Sidebar Menu

In the realm of research, it is sometimes crucial to take multiple samples from a single experimental unit to obtain a more detailed and accurate depiction of the variable response to the treatment. Known as Sub-Sampling, this method allows researchers to collect more consistent data by reducing the natural variability that may occur within a single experimental unit.

SmartstatXL understands this need profoundly and offers robust data analysis solutions for multiple observations (multi-observation), particularly those related to Sub-Sampling.

There are two types of multiple observation data:

  • Sub-Sampling: Data obtained from multiple samples within a single experimental unit are measured. For example, when measuring plant height, 10 plants may be measured from each plot to obtain a more accurate average.
  • Repeated Measure: data from measurements over time where the same attribute is measured periodically, for example, at different stages of plant growth such as plant height, number of tillers, leaf area, etc.

Moreover, SmartstatXL is also equipped with various Post Hoc Tests to assist researchers in further exploring the possible differences that may occur between treatments. These Post Hoc Test options include: Tukey, Duncan, LSD, Bonferroni, Sidak, Scheffe, REGWQ, Scott-Knott, and Dunnet.

With the aid of SmartstatXL, researchers can ensure that data obtained through Sub-Sampling is analyzed with the highest accuracy and integrity, providing results that are reliable and easy to interpret.

Case Example

The number of tillers in the rice variety IR729-67-3 was tested with nine fertilization treatments in a Randomized Block Design with four replications and four sampling units (s1 to s4).

 

Replication

 

1

2

3

4

Treatment

s1

s2

s3

s4

s1

s2

s3

s4

s1

s2

s3

s4

s1

s2

s3

s4

1

30

23

27

22

22

26

25

32

34

26

30

24

40

42

37

26

2

48

46

33

42

57

60

38

50

67

64

63

58

40

57

36

60

3

52

47

61

46

49

41

43

70

52

48

54

56

50

61

58

74

4

45

51

73

55

65

62

79

54

75

56

75

75

58

41

47

58

5

52

62

56

52

50

72

51

51

56

39

49

59

53

53

40

72

6

62

63

56

43

52

48

54

56

74

58

48

51

63

59

46

52

7

58

46

63

55

47

50

70

53

75

48

73

52

66

76

72

74

8

63

56

59

49

47

53

60

68

47

58

65

78

63

70

80

68

9

70

72

72

49

55

44

42

52

69

55

56

59

53

52

44

49

Cited from:
Gomez, Kwanchai A. and Gomez, Arturo A. 1995. Statistical Procedures for Agricultural Research. [translator] Endang Sjamsuddin and Justika S. Baharsjah. Second Edition. Jakarta: UI-Press, 1995. ISBN: 979-456-139-8. p. 250.

Steps for Randomized Block Design (RBD) Analysis of Variance:

  1. Ensure that the worksheet (Sheet) you wish to analyze is active.
  2. Place the cursor on the Dataset. (For information on creating a Dataset, please refer to the 'Data Preparation' guide).
  3. If the active cell (Active Cell) is not on the dataset, SmartstatXL will automatically detect and determine the appropriate dataset.
  4. Activate the SmartstatXL Tab
  5. Click on the Menu One Factor > RCBD:Sub-Sampling.
    Menu RAK > RAK:Sub-Sampling
  6. SmartstatXL will display a dialog box to confirm whether the Dataset is correct or not (usually the cell address for the Dataset is automatically selected correctly).
  7. After confirming that the Dataset is correct, press the Next Button
  8. Next, the Anova – Single Factor RBD Dialog Box will appear:
    Anova – Single Factor RAK Dialog Box
  9. There are 3 Stages. In the first stage, select the Factor and at least one Response to be analyzed (As shown in the picture above)!
  10. When you select a Factor, SmartstatXL will provide additional information about the number of levels and their names. In RBD experiments, Replications are included as a factor.
  11. Details of the Anova STAGE 1 dialog box can be seen in the following image:
    Anova STAGE 1 Dialog Box
  12. After confirming that the Dataset is correct, press the Next Button to move to Anova Stage-2 Dialog Box
  13. The dialog box for the second stage will appear.
    Anova STAGE 2 Dialog Box
  14. Adjust the settings based on your research method. In this example, the Post Hoc Test used is Bonferroni (Dunn).
  15. To set additional output and default values for subsequent output, press the "Advanced Options…" button
  16. Here is the view of the Advanced Options Dialog Box:
  17. Once the settings are complete, close the "Advanced Options" dialog box
  18. Next, in the Anova Stage 2 Dialog Box, click the Next button.
  19. In the Anova Stage 3 Dialog Box, you will be asked to specify the average table, ID for each Factor, and rounding of the average value. Details can be seen in the following image:
    Anova STAGE 3 Dialog Box
  20. As the final step, click "OK"

Analysis Results

Analysis Information

Before delving further, it should be noted that Analysis of Variance (ANOVA) is used to test differences between two or more groups based on the numerical data collected. In this case, the focus is on the influence of fertilization treatments (9 levels) and repetitions (4 levels) on the number of tillers in the rice variety IR729-67-3.

Experimental Design

The experimental design employed is a Randomized Block Design (RBD) with Single Factor and Sub-Sampling. This means that this study uses one main factor (fertilization treatment) and applies this design in four repetitions or blocks. In addition, each repetition has four sub-samples (sub-sampling).

Post Hoc Test

The Post Hoc Test used is Bonferroni, which is a correction method for controlling the Type I error rate when making multiple comparisons.

Response

The response or dependent variable in this experiment is the "Number of Tillers," serving as an indicator of the success of various fertilization treatments applied.

Factors

There are two factors analyzed in this experiment:

  1. Repetitions: With 4 levels, this refers to the four different groups where the treatment is applied. Repetitions are used to assess natural variability in the experiment.
  2. Treatments: With 9 levels, this refers to the nine different types of fertilization treatments applied to the rice.

Thus, the aim of this analysis is to determine the extent to which fertilization treatments and repetitions affect the number of rice tillers.

Please proceed by providing further details about the Analysis of Variance results, such as F-values, p-values, and so on, so that interpretation and discussion can be conducted more comprehensively.

Variance Analysis

The results of the Analysis of Variance reveal several key points:

  1. Effect of Replication (U): The effect of replications is not statistically significant at either the 5% or 1% significance levels (p-value > 0.05). This indicates that the natural variability between repetitions is not large enough to affect the number of rice tillers in this experiment.
  2. Effect of Treatments (P): The effect of treatments is statistically significant at the 1% significance level (p-value < 0.01). This indicates that the fertilization treatments have a significant effect on the number of rice tillers.
  3. Error Effect: The error effect is statistically significant at the 1% significance level (p-value < 0.01), indicating that there is variation unexplained by the model.

Discussion

  1. Replication: The variability between repetitions does not affect the number of rice tillers, indicating that this experiment is fairly consistent in terms of repetitions.
  2. Treatments: There is a significant difference in the number of rice tillers between various types of fertilization treatments. This indicates the need to further explore the most effective types of fertilization.
  3. Error: Although this model is quite good at explaining the variation in the data, there is also unexplained variation, which could come from other factors not included in this model.

Overall, these results provide strong evidence that the type of fertilization used has a significant effect on the number of rice tillers. Therefore, further research is recommended to determine the most effective type of fertilization.

Post Hoc Test

Interpretation of Post Hoc Test Results

  1. Single Effect of Treatments (P):
    • The Post Hoc Test results using the Bonferroni method indicate that there is a significant difference between specific treatments on the number of rice tillers.
  2. Critical Value:
    • The Standard Error indicates the variation between the average measurements of two different treatments.
    • The Bonferroni critical value of 0.05 (13.7002) is used as a threshold to determine whether differences between treatments are significant or not.
  3. Table of Average Number of Tillers:
    • Treatment 1 is significantly different from other treatments with a lower number of rice tillers.
    • Treatments 2-9 are not significantly different from each other but differ from Treatment 1.

In this context, Treatment 1 shows a lower effect on the number of rice tillers compared to Treatments 2-9. Treatments 2-9 show similar and higher results compared to Treatment 1.

This indicates that the fertilization treatments represented by Treatments 2-9 are more effective in increasing the number of rice tillers compared to Treatment 1. However, among Treatments 2-9, there are no significant differences that can be used to distinguish their effectiveness.

Therefore, further research may be needed to understand the factors that make Treatments 2-9 more effective compared to Treatment 1, and to explore whether there are further differences that may not have been detected in this experiment.

ANOVA Assumption Checks

Formal Approach (Statistical Tests)

Levene's Test for Homogeneity of Variance

Levene's Test is used to check the assumption of homogeneity of variances among groups. A p-value greater than 0.05 (p = 0.213) indicates that the assumption of homogeneity of variances is met. This means the variance across all groups is homogeneous and does not violate the ANOVA assumptions.

Normality Test

All normality tests show a p-value greater than 0.05, indicating that the data is normally distributed. Therefore, the assumption of normal distribution for residuals is also met.

Discussion

  • Variance Homogeneity: Levene's test confirms that the variance between groups is homogeneous. This is good news as homogeneity of variance is one of the basic assumptions of ANOVA, and if not met, could affect the validity of the results.
  • Data Normality: Several test methods are used to ensure that the data is normally distributed, which is also a basic assumption of ANOVA. All tests confirm that the data meets this assumption.

Overall, both the homogeneity of variance and normal distribution assumptions are met, further validating the results of the variance analysis conducted. This reinforces confidence in the findings that show a significant effect of fertilization treatments on the number of rice tillers and validates the decision to proceed with Post Hoc tests like Bonferroni.

Visual Approach (Graph Plots)

Interpretation

  1. Normal P-Plot of Residual Data
    • The Normal P-Plot graph is used to check whether the residual data is normally distributed. If the points on the graph nearly form a straight diagonal line, this indicates that the data tends to be normal. From the graph presented, it appears that the data points tend to follow the diagonal line, indicating that the normality assumption is met.
  2. Residual Data Histogram
    • The histogram is used to visualize the data distribution. In this context, the histogram shows the shape of the residual data distribution. From the graph, it appears that the residual data spreads around the zero value and forms a pattern resembling a normal distribution. This also indicates that the data normality assumption is met.
  3. Residual vs. Predicted Plot
    • The Residual vs. Predicted graph is used to check the assumption of homoscedasticity, that is, the homogeneity of residual variance. If the pattern of the points is random and does not show any specific pattern (e.g., opening or closing like a funnel), then the homoscedasticity assumption is met. From the graph presented, it appears that the residual points scatter randomly without showing any specific pattern, indicating that the homoscedasticity assumption is met.
  4. Standard Deviation vs. Mean
    • This graph is also used to check for homoscedasticity. If the points scatter randomly and do not show any specific pattern, this indicates that the variance from each group is the same or homogeneous. From the graph presented, the points appear to scatter randomly, further validating the homogeneity of variance assumption.

Conclusion

Graphically, all basic ANOVA assumptions—normality of distribution, homogeneity of variance, and homoscedasticity—appear to be met based on the presented graphs. This strengthens the validity of the variance analysis and Post Hoc tests performed, making the interpretation and conclusions drawn from the analysis more convincing.

Box-Cox and Residual Analysis

Box-Cox Transformation

The Box-Cox Transformation is used to ensure that the data meets the normality and homoscedasticity assumptions required in variance analysis. The Lambda value found is 0.526, suggesting that the square root transformation (√Y) is most suitable for this data. Based on the analysis results, the ANOVA assumptions are met, thus no transformation is needed.

Outlier Data Examination

Based on the table presented, several points show the value "Outlier" in the "Diagnostic" column. This indicates that there are some observations that could potentially be outliers or values very different from other observations in the sample.

  • For instance, in Replication 1 and Treatment 4, the Tiller Count value is 73, while the predicted value is 56. This results in a residual of 17, with a Studentized Deleted Residual of 2.2870, indicating this as an outlier.
  • Similarly, in Replication 2 and Treatment 3, the Tiller Count value is 70, while the predicted value is 50.75. This also results in a Studentized Deleted Residual of 2.6077, identifying this as an outlier.

Discussion

  1. Box-Cox Transformation: Based on the analysis results, the ANOVA assumptions are met, thus no transformation is needed.
  2. Outlier Data: Outliers can affect the analysis results and lead to potentially inaccurate conclusions. Therefore, it's important to further investigate this outlier data. Researchers may need to decide whether to retain or remove these outliers from the analysis, depending on the research context.
  3. Residual and Predicted Values: This column provides information on how well the model is able to predict accurate results. Large residuals suggest that the model may not be fully adequate in explaining the variation in the data.
  4. Leverage, Studentized Residual, and Studentized Deleted Residual: These are metrics used to identify outliers. Values far from zero indicate potential outliers.

Overall, the Box-Cox transformation and outlier data examination are crucial steps in ensuring the accuracy and reliability of statistical analysis results. This enables researchers to make more convincing interpretations and conclusions.