Sidebar Menu

As an extension to Excel's functionality, SmartstatXL is presented as an Add-In designed to facilitate experimental data analysis. One of the main focuses is the Split-Split Plot Analysis of Variance based on RBD or CRD, where the main plot refers to Randomized Block Design or Completely Randomized Design. Although the primary priority is balanced design (Balanced Design), SmartstatXL is not limited to standard designs but also supports other mixed models.

Here are the specific features for Split-Split Plot experiments available in SmartstatXL:

  • CRD/RBD/LSD Split-Split Plot: Refers to the Split-Split Plot experiment where each observational unit is measured only once.
  • CRD/RBD/LSD Split-Split Plot: Sub-Sampling: Designed for repeated observations with the capability to draw sub-samples from a single observational unit. An example of its use is where there are 10 plants measured in one observational unit (treatment 3Dok1, first replication).
  • Split-Split Plot: Repeated Measure: Specifically for observations that are made periodically from a single observational unit, such as every 14 days.
  • Split-Split Plot: Multi Location/Season/Year: An ideal solution for experiments conducted across multiple locations, seasons, or years.

If there are significant treatment effects, SmartstatXL offers various Post hoc Tests for comparing the mean treatment values. The available options include: Tukey, Duncan, LSD, Bonferroni, Sidak, Scheffe, REGWQ, Scott-Knott, and Dunnet.

Case Example

An agricultural experiment aims to study the effect of three factors: Nitrogen fertilization (A), Plant Management (B), and Variety Type (C) on rice yield (tons/ha). The Nitrogen factor is placed as the main plot, Management as the subplot, and varieties as the sub-subplots. The Rice Yield Data (tons/ha) is presented in the following table:

     

Group

Nitrogen

Management

Varieties

1

2

3

1

1

1

3.320

3,864

4,507

   

2

6.101

5.122

4.815

   

3

5.355

5.536

5,244

 

2

1

3.766

4.311

4.875

   

2

5.096

4.873

4.166

   

3

7,442

6.462

5.584

 

3

1

4.660

5.915

5,400

   

2

6.573

5.495

4.225

   

3

7.018

8020

7.642

2

1

1

3.188

4.752

4.756

   

2

5.595

6,780

5,390

   

3

6.706

6,546

7.092

 

2

1

3.625

4.809

5.295

   

2

6.357

5.925

5.163

   

3

8,592

7.646

7.212

 

3

1

5.232

5.170

6.046

   

2

7.016

7,442

4.478

   

3

8,480

9.942

8,714

3

1

1

5.468

5.788

4.422

   

2

5.442

5,988

6.509

   

3

8.452

6.698

8,650

 

2

1

5.759

6,130

5.308

   

2

6398

6,533

6.569

   

3

8,662

8,526

8,514

 

3

1

6.215

7.106

6.318

   

2

6,953

6,914

7,991

   

3

9.112

9.140

9.320

Cited from:
Gaspersz, Vincent. 1991. Experimental Design Methods: For Agricultural Sciences, Engineering, and Biology. Bandung: Armico, 1991. p. 297.

Steps for Analysis of Variance (ANOVA) and Post Hoc Tests:

  1. Make sure the worksheet (Sheet) you wish to analyze is active.
  2. Place the cursor on the Dataset. (For information on creating a Dataset, please refer to the 'Data Preparation' guide).
  3. If the active cell is not on the dataset, SmartstatXL will automatically detect and determine the appropriate dataset.
  4. Activate the SmartstatXL Tab
  5. Click on the Menu Split-Split Plot > Main Plot CRD/RBD.
    Menu Split-Split Plot > Main Plot CRD/RBD
  6. SmartstatXL will display a dialog box to confirm whether the Dataset is correct (usually, the cell address for the Dataset is automatically selected correctly).
  7. After confirming the Dataset is correct, press the Next Button
  8. The following Anova – RBD/CRD Split-Split Plot Dialog Box will appear:
    Dialog Box Anova – RBD/CRD Split-Split Plot
  9. There are three stages in this dialog. In the first stage, select the Factors and at least one Response to analyze.
  10. When you select Factors, SmartstatXL will provide additional information about the number of levels and their names. In Split-Split Plot experiments (CRD/RBD), Replications are still included as a factor.
  11. Details of the Anova STAGE 1 Dialog Box can be seen in the following image:
    Dialog Box Anova STAGE 1
  12. After confirming the Dataset is correct, press the Next Button to proceed to the Anova Stage-2 Dialog Box
  13. The dialog box for the second stage will appear.
    Dialog Box Anova STAGE 2
  14. Adjust settings according to your research method. In this example, the Post Hoc Test used is LSD.
  15. To adjust additional output and default values for the subsequent output, press the "Advanced Options…" button
  16. Here is the display of the Advanced Options Dialog Box:
  17. Once settings are finalized, close the "Advanced Options" dialog box.
  18. Next, in the Anova Stage 2 Dialog Box, click the Next button.
  19. In the Anova Stage 3 Dialog Box, you will be asked to specify the average table, ID for each Factor, and rounding off the average values. The details can be seen in the following image:
    Dialog Box Anova STAGE 3
  20. As the final step, click "OK"

Analysis Results

Analysis Information

Information on experimental design, Post Hoc tests used, and output of the analysis of variance table

In this context, a Randomized Block Design (RBD) Split-Split Plot is used to evaluate the effects of three factors: Nitrogen (A), Management (B), and Variety (C) on rice yield (tons/ha).

Key Points:

  • Experimental Design: The RBD Split-Split Plot allows us to consider the nested structure of these factors. In this case, the Management factor (B) is nested within the Nitrogen factor (A), and the Variety factor (C) is nested within the Management factor.
  • Post Hoc Test: The Least Significant Difference (LSD) test will be used to compare group means if significant effects from these factors are found.
  • Response: The response variable to be analyzed is rice yield, measured in tons per hectare.
  • Factors and Levels: Each factor has three levels, which will facilitate the interpretation of interactions between the factors.
  • Assumption Violations: There seems to be a violation of assumptions affecting the validity of the analysis. The proposed solution is to replace outlier data with values from missing data calculations.

Assumption Checks (Original Data)

Formal Approach (Statistical Tests)

Levene's Test for Homogeneity of Variances

The Levene's test indicates that there is a violation of the homogeneity of variance assumption, as the p-value (0.010) is smaller than 0.05. This means that the variability between groups is inconsistent, and this could affect the validity of the analysis of variance. In this case, further consideration may be necessary, such as data transformation or alternative, more robust analysis methods to address this assumption violation.

Tests for Normality

  • Shapiro-Wilk's: p-value = 0.832
  • Anderson Darling: p-value = 0.646
  • D'Agostino Pearson: p-value = 0.636
  • Liliefors: p > 0.20
  • Kolmogorov-Smirnov: p > 0.20

All normality tests indicate that the assumption of a normal distribution of residuals has been met, as all p-values are greater than 0.05. This means that the normality assumption for ANOVA is satisfied and is not a concern.

Conclusion

  • The homogeneity of variance assumption is violated, but the normality assumption is met.
  • Due to the violation of the homogeneity of variance assumption, the results from ANOVA may not be entirely valid. Therefore, alternative analysis methods or data transformation should be considered.

After understanding the status of the assumptions involved in the analysis, various steps have been taken to ensure the validity of the results. SmartstatXL has tried various types of data transformations, but none proved effective in meeting the disturbed homogeneity of variance assumption. As an alternative, SmartstatXL successfully found a solution by correcting outlier data. Nonetheless, SmartstatXL still presents the analysis results from the original data as a reference, in addition to the results from the modified data.

Data Replacement Methods

Data Replacement Methods

In an effort to meet the disrupted homogeneity of variance assumption, SmartstatXL has replaced outlier data with values generated from missing data calculations (Depth: 1). By doing this, the aim is to mitigate the effects of outlier data that could jeopardize the validity of the analysis of variance.

Sample Data

  • For example, in group 1, with Nitrogen type 1, Management type 3, and Variety type 2, the original rice yield was 6.573 tons/ha. However, this value is considered as outlier data and replaced with 3.9486 tons/ha.
  • Similarly, in group 1, with Nitrogen type 2, Management type 2, and Variety type 1, the original rice yield was 3.625 tons/ha, which was also replaced with 6.0348 tons/ha.

Implications

This outlier data replacement is a step taken to ensure the validity of the statistical analysis. However, it is important to note that this method also has its own implications and should be handled carefully when interpreting results. In addition, SmartstatXL also presents the analysis results from the original data as a reference, which can be used for comparison and validation of the results from the modified data.

Understanding these changes, we can proceed to the next phase of analysis with the modified data.

Analysis of Variance

Analysis of Variance from the original data:

Interpretation and Discussion

  1. Group Effect (G)
    • F-value (1.551) < F-0.05 (6.944): There is no significant group effect on rice yield. The p-value (0.317) is greater than 0.05, thus we fail to reject the null hypothesis that the group effect is not significant.
  2. Nitrogen Effect (N)
    • F-value (54.098) > F-0.01 (18.000): There is a significant effect of Nitrogen on rice yield at the 1% significance level. This indicates that Nitrogen application has a significant impact on rice yield.
  3. Management Effect (M)
    • F-value (49.332) > F-0.01 (6.927): There is a significant effect of Management on rice yield at the 1% significance level. This indicates that the management strategies used in this experiment have a significant impact on rice yield.
  4. Interaction of Nitrogen x Management (N x M)
    • F-value (0.175) < F-0.05 (3.259): There is no significant interaction between Nitrogen and Management on rice yield. This means the effect of Nitrogen on rice yield is not dependent on the type of Management used, and vice versa.
  5. Variety Effect (V)
    • F-value (77.626) > F-0.01 (5.248): There is a significant effect of Variety on rice yield at the 1% significance level. This indicates that the choice of rice variety plays a very important role in determining production output.
  6. Other Interactions (N x V, M x V, N x M x V)
    • All other interactions are not significant at the 5% level, indicating that the combined effects of the factors do not show a significant difference in rice yield.
  7. Coefficient of Variation (CV)
    • CV(a) = 9.18%, CV(b) = 7.75%, CV(c) = 12.59%
    • The coefficient of variation at each factor level is relatively low, indicating that the data is quite consistent and the variability between repetitions is relatively small.

Overall, the factors of Nitrogen, Management, and Variety have a significant impact on rice yield. However, the interactions between these factors are generally not significant, indicating that the effects of each factor stand alone and are not dependent on each other. This provides valuable insights for further research and practical applications in the field of agriculture.

Analysis of Variance from the Revised Data:

After replacing the outlier data, the analysis of variance results show several significant changes compared to the analysis of the original data. Here is the discussion:

Group Effect (G)

  • Original Data: Not significantly different (P-Value: 0.317)
  • Modified Data: Significantly different at the 1% level (P-Value: 0.008)

There is now a significant group effect on rice yield, which was not present in the original data. This could indicate that the replacement of outlier data helps in identifying previously hidden group effects.

Nitrogen Effect (N)

  • The effect of Nitrogen remains significant at the 1% level in both analyses, but the F-value increased from 54.098 to 732.243. This indicates that the influence of Nitrogen becomes stronger in the modified data.

Management Effect (M)

  • The Management effect remains significant at the 1% level in both analyses. However, the F-value slightly decreased from 49.332 to 19.451, but still remains significant.

Interaction of Nitrogen x Variety (N x V)

  • Original Data: Not significantly different (P-Value: 0.251)
  • Modified Data: Significantly different at the 5% level (P-Value: 0.015)

There is now a significant interaction between Nitrogen and Variety, which was not present in the original data. This could mean that the effect of Nitrogen on rice yield depends on the type of Variety used, and vice versa.

Interaction of Nitrogen x Management x Variety (N x M x V)

  • Original Data: Not significantly different (P-Value: 0.859)
  • Modified Data: Significantly different at the 1% level (P-Value: 0.005)

The three-way interaction between Nitrogen, Management, and Variety now becomes significant, indicating deeper complexity in the effects of these factors.

Coefficient of Variation (CV)

  • CV(a), CV(b), and CV(c) in the modified data show lower figures compared to the original data, indicating smaller variability between repetitions.

Conclusion

The replacement of outlier data has significantly affected the results of the analysis of variance. Some previously non-significant effects and interactions have become significant, and the effect of Nitrogen appears stronger. Additionally, the coefficient of variation shows increased data consistency. However, it should be remembered that these results are derived from modified data, so the interpretation must be done cautiously.

Discussion on Revised Data

In the subsequent discussion, the focus will be placed on the analysis of variance results from the modified data. The replacement of outlier data has brought significant changes in the results, especially in uncovering interaction effects that were previously not visible in the original data. Therefore, it will be crucial to delve into these interaction effects to understand the complex dynamics between the factors under study.

Specifically, the discussion will be directed towards:

  • Two-Way Interaction between Nitrogen and Variety (N x V): Becoming significant in the modified data, this interaction suggests that the effect of Nitrogen on rice yield depends on the type of Variety used, and vice versa.
  • Three-Way Interaction between Nitrogen, Variety, and Management (N x V x M): Also becoming significant in the modified data, this interaction indicates more complex dynamics between these three factors in affecting rice yield.

Both of these interactions provide valuable insights into how the combination of Nitrogen, Variety, and Management can affect rice yield, and therefore, will be the focus of attention in the subsequent discussion.

Post hoc Test

Effect of Nitrogen x Variety Interaction

There are two presentation formats for the interaction effect tables. You can choose either or both. The First Format is in a one-way table form, where treatment levels are combined and the layout is like a single effect table. The Second Format tests simple effects and is presented in a two-way table format. The choice of average table and graph display can be set through the Advanced Options (refer back to step 15 of the Analysis of Variance Steps).

First Format:

Second Format:

Comparison of Data Presentation between the Two Formats

First Format:

  • This format presents the interaction between Nitrogen and Variety directly by showing the mean values and confidence intervals.
  • There is no explicit separation between the simple effects of Nitrogen or Variety; the focus is on the combined effect of both these factors.

Second Format:

  • This format provides a more explicit separation between the effects of Nitrogen and Variety, allowing the reader to more easily compare the simple effects of each factor at each level of the other factor.
  • Using two types of notation (lowercase for Nitrogen and uppercase for Variety) facilitates the interpretation of the simple effects of each factor.

Interpretation and Discussion of the Second Format: Simple Effects of the Interaction

Simple Effect of Nitrogen (N):

  • For Variety 1: The average rice yield increased from 4.513 ton/ha (Nitrogen 1) to 5.835 ton/ha (Nitrogen 3). All these values are significantly different (a, b, c).
  • For Variety 2: There is a significant increase from 4.871 ton/ha (Nitrogen 1) to 6.589 ton/ha (Nitrogen 3) (a, b).
  • For Variety 3: Significant decline from 6.478 ton/ha (Nitrogen 1) to 8.816 ton/ha (Nitrogen 3) (A, B, C).

Simple Effect of Variety (V):

  • For Nitrogen 1: The lowest rice yield is for Variety 1 (4.513 ton/ha) and the highest is for Variety 3 (6.478 ton/ha). All these values are significantly different (A, B).
  • For Nitrogen 2: There is a significant increase from 5.031 ton/ha (Variety 1) to 7.881 ton/ha (Variety 3) (A, B, C).
  • For Nitrogen 3: The lowest rice yield is for Variety 1 (5.835 ton/ha) and the highest is for Variety 3 (8.816 ton/ha). All these values are significantly different (A, B, C).

Notes

  • Letter notation (both lowercase and uppercase) indicates significant differences at the 0.05 level according to the Post hoc LSD Test.

Conclusion

The simple effects of the interaction between Nitrogen and Variety show that both factors have different effects on rice yield depending on the level of the other factor. For example, the effect of increasing Nitrogen levels on rice yield differs depending on the Variety used, and vice versa. This indicates the complexity of the interaction between Nitrogen and Variety and the importance of considering both factors simultaneously in agricultural practice.

Impact of Three-Way Interaction: Nitrogen x Management x Variety

Notation Explanation

  • Lowercase letters are used to compare between Management at the same Nitrogen x Variety.
  • Uppercase letters are used to compare between Variety at the same Nitrogen x Management.
  • Letters in parentheses are used to compare between Nitrogen at the same Management x Variety.

Graphical Presentation:

Simple Effects of the Three-Way Interaction (N x M x V)

This analysis reveals how the interaction between Nitrogen (N), Management (M), and Variety (V) affects rice yield. By observing the average value table and considering the critical values from the Post hoc LSD Test, various important insights can be found.

Simple Effects of Nitrogen (N):

  • At Management 1 and Variety 1: Rice yield increases from 3.897 ton/ha (N1) to 5.226 ton/ha (N3) (a, b).
  • At Management 2 and Variety 3: Rice yield increases from 6.496 ton/ha (N1) to 9.191 ton/ha (N3) (a, b).

Simple Effects of Management (M):

  • At Nitrogen 1 and Variety 1: Rice yield is higher in M2 (4.317 ton/ha) compared to M1 (3.897 ton/ha) but not significantly different (a, ab).
  • At Nitrogen 3 and Variety 3: Rice yield increases from 8.691 ton/ha (M1) to 9.191 ton/ha (M3), and this is significantly different (a, b).

Simple Effects of Variety (V):

  • At Nitrogen 1 and Management 1: The lowest rice yield is at V1 (3.897 ton/ha) and the highest is at V3 (5.378 ton/ha) (A, B).
  • At Nitrogen 3 and Management 3: Rice yield increases from 6.546 ton/ha (V1) to 9.191 ton/ha (V3) and this is significantly different (A, B).

Interactions and Combinations:

  • At Nitrogen 2, Management 3, and Variety 1 (N2, M3, V1), rice yield is 5.483 ton/ha, which is significantly different from the combination N2, M3, V3 with rice yield of 9.045 ton/ha.

If the three-way interaction (e.g., Nitrogen x Management x Variety) is significant, it indicates that the effect of one factor on the response variable (in this case, rice yield) depends on the levels of the other two factors. In such situations, it is usually very informative to discuss the three-way interaction to understand the more complex dynamics between the variables.

Why Discussing Three-Way Interaction is Important:

  1. Complexity of Dynamics: Three-way interaction can indicate complexities that cannot be explained by two-way interactions alone.
  2. Management Strategy: Knowing how three factors interact can affect management decisions in practice, such as how to design a fertilization plan or choose a variety.
  3. Deeper Understanding: Discussing three-way interaction provides a deeper understanding of how these factors influence each other, which can be very valuable in research and practical applications.
  4. Accuracy of Interpretation: If you only consider two-way interactions when a three-way interaction is significant, you may miss some important dynamics that could affect your interpretation of the data.

However, there are also drawbacks:

  1. Complexity of Analysis: Discussing three-way interaction can be very complex and requires a good understanding of the subject and statistical methods.
  2. Risk of Type I Error: Performing multiple statistical tests increases the risk of finding significant results by chance.

So, although more complex and requiring careful interpretation, discussing three-way interaction is generally recommended if you find significant effects in your analysis of variance.

Conclusion

The three-way interaction between Nitrogen, Management, and Variety shows very complex effects on rice yield. These results indicate that an optimal agricultural management strategy will require a highly coordinated approach, considering how these three factors interact with each other. This reinforces the importance of further research to understand how these factors interact in real field conditions.

ANOVA Assumption Checks

Formal Approach (Statistical Test)

Interpretation and Discussion: Anova Assumption Checks After Data Modification

Variance Homogeneity: Levene's Test

  • The F-value for Levene's Test is 1.563 with a P-value of 0.083.
  • Since the P-value is greater than 0.05, there is insufficient evidence to reject the null hypothesis that variances are equal across all groups.
  • Thus, the assumption of variance homogeneity appears to have been met.

Data Normality: Normality Test

  • Various normality tests (Shapiro-Wilk, Anderson Darling, D'Agostino Pearson, Liliefors, and Kolmogorov-Smirnov) all yield P-values greater than 0.05.
  • This indicates that there is insufficient evidence to reject the null hypothesis that residuals are normally distributed.
  • Therefore, the assumption of data normality also appears to have been met.

Conclusion

After making modifications to the data, both the assumption of variance homogeneity and the assumption of data normality appear to have been met. This increases our confidence in the validity of the results from the analysis of variance and subsequent tests generated from this data.

The success in meeting these assumptions after data modification indicates that further analysis based on this data will be more reliable and valid. This also validates the steps taken to modify the data, including replacing outlier data.

Visual Approach (Plot Graphs)

  1. Normal P-Plot of Residual Data
    • In the Normal P-Plot graph, we want to see if the data points are close to the diagonal line. If so, this is a strong indication that the residual data is normally distributed.
    • From the graph, it appears that data points tend to follow the diagonal line, although there are some deviations. This suggests that the normality assumption is generally met, despite some minor deviations.
  2. Residual Data Histogram
    • The histogram is used to check the shape of the residual data distribution. A shape approaching a normal distribution (bell-shaped) indicates that the normality assumption is met.
    • From the histogram, it appears that the residual data is distributed in a shape that approaches a normal distribution. Although there are some deviations, the overall shape of this distribution suggests that the normality assumption of residual data has been met.
  3. Residual vs. Predicted Plot
    • In the Residual vs. Predicted plot, we want to see if the residuals are randomly distributed around the horizontal zero line. If so, this indicates that the assumption of homoscedasticity (constant variance across predictor levels) is met.
    • Based on the Residual vs. Predicted plot, it appears that the residuals are relatively randomly distributed around the zero line. Although there are some deviations, this pattern generally supports the homoscedasticity assumption.
  4. Standard Deviation vs. Mean
    • The Standard Deviation vs. Mean graph is used to check the variance homogeneity assumption. If the points are randomly distributed around a horizontal line, then the variance homogeneity assumption is considered to be met.
    • From the graph, it appears that the points are distributed in a relatively random pattern, although there are some deviations. This suggests that, generally, the variance homogeneity assumption has been met, with some minor exceptions.

Taking all these graphs into consideration, we can conclude that the assumptions required for the analysis of variance (ANOVA) have mostly been met. This adds to our confidence in the validity and reliability of the statistical analysis results that have been performed.