Enhancing Excel's functionality, SmartstatXL comes as an Add-In specialized in experimental data analysis. Its strengths include the analysis of variance for RBD Factorial, an experimental design that adopts Randomized Block Design as its basic designs. While currently focusing more on balanced designs, SmartstatXL also has the capability to analyze experimental data based on different mixed models.
Specific features for Factorial experiments in SmartstatXL include:
- Factorial RBD: Refers to Factorial experiments where each observational unit is measured only once.
- Factorial RBD: Sub-Sampling: Tailored for repeated measurements, with an option to draw sub-samples from a single observational unit. For instance, in one observational unit (treatment 3Dok1, repetition 1), there are 10 plants that are measured.
- Factorial RBD: Repeated Measure: Specifically for observations made periodically on a single observational unit, such as every 14 days.
- Factorial RBD: Multi-Location/Season/Year: An ideal solution for experiments conducted in different locations, seasons, or years.
If a significant treatment effect is found, SmartstatXL makes it easy to perform Post hoc Tests to compare the average treatment values. Some available options include: Tukey, Duncan, LSD, Bonferroni, Sidak, Scheffe, REGWQ, Scott-Knott, and Dunnet.
Case Example
There is an experiment aiming to understand the effect of food on the weight gain of lambs. There are three factors being tested, namely: Lysine (4 levels), Methionine (3 levels), and high-protein soybean meal (2 levels). Randomized Block Design consisting of two groups of lambs is used as the basic design. Weight gain is measured in specific weight units. The experimental data is as follows:
Block | ||||
Lysine | Methionine | Protein | 1 | 2 |
1 | 1 | 1 | 1.11 | 0.97 |
2 | 1.52 | 1.45 | ||
2 | 1 | 1.09 | 0.99 | |
2 | 1.27 | 1.22 | ||
3 | 1 | 0.85 | 1.21 | |
2 | 1.67 | 1.24 | ||
2 | 1 | 1 | 1.30 | 1.00 |
2 | 1.55 | 1.53 | ||
2 | 1 | 1.03 | 1.21 | |
2 | 1.24 | 1.34 | ||
3 | 1 | 1.12 | 0.96 | |
2 | 1.76 | 1.27 | ||
3 | 1 | 1 | 1.22 | 1.13 |
2 | 1.38 | 1.08 | ||
2 | 1 | 1.34 | 1.41 | |
2 | 1.40 | 1.21 | ||
3 | 1 | 1.34 | 1.19 | |
2 | 1.46 | 1.39 | ||
4 | 1 | 1 | 1.19 | 1.03 |
2 | 0.80 | 1.29 | ||
2 | 1 | 1.36 | 1.16 | |
2 | 1.42 | 1.39 | ||
3 | 1 | 1.46 | 1.03 | |
2 | 1.62 | 1.27 | ||
Cited from:
Gaspersz, Vincent. 1991. Experimental Design Methods: For Agricultural Sciences, Engineering, and Biology. Bandung: Armico, 1991. p. 236.
Steps for Analysis of Variance (Anova) and Post hoc Tests:
- Ensure that the worksheet (Sheet) you wish to analyze is active.
- Place the cursor on the Dataset. (For information on creating a Dataset, please refer to the 'Data Preparation' guide).
- If the active cell is not on the dataset, SmartstatXL will automatically detect and select the appropriate dataset.
- Activate the SmartstatXL Tab
- Click Menu Factorial > Factorial RBD.

- SmartstatXL will display a dialog box to confirm whether the Dataset is correct (usually the cell address for the Dataset is automatically selected correctly).

- After confirming the Dataset is correct, press the Next Button
- The following Anova – Factorial RBD Dialog Box will then appear:

- There are three stages in this dialog. In the first stage, select the Factors and at least one Response to be analyzed.
- When selecting Factors, SmartstatXL will provide additional information on the number of levels and the names of these levels. In RBD experiments, Replications are included as a factor.
- The details of the Anova STAGE 1 Dialog Box can be seen in the following image:

- After confirming the Dataset is correct, press the Next Button to proceed to the Anova Stage-2 Dialog Box
- The dialog box for the second stage will appear.

- Adjust the settings according to your research method. In this example, the Post hoc Test used is the REGWQ Test.
- To configure additional output and default values for subsequent outputs, press the "Advanced Options…" button.
- Here is the Advanced Options Dialog Box:

- After completing the settings, close the "Advanced Options" dialog box.
- Next, in the Anova Stage 2 Dialog Box, click the Next Button.
- In the Anova Stage 3 Dialog Box, you will be asked to specify the average table, IDs for each Factor, and rounding of average values. The details can be seen in the following image:

- As a final step, click "OK"
Analysis Results
Below is the Output of the Factorial Randomized Block Design (RAK) Analysis of Variance (Anova):
Analysis Information

Summary of Experimental Design
- Experimental Design: Factorial Randomized Block Design (RAK)
- Response (Dependent Variable): Increase in lamb body weight
- Independent Factors:
- Block: 2 levels
- Lysine: 4 levels
- Methionine: 3 levels
- Protein (soybean-based): 2 levels
- Post hoc Test: REGWQ (Ryan-Einot-Gabriel-Welsch-Quelett range test) used for comparing means between groups.
Note:
In the table of contents, there is a note about "Violation of Assumptions" on the weight response. This indicates that the basic assumptions of the analysis of variance may have been violated. These assumptions include data normality, homogeneity of variances, and independence of observations. Violations of these assumptions are important to consider when interpreting the results.
Analysis of Variance

Interpretation and Discussion
- Block (K)
- Block has a significant effect on the increase in lamb body weight at the 5% significance level. This is indicated by the F-Calculated (4.854), which is greater than F-0.05 (4.279), and a P-value of 0.038.
- Lysine (L)
- Lysine does not have a significant effect on the increase in lamb body weight. This is evidenced by the F-Calculated (0.518), which is smaller than F-0.05 (3.028), and a P-value of 0.674.
- Methionine (M)
- Methionine also does not have a significant effect on the increase in lamb body weight. This is indicated by the F-Calculated (0.956), which is smaller than F-0.05 (3.422), and a P-value of 0.399.
- Protein (P)
- Protein has a highly significant effect on the increase in lamb body weight at the 1% significance level. This is indicated by the F-Calculated (19.492), which is much greater than F-0.01 (7.881), and a P-value approaching 0.
- Interaction among factors
- The interaction between Lysine and Methionine (L x M) does not have a significant effect, with F-Calculated (1.543) and a P-value of 0.209.
- The interaction between Lysine and Protein (L x P) almost has a significant effect at the 5% significance level, but has not reached significance. This is indicated by the F-Calculated (2.911), which is close to F-0.05 (3.028), and a P-value of 0.056.
- The interaction between Methionine and Protein (M x P) does not have a significant effect, with F-Calculated (1.495) and a P-value of 0.245.
- The interaction among Lysine, Methionine, and Protein (L x M x P) does not have a significant effect, with F-Calculated (0.415) and a P-value of 0.861.
- Coefficient of Variation (KK)
- The coefficient of variation of 13.16% indicates the relative variation of the analyzed data.
Conclusion
From the results of the analysis of variance, it can be concluded that the Block and Protein factors have a significant effect on the increase in lamb body weight. Meanwhile, the Lysine, Methionine, and all interactions among these factors do not have a significant effect. Therefore, to increase lamb body weight, it is necessary to consider Protein as one of the main factors that need to be addressed.
Post hoc Test
Based on the Analysis of Variance, only the single effect of Protein is significant, while the single effects of other factors are not significant. Similarly, the interaction effects are all non-significant.
Single Effect of Lysine and Methionine

Single Effect of Protein (P)

Interpretation and Discussion
- Critical Value
- Based on the critical value table, the average difference between two levels of protein that will be considered significant must be greater than 0.0990 (REGWQ 0.05).
- Average Weight Table
- Protein level 1 yields an average weight of 1.1542 with a confidence interval (CI) ± 0.0667.
- Protein level 2 yields a higher average weight, specifically 1.3654 with a confidence interval (CI) ± 0.0850.
- The difference between the average weights for Protein level 1 and 2 is 1.3654−1.1542=0.2112, which is significantly greater than the critical value REGWQ 0.05 of 0.0990. Therefore, these averages are considered significantly different.
- Letter Notation on Averages
- The letter notations 'a' and 'b' on the average weights indicate that the two protein levels yield significantly different results at the 0.05 significance level. This means that treatment with Protein level 2 results in a higher weight gain compared to Protein level 1.
Conclusion
Based on the REGWQ post hoc test, it can be concluded that Protein level 2 has a better effect on the weight gain of lambs compared to Protein level 1. Therefore, to improve weight gain in lambs, it is recommended to use Protein level 2.
Interaction Table

There are two formats for presenting the average table for interaction effects. You can choose one or both. The First Format is in the form of a one-way table, where treatment levels are combined and the layout is similar to the single effect table. The Second Format tests simple effects and is presented in a two-way table format. The display option for average tables and graphs can be configured through the Advanced Options (refer back to step 15 from the Analysis of Variance Steps).
Examination of Anova Assumptions
Formal Approach (Statistical Tests)

Interpretation and Discussion
Levene's Test for Homogeneity of Variances
Levene's Test is used to test the assumption of homogeneity of variances among groups. From the results provided:
- The F-Value is very large, with a value of 8.27×1029, and a P-value of 0.000.
- This indicates that the variances among groups are not homogeneous, thus violating the assumption of homogeneity of variances.
The lack of homogeneity in variances can be influenced by many factors, including data distribution, sample size, or outliers. Given that the number of replications is only 2, this could be one reason why the F-Value is very large. When the sample size is very small, the variability of the sample may not accurately represent the population, leading to significant differences in variances between groups.
Normality Test
The normality test is used to test the assumption of normal distribution of residuals. From the results provided:
- All the P-values from the normality tests (Shapiro-Wilk's, Anderson Darling, D'Agostino Pearson, Liliefors, and Kolmogorov-Smirnov) are greater than 0.05, except for Liliefors, which gives a p < 0.20.
- This indicates that the residuals are normally distributed and the assumption of normality is not violated.
Conclusion
Based on the examination of assumptions:
- The assumption of homogeneity of variances is violated based on Levene's Test.
- The assumption of normality is not violated based on various normality tests.
Regarding the very large F-Value in Levene's Test, one possible reason could indeed be the low number of replications, which is only 2. When the sample size is small, the variance estimate may become unstable and can lead to large differences in variances between groups.
Visual Approach (Plotting Graphs)

Interpretation and Discussion
- Normal P-Plot of Residual Data
- The Normal P-Plot is used to check whether the residual data follows a normal distribution. If the points on the plot follow the diagonal line, this indicates that the data distribution is approximately normal.
- From the graph, most points follow the diagonal line, but some points deviate at the upper end. This indicates that the residual data has a distribution that is close to normal with slight deviations.
- Residual Data Histogram
- The histogram is used to visualize the frequency distribution of the data.
- From the histogram, the residual data distribution appears to be bell-shaped, indicating a normal distribution. However, there is a slight skew to the right (positively skewed).
- Residual vs. Predicted Plot
- The Residual vs. Predicted Plot is used to check the homoscedasticity (variance uniformity) of the data. A random and uniform distribution of points indicates that the residual variance is constant across all predicted values.
- From the graph, the points appear to be randomly and uniformly distributed around the horizontal zero line. However, there are a few points with extreme residual values.
- Standard Deviation vs. Mean
- The Standard Deviation vs. Mean graph is used to check data variability.
- Points on the graph appear to be distributed around the horizontal line, indicating that the standard deviation is consistent across the mean.
Conclusion
- The residual data has a distribution that is close to normal with a slight skew to the right.
- The residual variance appears to be uniform across all predicted values, although there are a few extreme residual values that need to be considered.
- Data variability appears to be consistent across the mean, albeit with some deviations.
Note that while graphs provide good visual insights into data assumptions, it is always advisable to complement them with statistical tests (as you have done previously) for further confirmation.
Box-Cox Transformation, Residual Table, and Outlier Data Information

Interpretation and Discussion
Box-Cox Transformation
The Box-Cox Transformation is used to transform data distributions to approximate normality. The lambda value obtained is 1.169, indicating that the appropriate transformation is "No Transformation." This means the data does not require transformation as it already approximates a normal distribution.
Residual Values and Outlier Data Examination
This table provides information about residuals, leverage, and various other measures used to detect outliers and influential points.
- Residual: The difference between the actual observation values and the values predicted by the model.
- Leverage: Measures the extent to which an observation influences its own prediction. Observations with high leverage are points that have extreme predictor (X) values.
- Studentized Residual and Studentized Deleted Residual: Normalized residuals. If their absolute values are large, they indicate outliers.
- Cook's Distance and DFITS: Measure the influence of an observation on all predictions. If these values are high, the observation can be considered as an influential point.
From the provided table, some observations are labeled as "Outliers." This means these observations have large residuals and could influence the model.
Conclusion
- Your data appears to have some outliers that could influence the model. You may want to consider revisiting these outlier data or taking corrective actions.
- Although the Box-Cox Transformation indicates that your data already approximates normality, if other model assumptions are not met, you may consider this transformation as one of the solutions.
- It is important to always check model assumptions to ensure that your analysis results are valid and reliable.








