SmartstatXL, serving as an Excel Add-In, is specifically designed to facilitate the process of experimental data analysis. It includes the analysis of variance for one-factor Randomized Block Designs (RBD). For the time being, SmartstatXL focuses on balanced designs (Balanced Design). Nevertheless, it's not just standard designs that can be analyzed. SmartstatXL also offers flexibility in handling other mixed models of experimental data.
Below is the list of menus and sub-menus for one-factor RBD experiments that can be found in SmartstatXL:
- RBD: This refers to one-factor RBD experiments where each observational unit is measured only once.
- RBD: Sub-Sampling: Specifically for repeated observations, allowing subsamples to be drawn from the same observational unit. For example, measurements are conducted on 10 plants from the same observational unit (treatment 3Dok1, repetition 1).
- RBD: Repeated Measure: This is provided for repeated measurements that are taken periodically from a single observational unit, such as at 14-day intervals.
- RBD: Multi-Location/Season/Year: This option is ideal for experiments conducted across multiple locations, seasons, or different years.
- Friedman Test (Two Way Anova): If ANOVA assumptions are not met, there is an option for non-parametric analysis as an alternative for one-factor RBD.
If there is a significant treatment effect, comparisons between the average treatment values can be made through post-hoc tests. Several options available in SmartstatXL include: Tukey, Duncan, LSD, Bonferroni, Sidak, Scheffe, REGWQ, Scott-Knott, and Dunnet.
Case Example
There is a study that tests the combination of NPK fertilizers on various soil chemical properties and rice yields. The factors being investigated are six levels of NPK fertilizer combinations, namely Control, PK, N, NP, NK, NPK, each repeated four times. The experiment uses a Randomized Block Design (RBD). The data is presented in the following table:
Fertilizer | Group | Results | pH | N-Total | P-Available | VAT |
1 | 1 | 24.20 | 5.93 | 0.122 | 19.08 | 0.148 |
1 | 2 | 32.55 | 5.30 | 0.145 | 23.50 | 0.128 |
1 | 3 | 27.90 | 6.25 | 0.121 | 21.22 | 0.135 |
1 | 4 | 37.70 | 5.50 | 0.126 | 26.89 | 0.150 |
2 | 1 | 33.30 | 5.63 | 0.121 | 30.89 | 0.163 |
2 | 2 | 32.25 | 5.50 | 0.129 | 28.56 | 0.120 |
2 | 3 | 26.25 | 6.25 | 0.128 | 33.23 | 0.197 |
2 | 4 | 37.95 | 5.60 | 0.141 | 35.73 | 0.193 |
3 | 1 | 38.65 | 6.05 | 0.145 | 15.56 | 0.128 |
3 | 2 | 41.35 | 5.50 | 0.154 | 25.18 | 0.142 |
3 | 3 | 45.90 | 5.40 | 0.157 | 27.94 | 0.136 |
3 | 4 | 44.55 | 5.20 | 0.158 | 24.46 | 0.121 |
4 | 1 | 41.50 | 5.80 | 0.116 | 28.80 | 0.120 |
4 | 2 | 44.75 | 5.45 | 0.148 | 30.79 | 0.113 |
4 | 3 | 44.60 | 5.73 | 0.122 | 20.86 | 0.121 |
4 | 4 | 44.80 | 5.29 | 0.165 | 31.74 | 0.106 |
5 | 1 | 41.10 | 5.83 | 0.114 | 19.67 | 0.178 |
5 | 2 | 42.55 | 5.60 | 0.132 | 22.37 | 0.135 |
5 | 3 | 42.85 | 5.83 | 0.140 | 21.92 | 0.150 |
5 | 4 | 45.50 | 5.50 | 0.162 | 27.01 | 0.194 |
6 | 1 | 45.75 | 5.65 | 0.135 | 24.38 | 0.133 |
6 | 2 | 47.85 | 5.30 | 0.158 | 27.10 | 0.150 |
6 | 3 | 43.25 | 6.23 | 0.123 | 16.39 | 0.171 |
6 | 4 | 46.10 | 5.55 | 0.163 | 32.60 | 0.207 |
Steps for Analysis of variancein RBD:
- Ensure that the worksheet (Sheet) you wish to analyze is active.
- Place the cursor on the Dataset. (For information on creating a Dataset, please refer to the 'Data Preparation' guide).
- If the active cell is not on the dataset, SmartstatXL will automatically detect and select the appropriate dataset.
- Activate the SmartstatXL Tab
- Click on the Menu One Factor > Randomized Block Design (RBD).

- SmartstatXL will display a dialog box to confirm whether the Dataset is correct or not (usually the cell address for the Dataset is automatically selected correctly).

- After confirming that the Dataset is correct, press the Next Button
- The following Anova – Single-Factor RBD dialog box will appear:

- There are 3 Stages. In the first stage, choose the Factor and at least one Response to be analyzed (as shown in the picture above)!
- When you select a Factor, SmartstatXL will provide additional information about the number of levels and their names. In RBD experiments, Repetitions are included as a factor.
- Details of the Anova STEP 1 dialog box can be seen in the following image:

- After confirming the Dataset is correct, press the Next Button to proceed to Anova Step 2 Dialog Box
- The dialog box for the second step will appear.

- Adjust the settings based on your research methodology. In this example, the Post Hoc test used is Duncan's Test.
- To set additional output and default values for subsequent output, press the "Advanced Options…" button.
- Here is the appearance of the Advanced Options dialog box:

- Once settings are complete, close the "Advanced Options" dialog box.
- Next, in the Anova Step 2 dialog box, click the Next Button.
- In the Anova Step 3 dialog box, you will be asked to specify the average table, ID for each Factor, and rounding of average values. The details can be seen in the following image:

- As the final step, click "OK"
Analysis Results
Analysis Information

In the conducted research, Analysis of Variance (ANOVA) is employed to assess the impact of NPK fertilizer combinations on several soil chemical properties and rice yield. The experimental design utilized is the Randomized Complete Block Design (RCBD) with a single factor. In this analysis, the observed responses include Rice Yield, Soil pH, Total Nitrogen, Available Phosphorus, and Kdd. The factor tested consists of 4 levels of blocks and 6 levels of fertilizer.
From the ANOVA results, responses to all five variables have been successfully analyzed, except for the Kdd variable. There is a violation of assumptions in the Kdd variable, indicating that the data may not meet the criteria of normality or homogeneity of variances. This assumption violation could affect the accuracy of ANOVA interpretation.
However, SmartstatXL has found a solution to address the assumption violation in the Kdd variable by applying the Inverse Square transformation. This transformation is conducted by altering each data value with the formula Y=1/Y2. Hence, the transformed data is expected to meet the necessary assumptions for variance analysis, making the interpretation of the results more valid.
It is crucial to always check assumptions in analysis of varianceto ensure that the results can be interpreted correctly. If assumptions are not met, data transformation or other analysis methods may be needed to obtain accurate interpretation.
Response Results
Variance Analysis

In the ANOVA for the "Yield" response, there are two sources of variance, namely "Block" and "Fertilizer".
- Block (B):
- With degrees of freedom (DF) of 3, Block contributes a variance of 98.5553 with a Mean Square (MS) of 32.8518.
- The F-value for Block is 3.681 with a p-value of 0.036. Given that F-0.05 (F-value at 5% significance level) is 3.287 and F-value > F-0.05, the impact of Block on Rice Yield is significant at the 5% level. This means that there is a significant difference among the blocks in this study related to Rice Yield.
- Fertilizer (F):
- With DF of 5, Fertilizer contributes a variance of 837.3980 with an MS of 167.4796.
- The F-value for Fertilizer is 18.767 with a p-value almost close to 0 (that is, 0.000). Given that F-0.01 (F-value at 1% significance level) is 4.556 and F-value > F-0.01, the impact of Fertilizer on Rice Yield is highly significant at the 1% level. This shows that there is a very significant difference among the types of fertilizers used in this study related to Rice Yield.
- Error:
- The error variance is 133.8641 with an MS of 8.9243. This represents variability that cannot be explained by the existing factors in the model.
- CV (Coefficient of Variation):
- The CV value is 7.52%, indicating how much of the data variability can be explained by the model.
Conclusion:
Based on the variance analysis, both Block and Fertilizer factors have a significant impact on Rice Yield. Therefore, the type of fertilizer and experimental block used influence rice production. Specifically, the impact of Fertilizer on Rice Yield is highly significant at the 1% level, highlighting the importance of selecting the appropriate type of fertilizer to enhance rice production.
Post hoc Test
Based on the variance analysis, treatment significantly affects the yield. The mean difference in yield among the treatments is presented in table and graph form. Table and graph display settings can be adjusted through Advanced Options (see again step 15 from the Steps of Variance Analysis).
Interpretation and Discussion:
Following the individual fertilizer effect analysis on Yield, a post hoc test was conducted to assess whether there is a significant difference in rice production among various fertilizer combinations.
Critical Values:
- The Standard Error for comparison between two types of fertilizer is 1.4937.
- The Duncan values at the 0.05 significance level for various comparisons are presented, ranging from 4.5019 to 5.0128. These values are used to compare differences between the mean yields from various fertilizer combinations.
Table of Mean Yield Values:
- Control Fertilizer provides a mean rice production yield of 30.59 with a confidence interval (CI) ± 9.30 and is marked with the letter "a".
- PK Fertilizer provides a mean rice production yield of 32.44 with a CI ± 7.66 and is also marked with the letter "a".
- N, NP, NK, and NPK Fertilizers show higher mean rice production yields, ranging from 42.61 to 45.74, with a lower CI compared to Control and PK Fertilizers. All these fertilizer combinations are marked with the letter "b".
Conclusion:
Based on the Duncan post hoc test at the 0.05 significance level, it can be concluded that:
- Control and PK Fertilizers yield relatively lower rice production and are not significantly different from each other.
- N, NP, NK, and NPK Fertilizers yield higher rice production and are not significantly different from each other. However, rice production from N, NP, NK, and NPK Fertilizers is significantly different compared to Control and PK Fertilizers.
This suggests that the choice of fertilizer combination has a significant impact on rice production. The combinations of N, NP, NK, and NPK Fertilizers appear to be more effective in increasing rice production compared to Control and PK Fertilizers.
Examination of ANOVA Assumptions
Formal Approach (Statistical Test)

Interpretation and Discussion:
In Analysis of Variance (ANOVA), there are several assumptions that must be met for the analysis results to be correctly interpreted. These assumptions include the homogeneity of variance (equal variance across all groups) and the normality of residuals.
Levene's Test for Homogeneity of Variance:
- Levene's test is conducted to examine whether the variance across various groups is equal (homogeneity of variance).
- With an F-value of 2.096 and a p-value of 0.113 (greater than 0.05), there is no strong evidence to reject the null hypothesis that variances are equal across all groups. This means the assumption of homogeneity of variance is satisfied for this analysis.
Tests for Normality:
- Various statistical tests are used to examine the distribution normality of residuals.
- Based on the Shapiro-Wilk, Anderson Darling, D'Agostino Pearson, Liliefors, and Kolmogorov-Smirnov tests, all p-values are greater than 0.05. This suggests that the residuals from the model are normally distributed, thus the assumption of normality is satisfied.
Conclusion:
Based on the examination of ANOVA assumptions, both the assumption of homogeneity of variance and the assumption of normality of residuals are met. This means the Analysis of Variance (ANOVA) carried out can be correctly interpreted, and the results obtained from this analysis are considered valid.
Visual Approach (Plotting Graphs)

Besides formal testing, assumptions can also be visually inspected using the included residual plots. Inspection can be done using the Normal Probability Plot (Normal P-Plot), Histogram, and Residual vs. Predicted plots.
Normal P-Plot for Residuals:
The Normal P-Plot is used to check whether the data (in this case, residuals) follows a normal distribution. Ideally, points on this plot should follow a straight diagonal line. If points deviate from the diagonal line, this may indicate deviations from normality.
From the graph visualization:
- Most points appear to follow the diagonal line, indicating that most residuals are normally distributed.
- However, there are some points, particularly at the ends of the graph, that deviate from the diagonal line. This indicates the presence of some residuals that are not normally distributed and could be considered outliers.
Therefore, based on the visualization of the Normal P-Plot, residuals mostly follow a normal distribution with some exceptions at extreme values. These deviations may affect the analysis results and the validity of the model. Hence, it is crucial to consider appropriate actions, such as replacing outlier data or performing data transformation, to obtain a more reliable model.
Histogram for Residuals:
The histogram plot is used to visualize the distribution of data. In this context, we use the histogram to examine the distribution of residuals. The histogram should display a distribution that approximates a bell shape (normal distribution). Deviations from this shape (for example, skewed or long-tailed distributions) may indicate violations of the normality assumption.
From the graph visualization:
- The shape of the distribution seems to approximate a bell shape, indicating that the distribution of residuals tends to approximate a normal distribution.
- There is no significant skewness or clear asymmetry in the histogram, indicating that the distribution of residuals is likely symmetrical.
- Although there are some peaks, there is no clear mode or extremely high frequency at any specific point, indicating that no group of residuals is overly dominant.
Therefore, based on the visualization of the residual histogram, the distribution of residuals appears to approximate a normal distribution. This indicates that the normality assumption of residuals, which is one of the key assumptions in variance analysis, is likely met. This adds confidence to the validity and reliability of the analysis of variancemodel used.
Residual vs Predicted:
- Homoscedasticity:
- If residuals are randomly scattered around the horizontal line at value 0 and do not show any particular pattern (for example, funnel shape or U-shape), then the assumption of homoscedasticity is considered met. If there is a pattern, such as a funnel shape, this indicates heteroscedasticity, meaning the variability of residuals changes with the predicted values.
- The points appear to be randomly scattered around the horizontal line at value 0 and do not show any particular pattern, indicating that the variability of residuals is consistent across the range of predicted values. This suggests that the assumption of homoscedasticity is likely met.
- Independence:
- If residuals are randomly scattered and do not show any autocorrelation pattern (for example, wave-like or cyclical patterns), then the assumption of independence is considered met.
- There is no clear wave-like or cyclical pattern, indicating that the residuals are independent and not autocorrelated.
Therefore, based on the visualization of the predicted vs. residual graphs, both the assumptions of homoscedasticity and independence of residuals seem to be met. This indicates that the analysis of variancemodel has met two crucial assumptions, and the results from this model can be considered reliable.
Standard Deviation vs. Mean Graph
The Standard Deviation vs. Mean Graph is used to examine the assumption of homoscedasticity in the data in a manner distinct from the Predicted vs. Residual graph.
In the Standard Deviation vs. Mean Graph:
- The horizontal axis (X-axis) represents the mean value of each group or category.
- The vertical axis (Y-axis) represents the standard deviation of each group or category.
Here's how to interpret the Standard Deviation vs. Mean Graph:
Homoscedasticity: If the points on the graph do not exhibit a clear correlation pattern (e.g., a straight line either ascending or descending), then the assumption of homoscedasticity is considered to be met. Conversely, if a clear positive or negative correlation between the mean and standard deviation is present, this indicates heteroscedasticity, meaning that the data variability changes along with the mean value.
From the visualization of the graph:
- The points appear to be scattered without a clear correlation pattern between the standard deviation and the mean. This indicates that data variability (standard deviation) is consistent across the range of mean values.
Therefore, based on the visualization of the Standard Deviation vs. Mean Graph, the assumption of homoscedasticity appears to be met. This suggests that the data variability is consistent across the range of mean values, making further analysis of this data considered valid.
Box-Cox Transformation and Residual Table

Interpretation and Discussion:
- Box-Cox Transformation:
- Box-Cox is a transformation method used to make data that do not meet the assumption of normality approximate normality. In this case, the Lambda value found is 2.000, which suggests that the appropriate transformation for this data is Square Transformation (Y2).
- Residual Value and Outlier Examination:
- Residual is the difference between the actual observed value and the value predicted by the model. A large residual may indicate the presence of an outlier.
- Leverage indicates how far the predictor value (X) is from the mean predictor value. A high leverage value can indicate an extreme or "outlier" observation in terms of its predictor value.
- Studentized Residual is a normalized residual. If its absolute value is large, this may indicate the presence of an outlier.
- Cook's Distance measures the influence of one observation on all other observations. A high Cook's Distance value indicates an observation with significant influence on the model.
- DFITS is a measure of the influence of an observation on the predictions made by the model. A high DFITS value indicates an observation with significant influence on model predictions.
Conclusion:
Based on the data presented, it is crucial to examine each observation for potential outliers or extreme values that could affect the model. Using the Box-Cox transformation can help in ensuring that the model assumptions are met. If these assumptions are not met, it could affect the validity and interpretation of the analysis results. In this case, the Square Transformation (Y2) is recommended to ensure the data meet the assumption of normality.
Soil pH Response
Variance Analysis

Further Tests
Based on the Variance Analysis, the Treatment does not have a significant impact on the soil pH value. There are three options for adding letter annotations to the Average Value Table. The presentation option for the Average Value Table can be modified by accessing "Advanced Options" in the Anova Dialog Box (as explained in step 15)
Three Options for Presenting the Average Value Table
First: The average values are presented without any letter annotations following them:
Second: All average values are given the same letter annotation as a marker.
Third: Letter annotations are provided based on the results of each further test for each average value.
Sometimes, Anova results indicate that the Treatment is not significant, yet further tests show significant differences between groups. This often occurs in Duncan's Further Test. The sensitive nature of Duncan's Further Test can lead to interpreting differences between pairs of average values as significant, especially when the p-value (probability of significance) is close to the predetermined significance threshold.
In such situations, it is recommended to present the average table according to the conclusions drawn from the variance analysis, either in accordance with the First Presentation or the Second Presentation.








