ANOVA Assumptions

Assumptions for hypothesis testing based on the single factor ANOVA model are actually related to the residual value or error (ε _ij ). Many references state that single factor ANOVA is quite reliable for this assumption, for example the F test remains reliable and reliable even though the assumptions are not met. However, the level of reliability is very difficult to measure and also depends on the sample size that must be balanced. The F test can be very unreliable if the sample size is not balanced, especially if it is added to the distribution of data that is not normal and the variance is not homogeneous. Therefore, I highly recommend checking the ANOVA assumptions first before proceeding to the analysis phase.

What if we analyze data that does not actually meet the assumptions of analysis of variance? If that happens, then the conclusions drawn will not describe the actual situation and even be misleading! Thus, before conducting an analysis of variance, we must first check whether the data has met the basic assumptions of the analysis of variance or not.

The general strategy for examining ANOVA assumptions and the sequence of assumptions that must be checked first are discussed in detail by Dean and Voss (1999). They focus on observing plot residuals, for the following reasons: examining residual plots is more subjective than formal tests and more importantly, plots of residuals are more informative about the nature of the problem, its consequences, and the corrective actions that can be taken.

Take a look at the linear models for the following RAL (One Way Anova) or RAK designs:

Linear model for RAL (One Way Anova):

Y _ij = μ + τ _i + ε _ij .

and linear model for RAK:

Y ij = μ+ τ _i + β _j + ε _ij ,

where _ij NIID(0, ² )

NIID = Normal, Independent, Identically Distributed with mean 0 and variance ²

In practice, the implied meaning of the model is:

Observational data from each treatment group came from a normal population/normally distributed (this is necessary so that _ijis normally distributed ).
All treatment groups have homogeneous variance (this is necessary so that _ij will have a homogeneous variance for each treatment level, i).
Experimental units are determined and assigned randomly to each treatment group (this is necessary so that _ijis independent of each other).
The effect of treatment (τ _i ) and environmental (β j ) and error (ε _ij ) factors is additive, meaning that the level of response is solely due to the effect of additional treatments and/or groups. Response value (Y _ij ) is the general mean value (μ) plus the addition of treatment (τ _i ) and error (ε _ij ).

Thus, the assumptions that must be met in performing the analysis of variance are, Normality, homoscedasticity (homogeneous variance), Independence (freedom of error), and Additives.

1. Normality

Normality means that the residual value (ε _ij ) in each treatment (group) associated with the observed value of Y _i must be normally distributed. If the residual value is normally distributed, then the value of Y _i will also be normally distributed. If the sample size and variance are the same, then the ANOVA test is very strong against this assumption. The impact of the abnormality is not too serious, but if the abnormality is accompanied by a heterogeneous variety, the problem can be serious!

1.1. Cause of Abnormality

In practice, it is rare to find a distribution of observation values that has an ideal form, such as a normal distribution, even on the contrary, we often find shapes that tend to be abnormal (skewed or multimodal) because of the diversity of the sampling. This diversity occurs when the sample size is too small, for example less than 8–12 (Keppel & Wickens, 2004; Tabachnick & Fidell, 2007), or if there are outliers. Outliers usually occur due to errors, especially errors in data entry, incorrect coding, participant errors in following instructions, and so on.

Some examples of cases where the data distribution tends to be abnormal, for example:

The number of parasites in the wild life
Counting the number of bacteria
Data in the form of proportions or percentages
Arbitrary Scale, such as testing 10 taste test scale
Weighing very small objects, related to the limitations of the weighing tool.

Another thing that can damage the normality assumption is if the randomization is not in accordance with the randomization principle of an experimental design. This allows the data to be spread abnormally.

1.2. Consequence

The consequence of data that is not normally distributed is that it will lead to a decision that is under estimate ( under estimate ) or over estimate ( over estimate ) to the experimental real level that has been determined (Type I Error).

However, it must be remembered that in the assumption of analysis of variance (condition for model adequacy), the normality test is less important than other tests, provided that:

Large sample size and balanced number of samples.
As long as all data samples have almost the same distribution and the number of samples is the same or almost the same and there are no extreme deviations, there is no need for normality testing.

1.3. Relationship with homogeneity of variance

Actually, there is a simultaneous relationship between data that is normally distributed and data that has homogeneous variance. Data that is homogeneous in variance will spread normally, but data that is normally distributed does not always have a homogeneous variance.

1.4. Normality Testing:

We can check the normality assumption in various ways.

Normality test must be carried out on each treatment combination ( cell by cell basis )
Check for outliers, skewness and bimodality.
- Histogram and Stem-and-Leaf-Plot of the observed or residual values
- Box plot
  - Boxplots of observations or residues in each treatment (group) must be symmetrical.
  - observational or residual data should not be symmetrical ( Side-by-side boxplots )
- Coefficient of skewness ( skewness ) and kurtosis
  - A sample from a skewed distribution will show a positive relationship between the mean and variance.
- Group plot Mean treatment vs. residual
- The Mean vs Variance group plot should show no correlation
  - The mean and variance values derived from the normal distribution are independent (independent) so that the plot of the sample mean against the sample variance must show no relationship.
- Normal The probability plot between the residual value and the predicted or observed value is also quite informative.
  - Data is said to be normally distributed if the data plot follows the normal line (diagonal line).
Formal Test: Shapiro-Wilk test; Kulmogorov-Smirnov test
- There are also some formal tests of normality (eg Shapiro-Wilks test; goodness-of-fit such as the Kulmogorov-Smirnov test), but according to some literature, graphical methods are much more informative in checking ANOVA assumptions before analysis of variance is performed.

1.5. Solution

Try to have the same number of replications for each treatment because a uniform sample size is very reliable against abnormalities.
Check for outliers, remove if the data points are not representative or check the correctness of the data
The next approach to reduce normality violations is to trim the values of the most extreme observational data, with the aim of reducing the effects of skewness and kurtosis, for example, removing the top and bottom 5 percent of a distribution (Anderson, 2001).
Data transformation
Non-parametric test

» Testing for abnormality of observational data will be discussed in a separate topic

2. Homogeneity of Variance (homoscedasticity)

Another assumption underlying the analysis of variance is the homogeneity of variance or the assumption of homoscedasticity. Homoscedasticity means that the variance of the residual value is constant. The homogeneity assumption requires that the residual distribution for each treatment/group must have the same variance. In practice, this means that the value of Y _ij at each level of the independent variable varies around its average value.

The variance of residual values and the variety of observational data in the same group should be homogeneous
The impact of variance inhomogeneity is more serious than the data abnormality because it can affect the F-Test. This will increase the type I error (looks like there is an effect from the treatment when in fact there is not)
Box plot observation data should be evenly distributed among the treatment groups (among group)
The distribution of residuals should be even when plotted with the average value

Heterogeneous variance is a deviation from the basic assumptions in the analysis of variance. This kind of data is not suitable for analysis of variance. This means that to be able to analyze variance, the data must have a homogeneous variance.

2.1. Causes of Heteroscedasticity

First, determining the level or classification of factors (independent variables), such as gender, variety, which have unique and different natural diversity. Second, manipulation of treatment factors that causes an object (plants, participants, etc.) to have characteristics or behaviors that tend to be more the same or different than the control. Third, the diversity of the responses (the dependent variable) is related to the size of the sample that we take. Diversity can become serious if the sample sizes are not balanced (Keppel & Wickens, 2004).

2.3. Consequences of Heteroscedasticity

Inhomogeneous variance coupled with unequal sample sizes, can be a serious problem in hypothesis testing with ANOVA. Violation of this assumption is more serious than the Normality assumption, because it will have a serious impact on the sensitivity of the results of the analysis of variance. Wilcox et al. (1986) using simulation data proves that:

with four treatments/groups and the same sample size (n), namely eleven, the ratio of the largest to the smallest standard deviation = 4:1 (meaning ratio of variance = 16:1) resulted in a Type I error rate for a significance level of 0.05 which was 0.109.
Furthermore, with the same limitations as above, but with different sample sizes, namely 6, 10, 16 and 40, the Type I error rate can reach 0.275.

A larger variety with a smaller sample size will result in an increase in the Type I error rate so that the F test tends to be liberal where the value of the significance level we set is 0.05, in fact the value of is looser, for example 0.10. On the other hand, a greater variety with a larger sample size results in reduced power, so the F test tends to be more conservative where the value of the significance level we set is 0.05, in fact the value of is tighter, for example 0.01 (Coombs et al. 1996, Stevens, 2002).

2.4. Homogeneity test of variance

There are several alternatives to test whether the experimental data have met the assumption of homogeneity of variance or not.

Graphical Method:
- Side-by-side boxplots.
  - Boxplots of observational data in each treatment/group distribution must be similar.
- Plot between residual value and mean value
  - The distribution of residual values in each treatment/group mean must be similar.
- Variance/Standard Deviation/IQR statistics
Formal Test:
- There are several formal tests to test the homogeneity of variance, for example Bartlett's, Hartley's, Cochran, Levene's tests.

It should be noted that some of the formal tests are very sensitive to data abnormalities, especially to data whose distribution tends to stick to the right (positive skewness). Second, and more importantly, if the sample size is small, formal tests sometimes fail to reject H ₀ , so we will assume that the variance is homogeneous. In other words, if the data is not normally distributed, then the homogeneity test of variance cannot be relied on.

Finally, the homogeneity of variance test provides little information about the underlying causes of the variance inhomogeneity, and diagnostic techniques (eg residual plots) are still needed to decide on the appropriate corrective action.

2.5. Solution

Using a more stringent significance level value, for example 0.025 (so that the Type I error is expected to remain below 0.05)
Data transformation
Using other more suitable estimation models

» Homogeneity Testing The variety of observational data will be discussed in a separate topic

3. Independence

Residual values and data for each experimental unit observation must be independent of each other, either within the treatment itself ( within group ) or between treatments ( between groups ). If these conditions are not met, it will be difficult to detect any real differences that may exist.

3.1. Cause of Unfreedom

Not free:
- There is a positive correlation between replicates in each treatment group ( within group ) which will result in a variance value that is under estimate (under estimate) so that it will increase the value of type I error (value of - the effect of the treatment detected is incorrect). Often occurs in observations made repeatedly on the same experimental unit (repeated measure).
- There is a negative correlation between the replicates in each treatment group ( within group ) which will result in a variance value that is above the estimate (over estimate) so that it will increase the type II error value (the value of – the actual effect is not detected)
- The response to one treatment affects the response to the other treatment, for example, the animal moves to another treatment.
This assumption should be considered at the time of design before starting the experiment.

3.2. Consequences of not being error free

Often this independence test is ignored by researchers, especially researchers in the social and behavioral sciences. Hays (1981) and Stevens (2002) state that the violation of data independence is a very serious problem in the analysis of variance. Consequently, it will cause inflation to the value of the real level (α) that has been determined. For example, Stevens (2002) states that although the indication of independence among the observed values is only small, it will increase the value of type I error (the value of - the effect of the treatment detected is incorrect) several times greater, for example if the significance level we set of 0.05, the actual significance level value will be much larger (eg, 0.10 or 0.20).

3.3. Error Freedom Test

Plot between the mean value of the treatment/group with the variance value
- If the treatment values are independent, the data will be spread around the horizontal line
- If independent, the distribution will follow a certain pattern, for example linear, quadratic, or other curve shapes.

3.4. Solution

The assumption of freedom of error can usually be fulfilled if the randomization of the experimental units has been carried out correctly (according to the principles of experimental design). So if the arrangement of your experimental units is arranged systematically , it is possible that the assumption of freedom of error will be violated .
Appropriate data transformations will help in eliminating the effects of these dependencies.

» Testing the independence of observational data will be discussed in a separate topic

4. Additive Effect

The influence of treatment and environmental factors is additive, meaning that the level of response is solely due to the effect of additional treatments and/or groups.

In the linear model above, the treatment (τ _i ) and error (ε _ij ) are additive, in other words the effect of addition from the treatment is constant for each replication and the effect of replication is constant for each treatment. Response value (Y _ij ) is the general average value plus the addition of treatment and error.

To make it easier to understand, consider the following illustration: Suppose the general mean value (μ) = 8 and the addition effect of each treatment (τ i ) and the addition effect of each replication/group (β _j ) as shown in the table following. To simplify the example, assume the value of _ij = 0, so that the value of the response Y _ij = + _i + _j + _ij can be calculated.

Factor A	Factor B (Retest/Group)		Difference In replay
Factor A	β ₁ = +1	β ₁ = +2	Difference In replay
t ₁ = +1	(8+1+1) = 10	(8+1+2) = 11	1
t ₂ = +3	(8+3+1) = 12	(8+3+2) = 13	1
Difference in Effect of Treatment	2	2

In the table above you can see that the effect of the treatment is constant in each replication and the effect of the replication (or the effect of the group if you use a group) is always constant in all treatments. If this is the case, then the data is additive. However, if the effect is not additive, but multiplicative, then the response data will look like the following table.

Factor A	Test
Factor A	β ₁ = +1	β ₁ = +2	Difference in repetition
t ₁ = +1	(8x1x1) = 9	(8x1x2) = 10	1
t ₂ = +3	(8x3x1) = 11	(8x3x2) = 14	3
Treatment Difference	2	4

Notice, the difference between the effect of the addition of the treatment or the group is no longer constant! If there is an additional effect from other factors outside our experiment, then the effect of the factor we are trying is no longer additive, but multiplicative.

For more details, consider the comparison between additive and multiplicative effects for the following randomized block design.

Comparison table between additive and multiplicative effects

	Factor A
Factor B	t ₁ = +1	t ₂ = +2	t ₃ = +3
β ₁ = +1	2	3	4	Additive Effect
	1	2	3	Multiplicative influence
	0	0.30	0.48	Multiplicative effect (log)
β ₂ = +5	6	7	8	Additive Effect
	5	10	15	Multiplicative influence
	0.70	1.00	1.18	Multiplicative effect (log)

4.1. Reason

There are influences from other factors beyond the factors we are trying:

The effect of the residual effects of previous studies.

There is an interaction between the treatment and other factors that are not included in the model, such as gender, type of variety, and so on.
In a Randomized Block Design, there is usually an interaction between the treatment and the group

4.2. Relationship with homogeneity of variance

Usually if the data is additive, then the data has a homogeneous variance. Conversely, if the data is not additive, then the data has a heterogeneous variance. This means that data that does not meet the additive effect will have a large variety of errors. To see the variance of the error from the experiment, you can look at the mean square (KT) of the error in your analysis of variance table. The larger your error KT, the greater the variance in your experiment will be.

The effect of treatment and group is said to be additive if the effect of the treatment is always constant in each replication or group and the effect of the test or group is always fixed for all treatments.

4.3 Nonadditive Test:

RAK linear model: Y ij = + i + j + ij . The error value, _ij is assumed to be independent, homogeneous, and normally distributed. The model is additive if the interaction between treatment and group (τ i * j ) is not significant. If there is an interaction, then the F-test is no longer efficient and there is a possibility of drawing wrong conclusions because the influence of the two factors is no longer additive but multiplicative.

The test to test whether the model is additive or not is to use the Tukey method.

SS (non - additive) = (∑∑ τ _i_j y _ij ) ²/ ( _i² )( _j² )

» Testing the independence of observational data will be discussed in a separate topic

4.4. Solution:

Log Transform

5. Conclusion

Of the four assumptions above, the most commonly violated assumption is the homogeneity of variance assumption . If the assumption of homogeneity of variance is met, usually the assumption of normality is also fulfilled, but the opposite is not always the case.

Diagnostic with SmartstatXL - Excel Add-In

How to Analyze Messy Data: Lost Data and Data Transformation

Refference:

Angela Dean and Daniel Voss. 1999. Design and Analysis of Experiments. Springer Verlag New York, Inc.

Gerry P. Quinn & Michael J. Keough. 2002. Experimental Design and Data Analysis for Biologists. Cambridge University Press

Glenn Gamst, Lawrence S. Meyers, and A. J. Guarino. 2008. Analysis of Variance Designs A Conceptual and Computational Approach with SPSS and SAS. Cambridge University Press

Shirley Dowdy, Stanley Weardon, Daniel Chilko. 2004. Statistics for Research (Third Edition). John Wiley & Sons, Inc

Plus other References.