Completely randomized design is the simplest type of experimental design . The reasons behind the use of a completely randomized design are as follows:
- The experimental unit used is homogeneous or there are no other factors that affect the response outside of the factors being tried or studied.
- External factors that can affect the experiment can be controlled. For example, an experiment conducted in a laboratory.
Because of the things mentioned above, completely randomized designs are usually found in laboratories or greenhouses.
Sub-discussion:
- Experimental Design Classification
- RAL Usage Background
- Advantages of Completely Randomized Design:
- Disadvantage: sometimes these designs are inefficient.
- When should we choose RAL
- Randomization And Trial Plan
- Randomization and Placement of Experimental Units:
- Linear Model and Analysis of Variety (Anova/F-Test) in Completely Randomized Design
- Linear Model
- Assumption:
- Hypothesis:
- Analysis of Variety (Anova or F-Test)
- Standard Error
- Example of Completely Randomized Design
- Case example 1: Completely Randomized Design with Same Deuteronomy
- Case example 2: Completely Randomized Design with Unequal Deuteronomy
A full discussion of Completely Randomized Design (CRD) can be read in the following document.
Experimental Design Classification
The experimental design consists of several designs, namely.
- Environmental design is a design of how the treatments tried are placed in experimental units. Included in this design are the Completely Randomized Design (CRD), the Group Randomized Design (RCBD) and the Latin Square Design (LSD), Lattice.
- A treatment plan is a design of how treatments are formed. Meanwhile, what is meant by treatment is the level of factors or a combination of levels of factors. This treatment design consists of a single factor (single-factor design), and a more than one-factored design (Factorial, Split-Plot, Split Block). From the combination of environmental design and treatment design then known various design names, for example:
- CRD (one or more factors)
- RCBD (one or more factors)
- A measurement design is a design of the procedure for measuring the nature of the experimental unit under study which then from this measurement is produced what is referred to as the experimental response.
Background to the Use of CRD
A Completely Randomized Design is the simplest type of experimental design. The reasons behind the use of a Completely Randomized Design are as follows:
- The experimental unit used is homogeneous or there are no other factors that effect the response beyond the factor being tried or studied.
- The outside factors that can affect the experiment can be controlled. For example, experiments conducted in the laboratory.
Therefore, these Completely Randomized Designs are usually found in laboratories or greenhouses.
Advantages of Completely Randomized Design:
- Easy design and execution
- The data analyst is simple
- Flexible (slightly more flexible than RCBD) in terms of:
- Number of treatments
- Number of tests
- Can be done with unequal replications
- There are alternatives to appropriate nonparametric analysis
- The problem of missing data is easier to handle (a little easier compared to RCBD)
- Lost data does not cause serious data analysis problems
- Less loss of sensitivity compared to other designs
- The degree of freedom for error (DFE) is greater (maximum). This advantage occurs especially when the DFE is very small.
- It does not require a high level of understanding of the experimental materials.
The disadvantage: sometimes this design is inefficient.
- The degree of precision (precision) of the experiment may not be very satisfactory unless the experimental unit is completely homogeneous
- Only suitable for experiments with a not too large number of treatments
- The repetition of the same experiment may be inconsistent (weak) if the unit of the experiment is not really homogeneous especially if the number of tests is small.
When should we choose CRD?
- If the experimental unit is completely homogeneous, for example:
- experiments in the laboratory
- Greenhouse
- If there is no prior knowledge/information about the homogeneity of the experimental unit.
- If the number of treatments is only small, where the degree of error-free will also be small
Randomization And Experimental Plan
Randomization is carried out so that the data analysis carried out becomes valid. Randomization can be done using a lottery, a list of random numbers, or using the help of software. Suppose we design 7 treatments (t = A, B, C, D) each of which is repeated 3 times (r) so that there are 4x3=12 experimental units (tr). We placed the treatment randomly into 12 experimental units.
Randomization and Placement of Experimental Units:
To place the treatment into the Experimental unit can be done using a random list of numbers, a draw or computer assistance.
Example of randomization by lottery.
- Make 12 rolls of paper on which each roll of paper is written the treatment code (A1, A2, A3, B1 ..., D2, D3)
- Do the draw (without replacement). The fallen treatment code is first placed in box no 1, the 2nd is placed in box no 2, etc. The fallen treatment code is first placed in box no 1, the 2nd is placed in box no 2, etc. Suppose that the C3 code falls first, then box no. 1 is replaced into C3, code A2 falls on the 2nd order, then box no. 2 is replaced with A2. Keep the draw until the last treatment code will be placed in box no 12.
An example of randomization by using Microsoft Excel.
- Create a table with the number of rows according to the combination of treatments, for an example case above its Table List as below, in the 3rd column is written the Formula "=RAND ()":
- Highlight/block Columns B and C and sort by 3rd column (Random Numbers)
- The randomization has been completed. Place the treatment code A1 on box No 1, A3 on box No 2, etc. to the last code, B1 on box No-12. The results are as follows:
A1 | A3 | C2 | C3 |
B2 | D2 | D3 | C1 |
D1 | A2 | B3 | B1 |
Figure 1. CRD trial layout with four treatments (A, B, C, D) and each was repeated three times
From the results of the experiments carried out based on the randomization and experimental plan above, the following data will be generated:
Table 1. Completely Randomized Design Data Tabulation With 4 Treatments And 3 Replications
Replication | Treatment | Total | |||
A | B | C | D | ||
1 | Y11 | Y21 | Y31 | Y41 |
|
2 | Y12 | Y22 | Y32 | Y42 |
|
3 | Y13 | Y23 | Y33 | Y43 |
|
Total | Y1. | Y2. | Y3. | Y4. | Y.. |
Linear Models and Analysis of Variance in A Completely Randomized Design
Linear Model
There are two types of models in the experimental design, depending on the observed factors, namely a randomized model if the treatment is taken randomly from the existing treatment population, and a fixed model if the researcher is only dealing with the treatment, where the treatment is determined by the researcher. The difference between the Fixed Model and the Random Model can be seen in the following Figure. For example, we want to know the yield of several varieties of rice. In the Random Model, samples were randomly taken from 10 varieties which were then used to infer the 100 varieties of rice, as in Model T, the number of levels observed was determined by the researcher so that the researcher could only conclude on the rice varieties he observed, not on the entire rice population.
Figure 2. Differences between Random Model and Fixed Model
A general linear model of a Completely Randomized Design of one factor can be divided into two, namely a fixed model if the factor used is fixed and a random model if the factor used is random.
The general form of a single-factor linear model can be written as follows:
$$\begin{matrix}Y_{ij}=\mu_i+\varepsilon_{ij}\\=\mu+(\mu_i-\mu)+\varepsilon_{ij}\\=\mu+\tau_i+\varepsilon_{ij}\ \ \ \ ;\\\end{matrix}$$
i = 1, 2,...,t ; j= 1, 2,... ri ; μi = i-th treatment mean
With:
μ = population mean
τi = (μi- μ) = Additive effect of the i-th treatment
εij = trial error/random effect of jth replication i-th treatment with εij ~ N (0, σ2)
t = number of treatments and ri is the number of tests of the i-th treatment, for experiments that have the same replication, ri = r.
Assumption:
Fixed Model | Random Model |
$$E(\tau_i)=\tau\ \ \ ;\ \ \ \sum_{i=1}^{t}\tau_i=0\ \ \ ;\ \ \ \varepsilon_{ij}\overset{bsi}{\sim}N(0,\sigma^2)$$ | $$ E(\tau_i)=0\mathrm{\ \ \ ;\ \ }\ E({\tau_i}^2)={\sigma_\tau}^2\mathrm{\ \ \ ;\ \ \ }\varepsilon_{\mathrm{ij}}\buildrel~\over~bsiN(0,\sigma^2)$$ |
Hypothesis:
Hypotheses to Be Tested: | Fixed Model | Random Model |
H0 | All τi = 0 | στ2 = 0 |
H1 | Not all τi = 0 | στ2 > 0 |
Analysis of variance
Analysis of variance is an analysis to break down total variance into several constituent components. The least squares estimator for the parameters in the Completely Randomized Design model is obtained as follows:
Parameters | Estimators |
μ | $$\hat{\mu}=\mathrm{\ }Y..$$ |
τi | $${\hat{\tau}}_{\mathrm{i\ }}\ =\mathrm{\ }Yi.-Y..$$ |
εij | $${\hat{\varepsilon}}_{ij}=Y_{ij}-{\bar{Y}}_{i.}$$ |
To understand the decomposition of total variance into its constituent components, consider the following example case:
The following are the results of estrogen testing of some solutions that have undergone certain treatments. The weight of uterine in rats is used as a measure of estrogen activity. The uterine weight in milligrams of four mice for each control and six different solutions are listed in the following table:
From the above data, we next break down the data into its Sum of Squares components according to its linear model:
$$\begin{matrix}Y_{ij}&=&\mu&+&\tau_i&+&\varepsilon_{ij}\\\sum_{i=1}^{t}\sum_{j=1}^{r}{(Y_{ij})^2}&=&\sum_{i=1}^{t}\sum_{j=1}^{r}{(\bar{Y}..)^2}&+&\sum_{i=1}^{t}\sum_{j=1}^{r}{({\bar{Y}}_{i.}-\bar{Y}..)^2}&+&\sum_{i=1}^{t}\sum_{j=1}^{r}{(Y_{ij}-{\bar{Y}}_{i.})^2}\\\end{matrix}$$
Treatment | Uterine Data | Overall average | Effect of Additives from treatment | Error (residual) |
Yij | μ | τi | εij=Y ij-μ-τ i | |
control | 89.8 | 80.32 | 15.83 | -6.35 |
control | 93.8 | 80.32 | 15.83 | -2.35 |
control | 88.4 | 80.32 | 15.83 | -7.75 |
control | 112.6 | 80.32 | 15.83 | 16.45 |
P1 | 84.4 | 80.32 | 7.93 | -3.85 |
P1 | 116.0 | 80.32 | 7.93 | 27.75 |
P1 | 84.0 | 80.32 | 7.93 | -4.25 |
P1 | 68.6 | 80.32 | 7.93 | -19.65 |
P2 | 64.4 | 80.32 | -4.92 | -11.00 |
P2 | 79.8 | 80.32 | -4.92 | 4.40 |
P2 | 88.0 | 80.32 | -4.92 | 12.60 |
P2 | 69.4 | 80.32 | -4.92 | -6.00 |
P3 | 75.2 | 80.32 | -11.87 | 6.75 |
P3 | 62.4 | 80.32 | -11.87 | -6.05 |
P3 | 62.4 | 80.32 | -11.87 | -6.05 |
P3 | 73.8 | 80.32 | -11.87 | 5.35 |
P4 | 88.4 | 80.32 | 4.58 | 3.50 |
P4 | 90.2 | 80.32 | 4.58 | 5.30 |
P4 | 73.2 | 80.32 | 4.58 | -11.70 |
P4 | 87.8 | 80.32 | 4.58 | 2.90 |
P5 | 56.4 | 80.32 | -1.42 | -22.50 |
P5 | 83.2 | 80.32 | -1.42 | 4.30 |
P5 | 90.4 | 80.32 | -1.42 | 11.50 |
P5 | 85.6 | 80.32 | -1.42 | 6.70 |
P6 | 65.6 | 80.32 | -10.12 | -4.60 |
P6 | 79.4 | 80.32 | -10.12 | 9.20 |
P6 | 65.6 | 80.32 | -10.12 | -4.60 |
P6 | 70.2 | 80.32 | -10.12 | 0.00 |
Sum of Squares | 186121.4 | 180642.89 | 2415.937 | 3062.57 |
Linear Model | Yij | μ | τi | εij |
Sum of Squares decomposition | $$\sum_{i=1}^{t}\sum_{j=1}^{r}{(Y_{ij})^2}$$ | $$\sum_{i=1}^{t}\sum_{j=1}^{r}{(\bar{Y}..)^2}$$ | $$\sum_{i=1}^{t}\sum_{j=1}^{r}{({\bar{Y}}_{i.}-\bar{Y}..)^2}$$ | $$\sum_{i=1}^{t}\sum_{j=1}^{r}{(Y_{ij}-{\bar{Y}}_{i.})^2}$$ |
SS |
| Correction/Intercept factor | Treatment | Error |
|
| FC | SST (Between) | SSE (Within) |
$$\begin{matrix}Y_{ij}=\mu+\tau_i+\varepsilon_{ij}\\Y_{ij}-\mu=\tau_i+\varepsilon_{ij}\\Model\ SS:\\\sum_{i=1}^{t}\sum_{j=1}^{r}{(Y_{ij})^2}=\sum_{i=1}^{t}\sum_{j=1}^{r}{(\bar{Y}..)^2}+\sum_{i=1}^{t}\sum_{j=1}^{r}{({\bar{Y}}_{i.}-\bar{Y}..)^2}+\sum_{i=1}^{t}\sum_{j=1}^{r}{(Y_{ij}-{\bar{Y}}_{i.})^2}\\\sum_{i=1}^{t}\sum_{j=1}^{r}{(Y_{ij}-\bar{Y}..)^2}=\sum_{i=1}^{t}\sum_{j=1}^{r}{({\bar{Y}}_{i.}-\bar{Y}..)^2}+\sum_{i=1}^{t}\sum_{j=1}^{r}{(Y_{ij}-{\bar{Y}}_{i.})^2}\\(186121.40)-(180642.89)=(2415.94)+(3062.57)\\(5478.51)=(2415.94)+(3062.57)\\SSTOT=SST+SSE\\\end{matrix}$$
Thus, for experiments using t treatment and r replications its total variance can be broken down into:
Same Number of Tests | The Number of Tests is Not the Same |
$$\begin{matrix}CF=\frac{Y..^2}{rt}\\SSTOT=\sum_{i=1}^{t}\sum_{j=1}^{r}{(Y_{ij}-\bar{Y}..)^2}=\sum_{i=1}^{t}\sum_{j=1}^{r}{Y_{ij}}^2-\frac{Y..^2}{rt}\\=\sum_{i,j}Y_{ij}^2-FC\\SST=\sum_{i=1}^{t}\sum_{j=1}^{r}{({\bar{Y}}_{i.}-\bar{Y}..)^2}=\sum_{i=1}^{t}\frac{{Y_{i.}}^2}{r}-\frac{Y..^2}{rt}\\=\sum_{i=1}^{t}\frac{{Y_{i.}}^2}{r}-FK\\SSE=\sum_{i=1}^{t}\sum_{j=1}^{r}{(Y_{ij}-{\bar{Y}}_{i.})^2}=\sum_{i=1}^{t}\sum_{j=1}^{r}{e_{ij}}^2\\=SSTOT-SST\\OR:\\SSTOT=SST+SSE\\\end{matrix}$$ | $$\begin{matrix}SSTOT=\sum_{i=1}^{t}\sum_{j=1}^{r_i}{(Y_{ij}-\bar{Y}..)^2}=\sum_{i=1}^{t}\sum_{j=1}^{r_i}{Y_{ij}}^2-\frac{Y..^2}{\sum_{i=1}^{t}r_i}\\SST=\sum_{i=1}^{t}\sum_{j=1}^{r_i}{(Y_{i.}-\bar{Y}..)^2}=\sum_{i=1}^{t}\frac{{Y_{i.}}^2}{r_i}-\frac{Y..^2}{\sum_{i=1}^{t}r_i}\\SSE=\sum_{i=1}^{t}\sum_{j=1}^{r_i}{(Y_{ij}-{\bar{Y}}_{i.})^2}=\sum_{i=1}^{t}\sum_{j=1}^{r_i}{e_{ij}}^2\\=SSTOT-SST\\\end{matrix}$$ |
The analysis of variance table for fixed models and random models is given as follows:
Table 2. Anova Table of Completely Randomized Design with Fixed Model and Random Model For the same Number of Replications
Sources of variance (SK) | Degree of freedom (df) | Sum of squares (JK) | Mean square (KT) | F-stat | E(KT) | |||
Fixed model | Random models | |||||||
Treatment | t-1 | SST | MST | $\frac{MST}{MSE}$ | $\sigma^2+[\frac{r}{(t-1)}]i=1tτi2$ | $\sigma^2+r{\sigma_\tau}^2$ | ||
Error | t(r-1) | SSE | MSE |
| $\sigma^2$ | $\sigma^2$ | ||
Total | tr-1 | SS(Total) |
|
|
|
|
Table 3. Anova Table of Completely Randomized with Fixed Model and Random Model for Different Number of Replication
Sources of variance (SK) | Degree of freedom (df) | Sum of squares (JK) | Mean square (KT) | Fstat | E(KT) | |
Fixed model | Random models | |||||
Treatment | t-1 | SST | SST | $\frac{MST}{MSE}$ | $\begin{matrix}\sigma^2+\\\frac{\sum_{i=1}^{t}{r_i{\tau_i}^2-(\sum_{i=1}^{t}{r_i\tau_i)^2/\sum_{i=1}^{t}r_i}}}{(t-1)}\\\end{matrix}$ | $\sigma^2+r_a{\sigma_\tau}^2$ |
Error | $\sum_{i=1}^{t}{(r_i-1)}$ | SSE | MSE |
| σ2 | σ2 |
Total | $\sum_{i=1}^{t}{r_i-1}$ | SS(Total) |
|
|
|
|
with:
$$r_a=\sigma^2+\left(\sum_{i=1}^{t}r_i-\frac{\sum_{i=1}^{t}{r_i}^2}{\sum_{i=1}^{t}r_i}\right)\frac{1}{t-1}$$
Fstat = $\frac{MST}{MSE}$ spread according to the spread of F with the numerator degree of freedom (db1) equal to the degree of freedom for treatment and the denominator-degree of freedom (db2) equal to the degree of freedom for error. The F value of the table can be seen in the table of F values. If the F value counts > the table F values in db1 and db2 and a certain significant level (α) then the null hypothesis is rejected and vice versa.
The reliability index of an experiment can be seen from the value of the coefficient of variance (CV) which indicates the degree of accuracy of an experiment.
$$ CV=\frac{\sqrt{MSE}}{\bar{Y}..}\times100%$$
The larger the CV indicates the lower the reliability of the experiment. There is no benchmark for how much the CV value should be, this also depends on the field being engaged in, but a fairly reliable experiment is attempted the CV value does not exceed 20%, but a very small value there is a tendency that there is manipulation of the experimental data.
Standard Error
To compare the mean values of the treatment, it is necessary to first determine the standard error of the CRD. The standard error is calculated by the following formula:
$$S_{\bar{Y}}=\sqrt{\frac{2MSE}{r}}$$