Math 4220 Dr. Zeng: Y I A J N Iidn
Math 4220 Dr. Zeng: Y I A J N Iidn
Zeng
Student Activity 6: Randomized Block Design, Latin Square, Repeated Latin Square, and Graeco Latin Square
Consider the “one-way treatment structure in a completely randomized design structure” experiment.
We have “a” treatments, each replicated n times (we consider the balanced case for simplicity). The
appropriate means model is
Yij i ij i 1,2,..., a
where ij ~ iidN 0, 2
j 1,2,...,n
The error terms ij denote the plot to plot variation in the response that cannot be attributed to the
treatment effect.
The variance 2 is a measure of this variation. If the plots are more alike (homogeneous) then 2
will be low. If the plots are very different from one another 2 will be large.
A small 2 enables an experimenter to attribute even small variation in the treatment sample means
Y1 ,..,Ya to differences between treatment (population) means ij . In other words, a small results
2
Thus, one of the main tasks of an experimenter is to reduce 2 by using homogeneous experimental
units.
However, one should make sure that such homogeneity does not compromise to applicability of the
results.
[e.g.: Using white males ages 21-25 in a test of a hair growing formulation
will make the results inapplicable to older males and individuals of
other races or females.
Another way to reduce 2 is by grouping experimented units that are more alike.
All the above are examples of “BLOCKING”. In example 1), the block is a pair of twins, in example
2), the block is a person, and in example 3), the block is a piece of land consisting of 5 adjoining
plots.
In all cases, plot to plot variation within a block is less than block to block variation.
i ~ iidN o, 2b
where
ij ~ iidN 0, 2
One can consider the above model as a two-way model where the row effect is fixed but the
column effect is random (so it is a mixed model). In fact, the appropriate sum of squares can be
obtained by treating it as a two-way model without interaction.
The plot to plot variation within a fixed block is 2 . Thus, the error variance of a plot selected
randomly from a pre-specified block (after accounting for the block effect) is 2 .
Thus VarY12 Y22 Var 12 22 2 2 . However, the variance of the response of a plot randomly
b 1 2
picked from the totality of ab plots is not 2 but is 2 b 2 2b if b is large. Note that b2 is
b
the block variation (scaled to reflect the plot size).
If b2 is large, then blocking will enable to come up with a more “sensitive” experiment.
Yij i j ij i 1,2,...,a
j 1,2,...,b
ij , j defined as before.
Observe that we have no interaction term. In blocked experiments, it is assumed that there
is not block by treatment interaction.
a b
The constraints assumed are i 0 and j 0 . The second restriction is needed only if
i 1 j 1
Example 1: Three different washing solutions are being compared to study their effectiveness in
retarding bacteria growth in 5-gallon milk containers. The analysis is done in a laboratory, and only
three trials can be run on any day. Because days could represent a potential source of variability, the
experimenter decides to use a randomized block design. Observations are taken for four days, and
the data are shown here. Analyze the data from this experiment and draw conclusions.
In this example, the blocking factor is the day. The treatment is “solution”. We have three types of
solutions and four levels for “day.”
3
MATH 4220 Dr. Zeng
4
MATH 4220 Dr. Zeng
5
MATH 4220 Dr. Zeng
6
MATH 4220 Dr. Zeng
There are many types of block designs, with RCB being one of them. Some of the other block
designs are Latin Square Designs, Greaco-Latin Square Designs, and Split-Plot Designs.
In a randomized complete block design, the blocking was done to reduce variation that can be
attributed to some random (and in some cases fixed) factor. For example, in an agricultural
experiment, blocking may be done to remove the effect due to a fertility gradient; in a chemistry
experiment blocking may be done to remove the effect of the chemists’ skills. In some situations, it is
possible that one wishes to remove the effect of two factors. Then blocking has to be done in two
“directions”, each “direction” corresponding to the “gradient’ of a given factor.
e.g: An agricultural scientist wishes to study the effects of 4 different kinds of fertilizer
on a certain variety of wheat. The experimental field in which the wheat
is to be grown has a moisture gradient in one direction and a sunlight
gradient perpendicular to it.
Column Blocks
A B C D
B C D A
Row Blocks Sun Light Gradient
C D A B
D A B C
Moisture Gradient
One may block as above (with 4 row blocks to take care of the sunlight
gradient and 4 column blocks to take care of the moisture gradient).
If you now apply the four fertilizers (i.e. treatments A,B,C,D) in such a
way that each treatment occurs once (and only once) in each row and
in each column, then we have what is known as a Latin Square Design.
Usually, the row block effects and the column block effects are random effects and it is assumed that
there is no row * column, row * treatment, column * treatment and row * column * treatment
interaction. In fact, it is the contrasts that estimate the above interactions that are used to estimate
the error variance 2 .
Sometimes, the row effect or the column effects are those due to a specific treatment (or both are).
Thes, the rows, columns (or both) are fixed effects.
7
MATH 4220 Dr. Zeng
e.g.: In the agriculture example given above, suppose the
experimental field is homogeneous (and hence no blocking
is necessary), but the agriculturalist is interested in two other
factors, namely wheat variety and time of application of fertilizer.
Suppose each of these two factors also have 4 levels each.
Yijk i j k ijk
i 1,2,...,
j 1,2,..., # of treatments =# of rows =# of columns
k 1,2,...,
8
MATH 4220 Dr. Zeng
or
Yijk ijk ijk , ijk ~ iidN 0, 2
if row & column effects are fixed.
Yijk i j k ijk
i, j , k 1, 2,..., ,
i 1
i 0
If j , k are fixed, then j 0, k 0 and ijk ~ iidN 0, 2.
j 1 k 1
Note that the above model is completely additive. That is, it has no interaction terms.
where N .
2
SSTreatment Yi2 2
NY
i 1
SSRows Y j
2 2
NY
j1
SSColumns Yk
2
NY 2
k 1
It can be shown that SSTreatment,SSRow ,SSColumns are independent, and are also independent of SSError
where
Further,
9
MATH 4220 Dr. Zeng
SSTreatment / 1
Fo ~ F 1, 2 1
SSError / 2 1
if
Ho 1 2 .. 0 (otherwise, Fo has a non-central F distribution).
Source d.f. SS MS F
MSTreatment
Treatments 1 SSTreatment MSTreatment Fo
MSError
Total 2 1 SSTotal
Example 2: Consider an experiment to investigate the effect of 4 diets on milk production. There are
4 cows. Each lactation period the cows receive a different diet. Assume there is a washout period so
previous diet does not affect future results.
11
MATH 4220 Dr. Zeng
12
MATH 4220 Dr. Zeng
One disadvantage of a Latin Square Design is that smaller squares yield low d.f. for error (e.g. A 3 x 3
design has only 2 d.f. for error; a 5 x 5 design has only 12 d.f. for error). To overcome this problem,
one may replicate the Latin Square n times n 1.
Case 1 Latin Squares replicated with same blocks. (Use the same col & row in each replicate)
Yijk i j k ijk
Replication
i 1
13
MATH 4220 Dr. Zeng
SSRows n Y j N Y
2 2
j1
SSColumns n Yk
2 2
N Y
k 1
n
SSReplication
2 2 2
Y N Y
1
SSSquares
SSTotal Yijk N Y
2 2
i j k
ANOVA
Source d.f. SS MS F
MSTrt
Treatments 1 SSTreatment MSTreatment Fo
MSError
n 1
2
Total SSTotal
14
MATH 4220 Dr. Zeng
Example 3 (Case 1): Same rows and same columns in additional squares
1 2 3 response
1 A B C 7 8 9
2 B C A 4 5 6
3 C A B 6 3 4
1 2 3
1 C B A 8 4 7
2 B A C 6 3 6
3 A C B 5 8 7
1 2 3
1 B A C 9 6 8
2 A C B 5 7 6
3 C B A 9 3 7
data case1;
input rep row col trt resp;
datalines;
11117
11228
11339
12124
12235
12316
13136
13213
13324
21138
21224
21317
22126
22213
22336
23115
23327
31129
31216
31338
32115
32237
32326
33139
15
MATH 4220 Dr. Zeng
33223
33317
;
proc glm data=case1;
class rep row col trt;
model resp=rep row col trt;
run;
quit;
Case 2 Replicated by introducing additional versions of one blocking factor but using the
same blocks for the other blocking factor. (use different rows but same columns in each
replicate)
w. .o.g. assume that columns blocks are repeated but row blocks have additional versions.
Yijk i j k jk
1,2,..., n
i j k
16
MATH 4220 Dr. Zeng
where N n 2
SSTreatment n Yi
2
N Y 2
i 1
n n
SSRows Y2j Y 2 2
i 1 1 1
n
SSColumns nYk
2 2
N Y
k 1
n
SSReplicates 2Y
2 2
N Y
1
ANOVA
Source d.f. SS MS F
MSTrt
Treatments 1 SSTreatment MSTreatment Fo
MSError
n 1
2
Total SSTotal
17
MATH 4220 Dr. Zeng
Example (Case 2): New (different) rows and same columns
1 2 3 response
1 A B C 7 8 9
2 B C A 4 5 6
3 C A B 6 3 4
1 2 3
4 C B A 8 4 7
5 B A C 6 3 6
6 A C B 5 8 7
1 2 3
7 B A C 9 6 8
8 A C B 5 7 6
9 C B A 9 3 7
Case 3 Latin Squares replicated by introducing additional versions of both blocking variables.
(use different row & col in each replicate)
Yijk i j k ijk
n n
SSColumns Yk
2
2 Y
2
k 1 1 1
n
SSReplicates 2Y
2
N Y2 where N n2
1
The ANOVA table as in Case 2, except SSColumn computed differently and has n 1 d.f. and SSError
has 1 n 1 1 d.f.
18
MATH 4220 Dr. Zeng
SAS can be used to analyze repeated Latin Squares as follows:
1 2 3 response
1 A B C 7 8 9
2 B C A 4 5 6
3 C A B 6 3 4
4 5 6
4 C B A 8 4 7
5 B A C 6 3 6
6 A C B 5 8 7
7 8 9
7 B A C 9 6 8
8 A C B 5 7 6
9 C B A 9 3 7
Def.
n
Let a x Latin square consists of Latin letters and another x Latin square consist of Greek
letters. Suppose they have the property that when superimposed, each Latin letter coincides
exactly once with each Greek letter. Then the two squares are said to be orthogonal.
A collection of n pxp Latin squares are said to be a mutually orthogonal set of Latin squares
if each letter in one square coincides with each combination of the letters in the other squares
exactly once.
Def.
n
A pair x Latin, Greek, Greek squares that are orthogonal form a Graeco-Latin square.
Using a Graeco-Latin square, one may block in a 3rd direction or analyze a 2nd treatment.
A B C D E
B C D E A
C D E A B
D E A B C
E A B C D
Note: When more than two orthogonal Latin squares are superimposed, we obtain a Hyper-Graeco-
Latin square.
Model is:
Yijk i w j k ijk
Latin Greek Row Column
TRT Letter
ŹŹ
SSTotal Yijk N Y where N .
2 2
i j k
d.f.)
SSTrt SSLatin Yi2 N Y2 ( 1)
i 1
SSGreek Y j Y 1
2 2
j 1
20
MATH 4220 Dr. Zeng
SSRows Yk
2 2
Y 1
k 1
SSColumns Y
2 2
Y 1
1
21
MATH 4220 Dr. Zeng
SAS codes:
data additives;
input row col trt greek resp @@;
datalines;
1 1 1 1 32 1 2 2 2 25
1 3 3 3 31 1 4 4 4 27
2 1 2 4 24 2 2 1 3 36
2 3 4 2 20 2 4 3 1 25
3 1 3 2 28 3 2 4 1 30
3 3 1 4 23 3 4 2 3 31
4 1 4 3 34 4 2 3 4 35
4 3 2 1 29 4 4 1 2 33
;
proc glm data=additives;
class row col trt greek;
model resp=row col trt greek;
run;
SAS outputs:
Practice:
1. Run the repeated Latin square design case 2 and 3. Interpret your result.
2. Interpret the result for Example 1-4
3. Draw conclusion for Example 5. Include the analysis of multiple comparisons.
22
MATH 4220 Dr. Zeng
Assignments:
1. Lew (2007) presents the data from an experiment to determine whether cultured cells respond to two
drugs. The experiment was conducted using a stable cell line plated onto Petri dishes, with each
experimental run involving assays of responses in three Petri dishes: one treated with drug 1, one
treated with drug 2, and one untreated serving as a control. The data are shown in the table below:
(a) Analyze the data as if it came from a completely randomized design (CRD). Write down the classical
effect model for CRD and the five steps for hypothesis testing. Is there a significant difference between
the treatment groups?
(b) Analyze the data as complete randomized block design (CRBD). What is the treatment? What is the
blocking factor? Write down the classical effect model for CRD and the five steps for hypothesis
testing. Is there a significant difference between the treatment groups?
(c) Is there any difference in the results you obtain in (a) and (b)? If so, explain what may be the cause of
the difference in the results and which method would you recommend?
2. Le Riche and Csima (1964) evaluated four hypnotic drugs and a placebo to determine their effect on
quality of sleep in elderly patients. The treatment levels were labeled (A=Placebo, E=Ethchlorvynol,
C=Glutethimide, D=Chloral hydrate and E=Secobarbitol sodium). Elderly patients were given one of the
capsules for five nights in succession and their quality of sleep was rated by a trained nurse on a four-
point scale (0=poor to 3=excellent) each night. An average score was calculated for each patient over
the five nights in a week. Each patient received all five treatments in successive weeks. The design and
the response (mean quality of sleep rating ) are shown in the table below:
Week
Patient 1 2 3 4 5
1 B 2.92 E 2.43 A 2.19 C 2.71 D 2.71
2 D 2.86 A 1.64 E 3.02 B 3.03 C 3.03
3 E 1.97 B 2.5 C 2.47 D 2.65 A 1.89
4 A 1.99 C 2.39 D 2.37 E 2.33 B 2.71
5 C 2.64 D 2.31 B 2.44 A 1.89 E 2.78
(a) What are the nuisance factors in this problem? What is the appropriate model for this data?
(b) Write down the classical effect model for this design and determine if there are any significant
differences among the treatments.
23
MATH 4220 Dr. Zeng
(c) Use an appropriate method to determine if there is a significant difference between the placebo and
other four drugs?
(d) Use an appropriate method to determine which drug/drugs has/have the highest rating?
(e) Use residual plots to check the assumption for the model you fit.
3. A manufacturing firm investigated the breaking strengths of components made from raw materials
purchased from 4 supplies (A, B, C, D). Data was collected from 2 replicates of a 4X4 Latin square
design. The blocking factors were days and operators.
(a) The same four operators were used in both replicates. Each replicate was also run on the same four
days with replicated values taken during the morning and afternoons of these four days. Write down
the statistical model for this data. Is there any significant difference among the different supplies?
Replicate 1 Replicate 2
Days Days
1 2 3 4 1 2 3 4
Operator B C A D Operator D C A B
1 810 1080 700 910 1 840 1050 775 805
C D B A A D B C
2 1100 880 780 600 2 670 930 720 1035
D A C B C B D A
3 840 540 1055 830 3 980 700 810 610
A B D C B A C D
4 650 740 1025 900 4 860 730 970 900
(b) Eight operators were used with four operators randomly assigned to each replicate. The two replicates
were run over 8 days with the first 4 days assigned to replicate 1 and the second four days assigned to
replicate 2. Write down the statistical model for this data. Is there any significant difference among the
different supplies?
Replicate 1 Replicate 2
Days Days
1 2 3 4 1 2 3 4
Operator B C A D Operator D C A B
1 810 1080 700 910 5 840 1050 775 805
C D B A A D B C
2 1100 880 780 600 6 670 930 720 1035
D A C B C B D A
3 840 540 1055 830 7 980 700 810 610
A B D C B A C D
4 650 740 1025 900 8 860 730 970 900
24