0% found this document useful (0 votes)
22 views24 pages

Math 4220 Dr. Zeng: Y I A J N Iidn

The document discusses various experimental designs, focusing on randomized block designs, Latin squares, and their applications in reducing variation in experiments. It explains the means models for one-way treatment structures, blocking techniques, and provides examples of how to analyze data using these designs. Additionally, it outlines the general means model and classical model for Latin square designs, along with the analysis of variance (ANOVA) framework.

Uploaded by

vinay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views24 pages

Math 4220 Dr. Zeng: Y I A J N Iidn

The document discusses various experimental designs, focusing on randomized block designs, Latin squares, and their applications in reducing variation in experiments. It explains the means models for one-way treatment structures, blocking techniques, and provides examples of how to analyze data using these designs. Additionally, it outlines the general means model and classical model for Latin square designs, along with the analysis of variance (ANOVA) framework.

Uploaded by

vinay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

MATH 4220 Dr.

Zeng
Student Activity 6: Randomized Block Design, Latin Square, Repeated Latin Square, and Graeco Latin Square
Consider the “one-way treatment structure in a completely randomized design structure” experiment.

We have “a” treatments, each replicated n times (we consider the balanced case for simplicity). The
appropriate means model is

Yij  i   ij i  1,2,..., a
where  ij ~ iidN 0, 2 
j  1,2,...,n

The error terms  ij denote the plot to plot variation in the response that cannot be attributed to the
treatment effect.

The variance  2 is a measure of this variation. If the plots are more alike (homogeneous) then  2
will be low. If the plots are very different from one another  2 will be large.

A small  2 enables an experimenter to attribute even small variation in the treatment sample means
Y1 ,..,Ya to differences between treatment (population) means ij . In other words, a small  results
2

in a more powerful F -test. The reverse is true if  2 is large.

Thus, one of the main tasks of an experimenter is to reduce  2 by using homogeneous experimental
units.

However, one should make sure that such homogeneity does not compromise to applicability of the
results.

[e.g.: Using white males ages 21-25 in a test of a hair growing formulation
will make the results inapplicable to older males and individuals of
other races or females.

Another way to reduce  2 is by grouping experimented units that are more alike.

e.g.: 1) We have two drugs to be tested. Use identical twins, say


5 pairs. Randomly pick one twin from each pair and give
drug one. The other twin gets drug two. We rely on the
fact that within pair of twin variation is less than between
pair of twin variation.

e.g.: 2) We need to test two types of shoe soles. Pick 20 people


and randomly assign one type of sole to one foot of each
person and the other type to the other foot. Here again,
between foot variation within a person is less than between
person variation.

e.g.: 3) In an agricultural experiment to compare the yield of 4 varieties


of soybeans, divide experimental land into four blocks, each
block containing 5 plots (i.e. experimental units). In each block,
randomly assign each variety to a plot.
1
MATH 4220 Dr. Zeng

All the above are examples of “BLOCKING”. In example 1), the block is a pair of twins, in example
2), the block is a person, and in example 3), the block is a piece of land consisting of 5 adjoining
plots.

In all cases, plot to plot variation within a block is less than block to block variation.

THE MEANS MODEL FOR A ONE-WAY (FIXED EFFECT) TREATMENT STRUCTURE IN A


RANDOMIZED BLOCK DESIGN

Yij  i   j  ij i  1,2,..., a


j  1,2,...,b

i ~ iidN o, 2b 
where
 ij ~ iidN 0, 2 

and  j ,  ij are independent.

i denote the population mean for the i th treatment

One can consider the above model as a two-way model where the row effect is fixed but the
column effect is random (so it is a mixed model). In fact, the appropriate sum of squares can be
obtained by treating it as a two-way model without interaction.

The plot to plot variation within a fixed block is  2 . Thus, the error variance of a plot selected
randomly from a pre-specified block (after accounting for the block effect) is  2 .

Thus VarY12  Y22   Var 12   22   2 2 . However, the variance of the response of a plot randomly
b 1 2
picked from the totality of ab plots is not  2 but is  2   b   2   2b if b is large. Note that  b2 is
b
the block variation (scaled to reflect the plot size).

If  b2 is large, then blocking will enable to come up with a more “sensitive” experiment.

THE CLASSICAL MODEL

Yij    i   j   ij i  1,2,...,a
j  1,2,...,b
 ij ,  j defined as before.

Observe that we have no interaction term. In blocked experiments, it is assumed that there
is not block by treatment interaction.
a b
The constraints assumed are  i  0 and   j  0 . The second restriction is needed only if
i 1 j 1

the blocking effect is considered fixed.


2
MATH 4220 Dr. Zeng
AN EXAMPLE OF AN ANALYSIS OF DATA FROM A
RANDOMIZED COMPLETE BLOCK DESIGN

Example 1: Three different washing solutions are being compared to study their effectiveness in
retarding bacteria growth in 5-gallon milk containers. The analysis is done in a laboratory, and only
three trials can be run on any day. Because days could represent a potential source of variability, the
experimenter decides to use a randomized block design. Observations are taken for four days, and
the data are shown here. Analyze the data from this experiment and draw conclusions.

In this example, the blocking factor is the day. The treatment is “solution”. We have three types of
solutions and four levels for “day.”

options ls=72 nodate;


data wash;
input solution day bacteria;
cards;
1 1 13
1 2 22
1 3 18
1 4 39
2 1 16
2 2 24
2 3 17
2 4 44
3 1 5
3 2 4
3 3 1
3 4 22
;
proc print;
title1 ' MATH 338 : Experimental Design';
title2 'Example on Randomized Complete Block Design';
title3 'List of Data';
proc glm;
title3 'analysis of variance results';
class solution day;
model bacteria = day solution / solution;
means solution / tukey;
proc glm;
title3 'analysis of variance results with lsmeans';
class solution day;
model bacteria = day solution / solution;
lsmeans solution / tdiff;
run;

3
MATH 4220 Dr. Zeng

THE SAS OUTPUT IS GIVEN BELOW

4
MATH 4220 Dr. Zeng

5
MATH 4220 Dr. Zeng

6
MATH 4220 Dr. Zeng

OTHER BLOCK DESIGNS

There are many types of block designs, with RCB being one of them. Some of the other block
designs are Latin Square Designs, Greaco-Latin Square Designs, and Split-Plot Designs.

LATIN SQUARE DESIGN

In a randomized complete block design, the blocking was done to reduce variation that can be
attributed to some random (and in some cases fixed) factor. For example, in an agricultural
experiment, blocking may be done to remove the effect due to a fertility gradient; in a chemistry
experiment blocking may be done to remove the effect of the chemists’ skills. In some situations, it is
possible that one wishes to remove the effect of two factors. Then blocking has to be done in two
“directions”, each “direction” corresponding to the “gradient’ of a given factor.

e.g: An agricultural scientist wishes to study the effects of 4 different kinds of fertilizer
on a certain variety of wheat. The experimental field in which the wheat
is to be grown has a moisture gradient in one direction and a sunlight
gradient perpendicular to it.

Hence we need to block in both directions.

Column Blocks

A B C D
B C D A
Row Blocks Sun Light Gradient
C D A B
D A B C

Moisture Gradient

One may block as above (with 4 row blocks to take care of the sunlight
gradient and 4 column blocks to take care of the moisture gradient).

If you now apply the four fertilizers (i.e. treatments A,B,C,D) in such a
way that each treatment occurs once (and only once) in each row and
in each column, then we have what is known as a Latin Square Design.

Usually, the row block effects and the column block effects are random effects and it is assumed that
there is no row * column, row * treatment, column * treatment and row * column * treatment
interaction. In fact, it is the contrasts that estimate the above interactions that are used to estimate
the error variance  2 .

Sometimes, the row effect or the column effects are those due to a specific treatment (or both are).
Thes, the rows, columns (or both) are fixed effects.

7
MATH 4220 Dr. Zeng
e.g.: In the agriculture example given above, suppose the
experimental field is homogeneous (and hence no blocking
is necessary), but the agriculturalist is interested in two other
factors, namely wheat variety and time of application of fertilizer.
Suppose each of these two factors also have 4 levels each.

Then, the agriculturalist could have conducted a 3-way


experiment. With 2 replications for each of the 4 * 4 * 4
treatment combinations, be would need 128 experimental
units (plots).

Suppose he knows that no interaction exists, so he need not


Replicate because interaction contrast can be used to estimate error.
Even then he needs 64 plots.

Now, if the no interaction hypothesis is true (i.e. no variety *


fertilizer, variety * time, time * fertilizer, and variety * time *
fertilizer interactions), then he could use the design in on this page
with the varieties randomly assigned to the rows and times
of fertilizing randomly assigned to the columns. This way, he
needs only 16 plots!

Usually, however, such an assumption of no interaction is not


reasonable and thus the agriculturalist may end up having to
use 128 plots.

THE GENERAL MEANS MODEL FOR A LATIN SQUARE DESIGN

Yijk  i   j  k   ijk

i  1,2,..., 
j  1,2,...,   # of treatments =# of rows =# of columns
k  1,2,..., 

Here i denotes the treatment


 
Here j denotes the row 

Here k denotes the column 

where  j ~ iidN 0,  k2  


 

 k ~ iidN 0, c  2 

*   If row and column effects are random
  ijk ~ iidN 0,  2  
 

with  j ,  k , ijk independent


8
MATH 4220 Dr. Zeng
or 
 
 Yijk  ijk   ijk ,  ijk ~ iidN 0,  2

 
 if row & column effects are fixed. 

THE GENERAL CLASSICAL MODEL FOR LATIN SQUARE DESIGN

Yijk     i   j   k   ijk

i, j , k  1, 2,...,  , 
i 1
i 0

( i -denoting the treatment effect and  -denoting the overall mean)

and if  j , k are considered random effects. I this case * above holds.

 
If  j , k are fixed, then  j  0,  k  0 and  ijk ~ iidN 0, 2.
j 1 k 1

Note that the above model is completely additive. That is, it has no interaction terms.

ANALYSIS OF A LATIN SQUARE DESIGN


  
SSTotal     Yijk
2 2
 NY 
i 1 j 1 k 1

where N   .
2


SSTreatment   Yi2 2
 NY
i 1


SSRows   Y j
2 2
 NY 
j1


SSColumns   Yk
2
 NY 2
k 1

It can be shown that SSTreatment,SSRow ,SSColumns are independent, and are also independent of SSError
where

SSError  SSTotal  SSTreatment  SSRows  SSColumns

Further,

9
MATH 4220 Dr. Zeng
SSTreatment /   1
Fo  ~ F  1,   2  1
SSError /   2  1

if
Ho  1   2  ..     0 (otherwise, Fo has a non-central F distribution).

THE ANOVA TABLE

Source d.f. SS MS F

MSTreatment
Treatments  1 SSTreatment MSTreatment Fo 
MSError

Rows  1 SSRows MSRows

Columns  1 SSColumns MSColumns

Error   2 1 SSError MSError

Total 2  1 SSTotal

Then analysis using SAS can be done as follows:

proc glm data=yourdata;


class row col treatment;
model y = row col treatment;
means treatment/lsd tukey;
run;

Example 2: Consider an experiment to investigate the effect of 4 diets on milk production. There are
4 cows. Each lactation period the cows receive a different diet. Assume there is a washout period so
previous diet does not affect future results.

options nocenter ls=75;


data milk;
input cow period trt resp @@;
cards;
1 1 1 38 1 2 2 32 1 3 3 35 1 4 4 33
2 1 2 39 2 2 3 37 2 3 4 36 2 4 1 30
3 1 3 45 3 2 4 38 3 3 1 37 3 4 2 35
4 1 4 41 4 2 1 30 4 3 2 32 4 4 3 33
;
proc glm;
class cow trt period;
model resp=trt period cow;
means trt/lsd tukey;
means period cow;
10
MATH 4220 Dr. Zeng
output out=new r=res p=pred;
symbol1 v=circle;
proc gplot;
plot res*pred;
proc univariate noprint normal;
histogram res/normal (L=1 mu=0 sigma=est) kernel (L=2);
qqplot res/normal (L=1 MU=0 sigma=est);
run;

11
MATH 4220 Dr. Zeng

12
MATH 4220 Dr. Zeng

REPEATED LATIN SQUARES

One disadvantage of a Latin Square Design is that smaller squares yield low d.f. for error (e.g. A 3 x 3
design has only 2 d.f. for error; a 5 x 5 design has only 12 d.f. for error). To overcome this problem,
one may replicate the Latin Square n times n 1.

Case 1 Latin Squares replicated with same blocks. (Use the same col & row in each replicate)

The classical model is:

Yijk     i   j   k     ijk

Replication

i, j,k as before, 1,2,...,n, where  denote the effect of the


th
square
(which is also a block effect).

SSTreatment   n Yi 
2 2
 N Y  where N  n .
2

i 1

13
MATH 4220 Dr. Zeng

SSRows   n Y j   N Y 
2 2

j1


SSColumns   n Yk
2 2
  N Y  
k 1

n
SSReplication   
2 2 2
Y   N Y  
1

 SSSquares 
SSTotal      Yijk  N Y
2 2

i j k

SSError  SSTotal  SSRows  SSColumns  SSRep.

ANOVA

Source d.f. SS MS F

MSTrt
Treatments  1 SSTreatment MSTreatment Fo 
MSError

Rows  1 SSRows MSRows

Columns  1 SSColumns MSColumns

Replicates n 1 SSReplicates MSReplicates

Error  1n 1  3 SSError MSError

n 1
2
Total SSTotal

14
MATH 4220 Dr. Zeng

Example 3 (Case 1): Same rows and same columns in additional squares

1 2 3 response
1 A B C 7 8 9
2 B C A 4 5 6
3 C A B 6 3 4

1 2 3
1 C B A 8 4 7
2 B A C 6 3 6
3 A C B 5 8 7

1 2 3
1 B A C 9 6 8
2 A C B 5 7 6
3 C B A 9 3 7

data case1;
input rep row col trt resp;
datalines;
11117
11228
11339
12124
12235
12316
13136
13213
13324

21138
21224
21317
22126
22213
22336
23115
23327

31129
31216
31338
32115
32237
32326
33139
15
MATH 4220 Dr. Zeng
33223
33317
;
proc glm data=case1;
class rep row col trt;
model resp=rep row col trt;
run;
quit;

Case 2 Replicated by introducing additional versions of one blocking factor but using the
same blocks for the other blocking factor. (use different rows but same columns in each
replicate)
w. .o.g. assume that columns blocks are repeated but row blocks have additional versions.

The classical model is:

Yijk     i   j  k     jk
 1,2,..., n

SSTotal      Yijk  n Y


2 2

i j k

16
MATH 4220 Dr. Zeng
where N  n 2


SSTreatment   n Yi 
2
 N Y 2
i 1

 n n
SSRows    Y2j   Y 2 2
 
i 1 1 1

n
SSColumns   nYk
2 2
  N Y
k 1

n
SSReplicates    2Y
2 2
 N Y
1

SSError  SSTotal  SSTrt  SSRows  SSCol  SSRep

ANOVA
Source d.f. SS MS F

MSTrt
Treatments  1 SSTreatment MSTreatment Fo 
MSError

Rows n 1 SSRows MSRows

Columns  1 SSColumns MSColumns

Replicates n 1 SSReplicates MSReplicates

Error n    1   1 SSError MSError

n 1
2
Total SSTotal

17
MATH 4220 Dr. Zeng
Example (Case 2): New (different) rows and same columns

1 2 3 response
1 A B C 7 8 9
2 B C A 4 5 6
3 C A B 6 3 4

1 2 3
4 C B A 8 4 7
5 B A C 6 3 6
6 A C B 5 8 7

1 2 3
7 B A C 9 6 8
8 A C B 5 7 6
9 C B A 9 3 7

Case 3 Latin Squares replicated by introducing additional versions of both blocking variables.
(use different row & col in each replicate)

The model is:

Yijk     i   j  k     ijk

SSTotal computed as before

SSTreatment computed as before

SSRows computed as before

 n n
SSColumns     Yk
2
  2 Y
2

k 1 1 1

n
SSReplicates    2Y
2
 N Y2 where N  n2
1

SSError obtained by subtraction.

The ANOVA table as in Case 2, except SSColumn computed differently and has n 1 d.f. and SSError
has    1 n    1  1 d.f.

18
MATH 4220 Dr. Zeng
SAS can be used to analyze repeated Latin Squares as follows:

Example 4 (Case 3): different rows and new columns

1 2 3 response
1 A B C 7 8 9
2 B C A 4 5 6
3 C A B 6 3 4

4 5 6
4 C B A 8 4 7
5 B A C 6 3 6
6 A C B 5 8 7

7 8 9
7 B A C 9 6 8
8 A C B 5 7 6
9 C B A 9 3 7

Case 1 (rows and columns crossed w/reps)

proc glm data=yourdata;


class rep row col treatment;
model y = rep row col treatment;
run;

Case 2 (row nested, columns crossed w/reps)

proc glm data=yourdata;


class rep row col treatment;
model y = rep row(rep) col treatment;
run;

Case 3 (rows and columns nested w/reps)

proc glm data=yourdata;


class rep row col treatment;
model y = rep row(rep) col(rep) treatment;
run;
Note: ROW (REP) gives SS due to rows within replication. (Similarly for COLUMN (REP)).
19
MATH 4220 Dr. Zeng
GRAECO-LATIN SQUARES

Def.
n
Let a x Latin square consists of Latin letters and another x Latin square consist of Greek
letters. Suppose they have the property that when superimposed, each Latin letter coincides
exactly once with each Greek letter. Then the two squares are said to be orthogonal.

A collection of n pxp Latin squares are said to be a mutually orthogonal set of Latin squares
if each letter in one square coincides with each combination of the letters in the other squares
exactly once.

Def.
n
A pair x Latin, Greek, Greek squares that are orthogonal form a Graeco-Latin square.

Using a Graeco-Latin square, one may block in a 3rd direction or analyze a 2nd treatment.

An example of a Graeco-Latin Square

A B C D E
B C D E A
C D E A B
D E A B C
E A B C D

Note: When more than two orthogonal Latin squares are superimposed, we obtain a Hyper-Graeco-
Latin square.

Analysis of Graeco-:Latin Squares

Model is:
Yijk     i  w j   k     ijk
   
Latin Greek Row Column
TRT Letter

ŹŹ
SSTotal      Yijk  N Y where N   .
2 2

i j k

 d.f.)
SSTrt  SSLatin   Yi2  N Y2 (   1)
i 1

SSGreek   Y j   Y    1
2 2

j 1

20
MATH 4220 Dr. Zeng

SSRows   Yk
2 2
  Y    1
k 1


SSColumns   Y
2 2
 Y    1
1

SSError obtained by subtraction   3 1


SAS can be utilized as follows:

proc glm data=yourdata;


class greek row col tx;
model y = row col greek tx;
run;

Example 5: Graeco-Latin Square


An experiment is conducted to compare four gasoline additives by testing them on four cards with
four drivers over four days. Only four runs can be conducted in each day. The response is the amount
of automobile emission.
Treatment factor: gasoline additive, denoted by A, B, C, and D
Block factor 1: driver, denoted by 1,2,3,4
Block factor 2: day, denoted by 1,2,3,4
Block factor 3: car, denoted by α, β, γ, δ

Graeco-Latin Square Design Matrix:

21
MATH 4220 Dr. Zeng
SAS codes:

data additives;
input row col trt greek resp @@;
datalines;
1 1 1 1 32 1 2 2 2 25
1 3 3 3 31 1 4 4 4 27
2 1 2 4 24 2 2 1 3 36
2 3 4 2 20 2 4 3 1 25
3 1 3 2 28 3 2 4 1 30
3 3 1 4 23 3 4 2 3 31
4 1 4 3 34 4 2 3 4 35
4 3 2 1 29 4 4 1 2 33
;
proc glm data=additives;
class row col trt greek;
model resp=row col trt greek;
run;

Multiple comparisons can be carried out using similar methods.

SAS outputs:

Practice:

1. Run the repeated Latin square design case 2 and 3. Interpret your result.
2. Interpret the result for Example 1-4
3. Draw conclusion for Example 5. Include the analysis of multiple comparisons.
22
MATH 4220 Dr. Zeng

Assignments:

1. Lew (2007) presents the data from an experiment to determine whether cultured cells respond to two
drugs. The experiment was conducted using a stable cell line plated onto Petri dishes, with each
experimental run involving assays of responses in three Petri dishes: one treated with drug 1, one
treated with drug 2, and one untreated serving as a control. The data are shown in the table below:

Control Drug 1 Drug 2


Experiment 1 1147 1169 1009
Experiment 2 1273 1323 1260
Experiment 3 1216 1276 1143
Experiment 4 1046 1240 1099
Experiment 5 1108 1432 1385
Experiment 6 1265 1562 1164

(a) Analyze the data as if it came from a completely randomized design (CRD). Write down the classical
effect model for CRD and the five steps for hypothesis testing. Is there a significant difference between
the treatment groups?
(b) Analyze the data as complete randomized block design (CRBD). What is the treatment? What is the
blocking factor? Write down the classical effect model for CRD and the five steps for hypothesis
testing. Is there a significant difference between the treatment groups?
(c) Is there any difference in the results you obtain in (a) and (b)? If so, explain what may be the cause of
the difference in the results and which method would you recommend?

2. Le Riche and Csima (1964) evaluated four hypnotic drugs and a placebo to determine their effect on
quality of sleep in elderly patients. The treatment levels were labeled (A=Placebo, E=Ethchlorvynol,
C=Glutethimide, D=Chloral hydrate and E=Secobarbitol sodium). Elderly patients were given one of the
capsules for five nights in succession and their quality of sleep was rated by a trained nurse on a four-
point scale (0=poor to 3=excellent) each night. An average score was calculated for each patient over
the five nights in a week. Each patient received all five treatments in successive weeks. The design and
the response (mean quality of sleep rating ) are shown in the table below:

Week
Patient 1 2 3 4 5
1 B 2.92 E 2.43 A 2.19 C 2.71 D 2.71
2 D 2.86 A 1.64 E 3.02 B 3.03 C 3.03
3 E 1.97 B 2.5 C 2.47 D 2.65 A 1.89
4 A 1.99 C 2.39 D 2.37 E 2.33 B 2.71
5 C 2.64 D 2.31 B 2.44 A 1.89 E 2.78

(a) What are the nuisance factors in this problem? What is the appropriate model for this data?
(b) Write down the classical effect model for this design and determine if there are any significant
differences among the treatments.

23
MATH 4220 Dr. Zeng
(c) Use an appropriate method to determine if there is a significant difference between the placebo and
other four drugs?
(d) Use an appropriate method to determine which drug/drugs has/have the highest rating?
(e) Use residual plots to check the assumption for the model you fit.

3. A manufacturing firm investigated the breaking strengths of components made from raw materials
purchased from 4 supplies (A, B, C, D). Data was collected from 2 replicates of a 4X4 Latin square
design. The blocking factors were days and operators.

(a) The same four operators were used in both replicates. Each replicate was also run on the same four
days with replicated values taken during the morning and afternoons of these four days. Write down
the statistical model for this data. Is there any significant difference among the different supplies?

Replicate 1 Replicate 2
Days Days
1 2 3 4 1 2 3 4
Operator B C A D Operator D C A B
1 810 1080 700 910 1 840 1050 775 805
C D B A A D B C
2 1100 880 780 600 2 670 930 720 1035
D A C B C B D A
3 840 540 1055 830 3 980 700 810 610
A B D C B A C D
4 650 740 1025 900 4 860 730 970 900

(b) Eight operators were used with four operators randomly assigned to each replicate. The two replicates
were run over 8 days with the first 4 days assigned to replicate 1 and the second four days assigned to
replicate 2. Write down the statistical model for this data. Is there any significant difference among the
different supplies?

Replicate 1 Replicate 2
Days Days
1 2 3 4 1 2 3 4
Operator B C A D Operator D C A B
1 810 1080 700 910 5 840 1050 775 805
C D B A A D B C
2 1100 880 780 600 6 670 930 720 1035
D A C B C B D A
3 840 540 1055 830 7 980 700 810 610
A B D C B A C D
4 650 740 1025 900 8 860 730 970 900

24

You might also like