STAT3010: Lecture 8
TWO-WAY ANALYSIS OF VARIANCE
(Your text book doesnt cover 2-way ANOVA so youll have to
use my lecture notes. )
In chapter 9, we used analysis of variance to decide whether
three or more populations have the same mean. Those were
called one-way ANOVAs (or single-factor ANOVA) because
the data are categorized into groups according to a single
factor (or treatment).
Lets look at this Table:
Weights (kg) of Poplar Trees in Year 1
Site 1
(rich, moist)
Site 2
(sandy, dry)
_________________________treatment________________________________________
Fertilizer and
None
Fertilizer
Irrigation
Irrigation
__________
0.15
1.34
0.23
2.03
0.02
0.14
0.04
0.27
0.16
0.02
0.34
0.92
0.37
0.08
0.16
1.07
0.22
0.08
0.05
2.38
0.60
1.11
0.07
0.07
0.44
1.16
0.93
0.30
0.59
0.17
0.65
0.08
0.62
0.01
0.03
0.22
2.13
2.33
1.74
0.12
________
This table lists weights (in kilograms) of poplar trees. The listed
weights are partitioned into 8 categories according to two
variables: (1) the row variable of site and (2) the column
variable of treatment.
Two-way analysis of variance involves TWO factors, such as site
and treatment in my above table. The eight subcategories are
called cells, so our table has eight cells containing 5 values
each.
STAT3010: Lecture 8
So, why dont we just use two separate one-way ANOVAs?
Definition: There is an interaction between two factors if the
effect of one of the factors changes for different categories of
the other factor.
Example of interaction between two factors:
What about a more serious matter?
Lets explore our above poplar tree table by calculating the
mean for each cell.
Weights (kg) of Poplar Trees in Year 1
Treatment
Site 1
(rich, moist)
Site 2
(sandy, dry)
None
0.15
0.02
0.16
0.37
0.22
Fertilizer
1.34
0.14
0.02
0.08
0.08
Irrigation
0.23
0.04
0.34
0.16
0.05
0.60
1.11
0.07
0.07
0.44
1.16
0.93
0.30
0.59
0.17
0.65
0.08
0.62
0.01
0.03
Fertilizer and
Irrigation
2.03
0.27
0.92
1.07
2.38
0.22
2.13
2.33
1.74
0.12
STAT3010: Lecture 8
Means
None
Fertilizer
Site 1
(rich, moist)
Site 2
(sandy, dry)
Lets plot these means:
Irrigation
Fertilizer and
Irrigation
STAT3010: Lecture 8
What does this plot show?
Note: If your mean line segments are approximately parallel,
we have evidence that there is not an interaction between the
row and column variables. If the mean line segments are far
from parallel, we would have evidence of an interaction
between the row and column variables.
Now, in using a two way ANOVA for the data, we must
consider three possible effects on the weights of poplar trees:
(1) the effects of an interaction between site and
treatment
(2) the effects of site
(3) the effects of treatment
In order to create an ANOVA, we must first learn of the
requirements:
1. For each cell, the sample values come from a population
with a distribution that is normal. (We may basically
always assume normality, but you must state that you are
assuming normality if not otherwise stated.)
2. For each cell, the sample values come from populations
having the same variance. (It works better if we have
equal variances, but it still works if we dont have equal
variances.)
3. The samples are simple random samples.
4. The samples are independent of each other.
5. The sample values are categorized two ways. (Hence 2way ANOVA.)
6. All of the cells have the same number of sample values.
STAT3010: Lecture 8
Definitions and Notation
Generally, the two-factor analysis of variance has several
observations within the two factors. So, any particular
treatment combination has several observations associated
with it. A dot appearing in place of a subscript indicates that
the missing subscript has been summed out. As usual, a bar
over a symbol means that an appropriate mean has been
calculated.
i
j
I
J
K
xijk
xij .
xi..
x. j .
x...
N IJK
5
STAT3010: Lecture 8
Note: Two-Factor ANOVA with k ij =1 is such a case where we
analyze data from a two-factor experiment in which there in
only one observation for each of the IJ combinations of levels
of the two factors. For ex:
Consider the following data from an experiment to compare
three different brands of pens and four different wash
treatments with respect to their ability to remove marks on a
particular type of fabric. The response variable is a quantitative
indicator of overall specimen colour change; the lower this
value, the more marks were removed.
Washing Treatment
1
2
3
4
Brand of 1
0.97
0.48
0.48
0.46
Pen
2
0.77
0.14
0.22
0.25
3
0.67
0.39
0.57
0.19
Procedure of the Two-way ANOVA
Step 1:
Interaction Effect: In two-way ANOVA, begin by testing the null
hypothesis that there is no interaction between the two factors:
6
STAT3010: Lecture 8
Test Statistic:
Decision:
Conclusion.
Note: If we reject the null hypothesis and claim there is an
interaction between the row and column, then we would STOP
NOW. If we do not reject the null hypothesis and claim there is
no interaction between the row and column, we go onto step
2.
Step 2:
Row/Column Effects (main effects): [Step 2 should only be
completed if proven no interaction between the row and
column effects] We wish to test the main effects:
STAT3010: Lecture 8
Test Statistic for row factor:
Test Statistic for column factor:
Decision:
Conclusion.
Lets look at the Sum of Squares and degrees of freedom
formulas:
SST ( X ijk X ) 2 X 2 ijk
i
SSE ( X ijk X ij ) 2 X 2 ijk
i
SSA ( X i X ) 2
i
SSB ( X j X ) 2
i
1
JK
1
IK
SSAB ( X ij X i X j X ) 2
i
1
K
X
i
df=IJK-1
ij
df=IJ(K-1)
1
X 2
IJK
df=I-1
1
X 2
IJK
df=J-1
1
X 2
IJK
df=(I-1)(J-1)
The fundamental identity
SST = SSA + SSB + SSAB + SSE
Implies that interaction sum of squares SSAB can be obtained
by subtraction.
8
STAT3010: Lecture 8
Treatment
Site 1
(rich, moist)
Site 2
(sandy, dry)
None
Fertilizer
Irrigation
Fertilizer and
Irrigation
0.15
0.02
0.16
0.37
0.22
1.34
0.14
0.02
0.08
0.08
0.23
0.04
0.34
0.16
0.05
2.03
0.27
0.92
1.07
2.38
0.60
1.11
0.07
0.07
0.44
1.16
0.93
0.30
0.59
0.17
0.65
0.08
0.62
0.01
0.03
0.22
2.13
2.33
1.74
0.12
STAT3010: Lecture 8
ANOVA TABLE:
Source of
Variation
df
Sum of
Squares
Treatments
Site
Interaction
Error
Total
10
Mean
Square