STAT3010: Lecture 6
Multiple Comparisons Procedures- Contd (Section 9.5,
Page 425)
Recall: Last class we looked at the Scheffe Multiple
Comparison Procedure. These calculations take some time
(especially when you have a lot more than 3 treatments!), and are
generally performed using statistical software on a computer,
so lets try it with SAS.
SAS CODE:
options ps=62 ls=80;
data scheffe;
input Drug $ time;
cards;
1
30
1
35
1
40
1
25
1
35
2
25
2
20
2
30
2
25
2
30
3
15
3
20
3
25
3
20
3
20
run;
proc anova;
class drug;
model time=drug;
means drug/scheffe;
run;
STAT3010: Lecture 6
SAS OUTPUT:
The SAS System
The ANOVA Procedure
Class Level Information
Class
Drug
Levels
3
Values
1 2 3
Number of Observations Read
Number of Observations Used
15
15
The ANOVA Procedure
Dependent Variable: time
Source
Model
Error
Corrected Total
DF
2
12
14
R-Square
0.628713
Source
Drug
Sum of
Squares
423.3333333
250.0000000
673.3333333
Coeff Var
17.33299
DF
2
Mean Square
211.6666667
20.8333333
Root MSE
4.564355
Anova SS
423.3333333
F Value
10.16
time Mean
26.33333
Mean Square
211.6666667
F Value
10.16
The ANOVA Procedure
Scheffe's Test for time
NOTE: This test controls the Type I experimentwise error rate.
Alpha
0.05
Error Degrees of Freedom
12
Error Mean Square
20.83333
Critical Value of F
3.88529
Minimum Significant Difference
8.047
Means with the same letter are not significantly different.
Scheffe Grouping
A
A
B
A
B
B
Mean
33.000
N
5
Drug
1
26.000
20.000
Pr > F
0.0026
Pr > F
0.0026
STAT3010: Lecture 6
The Tukey Procedure
The Tukey procedure is also called the Studentized Range test.
The Tukey procedure is appropriate for pairwise comparisons,
but it doesnt handle general contrasts. However, this
procedure has better statistical power than the Scheffe
procedure.
Before I outline the Tukey procedure, I need to explain the
concept of the comparisons.you have to compare in the
right order.
The first comparison of treatments uses the largest sample
mean compared with the smallest sample mean, if this test is
significant (there is a difference), then the next comparison of
treatments deals with the largest sample mean compared with
the second to smallest sample mean, if this test is significant
(there is a difference), then the next comparison of treatments
deals with the largest sample mean with the third to smallest
STAT3010: Lecture 6
sample mean, and so on. Once you reach a non-significant
difference (there is no difference), then the test stops.
Outline of the Tukey Procedure:
1. Set up the hypothesis:
2. Compute the test statistic:
3. Decision Rule:
4. Conclusion.
Example 9.9: Recall Example 9.3;
Summary Statistics by Treatment
n3
n1 5
n2
x1 33
s1 5.7
x 2 26
s2 4.2
x 3 20
s3 3.5
STAT3010: Lecture 6
Drug A versus Drug C:
1. Hypothesis:
2. Test Statistic:
3. Decision:
4. Conclusion:
Now, because the first test was significant, we proceed to the
next test (largest sample mean compared to the second
smallest sample mean).
Drug A versus Drug B:
1. Hypothesis:
2. Test Statistic:
3. Decision:
4. Conclusion:
STAT3010: Lecture 6
SAS CODE:
options ps=62 ls=80;
data tukey;
input drug time;
cards;
1
30
1
35
1
40
1
25
1
35
2
25
2
20
2
30
2
25
2
30
3
15
3
20
3
25
3
20
3
20
run;
proc anova;
class drug;
model time=drug;
means drug/tukey;
run;
SAS OUTPUT:
The SAS System
The ANOVA Procedure
Class Level Information
Class
drug
Levels
3
Number of Observations Read
Number of Observations Used
Values
1 2 3
15
15
STAT3010: Lecture 6
The ANOVA Procedure
Dependent Variable: time
Source
Model
Error
Corrected Total
DF
2
12
14
R-Square
0.628713
Sum of
Squares
423.3333333
250.0000000
673.3333333
Coeff Var
17.33299
Source
drug
DF
2
Mean Square
211.6666667
20.8333333
Root MSE
4.564355
Anova SS
423.3333333
F Value
10.16
Pr > F
0.0026
time Mean
26.33333
Mean Square
211.6666667
F Value
10.16
Pr > F
0.0026
The ANOVA Procedure
Tukey's Studentized Range (HSD) Test for time
NOTE: This test controls the Type I experimentwise error rate, but it generally
has a higher Type II error rate than REGWQ.
Alpha
0.05
Error Degrees of Freedom
12
Error Mean Square
20.83333
Critical Value of Studentized Range 3.77278
Minimum Significant Difference
7.7012
Means with the same letter are not significantly different.
Tukey Grouping
Mean
drug
A
A
A
33.000
26.000
20.000
B
B
B
STAT3010: Lecture 6
Contrasts
When we create an ANOVA test, in the ideal situation, specific
questions regarding comparisons among the means are posed
before the data are collected. Lets look at an example:
This is a randomized comparative experiment to compare
three methods for teaching reading. Our response variable is
COMP, a measure of reading comprehension that was
measured by a test taken after the instruction was completed.
Group
COMP
Group
COMP
Group
COMP
Basal
41
DRTA
31
Strat
53
Basal
41
DRTA
40
Strat
47
Basal
43
DRTA
48
Strat
41
Basal
46
DRTA
30
Strat
49
Basal
46
DRTA
42
Strat
43
Basal
45
DRTA
48
Strat
45
Basal
45
DRTA
49
Strat
50
Basal
32
DRTA
53
Strat
48
Basal
33
DRTA
48
Strat
49
Basal
39
DRTA
43
Strat
42
Basal
42
DRTA
55
Strat
38
Basal
45
DRTA
55
Strat
42
Basal
39
DRTA
57
Strat
34
Basal
44
DRTA
53
Strat
48
Basal
36
DRTA
37
Strat
51
Basal
49
DRTA
50
Strat
33
Basal
40
DRTA
54
Strat
44
Basal
35
DRTA
41
Strat
48
Basal
36
DRTA
49
Strat
49
Basal
40
DRTA
47
Strat
33
Basal
54
DRTA
49
Strat
45
Basal
32
DRTA
49
Strat
42
The following is the summary statistics (proc means) and
ANOVA table (proc anova) from SAS:
The SAS System
The MEANS Procedure
Variable
N
Mean
Std Dev
Minimum
Maximum
Basal
22
41.0454545
5.6355781
32.0000000
54.0000000
DRTA
22
46.7272727
7.3884196
30.0000000
57.0000000
Strat
22
44.2727273
5.7667505
33.0000000
53.0000000
STAT3010: Lecture 6
The SAS System
The ANOVA Procedure
Class Level Information
Class
trt
Levels
3
Values
Basal DRTA Strat
Number of Observations Read
Number of Observations Used
The SAS System
66
66
The ANOVA Procedure
Dependent Variable: COMP
Source
DF
Sum of
Squares
Model
Error
Corrected Total
2
63
65
357.303030
2511.681818
2868.984848
R-Square
0.124540
Source
trt
Coeff Var
14.34531
DF
2
Mean Square
F Value
Pr > F
178.651515
39.867965
4.48
0.0152
F Value
4.48
Pr > F
0.0152
Root MSE
6.314108
Anova SS
357.3030303
score Mean
44.01515
Mean Square
178.6515152
This above ANOVA shows:
Lets say the researchers are now investigating a specific theory
about reading comprehension. The instruction for the Basal
group was the standard method commonly used in schools.
The DRTA and Strat groups received innovative methods of
teaching that were designed to increase the reading
comprehension of the children. The DRTA and Strat methods
were not identical, but they both involved teaching the
students to use similar comprehension strategies in their
reading. Based on this supposition, the relevant contrast is:
STAT3010: Lecture 6
This above hypothesis compares the average of the two
innovative methods (DRTA and Strat) with the standard method
(Basal). The alternative is one sided because the researchers
are interested in demonstrating that the new methods are
better than the old.
Notice that the combination of population means is 0. These
combinations of means are called contrasts. We use , the
Greek letter psi, for contrasts among population means:
Here are the relevant formulas for carrying out a contrast
Contrasts
A contrast is a combination of population means of the form
ai
where the coefficients a i have sum 0. The corresponding sample
contrast is
ai x i
The standard error of c is
2
SEc
ai
ni
sp
where s p is the root MSE. To test the null hypothesis
Ho :
use the t statistic
c
SEc
with degrees of freedom DFE that are associated with s p . The
alternative hypothesis can be one-sided or two sided.
10
STAT3010: Lecture 6
The sample contrast that estimates
with standard error
The t statistic is
11