STAB22 section 8.
2 and chapter 8 exercises
All the other questions in this section that do confidence intervals
can be checked in Minitab this way.
8.36 Each subject had to choose one of Commercial A and Commercial B. So the ones that did not choose Commercial A chose
Commercial B: 100 45 = 55 of the women, 140 80 = 60 of the
men. You can plug all these numbers into the formulas again, or
you can stop and think a moment: it really makes no difference
whether you count successes (prefer Commercial A) or failures (prefer Commercial B), so the difference in proportions will
be the same (just the other way around), and the square root and
z will be exactly the same (p(1 p) is the same as (1 p)p each
time). More of the women prefer Commercial B, so the confidence
interval is now mostly positive: from 0.006 to 0.249. (If you did
men minus women, your answer here will be the same one as I
Minitab can also do this kind of confidence interval. It does the
got in (a) and vice versa.)
calculation the same way we do, so you can use Minitab to check
your answers. Select Stat, Basic Statistics, 2-proportions. Click 8.37 When doing a test for two proportions, bear in mind that the
on Summarized Data, and enter the number of trials and number
null hypothesis tells you that the two population proportions are
of successes for each sample: 100 and 45 in the first row, 140 and
the same (you just dont know what they are). So you estimate
80 in the second row. Click Options and check that you have the
the common proportion first (as the total number of successes
confidence level correct (95), and an Alternative of Not Equal.
divided by the total number of trials). Here that is p = (45 +
My output is in Figure 1 (it comes with some stuff about doing a
80)/(100 q
+ 140) = 0.5208. SEDp then uses p that you just found:
test, which you can ignore). The confidence interval is the same
SEDp = (0.5208)(0.4792)(1/80 + 1/140) = 0.0654. This is the
as we found by hand.
bottom of your test statistic. (Note that its really the same
formula as SED for a confidence interval, except that you work
Test and CI for Two Proportions
out p first and plug that in where you had p1 and p2 before.)
Sample
X
N Sample p
Then
the test statistic is the difference in proportions divided by
1
45 100 0.450000
2
80 140 0.571429
the square root you just found: z = (45/100 80/140)/0.0654 =
1.86. Our null hypothesis was that the proportions for men and
Difference = p (1) - p (2)
women are the same, and the alternative is the two-sided one that
Estimate for difference: -0.121429
they are different, so we need a two-sided P-value: 2 0.0317 =
95% CI for difference: (-0.248815, 0.00595810)
0.0634. This is not quite smaller than 0.05, so we do not quite have
evidence that the population proportions differ (even though the
Figure 1: Minitab output for 8.35
sample proportions differ by 12%). As ever, larger sample sizes
8.35 45 out of 100 women is proportion 0.45; 80 out of 140 men is
0.571, so the estimated difference is 0.450.571 = 0.121 (or you
can do as men minus women
if you prefer). For the confidence inq
terval, SED is SED = (0.45)(0.55)/100 + (0.571)(0.429)/180 =
0.0650, for a 95% interval z = 1.96, so the margin of error is
m = (1.96)(0.0650) = 0.1274. Go up and down this much from
the estimated difference in proportions, 0.121, to get a confidence
interval from 0.249 to 0.006. (If you did men minus women, your
interval will be the other way around: 0.006 to 0.249. Either is
good.)
would help.
that we can cite the undoubled P-value, 0.0317. This is significant
at the 5% level, so we can conclude that the proportion of men
favouring commercial A is larger than the proportion of women.
(If you did men minus women, youll find that your test statistic
is 1.86, positive, and the same logic leads you to conclude that
you are also on the correct side and the P-value is the same.
As ever, you can compare this two-sided test with the confidence
interval: we did not quite reject the null, so a difference of 0 is
(just) inside the confidence interval. In both cases, though, it is
a close thing.
Minitab can also do this test (as we do, so you can compare
answers). Select Stat, Basic Statistics, 2 Proportions. Click on 8.39 According to the bottom of the box on page 506507, we can use
the large-sample confidence interval when the number of successes
Summarized Data and enter the number of trials and number
and failures in each of the two samples is at least 10 (that is, there
of successes for each sample in the same way as 8.35. Click on
are four things to check, and they all have to be OK). What we are
Options and select the correct Alternative (here Not Equal is OK).
really checking is that the normal approximation to the binomial
Also select Use Pooled Estimate of p For Test (so that Minitab
works all right in both samples.
calculates p the same way as we do by hand). See Figure 2;
this time you ignore the confidence interval and look at the test
(a) is all right: the first sample has 10 successes and 20 failures,
statistic (1.86) and P-value 0.063, which are the same as we got
and the second has 15 successes and 15 failures. (d) is likewise
by hand.
OK. (b) and (c) fail because the number of successes in the second
sample is less than 10 (so you dont need to worry about anything
Test and CI for Two Proportions
else); (e) fails because the number of failures in the second sample
Sample
X
N Sample p
is 50 45 = 5 which is less than 10.
1
45 100 0.450000
2
80
140
0.571429
Difference = p (1) - p (2)
Estimate for difference: -0.121429
95% CI for difference: (-0.248815, 0.00595810)
Test for difference = 0 (vs not = 0): Z = -1.86
Most of the time, youll be dealing with large enough samples
in practice, but it always is a good idea to check.
8.40 This is the same idea as 8.39: do we have at least 10 successes
and 10 failures in each sample? (c) and (d) are both all right ((c)
has exactly 50% successes in its large-enough samples), but the
others all fail: (a) has too few successes in the 2nd sample, (b)
Figure 2: Minitab test output for 8.37
has too few failures (too many successes) in the 1st sample, and
(e) has too few failures in the 2nd sample.
All the other questions in this section that do tests (except for
8.62, which is different) can be checked in Minitab this way.
8.41 In 8.4, we attacked the same data by treating the 2003 data as
P-Value = 0.063
a population value, which it is not really: it came from a sample
too. Here were going to tackle things the honest way: compare
the figures as coming from two samples.
8.38 If you imagine that you had some reason ahead of time to prefer
a one-sided alternative (maybe Commercial A featured sports cars
or scantily-clad women or beer), you would first check that the
test statistic was on the correct side. Since we did women minus
men, the test statistic should be, and is, negative. This means
89% is 1068 out of 1200; 83% is 996 out of 1200. For our test,
the null is that the two population proportions are the same, and
2
about 9 percentage points.)
the alternative is that the proportion of cellphone owners has
increased (the same as in 8.4).
8.44 I cant see which proportion is bigger, so Ill just go ahead and
First we need the overall estimate of the population proportion of
calculate them: for pet owners, 285/595 = 0.4790 and for nonsuccesses: p = (1068+996)/(1200+1200) = 0.86, or, more simply,
pet-owners, 1024/1939 = 0.5281.
note that the two samples are the same size, so that the best
For the confidence interval, we want SED :
estimate of p is just the average of the two sample proportions.
s
The
square root on the bottom of the test statistic (SEDp ) is
q
(0.4790)(0.5210) (0.5281)(0.4719)
SED =
+
(0.86)(0.14)(1/1200 + 1/1200) = 0.0142, so the test statistic
595
1939
itself is z = (0.89 0.83)/0.0142 = 4.24. This is off the end of the
= 0.0234,
table, so the P-value is very small, and we can definitely conclude
that the proportion of cellphone owners has increased. (This is
so the 95% interval is 0.5281 0.4790 (1.96)(0.0234), or from
the same conclusion as in 8.4, though there the test statistic was
0.0032 to 0.0950. (If you do the subtraction the other way around,
bigger, because we were treating the 2003 value (0.83) as the
your interval endpoints will be negative.)
truth, whereas it really comes from a sample and is subject to
sampling variability as well, so if we are honest, as here, we should 8.45 Now were comparing the proportion of pet-owners that are married with the proportion of non-pet-owners that are married. This
have a slightly larger P-value than we did in 8.4.)
time, we have the sample proportions, so we can go right ahead
Usually, if you are calculating a confidence interval as well
and calculate:
as doing a test, you can re-use most of your work to
s
(0.533)(0.467) (0.577)(0.423)
get the confidence interval. Not with proportions, though.
SED =
+
The qmargin of error for the 95% interval is z SED =
595
1939
= 0.0233,
1.96 (0.89)(0.11)/1200 + (0.83)(0.17)/1200 = (1.96)(0.0141) =
0.0277, and so the confidence interval is 0.89 0.83 0.0277, from
giving a 95% confidence interval of 0.577 0.533 (1.96)(0.0233),
0.032 to 0.088. We are pretty sure the proportion has gone up;
0.0017 to 0.0897. According to our interval, the proportions
this is how much we think it has gone up by.
married for pet-owners vs. non-pet-owners could be anywhere
Two observations to make here: these samples gave a difference of
from almost equal to about 9% higher for non-pet-owners. (This
six percentage points, which was strongly significant, as compared
required us to look carefully at which proportion was which.)
to 8.35, where a difference of 12 percentage points was not quite
significant. You might guess that the larger sample sizes made the 8.46 The alternative hypothesis is that the population proportions
are different (two-sided), and the null is that they are the
difference here. Also, note that here SED and SEDp are almost,
same. Plug the given numbers into your formulas to find that
but not quite, the same; they are never exactly equal (unless the
p1 = 2973/14995 = 0.1983, p2 = 3140/13819 = 0.2272,
two sample proportions are also exactly equal), but they will be
p
= (2973 + 3140)/(14995 + 13819) = 0.2122, SEDp =
close if the two sample proportions are also close. (The difference
q
(0.2122)(0.7878)(1/14995 + 1/13819) = 0.0048, z = (0.1983
here was statistically significant but not very big, no more than
3
0.2272)/0.0048 = 6.01, which gives a P-value that is very small 8.48 The sample proportions are p1 = 35/165 = 0.2121 and p2 =
17/283 = 0.0601 for the congested area and uncongested area
indeed. The population proportions are definitely different, even
respectively. These differ by 0.2121 0.0601 = 0.1520.
though the sample proportions differ by only 3 percentage points.
(That is because of the huge sample sizes.) Also, p is slightly
There are two different standard errors: SED for a confidence
closer to the 1993 proportion because the 1993 data was based on
interval, and SEDp for a test. We are going to need both, so we
a slightly larger sample size.
might as well find both now:
s
The 95% confidence interval,
therefore,
should not
(0.2121)(0.7879) (0.0601)(0.9399)
include 0.
Even though SEDp and SED are difSED =
+
165
283
ferent,
they agree to 4 decimal places:
SED
=
q
=
0.0348;
(0.1983)(0.8017)/14995 + (0.2272)(0.7728)/13819 = 0.0048, so
p = (17 + 35)/(283 + 165) = 0.1161;
that the margin of error is m = (1.96)(0.0048) = 0.0095 and
s
the confidence interval goes from (1999 values first) 0.0195 to
1
1
+
SE
=
(0.1161)(0.8839)
Dp
0.0384. We think the proportion of frequent binge drinkers in
165 283
1999 is bigger than the proportion of frequent binge drinkers in
= 0.0314.
1993 by about this amount: not much bigger, but a real increase,
statistically speaking.
(SE and SE are different, but almost the same. This is usuD
Dp
ally the case. SEDp is calculated assuming that the two population proportions are the same, whereas SED makes no such
assumption.)
8.47 This is like 8.35 and 8.36: looking at failures rather than successes shouldnt change your opinion of the situation in any important way. The confidence interval should be the same, only with
negatives instead of positives (the proportion of people who are
not frequent binge drinkers was higher in the 1993 sample), and
the test statistic should be positive (as I calculated it in 8.46),
with the same P-value. (It doesnt matter whether you have a
one-sided or two-sided alternative; the P-value will be the same
because with a one-sided alternative the definition of correct
side changes along with the sign of the test statistic.)
The study is aiming to see whether reducing pollution reduces
wheezing, so a one-sided alternative is in order. With the labels
as used above, the null is p1 = p2 and the alternative is p1 > p2 .
(If you used 1 and 2 the other way around, your hypotheses should
reflect that.)
Test statistic is
0.1520
p1 p2
=
= 4.84,
z=
You can run through the calculations to convince yourself that
SEDp
0.0314
the above is correct. Your calculations of SED and SEDp have
whose P-value is very small. So we can reject the null and
the same numbers in a different order, so SED and SEDp have
conclude that building the bypass1 did reduce the symptoms of
the same value as before. The difference in sample proportions
of failures is the same as the difference in sample proportions
1
If you know your Hitchhikers Guide to the Galaxy, you will recall that when
of successes, though with the opposite sign, so test statistic and Arthur Dent asked why a bypass had to be built through his house, Mr. Prosser
confidence interval are as before but with the oppposite sign.
replied Well, youve got to build bypasses, havent you?.
4
For the test (though you can guess what the result will be): use a
two-sided alternative (since any gender bias might go either way),
and figure out p = (48+52)/(60+132)
= 0.5208 (quite a lot closer
q
to the male value), SEDp = (0.5208)(0.4792)(1/60 + 1/132) =
0.0778 (quite a lot different from SED because the sample proportions are very different and the sample sizes are small), z =
(0.80 0.394)/0.0778 = 5.22. The P-value, even two-sided, is
very small, so we have evidence that there is a gender bias in
textbooks of the type sampled. (Are you surprised, given the
confidence interval? I thought not.)
wheezing. (I skipped the drawing of the sketch, but if you drew
one, it should look like a normal curve with 4.84 marked somewhere way over to the right, and the area beyond 4.84 shaded.
If you had your labels the other way around, your test statistic
will be 4.84, which is way over on the left, and beyond means
less than.)
Part (e) is a rather odd question, but to calculate the confidence
interval anyway: 0.1520 (1.96)(0.0348), from 0.0838 to 0.2203,
so that the percentage reporting improvement was between 8%
and 22% higher for the bypass residents. We had evidence of
a difference (in (d)), but if we had not, our confidence interval
would have included both negative and positive values for the dif- 8.57 (a) Our confidence interval calculation assumes that there are
no problems with nonresponse: that is, the confidence interval
ference in proportions; there could be either a positive or negative
includes an allowance for sampling variability but nothing else.
difference, or no difference at all, so that absence of evidence is
(b) p1 and p2 are the sample proportions; a null hypothesis has to
not evidence of absence.
be about the population proportions. (Once you fix up the null
There might be something different about the United Kingdom
hypothesis, you do indeed use a z test statistic to do the test,
that would cause these results not to be reproducible in other
so that part is correct. (c) One sample could be bigger than the
countries. Weather is the obvious one; in smoggy places, it might
other and have more successes in proportion: for example, one
not matter much how far away from the pollution you were.
sample has 20 trials and 10 successes (proportion 0.5), and the
other has 40 trials and 20 successes (also 0.5).
8.51 Standard procedure once again for a confidence interval: p1 =
73/91 = 0.8022; p2 = 75/109 = 0.6881. SED = 0.0609, and
interval; the
the interval is 0.8022 0.6881 (1.96)(0.0609), from 0.0053 to 8.58 This is a straightforward one-proportion confidence
only
twist
is
that
it
is
a
90%
interval
so
z
=
1.645.
The sam0.2335.
ple proportion is 0.58; exercise 8.14 says that
q the sample size is
8.53 This (we hope) should be getting straightforward by now.
n = 1048, so the margin of error is 1.645 (0.58)(0.42)/1048 =
The sample proportions, taking female references first, are
0.025, and the confidence interval is 0.555 to 0.605. (Note that
pq
1 = 48/60 = 0.80 and p2 = 52/132 = 0.394. SED =
the margin of error is around 3% again.)
(0.8)(0.2)/60 + (0.394)(0.606)/132 = 0.0669. For a 90% interval, z = 1.645, so the margin of error is m = (1.645)(0.0669) = 8.59 Under the assumptions, we have 524 each of men and
women.
The confidence interval uses z = 1.96 and SED =
0.11, and the confidence interval for the difference in proportions
q
(0.59)(0.41)/524 + (0.56)(0.44)/524 = 0.0305, so the margin of
goes from 0.30 to 0.52. Were not very sure how big the difference
in proportions is, but we are pretty sure that it is positive: that
error is m = (1.96)(0.0305) = 0.06. The difference in sample
is, female references are more likely to be to juveniles than male
proportions is 0.03, so the confidence interval goes from 0.03 to
references.
0.09. The sample proportions are close, so the sample sizes are
5
not big enough to prove that there is a difference between men 8.63 These symptoms differ in severity: at the top of the list, you
might notice a wheezing attack but not wish to do anything about
and women.
it. The symptoms at the bottom of the list are the kind of thing
that are rarer, but you probably would consult a doctor. The
8.62 In Exercise 8.41, the sample proportion of students owning a
short answer is different people have different symptoms.
cellphone in 2004 was 89% (based on a sample of 1200 students),
and in 2000 was 43%, likewise (as far as we can tell) based on a
Part (b) is a lot of calculation, which you can get Minitab to help
sample of 1200 students.
you with. A summary of the results is in Figure 3.
For these two samples, p1 2
p2 = 0.89 2(0.43) = 0.03. Treating
p1 and p2 as random variables, p1 has variance p1 (1 p1 )/1200
and p2 has variance p2 (1 p2 )/1200. According to the rules for
variances, p1 2
p2 has variance var(
p1 ) + 22var(
p2 ) and standard
error the square root of that.
We could expect the bypass proportions to be higher: that is,
wed expect more improvement when the pollution decreased.
Our null hypothesis is H0 : p1 = 2p2 and the alternative is
Ha : p1 > 2p2 . We dont know the values of p1 and p2 , so
well put in the values we got from our samples to get a variance
of (0.89)(0.11)/1200 + 4(0.43)(0.57)/1200 = 0.0009 and SE
0.0009 = 0.03. This is not the absolute best way to do it, since
we havent used the H0 fact that p1 = 2p2 , but itll be good
enough. See below for some more thoughts on this.
For (e), see the last two columns of Figure 3. Part (b), though,
gives you an improvement relative to a control group (the congested group), and we know that comparison with a control is
a good idea: we can see the effect of the bypass compared to no
bypass, rather than just looking at the bypass group alone.
See the P-value column of Figure 3. Only the sleep difference
is significant, and three of the differences are even on the wrong
side.
Complaint
Sleep
Number
Speech
Activities
Doctor
Phlegm
Cough
The test statistic is z = {0.89 2(0.43)}/0.03 = 1, so the Pvalue (one-sided, since we are testing more than doubled) is
0.1587. There is no evidence at the 5% level to conclude that the
proportion has more than doubled.
p
1 p
2
0.0864
0.0307
0.0182
0.0137
0.0112
0.0220
0.0323
CI
0.02800.1448
0.03610.0976
0.01520.0515
0.03950.0670
0.07960.0573
0.07110.0271
0.08530.0207
z
2.64
0.88
0.99
0.50
0.32
0.92
1.25
P-value
0.0042
0.1897
0.1600
0.3100
> 0.5
> 0.5
> 0.5
0.1596
0.1596
0.0426
0.0925
0.1174
0.0474
0.0575
CI
0.11680.2023
0.11680.2023
0.01900.0661
0.05860.1264
0.07730.1576
0.02120.0736
0.02920.0857
Figure 3: Summary of results for exercise 8.57
When youre doing a test, as in this section, of H0 : p1 = p2 ,
what you do is to say let p1 = p2 = p and then estimate p 8.64 63 of 296 label users among the women is proportion
p1 = 0.213, and 27 of 251 men is proportion p2 =
(thats what that business of calculating an overall success rate,
0.108.
For the confidence interval, you need SED =
and using SEDp rather than SED , is for). So the best thing to do
q
(0.213)(0.787)/296 + (0.108)(0.892)/251 = 0.0308, so the marhere would be to is to let q = p1 = 2p2 and figure out an estimate
of q, which will be somewhere between 89% and 2 times 43%
gin of error is m = (1.96)(0.0308) = 0.06. The confidence interval
(though its hard to see what it would be exactly). But since, for
goes from 0.045 to 0.166, which says that the proportion of women
these data, p1 and 2
p2 are already pretty close, this improvement
label users is higher than the proportion of men label users by
wouldnt make much difference to anything, so the P-value youd
somewhere between 4 and 16 percentage points. We havent esget by doing this would be very similar to the 0.1587 we got above.
timated the difference in proportions very precisely, but we think
6
greater probability of graduating.
its positive, which means that we should end up rejecting a hypothesis that the proportions are equal.
The suggestion from this output is that, for the confidence interval, it doesnt actually matter whether you have the Use Pooled
Estimate box checked or not, and that you might as well have gotten the confidence interval from the top half of the output anyway.
You can check the Minitab help files to see whether this will work
in general. (Though, if youre doing a 1-sided test, youll have to
run this a second time to get the confidence interval anyway.)
To do the test: use a null hypothesis that the proportions
are equal against a two-sided alternative (unless you can justify that, say, women are if anything more likely to be label
users). Then
calculate p = (63 + 27)/(296 + 251) = 0.1645,
q
SEDp = (0.1645)(0.8355)(1/296 + 1/251) = 0.0318, and the
test statistic is z = (0.213 0.108)/0.0318 = 3.31. The P-value
for this is 2 0.0005 = 0.0010, so we can certainly reject the null
hypothesis at the 5% (or 1%) level and conclude that there is a
difference in proportion of label users between men and women.
Test and CI for Two Proportions
Sample
1
2
8.65 Same old thing again. Lets do this one in Minitab for a change.
Select Stat, Basic Statistics, 2 proportions. Click on Summarized Data and enter 1132 trials, 643 successes (for the Internet
users) and, on the second row, 852 trials, 349 successes (for the
nonusers). In this case, success means completed college. For
the test, differ significantly implies a two-sided alternative, so
Not Equal is right for the Alternative. Make sure that Use Pooled
Estimate is checked. See the top half of Figure 4 for the results.
The test statistic (on the bottom line) is 6.98, with a two-sided
P-value very close to 0. There definitely is a difference in proportion of college graduates between users and nonusers (which
of course does not mean that using the Internet will help you to
graduate).
X
643
349
N
1132
852
Sample p
0.568021
0.409624
Difference = p (1) - p (2)
Estimate for difference: 0.158397
95% CI for difference: (0.114544, 0.202249)
Test for difference = 0 (vs not = 0): Z = 6.98
P-Value = 0.000
Test and CI for Two Proportions
Sample
1
2
X
643
349
N
1132
852
Sample p
0.568021
0.409624
Difference = p (1) - p (2)
Estimate for difference: 0.158397
95% CI for difference: (0.114544, 0.202249)
Test for difference = 0 (vs not = 0): Z = 7.08
P-Value = 0.000
Figure 4: Test and confidence interval for 8.65
For the confidence interval, you probably shouldnt use the output you just got; you should go back and start again. Select 8.68 Before the changes, 16% of 200 repairs were completed in 5 days,
Stat, Basic Statistics, 2 proportions; all the numbers you typed
which is (0.16)(200) = 32 orders. The confidence interval for the
in should still be in the Summarized Data boxes. Click Options,
before proportion is
make sure that Confidence Level is correct (95) and Alternative
s
is Not Equal, and uncheck Use Pooled Estimate of p (just to be
(0.16)(0.84)
,
0.16 1.96
safe). This gives the bottom half of Figure 4. The confidence
200
interval on the second-to-last line is 0.115 to 0.202 for the difference in proportions; we think Internet users have this much
from 0.1092 to 0.2108.
7
the test statistic is z = (0.903 0.679)/0.0462 = 4.85, again off
the end of the table. So again the P-value is very small and we
have strong evidence that the die-hard fans were more likely to
watch/listen to Cubs games as children (assuming that the fans
surveyed really were random samples of all possible Cubs fans).
After the changes, 90% of 200 repairs were completed in 5 days,
which is 180 orders. The confidence interval here is
0.90 1.96
(0.90)(0.10)
,
200
from 0.8584 to 0.9416.
The test isnt really very interesting, because who really wants to
know that die-hard fans are more likely to have been fans as children where else, after all, would someone learn to be a baseball
fan? A confidence interval is more interesting because it answers a
more interesting question: how much more likely is a die-hard fan
to have been q
a childhood fan? The calculation is the usual business: SED = (0.903)(0.097)/134 + (0.679)(0.321)/237 = 0.0397
(not so close to SEDp because the sample proportions are quite
different), the margin of error is m = (1.96)(0.0397) = 0.078, and
the interval is 0.903 0.679 0.078, from 0.146 to 0.301. Diehard fans are quite a bit more likely to have been childhood fans
as well, or at least fans enough to watch or listen to games.
These confidence intervals dont overlap, which might make you
suspect that the proportions are significantly different. But to
compare two proportions, you have to use the two-proportion
methods of 8.2. For that, we need
SED =
(0.16)(0.84) (0.90)(0.10)
+
= 0.0335,
200
200
and hence the confidence interval for the difference of proportions
is 0.90 0.16 (1.96)(0.0335), from 0.6743 to 0.8057. In percents,
this is from 67.43% to 80.57%.
8.69 This is a 1-sample CIqfor a proportion (just
qto confuse you). The 8.74 463 out of 975 is p = 0.475, so a 95% q
confidence interval for
margin of error is z p(1 p)/n = 1.96 (0.56)(0.44)/1200 =
the population proportion is 0.475 1.96 (0.475)(0.525)/975 =
0.024, so the confidence interval goes from 0.526 to 0.584.
0.475 0.026, from 0.449 to 0.501. Somewhere close to, but probably a little less than, half of students at this university changed
8.70 Back to two proportions. Labelling the die-hard fans as 1 and
their major at some time. The confidence interval as percents goes
the less loyal fans as 2, for sample proportions of people who
from 44.9% to 50.1%. To get a confidence interval for the numwatched/listened to Cubs games as a child, p1 = 0.903, p2 =
ber of students, just multiply the endpoints by 37500, 16838 to
0.679. These are 121 of 134 and 161 of 237, respectively.
18788. An interval that expresses the proportion with reasonable
For the test, you can justify a one-sided test, on the basis
accuracy doesnt do the same for the number of students!
that die-hard fans should be more likely if anything to have
paid attention to Cubs games as children. (A two-sided test 8.77 This one is good to do with a spreadsheet, to save yourself the
repetitive calculations.
wouldnt be wrong, either. Just state your choice and get on
q p is always 0.45 (the average of 0.4 and
with the calculations.) Here that means H0 : p1 = p2 and
0.5) and so SEDp = (0.45)(0.55)(1/n + 1/n). In your spreadHa : p1 > p2 . Then go through the calculations for the test:
sheet, set this up as a formula so that you can work it out for varip = (121 + 161)/(134 + 237) = 0.7601 (which is not just the averous n. Then the test statistic is z = (0.50.4)/SEDp = 0.1/SEDp ,
age of the two pq
values, becauuse there are more less loyal fans).
and if your spreadsheet has a normal distribution function (mine
is called NORMSDIST), you can work out the P-value too. Some
Then SEDp = (0.7601)(0.2399)(1/134 + 1/237) = 0.0462, and
8
one-sided test, looking for evidence that the number of MexicanAmericans selected is too small).
judicious copying and pasting gets you Figure 5. Notice how the
P-values go steadily down. This shows that when you increase the
sample size, a constant difference (0.1) between the two samples
becomes more and more statistically significant (the larger the
samples get, the closer each p should be to its population value p,
by the law of large numbers, so, loosely, any difference in sample
proportions becomes more likely to be real rather than chance.
n
40
50
80
100
400
500
1000
SE_Dp
0.11
0.1
0.08
0.07
0.04
0.03
0.02
z
0.9
1.01
1.27
1.42
2.84
3.18
4.49
The test statistic is z = (0.3897 0.79)/ (0.79)(0.21)/870 =
28.99. This is not just on the correct side, it is so far on the
correct side that the P-value is tiny; Mexican-Americans are definitely being selected for jury duty less frequently than would be
expected, given how many of them are eligible jurors. (This, by
itself, is not evidence of discrimination, but it is now up to the defence to prove some other (acceptable) reason for the difference.)
P-value
0.1843
0.1574
0.1018
0.0776
0.0022
0.0007
0.0000
We would expect the two-sample version of the problem to give
us very similar answers. The tricky thing is getting the numbers
right: we have 339 Mexican-Americans out of 870 jurors, and
143611 339 = 143272 MAs out of 181535 870 = 180665 nonjurors. (The sample of nonjurors is so big that we almost have
no sampling variability here, in other words wed expect a result
like the above.) Going through the calculations, SEDp = 0.0138
and the test statistic is 29.20, basically the same strong evidence
against equal proportions as the one-sample test.
Figure 5: Test statistics and P-values for 8.77
8.79 The margin qof error for the confidence interval
q is m =
z SED = 1.96 (0.5)(0.5/n) + (0.5)(0.5)/n = 1.96 2(0.52 )/n = 8.82 I went to nhl.com and looked at the Vancouver Canucks (I used
to live in Vancouver). In the 20072008 season, the Canucks
1.3859/ n. Writing n in terms of m gives n = (1.3859/m)2;
played 41 regular season games at home, winning 26 and losing
putting in m = 0.075 gives n = (1.3859/0.075)2 = 341.48, so we
15, and 41 regular season games on the road, winning 17 and
will need a sample size of 342 in each group.
losing 24 (counting shootout wins as wins and shootout losses as
Mimicking the above withqmore symbols and fewer numbers
losses). This looks like quite a disparity, but could it happen by
gives m = z SED = z 2(0.5)2 /n = z 0.5/ n, so that
chance?
n = 0.5(z /m)2 . (Check this with your numbers in (a).) As
First up, a test. We can justify a one-sided alternative because
always with these sample size calculations, if you want to make
we are looking for a positive advantage to playing at home. I got
the margin of error smaller, you have to make the sample size a
a test statistic of 1.99 for my data, with a one-sided P-value of
lot bigger.
0.023, so there is evidence of a home ice advantage.
To see how big the home-ice advantage might be, a confidence
interval is called for. In Minitab, I switched Alternative back to
Not Equal to get an 95% confidence interval from 0.009 to 0.430.
As is often the case, we have evidence that the effect (here home
8.81 The proportion of all eligible jurors that were Mexican-American
is 143611/181535 = 0.79. Were going to treat this as our null
hypothesis value of p, and look to see whether our sample proportion of 339/870 = 0.3897 is significantly less (ie. we are doing a
9
ice advantage) exists, but we do not have a good sense of how big
it is. One way to get a bigger sample size is to look at the whole
NHL: you can go to the Standings page at nhl.com and add up
the home and road wins and losses, bearing in mind that each
game is listed twice, once for each team. (You can ignore the ties,
or count them as half a win and half a loss.)
To do this kind of thing properly, you need to account for the
fact that some teams are stronger than others (for example, a
strong team will tend to win most of its games whether they are
at home or on the road). The Canucks (or whichever team you
chose) played teams of a variety of strengths, so their probability
of winning a home game was not equal for all games. This means
that the binomial assumption that underlies our calculations is
not actually true. There is a thing called the Bradley-Terry model
that enables you to estimate the size of the home advantage while
also allowing for the fact that teams are of different strengths;
using this enables you to show that a home advantage exists in
almost every sports league. Its size varies, though: in the soccer
leagues of southern and eastern Europe, the home field advantage
is quite considerable (which can be hard to see because the teams
differ considerably in strength as well: once you allow for the
strengths of the teams, you can see just how big it is).
10