Chapter 9 Large-sample tests of
hypothesis
• Objectives: To learn how to formulate
statistical hypothesis and test it
9.1 Introduction
• Suppose that a pharmaceutical company
is concerned that the mean potency m of an
antibiotic meet the minimum government
potency standards. They need to decide
between two possibilities:
– The mean potency µ does not exceed
the mean allowable potency.
– The mean potency µ exceeds the mean
allowable potency.
• This is an example of a test of hypothesis.
• Similar to a courtroom trial. In trying a person for a
crime, the jury needs to decide between one of two
possibilities:
– The person is guilty.
– The person is innocent.
• To begin with, the person is assumed innocent.
• The prosecutor presents evidence, trying to
convince the jury to reject the original assumption
of innocence, and conclude that the person is
guilty.
9.2 Components in statistical test of
hypothesis
1. The null hypothesis, H0:
– Assumed to be true until we can prove
otherwise.
2. The alternative hypothesis, Ha:
– Will be accepted as true if we can disprove
H0
Court trial: Pharmaceuticals:
H0: innocent H0: m does not exceeds allowed amount
Ha: guilty Ha: m exceeds allowed amount
3. The test statistic and its p-value:
• A single statistic (a number) calculated from the
sample which will allow us to reject or not reject
H0, and
• A probability, calculated from the test statistic
that measures whether the test statistic is likely
or unlikely, assuming H0 is true.
4. The rejection region:
– A rule that tells us for which values of the test
statistic, or for which p-values, the null
hypothesis should be rejected.
5. Conclusion:
– Either “Reject H0” or “Do not reject H0”, along
with a statement about the reliability of your
conclusion.
How do you decide when to reject H0?
– Depends on the significance level (Type I
error), a, the maximum tolerable risk you want
to have of making a mistake, if you decide to
reject H0.
– Usually, the significance level is a = .01 or a =
.05.
9.3 Large Sample Test of a Population Mean, m
• Take a random sample of size n 30 from a
population with mean m and standard
deviation s.
• We assume that either
1.s is known or
2.s s since n is large
• The hypothesis to be tested is
– H0:m = m0 versus Ha: m m0
Test statistic (or z-score)
• Assume to begin with that H0 is true. The
sample mean x is our best estimate of m,
and we use it in a standardized form as the
test statistic:
x m0 x m0
z=
s / n s/ n
since x has an approximate normal distribution
with mean m0 and standard deviation s / n .
• If H0 is true the value of x should be close
to m0, and z will be close to 0. If H0 is false, x
will be much larger or smaller than m0, and z
will be much larger or smaller than 0,
indicating that we should reject H0.
Likely or Unlikely? P-value
• Once you’ve calculated the observed value of the test
statistic, calculate its p-value:
p-value: The probability of observing, just by
chance, a test statistic as extreme or even
more extreme than what we’ve actually
observed. If H0 is rejected this is the actual
probability that we have made an incorrect
decision.
• If this probability is very small, less than some
preassigned significance level, a, H0 can be rejected.
Likely or Unlikely? Critical value and rejection
region
• Reject H0 if the statistic is more extreme than
, the number which has a proportion of
more extreme than it for some pre-assigned
alpha
Critical region critical region
critical value
Applet Example
• The daily yield for a chemical plant
has averaged 880 tons for several years.
The quality control manager wants to know if
this average has changed. She randomly selects
50 days and records an average yield of 871 tons
with a standard deviation of 21 tons.
H 0 : m = 880 Test statistic :
H a : m 880 x m 0 871 880
z = = 3.03
s/ n 21 / 50
Applet Example continued
What is the probability that this test
statistic or something even more extreme (far
from what is expected if H0 is true) could have
happened just by chance?
p - value : P ( z > 3.03) + P ( z < 3.03)
= 2 P ( z < 3.03) = 2(.0012) = .0024
This is an unlikely
occurrence, which
happens about 2 times in
1000, assuming m = 880!
Example continued
• To make our decision clear, we choose
a significance level, say a = .01.
If the p-value is less than a, H0 is rejected as false. You
report that the results are statistically significant at
level a.
If the p-value is greater than a, H0 is not rejected. You
report that the results are not significant at level a.
Since our p-value =.0024 is less than 0.01, we
reject H0 and conclude that the average yield
has changed.
Example continued (rejection region)
If a = .01, what would be the critical
value that marks the “dividing line” between “not
rejecting” and “rejecting” H0?
If p-value < a, H0 is rejected.
If p-value > a, H0 is not rejected.
The dividing line occurs when p-value = a. This is
called the critical value of the test statistic.
Test statistic > critical value implies p-value < a, H0 is rejected.
Test statistic < critical value implies p-value > a, H0 is not
rejected.
Applet Example
What is the critical value of z that
cuts off exactly a/2 = .01/2 = .005 in the tail
of the z distribution?
For our example, z
= -3.03 falls in the
rejection region
and H0 is rejected
at the 1%
significance level.
Rejection Region: Reject H0 if z > 2.58 or z < -2.58. If the
test statistic falls in the rejection region, its p-value will be
less than a = .01.
One-tailed tests
• Sometimes we are interested in a detecting a
specific directional difference in the value of m.
• The alternative hypothesis to be tested is one
tailed:
– Ha:m > m0 or Ha: m < m0
• Rejection regions and p-values are calculated
using only one tail of the sampling distribution.
Applet
Example
• A homeowner randomly samples 64 homes
similar to her own and finds that the average
selling price is $252,000 with a standard
deviation of $15,000. Is this sufficient evidence
to conclude that the average selling price is
greater than $250,000? Use a = .01.
Test statistic :
H 0 : m = 250,000
x m 0 252,000 250,000
H a : m > 250,000 z = = 1.07
s/ n 15,000 / 64
Example continued (critical value approach)
What is the critical value of z that
cuts off exactly a= .01 in the right-tail of the z
distribution? For our example, z =
1.07 does not fall in
Applet the rejection region
and H0 is not rejected.
There is not enough
evidence to indicate
that m is greater than
$250,000.
Rejection Region: Reject H0 if z > 2.33. If the test statistic falls
in the rejection region, its p-value will be less than a = .01.
Example continued (p-value approach)
• The probability that our sample results
or something even more unlikely would
have occurred just by chance, when m =
250,000.
p - value : P ( z > 1.07) = 1 .8577 = .1423
Since the p-value is
Applet greater than a = .01, H0
is not rejected. There is
insufficient evidence to
indicate that m is greater
than $250,000.
How to decide the alpha (significance level)
• The critical value approach and the p-value
approach produce identical results.
• The p-value approach is often preferred because
– Computer printouts usually calculate p-values
– You can evaluate the test results at any
significance level you choose.
• What should you do if you are the experimenter
and no one gives you a significance level to use?
• If the p-value is less than .01, reject H0. The
results are highly significant.
• If the p-value is between .01 and .05, reject
H0. The results are statistically significant.
• If the p-value is between .05 and .10, do not
reject H0. But, the results are tending
towards significance.
• If the p-value is greater than .10, do not reject
H0. The results are not statistically
significant.
General form of test statistics
• Not only the sample mean and sample
binomial proportion estimates ca be used for
testing. In general, any estimator (e.g.,
) that has approximately normal distribution
can be used to build a test statistic, z:
estimator - hypothesized value
z=
standard error of the estimator
9.4 Large-sample test of hypothesis for
the difference between two population
means
A random sample of size n1 drawn from
population 1 with mean μ1 and variance s 12 .
A random sample of size n2 drawn from
population 2 with mean μ2 and variances 22 .
•The hypothesis of interest involves the
difference, m1m2, in the form:
•H0: m1m2 = D0 versus
Ha: one of three alternatives
where D0 is some hypothesized difference,
usually 0.
Sampling distribution of x1 x2
• For large sample sizes, the CLT tells us that:
have approximately normal
distribution with mean and
standard deviation
which can be estimated by
• Hence…
H 0 : m1 m 2 = D0 versus
H a : one of three alternatives
x11 xx2 2 0
Test statistic : z =z 2
x
22
2
s1s1 ss22
++
n1n1 nn22
with rejection regions and/or p - values
based on the standard normal z distribution.
Example
Avg Daily Intakes Men Women
Sample size 50 50
Sample mean 756 762
Sample Std Dev 35 30
• Is there a difference in the average daily intakes of dairy
products for men versus women? Use a = .05.
H0 : m1 m2 = 0 (same) H a : m1 m 2 0 (different )
Test statistic :
x1 x2 0 756 762 0
z = = .92
2 2 2 2
s1 s2 35 30
+ +
n1 n2 50 50
Example continued (p-value approach)
• The probability of observing values of z
that as far away from z = 0 as we have,
just by chance, if indeed m1m2 = 0.
p - value : P ( z > .92) + P ( z < .92)
= 2(.1788) = .3576 Since the p-value is
greater than a = .05, H0
is not rejected. There is
insufficient evidence to
indicate that men and
women have different
average daily intakes.
9.5 large-sample test of hypothesis for a
binomial proportion
A random sample of size n from a binomial population
to test
H 0 : p = p0 versus
H a : one of three alternatives
pˆ p0
Test statistic : z =
p0 q0
n
with rejection regions and/or p - values based on
the standard normal z distribution.
Example
• Regardless of age, about 20% of American
adults participate in fitness activities at least twice
a week. A random sample of 100 adults over 40
years old found 15 who exercised at least twice a
week. Is this evidence of a decline in participation
after age 40? Use a = .05.
Test statistic :
H 0 : p = .2
pˆ p0 .15 .2
H a : p < .2 z= = = 1.25
p0 q0 .2(.8)
n 100
Example continued (p-value approach)
• The z=-1.25, and the alternative hypothesis is
one-tailed (only we reject for very low values
of z) so the p-value is:
P(z<-1.25)=.0156
P-value=.0156
-1.25
Example continued(critical value
approach)
What is the critical value of z that
cuts off exactly a= .05 in the left-tail of
the z distribution?
For our example, z = -1.25
does not fall in the
rejection region and H0 is
not rejected. There is not
enough evidence to
indicate that p is less than
.2 for people over 40.
Rejection Region: Reject H0 if z < -1.645. If the test statistic
falls in the rejection region, its p-value will be less than level a
= .05.
9.6 Testing difference between two
binomial proportions
•To compare two binomial proportions,
A random sample of size n1 drawn from
binomial population 1 with parameter p1.
A random sample of size n2 drawn from
binomial population 2 with parameter p2 .
•The hypothesis of interest involves the
difference, p1p2, in the form:
H0: p1p2 = D0 versus Ha: one of three
•where D0 is some hypothesized difference,
usually 0.
H 0 : p1 p2 = 0 versus
H a : one of three alternatives
pˆ1 pˆ 2
Test statistic : z =
1 1
pˆ qˆ +
n1 n2
x1 + x2
with pˆ = to estimate the common value of p
n1 + n2
and rejection regions or p - values
based on the standard normal z distribution.
Example
Youth Soccer Male Female
Sample size 80 70
Played soccer 65 39
• Compare the proportion of male and female college
students who said that they had played on a soccer team
during their K-12 years using a test of hypothesis.
H0 : p1 p2 = 0 (same) H a : p1 p2 0 (different )
Calculate pˆ1 = 65 / 80 = .81
x1 + x2 104
pˆ 2 = 39 / 70 = .56 pˆ = = = .69
n1 + n2 150
Example continued
Youth Soccer Male Female
Sample size 80 70
Played soccer 65 39
Test statistic :
pˆ 1 pˆ 2 0 .81 .56
z= = = 3.30
1 1 1 1
pˆ qˆ + .69(.31) +
n1 n2 80 70
p - value : P( z > 3.30) + P( z < 3.30) = 2(.0005) = .001
Since the p-value is less than a = .01, H0 is rejected. The
results are highly significant. There is evidence to indicate
that the rates of participation are different for boys and girls.