The theory of statistical inference
consists of those methods by which
one makes inferences or
generalizations about a population.
2
Statistical inference may be divided
into two major areas:
estimation and tests of hypotheses.
3
A candidate for public office may wish to
estimate the true proportion of voters favoring him
by obtaining the opinions from a random sample
of 100 eligible voters.
The fraction of voters in the sample favoring
the candidate could be used as an estimate of
the true proportion of the population of voters.
This problem falls in the area of estimation.
4
Consider the case in which a housewife is
interested in finding out whether brand A floor
wax is more scuff-resistant than brand B floor wax.
She might hypothesize that brand A is better than
brand B and, after proper testing, accept or
reject this hypothesis.
5
A point estimate of some
population parameter is a single
value of a statistic.
6
An estimator is not expected to
estimate the population parameter
without error.
We do not expect 𝑥 to estimate 𝜇
exactly, but we certainly hope that it is
not too far off.
7
For a particular sample it is possible to
obtain a closer estimate of 𝜇 by using
the median 𝑥 as an estimator.
8
Not knowing the true value of 𝜇, we
must decide in advance whether to use
𝑥 or 𝑥 as our estimator.
9
If we consider all possible unbiased
estimators of some parameter θ, the
one with the smallest variance is the
most efficient estimator of θ.
10
Even the most efficient unbiased
estimator is unlikely to estimate the
population parameters exactly.
11
It is true that our accuracy increases
with large samples, but there is still no
reason why we should expect a point
estimate from a given sample to be
exactly equal to the population
parameter it is supposed to estimate.
12
Perhaps it would be more desirable to
determine an interval within which we
would expect to find the value of the
parameter.
Such an interval is called an interval
estimate.
13
An interval estimate of a population
parameter 𝜃 is an interval of the form
𝜃1 < 𝜃 < 𝜃2
This interval computed from the selected
sample, is then called a
(1-α) 100% confidence interval.
14
The fraction 1 – α is called the
confidence coefficient or the degree of
confidence, and the end points 𝜃1 and
𝜃2 , are called the lower and upper
confidence limits.
15
Thus, when α = 0.05, we have a 95%
confidence interval,
and when α = 0.01, we obtain a wider
99% confidence interval.
The wider the confidence interval is, the
more confident we can be that the
given interval contains the unknown
parameter.
16
If our sample is selected from a normal
population or, if n is sufficiently large,
we can establish a confidence interval
for 𝜇 by considering the sampling
distribution of 𝑥.
17
Based from Theorems 8.1 and 8.3,
expect the sampling distribution of 𝑥 to
be approximately normally distributed
with mean 𝜇𝑥 = 𝜇 and
𝜎
standard deviation 𝜎𝑥 =
𝑛
18
If 𝑥 is the mean of a random sample of size n from
a population with known variance 𝜎 2 , a (1 –
α)100% confidence interval for 𝜇 is given by
𝜎 𝜎
𝑥 − 𝑧𝛼/2 < 𝜇 < 𝑥 + 𝑧𝛼/2
𝑛 𝑛
Where 𝑧𝛼/2 is the z value leaving an area of α/2 to
the right.
19
For samples of size n ≥ 30, regardless of the shape
of most populations, sampling theory guarantees
good results.
To compute a (1-α)100% confidence interval for 𝜇,
we have assumed that 𝜎 is known.
Since this is generally not the case, we shall
replace 𝜎 by the sample standard deviation s,
provided that n ≥ 30.
20
The mean and standard deviation for
the quality grade-point averages of a
random sample of 36 college seniors
are calculated to be 2.6 and 0.3,
respectively. Find the 95% and 99%
confidence intervals for the mean of
the entire senior class.
21
n = 36 (large)
𝑥 = 2.6
𝜎 = s = 0.3
@95% confidence interval 𝑧𝛼/2 = 𝑧0.025 .
The z value, leaving an area of 0.025 to the
right and therefore an area of 0.975 to the
left, is 𝑧0.025 = 1.96.
22
@95% confidence interval:
𝜎 𝜎
𝑥 − 𝑧𝛼/2 < 𝜇 < 𝑥 + 𝑧𝛼/2
𝑛 𝑛
0.3 0.3
2.6 − 1.96 < 𝜇 < 2.6 + 1.96
36 36
2.502 < 𝜇 < 2.698
23
@99% confidence interval:
𝜎 𝜎
𝑥 − 𝑧𝛼/2 < 𝜇 < 𝑥 + 𝑧𝛼/2
𝑛 𝑛
0.3 0.3
2.6 − 2.575 < 𝜇 < 2.6 + 2.575
36 36
2.471 < 𝜇 < 2.729
24
@95% confidence interval:
2.502 < 𝜇 < 2.698
@99% confidence interval:
2.471 < 𝜇 < 2.729
A longer interval is required to estimate 𝜇
with a higher degree of accuracy.
25
If 𝑥 is used as an estimate of 𝜇, we can
be (1-𝛼)100% confident that the error
will not exceed a specified amount e
when the sample size is
𝑧𝛼/2 𝜎 2
𝑛=
𝑒
26
When solving for the sample size, n,
all fractional values are rounded up to
the next whole number. By adhering to
this principle, we can be sure that our
degree of confidence never falls below
(1-𝛼)100%.
27
How large a sample is required in
Example 9.1 if we want to be 95%
confident that our estimate of 𝜇 is not
off by more than 0.05?
28
𝑧𝛼/2 𝜎 2
𝑛=
𝑒 2
(1.96)(0.3)
𝑛=
0.05
𝑛 = 138.3 ≈ 139
29
Frequently, we are attempting to
estimate the mean of a population
when the variance is unknown and it is
impossible to obtain a sample of size n ≥
30. Cost can often be a factor that
limits our sample size.
30
As long as the population is
approximately bell-shaped, confidence
intervals can be computed when 𝜎 is 2
unknown and the sample size is small by
using the sampling distribution of T.
31
If 𝑥 and 𝑠 are the mean and standard deviation
of a random sample of size n < 30 from an
approximate normal population with unknown
variance 𝜎 2 , a (1-α)100% confidence interval for 𝜇
is given by
𝑠 𝑠
𝑥 − 𝑡𝛼/2 < 𝜇 < 𝑥 + 𝑡𝛼/2
𝑛 𝑛
Where 𝑡𝛼/2 is the t value with v = n-1 degrees of freedom,
leaving an area of α/2 to the right.
32
The contents of 7 similar containers of
sulfuric acid are 9.8, 10.2, 10.4, 9.8, 10.0,
10.2, and 9.6 liters. Find a 95%
confidence interval for the mean
content of all such containers, assuming
an approximate normal distribution for
container contents.
33
𝑥 = 10.0
𝑠 = 0.2828
v = n-1 = 7-1 = 6
Using Table A.5, 𝑡𝛼/2 = 𝑡0.025 = 2.447
34
@95% confidence interval:
𝑠 𝑠
𝑥 − 𝑡𝛼/2 < 𝜇 < 𝑥 + 𝑡𝛼/2
𝑛 𝑛
0.2828 0.2828
10. 0 − 2.447 < 𝜇 < 10.0 + 2.447
7 7
9.738 < 𝜇 < 10.262
35