Section 9: Confidence Intervals
Introduction to Probability & Statistics
                      Dr. Oliver Russell
                            201 - SN1
               Lectures 1 & 2: Sections 9.1-9.2
201 - SN1             Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2   1 / 32
Target parameter
Definition
The unknown population parameter (e.g., mean or proportion) that we are
interested in estimating is called the target parameter.
       201 - SN1          Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2   2 / 32
Point vs. interval estimator
Definition
A point estimator of a population parameter is a rule or formula that
tells us how to use the sample data to calculate a single number that can
be used as an estimate of the target parameter.
Definition
An interval estimator is a formula that tells us how to use the sample
data to calculate an interval that estimates the target parameter.
Definition
A confidence interval is an interval estimator with a level of reliability (or
certainty) attached to it.
        201 - SN1            Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2   3 / 32
zα/2
Definition
The value zα/2 is defined as the value of the standard normal random
variable such that the area α/2 will lie to its right. In other words,
P(Z > zα/2 ) = α/2.
       201 - SN1            Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2   4 / 32
Confidence coefficient and confidence interval
Definition
The confidence coefficient, (1 − α), is the probability that an interval
estimator encloses the population parameter—that is, the relative
frequency with which the interval estimator encloses the population
parameter when the estimator is used repeatedly a very large number of
times. The confidence level, (1 − α)% is the confidence coefficient
expressed as a percentage.
For example, a confidence interval for µ with confidence coefficient
(1 − α) could look like
                             x̄ ± zα/2 σx̄ .
       201 - SN1           Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2   5 / 32
Example: confidence interval for µ
Example: if our confidence level is 90%, then in the long run, 90% of our
sample confidence intervals will contain µ. In this case, α = 0.10. Thus,
x̄ ± 1.645σx̄ is a 90% confidence interval for µ.
       201 - SN1            Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2   6 / 32
Important values of zα/2
      201 - SN1       Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2   7 / 32
RECAP: Central Limit Theorem (CLT)
Theorem
Consider a random sample of n observations selected from any population
with mean µ and standard deviation σ. Then, when n is sufficiently large,
the sampling distribution of X̄ will be approximately normal with mean
                                   µX̄ = µ
and standard deviation
                                       σ
                                 σX̄ = √ .
                                        n
       201 - SN1           Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2   8 / 32
RECAP: Sampling distribution of X̄
Thus, the CLT says that for large enough n, approximately,
                                       σ2
                                         
                            X̄ ∼ N µ,       .
                                        n
or, equivalently,
                          X̄ − µ
                             √ = Z ∼ N (0, 1) .
                          σ/ n
The larger n is, the closer X̄ becomes to a true normal distribution. For
most sampled populations, sample sizes of n ≥ 30 will suffice for the
normal approximation to be reasonable.
        201 - SN1           Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2   9 / 32
RECAP: Small samples: problems and solutions
We face 2 problems when dealing with small sample sizes:
 1 If the sample size, n, is not large enough, then the normal
    approximation of X̄ may not hold.
      Solution: if we assume the population is approximately normal, then
      X̄ will also be approximately normal.
  2   When estimating µ, we must often approximate σ using s, but for
      small samples s may not be a good approximation for σ.
                                                                                     X̄ −µ
      Solution: instead of using the standard normal statistic Z =                     σ
                                                                                       √     ,
                                                                                        n
      which relies on either knowing σ or having a close approximation to
      σ, we will define and use a new statistic,
                                                  X̄ − µ
                                   Tn−1 =
                                                     √S
                                                       n
      where we choose to use the sample standard deviation, S.
         201 - SN1           Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 10 / 32
RECAP: Student’s T -statistic
Definition
For any sample of size n randomly drawn from an approximately normal
population with mean µ, the Student’s T -statistic with (n − 1)
degrees of freedom (df) is defined as
                                              X̄ − µ
                               Tn−1 =
                                                 √S
                                                   n
                           q Pn
                                             2
                                i=1 (Xi −X̄ )
where we recall that S =           n−1            is the sample standard deviation.
Note: this also applies if the population is not normally distributed, but
the sample size n is large enough.
        201 - SN1            Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 11 / 32
RECAP: T probabilities: table (See Appendix B)
      201 - SN1      Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 12 / 32
Quick quiz
Suppose you wish to estimate the mean µ of a population of interest by
taking a random sample of size n. Using the Student’s T -statistic, derive
a 100(1 − α)% confidence interval for µ.
       201 - SN1            Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 13 / 32
Quick quiz
      201 - SN1   Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 14 / 32
Summary: confidence interval for µ
Assuming we have either an approximately normal population or the
sample size n is large enough, we can compute a
100(1 − α)% confidence interval for µ based on a T -statistic:
                                            √
                             x̄ ± (tα/2 )(s/ n)
where tα/2 is the T -value corresponding to an α/2 area in the upper tail
of the T -distribution with (n − 1) df, and s is the standard deviation of
the sample.
        201 - SN1           Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 15 / 32
Quick quiz
A manufacturer of printers wishes to estimate the mean number of
characters printed before the printhead fails. The printer manufacturer
tests n = 15 printheads and finds x̄ = 1.24 million characters printed until
failure and s = 0.19 million characters printed until failure. Form a 99%
confidence interval for the mean number of characters printed before the
printhead fails. Interpret the result.
        201 - SN1            Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 16 / 32
RECAP: population vs. sample proportions
Definition
When discussing data which only have 2 potential outcomes (say, success
or failure), the binomial proportion of a population, p, is the
population’s proportion of successes.
Definition
The sample proportion, P̂, is a random variable representing the
proportion of successes in a randomly drawn sample.
        201 - SN1          Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 17 / 32
RECAP: corollary of CLT for proportions
Corollary
By the CLT, if a sample size is large enough, then it turns out that the
random variable P̂ is also approximately normally distributed with mean
                                     µP̂ = p
and standard deviation                r
                                          p(1 − p)
                            σP̂ =                  .
                                             n
       201 - SN1            Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 18 / 32
RECAP: sampling distribution of P̂
In other words, if n large enough, then approximately,
                                                                                    p(1 − p)
                           P̂ ∼ N p,             .
                                         n
or, equivalently,
                          P̂ − p
                        p           ∼ N (0, 1) .
                         p(1 − p)/n
Here, large enough means n ≥ 30, np ≥ 10 and n(1 − p) ≥ 10.
        201 - SN1           Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 19 / 32
Quick quiz
Suppose you wish to estimate the binomial proportion p of a population of
interest by taking a random sample of size n. Using the standard normal
Z -statistic, derive a 100(1 − α)% confidence interval for p.
       201 - SN1           Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 20 / 32
Quick quiz
      201 - SN1   Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 21 / 32
Summary: large-sample (n ≥ 30, np ≥ 10 and
n(1 − p) ≥ 10) confidence interval for p
Large-sample 100(1 − α)% confidence interval for p, based on a
standard normal (Z ) statistic:
                                   r
                                     p̂(1 − p̂)
                         p̂ ± zα/2
                                         n
where p is the binomial proportion of the population and p̂ is the binomial
proportion of the sample.
       201 - SN1            Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 22 / 32
Quick quiz
The Bureau of Economic and Business Research (BEBR) conducts
quarterly surveys to gauge consumer sentiment in Florida. Suppose that
BEBR randomly samples 484 consumers and finds that only 157 are
optimistic about the state of the economy. Use a 90% confidence interval
to estimate the proportion of all consumers in Florida who are optimistic
about the state of the economy. What is the 90% margin of error?
       201 - SN1            Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 23 / 32
Margin of error
Definition
The 100(1 − α)% margin of error (or sampling error), ∆, is half of
the width of the confidence interval.
Examples:
    For a confidence interval for µ, the 95% margin of error is
    ∆ = t0.025 √sn .
    For a confidence
              q        interval for p, the 90% margin of error is
                p̂(1−p̂)
    ∆ = z0.05       n    .
       201 - SN1            Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 24 / 32
Quick quiz (Example 9.1.1)
      201 - SN1      Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 25 / 32
Quick quiz (Example 9.1.1)
      201 - SN1      Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 26 / 32
Quick quiz (Example 9.2.1)
      201 - SN1      Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 27 / 32
Quick quiz (Example 9.2.1)
      201 - SN1      Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 28 / 32
Quick quiz (Example 9.1.2)
      201 - SN1      Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 29 / 32
Quick quiz (Example 9.1.2)
      201 - SN1      Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 30 / 32
Quick quiz
      201 - SN1   Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 31 / 32
Quick quiz
      201 - SN1   Section 9: Confidence Intervals   Lectures 1 & 2: Sections 9.1-9.2 32 / 32