0% found this document useful (0 votes)
12 views16 pages

Lecture 11 Stat 1100Q

The document discusses the concepts of parameters and statistics in the context of statistical inference, highlighting their definitions and relationships. It provides examples to illustrate the identification of parameters versus statistics, the behavior of sampling distributions, and the application of the Central Limit Theorem. Additionally, it examines various scenarios involving sample proportions and means to assess claims about population parameters.

Uploaded by

dramosj55
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views16 pages

Lecture 11 Stat 1100Q

The document discusses the concepts of parameters and statistics in the context of statistical inference, highlighting their definitions and relationships. It provides examples to illustrate the identification of parameters versus statistics, the behavior of sampling distributions, and the application of the Central Limit Theorem. Additionally, it examines various scenarios involving sample proportions and means to assess claims about population parameters.

Uploaded by

dramosj55
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Austin Menger 1

STAT 1100Q-001

Lecture 11
Inference

Introduction

Recall the statistics lifecycle:

Some terminology:

Parameter →

Statistic →

Note: Parameters typically have statistic counterparts

We briefly discussed this earlier in the course, but formally there are 3 parameters with
corresponding statistics used in the remainder of this course:

Parameter Statistic

Proportion

Mean

Standard Deviation
Austin Menger 2
STAT 1100Q-001

Example – Identifying Parameters vs Statistics


Identify the parameter and corresponding statistic in each of the following scenarios:

a) 43% of the U.S. say they like Taylor Swift’s music enough to keep her on the radio for
longer than 1 minute. To test this, Austin took a cluster sample of 350 UConn statistics
students. Of the 350 people in the sample, 159 say they like Taylor Swift’s music enough
to keep her on the radio for longer than 1 minute.

b) Of those adults aged 26-35, the mean length of time that one owns the same pair of
sneakers for athletic activity is 283 days, give or take 11 days. A simple random sample
of 134 adults aged 26-35 was followed, and it was found that their mean sneaker length
was 291 days, with a standard deviation of 18 days.

A Few Notes on Parameters & Statistics:

i.

ii. Sampling Variability →

iii.
Austin Menger 3
STAT 1100Q-001
We must understand the sampling distribution of the statistic in order to
infer about the population parameter using the statistic.

̂
Sampling Distribution of the Sample Proportion 𝒑

Here, we will examine the first statistic of interest, the sample proportion 𝑝̂ , and look at its
behavior.

Let’s look at the situation visually:

̂,
In order to discuss this new dataset that we have constructed, the Sampling Distribution of 𝒑
we analyze it in the same way we analyze any other dataset: shape, center, and spread.

Statistical theory gives us the following results about the sampling distribution of 𝑝̂ :

Shape:

Since the distribution is Bell-shaped, we then discuss the mean and the standard deviation with
regard to measuring central tendency and variability (spread), respectively.

Center:

Spread:
Austin Menger 4
STAT 1100Q-001

Example – College Exercising


The estimated percent of college students that exercise on a regular basis is 60% some time ago.
A health educator suspects that this proportion has increased since then. To check his claim, the
educator chooses a random sample of 100 college students and finds that 67 of them exercise on
a regular basis.

a) Write down the given information

b) Describe the shape, center, and spread of the sampling distribution of 𝑝̂ . Draw this
distribution.

c) Is 𝑝̂ = .67 a reasonable value to get when 𝑝 = .60 (i.e. could it have occurred just by
chance due to sampling variability), or is 𝑝̂ = .67 ususally high if 𝑝 = .60 (suggesting
that 𝑝 has actually increased)?

So, the data (𝑝̂ = .67) do not provide enough evidence to conclude that 𝑝 has increased.
Austin Menger 5
STAT 1100Q-001

Example – College Exercising (revised)


The estimated percent of college students that exercise on a regular basis is 60% some time ago.
A health educator suspects that this proportion has increased since then. To check his claim, the
educator chooses a random sample of 400 college students and finds that 268 of them exercise
on a regular basis. Do the data provide evidence of a “real” change?

a) Write down the given information

b) Describe the shape, center, and spread of the sampling distribution of 𝑝̂ . Draw this
distribution.

c) With an increased sample size, is 𝑝̂ = .67 a reasonable value to get when 𝑝 = .60 (i.e.
could it have occurred just by chance due to sampling variability), or is 𝑝̂ = .67 ususally
high if 𝑝 = .60 (suggesting that 𝑝 has actually increased)?

So, the data (𝑝̂ = .67) do provide evidence to suggest that 𝑝 has increased and is thus
higher than .60.
Austin Menger 6
STAT 1100Q-001

d) From here, we can formally calculate the probability of obtaining, just by chance, (i.e.
just due to sampling variability) such an unusual (high) 𝑝̂ or even more unusual (higher):

There is less than a 1% chance of getting 𝑝̂ as high as .67 or higher when 𝑝 = .60 (i.e. it
would be extremely unlikely). This is strong evidence that 𝑝 has probably increased
(since if it stayed .60, we would not have observed such a high 𝑝̂ as we did).

Note: If the sample statistic falls more than 2 standard deviations ABOVE the mean,
then this implies the true parameter value is higher than originally claimed. Likewise,
if the sample statistic falls more than 2 standard deviations BELOW the mean, then
this implies the true parameter value is lower than originally claimed.

What do we notice?

How can we quantify this decrease?

Visually:
Austin Menger 7
STAT 1100Q-001

Now let’s move on to the distribution of another key statistic, the sample mean 𝑥̅ .

̅
The Sampling Distribution of the Sample Mean 𝒙

Visually, what is the big picture?

Again, we aim to describe this distribution using the mean and standard deviation to measure
central tendency and variation, respectively.

Statistical theory tell us the following about the sampling distribution of the sample mean 𝑥̅ :

Center:

Spread:

Note: In words, this means that the variation from sample mean to sample
mean is less than the variation from population individual to population
𝟏
individual by a factor of .
√𝒔𝒂𝒎𝒑𝒍𝒆 𝒔𝒊𝒛𝒆

Shape: Central Limit Theorem (CLT) →


Austin Menger 8
STAT 1100Q-001

Example – Light Bulb Duration


The number of hours that a typical incandescent light bulb lasts before burning out varies from
bulb to bulb. One particular type of General Electric Soft White light bulb is advertised to have
mean lifetime of 750 hours.

Consumer advocacy groups occasionally test such claims


for proof-in-advertising. Suppose some researchers from
Consumer Reports wished to test the advertised lifetime of
the GE Soft White light bulbs. So, suppose that they
selected a reasonably-random sample of 𝑛 = 36 light
bulbs (for instance by choosing a random sample of stores
in the region, and then, at each store, randomly selecting
one of the packages from the shelves). They took the light
bulbs to the Consumer Reports laboratory, and the light
bulbs were left on until each one burned out. Then, the
researchers calculated the average lifetime in the sample
to be 744 hours. Suppose the standard deviation of
lifetimes of all light bulbs is known to be 12 hours.

a) Write down the given information

b) Describe the shape, center, and spread of the sampling distribution of 𝑥̅ . Draw this
distribution. (hint: does the Central Limit Theorem apply here?)
Austin Menger 9
STAT 1100Q-001

c) Is this sufficient evidence of false advertising? Would Consumer Reports win an


expensive lawsuit against GE, embarrassing the company and forcing them to re-tool the
plant? Or was Consumer Reports’ test result nothing more than natural variation due to
sampling, and not a sign of false advertising overall?

So, the data (𝑥̅ = 744) provide evidence to suggest that 𝜇 has decreased and is thus lower
than the advertised 750 hours.

d) From here, we can formally calculate the probability of obtaining, just by chance, (i.e.
just due to sampling variability) such an unusual (low) 𝑥̅ or even more unusual (lower):

There is less than a 1% chance (.0013) of getting 𝑥̅ as low as 744 or lower when 𝜇 = 750
(i.e. it would be extremely unlikely). This is strong evidence that 𝜇 is probably lower than
the advertised 750 (since if it was 750, we would not have observed such a low 𝑥̅ as we
did).

Comments Regarding the Central Limit Theorem

i. The Central Limit Theorem (CLT) applies regardless of the shape of the population
distribution from which we sample.
Austin Menger 10
STAT 1100Q-001

ii. How large is “large enough”? → The answer depends on the shape of the population
distribution from which we take our sample. (In general, with 𝑛 ≥ 30, we are safe)

If the population is far from Normal →

If the population is close to Normal →

If the population distribution happens to be Normal itself →

iii. Note that averages tend to be ____ spread out, with the spread decreasing by a factor
of

as the sample size 𝑛 is increasing.

Let’s see the Central Limit Theorem in action on 2 specific population distributions:
Austin Menger 11
STAT 1100Q-001

The Central Limit Theorem in action for the Uniform Distribution


Austin Menger 12
STAT 1100Q-001

The Central Limit Theorem in action for the Exponential Distribution


Austin Menger 13
STAT 1100Q-001

Example – Height & Popularity


The heights of male seniors in high school follow a normal distribution with a mean of 74 inches,
give or take 2.7 inches. To assess whether there is a connection between height and popularity in
high school, the average height (𝑥̅ ) of the top 5 nominees for prom king at a high school was
calculated.

a) Draw what’s going on in the “big picture”.

b) What is the sampling distribution of 𝑥̅ ? Remember to mention the shape of the


distribution, it’s mean, and the standard deviation.

c) Draw the sampling distribution of 𝑥̅ for this example.


Austin Menger 14
STAT 1100Q-001

d) Suppose that, in fact, the average height of the 5 students was found to be 𝑥̅ = 71.8.
While this is some evidence that the more popular students tend to be taller, would you
say that this is enough evidence to draw the conclusion that there is a connection
between height and popularity in high school? Or is 𝑥̅ = 71.8 just a result that could have
happened by chance due to sampling variability?

e) Support your answer to part (d) by calculating the probability of obtaining 𝑥̅ = 71.8 or
lower just by chance (i.e. 𝑃(𝑋̅ ≤ 71.8) ).

Summary of Sampling Distributions


Austin Menger 15
STAT 1100Q-001
Austin Menger 16
STAT 1100Q-001

You might also like