0% found this document useful (0 votes)
32 views22 pages

4 Sampling-Distributions

This document discusses sampling and sampling distributions. It defines key terms like population, sample, and sample statistics. It explains that sampling is done to save time and money compared to examining an entire population. A sampling distribution is the distribution of all possible values of a sample statistic from samples of the same size selected from a population. The document uses an example population to demonstrate how to develop a sampling distribution by considering all possible samples of a given size. It compares the characteristics of the population distribution to the sampling distribution.

Uploaded by

tunetonevidevide
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views22 pages

4 Sampling-Distributions

This document discusses sampling and sampling distributions. It defines key terms like population, sample, and sample statistics. It explains that sampling is done to save time and money compared to examining an entire population. A sampling distribution is the distribution of all possible values of a sample statistic from samples of the same size selected from a population. The document uses an example population to demonstrate how to develop a sampling distribution by considering all possible samples of a given size. It compares the characteristics of the population distribution to the sampling distribution.

Uploaded by

tunetonevidevide
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

11/16/20

END 311E
STATISTICS
Sampling and Sampling Distributions

16.11.2020 1

Population
census

Are we able to observe all


members of the population?

Motor vehicle
registration

1
11/16/20

16.11.2020 3

Sampling
§ Sampling: The process or action of taking samples representing the population
§ Sample: a finite part of a population whose properties are studied to gain
information about the whole
§ Sample Statistics, any value of observed data, especially one used to estimate
the corresponding parameter of the population:

16.11.2020 4

2
11/16/20

Why Sample?
• Selecting a sample is less time-consuming than selecting every item in
the population.

• Selecting a sample is less costly than selecting every item in the


population.

• An analysis of a sample is less cumbersome and more practical than an


analysis of the entire population.

16.11.2020 5

Sampling Distributions
• A sampling distribution is a distribution of all of the possible values of a sample
statistic for a given size sample selected from a population.

• For example, suppose you sample 50 students from your college regarding their mean
GPA. If you obtained many different samples of 50, you will compute a different mean
for each sample. We are interested in the distribution of all potential mean GPA we
might calculate for any given sample of 50 students.

16.11.2020 6

3
11/16/20

Developing a Sampling Distribution

• Assume there is a population


D
A B C
• Population size 𝑁 = 4
• Random variable, X, is age of individuals
• Values of X: 18, 20, 22, 24 (years)

16.11.2020 7

Developing a Sampling Distribution

Summary Measures for the Population Distribution:

μ=
åX i
P(x)
N .3

18 + 20 + 22 + 24 .2
= = 21
4 .1

σ=
å (X - μ)
i
2

= 2.236
18
A
20
B
22
C
24
D
x
N
Uniform Distribution

16.11.2020 8

4
11/16/20

Developing a Sampling Distribution


Now consider all possible samples of size n=2

16 Sample Means
1st 2nd Observation
Obs
18 20 22 24
18 18,18 18,20 18,22 18,24 18 20 22 24
20 20,18 20,20 20,22 20,24 18 18 19 20 21
22 22,18 22,20 22,22 22,24 20 19 20 21 22
24 24,18 24,20 24,22 24,24 22 20 21 22 23
24 21 22 23 24
16 possible samples
(sampling with
replacement)
16.11.2020 9

Developing a Sampling Distribution


Sampling Distribution of All Sample Means

16 Sample Means Sample Means Distribution


_
P(X)
18 20 22 24
.3
18 18 19 20 21
20 19 20 21 22 .2
22 20 21 22 23
.1
24 21 22 23 24
0 _
18 19 20 21 22 23 24 X
16.11.2020 (no longer uniform)

10

5
11/16/20

Developing a Sampling Distribution


Summary Measures of this Sampling Distribution:

18 + 19 + 19 + ! + 24
μX = = 21
16

(18 - 21) 2 + (19 - 21)2 + ! + (24 - 21)2


σX = = 1.58
16

Note: Here we divide by 16 because there are 16 different samples of size 2.

16.11.2020 11

11

Comparing the Population Distribution to the Sample Means


Distribution

Population Sample Means Distribution


N=4 n=2

μ = 21 σ = 2.236 μX = 21 σ X = 1.58
_
P(X) P(X)
.3 .3

.2 .2

.1 .1

0
18 20 22 24 X
0
18 19 20 21 22 23 24
_
X
A B C D
16.11.2020 12

12

6
11/16/20

Population mean = The mean of all possible samples

Sample mean is an UNBIASED ESTIMATOR for the population mean!!

Suppose we have random samples 𝑋! , 𝑋" , … , 𝑋#


#

𝐸 % 𝑋$ = 𝐸 𝑋! + 𝐸 𝑋" + ⋯ + 𝐸 𝑋# = 𝑛𝜇
$%!

𝜇 𝜇 𝜇

#
1 1
𝐸 𝑋+ = 𝐸 % 𝑋$ = 𝑛𝜇 = 𝜇
𝑛 𝑛
$%!

16.11.2020 13

13

Sample Mean Sampling Distribution:


Standard Error of the Mean
• Different samples of the same size from the same
population will yield different sample means
• A measure of the variability in the mean from sample to
sample is given by the Standard Error of the Mean:
(This assumes that sampling is with replacement or
sampling is without replacement from an infinite population)

σ
σX =
n
• Note that the standard error of the mean decreases as the
sample size increases
16.11.2020 14

14

7
11/16/20

Sample variance is a BIASED ESTIMATOR for the population variance!!

Suppose we have random samples 𝑋! , 𝑋" , … , 𝑋#


#

Var % 𝑋$ = Var 𝑋! + Var 𝑋" + ⋯ + Var 𝑋# = 𝑛𝜎 "


$%!

𝜎 𝜎 𝜎

#
1 1 𝜎"
Var 𝑋+ = Var % 𝑋$ = "
𝑛𝜎 " =
𝑛 𝑛 𝑛
$%!

16.11.2020 15

15

Sample Mean Sampling Distribution:


If the Population is Normal
• If a population is normal with mean μ and standard
deviation σ, the sampling distribution of X is also
normally distributed with

σ
μX = μ and σX =
n

16.11.2020 16

16

8
11/16/20

Z-value for Sampling Distribution of the Mean

• Z-value for the sampling distribution of : X

( X - μX ) ( X - μ)
Z= =
σX σ
n
where: X = sample mean
μ = population mean
σ = population standard deviation
n = sample size

16.11.2020 17

17

Sampling Distribution Properties

μx = μ
Normal Population
Distribution

μ x
(i.e. x is unbiased ) Normal Sampling
Distribution
(has the same mean)

μx
x
16.11.2020 18

18

9
11/16/20

Example
• A cereal firm fills thousands of boxes of cereals during a day.
• To be consistent with the package labeling, boxes should contain 368 grams of
cereal.
• Cereal weight varies from box to box.
• Given that the standard deviation of the cereal-filling process is 15 grams,
• What will the standard error be for a sample contains 25 boxes?

16.11.2020 19

19

Example

The standard error is 3.


𝜎 15
= =3 The variation in the sample means for samples
𝑛 25 of n = 25 is much less than the variation in the
individual boxes of cereal

16.11.2020 20

20

10
11/16/20

Example
• What is the probability that the mean of a sample (n=25) being less than 365 gr?

( X - μX ) ( X - μ)
Z= =
σX σ
n

#$%&#$'
𝑍= #
= −1
P{𝑍 < −1} = 0.1587

16.11.2020 21

21

Example 𝑃{𝑍 < −1}

𝑃{𝑍 > 1}

16.11.2020 22

22

11
11/16/20

Example

𝑃{𝑍 < 1}

1 − 𝑃{𝑍 < 1}

16.11.2020 𝑃 𝑍 < −1 = 1 − 𝑃{𝑍 < 1} 23

23

Example
• Find the probability that the mean being less than 365 gr using the population
parameters.
#$%&#$'
𝑍= 𝟏𝟓
= −0.2
P{𝑍 < −0.2} = 0.4207

many more individual boxes than


sample means are below 365 grams

the chance that the sample mean of


25 boxes is far away from
the population mean is less than the
chance that a single box is far away!
16.11.2020 24

24

12
11/16/20

Example
taking a larger sample results in
less variability in the sample means from
sample to sample!

• What is the probability that the mean of a sample (n=100) being less than 365

15
𝜎-, = = 1.5
100
#$%&#$'
𝑍= = −2
*.%

P{𝑍 < −2} = 0.0228

16.11.2020 25

25

Sampling Distribution Properties

As n increases, Larger sample


size
σ x decreases

Smaller sample
size

μ x
16.11.2020 26

26

13
11/16/20

Determining An Interval Including A Fixed Proportion of the


Sample Means
Find a symmetrically distributed interval around µ
that will include 95% of the sample means when µ
= 368, σ = 15, and n = 25.

• Since the interval contains 95% of the sample


means 5% of the sample means will be outside
the interval
• Since the interval is symmetric 2.5% will be above
the upper limit and 2.5% will be below the lower
limit.
• From the standardized normal table, the Z score
with 2.5% (0.0250) below it is -1.96 and the Z
score with 2.5% (0.0250) above it is 1.96.

16.11.2020 27

27

Determining An Interval Including A Fixed Proportion of the


Sample Means
• Calculating the lower limit of the interval
σ 15
XL = μ + Z = 368 + (-1.96) = 362.12
n 25
• Calculating the upper limit of the interval
σ 15
XU = μ + Z
= 368 + (1.96) = 373.88
n 25
• 95% of all sample means of sample size 25 are between 362.12 and
373.88

16.11.2020 28

28

14
11/16/20

Sample Mean Sampling Distribution:


If the Population is not Normal
• We can apply the Central Limit Theorem:
• Even if the population is not normal,
• …sample means from the population will be
approximately normal as long as the sample size is large
enough.

Properties of the sampling distribution:

σ
μx = μ and σx =
n
16.11.2020 29

29

𝐸 𝑆! = 𝜎!

∑%"#$ 𝑋" − 𝑋, !
𝐸 = 𝜎 ! ⟹ 𝑘 =?
𝑘
%
1
𝐸[9 𝑋" − 𝑋, ! ] = 𝜎 !
𝑘
"#$

𝑋" − 𝜇 − (𝑋, − 𝜇) 𝑋" − 𝑋,


%
1 !
𝐸[9 𝑋" − 𝜇 − (𝑋, − 𝜇) ] = 𝜎 !
𝑘
"#$

%
1
𝐸[9 𝑋" − 𝜇 !
− 2 𝑋" − 𝜇 𝑋, − 𝜇 + (𝑋, − 𝜇)! ] = 𝜎 !
𝑘
"#$

%
1
9(𝐸[ 𝑋" − 𝜇 ! ] − 2𝐸[ 𝑋" − 𝜇 𝑋, − 𝜇 ] + 𝐸[(𝑋, − 𝜇)! ]) = 𝜎 !
𝑘
"#$

30

15
11/16/20

%
1
9(𝐸[ 𝑋" − 𝜇 ! ] − 2𝐸[ 𝑋" − 𝜇 𝑋, − 𝜇 ] + 𝐸[(𝑋, − 𝜇)! ]) = 𝜎 !
𝑘
"#$

𝜎! The square of the


&!
standard error:
%

%
1 𝜎! 𝒏 constant
(𝑛𝜎 ! + 𝑛 − 2 𝐸[9 𝑋" − 𝜇 𝑋, − 𝜇 ]) = 𝜎 !
𝑘 𝑛 𝒏
"#$

%
1 𝒏
(𝑛𝜎 ! + 𝜎 ! − 2 𝐸[ 𝑋, − 𝜇 9 𝑋" − 𝜇 ]) = 𝜎 !
𝑘 𝒏
"#$

%
1 𝑋" − 𝜇
(𝑛𝜎 ! + 𝜎 ! − 2𝒏𝐸[ 𝑋, − 𝜇 9 ]) = 𝜎 !
𝑘 𝒏
"#$

31

%
1 𝑋" − 𝜇
(𝑛𝜎 ! + 𝜎 ! − 2𝒏𝐸[ 𝑋, − 𝜇 9 ]) = 𝜎 !
𝑘 𝒏
"#$

1
(𝑛𝜎 ! + 𝜎 ! − 2𝒏𝐸[ 𝑋, − 𝜇 ! ]) = 𝜎 !
𝑘

𝜎!
𝑛

1 𝜎! 1
(𝑛𝜎 ! + 𝜎 ! − 2𝑛 ) = 𝜎 ! ⟹ (𝑛𝜎 ! + 𝜎 ! − 2𝜎 ! ) = 𝜎 !
𝑘 𝑛 𝑘

1 1 ∑%
"#$(𝑋" − 𝑋)
!
(𝑛𝜎 ! − 𝜎 ! ) = 𝜎 ! ⟹ (𝑛 − 1)𝜎 ! = 𝜎 ! ⟹ 𝑘 = (𝑛 − 1) ⟹ 𝑆=
𝑘 𝑘 𝑛−1

32

16
11/16/20

Central Limit Theorem


the sampling
As the n↑
distribution of the
sample sample mean becomes
size gets almost normal
large regardless of shape of
enough… population

16.11.2020
x 33

33

Sample Mean Sampling Distribution:


If the Population is not Normal

Population Distribution
Sampling distribution
properties:

Central Tendency

μx = μ
μ x
Variation Sampling Distribution
σ (becomes normal as n increases)
σx = Larger
n Smaller sample
size
sample
size

16.11.2020 μx x 34

34

17
11/16/20

Example
• Suppose a population has mean μ = 8 and standard
deviation σ = 3. Suppose a random sample of size n
= 36 is selected.

• What is the probability that the sample mean is


between 7.8 and 8.2?

16.11.2020 35

35

Example
Solution:
• Even if the population is not normally distributed, the
central limit theorem can be used (n is relatively large)
• … so the sampling distribution of x is approximately
normal
• … with μx = 8
σ 3
• …and σx = = = 0.5
n 36

16.11.2020 36

36

18
11/16/20

Example
Solution (continued):
æ ö
ç 7.8 - 8 X -μ 8.2 - 8 ÷
P(7.8 < X < 8.2) = Pç < < ÷
ç 3 σ 3 ÷
è 36 n 36 ø
= P(-0.4 < Z < 0.4) = 0.6554 - 0.3446 = 0.3108

Population Sampling Standard Normal


Distribution Distribution Distribution
???
? ??
? ?
? ? ? Sample Standardize
?
7.8 8.2 -0.4 0.4
Z
μ=8 X
μX = 8
x μz = 0
16.11.2020 37

37

Population Proportions
π = the proportion of the population having
some characteristic
• Sample proportion (p) provides an estimate of π:
X number of items in the sample having the characteristic of interest
p= =
n sample size
• 0≤p≤1
• p is approximately distributed as a normal distribution when n is large
(assuming sampling with replacement from a finite population or without replacement from an infinite
population)

16.11.2020 38

38

19
11/16/20

Sampling Distribution of p
• Approximated by a
Sampling Distribution
normal distribution if: P(ps)
.3

nπ ³ 5 .2
.1
and 0
0 .2 .4 .6 8 1 p

n(1 - π ) ³ 5
where
π(1- π )
μp = π and σp =
n
(where π = population proportion)
16.11.2020 39

39

Standardize p to a Z value with the formula:


p -p p -p
Z= =
σp p (1- p )
n
Suppose q is the probability of success and (1-q) is the probability of failure where
2
𝑞 = 3.
𝐸 𝑋 = 𝑛𝑞
Var 𝑋 = 𝑛𝑞(1 − 𝑞)

𝑋 𝑛𝑞
𝐸 𝑝 =𝐸 = =𝑞
𝑛 𝑛

𝑋 1 𝑞(1 − 𝑞)
Var 𝑝 = Var = 4 𝑛𝑞 1 − 𝑞 =
𝑛 𝑛 𝑛
16.11.2020 40

40

20
11/16/20

Example

• If the true proportion of voters who support


Proposition A is π = 0.4, what is the probability that
a sample of size 200 yields a sample proportion
between 0.40 and 0.45?

i.e.: if π = 0.4 and n = 200, what is


P(0.40 ≤ p ≤ 0.45) ?

16.11.2020 41

41

Example
if 𝝅 = 𝟎. 𝟒 and 𝒏 = 𝟐𝟎𝟎, what is
𝑷(𝟎. 𝟒𝟎 ≤ 𝒑 ≤ 𝟎. 𝟒𝟓) ?

Find σ p: p (1- p ) 0.4(1- 0.4)


σp = = = 0.03464
n 200

Convert to æ 0.40 - 0.40 0.45 - 0.40 ö


P(0.40 £ p £ 0.45) = Pç £Z£ ÷
standardized normal: è 0.03464 0.03464 ø
= P(0 £ Z £ 1.44)

16.11.2020 42

42

21
11/16/20

Example
if 𝝅 = 𝟎. 𝟒 and 𝒏 = 𝟐𝟎𝟎, what is
𝑷(𝟎. 𝟒𝟎 ≤ 𝒑 ≤ 𝟎. 𝟒𝟓) ?
Utilize the cumulative normal table:
P(0 ≤ Z ≤ 1.44) = 0.9251 – 0.5000 = 0.4251

Standardized
Sampling Distribution Normal Distribution
0.4251

Standardize

0.40 0.45 0 1.44


p Z
16.11.2020 43

43

22

You might also like