Unit Eight
Estimation and Hypothesis Testing
Objectives
Having studied this unit, you should be able to
Construct and interpret confidence interval estimates
Formulate hypothesis about a population mean or/and proportion
Determine an appropriate sample size for estimation
Points from chapter 7
Sampling Distribution Central Limit
of Theorem
If is random sample from a population
with mean and variance , then Let is random sample from a
population with mean and
[Normal Population OR sample variance .
size ] converges to normal distribution
) with mean and variance if the
[Normal Population OR sample
sample size is large enough.
size ] i.e. As )
Population not normal and n < 30
o Use nonparametric
methods
Introduction
So far we studied
Data collection and presentation techniques
Measures of centers
Measures of Variation
Probability distributions
Sampling and sampling distribution
Next is Statistical Inference, procedures used to draw conclusions based
on sample
Statistical inference involves
Estimation - Estimating population parameters based on statistic.
Hypotheses Testing – Evaluating whether an asserted population parameter
is supported with sample data or not.
Estimation of Population Mean
Point Estimation : provides a single value as estimate of population parameter
Estimator:
o It is a random variable that helps us to approximate a parameter ()
o It should be
• Unbiased -
• Consistent -
• Efficient - An estimator with smallest variance is considered as
efficient
• Estimate – particular value that an estimator take
• Point Estimator of mean is sample mean
• Difficulty with Point estimation – No clue on how close the sample statistic to
the parameter being estimated.
Estimation … E() = and Std() =
Interval Estimation - provides a range of likely
values for the parameter.
Z= = Z
Deals with finding the limiting values of the For level of sign.
interval estimator (1- )100% confidence interval:
The limiting values are variables
To construct confidence interval we need
Point estimator
Distribution of the point estimator
Therefore, 100(1- )% confidence
interval for is
𝜇
Significance level
Interval Estimation of () or
The point estimator is () with n-1 df
Distribution of is
Norma - If )
T – If )
Significance level say
Estimation …
Margin of Error: is the term in CI estimation
Sample size determination:
To achieve a specific margin of error , the sample size must be
Example:
From scores on exams in statistics, a random sample of 36 scores is taken and gives a sample
mean of 68. Its known that the standard deviation for statistics exam score is 3. Find a 95%
confidence interval for the true (population) mean of statistics exam scores.
Solution: =3, Known and n=36 large enough, =68
confidence level =95% = =1.96
Margin of error = 1.96* = 0.98
95% CI for : (68 - 0.98, 68+.98) =(67.02,68.98)
We are 95% confident that (67.02,68.98) will cover .
Estimation: Population Proportion(P)
Point estimator of P is sample proportion
For large n, i.e. , )
For significance level, 100(1- )% confidence interval for the population proportion is
given by ( -, )
Example:
Suppose 250 randomly selected people are surveyed to determine if they own a tablet.
Of the 250 surveyed, 98 reported owning a tablet. Using a 95% confidence level,
compute a confidence interval estimate for the true proportion of people who own
tablets.
Sol. n=250, = 98/250 =0.392, 1- = 0.608 both n >10 and n(1-)>10
=1.96
100(1- )% CI for P:
( -, ) = (.39-, .39+) =(0.33,0.67)
Examples
The amount of a particular biochemical substance related to bone breakdown
was measured in 30 healthy women. The sample mean and standard deviation
were 3.3 nanograms per milliliter (ng/mL) and 1.4 ng/mL. Construct an 80%
confidence interval for the mean level of this substance in all healthy women.
A thread manufacturer tests a sample of eight lengths of a certain type of
thread made of blended materials and obtains a mean tensile strength of 8.2lb
with standard deviation 0.06lb. Assuming tensile strengths are normally
distributed, construct a 90% confidence interval for the mean tensile strength of
this thread.
Examples
In a random sample of 900 adults, 42 defined themselves as
vegetarians. Of these 42, 29 were women.
a. Give a point estimate of the proportion of all self-described
vegetarians who are women.
b. Verify that the Sample is sufficiently large to use it to construct
a confidence interval for that proportion.
c. Construct a 90% confidence interval for the proportion of all self-
described vegetarians who are women.
A software engineer wishes to estimate, to within 5 seconds, the mean
time that a new application takes to start up, with 95% confidence.
Estimate the minimum size Sample required if the standard deviation of
startup times for similar software is 12 seconds.
Estimation for population variance()
The point estimator - Sample variance
If a simple random sample size n is obtained from a normally distributed
population with mean μ and standard deviation σ, then follows chi-
squared distribution.
If is significance value
A 100( )% CI for :
A 100( )% CI for
Example
Suppose a sample of 30 randomly
selected students are given an IQ
test. If the sample has a standard
deviation of 12.23 points, find a 90%
confidence interval for the
population standard deviation.
=
=
S = 12.23
A 100( )% CI for :
Exercise
A manufacturer measures 19 randomly selected dowels
and finds the standard deviation of the sample to be s =
0.16. Find the 95% confidence interval for the
Population Variance and standard deviation
Hypothesis Testing
Hypothesis: A hypothesis is an assumption about the population parameter
I assume that the average GPA() of this class is 3.5.
A Statistical hypothesis:
A hypothesis that is testable on the basis of observing a process that is modeled via a set of
random variables.
Main Idea:
It is difficult to prove that a fact is “right”.
But it is easy to prove that it is “wrong”. Finding counter example is enough.
Components of Hypothesis Test
The Null Hypothesis(Ho) :
States the assumption to be tested like Ho
Until proven wrong, we assume it is true
May or may not be Rejected
Alternative Hypothesis(Ha):
Competes with Ho
Never contain equal sign
May or may not be accepted
Hypothesis Testing on Population
Mean
Steps : 4. Establish a decision rule (critical
1. State the Null and Alternative Hypotheses or rejection region) Decision:
A. Ho Vs or Reject Ho if
Z<-
B. Ho Vs or A)
C. Ho Vs
Decision:
2. State the level of significance() and get the
Reject Ho if
critical value for the test , or B) Z>
• Usually is set to be 0.1, 0.05, 0.01
• It is probability of rejecting a true Ho.
3. Calculate appropriate test statistic (Z or t ) C) Decision:
Reject Ho if |Z|>
When
When
(Normal or 5. Give Conclusion – Interpretation
(Normal or
n>30)
n>30)
Z=
t=
Errors in Hypothesis Testing
There are two types of errors that can occur in a decision making process
Type I Error( - occurs when one rejects Ho while its actually true.
Type II Error( - occurs when one fails to rejects Ho while it is actually false.
Null Hypothesis
Decision True (Ho) False
Reject Ho Type I Error ( Correct Decision
Don’t reject Ho Correct Decision Type II Error (
and are inversely related, when trying to reduce , the value of increases.
Taking adequate sample size helps to minimize them
Depending the severity of the test, try to optimize between them.
Example
We want to test the claim that the climate has changed since industrialization at
0.05 significance level. Suppose that the mean temperature throughout history
is 50 degrees. To test the claim, data on the temperature was collected from the
past 40 years. The mean temperature was found to be 51 degrees. If the
population standard deviation is 2. what should be our conclusion?
Step4: Critical value =
Sol. n=40, = 51 From the standard
Step1: Ho(No Change) Normal table,
(There is change) =1.96
Step2: Significance level = 0.05 Decision: Since |Z|=3.16 >1.96 = ,
Reject Ho.
Step3 Since
: n=40 and is known,
Step5: At 0.05 level of significance,
the appropriate test statistic is
there is enough evidence that supports
Z
the claim that the temperature has
Z = = = 3.16
changed since industrialization.
P-value Method
The p-value is the probability of observing sample data at least as extreme as the
actually obtained test statistic.
Small p-values provide evidence against the null hypothesis.
The smaller (closer to 0) the p-value, the stronger is the evidence against the null
hypothesis.
If Zc = is the test statistic, the decision rule using p-value is summarized as
follows.
Null Hypothesis: Ho Ho Ho
Alternative
Hypothesis
P-value = P(Z<Zc) P(Z>Zc) 2|Zc|)
Significance level =
Decision: Reject Ho if P-value <
Example
A teacher believes that 85% of students in the class will want to go on a field trip to
the local zoo. She performs a hypothesis test to determine if the percentage is the
same or different from 85%. The teacher samples 50 students and 39 reply that they
would want to go to the zoo. For the hypothesis test, use a 1% level of significance.
P-value = 2(P(Z>1.39))=2*(1-0.9177)
=2*0.0823
Solution =0.1646
Ho: P = .85 Ha: Decision:
Since P-value =0.1646 >0.01= , Don’t reject
Significance level
Ho.
Test statistic:
39/50=0.78 ; 1-=0.22 Conclusion, at 1% level of significance the
evidence is not enough to reject the null
Both and n(1-) are >10
hypothesis.
Test statistic = =
=
= -1.39
Thank you All!