Overview
• Terminology
• Confidence intervals
• Hypothesis testing
What is statistics?
• Statistics is about making decisions in the face of uncertainty
• In the simplest case, we are trying to understand uncertainty
when estimating a single mean
• Often, we characterize the uncertainty by assuming a
distribution
Notation
• The normal or Gaussian distribution can be written in one of
two ways.
G=Gaussian
N=Normal
G(µ ,σ )
N (µ ,σ )
2
Standard Error (se) and
Standard Deviation (sd)
• There are unfortunately no standard definitions of sd and se.
– Standard deviation: square root of the variance of a random variable
sd ( X ) = Var ( X )
– Standard error: square root of the variance of a function of a random
variable
– e.g. se( X ) = Var ( X )
Standard Error (se) and
Standard Deviation (sd)
• Some professors refer to standard deviations as the theoretical
quantities (the formula), and to standard errors as the
estimated quantities
Small Example
• Draw a sample from some population
• Data: x=4,5,6
• Clicker Question: What is the sample
standard deviation sd ( X ) ?
Answers:
a) sqrt(2/3)
b) 2/3
c) 1
d) 2
e) other
Small Example
• Data: 4,5,6
• Given that you have computed the sd, what is the standard
error of the mean
se( X )
• Clicker question: What is the se? Answers:
a) sd/n
b) sd/sqrt(n)
c) sd*n
d) sd*sqrt(n)
e) sd
Symbols for estimator vs estimate
• An estimator is a formula, an estimate is a numerical value
• In Statistics we distinguish between an estimator
(tilde ~) and an estimate (hat ^)
Confidence intervals
• One sample model
Y= µ + ε
where ε is a residual or error
• What is a plausible range for µ ?
• We have a sample of Y’s
Confidence interval
• For a mean σ σ
y −c* , y+ c *
n n
• Where c is a critical value and depends on the distribution of y
• Because of the central limit theorem, for large samples the
distribution of y is approximated by a normal distribution
– regardless of the distribution of y
• The critical value c needs to be obtained from a look-up table or a
computer
– For the 95% two-sided confidence interval c=1.96
Sampling distribution of y
• E.g. alpha=5%
.4
• We are 95% confident that the true
mean is in the interval
.3
probability density
• 95% of the time, when we
.2
calculate a confidence interval in
(1 − α )
this way, the true mean will be in
.1
the confidence interval
α /2 α /2
• Technically Incorrect: The
0
mu
Because the total probability is 1, probability of the true mean being
the total area under the curve is 1. in the interval is 95%
α / 2 + (1 − α ) + α / 2 =
1
Sampling distribution assuming normality
• Suppose we assume normality
µ + ε , ε ∼ G (0, σ )
Y =
• Then the sampling distribution is t
µˆ − µ
~ tn −1
σˆ n −1 / n
• The t-distribution looks very similar to the normal distribution, but
has heavier tails
t-distribution vs. normal
• The t-distribution has
.4
heavier tails
.3
.2
y
.1
0
-4 -2 0 2 4
x
t density with 4 df Normal density
t-distribution vs. normal
• The t-distribution converges to the normal distribution as n
increases.
– Once (n>30), there is not much difference.
• For a 95% confidence interval:
– critical values of the t-distribution are always larger than 1.96.
– critical values converge to 1.96 for “large” sample sizes
Example
• A 95% confidence interval for the simple example with 3
observations:
CI =( y − t2 * se( y ), y + t2 * se( y ) )
1 1
= 5 − 4.3* ,5 + 4.3*
3 3
= ( 2.52, 7.48 )
• Note: If I had had a large number of observations, the critical value
would have been 1.96
Hypothesis tests
• A hypothesis is a statement about a parameter
• The most common hypothesis test is
H0 : µ = 0
H1 : µ ≠ 0
• The alternative hypothesis affects the answer. Here it specifies both
large positive and negative values are evidence against the
hypothesis
Rejection of hypotheses
• We either reject the null hypothesis or we
fail to reject the hypothesis
– We never accept the null hypothesis, because
we can never be sure it is true
• This parallels convictions in a court of law:
– “guilty” implies proof of guilt beyond H 0 : innocent
reasonable doubt
– “not guilty” does not mean innocence; it H1 : not innocent
simply means you couldn’t prove guilt
Hypotheses
• In hypothesis testing you compute the probability that the
hypothesis is true given the data (and the alternative)
• If the probability that the hypothesis is true is very small you
reject the hypothesis
Hypothesis
.4
H 0 : µ = µ0
.3
probability density
H1 : µ ≠ µ 0
.2
(1 − α )
.1
α /2 α /2
0
mu
Suppose that we flipped the coin 50 times, and
observed 22 heads.
The observed value of D is d = |22 – 25| = 3
What is the probability of observing a value of D
greater than or equal to 3 if H0 : 𝜃 = 0.5 is true?
Suppose that we repeated the same experiment
with many coins – each coin is tossed 50 times.
If the coins were fair, then we would expect to
observe about 48% of the experiments have a value
of D = |Y-25| greater than or equal to the value
d = |22-25| = 3
Important Note: This test cannot prove that the
coin is fair. However, in this case, based on the
evidence on hand (the data), there is no evidence
to suggest that the coin is biased
Key Concept: We care about the probability of
observing a value of D greater than or equal to the
observed value (d = 3 in this example) if the null
hypothesis were true.
We do not care about the probability of observing D
= 3 exactly.
Why?
If we repeated the experiment and got y = 22 again,
the only surprise is that we got the exact same
result. It provides zero evidence against H0 : 𝜃 = 0.5 .
Instead we ask, what is the probability of a result at least
as surprising.
The probability calculated in this question is called
the p-value
The p-value represents the probability of
observing a value as extreme or more extreme
than the value observed, under the assumption
that the null hypothesis is true
It is a measure of the level of evidence against Ho
based on the observed data
So, the smaller the p-value, the more evidence we
have against Ho, or the less the data supports the
claim that the null hypothesis is true
Hypothesis testing vs confidence interval
For a given significance level alpha:
• The hypothesis is rejected if and only if the observed statistic
falls outside of the confidence interval
• Therefore, hypothesis testing and confidence intervals always
lead to the same conclusion
Significance level
• The significance level is the probability at which the null
hypothesis becomes so unlikely, that you no longer believe it
– usually alpha=0.05, sometimes alpha=0.01
– p-value is interpreted as “degree of evidence” against the hypothesis
– strong evidence, reasonable evidence, weak evidence against the
hypothesis
• Unfortunately, in practice you often have to decide: reject or
not reject
“Statistical significance” vs.
(English) significance
• “significant” has a different meaning in statistics than in the
English language
• “This is a significant change”. What does this mean?
– English: “This is an important change” or “This change is large
enough to matter in practice”
– Statistics: “The change is not zero” or “The change cannot be
explained by chance alone”
Highway or surface road?
• When Dr. Schonlau was living in Los Angeles working for the
RAND Corporation, he had two choices for his morning
commute:
– Surface streets (shorter distance)
– Highway (longer distance)
• It was not clear to him which of the two options took less time
• Therefore, he started to record how many minutes each
commute took
• We will now play the commuting game
Highway or surface road?
• Here is the time of the first 3 trips (in Surface High-
minutes) n Roads way
• I will give you additional commuting 1 19
times shortly 2 19
• When you think you know which route 3 23
is shorter, write down the value of n
• From that point on you would only
commute on the shorter route
One–sample t-test
• The one-sample t-test corresponds to the following model:
µ + ε , ε ∼ G (0, σ )
Y =
Regression
• In regression, the systematic part of the model changes as a
function of other x-variables
Y =α + β x + ε , ε ∼ G (0, σ )
• This is a generalization of the one sample model
– The one sample t-test arises with x=0