0% found this document useful (0 votes)
18 views26 pages

Lecture 4

good
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views26 pages

Lecture 4

good
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Statistical Modeling, Lecture 4

Melike Efe
July 21, 2025
Sabancı University
Table of contents

1. Point Estimation
Efficient Estimators
Consistency
Sufficient Estimator

1
Point Estimation
Efficient Estimators

▶ Unbiasedness is a desirable property for an estimator. The estimator’s


expected value equals the true value of the parameter.
▶ There are usually many unbiased estimators of a parameter θ.
▶ In that case we look in the class of unbiased estimators for an
estimator with a variance as small as possible.
▶ This process leads to the MVUE, the minimum variance unbiased
estimator.
Definition: For a random sample X1 , X2 , . . . , Xn from a given
distribution with parameter θ, the estimator Θ̂ = h(X1 , X2 , . . . , Xn ) is a
minimum variance unbiased estimator of θ if Θ̂ is unbiased, that is, if
E(Θ̂) = θ (for all possible θ), and Var(Θ̂) is less than or equal to the
variance of any unbiased estimator of θ.

2
▶ We may use the variance as a measure to compare two unbiased
estimators of the same parameter.
▶ Obviously the estimator with the smallest variance is preferred.
Definition: If Θ̂1 and Θ̂2 are two unbiased estimators for the parameter
θ of a given population, we say that Θ̂1 is more efficient than Θ̂2 if the
relative efficiency
Var(Θ̂2 )
e(Θ̂1 , Θ̂2 ) = >1
Var(Θ̂1 )

The above ratio e(Θ̂1 , Θ̂2 ) is called the efficiency of Θ̂1 relative to Θ̂2 .

3
Theorem: Let X1 , X2 , . . . , Xn be a random sample drawn from a
population with the density f (x) and Θ̂ be an unbiased estimator of θ.
Under reasonably general regularity conditions, it follows that
1
Var(Θ̂) ≥
∂ ln f (X ) 2
nE[( ) ]
∂θ
The right-hand side in the above equation is called the Cramér-Rao lower
bound (CRLB).
Therorem: If Θ̂ is an unbiased estimator of θ and
1
Var(Θ̂) =
∂ ln f (X ) 2
nE[( ) ]
∂θ

then Θ̂ is a minimum variance unbiased estimator of θ.

4
Example: Show that X is a minimum variance unbiased estimator of the
mean µ of a normal population.

5
Example: Let X be a random variable taken from a Binomial population
with parameters n and θ. Use Cramér-Rao inequality to show that
Θ̂ = X /n is a minimum variance unbiased estimator of θ.

6
Understanding Mean Square Error (MSE)

Why Variance is Not Always Enough:


If an estimator is biased, variance alone is not a sufficient measure of its
performance.
Definition (Mean Square Error (MSE)): If Θ̂ is an estimator of θ, its
mean square error is:

MSE (Θ̂) := E[(Θ̂ − θ)2 ]

▶ The MSE accounts for both variance and bias.


▶ If the estimator is unbiased, MSE reduces to its variance.

7
Decomposition of MSE: The mean square error can be decomposed as:

MSE (Θ̂) = Var(Θ̂) + [b(Θ̂)2 ]

where b(Θ̂) = E(Θ̂) − θ is the bias of Θ̂.

8
MSE and Biased Estimators

Key Insight:
▶ If we are not restricted to unbiased estimators, the decomposition
formula suggests that a biased estimator might have a lower MSE than
an unbiased one.
▶ This means biased estimators can sometimes be preferable if MSE is
the primary criterion.

9
Example: Let X1 , X2 , . . . , Xn be a random sample from a normal
population with µ = 0 and unknown variance θ = σ 2 .
1 Pn
(a) Show that Θ̂1 = X 2 is an unbiased estimator for θ with
n i=1 i
2θ2
variance and minimum variance unbiased estimator for θ.
n
α Pn
(b) Let Θ̂2 = X 2 and find the value of α such that the MSE is
n i=1 i
minimal.

10
Key Takeaway: Tradeoff Between Bias and Variance

Important Insight:
▶ An estimator with a small bias can have a significantly lower MSE
than an unbiased estimator.
▶ MSE accounts for both variance and bias: MSE = Variance + Bias2 .
▶ This example demonstrates that in some cases, introducing bias can
improve the overall accuracy of an estimator.
▶ In practice, minimizing MSE is often preferable to insisting on
unbiased estimators.
▶ This is the foundation for regularization techniques in machine
learning, where small bias is introduced to significantly reduce variance.

11
What is Consistency?

Key Idea: An estimator is consistent if it gets closer and closer to the


true parameter as the sample size increases.
Definition: An estimator Θ̂ is consistent for a parameter θ if:

lim P(|Θ̂ − θ| < ε) = 1, for every ε > 0.


n→∞

This means that as n grows, the probability that Θ̂ is close to θ


approaches 1.
Intuition:
- If we keep collecting more data, our estimate should get closer to the
true value.
- The probability of making a big mistake should go to zero.
- Unbiasedness is not required, but it helps.

12
Example: Let X1 , X2 , . . . , Xn be a random sample from a continuous
population with the density:
(
1
for 0 < x < θ
f (x) = θ
0 elsewhere

where θ > 0 is an unknown parameter. Determine whether


Yn = max{X1 , X2 , . . . , Xn } is a consistent estimator of θ.

13
Example: Suppose X1 , X2 , . . . , Xn is a random sample from a
distribution with the density function
(
e −(x−θ) for x > θ
f (x) =
0 elsewhere

Determine whether Y1 = min{X1 , X2 , . . . , Xn } is a consistent estimator of


θ.

14
A Sufficient Condition for Consistency

▶ Checking the consistency using its definition is not always easy.


Sometimes the following sufficient condition proves useful:
Theorem: If Θ̂n is an estimator of θ and MSE (Θ̂n ) → 0 as n → ∞,
then Θ̂n is a consistent estimator of θ.
Proof: The proof follows from the Markov inequality which says that for
a given non-negative random variable Y
E (Y )
P(Y ≥ ℓ) ≤ for any constant ℓ > 0.

Pick an ϵ > 0, take Y = |Θ̂n − θ|. By Markov inequality,

P(|Θ̂n − θ| > ϵ) = P(|Θ̂n − θ|2 > ϵ2 )


MSE (Θ̂n )

ϵ2
since E (|Θ̂n − θ|2 ) = MSE (Θ̂) by definition of MSE .
15
Remark: Recall the decomposition

MSE (Θ̂n ) = Var(Θ̂n ) + [b(Θ̂n )]2 .

If Θ̂n is an unbiased or asymptotically unbiased estimator of θ, then

[b(Θ̂n )]2 → 0 as n → ∞.

Hence, if Var(Θ̂n ) → 0 as n → ∞, then MSE (Θ̂n ) → 0 as n → ∞.

16
Example: For a random sample from a normal distribution, verify that
the sample variance is a consistent estimator of σ 2 .

17
Example: For a random sample from a normal distribution, the
alternative estimator:
n
1X
Θ̂n = (Xi − X )2
n
i=1

is also a consistent estimator of σ 2 .

18

You might also like