0% found this document useful (0 votes)

42 views6 pages

Questions For Unit 4

Uploaded by

PRANAV T V

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views6 pages

Questions For Unit 4

Uploaded by

PRANAV T V

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Estimation - Classical and Bayesian Approach

Classical (Frequentist) Approach

• Focus: Considers parameter as fixed, uses data to infer its value.
• Point Estimation: Uses estimators like Maximum Likelihood Estimation (MLE) or Method of Moments (MoM).
– MLE Formula:
θ̂MLE = arg max L(θ; X)
θ
• Interval Estimation: Constructs confidence intervals, often using normal approximations.
• Objective: Obtain point estimates for parameters without involving prior knowledge.
• Properties:
– Consistency: θ̂ → θ as n → ∞.
– Unbiasedness: E(θ̂) = θ.
– Efficiency: Minimizes variance among unbiased estimators.
– Sufficiency: Uses all information in the sample.
• Key Method: Maximum Likelihood Estimation (MLE).

Bayesian Approach
• Focus: Treats parameters as random variables with prior distributions.
• Bayes’ Theorem:
p(X|θ)p(θ)
p(θ|X) =
p(X)
• Posterior Distribution: Combines prior and likelihood to form the posterior p(θ|X).
• Objective: Incorporate prior beliefs with sample evidence to update knowledge.
• Posterior Distribution:
P (θ|X) ∝ P (X|θ) · P (θ)
• Properties:
– Flexible, subjective.
– Allows prior updates with new data.
– Uses posterior predictive distribution for inference.

Summary :
• Philosophy: Classical estimation treats parameters as fixed values, while Bayesian estimation treats parameters
as random variables with distributions.
• Prior Information: Bayesian estimation incorporates prior beliefs through prior distributions, while classical
estimation uses data alone.
• Uncertainty Quantification: Classical methods typically use point estimates and confidence intervals, while
Bayesian methods provide a full distribution (posterior) and credible intervals.
• Computation: Classical estimation is often simpler, while Bayesian methods require more complex computational
techniques such as MCMC.

Methods of Estimation
Point Estimation
• A single value estimate of a parameter.
• Methods:
– MLE: Maximizes likelihood function;
θ̂MLE = arg max L(θ|X)
θ

– Method of Moments (MoM): Sets sample moments equal to population moments.

– Bayesian Estimation: Uses posterior mean, median, or mode as estimates.

1
Interval Estimation
• Provides a range within which the parameter lies with a certain confidence.
• Confidence Interval: h i
θ̂ − zα/2 · σθ̂ , θ̂ + zα/2 · σθ̂

• Method of Moments (MoM):

– Sets sample moments equal to population moments to solve for parameters.

– Example: For parameter θ, E(X) = sample mean.

• Maximum Likelihood Estimation (MLE):

– Maximizes the likelihood function with respect to parameters.
• Bayesian Estimation:
– Computes posterior mean, median, or mode based on the posterior distribution.
– Posterior Mean: Z
E(θ|X) = θ p(θ|X) dθ

Likelihood and EM Algorithm

• Likelihood Function:
L(θ; X) = p(X|θ)

• Log-Likelihood:
ℓ(θ; X) = log(L(θ; X))

• Properties of Likelihood:
– Consistency: θ̂MLE → θ as n → ∞.
– Asymptotic Normality: √
n(θ̂MLE − θ) ∼ N (0, I −1 (θ))
where I(θ) is the Fisher information.
– Helps find MLE.
– For large samples, the MLE is approximately normally distributed.
• EM Algorithm:
– Purpose: Estimate parameters in models with latent variables.
– Properties:
∗ Iterative improvement.
∗ Converges to a local maximum of the likelihood.

– Iterative algorithm for finding MLE when data are incomplete or have latent variables.
– Steps:
∗ E-step: Compute the expectation of the complete-data log-likelihood.(Calculate expected log-likelihood
Q(θ|θ(t) ).)
∗ M-step: Maximize this expectation to update parameters.(Maximize Q(θ|θ(t) ) with respect to θ.)

Prior Distributions
Conjugate Priors
• Definition: Prior and posterior distributions are in the same family.
• Examples:
– Normal prior for normal likelihood.
– Beta prior for binomial likelihood.
• Benefit: Simplifies computation of posterior.

2
Informative Prior
• Reflects specific prior knowledge about the parameter.

• Example: Using expert data for priors.

Non-informative Prior
• Represents lack of prior information (e.g., uniform distribution).

• Objective: Minimize influence of prior on posterior.

Loss Functions
• Purpose: Quantify the cost of estimation errors.

• Common Loss Functions:

– Squared Error Loss:
L(θ, θ̂) = (θ − θ̂)2
– Absolute Error Loss:
L(θ, θ̂) = |θ − θ̂|
– Zero-One Loss:
L(θ, θ̂) = I(θ ̸= θ̂)

Risk Function
• Definition: Expected value of the loss function over the parameter space.
• Formula:
R(θ, θ̂) = Eθ [L(θ, θ̂)]

• Bayesian Risk: Minimizes expected posterior loss for decision-making.

Examples
Example (Classical): Suppose you have a sample of heights from a population and want to estimate the population
mean µ.
Let’s assume the sample heights are X = {170, 165, 180, 175, 160}.
Sample Mean (Point Estimate):
n
1X 170 + 165 + 180 + 175 + 160
µ̂ = Xi = = 170
n i=1 5

Confidence Interval for Mean (assuming normal distribution with unknown variance): Compute the sample
standard deviation s: v
u n r
u 1 X 1X
s=t (Xi − µ̂)2 = (Xi − 170)2 = 7.91
n − 1 i=1 4

For a 95% confidence level, with t0.025,4 ≈ 2.776:

s s
CI = µ̂ − tα/2,n−1 · √ , µ̂ + tα/2,n−1 · √
n n

= (170 − 2.776 × 3.54, 170 + 2.776 × 3.54) = (160.17, 179.83)

Example (Bayesian): Assume a prior belief that the population mean µ is normally distributed with µ0 = 160 and
variance σ02 = 25.
The likelihood (data distribution) is also normal, X ∼ N (µ, σ 2 ), with σ 2 = 16.
Posterior Mean (using conjugate normal prior):

σ02 · X̄ + σ 2 · µ0 25 · 170 + 16 · 160

µposterior = = = 165.27
σ02 + σ 2 25 + 16

3
2. Methods of Estimation
Example (Maximum Likelihood Estimation): Suppose X1 , X2 , . . . , Xn are i.i.d. samples from an exponential
distribution with unknown rate λ: f (x|λ) = λe−λx .
Likelihood Function:
n
Y P
L(λ) = λe−λXi = λn e−λ Xi
i=1
Log-Likelihood: X
ℓ(λ) = n ln λ − λ Xi
Maximize by taking the derivative:
dℓ n X n
= − Xi = 0 ⇒ λ̂ = P
dλ λ Xi
Example (Method of Moments): Suppose you have data from a distribution with unknown mean µ and variance
σ2 .
For a normal distribution, the first two moments are E[X] = µ and E[(X − µ)2 ] = σ 2 .
Equating sample moments to population moments: - Sample mean X̄ = µ. - Sample variance S 2 = σ 2 .
Thus, the method of moments estimates are µ̂ = X̄ and σ̂ 2 = S 2 .

3. Likelihood and Expectation-Maximization (EM) Algorithm

Example (Likelihood): Suppose X1 , X2 , . . . , Xn are i.i.d. samples from a normal distribution N (µ, σ 2 ).
Likelihood Function:
n
Y 1 (Xi −µ)2
L(µ, σ 2 ) = √ e− 2σ2
i=1 2πσ 2
Log-Likelihood:
n
n 1 X
ℓ(µ, σ 2 ) = − ln(2πσ 2 ) − 2 (Xi − µ)2
2 2σ i=1
To find the MLEs µ̂ and σ̂ 2 , differentiate ℓ with respect to µ and σ 2 , set to zero, and solve.
Example (EM Algorithm): Assume you observe data X from a mixture of two normal distributions with unknown
means µ1 , µ2 and common variance σ 2 .
E-Step: Compute the probability each observation belongs to each component, given the current parameter estimates.
M-Step: Use these probabilities to update the parameter estimates (e.g., means and variances) by maximizing the
expected log-likelihood.

4. Prior Distributions
Example (Conjugate Prior with Beta-Binomial): Assume X ∼ Binomial(n, θ) with a beta prior θ ∼ Beta(α, β).
Posterior Distribution: Since the beta prior is conjugate, the posterior is also a Beta distribution:
θ|X ∼ Beta(α + X, β + n − X)
Interpretation: Posterior updates based on the observed successes X and failures n − X, blending prior beliefs with
new data.

5. Loss Functions
Example (Squared Error Loss): Suppose you want to estimate the parameter θ = 5 and your estimate θ̂ = 4.
Squared Error Loss:
L(θ, θ̂) = (θ − θ̂)2 = (5 − 4)2 = 1
Example (Bayesian Decision with Loss Function): For estimating a parameter with squared error loss, the
Bayes estimator is the posterior mean.
Suppose the posterior distribution of θ after observing data is θ|X ∼ N (10, 2).
Posterior Mean: Since the Bayes estimator minimizes squared error loss, the best estimate of θ is 10.

6. Risk Function
Example (Risk Function for a Specific Estimator): Suppose X ∼ N (µ, 1) and you use µ̂ = X as the estimator for
µ.
Squared Error Risk: Since X is an unbiased estimator, R(µ, µ̂) = E[(X − µ)2 ] = Var(X) = 1.
The risk, or expected loss, is constant at 1, regardless of µ.

4
Problem:
Suppose you have a sample of n = 5 observations from a distribution with the probability density function (PDF) given
by:

θxθ−1
f (x; θ) = , 0 ≤ x ≤ 1, θ>0
θ
where θ is the unknown parameter.

1. Method of Moments: Use the method of moments to estimate the parameter θ.

2. Maximum Likelihood Estimation (MLE): Find the Maximum Likelihood Estimate (MLE) for θ.
3. EM Algorithm: Suppose the observed data come from a mixture of two distributions with the same form (but dif-
ferent parameters) and the goal is to estimate the parameters of the mixture. Set up the Expectation-Maximization
(EM) algorithm for this problem.

Solution:
1. Method of Moments:
The method of moments is used to estimate parameters by equating the sample moments to the population moments.
- The first population moment (mean) for the given distribution is:
Z 1 Z 1
E[X] = xf (x; θ) dx = x · θxθ−1 dx
0 0
This simplifies to:
1
E[X] =
θ+1
Now, the sample mean is:
n
1X
µ̂ = xi
n i=1
Equating the sample mean to the population mean:
n
1X 1
xi =
n i=1 θ+1
Solving for θ:
1
θ= 1
Pn −1
n i=1 xi
Thus, the method of moments estimate for θ is:
1
θ̂M M = 1
Pn −1
n i=1 xi

2. Maximum Likelihood Estimation (MLE):

To find the MLE, we first write down the likelihood function for a sample of size n:
n n
Y Y θxθ−1 i
L(θ) = f (xi ; θ) =
i=1 i=1
1
This simplifies to:
n
Y
L(θ) = θn xθ−1
i
i=1

The log-likelihood function is:

n
X
log L(θ) = n log θ + (θ − 1) log xi
i=1

To maximize the log-likelihood, we take the derivative with respect to θ:

5
n
d n X
log L(θ) = + log xi
dθ θ i=1
Setting the derivative equal to 0:
n
n X
+ log xi = 0
θ i=1
Solving for θ:
n
θ̂M LE = − Pn
i=1 log xi
Thus, the MLE estimate for θ is:
n
θ̂M LE = − Pn
i=1 log xi

3. EM Algorithm:
In this case, suppose the observed data comes from a mixture of two distributions with the same form but different
parameters: f (x; θ1 ) and f (x; θ2 ). The goal is to estimate the parameters θ1 and θ2 .
E-step (Expectation step): Given the current estimates of θ1 and θ2 , compute the responsibilities, i.e., the
probabilities of each data point belonging to each distribution:

π1 f (xi ; θ1 )
γi1 =
π1 f (xi ; θ1 ) + π2 f (xi ; θ2 )
π2 f (xi ; θ2 )
γi2 =
π1 f (xi ; θ1 ) + π2 f (xi ; θ2 )
where π1 and π2 are the mixing coefficients.
M-step (Maximization step): Update the parameter estimates θ1 , θ2 , π1 , and π2 based on the responsibilities:
Pn ˆ
γi1 xθi 1 −1
θˆ1 = i=1
Pn
i=1 γi1
Pn ˆ
γ xθ2 −1
Pn i2 i
θˆ2 = i=1
i=1 γi2
The mixing coefficients are updated as:
n
1X
πˆ1 = γi1
n i=1
n
1X
πˆ2 = γi2
n i=1
Repeat the E-step and M-step iteratively until convergence.

Point Estimation: Definition of Estimators
No ratings yet
Point Estimation: Definition of Estimators
8 pages
Parameter Estimation+MLE+Bayesian
No ratings yet
Parameter Estimation+MLE+Bayesian
13 pages
DS 630 - Lec 02 - ST
No ratings yet
DS 630 - Lec 02 - ST
34 pages
Lecture 13
No ratings yet
Lecture 13
12 pages
Sta255 Week 11-2 Pre
No ratings yet
Sta255 Week 11-2 Pre
21 pages
Session 32 - Point Estimate
No ratings yet
Session 32 - Point Estimate
53 pages
11 Parameter Estimation
No ratings yet
11 Parameter Estimation
6 pages
NOTES
No ratings yet
NOTES
14 pages
3.exponential Family & Point Estimation - 552
0% (1)
3.exponential Family & Point Estimation - 552
33 pages
Maximum Likelihood Estimation Guide
No ratings yet
Maximum Likelihood Estimation Guide
55 pages
Chapter 3
No ratings yet
Chapter 3
9 pages
Chap - 2point - Estimation
No ratings yet
Chap - 2point - Estimation
11 pages
Chapitre 10 - Construction of Estimators
No ratings yet
Chapitre 10 - Construction of Estimators
35 pages
DSAI514 Lec2 Point Estimation Part 3
No ratings yet
DSAI514 Lec2 Point Estimation Part 3
21 pages
Chapter 2: Statistical Inference, Point Estimation, and Confidence Intervals
No ratings yet
Chapter 2: Statistical Inference, Point Estimation, and Confidence Intervals
16 pages
Module 2.1 Slides PDF
100% (1)
Module 2.1 Slides PDF
47 pages
Wickham Stati
No ratings yet
Wickham Stati
12 pages
Lecture Note 17
No ratings yet
Lecture Note 17
10 pages
AllNotes 4
No ratings yet
AllNotes 4
56 pages
Maximum Likelihood and Bayesian Parameter Estimation: Chapter 3, DHS
No ratings yet
Maximum Likelihood and Bayesian Parameter Estimation: Chapter 3, DHS
35 pages
STAT 135 Lab 2 Confidence Intervals, MLE and The Delta Method
No ratings yet
STAT 135 Lab 2 Confidence Intervals, MLE and The Delta Method
28 pages
Bayesian Inference Slides 2021
No ratings yet
Bayesian Inference Slides 2021
37 pages
ML Map and Bayseian
No ratings yet
ML Map and Bayseian
35 pages
ML Notes
No ratings yet
ML Notes
4 pages
Lecture 6 - Asymptotic Properties of Maximum Likelihood Estimators and Bayesian Methods of Point Estimation
No ratings yet
Lecture 6 - Asymptotic Properties of Maximum Likelihood Estimators and Bayesian Methods of Point Estimation
35 pages
Maximum Likelihood Estimation Guide
No ratings yet
Maximum Likelihood Estimation Guide
34 pages
Ch2 Prob II NAU
No ratings yet
Ch2 Prob II NAU
15 pages
CHAPTER 4 Parametric Methods
No ratings yet
CHAPTER 4 Parametric Methods
13 pages
Learning Models From Data: 1 Parametric Estimation
No ratings yet
Learning Models From Data: 1 Parametric Estimation
14 pages
ACST356 Section 4 Complete Notes
No ratings yet
ACST356 Section 4 Complete Notes
29 pages
Beamer 7
100% (1)
Beamer 7
92 pages
Lectura 1 Point Estimation
No ratings yet
Lectura 1 Point Estimation
47 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
11 pages
Hypothesis Testing Problems 11
No ratings yet
Hypothesis Testing Problems 11
9 pages
MIT14 30s09 Lec19
No ratings yet
MIT14 30s09 Lec19
7 pages
Psp-Unit-6 Estimation Theory PDF
No ratings yet
Psp-Unit-6 Estimation Theory PDF
38 pages
جلسه پنجم-1
No ratings yet
جلسه پنجم-1
15 pages
Lecture Notes For Probability and Statistics
No ratings yet
Lecture Notes For Probability and Statistics
7 pages
Bayesian Modelling Tuts-4-9
No ratings yet
Bayesian Modelling Tuts-4-9
6 pages
SI Chapter-2
No ratings yet
SI Chapter-2
53 pages
Intro to Point Estimation Methods
100% (1)
Intro to Point Estimation Methods
22 pages
PRCI Slides 1
No ratings yet
PRCI Slides 1
86 pages
AE 248: AI and Data Science: Prabhu Ramachandran 2024-03-01
No ratings yet
AE 248: AI and Data Science: Prabhu Ramachandran 2024-03-01
8 pages
Handout 6 (Chapter 6) : Point Estimation: Unbiased Estimator: A Point Estimator
No ratings yet
Handout 6 (Chapter 6) : Point Estimation: Unbiased Estimator: A Point Estimator
9 pages
Sta255 Week 11-1 Pre
No ratings yet
Sta255 Week 11-1 Pre
37 pages
Lecture 1
No ratings yet
Lecture 1
18 pages
Hasan 2 - Estimation Methods Method of Moments and Maximum Likelihood
No ratings yet
Hasan 2 - Estimation Methods Method of Moments and Maximum Likelihood
5 pages
19-Bayesian 2
No ratings yet
19-Bayesian 2
39 pages
Basic Concepts of Inference: Corresponds To Chapter 6 of Tamhane and Dunlop
No ratings yet
Basic Concepts of Inference: Corresponds To Chapter 6 of Tamhane and Dunlop
40 pages
Bayes 2021 Part1
No ratings yet
Bayes 2021 Part1
44 pages
Handout 6 (Chapter 6) : Point Estimation: Unbiased Estimator: A Point Estimator
No ratings yet
Handout 6 (Chapter 6) : Point Estimation: Unbiased Estimator: A Point Estimator
9 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
Estimation EMV
No ratings yet
Estimation EMV
37 pages
Frequentist Estimation: 4.1 Likelihood Function
No ratings yet
Frequentist Estimation: 4.1 Likelihood Function
6 pages
Session3 QTII 24
No ratings yet
Session3 QTII 24
19 pages
Bayesian Inference
No ratings yet
Bayesian Inference
18 pages
Research On Radio Frequency Finerprint Licalization Based On Machine Learning
No ratings yet
Research On Radio Frequency Finerprint Licalization Based On Machine Learning
4 pages
Model Misspecification: Gerda Claeskens
No ratings yet
Model Misspecification: Gerda Claeskens
21 pages
JNTUK R20 ML UNIT-I Final
No ratings yet
JNTUK R20 ML UNIT-I Final
22 pages
Copulas and Their Applications - Lan Zhang, Vijay P. Singh
No ratings yet
Copulas and Their Applications - Lan Zhang, Vijay P. Singh
620 pages
Bayesian Learning: Salma Itagi, Svit
No ratings yet
Bayesian Learning: Salma Itagi, Svit
14 pages
MLE and Model Selection
No ratings yet
MLE and Model Selection
22 pages
Tango - 1998 - Equivalence Test and Confidence Interval For The Difference in Proportions For The Paired-Sample Design
No ratings yet
Tango - 1998 - Equivalence Test and Confidence Interval For The Difference in Proportions For The Paired-Sample Design
18 pages
Unit-2 Machine Learning
No ratings yet
Unit-2 Machine Learning
110 pages
2 Linear
No ratings yet
2 Linear
83 pages
1 s2.0 S000145752300146X Main
No ratings yet
1 s2.0 S000145752300146X Main
12 pages
Bayesian Best-Worst Method A Probabilistic Group Decision Making Model
No ratings yet
Bayesian Best-Worst Method A Probabilistic Group Decision Making Model
8 pages
Examples: Mixture Modeling With Cross-Sectional Data
No ratings yet
Examples: Mixture Modeling With Cross-Sectional Data
56 pages
CS771 IITK EndSem Solutions
100% (1)
CS771 IITK EndSem Solutions
8 pages
Principal Component Analysis (PCA) - by Kavishka Abeywardana - Jun, 2024 - Medium
No ratings yet
Principal Component Analysis (PCA) - by Kavishka Abeywardana - Jun, 2024 - Medium
22 pages
Hidden Markov Model Introduction
No ratings yet
Hidden Markov Model Introduction
36 pages
APPLIED REGRESSION ANALYSIS AND GENERALIZED LINEAR MODELS Fox 2008
0% (1)
APPLIED REGRESSION ANALYSIS AND GENERALIZED LINEAR MODELS Fox 2008
103 pages
Econometrics Solutions Guide
0% (1)
Econometrics Solutions Guide
71 pages
1 s2.0 S1059056022000636 Main
No ratings yet
1 s2.0 S1059056022000636 Main
17 pages
SAHADEB - Logistic Reg - Sessions 8-10
No ratings yet
SAHADEB - Logistic Reg - Sessions 8-10
145 pages
NLP Cat 2
No ratings yet
NLP Cat 2
78 pages
Introduction To Bayesian Methods: Jessi Cisewski Department of Statistics Yale University
No ratings yet
Introduction To Bayesian Methods: Jessi Cisewski Department of Statistics Yale University
53 pages
Thesis Defense Images
100% (4)
Thesis Defense Images
4 pages
CST413 C
No ratings yet
CST413 C
4 pages
Maximum Likelihood Estimator Guide
No ratings yet
Maximum Likelihood Estimator Guide
14 pages
A Review of Weibull Functions in Wind Sector
No ratings yet
A Review of Weibull Functions in Wind Sector
9 pages
Unit 4
No ratings yet
Unit 4
8 pages
Objective Bayesian Statistics: 0.-A. Al-Hujaj and H.L. Harney
No ratings yet
Objective Bayesian Statistics: 0.-A. Al-Hujaj and H.L. Harney
8 pages
Week 2 Homework - Summer 2020: Attempt History
No ratings yet
Week 2 Homework - Summer 2020: Attempt History
27 pages
Unit IV. Digital Image Classification
No ratings yet
Unit IV. Digital Image Classification
22 pages
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
No ratings yet
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
15 pages

Questions For Unit 4

Uploaded by

Questions For Unit 4

Uploaded by

Estimation - Classical and Bayesian Approach

Classical (Frequentist) Approach

– Method of Moments (MoM): Sets sample moments equal to population moments.

• Method of Moments (MoM):

– Sets sample moments equal to population moments to solve for parameters.

• Maximum Likelihood Estimation (MLE):

Likelihood and EM Algorithm

• Example: Using expert data for priors.

• Objective: Minimize influence of prior on posterior.

• Common Loss Functions:

• Bayesian Risk: Minimizes expected posterior loss for decision-making.

For a 95% confidence level, with t0.025,4 ≈ 2.776:

= (170 − 2.776 × 3.54, 170 + 2.776 × 3.54) = (160.17, 179.83)

σ02 · X̄ + σ 2 · µ0 25 · 170 + 16 · 160

3. Likelihood and Expectation-Maximization (EM) Algorithm

1. Method of Moments: Use the method of moments to estimate the parameter θ.

2. Maximum Likelihood Estimation (MLE):

The log-likelihood function is:

To maximize the log-likelihood, we take the derivative with respect to θ:

You might also like