0% found this document useful (0 votes)
19 views10 pages

Problem Set 2 Solution

The document provides solutions to selected exercises for MATH5905, focusing on statistical inference, sufficient statistics, and classical estimation. It includes detailed factorization of likelihood functions, analysis of sufficient statistics, and examples demonstrating the sufficiency of various statistics. Additionally, it discusses conditional probabilities and unbiased estimators in the context of Poisson distributions and other statistical models.

Uploaded by

michealbain95
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views10 pages

Problem Set 2 Solution

The document provides solutions to selected exercises for MATH5905, focusing on statistical inference, sufficient statistics, and classical estimation. It includes detailed factorization of likelihood functions, analysis of sufficient statistics, and examples demonstrating the sufficiency of various statistics. Additionally, it discusses conditional probabilities and unbiased estimators in the context of Poisson distributions and other statistical models.

Uploaded by

michealbain95
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

THE UNIVERSITY OF NEW SOUTH WALES

DEPARTMENT OF STATISTICS

Solutions to selected exercises for MATH5905, Statistical Inference

Part two: Data reduction. Sufficient statistics. Classical estimation


Pn
Question 1: a) Denoting T = i=1 Xi , you can factorise L(X, µ) with
n
1X 2
h(X) = exp(− X ),
2 i=1 i

n 1
g(T, µ) = exp(− µ2 ) exp(T µ) √ .
2 ( 2π)n
Pn
b) Denoting T = i=1 Xi2 , you can factorise L(X, σ 2 ) with

1 1
h(X) = 1, g(T, σ 2 ) = exp(− 2
T) √ .
2σ ( 2πσ)n

c) For a point x and a set A, we use the notation



1 if x is in A,
IA (x) = I(x ∈ A) =
0 if x is not in A

Then
n
Y
L(X, θ) = I(θ,θ+1) (xi ) = I(θ,θ+1) (x(n) )I(θ,θ+1) (x(1) ) = I(x(n) −1,∞) I(−∞,x(1) ) (θ).
i=1
 
X(1)
Hence T = can be taken as sufficient vector-statistic.
X(n)
Pn
d) Denoting T = i=1 Xi , you can factorise L(X, λ) with

1
g(T, λ) = exp(−nλ)λT and h(X) = Qn .
i=1 Xi !

According to the factorisation criterion, T is sufficient.


Pn
Now, using the definition and noting that T = i=1 Xi ∼ Po(nλ) we have:
( Pn
6 t
T
P (X = x T = t) 0 if xi =
P (X = x|T = t) = = P (X=x) Pi=1
n
P (T = t) P( n
P if i=1 xi = t
i=1 Xi =t)

Pn
Since i=1 Xi ∼ Po(nλ), the latter expression on the right can be shown to be equal to

t!
Qn
nt i=1 xi !
Pn
and it does not depend on λ. Hence T = i=1 Xi is sufficient according to the original definition
of sufficiency.

1
Question 2: For S = X1 + X2 + X3 we already know (n = 3 is a special case of the general case considered
at the lecture.) To show that T = X1 X2 + X3 is not sufficient, it suffices to show that, say,
f(X1 ,X2 ,X3 |T =1) (0, 0, 1|1) does depend on p. You can see that
T T T
P (X1 = 0 X2 = 0 X3 = 1 T = 1)
f(X1 ,X2 ,X3 |T =1 (0, 0, 1|1)) =
P (T = 1)
(1 − p)2 p
= 2
3p (1 − p) + p(1 − p)2
1−p
=
1 + 2p
Hence T = X1 X2 + X3 is not sufficient for p.
Question 3: We will show that T1 = X1 + X2 is sufficient but T2 = X1 X2 is not sufficient. By a direct check
we have

P (X1 = 0 ∩ X2 = 0|X1 + X2 = 0) = 1,
P (X1 = 1 ∩ X2 = 0|X1 + X2 = 0) = P (X1 = 1 ∩ X2 = 1|X1 + X2 = 0) = P (X1 = 0 ∩ X2 = 1|X1 + X2 = 0) = 0
θ(4 − θ)/12 1
P (X1 = 1 ∩ X2 = 0|X1 + X2 = 1) = = = P (X1 = 0 ∩ X2 = 1|X1 + X2 = 1)
θ(4 − θ)/6 2
P (X1 = 0 ∩ X2 = 0|X1 + X2 = 1) = 0 = P (X1 = 1 ∩ X2 = 1|X1 + X2 = 0)
θ(θ − 1)/12
P (X1 = 1 ∩ X2 = 1|X1 + X2 = 2) = =1
θ(θ − 1)/12
P (X1 = 0 ∩ X2 = 1|X1 + X2 = 2) = P (X1 = 1 ∩ X2 = 0|X1 + X2 = 2) = 0
P (X1 = 0 ∩ X2 = 0|X1 + X2 = 2) = 0

and we see that in all possible cases the conditional distribution does not involve the parameter
θ. However, for T2 = X1 X2 we can see by following the same pattern, that

4θ − θ2
P (X1 = 1 ∩ X2 = 0|X1 X2 = 0) = .
θ − θ2 + 12
This clearly depends on θ hence T2 is not sufficient.
Question 4: The conditional probability P (X = x|X1 = x1 ) is the probability P (X2 = x2 ∩ · · · ∩ Xn = xn )
and it depends on p since for each i we have

P (Xi = xi ) = pxi (1 − p)1−xi .

Question 5: We need to show that at least


 insome cases there is explicit dependence of the conditional
X1
distribution of the vector given the statistic T = X1 + X2 . We note that possible
X2    
X1 x1
realisations of T are t = 2, 3, . . . , 2θ. We examine P ( = |X1 + X2 = x). Of
X2 x2
course, if x1 + x2 6= x, this conditional probability is zero and does not involve θ. Let us now
study the case when x1 + x2 = x. We have two scenarios:
First scenario: 2 ≤ x ≤ θ. Then
(1/θ)2
 X   x  T
1 1
 P (X1 = x1 X2 = x − x1 ) 1
P = |X1 +X2 = x = Px−1 = =
X2 x2 P (X1 = i
T
X2 = x − i) (x − 1)(1/θ) 2 x − 1
i=1

which does not involve θ.


Second scenario: θ < x ≤ 2θ. Then:
(1/θ)2
 X   x  T
1 1
 P (X1 = x1 X2 = x − x1 ) 1
P = |X1 +X2 = x = Pθ = =
X2 x2 T
X2 = x − i) (−x + 2θ + 1)(1/θ) 2 2θ − x+1
i=x−θ P (X1 = i

In the second case, the conditional distribution explicitly involves θ hence T = X1 + X2 can not
be sufficient for θ.

2
Question 6: Similar solution to Question 4 above and we leave this as an exercise for you.
Question 7: a) The ratio takes the form
Qn
L(x, λ) Pn
xi − n (y )!
Qni=1 i
P
yi
=λ i=1 i=1
L(y, λ) i=1 (xi )!
Pn Pn Pn
and this would not depend on λ if and only if i=1 xi = i=1 yi . Hence T = i=1 Xi is mini-
mal sufficient.

b)The ratio takes the form


n n
L(x, σ 2 ) 1 X 2 X 2
= exp(− ( x − y )).
L(y, σ 2 ) 2σ 2 i=1 i i=1 i
Pn Pn Pn
This would not depend on σ 2 if and only if i=1 x2i = i=1 yi2 . Hence T (X) = i=1 Xi2 is
minimal sufficient.
Qn Pn
c) Similarly, T = i=1 Xi is minimal sufficient. We can also take T̃ = i=1 logXi as minimal
sufficient.

d) We have
L(x, θ) I(x(n) ,∞) (θ)
= .
L(y, θ) I(y(n) ,∞) (θ)
This has to be considered as a function of θ for fixed x(n) and y(n) . Assume that x(n) 6= y(n)
L(x,θ)
and, to be specific, let x(n) > y(n) first. Then the ratio L(y,θ) is:
– not defined if θ ≤ y(n) ,
– equal to zero when θ ∈ [y(n) , x(n) ).
– equal to one when θ > x(n) .
In other words, the ratio’s value depends on the position of θ on the real axis, that is, it is a
function of θ. Similar conclusion will be reached if we had x(n) < y(n) (do it yourself). Hence,
if and only if x(n) = y(n) will the ratio not depend on θ. This implies that T = X(n) is minimal
sufficient.

e) T = (X(1) , X(n) ) is minimal sufficient. We know from 1c) that L(x, θ) depends on the sample
via x(n) and y(n) only. If x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) are such that either x(1) 6= y(1)
L(x,θ)
or x(n) 6= y(n) or both then L(y,θ) will have different values in different intervals, that is, will
depend on θ. For this not to happen, x(1) = y(1) and x(n) = y(n) must hold.

f) Similar to e). T = (x(1) , x(n) ) is minimal sufficient.


Question 8: a) Since
n
Y
n
L(x, θ) = θ ( xi )θ−1
i=1
Qn Pn
we see by the factorisation criterion that T = i=1 xi is sufficient. Note that T̃ = i=1 log xi
is also sufficient since it is an 1-to-1 transformation of T.

b) Since
n
1 Y Pn
L(x, θ) = 4 n
( x3i )e−( i=1 xi )/θ .
(6θ ) i=1
Qn 1 −t/θ
Pn
We can factorise with h(x) = i=1 x3i , g(t, θ) = (6θ 4 )n e , where t = i=1 xi .

Question 9, 10: Left for you as exercises. I have treated the location case for the Cauchy family in the lectures,
the scale case is along the same lines.

3
Question 11: Parts (a) to (d) we went through during the lectures. For part (e) look at the score representa-
tion.
Question 12: Take τ̂ = I{X1 =0 T X2 =0} (X). Then we have that E(τ̂ ) = e−2λ (that is, τ̂ is unbiased for τ (λ) =
e−2λ ). Then the UMVUE would be
 X n   n
X 
E τ̂ Xi = t = 1 · P τ̂ = 1| Xi = t .
i=1 i=1
Pn
We know that i=1 Xi ∼ Po(nλ). The unbiased estimate is
T T Pn
P (X1 = 0 X2 = 0 i=1 Xi = t)
a(t) = Pn
P ( i=1 Xi = t)
T T Pn
P (X1 = 0 X2 = 0 i=3 Xi = t)
= Pn
P ( i=1 Xi = t)
(n − 2)t
=
nt
2
= (1 − )t .
n
We can check directly that this estimator is unbiased for τ (λ) (although this is not necessary:
we have stated a general theorem that Rao-Blackwellization preserves the unbiasedness property.
I have included the calculations below just as an additional exercise:
∞ ∞
 X 2 e−nλ (nλ)t X [λ(n − 2)]t
(1 − )t = e−nλ = e−2λ .

E a(T ) =
t=0
n t! t=0
t!

The variance given by the Cramer-Rao lower bound is:

(τ 0 (λ))2 λ(−2e−2λ )2 4λe−4λ


= =
nIX1 (λ) n n
For the variance of the unbiased estimator, we have:

X 2 e−nλ (nλ)t
Var(a(T )) = (1 − )2t − (e−2λ )2
t=0
n t!

X (n − 2)2t λt
= e−nλ − e−4λ
t=0
nt t!
4
= e−nλ e(n−4+ n )λ − e−4λ
= e−4λ [e4λ/n − 1] > 0.

The latter value is strictly larger than the bound:


h 4λ i 1 4λ 1 4λ
e−4λ e4λ/n − 1 − = e−4λ ( ( )2 + ( )3 + . . . ) > 0.
n 2! n 3! n
Question 13: This is again just to refresh some required, useful technical skills.
Z x
fX (x) = 8xydy = 4x3 if x in (0,1) (and zero else)
0
Z 1
fY (y) = 8xydx = 4y − 4y 3 if y in (0,1) (and zero else)
y

8xy 2y
fY |X (y|x) = = 2 if 0 < y < x, 0 < x < 1 (and zero else)
4x3 x
Z x
2x
a(x) = E(Y |X = x) = yfY |X (y|x)dy = ,0 < x < 1
0 3

4
Z 1 Z 1
2x 3 8
E(a(X)) = a(x)fX (x)dx = 4x dx =
0 0 3 15
Z 1
8
E(Y ) = 4 y(y − y 3 )dy =
0 15
2 8 8 8 2 8
Similarly Ea (x) = 27 , Var(a(X)) = 27 − ( 15 ) = 675

1 11
E(Y 2 ) = , Var(Y ) =
3 225
and we see directly that indeed Var(a(X)) < Var(Y ) holds.
Again note that the fact that by conditioning we reduce the variance was proved quite generally
in the lectures. In this problem we are just checking that indeed Var(a(X)) < Var(Y ) on a
particular example.
Question 14: Steps:
Pn
a) T = i=1 Xi is complete and sufficient for θ.
b) If τ̂ = X1 X2 then E τ̂ = θ2 (that is, τ̂ is unbiased for θ2 ).
t(t−1)
c) a(t) = E(τ̂ |T = t) = · · · = n(n−1) which is the UMVUE.
We can also check directly the unbiasedness of this estimator:
n 1
E(a(T )) = E[X̄( X̄ − )]
n−1 n−1
n E(X̄)
= E(X̄)2 −
n−1 n−1
n θ
= [Var(X̄) + (E(X̄))2 ] −
n−1 n−1
n θ(1 − θ) 2 θ
= ( +θ )−
n−1 n n−1
= θ2 .

Question 15: Since f (x; θ) is an one-parameter exponential Pnfamily, with d(x) = x. Using our general state-
ment from the lecture, we can claim that T = i=1 Xi is complete and minimal sufficient
for θ. We also know that for this distribution E(X1 ) = θ, Var(X1 ) = θ2 holds. Let us calculate:

Var(X1 ) n+1 2
E(X̄ 2 ) = Var(X̄) + (E(X̄))2 = + (EX1 )2 = θ 6= θ2 .
n n
After bias-correction, by Lehmann-Scheffe’s theorem:

n(X̄)2 T2
=
n+1 n(n + 1)
T2
is unbiased for θ and since T is complete and sufficient, we conclude that n(n+1) is UMVUE
for θ2 .

Question 16: a) T = X(n) is complete and sufficient for θ, with

ntn−1
fT (t) = , 0 < t < θ.
θn
n n+2 2
Hence E(T 2 ) = 2
n+2 θ . Hence T1 = n T is unbiased estimator of θ2 . By Lehmann-Scheffe,

n+2 2
T
n
is the UMVUE.

5
Its variance:
n+2 2 2 n+2 2 4
E( T ) − θ4 = ( ) ET − θ4
n n
Z θ n+3
n+2 2 t
=( ) n dt − θ4
n 0 θn
(n + 2)2 1
= θ4 [ − 1]
n n+4
4θ4
= .
n(n + 4)
n−1 1 1
b) Similar to a). n T is the UMVUE; its variance is n(n−2)θ 2 .

Question 17: This is a more difficult (*) question. It is meant to challenge the interested students.

a) The density f (t; θ) in 7a) is also called Gamma(n, θ) density. To show the result, we
could use convolution. Reminder: the convolution formula for the density of the sum of two
independent random variables X, Y :
Z ∞
fX+Y (t) = fX (x)fY (t − x)dx
−∞

In particular, if the random variables are non-negative, the above formula simplifies to:
Z t
fX+Y (t) = fX (x)fY (t − x)dx, if t > 0 (and 0 elsewhere).
0

Applying it for the two non-negative random variables in our case, we get:
Z t Z t
−θx −tθ+θx 2 −tθ
fX1 +X2 (t) = θ 2
e e dx = θ e dx = θ2 te−tθ .
0 0

which means that for n = 2 the claim is proved (note that Γ(2) = 1.) We apply induction to
Pk
show the general case. Assume that for T = i=1 Xi , the formula is also true and we want to
Pk+1 Pk
show that then it is true for k + 1. We apply for i=1 Xi = i=1 Xi + Xk+1 the convolution
formula and we get:
tk θk+1 e−θt
fPk+1 Xi (t) = ,
i=1 Γ(k + 1)
that is, the claim is true for k + 1.

Note: It is possible to give an alternative proof by using the moment generating functions
approach. Try it if you feel familiar enough with moment generating functions.
R∞
b) Consider the estimator τ̂ = I{X1 >k} (X). Then, E(τ̂ ) = 1 · P (X1 > k) = k
θe−θx dx = e−kθ .
Pn
c) Let T = i=1 Xi . Consider for small enough ∆x1 :
fX1 ,T (x1 , t)∆x1 ∆t
fX1 |T (x1 |t)∆x1 =
fT (t)∆t
Pn
P [x1 < X1 < x1 + ∆x1 ; t < i=1 Xi < t + ∆t]
≈ 1 n n−1 e−θt ∆t
Γ(n) θ t
Pn
P [x1 < X1 < x1 + ∆x1 ; t − x1 < i=2 Xi < t − x1 + ∆t]
≈ 1 n n−1 e−θt ∆t
Γ(n) θ t
Pn
P (x1 < X1 < x1 + ∆x1 )P (t − x1 < i=2 Xi < t − x1 + ∆t)
≈ 1 n n−1 e−θt ∆t
Γ(n) θ t

θe−θx1 Γ(n−1)
1
θn−1 (t − x1 )n−2 e−θ(t−x1 ) ∆x1 ∆t (t − x1 )n−2
≈ 1 = (n − 1) ∆x1 .
n n−1 e−θt ∆t
Γ(n) θ t
tn−1

6
Going to the limit as ∆x1 tends to zero, we get
n−1 x1
fX1 |T (x1 |t) = (1 − )n−2 , 0 < x1 < t < ∞.
t t
Now we can find the UMVUE. It will be:
Z ∞ t
n−1  t − k n−1
Z
E(I(k,∞) (X1 )|T = t) = fX1 |T (x1 |t)dx1 = n−1
(t − x1 )n−2 dx1 = .
k k t t
That is,
 T − k n−1
I(k,∞) (T )
T
Pn
with T = i=1 Xi is the UMVUE of e−kθ .

Question 18: The restriction θ ∈ (0, 1/5) makes sure that the probabilities calculated as a function of θ indeed
belong to [0, 1]. Let Eθ h(X) = 0 for all θ ∈ (0, 1/5). This means:

h(0)2θ2 + h(1)(θ − 2θ3 ) + h(2)θ2 + h(3)(1 + 2θ3 − 3θ2 − θ) = 0.

We rewrite the above relationship as follows:

[2h(3) − 2h(1)]θ3 + [2h(0) + h(2) − 3h(3)]θ2 + [h(1) − h(3)]θ + h(3) = 0

for all θ ∈ (0, 1/5). The main theorem of algebra implies then that the coefficients in front
of each power of the 3rd order polynomial in θ must be equal to zero. Hence h(3) = 0 =⇒
h(1) − h(3) = 0 =⇒ h(1) = 0 =⇒ 2h(0) + h(2) = 0. The latter relationship does not necessarily
imply that both h(0) = 0, h(2) = 0 must hold. Hence the family of distributions is not complete.
Question 19: Parts 19a), 19b), 19c) were treated in lecture and are complete. We consider 19d) here. We
have to show that T = X(n) is complete. We know that the density of T is

ntn−1
fT (t) = , 0 < t < θ (and 0 else).
θn
Let Eθ g(T ) = 0 for all θ > 0. This implies:
θ θ
ntn−1
Z Z
1
g(t) n
dt = 0 = n g(t)ntn−1 dt
0 θ θ 0

for all θ > 0 must hold. Since θ1n 6= 0 we get 0 g(t)ntn−1 dt = 0 for all θ > 0. Differentiating
both sides with respect to θ we get
ng(θ)θn−1 = 0
for all θ > 0. This implies g(θ) = 0 for all θ > 0. This also means Pθ (g(T ) = 0) = 1. In
particular, this result implies that S = n+1
n X(n) is the UMVUE of τ (θ) = θ in this model since
Eθ S = θ holds (see previous lectures) and S is a function of sufficient and complete statistic.
Question 20: The likelihood is
Pn1 (xi −µ1 )2 Pn2 (yi −µ2 )2
1 {− 1 − 12 }
L(X,Y; µ1 , σ12 , µ2 , σ22 ) = √ i=1 2 i=1 2
n1 n2
e 2 σ1 σ2
n
( 2π) σ1 σ2

and log-likelihood is
n1 n2
√ 1X (xi − µ1 )2 1X (yi − µ2 )2
ln L = −n ln( 2π) − n1 σ1 − n2 σ2 − 2 −
2 i=1 σ1 2 i=1 σ22

Solving the equation system


∂ ∂
ln L = 0 and ln L = 0
∂µ1 ∂µ2

7
delivers
µˆ1 = X̄n1 and µˆ2 = Ȳn2
for the MLE. Using the transformation invariance property, we get θ̂ = X̄n1 − Ȳn2 for the
maximum likelihood estimator of θ. Further:
σ12 σ22
Var(θ̂) = Var(X̄n1 ) + Var(Ȳn2 ) = + = f (n1 ).
n1 n − n1
To find the minimum, we set the derivative with respect to n1 to be equal to zero and solve
the resulting equation. This gives: σσ21 = n1
n2 . With other words, the sample sizes must be
proportional to the standard deviations. In particular, if n is fixed, we get n1 = σ1σ+σ
1
2
n.
Question 21: i) The likelihood is
n
Y
L(X; θ) = θn x−2
i I[θ,∞) (x(1) ).
i=1

We consider L as a function of theta after the sample has been substituted. When θ moves
on the positive half-axis, this function first grows monotonically (when θ moves between 0 and
x(1) ) and then drops to zero onward since the indicator becomes equal to zero. Hence L is a
discontinuous function of θ and its maximum is attained at x(1) . This means that θ̂mle = X(1) .
ii) Using the factorisation criterion, we see that X(1) is sufficient. It is also minimal sufficient due
to dimension considerations. The minimal sufficiency can also be shown by directly examining
L(X;θ)
the ratio L(Y;θ) .
Question 22: a) The likelihood is
n
Y
L(X; θ) = θn ( xi )θ−1
i=1

with log-likelihood
n
X
ln L(X; θ) = n ln θ + (θ − 1) ln xi .
i=1

The score:
n
∂ n X
ln L = + ln xi = 0
∂θ θ i=1
which gives the root
−n
θ̂ = θ̂mle = Pn .
i=1 ln xi
Then, using the translation invariance property, we get

ˆ = θ̂
τ (θ) .
θ̂ + 1

b) We have that
√ 1
n(θ̂ − θ →d N (0, ).
IX1 (θ)
We need to find IX1 (θ). To this end, we take:

ln f (x; θ) = lnθ + (θ − 1) ln x;
∂ 1
ln f (x; θ) = + ln x;
∂θ θ
2
∂ 1
ln f (x; θ) = − 2 .
∂θ2 θ
1
This means that IX1 (θ) = θ2 and

n(θ̂ − θ) →d N (0, θ2 ).

8
θ
Since τ (θ) = θ+1 , by applying the delta method we get

√  θ2 
n(τ̂ − τ ) →d N 0, .
(1 + θ)4
Qn Pn
c) According to the factorisation criterion, i=1 Xi is sufficient (also, i=1 ln Xi is sufficient).
Since the density belongs to an one-parameter exponential we do have completeness, as well.
Pn
The statistic T = i=1 Xi is not sufficient. Consider for example 0 < t < 1, n = 2, T = X1 +X2 .
Using the convolution formula (see previous tutorial sheet) we have:
Z t
fX1 +X2 (t) = θ2 xθ−1 (t − x)θ−1 dx.
0

Changing the variables: x = ty, dx = tdy, we can continue to obtain:


Z 1
2θ−1 2
fX1 +X2 (t) = t θ y θ−1 (1 − y)θ−1 dy = t2θ−1 θ2 B(θ, θ).
0

Then the conditional density becomes:

θ2 (x1 x2 )θ−1
f(X1 ,X2 )|T (x1 , x2 |t) =
t2θ−1 θ2 B(θ, θ)

(if x1 + x + 2 = t, and, of course, zero elsewhere). Hence the conditional density of the sample
given the value of the statistic does depend on the parameter.
d)Looking at Pn
∂ − i=1 ln xi 1
ln L = −n( − )
∂θ n θ
1 1
we see that for the CRLB will be attained. This means that
θ θ can be estimated by the
UMVUE Pn
lnXi
T = − i=1 .
n
The attainable bound is easily seen to be
1
.
nθ2

Question 23: a) The density of a single observation is

1 (x−µ)2
f (x; µ, σ 2 ) = √ e− 2σ2
2πσ

where only σ 2 is assumed unknown. Then


n
√ n 1 X (xi − µ)2
ln L(X; σ 2 ) = −n ln(( 2π) − ln(σ 2 ) −
2 2 i=1 σ2

Then the equation


n
∂ n 1 X (xi − µ)2
2
ln L = − 2 + =0
∂σ 2σ 2 i=1 σ4
has a root
n
1X
σ̂ 2 = (xi − µ)2
n i=1
which is also the MLE.
Further,
√ 1 1 (x − µ)2
ln f (x; µ, σ 2 ) = − ln(( 2π) − ln(σ 2 ) − ,
2 2 σ2

9
∂ 1 1 (x − µ)2
2
ln f = − 2 + ,
∂σ 2σ 2 σ4
∂2 1 (x − µ)2
ln f = − .
∂σ 2 ∂σ 2 2σ 4 σ6
1
Taking −E(. . . ) in the last equation gives IX1 (σ 2 ) = 2σ 4 . Hence:

n(σ̂ 2 − σ 2 ) −→d N (0, 2σ 4 ).

b) We apply the delta method. First, we notice that


v
u n
√ u1 X
2
σ̂mle = σ̂ = σ̂ = t (Xi − µ)2
n i=1

is the MLE (due to the transformation invariance property). Now:


√ ∂
n(σ̂ − σ) −→d N (0, (( h)2 2σ 4 )
∂σ 2
√ ∂ 1
where h(σ 2 ) = σ 2 . Hence 2
∂σ 2 h(σ ) = 2σ and we get, after substitution:

n(σ̂ − σ) −→d N (0, σ 2 /2).

1 1
Question 24 a) i) The MLE of λ is X̄ hence of τ (λ) = λ would be τ̂ = X̄
.
ii) Since P (X̄ = 0) > 0, we get that even the first moment is infinite (not to mention the second)
and there is no finite variance.
iii) The delta method gives us:
√ 1 1 1 −1
n( − ) −→d N (0, 4 IX (λ)))
X̄ λ λ 1

(since in our case h(λ) = λ1 , ∂λ



h(λ) = − λ12 .) But, as you can easily see (and we discussed at
lectures), for P o(λ), we have IX1 (λ) = λ1 , therefore
√ 1 1 1
n( − ) −→d N (0, 3 ).
X̄ λ λ
(Comparing the outcomes in (ii) and (iii) we see that although the finite variance does not exist,
the asymptotic variance is well defined (= λ13 .))

b) i) X̄ is the MLE and, using the delta method, we get
√ p √ 1 1
n( X̄ − λ) −→d N (0, ( √ )2 λ) = N (0, ).
2 λ 4

(Since the asymptotic variance becomes constant (= 41 ) and does not depend on the parameter,

we call the transformation h(λ) = λ a variance stabilising transformation).
√ z √
ii) X̄ ± 2α/2
√ would be the confidence interval for
n
λ and
p zα/2 p zα/2
(( X̄ − √ )2 , ( X̄ + √ )2 )
2 n 2 n

would be the confidence interval for λ.

10

You might also like