0% found this document useful (0 votes)
21 views16 pages

Stat2602 Chapter6 Part 1

This document discusses advanced topics in hypothesis testing, focusing on powerful tests and the Neyman-Pearson Lemma, which provides a method for constructing the most powerful tests for simple hypotheses. It defines the criteria for a most powerful critical region and illustrates the application of these concepts through examples involving normal and binomial distributions. Additionally, it introduces the concept of uniformly most powerful tests for composite hypotheses.

Uploaded by

jeffsiu456
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views16 pages

Stat2602 Chapter6 Part 1

This document discusses advanced topics in hypothesis testing, focusing on powerful tests and the Neyman-Pearson Lemma, which provides a method for constructing the most powerful tests for simple hypotheses. It defines the criteria for a most powerful critical region and illustrates the application of these concepts through examples involving normal and binomial distributions. Additionally, it introduces the concept of uniformly most powerful tests for composite hypotheses.

Uploaded by

jeffsiu456
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Stat2602 Probability and Statistics II Fall 2014-2015

Chapter VI Further Topics in Hypothesis Testing

§ 6.1 Powerful Tests

For general statistical tests, hypotheses may not be explicitly stated in terms of the
population mean or population variance. In such cases, the test statistics cannot be
intuitively determined as in last chapter. To construct a test in such situations, we
often hold the probability of a type I error fixed and try to look for the test statistic
that minimizes the probability of a type II error, or equivalently, that maximizes
the power.

For simplicity, we start with constructing such tests for simple hypotheses.

Definition

When testing H 0 :    0 against H1 :   1 , the critical (rejection) region C is said


to be a most powerful critical region of size  if the following conditions hold:

1. It has size  , i.e. K C  0    .

2. If D is any other critical region with size K D  0    , then K C 1   K D 1  .

The corresponding test

“Reject H 0 if X  C ”

is said to be a most powerful test of size  .

In other words, a most powerful test of size  has power no smaller than any other
 -level tests.

To construct a most powerful test for these simple hypotheses, we can make use of
the likelihood ratio
L1 ; X
L 0 ; X

as it is expected to take a large value if the alternative hypothesis is true, i.e. large
value of the likelihood ratio should lead to the rejection of the null hypothesis.
Based on this reasoning, the following theorem guarantees a most powerful test.

P.125
Stat2602 Probability and Statistics II Fall 2014-2015

§ 6.1.1 Neyman-Pearson Lemma

Neyman-Pearson Lemma

When testing H 0 :    0 against H1 :   1 , the critical region

 L1 ; X  
C  X    k
 L 0 ; X  

for some constant k  0 , is a most powerful critical region of size   K C  0  .

Hence the corresponding most powerful test takes the form

L1 ; X
“Reject H 0 if  k .”
L0 ; X

Proof

(In this proof, we assume that X has a continuous distribution. The proof for the
discrete case is similar, with integrals replaced by summations.)

Suppose that D is a test with size   K D  0    . For any k  0 ,

K C 1   K D 1   k      K C 1   K D 1   k K C  0   K D  0 


 K C 1   kKC  0   K D 1   kK D  0 
 C L1 ; x  kL0 ; x dx  D L1 ; x  kL 0 ; xdx
  I XC  I XD L1 ; x  kL0 ; xdx

1 if X  C ,
where I XC and I XD are indicator variables such that I XC  
0 otherwise.

If X  C , then I XC  1  I XD and L1 ; x   kL 0 ; x   0 .


If X  C , then I XC  0  I XD and L1 ; x   kL 0 ; x   0 .

As a result, the integrand is always non-negative and so as the integral, i.e.

K C 1   K D 1   k      0 (   )

and hence K C 1   K D 1  . Therefore, the critical region C is the most powerful.

P.126
Stat2602 Probability and Statistics II Fall 2014-2015

Example 6.1

Suppose that a random sample X 1 , X 2 ,..., X n  from a normal population N  ,  02 


with known  0 is to be used to test

H 0 :   0 vs H1 :   1 ,

where 1  0 . The likelihood ratio can be obtained as

 1 2 
2  exp  2    xi  x   n x  1   
n
2 n 2 2

L1 ; x   2 0  i1 
0


L  0 ; x 
2 02 n 2 exp  1 2   xi  x 2  nx  0 2  
n

 2 0  i1 
 n 2 

 exp 2  x  0    x  1  
2

 2 0 
 n1  0 2 x  0  1  
 exp 
 2 0
2

Use the Neyman-Pearson lemma, a most powerful test would reject H 0 whenever

 n1  0 2 X  0  1  
exp   k
 2 2
0 
1  2 02 
 X   log k  0  1 
2  n1  0  

for some constant k  0 , where k is obtained such that the size of the test is  . In
actual practice, the most powerful test can be obtained as

“Reject H 0 if X  c ”

such that
 c  0 
  P X  c |   0   1   
0 n 
0
 c  0  Z
n

P.127
Stat2602 Probability and Statistics II Fall 2014-2015

Therefore the most powerful test is

0
“Reject H 0 if X  0  Z ”
n

which is just the Z-test derived in Section 5.4.

Note that if 1  0 , then 1  0 will become negative and the above inequality
should be solved as

 n1  0 2 X  0  1  
exp   k
 2 2
0 
1  2 0 2

 X   log k  0  1  .
2  n1  0  

The most powerful test should take the form

“Reject H 0 if X  c ”

0
with the value of c obtained as c  0  Z .
n

Example 6.2

Suppose that a single observation X from a binomial distribution bn, p  is to be


used to test

H 0 : p  p0 vs H1 : p  p1 ,

where p1  p0 . The likelihood ratio can be obtained as

n x
  p1 1  p1 
n x

L p1 ; x   x 

L p0 ; x   n  x
  p0 1  p0 
n x

 x
 p1 1  p0  
n x
 1  p1 
    
 1  p0   0
p 1  p 
1 

P.128
Stat2602 Probability and Statistics II Fall 2014-2015

For any k  0 ,

 p1 1  p0   p 1  p0 
n x
 1  p1   1  p1 
     k  n log   x log 1 k
 1  p0   0
p 1  p 
1   1  p 0 
p 0
1  p1

  1  p1   p 1  p0 
 x   k  n log   log 1
  1  p0   p0 1  p1 

Therefore from the Neyman-Pearson lemma, the test with the form

“Reject H 0 if X  c ”

is a most powerful test. Since X is discrete, we may not be able to find a value of c
such that the test has size exactly equal to  . For an  -level test, the value of c
should be taken as the smallest integer such that

P X  c | p  p0    .

For example, suppose that n  10 . To test

H 0 : p  0.5 vs H1 : p  0.7

at 5% significance level, we need to find smallest integer c such that

10 
P X  c | p  0.5    x 0.5 0.5
10
10 x
 0.05 .
x


x c 1 

By trial-and-error,
10 
  x 0.5
10
put c  7 ,  0.0547  0.05
10

x 8  
10 10
 
put c  8 ,   0.5  0.0107  0.05
10

x 9  x 

Therefore we take c  8 and the test becomes

“Reject H 0 if X  8 ”

The size of this test is 0.0107.

For large samples, we can use the normal approximation to obtain a most powerful
test with size approximately equal to  , as described in Section 5.9.

P.129
Stat2602 Probability and Statistics II Fall 2014-2015

§ 6.1.2 Uniformly Most Powerful Test

In this section, we consider the construction of most powerful tests for certain
testing problems involving composite hypotheses.

Definition

A size-  critical region C for testing H 0 : θ  0 against H1 : θ  1 is said to be


uniformly most powerful (UMP) if its power is no smaller than any other  -level
critical region. The corresponding test is said to be a uniformly most powerful test.

In other words, if C is a size-  UMP critical region, then

1. it has size  , i.e. max K C θ   ; and


θ0

2. for any other critical region D with size max K D     ,


 0

K C θ  K D θ for any θ  1 .

Unfortunately, UMP tests basically exist only for rather limited situation: testing
one-sided hypotheses for one-dimensional parameter models. For such situations,
we can first use the Neyman-Pearson lemma to find a most powerful test for
simple hypotheses, and then check if it is also an UMP test for the composite
hypotheses, as shown in the following examples.

Example 6.3

Consider the normal model: X 1 , X 2 ,..., X n ~ N  ,  02 ,  0 known


iid

0
From Example 6.1, the test “Reject H 0 if X  0  Z ” is a most powerful test
n
for the simple hypotheses

H 0 :   0 vs H1 :   1 .

where 1  0 . Since the test does not depend on the value of 1 , it should have
largest power at any    0 . Therefore it is also a UMP test for the composite
hypotheses
H 0 :    0 vs H1 :   0 .

P.130
Stat2602 Probability and Statistics II Fall 2014-2015

Moreover, since the power function of the test

   
K    1   0  Z 
0 n 

is an increasing function of  , it achieves its maximum at   0 under the


constraint that    0 . Hence the test is also a UMP size-  test for the hypotheses

H 0 :   0 vs H1 :   0 .

Example 6.4

Consider the binomial model: X ~ bn, p  , p unknown


From Example 6.1, the test “Reject H 0 if X  c ” is a most powerful test for the
simple hypotheses

H 0 : p  p0 vs H1 : p  p1 .

where p1  p0 . Since the test does not depend on the value of p1 , it should have
largest power at any p  p0 . Therefore it is also a UMP test for the composite
hypotheses

H 0 : p  p0 vs H1 : p  p0 .

Using normal approximation for large samples, the power function of the test is
given by
 c  np 
K  p   1   

 np1  p  

which is an increasing function of p (refer to Example 5.8). Under the constraint


that p  p0 , the power function will attain its maximum at p  p0 . Hence the test
is also a UMP test for the hypotheses

H 0 : p  p0 vs H1 : p  p0 .

The value of c can be obtained by solving to the equation based on


size  max K  p    and the resulted test would be just the approximated test
p  p0

described in Section 5.9.

P.131
Stat2602 Probability and Statistics II Fall 2014-2015

Example 6.5

Consider the exponential model: X 1 , X 2 ,..., X n ~ Exponentia l   ,  unknown


iid

Suppose that we want to test

H 0 :   0 vs H1 :   1 ,

where 1  0 . The likelihood ratio can be obtained as


n

L1 ; x  1 exp  1 i1 xi
n


L0 ; x  n0 exp  0 i1 xi
n

n
 1 
   exp 0  1  xi 
 n

 0   i 1 

Use the Neyman-Pearson lemma, a most powerful test would reject H 0 whenever

n
 1 
  exp 0  1  xi   k
n

 0   i 1 
1   
X  log k  n log 1 
n0  1   0 

for some constant k  0 . Therefore the most power test should take the form as

“Reject H 0 if X  c ”

for some constant c. From Example 4.2, we have derived that 2nX ~  22n . For a
size-  test,, the value of c can be obtained as

  P X  c |   0   P 22n  2n0c 


 22n ,
c
2n0

Therefore the most powerful test is constructed as

 22n ,
“Reject H 0 if X  .”
2n0

P.132
Stat2602 Probability and Statistics II Fall 2014-2015

As in the previous two examples, the test does not depend on the value of 1 and
hence it should have largest power at any   0 . Therefore it is also a UMP test
for the composite hypotheses

H 0 :   0 vs H1 :   0 .

Moreover, since the power function of the test

 
K    1  F   22n , 
 0 

where F . is the cdf of  22n is an decreasing function of  , it achieves its


maximum at   0 under the constraint that   0 . Therefore the test is also a
UMP size-  test for the hypotheses

H 0 :   0 vs H1 :   0 .

Example 6.6

Suppose that a random sample X 1 , X 2 ,..., X n  is drawn from a Weibull distribution


with pdf
f  x;    x 1 exp x  , x  0

where   0 is an unknown parameter. To find a most powerful test for

H 0 :  1 vs H1 :   1

where 1  1 , we consider the likelihood ratio

1n  xi 1 exp i1 xi 


n
n

L1 ; x 
1 1

 1n  xi 1 exp   xi   xi 


n n n

 
i 1 1 1

L1; x  exp  i 1 xi  i1 


n
i 1 i 1

Use the Neyman-Pearson lemma, a most powerful test would reject H 0 whenever

1n  xi 1 exp   xi   xi   k


n n n
1 1

i 1  i 1 i 1 
 1  1 log X i 1   X i   X i  log k  n log1
n n n
1 1

i 1 i 1 i 1

P.133
Stat2602 Probability and Statistics II Fall 2014-2015

for some constant k  0 . The expression on the left hand side cannot be simplified
anymore and the most powerful test should take the form

“Reject H 0 if 1  1 log X i


n n n
1 1
  Xi   Xi  c ”
1

i 1 i 1 i 1

where c is obtained by solving the equation based on size   . Note that the test
statistic depends on the value of 1 , and so as the form of the test. Therefore none
of these test has largest power for all the values of   1. For this model, there is no
UMP test for the composite hypotheses

H 0 :  1 vs H1 :   1 .

Example 6.7

Consider a normal model: X 1 , X 2 ,..., X n ~ N  , 02  , known  0


iid

Suppose we want to construct a two-sided test for the hypotheses

H 0 :   0 vs H1 :   0 .

First we consider the simple hypotheses: H 0 :   0 vs H1 :   1 .

As can be seen from Example 6.1, if 1   0 , the most powerful size-  test is
given by
0
“Reject H 0 if X  0  Z .”
n

However, if 1  0 , the most powerful size-  test becomes

0
“Reject H 0 if X  0  Z .”
n

Therefore none of these tests would have the largest power at all the values of
  0 , i.e. there is no UMP test for

H 0 :   0 vs H1 :   0 .

P.134
Stat2602 Probability and Statistics II Fall 2014-2015

§ 6.2 Likelihood Ratio Tests

As can be seen from last section, most powerful tests exist only for rather limited
situations and rarely exist for composite hypotheses. In fact, the optimality theory
will break down in statistical models with higher dimensional parameter space. In
practical situations, we need general methods for constructing reasonable tests,
even though they may not be uniformly most powerful.

A generalization of the method of Section 6.1 that combines the Neyman-Pearson


paradigm with maximum likelihood estimation is introduced in this section. The
idea is to compare the likelihood functions of two completing statistical models at
the “most favourable” parameter values. In most cases, such likelihood ratio tests
(LRT) have very satisfactory properties and are widely used in many standard
statistical procedures.

Likelihood ratio test statistic

Consider the problem of testing H 0 : θ  0 against H1 : θ  1 where


0  1   . Let θ̂ be the MLE of θ under the full model ( θ   ), and θ̂ 0 be the
MLE of θ under the null model ( θ  0 ). Then the likelihood ratio test statistic is
defined as

 X  
 
L θˆ 0 ; X
L θˆ ; X  
where Lθ; X is the likelihood function of θ based on a random vector X .

Note that the value of  X  is always in 0 , 1 because θ̂ is the MLE of θ under


 
the full model and hence L θˆ ; X  Lθ; X  for all θ   .

Since θ̂ is a consistence estimator of θ ,  X  would be close to 1 if H 0 is true. In


other words, a small value of  X  indicates strong evidence against the null
model. Therefore the LRT can be simply formulated as

“Reject H 0 if  X   c .”

where c is a constant that can be determined according to the pre-assigned size or


significance level, i.e. by solving

max P X  c | θ   or P X  c | H 0    .


θ0

P.135
Stat2602 Probability and Statistics II Fall 2014-2015

Example 6.8

Suppose that a random sample X 1 , X 2 ,..., X n  from a normal population N  ,  2 


with both  and  unknown is to be used to test

H 0 :   0 vs H1 :   0 .

From Example 3.20, the MLE of θ   ,   under the full model is given by

 n 1 
θ̂   X , S  .
 n 

Substituting in the likelihood function gives

 2 n  1 2 
 
n 2
 n
xi  x 2 
n
L θˆ ; x   S  exp  
   2n  1S i 1 
2
n
n 2
 2 n 2  n
    xi  x   exp   .
 n i1   2

From question 3(a) of the class test, the MLE of θ   ,   under the null model is
given by
 1 n 
θˆ 0   0 ,   X i  0 2  .
 n i 1 

Substituting in the likelihood function gives

 2
 
n 2
 2  n
 x     
n n
L θˆ 0 ; x  
2
exp  x   
 
 n  2i1 xi  0  i1

i 0 n 2 i 0
i 1

n 2
 2 n 2  n
    xi  0   exp   .
 n i1   2

The likelihood ratio test statistic can be obtained as

   in1  X i  X 2  n X  0 2 
n 2 n 2
L θˆ 0 ; X  i1  X i  0  
n 2

 X     
 
L θˆ ; X  i1  X i  X  
n 2

 i1 X i  X 
n 2


1 n X   0  
n 2
 2

 1  

 n  1 S 2

P.136
Stat2602 Probability and Statistics II Fall 2014-2015

Therefore the LRT would reject H 0 when

1 n X   0  
n 2
 2

 X   1  
 c
 n  1 S 2

X  0
  n  1c n 2  1
S n

where c may be chosen appropriately for a size-  test, so that the test becomes

X  0
“Reject H 0 if  t n1, 2 ”
S n

which is just the one sample t-test described in Section 5.4.

Example 6.9

Suppose that X 1 , X 2 ,..., X n ~ Exponentia l 1  and Y1 , Y2 ,..., Yn ~ Exponentia l 2  are
iid iid

two independent samples, where  and  are unknown parameters. Consider the
hypotheses

H 0 : 1  2 vs H1 : 1  2 .

From Example 3.22, the MLE of 1 base on the X-sample is given by ˆ1  1 X .
Similarly, the MLE of 2 base on the Y-sample is ˆ2  1 Y . Therefore under the
full model, the MLE of θ  1 , 2  is given by

 1 1
θ̂   , 
X Y 

and the corresponding likelihood function value can be expressed as

  1  1 n  1  1 n 
L θˆ ; x, y  n exp   xi   n exp   yi 
x  x i1  y  y i1 
 n n exp 2n  .
1
x y

P.137
Stat2602 Probability and Statistics II Fall 2014-2015

Under the null model, the two sample are from the same distribution. The MLE of
the common parameter 1  2 can be obtained from the combined sample:

1
 X Y 
ˆ0    .
 2 

and the corresponding likelihood function value is given by

  x  y  1  n 
 
2 n
x y n
L θˆ 0 ; x, y    exp      xi   yi  
 2    2   i 1 i 1 
2 n
x y
  exp 2n  .
 2 

The likelihood ratio test statistic can be obtained as

 X  
 
L θˆ 0 ; X, Y  X  Y 
 
2 n
1

22 n T n
 
L θˆ ; X, Y  2  X nY n 1  T 
2n

Y
where T  . Therefore the LRT would reject H 0 when
X

22 n T n T 1 1n
 X   2n  c
 2  c .
1  T  1  T  4
which is a quadratic inequality with solution taking the form as T  c1 or T  c2 .

From Example 4.2, the sampling distributions of X and Y are given by

2n1 X ~  22n , 2n2Y ~  22n

2n2Y 2n 2Y
from which we have  ~ F 2n,2n  .
2n1 X 2n 1 X

Hence the null distribution of T can be determined as T  ~ F 2n,2n  and the


Y H 0

X
LRT becomes
Y 1 Y
“Reject H 0 if  or  F2 n , 2 n , 2 .”
X F2 n , 2 n , 2 X

P.138
Stat2602 Probability and Statistics II Fall 2014-2015

Example 6.10

Suppose that X 1 , X 2 ,..., X n  is a random sample from Poisson   , where


  0,   is the unknown parameter. The likelihood function is given by
n

e   X X
L ; X   
n i n
 e  n 
i

i 1 X i!
i 1
 X !.
i 1
i

From Example 3.18, the MLE under the full model was found as ˆ  X .

Consider the hypotheses H 0 :   5 against H 1 :   5 . The parameter space for the


null model is  0   5 which consist of a single value. Hence the MLE of  under
the null model is ˆ  5 .
0

The likelihood ratio test statistic for testing H 0 :   5 against H 1 :   5 is given by

 X  
 
L ˆ0 ; X
e  n 5 X   5 
nX
 
 exp n X  5  X log
5 

 
L ˆ; X

X

  X 

The LRT will reject H 0 when

  5  5 log c
exp n X  5  X log    c  X  X log   5.
  X  X n

Therefore the LRT can be formulated as:

5
“ Reject H 0 if X  X log  c' ”
X

where c' is some constant satisfying

 5 
P X  X log  c'| H 0    .
 X 

5
However, the constant c' is still hard to be determined as X  X log may not
X
follow a nice distribution. Fortunately, the following theorem gives the large
sample approximation on the distribution of the LRT statistic.

P.139
Stat2602 Probability and Statistics II Fall 2014-2015

Wilk’s Theorem

Under certain regularity conditions (including those for the asymptotic properties
of MLEs),
 2 log  X  

L
 r2 as n  

where r is the difference in the dimensions of  and  0 .

Hence when n is large, the LRT test can be formulated as

“ Reject H 0 if  2 log  X    r2, . ”

Example 6.11

Consider the LRT statistic in Example 6.10:

  5   
 X   exp n X  5  X log     2 log  X   2n X log  X  5 
X
  X   5 

When n is large, to test H 0 :   5 against H 1 :   5 at   0.05 , the LRT can be


formulated as
 X 
“Reject H 0 if 2n X log  X  5    02.05,1  3.84 .”
 5 

The degrees of freedom for this test is equal to 1 because there is no free parameter
in the null model while there is one parameter in the full model.

P.140

You might also like