0% found this document useful (0 votes)
139 views12 pages

Gaussian Random Vector Solutions

This document contains the solutions to homework problems from a statistical signal processing class. 1. It solves problems related to Gaussian random vectors, including finding marginal and conditional distributions. 2. It redefines Gaussian random vectors using characteristic functions to prove the definition. 3. It proves a property about the best linear mean squared error estimate of a random variable given another. 4. It solves a problem about estimating a signal corrupted by two noise sources using linear mean squared error estimation.

Uploaded by

zach
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
139 views12 pages

Gaussian Random Vector Solutions

This document contains the solutions to homework problems from a statistical signal processing class. 1. It solves problems related to Gaussian random vectors, including finding marginal and conditional distributions. 2. It redefines Gaussian random vectors using characteristic functions to prove the definition. 3. It proves a property about the best linear mean squared error estimate of a random variable given another. 4. It solves a problem about estimating a signal corrupted by two noise sources using linear mean squared error estimation.

Uploaded by

zach
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

EE 278 Tuesday, November 20, 2007

Statistical Signal Processing Handout #17


Homework #6 Solutions

1. (40 points) Gaussian random vector


a. The marginal pdfs of a jointly Gaussian pdf are Gaussian. Therefore X1 ∼ N (1, 1).
b. Since X2 and X3 are independent (σ23 = 0), the variance of the sum is the sum of the
variances. Also the sum of two jointly Gaussian random variables is also Gaussian. Therefore
X2 + X3 ∼ N (7, 13).
c. Since 2X1 + X2 + X3 is a linear transformation of a Gaussian random vector,
 
  X1
2X1 + X2 + X3 = 2 1 1 X2  ,
X3
it is a Gaussian random vector with mean and variance
    
  1 2
  1 1 0 2
µ= 2 1 1  5 = 9 and σ =
 2 1 1  1 4 0   1 = 21 .
2 0 0 9 1
Thus 2X1 + X2 + X3 ∼ N (9, 21).
d. Since σ13 = 0, X3 and X1 are uncorrelated and hence independent since they are jointly
Gaussian; similarly, since σ23 = 0, X3 and X2 are independent. Therefore the conditional
pdf of X3 given (X1 , X2 ) is the same as the pdf of X3 , which is N (2, 9).
e. We use the general formula for the conditional Gaussian pdf:
X2 | {X1 = x1 } ∼ N Σ21 Σ−1 −1

11 (x − µ1 ) + µ2 , Σ22 − Σ21 Σ11 Σ12

In the case of (X2 , X3 ) | X1 ,


   
  1 4 0
Σ11 = 1 , Σ21 = , Σ22 = .
0 0 9
Therefore the mean and variance of (X2 , X3 ) given X1 = x1 are
     
1  −1   5 x1 + 4
µ(X2 ,X3 )|X1 = 1 x1 − 1 + =
0 2 2
         
4 0 1   4 0 1 0 3 0
Σ(X2 ,X3 )|X1 = − 1 0 = − =
0 9 0 0 9 0 0 0 9
Thus X2 and X3 are conditionally independent given X1 . The conditional densities are
X2 | {X1 = x1 } ∼ N (x1 + 4, 3) and X3 | {X1 = x} ∼ N (2, 9).
f. In the case of X1 | (X2 , X3 ) ,
 
4 0    
Σ11 = , Σ21 = 1 0 , Σ22 = 1
0 9
So the mean and variance of X1 | {X2 = x2 , X3 = x3 } are
 1 0
       
x 2 5  1  x2 −5
+ 1 = 14 x2 − 1

µX1 |X2 ,X3 = 1 0 4 1 − +1= 4 0
0 9 x3 9 x3 −9 4

 1 0 1
  
2
= 1 − 14 = 34

σX1 |X2 ,X3 = σ22 − 1 0 4 1
0 9 0
The conditional mean does not depend on x3 since X1 and X3 are independent.
g. Let Y = 2X1 + X2 + X3 . In part (c) we found that Y ∼ N (9, 21). Thus
   
0 − µY −9
P{Y < 0} = Φ =Φ √ = Φ(−1.96) = Q(1.96) = 2.48 × 10−2 .
σy 21

h. In general, AX ∼ N (AµX , AΣX AT ). For this problem,


 
  1  
2 1 1   9
µY = AµX = 5 =
1 −1 1 −2
2
  
  1 1 0 2 1  
T 2 1 1  21 6
ΣY = AΣX A = 1 4 0 1 −1 =
  
1 −1 1 6 12
0 0 9 1 1
   
9 21 6
Thus Y ∼ N , .
−2 6 12

2. (10 points) Definition of Gaussian random vector.


a. Just copying the definition from the lecture notes:
TX
φX (ω) = E eiω

.

b. Let Y = ω T X . Then φX (ω) = E(eiY ) = φY (1) .


c. Since Y is Gaussian by the new definition, its characteristic function is
1 2 σ2
φY (ω) = e− 2 ω Y +iµY
.
Thus
1 2
φX (ω) = φY (1) = e− 2 σY +iµY .

d. By linearity (lecture notes #6),


µY = ω T µ and σY2 = ω T Σω .

e. Combining the previous two steps,


1 T Σω+iω T µ
φX (ω) = e− 2 ω ,
which is the characteristic function of a Gaussian pdf with mean µ and covariance matrix Σ.
Therefore X is a a Gaussian random vector.
Note: This proof shows the power of the characteristic function. Try to prove this without
using the characteristic function!

Page 2 of 12 EE 278, Spring 2007


3. (10 points) Proof of Property 4.
a. Let X̂ be the best MSE linear estimate of X given Y. In the MSE vector case section
of lecture notes #6 it was shown that X̂ and X − X̂ are individually zero-mean Gaussian
random variables with variances ΣXY Σ−1 2 −1
Y ΣYX and σX − ΣXY ΣY ΣYX , respectively.

b. The random variables X̂ and X − X̂ are jointly Gaussian since they are obtained by a linear
transformation of the GRV [ Y X ]T . By orthogonality, X̂ and X − X̂ are uncorrelated, so
they are also independent. By the same reasoning, X − X̂ and Y are independent.
c. Now write X = X̂ + (X − X̂). Then given Y = y
X = ΣXY Σ−1
Y y + (X − X̂) ,

since X − X̂ is independent of Y.
d. Thus X | {Y = y} is Gaussian with mean ΣXY Σ−1 2 −1
Y y and variance σX − ΣXY ΣY ΣYX .

4. (15 points) Noise cancellation. This is a vector MSE linear estimation problem. Since Z 1 and Z2
are zero mean, µY1 = µX + µZ1 = µ and µY2 = µZ1 + µZ2 = 0 . We first normalize the random
variables by subtracting their means to get
 
0 0 Y1 − µ
X = X − µ and Y = .
Y2

Now using the orthogonality principle we can find the best linear MSE estimate X̂ 0 of X 0 . To
do so we first find    
P +N1 N1 P
ΣY = and ΣYX = .
N1 N1 +N2 0
Therefore
X̂ 0 = ΣTYX Σ−1
Y Y
0
 
1 N1 +N2 −N1
Y0
 
= P 0
P (N1 +N2 ) + N1 N2 −N 1 P +N 1
 
P   Y1 − µ
= N1 +N2 −N1
P (N1 +N2 ) + N1 N2 Y2
P (N1 + N2 )(Y1 − µ) − P N1 Y2
= .
P (N1 +N2 ) + N1 N2

The best linear MSE estimate is X̂ = X̂ 0 + µ. Thus


P (N1 + N2 )(Y1 − µ) − P N1 Y2
X̂ = +µ
P (N1 +N2 ) + N1 N2
P (N1 + N2 )Y1 − P N1 Y2 + N1 N2 µ
= .
P (N1 + N2 ) + N1 N2

Homework #6 Solutions Page 3 of 12


The MSE can be calculated by
2
MSE = σX − ΣTYX Σ−1
Y ΣYX
 
P   P
=P− N1 +N2 −N1
P (N1 + N2 ) + N1 N2 0
P 2 (N1 + N2 ) P N 1 N2
=P− = .
P (N1 + N2 ) + N1 N2 P (N1 + N2 ) + N1 N2
Note that if either N1 or N2 go to 0, the MSE also goes to 0. This is because the estimator will
then use the measurement with zero noise variance (that is, the one with no noise) and ignore
the other measurement.

5. (20 points) Additive nonwhite Gaussian noise channel. The best estimate of X is of the form
n
X
X̂ = hi Y i .
i=1

We apply the orthogonality condition E(XYj ) = E(X̂Yj ) for 1 ≤ j ≤ n:


n
X
P = hi E(Yi Yj )
i=1
Xn
= hi E((X + Zi )(X + Zj ))
i=1
n
X
= hi (P + N · 2−|i−j|) .
i=1

There are n equations with n unknowns:


    
P P +N P + N/2 · · · P + N/2n−2 P + N/2n−1 h1
P   P + N/2 P +N · · · P + N/2n−3 P + N/2n−2 
  h2 
 
 ..   . .. .. ..   .. 
  
.= .. ..  . .
. . . .
n−2
P + N/2n−3
    
P  P + N/2 ··· P +N P + N/2  hn−1 
n−1
P P + N/2 P + N/2n−2 ··· P + N/2 P +N hn
By the hint, there are only 2 degrees of freedom given, a and b. Solving this equation using the
first 2 rows of the matrix, we obtain
   
h1 2
 h2  1
 .. 
  P  .. 
 
 . =  .
  3N + (n + 2)P  . 
hn−1  1
hn 2
The minimum mean square error is
n
X
MSE = E(X − X̂)X = P − P hi Y i
i=1
 
(n + 2)P 3P N
=P 1− = .
3N + (n + 2)P 3N + (n + 2)P

Page 4 of 12 EE 278, Spring 2007


1
6. (15 points) Convergence examples. First consider Xn . Since Xn (ω) is either 0 or n
,
1
0 ≤ Xn (ω) ≤ n
for every ω .
Therefore lim Xn (ω) = 0 for every ω ∈ Ω. Thus
n→∞

P{ω : lim Xn (ω) = 0} = P(Ω) = 1 ,


n→∞

which shows that Xn → 0 with probability one.


Now consider convergence in mean square.
X
E((Xn − 0)2 ) = x2 pX (x)
x∈X
1
= 0 · P{Xn = 0} + P{Xn = n1 }
n2
1 1 1
= 2 P{ω = (n mod m)} = 2 · .
n n m
Obviously,
1
lim = 0,
n→∞ mn2
so Xn → 0 in mean square. This implies that Xn → 0 in probability also.
Next consider Yn . For any  > 0,
lim P{|Yn − 0| > } = lim P{Yn > }
n→∞ n→∞
1
= lim P{Yn = 2n } = P{ω = 1} = 6= 0 .
n→∞ m
Thus Yn 6→ 0 in probability. Since convergence with probability 1 implies convergence in
probability, Yn 6→ 0 with probability 1. Similarly, Yn 6→ 0 in mean square.
Finally consider Zn , which is independent of n. For any  such that 0 <  < 1,
lim P{|Zn − 0| > } = lim P{Zn > }
n→∞ n→∞
1
= P{Zi = 1} = P{ω = 1} = 6= 0 .
m
Thus Zn 6→ 0 in probability, hence Zn 6→ 0 either with probability 1 or in mean square.
Comments:
• Xn also converges to 0 in distribution since it converges in probability.
• Yn does not converge in any sense. To show this it suffices to show that it does not converge
in distribution. But
1
P{Yn ≤ y} → 1 − <1
m
for every y < ∞, so the limit of FYn is not a cdf at all.
• Zn converges in every sense to the nonzero random variable Z defined by
(
1 ω=1
Z(ω) =
0 otherwise

Homework #6 Solutions Page 5 of 12


This convergence is immediate since Zn = Z for every n.

7. (15 points) Convergence experiments. The Matlab code is below. The output is shown in
Figure 1.
clear all;
clf;

% Part (a)

% Generate 200 samples (X_1 to X_200) of i.i.d. zero


% mean, unit variance Gaussian random variables.
% Hint: Use randn and cumsum.

n = 1:200;

% WRITE MATLAB CODE HERE


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
X = randn( 1, 200 );
S = cumsum( X );
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% Do the divide by n part.


S = S./n;

subplot( 4, 1, 1 );
plot( n, S );
xlabel( ’n’ );
ylabel( ’Sn’ );
title( ’2(a) Sample average sequence’ );

% Part (b)

% Now generate 5000 such sequences.


% Hint: use randn and cumsum (be careful!) again

% WRITE MATLAB CODE HERE


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
X = randn( 5000, 200 );
S = cumsum( X, 2 );
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

S = S./repmat( n, 5000, 1 );

% Part (c) Strong Law of Large Numbers (this loop will run for a minute)

E_m = zeros(200,1);
for m = 0:199,

% N_m should be the number of rows in S that have an entry whose


% absolute value is > 0.1 in columns m+1 through 200.

% WRITE MATLAB CODE HERE


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
N_m = sum(sum((abs(S(:, m+1:200)) > 0.1), 2) > 0);
E_m(m+1) = N_m/5000;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

end;
subplot( 4, 1, 2 );

Page 6 of 12 EE 278, Spring 2007


plot( E_m );
xlabel( ’n’ );
ylabel( ’E_m’ );
title( ’2(c) Strong Law of Large Numbers’ );

%) Part (d) Convergence in Mean Square

% S_squared is the square of the S matrix. EX = 0.


S_squared = S.^2;

% WRITE MATLAB CODE HERE


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
M = 1/5000*sum( S_squared );
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

subplot( 4, 1, 3 );
plot( n, M );
xlabel( ’n’ );
ylabel( ’Mn’ );
title( ’2(d) Mean square convergence’ );

% Part (e) Weak Law of Large Numbers

% Count the number of times |S_i,n| > 0.1 in each column.

% WRITE MATLAB CODE HERE


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
N = sum( abs( S ) > 0.1 );
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% Find E by dividing by 5000.


E = N/5000;

% Plot Pn and En vs. n. Since EX = 0, Pn is the probability


% that |Sn| > 0.1. Sn has zero mean and a standard deviation
% sigma_Sn. Hint: sigma_Sn is a function of n.
% Therefore P( |Sn| > 0.1 ) = 2*Q( 0.1/sigma_Sn ).
% erfc( x ) = 2/sqrt( pi ) * integral from x to inf of exp( -t^2 ) dt
% So Q( x ) = 1/2*erfc( x/sqrt( 2 ) );

% Find sigma_Sn and Pn. Hint: Pn and En look quite similar.

% WRITE MATLAB CODE HERE


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
sigma_Sn = 1./sqrt(n);
P = erfc( ( 0.1./sigma_Sn )/sqrt(2) );
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

subplot( 4, 1, 4 );
plot( n, P, ’r--’ );
hold on;
plot( n, E );
axis( [ 0 200 0 1 ] );
xlabel( ’n’ );
ylabel( ’En (solid), Pn (dashed)’ );
title( ’2(e) Convergence in probability’ );

% Produce hardcopy
orient tall
print hw6q7

Homework #6 Solutions Page 7 of 12


2(a) Sample average sequence
1

0.5
Sn
0

−0.5
0 20 40 60 80 100 120 140 160 180 200
n
2(c) Strong Law of Large Numbers
1
m

0.5
E

0
0 20 40 60 80 100 120 140 160 180 200
n
2(d) Mean square convergence
1
Mn

0.5

0
0 20 40 60 80 100 120 140 160 180 200
n
2(e) Convergence in probability
En (solid), Pn (dashed)

0.5

0
0 20 40 60 80 100 120 140 160 180 200
n

Figure 1: Output of convergence experiments

8. (10 points) Convergence with probability 1. For any values of {Xn }, the sequence of Yn values
is monotonically decreasing in n. Since the random variables are ≥ 0, we know that the limit
of Yn is ≥ 0. We suspect that Yn → 0. To prove that Yn converges w.p.1 to 0, we show that
for every  > 0,
lim P{|Yn − 0| <  for all n ≥ m} = 1 .
m→∞

which is equivalent to limm→∞ P{|Yn − 0| ≥  for some n ≥ m} = 0. So, let m ≥ 1 and consider
P{|Yn − 0| ≥  for some n ≥ m} = P{Yn ≥  for some n ≥ m}

n [ o
(a)
=P {X1 ≥ , . . . , Xn ≥ , Xn+1 < }
n=m

(b) X
= P{X1 ≥ , . . . , Xn ≥ , Xn+1 < }
n=m
∞ n
(c) X Y
= P{Xn+1 < } P{Xi ≥ }
n=m i=1

X
= (1 − e−λ ) e−λn = e−λm → 0 as m → ∞
n=m

Page 8 of 12 EE 278, Spring 2007


Step (a) follows because the event on the previous line is the same as saying that the smallest
index k ≥ m such that Xk <  is either n + 1 or n + 2, . . .. Step (b) follows by the fact that
these events are disjoint. Step (c) follows by the independence of X1 , X2 , . . ..
Therefore Yn converges w.p.1 to 0.

9. (10 points) Convergence in probability.


a. A sequence of random variables Xn converges in probability to X if for every  > 0
lim P{|Xn − X| > } = 0 .
n→∞

We guess that Xn converges to 0. To prove this, consider


E(Xn )
P{|Xn − 0| > } = P{Xn > } ≤ → 0 as n → ∞ .

The inequality follows by the Markov inequality, since Xn ≥ 0.
b. We guess that Yn also converges to 0. By Jensen’s inequality
E(Yn ) ≤ 1 − e− E(Xn ) ,
since (1 − e−x ) is a concave function. Therefore E(Yn ) → 0 as n → ∞ and so Yn → 0 by the
result of part (a).

Homework #6 Solutions Page 9 of 12


Extra Problems Solutions
1. Covariance matrices.
a. No: not symmetric.
b. Yes: covariance matrix of X1 = Z1 + Z2 and X2 = Z1 + Z3 .
c. Yes: covariance matrix of X1 = Z1 , X2 = Z1 + Z2 , and X3 = Z1 + Z2 + Z3 .
d. No: several justifications.
2
• σ23 = 9 > σ22 σ33 = 6, which contradicts the Schwarz inequality.
• The matrix is not nonnegative definite since the determinant is −2.
• One of the eigenvalues is negative (λ1 = −0.8056).

2. Bit error rate.


a. As suggested, let the Bernoulli random variables X1 , . . . , X1000 have the common pmf
(
1 with probability 10−10
Xi =
0 with probability 1 − 10−10
Then E(Xi ) = 10−10 . If N is the number of errors in 1000 bits, then
X1000  X 1000
E(N ) = E Xi = E(Xi ) = 1000 E(Xi ) = 1000 · 10−10 = 10−7 .
i=1 i=1

b. The upper bound can be obtained from the Markov inequality:


1
P{N ≥ 10} = P{N ≥ a E(N )} ≤ ,
a
where a = 108 and E(N ) = 10−7 . Therefore P{N ≥ 10} ≤ 10−8 . The maximum probability
of ≥ 10 errors occurs when the errors occur in groups of 10.

3. Minimum MSE for Gaussian random vector. We are given a Gaussian random vector
   
0 1 2 1
X ∼ N  0  ,  2 5 2   .
0 1 2 9
Means and covariances are
     
  0 5 2 2
µ1 = 0 , Σ11 = 1 , Σ12 = 2 1 , µ2 = , Σ22 = , Σ21 = .
0 2 9 1

Page 10 of 12 EE 278, Spring 2007


We apply the general formula from lecture notes #6 to obtain the MMSE.
X̂1 = Σ12 Σ−1
22 (X2 − µ2 ) + µ1
 5 2 −1
   
   1 9 −2
= 2 1 X2 = 2 1 X2
2 9 41 −2 5
1   16 1
= 16 1 X2 = X2 + X3
41 41 41
MMSE = Σ11 − Σ12 Σ−1
22 Σ21
 5 2 −1 2
      
   1 9 −2 2
=1− 2 1 =1− 2 1
2 9 1 41 −2 5 1
 
1   2 33 8
=1− 16 1 =1− =
41 1 41 41

Since X1 is correlated with X2 and X3 , the MMSE is smaller than Var(X1 ) = 1.

4. Gaussian MMSEE is linear.


a. This follows from orthogonality. Since X̂2 is the best linear estimate of X given Y, the error
X − X̂2 and the observation Y − E(Y) are orthogonal, i.e.,

E (X − X̂2 ) · (Y − E(Y)) = 0 .

b. We can show that X − X̂2 and Y − EY are jointly Gaussian by representing them as a
linear function of a GRV:
" # 
1 ΣXY Σ−1 E(X) − ΣXY Σ−1
   
X − X̂2 Y X Y E(Y)
= + .
Y − E(Y) 0 I Y − E(Y)

c. By part (a), X − X̂2 and Yi − E(Yi ) are uncorrelated for every i. By part (b), they are
jointly Gaussian. Therefore they are independent since uncorrelated jointly Gaussian ran-
dom variables are independent. Since E(X̂2 ) = E(X),
E(X − X̂2 | Y) = E(X − X̂2 ) = E(X) − E(X̂2 ) = 0 .

d. E(X̂2 | Y) = X̂2 since X̂2 is a function of Y. Thus using part (c),


X̂1 = E(X | Y) = E(X̂2 | Y) = E(X̂2 ) .

5. Prediction. The zero-mean random vector (X1 , . . . , Xn ) has covariance matrix


 
1 α α2 · · · αn−2 αn−1
 α
 1 α ··· αn−3 αn−2  
 2 n−4 n−3 
 α α 1 ··· α α 
ΣX = 
 ... .. .. .. .. ..  ,
 . . . . . 

 n−2
α αn−3 αn−4 · · · 1 α 

αn−1 αn−2 αn−3 · · · α 1

Homework #6 Solutions Page 11 of 12


 
Let Y = X1 . . . Xn−1 and let X = Xn . Then
   
1 · · · αn−2 αn−1
ΣY =  ... ..  , Σ
..  .  2
. YX =  ..  , σX = 1.

. 
αn−2 · · · 1 α
Applying the general formula for linear MSE estimate and MMSE:
X̂n = ΣTYX Σ−1
Y Y
 −1
1 · · · αn−2
· · · α  ... .. ..  Y
 
= αn−1 . . 
n−2
α ··· 1
= hT Y where hT = ΣXY Σ−1
Y

by noting that hT ΣY = ΣXY


 
= 0 ··· 0 α Y
= αXn−1

MSE = σx2 − ΣXY Σ−1


Y ΣYX

= 1 − hT ΣYX 

αn−1
= 1 − 0 · · · 0 α  ...  = 1 − α2 .
  
α

6. Gambling By the weak law of large numbers, the sample mean n1 ni=1 Xi converges to the
P
mean E(X) in probability, so P(|Sn −µ| > ) → 0 as n → ∞. The limiting value of P(Sn < µ/2)
depends on µ.
• If µ < 0 then P(Sn < µ/2) → 1. This is because P(|Sn − µ| > ) → 0 as n → ∞ for all
positive . But this means P(|Sn − µ| < ) → 1 as n → ∞. Since Sn → µ < µ/2, we see
that P(Sn < µ/2) → 1.
• If µ > 0 then P(|Sn − µ| < ) → 1 as n → ∞. But if Sn → µ then P (Sn < µ/2) → 0.

7. Convergence to a random variable. We show that Sn converges to P in mean square. Consider


E((Sn − P )2 ) = EP (E((Sn − P )2 | P ))
= EP (Var(Sn | P ))
1 P 
n
= EP Var i=1 X i | P
n2
1
 
(since ni=1 Xi is Binom(n, P ) given P )
P
= EP 2
(nP (1 − P ))
n
1
E(P ) − E(P 2 ) .

=
n
Therefore limn→∞ E((Sn − P )2 ) = 0 and Sn converges to P in mean square.

Page 12 of 12 EE 278, Spring 2007

You might also like