Chap 3.2
Chap 3.2
Oftentimes when two random variables, ( X , Y ) , are observed, the values of the two
variables are related. For example, suppose that
Surely we would think it more likely that Y > 200 pounds if we were told that
X = 182 cm than if we were told that X = 104 cm .
The knowledge about the value of X gives us some information about the value of
Y even though it does reveal the exactly value of Y.
Recall that for any two events E and F, the conditional probability of E given F is
defined by
P (E ∩ F )
P (E | F ) = provided that P (F ) > 0 .
Pr (F )
Definition
P (Y = y , X = x ) p ( x, y )
pY | X ( y | x ) = P (Y = y | X = x ) = = .
P( X = x ) p X (x )
On the other hand, for any y such that pY ( y ) = P (Y = y ) > 0 , the conditional pmf of
X given that Y = y is the function of x denoted by p X |Y ( x | y ) and defined by
P ( X = x, Y = y ) p ( x, y )
p(x | y ) = P( X = x | Y = y ) = =
P (Y = y ) pY ( y )
P.129
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.25
Referring to example 1 that we randomly draw 3 balls from an urn with 3 red balls,
4 white balls, 5 blue balls. Let X be number of red balls, Y be the number of white
balls in the sample. The joint pmf of ( x, y ) is given by the following table.
Values of Y
Values of X 0 1 2 3 Total
0 0.0454 0.1818 0.1364 0.0182 0.3818
1 0.1364 0.2727 0.0818 0 0.4909
2 0.0682 0.0545 0 0 0.1227
3 0.0045 0 0 0 0.0045
Total 0.2545 0.5091 0.2182 0.0182 1.0000
Dividing all the entries by the row totals we obtain the conditional pmfs of Y X =x
.
Values of Y
Values of X 0 1 2 3 Total
0 0.1190 0.4762 0.3571 0.0476 1
1 0.2778 0.5556 0.1667 0 1
2 0.5556 0.4444 0 0 1
3 1 0 0 0 1
Each row represents a conditional pmf of Y X = x . For example, the first row is the
conditional pmf of Y given that X = 0 , the second row is the conditional pmf of Y
given that X = 1 , etc. From these conditional pmfs we can see how our uncertainty
on the value of Y is affected by our knowledge on the value of X.
Similarly, dividing all the entries in the joint pmf table by the column totals gives
the conditional pmf of X Y = y which is shown in the following table.
Values of Y
Values of X 0 1 2 3
0 0.1786 0.3571 0.6250 1
1 0.5357 0.5357 0.3750 0
2 0.2679 0.1071 0 0
3 0.0179 0 0 0
Total 1 1 1 1
P.130
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.26
Now consider
P( X = k , X + Y = n )
P( X = k | X + Y = n ) =
P( X + Y = n )
P( X = k ,Y = n − k )
=
P( X + Y = n )
P ( X = k )P (Y = n − k )
=
P( X + Y = n )
n! λ1k λn2 − k
=
k ! (n − k ) ! (λ1 + λ2 )n
k n−k
⎛ n ⎞⎛ λ1 ⎞ ⎛ λ2 ⎞
= ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ , x = 0,1,..., n
⎝ ⎠⎝ 1
k λ + λ 2 ⎠ λ
⎝ 1 + λ 2 ⎠
⎛ λ1 ⎞
Hence X |X +Y = n ~ b⎜⎜ n , ⎟⎟ .
⎝ λ1 + λ2⎠
P.131
Stat1301 Probability& Statistics I Spring 2008-2009
Remarks
1. If X is independent of Y, then the conditional pmf of X given Y = y will
becomes
p ( x, y ) p X ( x ) pY ( y )
p X |Y ( x | y ) = = = p X ( x ) for all y .
pY ( y ) pY ( y )
Similarly,
p ( x, y ) p X ( x ) pY ( y )
pY | X ( y | x ) = = = pY ( y ) for all x .
p X (x ) p X (x )
Hence the knowledge of the value of one variable do not affect our uncertainty
on the value of another variable, i.e. knowledge of one variable give us no
information on the other variable if they are independent.
f ( x, y )
fY | X ( y | x ) = provided that f X ( x ) > 0 .
f X (x )
f ( x, y )
f X |Y ( x | y ) = provided that f Y ( y ) > 0 .
fY ( y )
P.132
Stat1301 Probability& Statistics I Spring 2008-2009
4. The conditional pmf/pdf satisfies all the properties of a pmf/pdf and describes
the probabilistic behaviour of a random variable given the value of another
variable. Hence we can have the followings definitions.
⎧ ∑ fY | X (i | x ) discrete case
⎪
FY | X ( y | x ) = P (Y ≤ y | X = x ) = ⎨ i ≤ y
⎪⎩ ∫ − ∞ fY | X (t | x )dt
y
continuous case
⎧ ∑ g (i ) fY | X (i | x ) discrete case
⎪
E (g (Y ) | X = x ) = ⎨ i
⎪⎩ ∫ − ∞ g (t ) fY | X (t | x )dt
∞
continuous case
E (Y | X = x )
(
Var (Y | X = x ) = E (Y − E (Y | X = x )) | X = x
2
)
( )
= E Y 2 | X = x − (E (Y | X = x ))
2
Example 3.27
Suppose that the joint density of X and Y is given by
⎧ e− x y e− y
⎪ x > 0, y > 0
f ( x, y ) = ⎨ y .
⎪0
⎩ otherwise
Marginal pdf of Y :
fY ( y ) = ∫
∞
0
e− x y e− y
y
[
dx = e − y − e − x ]
y ∞
0 = e− y , y>0
P.133
Stat1301 Probability& Statistics I Spring 2008-2009
Conditional pdf of X Y = y :
f ( x, y ) 1 − x y
fY | X ( x | y ) = = e , x>0
fY ( y ) y
⎧ 1 − e− x y
x>0
FX |Y ( x | y ) = ⎨ .
⎩0 otherwise
E(X | Y ) = Y , Var ( X | Y ) = Y 2
What is E (E ( X | Y )) ?
Two important and useful formulae of conditional expectation are given below.
E (u ( X )) = E (E (u ( X ) | Y ))
E (E (u ( X ) | Y )) = ∫ − ∞ E (u ( X ) | Y = y ) fY ( y )dy
∞
Proof :
{ }
= ∫ − ∞ ∫− ∞ u ( x ) f X |Y ( x | y )dx fY ( y )dy
∞ ∞
= ∫ − ∞ ∫ − ∞ u ( x ) f ( x, y )dxdy
∞ ∞
= E (u ( X ))
P.134
Stat1301 Probability& Statistics I Spring 2008-2009
Proof : { ( )
E (Var ( X | Y )) = E E X 2 | Y − (E ( X | Y ))
2
}
( ) {
= E X 2 − E (E ( X | Y ))
2
}
{
Var (E ( X | Y )) = E (E ( X | Y ))
2
}− { E (E ( X | Y )) } 2
( )
= E X 2 − E (Var ( X | Y )) − E ( X )
2
= Var ( X ) − E (Var ( X | Y ))
Example 3.28
Suppose we have a binomial random variable X which represents the number of
success in n independent Bernoulli experiments. Sometimes the success probability
p is unknown. However, we usually have some understanding on the value of p,
e.g. we may believe that p is another random variable picked uniformly from (0,1) .
Then we have the following hierarchical model:
p ~ U (0,1) , X p
~ b(n, p )
n
E ( X ) = E (E ( X | p )) = E (np ) = nE ( p ) =
2
( )
= nE ( p ) − nE p 2 + n 2Var ( p )
n n n 2 n(n + 2 )
= − + =
2 3 12 12
P.135
Stat1301 Probability& Statistics I Spring 2008-2009
⎧1 X =x
Ix = ⎨ .
⎩0 otherwise
Then P ( X = x ) = E (I x )
= E (E (I x | p ))
= E (P ( X = x | p ))
⎛⎛ n⎞ n− x ⎞
= E ⎜⎜ ⎜⎜ ⎟⎟ p x (1 − p ) ⎟⎟
⎝⎝ x⎠ ⎠
⎛n⎞ 1
= ⎜⎜ ⎟⎟ ∫ 0 p x (1 − p ) dp
n− x
⎝ x⎠
⎛ n ⎞ Γ( x + 1)Γ(n − x + 1)
= ⎜⎜ ⎟⎟
⎝ x⎠ Γ(n + 2 )
⎛ n ⎞ x ! (n − x )! 1
= ⎜⎜ ⎟⎟ = , x = 0,1,..., n
⎝ x ⎠ (n + 1)! n +1
p X | p (x | p ) f p ( p )
f p| X ( p | x ) =
p X (x )
⎛n⎞ 1
= ⎜⎜ ⎟⎟ p x (1 − p ) × 1
n− x
⎝ x⎠ n +1
=
(n + 1)! p x (1 − p )n − x
x ! (n − x )!
Γ(n + 2 ) (n − x +1)−1
= p ( x +1)−1 (1 − p ) , 0 < p <1
Γ( x + 1)Γ(n − x + 1)
x +1 x +1
E( p | X = x) = = .
(x + 1) + (n − x + 1) n + 2
This formula is known as the Laplace’s law of succession in the 18th century by
Pierre-Simon Laplace in the course of treating the sunrise problem which tried to
answer the question “What is the probability that the sun will rise tomorrow?”
P.136
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.29
Let X ~ ℘(λ1 ) , Y ~ ℘(λ2 ) be two independent Poisson random variables. Find the
X
expected value of the proportion .
X +Y
⎛ λ1 ⎞
Ans: Let N = X + Y . From example 3.26, we know that X ~ b⎜⎜ N , ⎟⎟ .
N
⎝ λ1 + λ2⎠
Consider
⎛ X ⎞ ⎛X⎞ ⎡ ⎛X ⎞⎤
E⎜ ⎟ = E ⎜ ⎟ = E ⎢ E ⎜ | N ⎟⎥
⎝ X +Y ⎠ ⎝N⎠ ⎣ ⎝N ⎠⎦
⎧1 ⎫
= E ⎨ E ( X | N )⎬
⎩N ⎭
⎧ 1 Nλ1 ⎫ λ1
= E⎨ ⎬=
⎩ N λ1 + λ2 ⎭ λ1 + λ2
Example 3.30
(Prediction of Y from X)
When X and Y are not independent, we can base on the observed value of X to
predict the value of the unobserved random variable Y. That is, we may predict the
value of Y by g ( X ) where g is a function chosen in such a way that the mean
(
square error of the prediction, Q = E (Y − g ( X )) , is minimized.
2
)
First we conditional on X, consider
( ) ( )
E (Y − g ( X )) | X = E Y 2 | X − 2 g ( X )E (Y | X ) + g ( X )
2 2
= Var (Y | X ) + (E (Y | X )) − 2 g ( X )E (Y | X ) + g ( X )
2 2
= Var (Y | X ) + (g ( X ) − E (Y | X )) .
2
Hence
{ ((
Q = E E Y − g(X ) | X
2
)) }
{
= E (Var (Y | X )) + E (g ( X ) − E (Y | X )) .
2
}
Therefore Q is minimized if we choose g ( x ) = E (Y | X = x ) , i.e. the best predictor
of Y given the value of X is g ( X ) = E (Y | X ) . The mean square error of this
predictor is
( )
E (Y − E (Y | X )) = E (Var (Y | X )) = Var (Y ) − Var (E (Y | X )) ≤ Var (Y ) .
2
P.137
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.31
Two random variables X and Y are said to have a bivariate normal distribution if
their joint pdf is
⎧ ⎡⎛ x − μ 2 ρ ( x − μ x )( y − μ y ) ⎛ y − μ y ⎤⎫
2
⎞
2
⎪ ⎞
f ( x, y ) =
1
exp⎨−
1 ⎢⎜ ⎟⎟ − +⎜ ⎟ ⎥ ⎪⎬
( )
x
⎜
⎢⎝ σ x σ xσ y ⎜ σ ⎟
2πσ xσ y 1− ρ 2 ⎪⎩ 2 1 − ρ ⎥⎪
2
⎣ ⎠ ⎝ y ⎠ ⎦⎭
, −∞< x<∞ , −∞< y<∞ .
where μ x ,σ x2 are the mean and variance of X ; μ y ,σ y2 are the mean and variance
of Y ; ρ is the correlation coefficient between X and Y. It is denoted as
⎛X⎞ ⎡⎛ μ x ⎞ ⎛ σ x2 ρσ xσ y ⎞⎤
⎜⎜ ⎟⎟ ~ N ⎢⎜⎜ ⎟⎟ , ⎜ ⎟⎥ .
⎜ σ y2 ⎟⎠⎥⎦
⎝Y ⎠ ⎢⎣⎝ μ y ⎠ ⎝ ρσ xσ y
f X ( x ) = ∫ − ∞ f ( x, y )dy
∞
= C ( x )∫
∞ 1
⎧
⎪
exp ⎨−
1 ⎛ y − μy
⎜
⎞
⎟
2⎫
⎪ (
⎡ ρ (x − μ x ) y − μ y )⎤dy
−∞
(
2πσ 2y 1 − ρ 2 ) ⎪⎩ 2 1 − ρ
2
( ) ⎜ σy
⎝
⎟
⎠
⎬
⎪⎭
exp ⎢
(
⎢⎣ 1 − ρ σ xσ y
2
) ⎥
⎥⎦
⎧⎪ ⎛ x − μx ⎞
2
⎫⎪
where C (x ) =
1 1
exp⎨− ⎜⎜ ⎟⎟ ⎬
2πσ x
2
⎪⎩ 2 1 − ρ
2
( ) ⎝ σx ⎠ ⎪⎭
⎡ ρ (x − μ ) ⎤ y − μy
∞ ⎛ 1 ⎞
= C ( x )∫
1
exp⎜ − z 2 ⎟ exp ⎢ x
z ⎥dz ( put z = )
−∞ 2π ⎝ 2 ⎠ ⎢σ x 1 − ρ 2 ⎥ σ y 1− ρ 2
⎣ ⎦
⎛ ρ (x − μ ) ⎞
= C ( x )M Z ⎜ x ⎟ where M Z (t ) is the mgf of Z ~ N (0,1)
⎜σ 1− ρ 2 ⎟
⎝ x ⎠
⎧⎪ ⎫⎪ ⎧⎪ 1 ρ 2 ( x − μ x )2 ⎫⎪
2
1 1 ⎛ x − μx ⎞
= exp ⎨− ⎜⎜ ⎟⎟ ⎬ ⎨ ⎬
2πσ x2 (
⎪⎩ 2 1 − ρ
2
) ⎝ σx ⎠ ⎪⎭
exp
(
⎪⎩ 2 σ x2 1 − ρ 2 ⎪⎭ )
1 ⎧⎪ 1 ⎛ x − μ ⎞
2
⎫⎪
= exp ⎨− ⎜⎜ x
⎟⎟ ⎬, −∞ < x < ∞
2πσ x
2
⎪⎩ 2 ⎝ σ x ⎠ ⎪⎭
P.138
Stat1301 Probability& Statistics I Spring 2008-2009
(
Hence the marginal distribution of X is N μ x ,σ x2 . The conditional pdf of Y given)
X = x is given by
f ( x, y )
fY | X ( y | x ) =
f X (x )
⎧⎪ ⎛ ρσ y ⎞
2⎫
⎪
=
1
exp ⎨− 2
1
⎜⎜ y − μ y − (x − μ x )⎟⎟ −∞ < y < ∞
(
2πσ 2y 1 − ρ 2 ) (
⎪⎩ 2σ y 1 − ρ
2
) ⎝ σx ⎠
⎬,
⎪⎭
ρσ y
Hence Y X
~ N
⎛
⎜
⎜ μ y +
σ
( X − μ x ) , 1 − ρ 2
σ 2⎞
y ( )
⎟⎟ . The best predictor of Y given
⎝ x ⎠
the value of X is
ρσ y
E (Y | X ) = μ y + (X − μx ) .
σx
Y1 = g1 ( X 1 , X 2 ) = X 1 + X 2
Y2 = g 2 ( X 1 , X 2 ) = X 1 − X 2
would transform the random variables X 1 , X 2 into their sum and difference. To
determine the joint pdf of the transformed random variables, we may use the
following theorem, which is a generalization of the one-variable transformation
formula in section 2.6.
In general, let Yi = g i ( X 1 ,..., X n ) , i = 1,2,..., n for some functions g ’s such that the
functions g’s satisfy the following conditions :
P.139
Stat1301 Probability& Statistics I Spring 2008-2009
2. The functions g’s have continuous partial derivatives at all points ( x1 , x2 ,..., xn )
and are such that the n × n Jacobian determinant is non-zero, i.e.
Under these two conditions, the joint pdf of Y1 , Y2 ,..., Yn is given by the following
formula :
f Y ( y1 ,..., yn ) = f X ( x1 ,..., xn ) × J ( x1 ,..., xn )
−1
Example 3.31
Suppose that two random variables X 1 , X 2 have a continuous joint distribution for
which the joint pdf is as follows:
⎧1
⎪ ( x1 + x2 )e 1 2
−x −x
for x1 > 0, x2 > 0
f X ( x1 , x2 ) = ⎨ 2 .
⎪⎩ 0 otherwise
⎧1 −y
⎪ y1e 1 for − y1 < y2 < y1 , y1 > 0
f Y ( y1 , y2 ) = ⎨ 4 .
⎪⎩ 0 otherwise
Example 3.32
⎧ λα + β
⎪ xα −1 y β −1e − λ ( x + y ) for x > 0, y > 0
f X ,Y ( x, y ) = ⎨ Γ(α )Γ(β ) .
⎪0
⎩ otherwise
X
U= , V = X +Y .
X +Y
X = UV , Y = V (1 − U ) .
Jacobian determinant:
∂u y 1− u ∂u x u ∂v ∂v
= = , =− =− , = =1
∂x ( x + y )2
v ∂y (x + y )2
v ∂x ∂y
∂u ∂u
1− u u
∂x ∂y − 1− u ⎛ u⎞ 1
J ( x, y ) = = v v = ×1 − 1 × ⎜ − ⎟ = ≠ 0
∂v ∂v v ⎝ v⎠ v
1 1
∂y ∂y
P.141
Stat1301 Probability& Statistics I Spring 2008-2009
fU ,V (u, v ) = f X ,Y ( x, y ) × J ( x, y )
−1
−1
λα + β
= (uv )α −1 (v (1 − u ))β −1 e − λv × 1
Γ(α )Γ(β ) v
λα + β
u α −1 (1 − u ) vα + β −1e −λv ,
β −1
= 0 < u < 1, v > 0
Γ(α )Γ(β )
From the joint pdf it is easily observed that U and V are independent. Note that the
joint pdf of U ,V can be written as
⎧ Γ(α + β ) α −1 β −1 ⎫ ⎧ λ
α +β
⎫
fU ,V (u, v ) = ⎨ u (1 − u ) ⎬ ⎨ vα + β −1e − λv ⎬ , 0 < u < 1, v > 0
⎩ Γ(α )Γ(β ) ⎭ ⎩ Γ(α + β ) ⎭
and therefore
V = X + Y ~ Γ(α + β , λ ) .
X
U= ~ Beta(α , β ) ,
X +Y
Remark
−1
∂g1 ∂g1 ∂g1 ∂h1 ∂h1 ∂h1
L L
∂x1 ∂x2 ∂xn ∂y1 ∂y2 ∂yn
∂g 2 ∂g 2 ∂g 2 ∂h2 ∂h2 ∂h2
L L
J ( x1 , x2 ,..., xn )
−1
= ∂x1 ∂x2 ∂xn = ∂y1 ∂y2 ∂yn
M M M M M M M M
∂g n ∂g n ∂g n ∂hn ∂hn ∂hn
L L
∂x1 ∂x2 ∂xn ∂y1 ∂y2 ∂yn
P.142
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.33
In example 3.32,
−1
∂x ∂x ∂u ∂u
∂u ∂v = v u ∂x ∂y
= v × (1 − u ) − (− v ) × u = v = .
∂y ∂y − v 1 − u ∂v ∂v
∂u ∂v ∂x ∂y
Example 3.34
( )
iid
Suppose X 1 , X 2 ,..., X n ~ N μ ,σ 2 . The joint pdf of X 1 , X 2 ,..., X n is given by
n ⎧⎪ 1 ⎛ ( xi − μ )2 ⎞⎫⎪
f ( x1 , x2 ,..., xn ) = ∏ ⎨ exp⎜⎜ − ⎟⎬
⎟⎪
i =1 ⎪⎩ 2πσ
2
⎝ 2σ 2
⎠⎭
( ) ⎧ 1 n
(x − μ )2 ⎫⎬ ,
n
−
= 2πσ 2 2 exp ⎨− 2 ∑ i
− ∞ < x1 , x2 ,..., xn < ∞
⎩ 2σ i =1 ⎭
Y1 = X , Yi = X i − X , i = 2,3,..., n .
X 1 = nX − ( X 2 + ... + X n ) = X − ( X 2 − X ) − ... − ( X n − X )
= Y1 − Y2 − ... − Yn
X i = Yi + Y1 for i = 2,3,..., n
P.143
Stat1301 Probability& Statistics I Spring 2008-2009
n n
∑ (xi − μ ) = ∑ ( xi − x ) + n( x − μ )
2 2 2
i =1 i =1
2
⎛ n ⎞ n
= ⎜ ∑ yi ⎟ + ∑ yi2 + n( y1 − μ )
2
⎝ i =2 ⎠ i =2
⎧⎪ 1 ⎡⎛ n ⎞ 2 n 2 2 ⎤⎪
⎫
( )
n
2 −2
f Y ( y1 ,..., yn ) = 2πσ exp ⎨− 2 ⎢⎜ ∑ i ⎟ ∑ i
y + y + n ( y − μ ) ⎥⎬ × n
⎪⎩ 2σ
1
⎣⎢ ⎝ i = 2 ⎠ i = 2 ⎦⎥ ⎪⎭
⎧⎪ 1 ⎛ ( y1 − μ )2 ⎞⎫⎪
=⎨ exp⎜⎜ − ⎟⎬
⎪⎩ 2π σ n 2
( )
⎝ 2 σ n ⎠⎪⎭
2 ⎟( )
⎧⎪ ⎧ 1 ⎡⎛ n ⎞ 2 n 2 ⎤ ⎫⎫⎪
× ⎨ n (2πσ 2 ) 2 exp⎨−
n −1
⎢⎜⎝ ∑ y i ⎟ + ∑ y i ⎥ ⎬⎬
−
⎪⎩ ⎩ 2σ ⎠ ⎦ ⎭⎪⎭
2
⎣ i = 2 i =2
∑ (X i − X ) =
1 n n
S = ∑ Yi ⎟ + ∑ Yi 2 ⎟ .
2 2
⎜
n − 1 i =1 n − 1 ⎜⎝ ⎝ i = 2 ⎠ i = 2 ⎟⎠
In this section we consider the situations that two random variables are
transformed into one random variable.
Z = X +Y
Discrete case pZ (z ) = P ( X + Y = z ) = ∑ p ( x, z − x ) = ∑ p (z − y , y )
x y
Continuous case FZ (z ) = P (Z ≤ z ) = P ( X + Y ≤ z )
= P (Y ≤ z − X ) = ∫ − ∞ ∫ − ∞ f ( x, y )dydx
∞ z−x
= P ( X ≤ z − Y ) = ∫ − ∞ ∫ − ∞ f ( x, y )dxdy
∞ z− y
P.144
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.35
Support of Z: z = 2,3,..., ∞
∞
pZ (z ) = ∑ p ( x, z − x ) = ∑ p X ( x ) pY (z − x )
x x =1
z −1
(
= ∑ (1 − p )
x =1
x −1
)(
p (1 − p )
z − x −1
p ) ( pY ( z − x ) = 0 if z − x < 1 )
z −1
∑ (1 − p )
z −2
=p 2
x =1
⎛ z − 1⎞ 2
= ( z − 1) p 2 (1 − p ) ⎟⎟ p (1 − p )
z −2 x −2
= ⎜⎜
⎝ 2 − 1⎠
⎛ n − 1⎞ r
Compare with the pmf of nb(n, r ) , p( x ) = ⎜⎜ ⎟⎟ p (1 − p ) ,
n−r
x = r , r + 1,... , we
⎝ r − 1⎠
can see that Z ~ nb(2, p ) .
Example 3.36
Support of Z: z ∈ (0, ∞ )
[
= − e− y ] z
z2
= e−z 2 − e− z , z>0
P.145
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.37
Let X and Y be two independent U (0,1) random variables. What is the distribution
of Z = X + Y ?
Support of Z: z ∈ (0,2 )
f Z (z ) = ∫ − ∞ f (z − y , y )dy = ∫ − ∞ f X (z − y ) fY ( y )dy
∞ ∞
f Z (z ) = ∫ 0 (1)(1)dy = z
z
For 0 < z ≤ 1,
f Z (z ) = ∫ z −1 (1)(1)dy = 2 − z
1
For 1 < z < 2 ,
⎧z for 0 < z ≤ 1
⎪
Therefore f Z (z ) = ⎨ 2 − z for 1 < z < 2 .
⎪0
⎩ otherwise
Z = X −Y
Discrete case pZ ( z ) = P ( X − Y = z ) = ∑ p ( x, x − z ) = ∑ p (z + y , y )
x y
Continuous case FZ (z ) = P (Z ≤ z ) = P ( X − Y ≤ z )
= P (Y ≥ X − z ) = ∫ − ∞ ∫ x − z f ( x, y )dydx
∞ ∞
= P ( X ≤ z + Y ) = ∫ − ∞ ∫ − ∞ f ( x, y )dxdy
∞ z+ y
P.146
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.38
Support of Z: z ∈ (0, ∞ )
FZ (z ) = P (Z ≤ z ) = P (Y − X ≤ z ) y–x=z
f ( x, y )dydx
∞ x+ z
= ∫0 ∫0 y
y=x
∞ x+ z − y
=∫ ∫ 0 x
e dydx
=∫
∞ −x
0
(
e − e−(x + z ) )dx
= 1 − e−z
z
Integral region
f Z (z ) = F 'Z (z ) = e , −z
z>0
x
0
Hence Z = Y − X ~ Exp(1) .
Example 3.39
Let X and Y be two independent U (0,1) random variables. What is the distribution
of Z = X − Y ?
Support of Z : z ∈ (− 1,1)
f Z (z ) = ∫ − ∞ f (z + y , y )dy = ∫ − ∞ f X (z + y ) fY ( y )dy
∞ ∞
f Z (z ) = ∫ − z (1)(1)dy = 1 + z
1
For − 1 < z ≤ 0 ,
f Z (z ) = ∫ 0 (1)(1)dy = 1 − z
1− z
For 0 < z < 1,
P.147
Stat1301 Probability& Statistics I Spring 2008-2009
⎧ 1+ z for − 1 < z ≤ 0
⎪
Therefore f Z (z ) = ⎨ 1 − z for 0 < z < 1 .
⎪0
⎩ otherwise
Z = XY
Discrete case pZ (z ) = P ( XY = z ) = ∑ p ( x, x z ) = ∑ p ( y z , y )
x y
Continuous case FZ (z ) = P (Z ≤ z ) = P ( XY ≤ z )
= P (Y ≤ z X , X > 0) + P (Y ≥ z X , X < 0)
= ∫ 0 ∫ − ∞ f ( x, y )dydx + ∫ − ∞ ∫ z x f ( x, y )dydx
∞ z x 0 ∞
1 1
f Z (z ) = F ' (z ) = ∫ 0 f ( x, z x ) dx − ∫ − ∞ f ( x, z x ) dx
∞ 0
x x
1 1
= ∫ − ∞ f ( x, z x ) dx = ∫ − ∞ f (z y , y ) dy
∞ ∞
x y
Example 3.40
Let X and Y be two independent U (0,1) random variables. What is the distribution
of Z = XY ?
Support of Z: z ∈ (0,1)
1 1
f Z (z ) = ∫ − ∞ f ( x, z x ) dx = ∫ − ∞ f X ( x ) fY (z x ) dx
∞ ∞
x x
⎛1⎞
f Z (z ) = ∫ z (1)(1)⎜ ⎟dx = − ln z , 0 < z < 1 .
1
Therefore
⎝ x⎠
P.148
Stat1301 Probability& Statistics I Spring 2008-2009
X
Z=
Y
Discrete case pZ (z ) = P ( X Y = z ) = ∑ p ( x, x z ) = ∑ p (zy, y )
x y
Continuous case FZ (z ) = P (Z ≤ z ) = P ( X Y ≤ z )
= P ( X ≤ zY , Y > 0) + P ( X ≥ zY , Y < 0)
= ∫ 0 ∫ − ∞ f ( x, y )dxdy + ∫ − ∞ ∫ zy f ( x, y )dxdy
∞ zy 0 ∞
⎛ x⎞ x
= ∫ − ∞ f (zy, y ) y dy = ∫ − ∞ f ⎜ x, ⎟ 2 dx
∞ ∞
⎝ z⎠z
Example 3.41
Let X and Y be two independent N (0,1) random variables. What is the distribution
X
of Z = ?
Y
Support of Z: z ∈ (− ∞, ∞ )
f Z (z ) = ∫ − ∞ f (zy , y ) y dy = ∫ − ∞ f X (zy ) fY ( y ) y dy
∞ ∞
⎛ −
z 2 y 2 ⎞⎛
−
y2 ⎞
= ∫ −∞ ⎜ e 2 ⎟⎜ e 2 ⎟ y dy
∞ 1 1
⎜ 2π ⎟⎜ 2π ⎟
⎝ ⎠⎝ ⎠
= 2∫ 0
∞ 1
exp⎜⎜ −
(
⎛ 1 + z2 y2 ⎞ )
⎟⎟ ydy (integrand is an even function)
2π ⎝ 2 ⎠
1⎡ 1
= ⎢
(
⎛ 1 + z 2 y 2 ⎞⎤
exp⎜⎜ − ⎟⎟⎥
) ∞
π ⎣1 + z 2 ⎝ 2 ⎠⎦ 0
1
= , −∞< z <∞
π (1 + z 2 )
Example 3.42
( )
fW (w ) = fY rw2 × 2rw =
1
Γ(r 2 )2 r2
rw 2 ( ) r 2 −1 − rw 2 2
e × 2rw
rr 2
= w r −1e − rw 2 , w>0
2
Γ(r 2 )2 r 2 −1
Support of Z : z ∈ (− ∞, ∞ )
f Z (z ) = ∫ − ∞ f (zw, w ) w dw = ∫ − ∞ f X (zw ) fW (w ) w dw
∞ ∞
⎛ 1 − z 2 w2
∞⎜
⎞⎛ r2 −
rw 2 ⎞
=∫ e 2 ⎟⎜ r
w r −1e 2 ⎟ wdw ( fW (w) = 0 if w < 0 )
0 ⎜
2π ⎟⎜ Γ(r 2 )2 r 2 −1 ⎟
⎝ ⎠⎝ ⎠
rr 2
(r + z 2 )w 2
∞ r −
= ∫ we 2 dw
π Γ(r 2 )2 (r −1) 2 0
r 1
−
rr 2 ∞ ⎛ 2u ⎞ 2 − u 1 ⎛ 2 u ⎞ 2 2
= ∫ ⎜ ⎟ e ⎜ ⎟ du
Γ(1 2 )Γ(r 2 )2(r −1) 2 0 ⎝ r + z 2 ⎠ 2 ⎝ r + z2 ⎠ r + z2
(r + z )w
1
⎛ 2u ⎞ 2
2 2
( put u= ⇔ w=⎜ 2 ⎟
)
2 ⎝r + z ⎠
r −1
r2 r −1
r 22 ∞
= ∫ u 2 e − u du
Γ(1 2 )Γ(r 2 )2 (r −1) 2 r +1
(r + z )
0
2 2
⎛ r + 1⎞ ⎛ r + 1⎞
Γ⎜ ⎟ Γ⎜ ⎟ 1 r +1
2 − 2
2 ⎠ ⎛ ⎞
r (r + z 2 ) 2 = ⎝
2 ⎠ r2
= ⎝
r +1
− − z
r 2 ⎜1 + ⎟ , −∞< z <∞
⎛1⎞ ⎛r ⎞ ⎛1⎞ ⎛r ⎞
Γ⎜ ⎟ Γ⎜ ⎟ ⎝
r ⎠
Γ⎜ ⎟ Γ⎜ ⎟
⎝2⎠ ⎝2⎠ ⎝2⎠ ⎝2⎠
P.150
Stat1301 Probability& Statistics I Spring 2008-2009
Example 3.43
Support of Z : z ∈ (0, ∞ )
⎛Y ⎞
Distribution function of Z: FZ (z ) = P (Z ≤ z ) = P⎜ ≤ z ⎟ = P (Y ≤ zX )
⎝X ⎠
y
For 0 < z < 1 ,
FZ (z ) = ∫ 0 ∫ 0 (1)(1)dydx
1 zx 1
y = zx
1
=∫ 0
zxdx
z
= integral region
2 x
0 1
For 1 ≤ z < ∞ , y
y = zx
FZ (z ) = ∫ 0 ∫ y z (1)(1)dxdy
1 1
1
1⎛ y⎞
= ∫ 0 ⎜1 − ⎟dy
⎝ z⎠
1 integral region
=1−
2z 0 1
x
Hence
⎧ ⎧
⎪0 for z ≤ 0 ⎪0 for z ≤ 0
⎪z ⎪
⎪ ⎪1
FZ (z ) = ⎨ for 0 < z < 1 , f Z (z ) = ⎨ if 0 < z < 1 .
⎪2 ⎪2
⎪ 1 ⎪ 1
⎪⎩ 1 − for z ≥ 1 ⎪⎩ 2 z 2 if z ≥ 1
2z
P.151