0% found this document useful (0 votes)
21 views81 pages

Ch4-Part 1

Chapter 4 discusses multivariate random variables, focusing on their properties, joint distributions, independence, and conditional distributions. It introduces concepts such as joint probability mass functions, marginal distributions, and the multivariate moment generating function (MGF). The chapter also includes examples and applications to illustrate these concepts.

Uploaded by

110508030
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views81 pages

Ch4-Part 1

Chapter 4 discusses multivariate random variables, focusing on their properties, joint distributions, independence, and conditional distributions. It introduces concepts such as joint probability mass functions, marginal distributions, and the multivariate moment generating function (MGF). The chapter also includes examples and applications to illustrate these concepts.

Uploaded by

110508030
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

Chapter 4: Multivariate Random Variables

Li-Pang Chen, PhD

Department of Statistics, National Chengchi University

©Fall 2023

1 / 77
Outlines

1 Motivation

2 Jointly Distributed Random Variables

3 Independence and Conditional Distributions

4 Multivariate MGF

5 Transformation of Random Variables

6 Expectation, Variance, and Covariance

7 Some Specific Applications

2 / 77
4.1 Motivation

In the preceding chapters, we discussed the univariate random


variable X and its relevant properties.
In general, there are at least 2 random variables.
In this chapter, we discuss multivariate random variables and their
properties.

3 / 77
4.2 Jointly Distributed Random Variables
2

independent

Definition
When X and Y are both discrete random variable with possible values
x1 , x2 , · · · , and y1 , y2 , · · · ,, respectively, then the joint probability mass
function of X and Y at X = xi and Y = yj is defined as

P(xi , yj ) = P(X = xi , Y = yj ) (1)

→i = 1, · · · , j = 1, · · · .

4 / 77
4.2 Jointly Distributed Random Variables

pxlxi Iplxi yj

Definition (Marginal distribution)


If X and Y are discrete random variables and joint pmf is (1), then the
!

marginal pmf of X = xi is pX (xi ) = p(xi , yj ); the marginal pmf of
j=1
!
→ 綕4
Y = yj is pY (yj ) = p(xi , yj ).
i=1

ǐx xindxi xn

jim Plxi.gg Plxi yj Py b

2IIPIXi.gs Pxlxi I 5 / 77
2
4.2 Jointly Distributed Random Variables
Example: Suppose that 3 batteries are randomly chosen from a group of
3 new, 4 used but still working, and 5 defective batteries. If we let X and
Y denote, respectively, the number of new and used but still working
batteries that are chosen, then find the joint probability mass function of
X and Y .

Y 0 1 2 3
0 C5
ci 毙
1 0

ǚǒǒ

6 / 77
4.2 Jointly Distributed Random Variables

Recall: if X is a continuous random variable, then we usually consider


an interval/range.
In Section 2.1, we consider the probability of X for an interval [a, b],
i.e., P(a < X < b).
Naturally, it can be extended to bivariate case.
Suppose X and Y are continuous random variables, then in this case,
we have two-dimensional plane.

7 / 77
4.2 Jointly Distributed Random Variables

Definition
Let X and Y denote continuous random variables, and f : R2 ↑ R+ is a
bivariate continuous function. Suppose that C is a set in two-dimensional
plane, then the probability of (X , Y ) in a set C is defined as
$$
" #
P (X , Y ) ↓ C = f (x, y )dydx. (2)
(x,y )↑C

if

t
i
8 / 77
4.2 Jointly Distributed Random Variables

Similar to discrete random variables, we can define marginal


probability by “integrating out” other random variables (i.e., summing
up other random variables in discrete version).
That is, suppose that the support of f (x, y ) is R2 , then the marginal
density function of X and Y are, respectively, given by
marginal
$ → $ →
fX (x) = f (x, y )dy and fY (y ) = f (x, y )dx. (3)
↓→ ↓→

Note: In (3), if we integrate a function with respect to x (or y ), then


we treat y (or x) as a constant.

x marginal 1
fykl 9 / 77
4.2 Jointly Distributed Random Variables

Moreover, the marginal probability of X ↓ [a, b] and Y ↓ [c, d] are,


respectively, given by
$ b $ b %$ → &
P(X ↓ [a, b]) = fX (x)dx = f (x, y )dy dx
a a ↓→

and
$ d $ d %$ → &
P(Y ↓ [c, d]) = fY (y )dy = f (x, y )dx dy
c c ↓→

10 / 77
4.2 Jointly Distributed Random Variables
Example: The joint density function of X and Y is given by
% ↓x↓2y
2e 0 < x < ↔, 0 < y < ↔
f (x, y ) =
0 otherwise
Compute (a) P(X > 1, Y < 1) and (b) P(X < a) for some constant
a > 0.

fxiygfzttge
ialfgdgdx
in

Széyjédyax Plxcakfqtttéa

fzihēlydx
1_éiindx
Tzēyiy 4 19
éltēj int
11 / 77
4.3 Independence and Conditional Distributions

Independence:

Recall: in Section 3.4, we said two events E and F are independent if


P(E ↗ F ) = P(E )P(F ).
Similar idea can be applied to two random variables. If X and Y are
independent, then joint pmf/pdf of X and Y can be decomposed as
the product of pmf/pdf of X and Y .

12 / 77
4.3 Independence and Conditional Distributions

Here we state the formal definition.

Definition
The random variables X and Y are said to be independent if for any two
sets of real numbers A and B,

P(X ↓ A, Y ↓ B) = P(X ↓ A)P(Y ↓ B).

13 / 77
4.3 Independence and Conditional Distributions

When X and Y are continuous, then we take A and B as intervals,


and f (x, y ) = fX (x)fY (y ) with the support sets of X and Y being a
Cartesian product.
When X and Y are discrete, then taking A = {a} and B = {b}, we
have p(x, y ) = pX (x)pY (y ) for all x and y with the support sets of X
and Y being a Cartesian product.
In general, suppose that there are n independent random variables
X1 , · · · , Xn , then for all sets of real numbers A1 , · · · , An , we have
n
'
P(X1 ↓ A1 , · · · , Xn ↓ An ) = P(Xi ↓ Ai ). (4)
i=1

P Xiiyiltpxlxiypu uxiigi

14 / 77
4.3 Independence and Conditional Distributions

Definition
X1 , · · · , Xn are independent and identically distributed (i.i.d.) if
(a) they are independent and satisfy (4);
(b) they have the same distribution.

15 / 77
4.3 Independence and Conditional Distributions

Example: Continue on an example in page 6 and examine the


independence between X and Y .

4 Kl
Plx 3 y 0 1fromtable

kx 3

M 1

16 / 77
4.3 Independence and Conditional Distributions
Example: Continue an early example that X and Y are two random
variables whose joint density function is given by
% ↓x↓2y
2e 0 < x < ↔, 0 < y < ↔
f (x, y ) =
0 otherwise
Check the independence of X and Y .
fxmzésédy
fxlxkfqzédy d zéciē 19
fyly f zēdx 2é éx
xty

17 / 77
4.3 Independence and Conditional Distributions
Example: The joint pdf of X and Y is
f (x, y ) = 8xy for 0 < x < y < 1.
Examine if X is independent of Y .

yx fxlx f 8xgdy 4P
fyntbgxydx 454y

Still 8x x 4P

18 / 77
4.3 Independence and Conditional Distributions

Conditional probability:

Recall: Again, in Section 3.4, we defined the conditional probability:


↔F )
for two events E and F , we have P(E |F ) = P(EP(F ) , provided that
P(F ) > 0.
Now, suppose that X and Y are two random variables, how can we
define “conditional probability”?

19 / 77
4.3 Independence and Conditional Distributions

If X and Y are discrete random variables, then the conditional pmf of


X given Y = y is defined as

pX |Y (x|y ) ↭ P(X = x|Y = y )


P(X = x, Y = y )
=
P(Y = y )
p(x, y )
= .
pY (y )

20 / 77
4.3 Independence and Conditional Distributions

If X and Y are continuous random variables, then the conditional pdf


of X given Y = y is defined as

f (x, y )
fX |Y (x|y ) ↭ .
fY (y )

21 / 77
4.3 Independence and Conditional Distributions
Example: Suppose that p(x, y ), the joint pmf of X and Y , is given by
p(0, 0) = 0.4, p(0, 1) = 0.2, p(1, 0) = 0.1, p(1, 1) = 0.3.
Calculate the conditional pmf of X given Y = 1

p My 1
0 1
P1 īfkoiyy
0 0 4 0.1

1 0 0.3 85 iflx 1 y 11

22 / 77
4.3 Independence and Conditional Distributions
Example: The joint pdf of X and Y is given by
% 12
5 x(2 ↘ x ↘ y ) 0 < x < 1, 0 < y < 1
f (x, y ) =
0 otherwise
Compute the conditional pdf of X given Y = y for y ↓ (0, 1).

fixyiii
tyif ftp.x yidx 12

in
flxlǐng
I 12 xydxjxlǗ
ff zi

Et y.gg
23 / 77
4.3 Independence and Conditional Distributions

In applications, X and Y can be di!erent distributions when


characterizing the conditional distribution.
For example, consider
% given
X |Y ≃ Binom(Y , p)
, (5)
Y ≃ Pois(ω)

which is called a hierarchical model.


We are wondering the distribution of X .

24 / 77
4.3 Independence and Conditional Distributions
Example: Derive the distribution of X under (5).

PKMÉ plxix.y.gl
Épckxlkglxplǐ
Ǘy
I'H pxni
pxynēj
dyi
xilyyi

xiyi i pxié
X

ip ㄨ
ziyx

Ědir KPMÉIEL Hiixěcnn


25 / 77
4.3 Independence and Conditional Distributions

Remark:
The distribution of X derived under (5) is called mixture distributions,
which says that the distribution of X depends on a quantity that also
has a distribution.
In general, hierarchical models usually lead to mixture distributions.

26 / 77
4.4 Multivariate MGF

Recall: in Chapter 3, we introduced MGF for a random variable X .


In the presence of multivariate random variables X1 , · · · , Xn , the MGF
is defined as

M(t1 , · · · , tn ) = E {exp(t1 X1 + · · · tn Xn )} .

27 / 77
4.4 Multivariate MGF

Some properties of M(t1 , · · · , tn ):


If ti ⇐= 0 for some i and tj = 0 for j ⇐= i, then M(t1 , · · · , tn ) is equal
to MXi (ti ), the MGF of Xi .
If X1 , · · · , Xn are independent, then M(t1 , · · · , tn ) can be written as

M(t1 , · · · , tn ) = MX1 (t1 ) ⇒ · · · ⇒ MXn (tn ).


MGF

28 / 77
4.4 Multivariate MGF

Applications: Sums of random variables


Suppose that X1 , · · · , Xn are i.i.d. random variables.
!n
We may ask: what is the distribution of Y = Xi ?
i=1
To answer this, we can apply the MGF (why?)

29 / 77
4.4 Multivariate MGF

lfl Mxi Me
Example: Suppose that X1 , · · · , Xn are i.i.d. random variables. Determine
!
n
the distribution of Xi if Xi follows Mxltk Elety
i=1
(a) Bernoulli(p); let y Ěxi
(b) Pois(ω);
IN My t Elěg Elet
(c) N(µ, ε 2 ).
Ele Tlptcrpylpetlrpp
in Mxilt e pminp
My H e
let 1 14Mxi t enttht2
yrupoisvy y
Nlnmnr etemtie.ME
enmtl
30 / 77
4.4 Multivariate MGF
Example: Suppose that X1 , · · · , Xn are i.i.d. random variables, where Xi
follows the Gamma distribution gamma(ϑi , ϖ) for i = 1, · · · , n. Determine
!
n
the distribution of Xi .
i=1

x
Gammalai. K
Xlt xn

My H E ety Eletntxztity

Elěijuj rǘ

r.vn X pdfepmf
31 / 77
4.5 Transformation of Random Variables

Let X and Y be two random variables.


Suppose that there exist two continuous functions g1 and g2 , such
that two transformed random variables U and V are defined as

U = g1 (X , Y ) and V = g2 (X , Y ).

The questions are


(i) What is the joint function of (U, V )?
(ii) What are the marginal functions of U and V ?

32 / 77
4.5 Transformation of Random Variables
The joint distribution function of U and V is defined as
$$
P(U ⇑ u, V ⇑ v ) = fX ,Y (x, y )dydx.
g1 (X ,Y )↗u,g2 (X ,Y )↗v

Taking the derivative on P(U ⇑ u, V ⇑ v ) with respect to u and v


gives
ikhlxl orgy for yiy fyyfxlgylfhl
fU,V (u, v ) = fX ,Y (x, y ) |J(x, y )|↓1 , (6)

where X = h1 (U, V ) and Y = h2 (U, V ) for some continuous


functions h1 and h2 , and
( Fygkpcǜs
(
( ωg1 ωg1 (
( ωx ωy ( Plxegtx Fxcgy
J(x, y ) = ( ωg ωg2 (
( ωx ωy (
2

is called the Jacobian determinant.


33 / 77
4.5 Transformation of Random Variables
Example: If X and Y are independent random variables and follow the
standard normal distribution. Determine the joint distribution of

Ti
U = X + Y and V = X ↘ Y .

findfu
v1UoU fu.rlu.ly
E
fxylxiytfxlxlfyly ēillz ㄅㄨ

1Xty l
ēkx
f x
ūiē Normal lol

Yyǐdtranstom 笀
tix ē
J densityfunction

34 / 77
4.5 Transformation of Random Variables
Example: If X and Y are independent random variables and follow the
standard
⇓ normal distribution.↓1Determine the joint distribution of
R = X 2 + Y 2 and ϱ = tan (Y /X ).

distribution follow 值
fun UV1 fR olR 0
fxy x y fxIX fylylilxt Y
R 9iixi xigi
0 92 tan ǐlx

151
Ttyr
fkiolriakǎué
Jacobian

35 / 77
4.5 Transformation of Random Variables

Example: If X and Y are independent Gamma distributions. Determine


X
the joint distribution of U = X + Y and V = X +Y .

xiii fu v10 众 Éwiumiu


y
Gammalp. fx.gl
x

iiiiyg lixty
gy.in
r 惢
UN x 4 fxylx.gg

IJFU

36 / 77
4.5 Transformation of Random Variables

Extension:
The idea of deriving (6) can be generalized to the multivariate version
if we have X1 , · · · , Xn with the density function fX1 ,··· ,Xn for n > 2.
Specifically, suppose that there are n functions g1 , g2 , · · · , gn , then
the transformed random variables are

Y1 = g1 (X1 , · · · , Xn ), · · · , Yn = gn (X1 , · · · , Xn ).

37 / 77
4.5 Transformation of Random Variables

By (6) we have that


fY1 ,··· ,Yn (y1 , · · · , yn ) = fX1 ,··· ,Xn (x1 , · · · , xn ) |J(x1 , · · · , xn )|↓1 ,

where Xi = hi (Y1 , · · · , Yn ) with continuous functions hi for


i = 1, · · · , n, and
( (
91 xin (( ωx11 ωx21 · · · ωxn1 ((
ωg ωg ωg
( ωg2 ωg2 2 (
( ωx1 ωx2 · · · ωg
ωx (
J(x1 , · · · , xn ) = (( . . . .
n (.
( .. .. .. .. ((
( ωgn ωgn ωgn ((
( ωx ωx2 · · · ωxn
1

38 / 77
4.5 Transformation of Random Variables
Example: If X1 , X2 , and X3 are independent standard normal
distributions. Determine the joint distribution of Y1 = X1 + X2 + X3 ,
Y2 = X1 ↘ X2 and Y3 = X1 ↘ X3 . g
gz 93

txi.ynxslx.mx
zjzi
X1 514 4243
IJKHYHF3 Y 5141242 43
Y 5 1Ytzǐz2yy
f4142 1 el 1 gifyiiyzg

39 / 77
4.5 Transformation of Random Variables

i
Example: If X1 , · · · , Xn are independent exponential distributions with
the rate ω. Let Yi = X1 + · · · , Xi for i = 1, · · · , n.
(a) Find the joint density function of Y1 , · · · , Yn .
(b) Find the density function of Yn .

f Ni exponential ditribution IJKI


fy 4nllh.yz.hn NÉ
fxi.nl nlXi xn
Nii vociicyzc cyn

a Nē i A l finning Ne d
xnyzt Mcyzei.gg
YoY42 XP0 Xih
y_y_y i
My
43 Xntyztb piii

ǐm Ylixzt tyn 4nǜnǐm


40 / 77
4.5 Transformation of Random Variables

Application: If one wishes to determine the distribution of any operation


of two random variables X and Y , such as X + Y or X /Y , then one can
specify U as the new operated random variable and let V be either X or
Y . After that, computing the marginal distribution of U yields the desired
result.
Step 1: Define U = g (X , Y ), such as U = X + Y or U = X /Y , and
let V = X . 椂 Pdtnn
Step 2: Determine the joint pdf of (U, V ), say fUV (u, v ).
)→
Step 3: Compute the marginal pdf of U: fU (u) = ↓→ fUV (u, v )dv .

41 / 77
4.5 Transformation of Random Variables

LetX V
Proposition (convolution formula)
Let X and Y be two independent random variables with density functions
fX and fY , respectively. The density function of U = X + Y is given by
$ →
x Y distribution
fU (u) = fX (v )fY (u ↘ v )dv .
↓→

ǐ
)
Remark: If X and
! Y are discrete, then the integral can be replaced by
the summation , and the density functions can be replaced by pmf.

42 / 77
4.5 Transformation of Random Variables

Example: If X and Y are independent Poisson distributions with means µ


and ω. Determine the distribution of U = X + Y .

fyyly y fxyfyly

Vi Xrxhii
l Jk I

fuvluwkfxlutyl U_U
tn fgfmymgǎw
Plkx ǐi ÈMMYMD
XIU

43 / 77
U0
4.5 Transformation of Random Variables

Example: If X and Y are independent exponential distributions with rates


µ and ω. Determine the distribution of U = X + Y .
Něx
U Xtǐ
lvx
fx x fyly uē he i referto Cb3 1155

SǐXtǐ 9Y v_v une 18é


drfē
fulu gqué ne dr

44 / 77
Mu
4.5 Transformation of Random Variables
Example: Let X follow the standard normal distribution and let Y be the
chi-square distribution with
* v degrees of freedom. Determine the
distribution of U = X / (Y /v ), if X and Y are independent.

f xnlliǐkfieixpǗzyz it
ftp.fycy fulukffuwlnwldw
my

x no
1
iiiiiǘnn
1 to
ny.ii.in
go w dw

in xl I Tl 1
Gammadistribution

45 / 77
4.5 Transformation of Random Variables

xlM.FM
yn

In summary: To determine the distribution of X + Y , we introduce two


methods in this chapter:
multivariate MGF.
the convolution formula.

46 / 77
4.6 Expectation, Variance, and Covariance

Expectation: ten 2 3

Suppose that X and Y are two random variables with joint pmf
p(x, y ) or pdf f (x, y ).
Let g (x, y ) denote an arbitrary function of X and Y . Then the
expected value of g (x, y ) is defined as
!!
E {g (X , Y )} = g (x, y )p(x, y ) (discrete version).
)x→ y ) →
E {g (X , Y )} = ↑→ ↑→ g (x, y )f (x, y )dydx (continuous version).

47 / 77
4.6 Expectation, Variance, and Covariance

In particular, we consider g (x, y ) = x ± y . Then we have

E {g (X , Y )} = E (X ± Y ) = E (X ) ± E (Y ).

⇔ expectation has the property of additivity.


In general, the additivity still holds for n (n > 2) random variables.
That is, for X1 , · · · , Xn , we have
n
+
E (X1 + X2 + · · · + Xn ) = E (Xi ).
i=1

E MY fflxtyj.flx.nldydx
ffxflx dndxtffytlxiyldxdy
Sxfx Syfy 48 / 77
y

4.6 Expectation, Variance, and Covariance

E Xǐkffxgfyfydydx
E XE 4

Proposition
Let X and Y denote two random variables. Define g1 and g2 as two
continuous functions.
(a) If X is independent of Y , then E (XY ) = E (X )E (Y ).
(b) If X and Y are independent, then so are g1 (X ) and g2 (Y ).

Exercise: Let Z = X + Y . What is the MGF of Z ?

49 / 77
4.6 Expectation, Variance, and Covariance
Example: A construction firm has recently sent in bids for 3 jobs worth
(in profits) 10, 20, and 40 (thousands) dollars. If its probabilities of
winning the job are respectively 0.2, 0.8, and 0.3, what is the firm’s
expected total profit?

XK6 EM tyzty EM t El Mt END

Xz 6 a8 尛 u 8 4
Xk6 0.3

50 / 77
4.6 Expectation, Variance, and Covariance
Example: A secretary has typed N letters along with their respective
envelopes. The envelopes get mixed up when they fall on the floor. If the
letters are placed in the mixed-up envelopes in a completely random
manner (that is, each letter is equally likely to end up in any of the
envelopes), what is the expected number of letters that are placed in the
correct envelopes?

is platedin the torrent envelope


till īthyr
ElxltyztintXN EMH ENN I

Fi Plxi D 2

51 / 77
4.6 Expectation, Variance, and Covariance

Definition (conditional expectation)


Suppose that X and Y are two random variables. The conditional
expectation of X given Y , denoted E (X |Y ), is defined as
!
(a) E (X |Y ) = xpX |Y (x|y ) if X and Y are discrete;
)x→
(b) E (X |Y ) = ↓→ xfX |Y (x|y )dx if X and Y are continuous.

conditional on Y.to lompute Ey

52 / 77
4.6 Expectation, Variance, and Covariance

Example: Let X and Y be i.i.d. Binomial random variables Binom(n, p).


Find the conditional expectation of X given X + Y = m for some m.

Elxlxtkm

Ěxpl
PKX.y.my
Éx pktny
Mkǐplǐem
yǐ_Binknp
Pixtkm HyperGeographicdistribution

Cpiflrptcǎiiy CIC
CY
C
pmg 53 / 77
4.6 Expectation, Variance, and Covariance
Example: Suppose that the joint density function of X and Y is given by
exp(↘x/y ) exp(↘y )
fXY (x, y ) = x, y ↓ [0, ↔).
y
Find the conditional expectation of X given Y .

E X14 98 xfxiylxlYldx
f e integrationbypart

54 / 77
4.6 Expectation, Variance, and Covariance

P19 11y
Remarks: 91
We can replace X by an arbitrary continuous function g (X ), such
that
!
(a) E (g (X )|Y ) = g (x)pX |Y (x|y ) if X and Y are discrete;
)x→
(b) E (g (X )|Y ) = ↑→
g (x)fX |Y (x|y )dx if X and Y are continuous.
The additivity property holds:
, n ( - n
+ ( + " ( #
E Xi ((Y = E X i (Y
i=1 i=1

55 / 77
4.6 Expectation, Variance, and Covariance

There are several useful equalities based on the conditional expectation:


Proposition
Let X , Y and Z be random variables. Define g : R ↑ R as a continuous
function.
(a) E (X ) = E {E (X |Y )}.
function
(b) E (Xg (Y )|Y ) = g (Y )E (X |Y ).
givenǐ X
(c) E (X |Y ) = E (X ) if X and Y are independent.
(d) E (X |Y , g (Y )) = E (X |Y ).
(e) E {E (X |Y , Z )|Y } = E (X |Y ).

56 / 77
lal EIX fqsxfxix dx fxytyy
fggfgxflxiyldyd0
conditionalxpaitationpdf

1 ffisxfxiǐlxlǐ fglyldxdy
d dy
fgofyly S xfxiǐlxlǐ
ECXIY
EGEIXIYY

b E X91 ㄚ 14
fxlǐlxlǐldx
91111fqsx.fxiylxlyldx
ECXIY
y
C EIXIY 1 xfxiicxlyldx
Ixty

1 Xfxlxldx E X

e
EIXIY Z ffxfxiǐizlxlyizldx
y2function f㺸 g
EfEIXIY 2 1Y 9 fgxfxlǐiilxlǐizlfzlylzlYldadx
4.6 Expectation, Variance, and Covariance
Example: Derive E (X ) under (5).

E X E fElyjǐ Elyp P E Y P

57 / 77
4.6 Expectation, Variance, and Covariance
Example: Example 4.4.5

58 / 77
4.6 Expectation, Variance, and Covariance

Variance and Covariance:

In Section 2.4, we defined variance of a random variable.


When two random variables are available, similar to Section 4.2 in
Chapter 2, we can discuss covariance of two random variables.

59 / 77
4.6 Expectation, Variance, and Covariance

Definition (covariance)
Let X and Y be two random variables. Then the covariance of X and Y is
defined as

cov(X , Y ) = E {(X ↘ µX )(Y ↘ µY )} ,

where µX = E (X ) and µY = E (Y ).

Remark: Similar to the discussion of variance, cov(X , Y ) has a simpler


formulation:

cov(X , Y ) = E (XY ) ↘ E (X )E (Y ).

60 / 77
4.6 Expectation, Variance, and Covariance

Properties:
(a) cov(X , Y ) = cov(Y , X ).
(b) cov(X , X ) = var(X ).
(c) cov(aX , bY ) = abcov(X , Y ) for a, b ⇐= 0.
(d) for three random variables X , Y , Z ,
cov(X + Z , Y ) = cov(X , Y ) + cov(Z , Y ).
.!n !
m / ! n ! m
(e) In general, cov Xi , Yj = cov(Xi , Yj ).
i=1 j=1 i=1 j=1

61 / 77
4.6 Expectation, Variance, and Covariance

Proposition
Let X and Y denote two random variables.
if andonly if
(a) If X and Y are independent, then cov(X , Y ) = 0. TOV1xY 0
uncorrelated
(b) The converse of (a) is not true. Xt Y
pattern unlinear

ECXY E X ELY
lalp 49 XIY
codxiy 0

COVIXIUFECXYI E X EIY
62 / 77
4.6 Expectation, Variance, and Covariance

Yiàyii xnk Uarlxi


independent
Similarto the expectation,
 we can consider the variance of g (X , Y ),
say var g (X , Y ) .
In particular, we mainly study g (X , Y ) = X + Y , and thus,
var(X + Y ).
Q: The expectation has the additivity property, does the variance
have this property as well, say var(X + Y ) = var(X ) + var(Y )?
A: NO! In general, we have

var(X + Y ) = var(X ) + var(Y )+2cov(X , Y ).

Exercise: How about var(X ↘ Y )?

E xty uxtngpy
E lx.MX tly Mg1
2
El1xMxi El21xMxl1yMy E1y Myj
63 / 77
4.6 Expectation, Variance, and Covariance

Theorem
If X and Y are independent random variables, then cov(X , Y ) = 0, and
thus, var(X + Y ) = var(X ) + var(Y ).

64 / 77
4.6 Expectation, Variance, and Covariance

Example: Compute the variance of the sum obtained when 10


independent rolls of a fair die are made.

var INi Ǚvarcxi ixlti.co 1ixl



Elxy Eci
ii

65 / 77
4.6 Expectation, Variance, and Covariance

Example: Compute the variance of the number of heads resulting from 10


independent tosses of a fair coin.

66 / 77
4.6 Expectation, Variance, and Covariance

Given variances and covariance of X and Y , we define the correlation


coe”cient of X and Y :
cov(X , Y )
ς= * . (7)
var(X )var(Y )

negativecorrelation
P 1
correlation
P 1 positive

67 / 77
4.6 Expectation, Variance, and Covariance

Proposition
Let ς denote the correlation coe!cient in (7). Then
(a) ↘1 ⇑ ς ⇑ 1.
(b) If ς = ±1, then X and Y becomes one-to-one and linear.

tt
hit Ef Ttlx Mx 1y My so

E tlx uxFztixnxllg nyytly.my


tUarl4 lzcoulxiy 4uarIXvar14 0
t2Eyn7t2tcovlxiY
igǐiücn 1 p 1
b 4ac 0 U U
68 / 77
4.6 Expectation, Variance, and Covariance

Definition (conditional variance)


Let X and Y denote two random variables. The conditional variance,
denoted var(X |Y ), is defined as
 ( 
(
var(X |Y ) = E {X ↘ E (X |Y )}2 (Y . (8)

For the convenience, (8) can be expressed as


" 2( #
var(X |Y ) = E X (Y ↘ {E (X |Y )}2

69 / 77
4.6 Expectation, Variance, and Covariance

There is an important relationship between variance and conditional


variance:
Proposition
X and Y are two random variables. Then the variance of X can be
decomposed as

var(X ) = var{E (X |Y )} + E {var(X |Y )}.

2
var IX EIXY EIX

E E X21Y EIEIiǐi1372 Efvarixiǐ3tuarlECXIYD


EIEIX14i E 41
70 / 77
d
refer to p 34
var x Efuarcx1413 Varflx141

Elǐpllpljtvar Yp
P 1 P E14 p2uarl41p
4.7 Some Specific Applications

A. Multivariate normal distributions.


Let p → 2.
Let X ↑ Rp be a p-dimensional random vector.
Let µ ↑ Rp be a p-dimensional vector.
Let ! ↑ Rp→p denote a p ↓ p symmetric and positive definite matrix.

70 / 76
4.7 Some Specific Applications
x 1 Mni rioeio chisquare Nc01172
Definition
We say X follows a multivariate normal distribution, denoted as
X ↔ N(µ, !), if the pdf is
Tintpdf ! columnveitor "
P1 1 1 ↑ ↓1
f (x) = p/2 1/2
exp ↗ (x ↗ µ) ! (x ↗ µ) . (9)
(2ω) |!| 2
idǐu
fstidüaixpsanarematrix 9
Definition
The MGF of (9) is given by
! "
1 ↑

MX (t) = exp t µ + t !t
2

for t ↑ Rp .

71 / 76
11.2 Multiple Linear Regression Models

β
recap Y2 PotpixztEL f gi
A simpler formulation: matrix form.
Let Yn potpixntEn iiǐptl
     
1 X11 X12 · · · X1p Y1 ω'0
 

 1 X21 X22 · · · X2p 


 Y2 
  ω'1 
     
X= 1 X31 X32 · · · X3p  , Y =  Y3 '=
 , and ω  ω'2 .

 .. .. .. .. ..   ..   .. 
 . . . . .   .   . 
1 Xn1 Xn2 · · · Xnp Yn ω'p
n I xii IXip
alwaysequaltoone
becauseweinterseit
theβ1
liTLO Ixii Ixii IXilxil

Ixixixilxip In
13 / 79
4.7 Some Specific Applications

Remark: 嶨
µ is the mean vector of X , say µ = E (X ).
! ↭ cov(X ) is called the covariance matrix of X , where
(i) the ith diagonal entry is the variance of Xi .
(ii) the non-diagonal (j, k) entry (j ↘= k) is the covariance of Xj and Xk .
If p = 2, then (9) is called the Bivariate normal distribution.

iiijii .co
ulxixj P
72 / 76
4.7 Some Specific Applications

10V1XiY is the covarianiematrixotustorsxandǐ


capturingthe lovariancebetweeneachelementof x with eachelement of Y
Aside: Some operations for the expectation and variance
Proposition covlaxibY abloulxiYl ER
Let A ↑ Rr →p , B ↑ Rr →q , and c ↑ Rr denote constant matrices or vector.
Let X ↑ Rp and Y ↑ Rq denote two random vectors. Then
P
(a) E (AX + c) = AE (X ) + c.
(b) E (AX + BY ) = AE (X ) + BE (Y ).
9
(c) cov(AX , BY ) = Acov(X , Y )B . ↑ BE R BT R9 matrix

(d) var(AX ) = Avar(X )A↑ . 1c BY AX

1 2 71 2 K2
1 Ai Btmatrix
XiYtriv
transformationspelitiedA
whenmultiplying xbyA transform xbya linear
73 / 76
4.7 Some Specific Applications

Theorem
Let X ↔ N(µ, !), and let A ↑ Rm→p and b ↑ Rm . Then
AX + b ↔ N(Aµ + b, A!A↑ ).

cet Yaytb
4 AXttTby
My It E et E let
tTAT
E etTAYetib s ATt

esIn 1ST IS ttTb


TAMt Et AIA
et tttib
t AM b 74 / 76
4.7 Some Specific Applications

Theorem
# $ ## $ # $$
X1 µ1 !11 !12
Let X ↭ ↔N , with X 1 ↑ R r and
X2 µ2 !↑
12 !22
X2 ↑ Rp↓r , then
(a) X1 ↔ N(µ1 , !11 ) and X2 ↔ N(µ2 , !22 )
(b) c ↑ X ↔ N(c ↑ µ, c ↑ !c) for some non-random vector c ↑ Rp .

bㄨ
let Afǒ Tx 1 ta. e
scalar
A
b1 ctic
LetA 7 cix

75 / 76
4.7 Some Specific Applications

note
proplenset
x RP 1
Theorem (independence)
# $ ## $ # $$
X1 µ1 !11 !12
Let X ↭ ↔N , with X 1 ↑ R r and
X2 µ2 !↑
12 ! 22
X2 ↑ Rp↓r . Then X1 and X2 are independent i! !12 = 0.

Trivial
I toti 1

76 / 76
It It
Mxltj etn
Inztiti Entittti Izztz
etimittz
L
etTMltftiIntlxetTMztitzTIi.it

xi Xz

You might also like