Ch4-Part 1
Ch4-Part 1
©Fall 2023
1 / 77
Outlines
1 Motivation
4 Multivariate MGF
2 / 77
4.1 Motivation
3 / 77
4.2 Jointly Distributed Random Variables
2
independent
Definition
When X and Y are both discrete random variable with possible values
x1 , x2 , · · · , and y1 , y2 , · · · ,, respectively, then the joint probability mass
function of X and Y at X = xi and Y = yj is defined as
→i = 1, · · · , j = 1, · · · .
4 / 77
4.2 Jointly Distributed Random Variables
pxlxi Iplxi yj
ǐx xindxi xn
2IIPIXi.gs Pxlxi I 5 / 77
2
4.2 Jointly Distributed Random Variables
Example: Suppose that 3 batteries are randomly chosen from a group of
3 new, 4 used but still working, and 5 defective batteries. If we let X and
Y denote, respectively, the number of new and used but still working
batteries that are chosen, then find the joint probability mass function of
X and Y .
Y 0 1 2 3
0 C5
ci 毙
1 0
ǚǒǒ
6 / 77
4.2 Jointly Distributed Random Variables
7 / 77
4.2 Jointly Distributed Random Variables
Definition
Let X and Y denote continuous random variables, and f : R2 ↑ R+ is a
bivariate continuous function. Suppose that C is a set in two-dimensional
plane, then the probability of (X , Y ) in a set C is defined as
$$
" #
P (X , Y ) ↓ C = f (x, y )dydx. (2)
(x,y )↑C
if
䵄
t
i
8 / 77
4.2 Jointly Distributed Random Variables
x marginal 1
fykl 9 / 77
4.2 Jointly Distributed Random Variables
and
$ d $ d %$ → &
P(Y ↓ [c, d]) = fY (y )dy = f (x, y )dx dy
c c ↓→
10 / 77
4.2 Jointly Distributed Random Variables
Example: The joint density function of X and Y is given by
% ↓x↓2y
2e 0 < x < ↔, 0 < y < ↔
f (x, y ) =
0 otherwise
Compute (a) P(X > 1, Y < 1) and (b) P(X < a) for some constant
a > 0.
fxiygfzttge
ialfgdgdx
in
Széyjédyax Plxcakfqtttéa
fzihēlydx
1_éiindx
Tzēyiy 4 19
éltēj int
11 / 77
4.3 Independence and Conditional Distributions
Independence:
12 / 77
4.3 Independence and Conditional Distributions
Definition
The random variables X and Y are said to be independent if for any two
sets of real numbers A and B,
13 / 77
4.3 Independence and Conditional Distributions
P Xiiyiltpxlxiypu uxiigi
14 / 77
4.3 Independence and Conditional Distributions
Definition
X1 , · · · , Xn are independent and identically distributed (i.i.d.) if
(a) they are independent and satisfy (4);
(b) they have the same distribution.
15 / 77
4.3 Independence and Conditional Distributions
4 Kl
Plx 3 y 0 1fromtable
kx 3
M 1
16 / 77
4.3 Independence and Conditional Distributions
Example: Continue an early example that X and Y are two random
variables whose joint density function is given by
% ↓x↓2y
2e 0 < x < ↔, 0 < y < ↔
f (x, y ) =
0 otherwise
Check the independence of X and Y .
fxmzésédy
fxlxkfqzédy d zéciē 19
fyly f zēdx 2é éx
xty
17 / 77
4.3 Independence and Conditional Distributions
Example: The joint pdf of X and Y is
f (x, y ) = 8xy for 0 < x < y < 1.
Examine if X is independent of Y .
yx fxlx f 8xgdy 4P
fyntbgxydx 454y
Still 8x x 4P
18 / 77
4.3 Independence and Conditional Distributions
Conditional probability:
19 / 77
4.3 Independence and Conditional Distributions
20 / 77
4.3 Independence and Conditional Distributions
f (x, y )
fX |Y (x|y ) ↭ .
fY (y )
21 / 77
4.3 Independence and Conditional Distributions
Example: Suppose that p(x, y ), the joint pmf of X and Y , is given by
p(0, 0) = 0.4, p(0, 1) = 0.2, p(1, 0) = 0.1, p(1, 1) = 0.3.
Calculate the conditional pmf of X given Y = 1
p My 1
0 1
P1 īfkoiyy
0 0 4 0.1
1 0 0.3 85 iflx 1 y 11
22 / 77
4.3 Independence and Conditional Distributions
Example: The joint pdf of X and Y is given by
% 12
5 x(2 ↘ x ↘ y ) 0 < x < 1, 0 < y < 1
f (x, y ) =
0 otherwise
Compute the conditional pdf of X given Y = y for y ↓ (0, 1).
fixyiii
tyif ftp.x yidx 12
in
flxlǐng
I 12 xydxjxlǗ
ff zi
Et y.gg
23 / 77
4.3 Independence and Conditional Distributions
24 / 77
4.3 Independence and Conditional Distributions
Example: Derive the distribution of X under (5).
PKMÉ plxix.y.gl
Épckxlkglxplǐ
Ǘy
I'H pxni
pxynēj
dyi
xilyyi
xiyi i pxié
X
ip ㄨ
ziyx
Remark:
The distribution of X derived under (5) is called mixture distributions,
which says that the distribution of X depends on a quantity that also
has a distribution.
In general, hierarchical models usually lead to mixture distributions.
26 / 77
4.4 Multivariate MGF
M(t1 , · · · , tn ) = E {exp(t1 X1 + · · · tn Xn )} .
27 / 77
4.4 Multivariate MGF
28 / 77
4.4 Multivariate MGF
29 / 77
4.4 Multivariate MGF
lfl Mxi Me
Example: Suppose that X1 , · · · , Xn are i.i.d. random variables. Determine
!
n
the distribution of Xi if Xi follows Mxltk Elety
i=1
(a) Bernoulli(p); let y Ěxi
(b) Pois(ω);
IN My t Elěg Elet
(c) N(µ, ε 2 ).
Ele Tlptcrpylpetlrpp
in Mxilt e pminp
My H e
let 1 14Mxi t enttht2
yrupoisvy y
Nlnmnr etemtie.ME
enmtl
30 / 77
4.4 Multivariate MGF
Example: Suppose that X1 , · · · , Xn are i.i.d. random variables, where Xi
follows the Gamma distribution gamma(ϑi , ϖ) for i = 1, · · · , n. Determine
!
n
the distribution of Xi .
i=1
x
Gammalai. K
Xlt xn
My H E ety Eletntxztity
Elěijuj rǘ
r.vn X pdfepmf
31 / 77
4.5 Transformation of Random Variables
U = g1 (X , Y ) and V = g2 (X , Y ).
32 / 77
4.5 Transformation of Random Variables
The joint distribution function of U and V is defined as
$$
P(U ⇑ u, V ⇑ v ) = fX ,Y (x, y )dydx.
g1 (X ,Y )↗u,g2 (X ,Y )↗v
Ti
U = X + Y and V = X ↘ Y .
findfu
v1UoU fu.rlu.ly
E
fxylxiytfxlxlfyly ēillz ㄅㄨ
1Xty l
ēkx
f x
ūiē Normal lol
Yyǐdtranstom 笀
tix ē
J densityfunction
䕜
34 / 77
4.5 Transformation of Random Variables
Example: If X and Y are independent random variables and follow the
standard
⇓ normal distribution.↓1Determine the joint distribution of
R = X 2 + Y 2 and ϱ = tan (Y /X ).
distribution follow 值
fun UV1 fR olR 0
fxy x y fxIX fylylilxt Y
R 9iixi xigi
0 92 tan ǐlx
151
Ttyr
fkiolriakǎué
Jacobian
35 / 77
4.5 Transformation of Random Variables
iiiiyg lixty
gy.in
r 惢
UN x 4 fxylx.gg
IJFU
36 / 77
4.5 Transformation of Random Variables
Extension:
The idea of deriving (6) can be generalized to the multivariate version
if we have X1 , · · · , Xn with the density function fX1 ,··· ,Xn for n > 2.
Specifically, suppose that there are n functions g1 , g2 , · · · , gn , then
the transformed random variables are
Y1 = g1 (X1 , · · · , Xn ), · · · , Yn = gn (X1 , · · · , Xn ).
37 / 77
4.5 Transformation of Random Variables
①
fY1 ,··· ,Yn (y1 , · · · , yn ) = fX1 ,··· ,Xn (x1 , · · · , xn ) |J(x1 , · · · , xn )|↓1 ,
38 / 77
4.5 Transformation of Random Variables
Example: If X1 , X2 , and X3 are independent standard normal
distributions. Determine the joint distribution of Y1 = X1 + X2 + X3 ,
Y2 = X1 ↘ X2 and Y3 = X1 ↘ X3 . g
gz 93
txi.ynxslx.mx
zjzi
X1 514 4243
IJKHYHF3 Y 5141242 43
Y 5 1Ytzǐz2yy
f4142 1 el 1 gifyiiyzg
39 / 77
4.5 Transformation of Random Variables
i
Example: If X1 , · · · , Xn are independent exponential distributions with
the rate ω. Let Yi = X1 + · · · , Xi for i = 1, · · · , n.
(a) Find the joint density function of Y1 , · · · , Yn .
(b) Find the density function of Yn .
a Nē i A l finning Ne d
xnyzt Mcyzei.gg
YoY42 XP0 Xih
y_y_y i
My
43 Xntyztb piii
41 / 77
4.5 Transformation of Random Variables
LetX V
Proposition (convolution formula)
Let X and Y be two independent random variables with density functions
fX and fY , respectively. The density function of U = X + Y is given by
$ →
x Y distribution
fU (u) = fX (v )fY (u ↘ v )dv .
↓→
ǐ
)
Remark: If X and
! Y are discrete, then the integral can be replaced by
the summation , and the density functions can be replaced by pmf.
42 / 77
4.5 Transformation of Random Variables
fyyly y fxyfyly
Vi Xrxhii
l Jk I
fuvluwkfxlutyl U_U
tn fgfmymgǎw
Plkx ǐi ÈMMYMD
XIU
43 / 77
U0
4.5 Transformation of Random Variables
44 / 77
Mu
4.5 Transformation of Random Variables
Example: Let X follow the standard normal distribution and let Y be the
chi-square distribution with
* v degrees of freedom. Determine the
distribution of U = X / (Y /v ), if X and Y are independent.
f xnlliǐkfieixpǗzyz it
ftp.fycy fulukffuwlnwldw
my
x no
1
iiiiiǘnn
1 to
ny.ii.in
go w dw
in xl I Tl 1
Gammadistribution
45 / 77
4.5 Transformation of Random Variables
xlM.FM
yn
46 / 77
4.6 Expectation, Variance, and Covariance
Expectation: ten 2 3
Suppose that X and Y are two random variables with joint pmf
p(x, y ) or pdf f (x, y ).
Let g (x, y ) denote an arbitrary function of X and Y . Then the
expected value of g (x, y ) is defined as
!!
E {g (X , Y )} = g (x, y )p(x, y ) (discrete version).
)x→ y ) →
E {g (X , Y )} = ↑→ ↑→ g (x, y )f (x, y )dydx (continuous version).
47 / 77
4.6 Expectation, Variance, and Covariance
E {g (X , Y )} = E (X ± Y ) = E (X ) ± E (Y ).
E MY fflxtyj.flx.nldydx
ffxflx dndxtffytlxiyldxdy
Sxfx Syfy 48 / 77
y
E Xǐkffxgfyfydydx
E XE 4
Proposition
Let X and Y denote two random variables. Define g1 and g2 as two
continuous functions.
(a) If X is independent of Y , then E (XY ) = E (X )E (Y ).
(b) If X and Y are independent, then so are g1 (X ) and g2 (Y ).
49 / 77
4.6 Expectation, Variance, and Covariance
Example: A construction firm has recently sent in bids for 3 jobs worth
(in profits) 10, 20, and 40 (thousands) dollars. If its probabilities of
winning the job are respectively 0.2, 0.8, and 0.3, what is the firm’s
expected total profit?
Xz 6 a8 尛 u 8 4
Xk6 0.3
50 / 77
4.6 Expectation, Variance, and Covariance
Example: A secretary has typed N letters along with their respective
envelopes. The envelopes get mixed up when they fall on the floor. If the
letters are placed in the mixed-up envelopes in a completely random
manner (that is, each letter is equally likely to end up in any of the
envelopes), what is the expected number of letters that are placed in the
correct envelopes?
Fi Plxi D 2
51 / 77
4.6 Expectation, Variance, and Covariance
52 / 77
4.6 Expectation, Variance, and Covariance
Elxlxtkm
Ěxpl
PKX.y.my
Éx pktny
Mkǐplǐem
yǐ_Binknp
Pixtkm HyperGeographicdistribution
Cpiflrptcǎiiy CIC
CY
C
pmg 53 / 77
4.6 Expectation, Variance, and Covariance
Example: Suppose that the joint density function of X and Y is given by
exp(↘x/y ) exp(↘y )
fXY (x, y ) = x, y ↓ [0, ↔).
y
Find the conditional expectation of X given Y .
E X14 98 xfxiylxlYldx
f e integrationbypart
54 / 77
4.6 Expectation, Variance, and Covariance
P19 11y
Remarks: 91
We can replace X by an arbitrary continuous function g (X ), such
that
!
(a) E (g (X )|Y ) = g (x)pX |Y (x|y ) if X and Y are discrete;
)x→
(b) E (g (X )|Y ) = ↑→
g (x)fX |Y (x|y )dx if X and Y are continuous.
The additivity property holds:
, n ( - n
+ ( + " ( #
E Xi ((Y = E X i (Y
i=1 i=1
55 / 77
4.6 Expectation, Variance, and Covariance
56 / 77
lal EIX fqsxfxix dx fxytyy
fggfgxflxiyldyd0
conditionalxpaitationpdf
1 ffisxfxiǐlxlǐ fglyldxdy
d dy
fgofyly S xfxiǐlxlǐ
ECXIY
EGEIXIYY
b E X91 ㄚ 14
fxlǐlxlǐldx
91111fqsx.fxiylxlyldx
ECXIY
y
C EIXIY 1 xfxiicxlyldx
Ixty
1 Xfxlxldx E X
e
EIXIY Z ffxfxiǐizlxlyizldx
y2function f㺸 g
EfEIXIY 2 1Y 9 fgxfxlǐiilxlǐizlfzlylzlYldadx
4.6 Expectation, Variance, and Covariance
Example: Derive E (X ) under (5).
E X E fElyjǐ Elyp P E Y P
57 / 77
4.6 Expectation, Variance, and Covariance
Example: Example 4.4.5
58 / 77
4.6 Expectation, Variance, and Covariance
59 / 77
4.6 Expectation, Variance, and Covariance
Definition (covariance)
Let X and Y be two random variables. Then the covariance of X and Y is
defined as
where µX = E (X ) and µY = E (Y ).
cov(X , Y ) = E (XY ) ↘ E (X )E (Y ).
60 / 77
4.6 Expectation, Variance, and Covariance
Properties:
(a) cov(X , Y ) = cov(Y , X ).
(b) cov(X , X ) = var(X ).
(c) cov(aX , bY ) = abcov(X , Y ) for a, b ⇐= 0.
(d) for three random variables X , Y , Z ,
cov(X + Z , Y ) = cov(X , Y ) + cov(Z , Y ).
.!n !
m / ! n ! m
(e) In general, cov Xi , Yj = cov(Xi , Yj ).
i=1 j=1 i=1 j=1
61 / 77
4.6 Expectation, Variance, and Covariance
Proposition
Let X and Y denote two random variables.
if andonly if
(a) If X and Y are independent, then cov(X , Y ) = 0. TOV1xY 0
uncorrelated
(b) The converse of (a) is not true. Xt Y
pattern unlinear
ECXY E X ELY
lalp 49 XIY
codxiy 0
COVIXIUFECXYI E X EIY
62 / 77
4.6 Expectation, Variance, and Covariance
E xty uxtngpy
E lx.MX tly Mg1
2
El1xMxi El21xMxl1yMy E1y Myj
63 / 77
4.6 Expectation, Variance, and Covariance
Theorem
If X and Y are independent random variables, then cov(X , Y ) = 0, and
thus, var(X + Y ) = var(X ) + var(Y ).
64 / 77
4.6 Expectation, Variance, and Covariance
65 / 77
4.6 Expectation, Variance, and Covariance
66 / 77
4.6 Expectation, Variance, and Covariance
negativecorrelation
P 1
correlation
P 1 positive
67 / 77
4.6 Expectation, Variance, and Covariance
Proposition
Let ς denote the correlation coe!cient in (7). Then
(a) ↘1 ⇑ ς ⇑ 1.
(b) If ς = ±1, then X and Y becomes one-to-one and linear.
tt
hit Ef Ttlx Mx 1y My so
69 / 77
4.6 Expectation, Variance, and Covariance
2
var IX EIXY EIX
Elǐpllpljtvar Yp
P 1 P E14 p2uarl41p
4.7 Some Specific Applications
70 / 76
4.7 Some Specific Applications
x 1 Mni rioeio chisquare Nc01172
Definition
We say X follows a multivariate normal distribution, denoted as
X ↔ N(µ, !), if the pdf is
Tintpdf ! columnveitor "
P1 1 1 ↑ ↓1
f (x) = p/2 1/2
exp ↗ (x ↗ µ) ! (x ↗ µ) . (9)
(2ω) |!| 2
idǐu
fstidüaixpsanarematrix 9
Definition
The MGF of (9) is given by
! "
1 ↑
↑
MX (t) = exp t µ + t !t
2
for t ↑ Rp .
71 / 76
11.2 Multiple Linear Regression Models
β
recap Y2 PotpixztEL f gi
A simpler formulation: matrix form.
Let Yn potpixntEn iiǐptl
1 X11 X12 · · · X1p Y1 ω'0
1 X21 X22 · · · X2p
Y2
ω'1
X= 1 X31 X32 · · · X3p , Y = Y3 '=
, and ω ω'2 .
.. .. .. .. .. .. ..
. . . . . . .
1 Xn1 Xn2 · · · Xnp Yn ω'p
n I xii IXip
alwaysequaltoone
becauseweinterseit
theβ1
liTLO Ixii Ixii IXilxil
Ixixixilxip In
13 / 79
4.7 Some Specific Applications
Remark: 嶨
µ is the mean vector of X , say µ = E (X ).
! ↭ cov(X ) is called the covariance matrix of X , where
(i) the ith diagonal entry is the variance of Xi .
(ii) the non-diagonal (j, k) entry (j ↘= k) is the covariance of Xj and Xk .
If p = 2, then (9) is called the Bivariate normal distribution.
iiijii .co
ulxixj P
72 / 76
4.7 Some Specific Applications
1 2 71 2 K2
1 Ai Btmatrix
XiYtriv
transformationspelitiedA
whenmultiplying xbyA transform xbya linear
73 / 76
4.7 Some Specific Applications
Theorem
Let X ↔ N(µ, !), and let A ↑ Rm→p and b ↑ Rm . Then
AX + b ↔ N(Aµ + b, A!A↑ ).
cet Yaytb
4 AXttTby
My It E et E let
tTAT
E etTAYetib s ATt
Theorem
# $ ## $ # $$
X1 µ1 !11 !12
Let X ↭ ↔N , with X 1 ↑ R r and
X2 µ2 !↑
12 !22
X2 ↑ Rp↓r , then
(a) X1 ↔ N(µ1 , !11 ) and X2 ↔ N(µ2 , !22 )
(b) c ↑ X ↔ N(c ↑ µ, c ↑ !c) for some non-random vector c ↑ Rp .
bㄨ
let Afǒ Tx 1 ta. e
scalar
A
b1 ctic
LetA 7 cix
75 / 76
4.7 Some Specific Applications
note
proplenset
x RP 1
Theorem (independence)
# $ ## $ # $$
X1 µ1 !11 !12
Let X ↭ ↔N , with X 1 ↑ R r and
X2 µ2 !↑
12 ! 22
X2 ↑ Rp↓r . Then X1 and X2 are independent i! !12 = 0.
Trivial
I toti 1
76 / 76
It It
Mxltj etn
Inztiti Entittti Izztz
etimittz
L
etTMltftiIntlxetTMztitzTIi.it
xi Xz