Prob Lec3
Prob Lec3
1 / 49
Outline
• Expectation, variance
• Poisson distribution
• Geometric distribution
2 / 49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Random variables
Definition
A random variable is a function from the sample space Ω to the
real numbers R.
803//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Random variables
Example
Flipping three coins, let X count the number of Heads obtained.
Then, as a function on Ω,
X (T , T , T ) = 0;
X (T , T , H) = X (T , H, T ) = X (H, T , T ) = 1;
X (T , H, H) = X (H, T , H) = X (H, H, T ) = 2;
X (H, H, H) = 3.
814//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Example
The number of Heads in three coinflips is discrete.
Example
The number of conflips needed to first see a Head is discrete: it
can be 1, 2, 3, . . . .
Example
The lifetime of a device is not discrete, it can be anything in the
real interval [0, ∞).
825//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Mass function
The distribution of a random variable will be the object of
central importance to us.
Definition
Let X be a discrete random variable with possible values
x1 , x2 , . . . . The probability mass function (pmf), or distribution of
a random variable tells us the probabilities of these possible
values:
pX (xi ) = P{X = xi },
for all possible xi ’s.
Mass function
Proposition
For any discrete random variable X ,
X
p(xi ) ≥ 0, and p(xi ) = 1.
i
Proof.
Remark
Vice versa: any function p which is only non-zero in countably
many xi values, and which has the above properties, is a
probability mass function. There is a sample space and a
random variable that realises this mass function.
847//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Mass function
Example
We have seen X , the number of Heads in three coinflips. Its
possible values are X = 0, 1, 2, 3, and its mass function is
given by
1 3
p(0) = p(3) = ; p(1) = p(2) = .
8 8
Indeed,
3
X 1 3 3 1
p(i) = + + + = 1.
8 8 8 8
i=0
858//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Mass function
Example
Fix a positive parameter λ > 0, and define
λi
p(i) = c · , i = 0, 1, 2, . . . .
i!
How should we choose c to make this into a mass function? In
that case, what are P{X = 0} and P{X > 2} for the random
variable X having this mass function?
869//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Mass function
Solution
First, p(i) ≥ 0 iff c ≥ 0. Second, we need
∞
X ∞
X λi
p(i) = c · = c · eλ = 1,
i!
i=0 i=0
λ0
P{X = 0} = p(0) = e−λ = e−λ ;
0!
87
10//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Mass function
Solution (. . . cont’d)
P{X > 2} = 1 − P{X ≤ 2}
= 1 − P{X = 0} − P{X = 1} − P{X = 2}
λ0 λ1 λ2
= 1 − e−λ · − e−λ · − e−λ ·
0! 1! 2!
λ 2
= 1 − e−λ − e−λ · λ − e−λ · .
2
88
11//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Expectation, variance
89
12//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
1. Expectation
Definition
The expectation, or mean, or expected value of a discrete
random variable X is defined as
X
EX : = xi · p(xi ),
i
Remark
The expectation is nothing else than a weighted average of the
possible values xi with weights p(xi ). A center of mass, in other
words.
p(x1 ) p(x2 ) p(x3 ) p(x4 )
x1 E(X ) x2 x3 x4
90
13//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
1. Expectation
Remark
Let X be an indicator variable: K vng ca mt bin ngu nhiên ch báo bng xác sut
ca s kin mà nó đi din.
(
1, if event E occurs,
X =
0, if event E c occurs.
91
14//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
1. Expectation
6
X 6
X 1 1+6 1 7
EX = i · p(i) = i· = ·6· = .
6 2 6 2
i=1 i=1
92
15//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
if exists. . .
This formula is rather natural.
Proof.
93
16//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
E(aX + b) = a · EX + b.
Proof.
According to the above, (with g(x) = ax + b,)
X X X
E(aX + b) = (axi + b) · p(xi ) = a · xi p(xi ) + b · p(xi )
i i i
= a · EX + b.
94
17//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
E(aX + b) = a · EX + b.
Proof.
According to the above, (with g(x) = ax + b,)
X X X
E(aX + b) = (axi + b) · p(xi ) = a · xi p(xi ) + b · p(xi )
i i i
= a · EX + b.
94
18//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
E(aX + b) = a · EX + b.
Proof.
According to the above, (with g(x) = ax + b,)
X X X
E(aX + b) = (axi + b) · p(xi ) = a · xi p(xi ) + b · p(xi )
i i i
= a · EX + b · 1.
94
19//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
E|X |n .
Remark
Our notation in this definition and in the future will be
EX n : = E(X n ) 6= (EX )n !!
95
20//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
3. Variance
Example
Define X ≡ 0,
1 1 1
1, wp. , 2, wp. , 10, wp. ,
Y = 2 Z = 5 U= 2
−1, wp. 1 ,
− 1 , wp. 4 ,
−10, wp. 1 ,
2 2 5 2
Notice EX = EY = EZ = EU = 0, the expectation does not
distinguish between these rv.’s. Yet they are clearly different.
3. Variance
Example (. . . cont’d)
VarX = E(X − 0)2 = 02 = 0,
√
SD X = 0 = 0.
1 1
VarY = E(Y − 0)2 = 12 · + (−1)2 · = 1,
√ 2 2
SD Y = 1 = 1.
1 1 2 4
VarZ = E(Z − 0)2 = 22 · + − · = 1,
√ 5 2 5
SD Z = 1 = 1.
1 1
VarU = E(U − 0)2 = 102 · + (−10)2 · = 100,
√ 2 2
SD U = 100 = 10.
97
22//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
3. Variance
98
23//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Proof.
Corollary
For any X , EX 2 ≥ (EX )2 , with equality only if X = const. a.s.
99
24//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Example
The variance of the number X shown after rolling a fair die is
1 7 2 35
VarX = EX 2 − (EX )2 = (12 + 22 + · · · + 62 ) · − =
6 2 12
p
and its standard deviation is 35/12 ≃ 1.71.
The two most important numbers we can say about a fair die
are the average of 3.5 and typical deviations of 1.71 around this
average.
100
25//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
101
26//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Var(aX + b) = a2 · VarX .
Proof.
102
27//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Bernoulli, Binomial
In this part we’ll get to know the Bernoulli and the Binomial
distributions.
103
28//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
1. Definition
Definition
Suppose that n independent trials are performed, each
succeeding with probability p. Let X count the number of
successes within the n trials. Then X has the
Binomial distribution with parameters n and p or, in short,
X ∼ Binom(n, p).
Notice that the Bernoulli distribution is just another name for the
indicator variable from before.
104
29//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
2. Mass function
Proposition
Let X ∼ Binom(n, p). Then X = 0, 1, . . . , n, and its mass
function is
n i
p(i) = P{X = i} = p (1 − p)n−i , i = 0, 1, . . . , n.
i
p(0) = 1 − p, p(1) = p.
105
30//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
2. Mass function
Remark
That the above is indeed a mass function we verify via the
Binomial Theorem (p(i) ≥ 0 is clear):
n
X n
X n i
p(i) = p (1 − p)n−i = [p + (1 − p)]n = 1.
i
i=0 i=0
106
31//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
2. Mass function
Example
Screws are sold in packages of 10. Due to a manufacturing
error, each screw today is independently defective with
probability 0.1. If there is money-back guarantee that at most
one screw is defective in a package, what percentage of
packages is returned?
107
32//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
3. Expectation, variance
Proposition
Let X ∼ Binom(n, p). Then
Proof.
We first need to calculate
X n
X
n i
EX = i · p(i) = i· p (1 − p)n−i .
i
i i=0
d i
To handle this, here is a cute trick: i = dt t t=1 .
108
33//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
3. Expectation, variance
Proof.
Xn
n
EX = i · pi (1 − p)n−i
i
i=0
Xn
n d i
= t |t=1 · pi (1 − p)n−i
i dt
i=0
n
d X n
= (tp)i (1 − p)n−i
dt i t=1
i=0
d
= (tp + 1 − p)n |t=1 = n(tp + 1 − p)n−1 · p|t=1 = np.
dt
109
34//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Poisson
110
35//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
1. Mass function
Definition
Fix a positive real number λ. The random variable X is
Poisson distributed with parameter λ, in short X ∼ Poi(λ), if it is
non-negative integer valued, and its mass function is
λi
p(i) = P{X = i} = e−λ · , i = 0, 1, 2, . . .
i!
111
36//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Proposition
Fix λ > 0, and suppose that Yn ∼ Binom(n, p) with p = p(n) in
such a way that n · p → λ. Then the distribution of Yn converges
to Poisson(λ):
λi
∀i ≥ 0 P{Yn = i} −→ e−λ .
n→∞ i!
112
37//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
Proof.
n
P{Yn = i} = · pi (1 − p)n−i
i
1 (1 − p)n
= · [np] · [(n − 1)p] · · · [(n − i + 1)p] · .
i! (1 − p)i
113
38//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
3. Expectation, variance
Proposition
For X ∼ Poi(λ), EX = VarX = λ.
Proof.
∞
X ∞
X λi
EX = ip(i) = i · e−λ
i!
i=0 i=1
∞
X X ∞
−λ λi−1 λj
=λ e =λ e−λ = λ.
(i − 1)! j!
i=1 j=0
114
39//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
4. Examples
Example
Because of the approximation of Binomial, the
◮ number of typos on a page of a book;
◮ number of citizens over 100 years of age in a city;
◮ number of incoming calls per hour in a customer centre;
◮ number of customers in a post office today
are each well approximated by the Poisson distribution.
115
40//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
4. Examples
Example
A book on average has 1/2 typos per page. What is the
probability that the next page has at least three of them?
The number X of typos on a page follows a Poisson(λ)
distribution, where λ can be determined from 12 = EX = λ. To
answer the question,
P{X ≥ 3}
= 1 − P{X ≤ 2}
= 1 − P{X = 0} − P{X = 1} − P{X = 2}
(1/2)0 −1/2 (1/2)1 −1/2 (1/2)2 −1/2
=1− ·e − ·e − ·e
0! 1! 2!
≃ 0.014.
116
41//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
4. Examples
Example
Screws are sold in packages of 10. Due to a manufacturing
error, each screw today is independently defective with
probability 0.1. If there is money-back guarantee that at most
one screw is defective in a package, what percentage of
packages is returned?
Geometric
118
43//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
1. Mass function
Definition
Suppose that independent trials, each succeeding with
probability p, are repeated until the first success. The total
number X of trials made has the Geometric(p) distribution (in
short, X ∼ Geom(p)).
Proposition
X can take on positive integers, with probabilities
p(i) = (1 − p)i−1 · p, i = 1, 2, . . ..
119
44//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
1. Mass function
Remark
For a Geometric(p) random variable and any k ≥ 1 we have
P{X ≥ k} = (1 − p)k −1 (we have at least k − 1 failures).
Corollary
The Geometric random variable is (discrete) memoryless: for
every k ≥ 1, n ≥ 0
120
45//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
2. Expectation, variance
Proposition
For a Geometric(p) random variable X ,
1 1−p
EX = , VarX = .
p p2
121
46//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
2. Expectation, variance
Proof.
∞
X ∞
X
i−1
EX = i · (1 − p) p= i · (1 − p)i−1 p
i=1 i=0
d X i
∞
X ∞
d i
= t |t=1 · (1 − p)i−1 p = t · (1 − p)i−1 p
dt dt t=1
i=0 i=0
p d 1
= ·
1 − p dt 1 − (1 − p)t t=1
p 1−p 1
= · 2
= .
1 − p (1 − (1 − p)) p
122
47//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
3. Example
Example
To first see 3 appearing on a fair die, we wait X ∼ Geom( 61 )
1
many rolls. Our average waiting time is EX = 1/6 = 6 rolls, and
the standard deviation is
v
u
√ u 1 − 61 √
SD X = VarX = t 2 = 30 ≃ 5.48.
1
6
123
48//281
49
Prob. Cond. Discr. Cont. Joint E, cov LLN, CLT Mass fct E, var Binom Poi Geom
3. Example
Example (. . . cont’d)
The chance that 3 first comes on the 7th roll is
1 6 1
p(7) = P{X = 7} = 1 − · ≃ 0.056,
6 6
while the chance that 3 first comes on the 7th or later rolls is
1 6
P{X ≥ 7} = 1 − ≃ 0.335.
6
124
49//281
49