4.
Basic probability theory
Lect04.ppt S-38.145 - Introduction to Teletraffic Theory – Spring 1
2005
4. Basic probability theory
Contents
• Basic concepts
• Discrete random variables
• Discrete distributions (nbr distributions)
• Continuous random variables
• Continuous distributions (time distributions)
• Other random variables
2
4. Basic probability theory
Sample space, sample points, events
• Sample space Ω is the set of all possible sample points ω ∈ Ω
– Example 0. Tossing a coin: Ω = {H,T}
– Example 1. Casting a die: Ω = {1,2,3,4,5,6}
– Example 2. Number of customers in a queue: Ω = {0,1,2,...}
– Example 3. Call holding time (e.g. in minutes): Ω = {x ∈ ℜ | x > 0}
• Events A,B,C,... ⊂ Ω are measurable subsets of the sample space Ω
– Example 1. “Even numbers of a die”: A = {2,4,6}
– Example 2. “No customers in a queue”: A = {0}
– Example 3. “Call holding time greater than 3.0 (min)”: A = {x ∈ ℜ | x > 3.0}
• Denote by the set of all events A ∈
– Sure event: The sample space Ω ∈ itself
– Impossible event: The empty set ∅ ∈
3
4. Basic probability theory
Combination of events
• Union “A or B”: A ∪ B = {ω ∈ Ω | ω ∈ A or ω ∈ B}
• Intersection “A and B”: A ∩ B = {ω ∈ Ω | ω ∈ A and ω ∈
• Complement “not A”: B} Ac = {ω ∈ Ω | ω ∉ A}
• Events A and B are disjoint if
– A∩B=∅
• A set of events {B1, B2, …} is a partition of event A if
– (i) Bi ∩ Bj = ∅ for all i ≠ j
A
– (ii) ∪ i Bi = A
B1
B3
B2
4
4. Basic probability theory
Probability
• Probability of event A is denoted by P(A), P(A) ∈ [0,1]
– Probability measure P is thus
a real-valued set function defined on the set of events , P: → [0,1]
• Properties:
– (i) 0 ≤ P(A) ≤ 1
– (ii) P( ∅ ) = 0
A
– (iii) P(Ω) = 1
– (iv) P(Ac) = 1 − P(A)
– (v) P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
B
– (vi) A ∩ B = ∅ ⇒ P(A ∪ B) = P(A) + P(B)
– (vii) {Bi} is a partition of A ⇒ P(A) = Σi P(Bi)
– (viii) A ⊂ B ⇒ P(A) ≤ P(B)
5
4. Basic probability theory
Conditional probability
• Assume that P(B) > 0
• Definition: The conditional probability of event A
given that event B occurred is defined as
P( A | B) = P(P(B)
A∩B)
• It follows that
P( A ∩ B) = P(B)P( A | B) = P( A)P(B | A)
6
4. Basic probability theory
Theorem of total probability
• Let {Bi} be a partition of the sample space Ω
• It follows that {A ∩ Bi} is a partition of event A. Thus (by slide 5)
(vii)
P( A) = ∑i P( A ∩ Bi )
• Assume further that P(Bi) > 0 for all i. Then (by slide 6)
P( A) = ∑i P(Bi )P( A | Bi )
• This is the theorem of total probability Ω
B1 B3
A
B4
B2
7
4. Basic probability theory
Bayes’ theorem
• Let {Bi} be a partition of the sample space Ω
• Assume that P(A) > 0 and P(Bi) > 0 for all i. Then (by slide 6)
P( A∩Bi ) P(Bi )P( A|Bi )
P(Bi | A) = =
P( A)
P( A)P(Bi )P( A| i
P(Bi | A) =
∑ j P(B j )P( A|B j )
• B )theorem of total probability (slide 7), we get
Furthermore, by the
• This is Bayes’ theorem
– Probabilities P(Bi) are called a priori probabilities of events Bi
– Probabilities P(Bi | A) are called a posteriori probabilities of events Bi
8
(given that the event A occured)
4. Basic probability theory
Statistical independence of events
• Definition: Events A and B are independent if
P( A ∩ B) = P( A)P(B)
• It follows that
P( A | B) = P( A∩B) = P( A)P(B) = P( A)
P(B)
P(B)
• Correspondingly:
P(B | A) = P( A∩B) = P( A)P(B) = P(B)
P( A)
9
P( A)
4. Basic probability theory
Random variables
• Definition: Real-valued random variable X is a real-valued and
measurable function defined on the sample space Ω , X: Ω → ℜ
– Each sample point ω ∈ Ω is associated with a real number
X(ω)
• Measurability means that all sets of type
{X ≤ x}: ={ω ∈ Ω | X (ω ) ≤ x} ⊂ Ω
belong to the set of events , that is
{X ≤ x} ∈
• The probability of such an event is denoted by P{X ≤
x}
10
4. Basic probability theory
Example
• A coin is tossed three times
• Sample space:
Ω ={(ω1,ω2 ,ω 3 ) |ω i ∈{H, T}, i =1,2,3}
• Let X be the random variable that tells the total number of tails
in these three experiments:
ω HHH HHT HTH THH HTT THT TTH TTT
X(ω) 0 1 1 1 2 2 2 3
11
4. Basic probability theory
Indicators of events
• Let A ∈ be an arbitrary event
• Definition: The indicator of event A is a random variable defined as
follows:
ω∈A
1,
1A (ω ) =
0, ω∉
• Clearly: A
P{1A = 1} = P( A)
P{1A = 0} = P( Ac ) = 1− P( A)
12
4. Basic probability theory
Cumulative distribution function
• Definition: The cumulative distribution function (cdf) of a random
variable X is a function FX: ℜ → [0,1] defined as follows:
FX (x) = P{X ≤ x}
• Cdf determines the distribution of the random variable,
– that is: the probabilities P{X ∈ B}, where B ⊂ ℜ and {X ∈
B} ∈
• Properties:
– (i) FX is non-decreasing
1
– (ii) F is continuous from the right FX(x)
X
– (iii) FX (−∞) = 0
– (iv) FX (∞) = 1 x
0
13
4. Basic probability theory
Statistical independence of random variables
• Definition: Random variables X and Y are independent if for
all x and y
P{X ≤ x,Y ≤ y} = P{X ≤ x}P{Y ≤ y}
• Definition: Random variables X1,…, Xn are totally independent if for
all i and xi
P{X1 ≤ x1,..., X n ≤ xn} = P{X1 ≤ x1} P{X n ≤ xn}
14
4. Basic probability theory
Maximum and minimum of independent random variables
• Let the random variables X1,…, Xn be totally independent
• Denote: Xmax := max{X1,…, Xn}. Then
P{X max ≤ x} = P{X1 ≤ x, , X n ≤ x}
= P{X1 ≤ x} P{X n ≤ x}
• Denote: Xmin := min{X1,…, Xn}. Then
P{X min > x} = P{X1 > x, , X n > x}
= P{X1 > x} P{X n > x}
15
4. Basic probability theory
Contents
• Basic concepts
• Discrete random variables
• Discrete distributions (nbr distributions)
• Continuous random variables
• Continuous distributions (time distributions)
• Other random variables
16
4. Basic probability theory
Discrete random variables
• Definition: Set A ⊂ ℜ is called discrete if it is
– finite, A = {x1,…, xn}, or
– countably infinite, A = {x1, x2,…}
• Definition: Random variable X is discrete if
there is a discrete set SX ⊂ ℜ such that
P{X ∈ S X } = 1
• It follows that
– P{X = x} ≥ 0 for all x ∈ SX
– P{X = x} = 0 for all x ∉ SX
• The set SX is called the value set
17
4. Basic probability theory
Point probabilities
• Let X be a discrete random variable
• The distribution of X is determined by the point probabilities pi,
pi := P{X = xi}, xi ∈ S X
• Definition: The probability mass function (pmf) of X is a function
pX: ℜ → [0,1] defined as follows:
pi , x = xi ∈ S X
pX (x) := P{X = x} =
• Cdf is in this case a step function: 0, x ∉ SX
FX (x) = P{X ≤ x} = ∑ pi
i:xi ≤ x
18
4. Basic probability theory
Example
pX(x) FX(x)
1 1
x x
x1 x2 x3 x4 x1 x2 x3 x4
probability mass function (pmf) cumulative distribution function (cdf)
SX = {x1, x2, x3, x4}
19
4. Basic probability theory
Independence of discrete random variables
• Discrete random variables X and Y are independent if and only if
for all xi ∈ SX and yj ∈ SY
P{X = xi ,Y = y j } = P{X = xi}P{Y = y j }
20
4. Basic probability theory
Expectation
• Definition: The expectation (mean value) of X is defined by
µ X := E[ X ] := ∑ P{X = x}⋅ x =
∑ p X (x)x = ∑ pi xi x∈SX
x∈SX i
– Note 1: The expectation exists only if Σi pi|xi| < ∞
– Note 2: If Σi pi xi = ∞, then we may denote E[X] = ∞
• Properties:
– (i) c ∈ ℜ ⇒ E[cX] = cE[X]
– (ii) E[X + Y] = E[X] + E[Y] 21
– (iii) X and Y independent ⇒ E[XY] = E[X]E[Y]
4. Basic probability theory
Variance
• Definition: The variance of X is defined by
σ X2 := D2 [ X ] := Var[ X ] := E[( X −
E[ X ])2 ]
• Useful formula (prove!):
D2 [ X ] = E[ X 2 ] − E[ X
2
]
• Properties:
– (i) c ∈ ℜ ⇒ D2[cX] = c2D2[X]
– (ii) X and Y independent ⇒ D2[X + Y] =
D2[X] + D2[Y]
22
4. Basic probability theory
Covariance
• Definition: The covariance between X and Y is defined by
σ 2XY := Cov[ X ,Y ] := E[( X − E[ X ])(Y − E[Y
])]
• Useful formula (prove!):
Cov[ X ,Y ] = E[ XY ] − E[ X ]E[Y ]
• Properties:
– (i) Cov[X,X] = Var[X]
– (ii) Cov[X,Y] = Cov[Y,X]
– (iii) Cov[X+Y,Z] = Cov[X,Z] + Cov[Y,Z]
– (iv) X and Y independent ⇒ Cov[X,Y] = 0
23
4. Basic probability theory
Other distribution related parameters
• Definition: The standard deviation of X is defined by
σ X := D[ X ] := D2 [ X ] = Var
[X]
• Definition: The coefficient of variation of X is defined by
c X := C[ X ] :=E[ X
D[ X ] ]
• Definition: The kth moment, k=1,2,…, of X is defined by
µ X(k ) := E[ X
k]
24
4. Basic probability theory
Average of IID random variables
• Let X1,…, Xn be independent and identically distributed (IID)
with mean µ and variance σ2
• Denote the average (sample mean) as follows:
n
X n := 1n i=1
∑ Xi
• Then (prove!)
E[ X n ] = µ
2
D2 [ X n ] = σn
D[ X n ] = σn
25
4. Basic probability theory
Law of large numbers (LLN)
• Let X1,…, Xn be independent and identically distributed (IID)
with mean µ and variance σ2
• Weak law of large numbers: for all ε > 0
P{| X n − µ |> ε} → 0
• Strong law of large numbers: with probability 1
Xn → µ
26
4. Basic probability theory
Contents
• Basic concepts
• Discrete random variables
• Discrete distributions (nbr distributions)
• Continuous random variables
• Continuous distributions (time distributions)
• Other random variables
27
4. Basic probability theory
Bernoulli distribution
X ∼ Bernoulli( p), p ∈(0,1)
– describes a simple random experiment with two possible outcomes:
success (1) and failure (0); cf. coin tossing
– success with probability p (and failure with probability 1 − p)
• Value set: SX = {0,1}
• Point probabilities:
P{X = 0} = 1− p, P{X = 1} = p
• Mean value: E[X] = (1 − p)⋅0 + p⋅1 = p
• Second moment: E[X2] = (1 − p)⋅02 + p⋅12 = p
• Variance: D2[X] = E[X2] − E[X]2 = p − p2 = p(1 − p)
28
4. Basic probability theory
Binomial distribution
X ∼ Bin(n, p), n ∈{1,2,...}, p ∈(0,1)
– number of successes in an independent series of simple random
experiments (of Bernoulli type); X = X1 + … + Xn (with Xi ∼ Bernoulli(p))
(in ) i!(n−i)!
– n = total number of experiments
–
n!
p = probability of success in any single experiment
• Value set: SX = {0,1,…,n} =
n!=n⋅(n−1)2⋅1
• Point probabilities:
()
P{X = i} = in ip (1− n−i
• p)
Mean value: E[X] = E[X ] + … + E[X ] = np
1 n
• (independence!)
Variance: D2[X] = D2[X1] + … + D2[Xn] = np(1 − p)
29
4. Basic probability theory
Geometric distribution
X ∼ Geom( p), p ∈(0,1)
– number of successes until the first failure in an independent series of simple
random experiments (of Bernoulli type)
– p = probability of success in any single experiment
• Value set: SX = {0,1,…}
• Point probabilities:
P{X = i} = pi (1− p)
• Mean value: E[X] = ∑i ipi(1 − p) = p/(1 − p)
• Second moment: E[X2] = ∑i i2pi(1 − p) = 2(p/(1 − p))2 + p/(1 − p)
• Variance: D2[X] = E[X2] − E[X]2 = p/(1 − p)2
30
4. Basic probability theory
Memoryless property of geometric distribution
• Geometric distribution has so called memoryless property: for
all i,j ∈ {0,1,...}
P{X ≥ i + j | X ≥ i} = P{X ≥ j}
• Prove!
– Tip: Prove first that P{X ≥ i}
= pi
31
4. Basic probability theory
Minimum of geometric random variables
• Let X1 ∼ Geom(p1) and X2 ∼ Geom(p2) be independent. Then
min
X := min{X1, X 2} ∼
Geom( p1 p2 )
and
1 − pi , i
P{X min = X i} =
1 − p1 ∈{1,2}
• Prove! p2
– Tip: See
slide 15
32
4. Basic probability theory
Poisson distribution
X ∼ Poisson(a), a > 0
– limit of binomial distribution as n → ∞ and p → 0 in such a way that np
→a
• Value set: SX = {0,1,…}
• Point probabilities: i −a
P{X = i} = i! e a
• Mean value: E[X] = a
• Second moment: E[X(X −1)] = a2 ⇒ E[X2] = a2 + a
• Variance: D2[X] = E[X2] − E[X]2 = a
33
4. Basic probability theory
Example
• Assume that
– 200 subscribers are connected to a local exchange
– each subscriber’s characteristic traffic is 0.01 erlang
– subscribers behave independently
• Then the number of active calls X ∼ Bin(200,0.01)
• Corresponding Poisson-approximation X ≈ Poisson(2.0)
• Point probabilities:
0 1 2 3 4 5
Bin(200,0.01) .1326 .2679 .2693 .1795 .0893 .0354
Poisson(2.0) .1353 .2701 .2701 .1804 .0902 .0361
34
4. Basic probability theory
Properties
• (i) Sum: Let X1 ∼ Poisson(a1) and X2 ∼ Poisson(a2) be
independent. Then
X1 + X 2 ∼ Poisson(a1 + a2 )
• (ii) Random sample: Let X ∼ Poisson(a) denote the number of
elements in a set, and Y denote the size of a random sample of this set
(each element taken independently with probability p). Then
Y ∼ Poisson( pa)
• (iii) Random sorting: Let X and Y be as in (ii), and Z = X − Y. Then
Y and Z are independent (given that X is unknown) and
Z ∼ Poisson((1− p)a)
35
4. Basic probability theory
Contents
• Basic concepts
• Discrete random variables
• Discrete distributions (nbr distributions)
• Continuous random variables
• Continuous distributions (time distributions)
• Other random variables
36
4. Basic probability theory
Continuous random variables
• Definition: Random variable X is continuous if
there is an integrable function fX: ℜ → ℜ+ such that for all x ∈ ℜ
x
FX (x) := P{X ≤ x} = ∫ f X ( y)
dy
• The function fX is called the probability −∞
density function (pdf)
– The set SX, where fX > 0, is called the value set
• Properties:
– (i) P{X = x} = 0 for all x ∈ ℜ
– (ii) P{a < X < b} = P{a ≤ X ≤ b} = ∫ b f (x) dx
a
X
= ∫∫-∞
–– (iv) P{X ∈ A}
P{X ∈ ℜ} = ∞ f (x) dx = ∫ fX(x) dx = 1
(iii) A X(x) dx
f X S
X 37
4. Basic probability theory
Example
fX(x) FX(x)
1
x x
x1 x2 x3 x1 x2 x3
probability density function (pdf) cumulative distribution function (cdf)
SX = [x1, x3]
38
4. Basic probability theory
Expectation and other distribution related parameters
• Definition: The expectation (mean value) of X is defined by
∞
µ X := E[ X ] := ∫ f X (x)x dx
– Note 1: The expectation exists only −∞
if ∫-∞∞ fX(x)|x| dx < ∞
– Note 2: If ∫-∞∞ fX(x)x = ∞, then we may denote E[X] = ∞
– The expectation has the same properties as in the discrete case
(see slide 21)
• The other distribution parameters (variance, covariance,...) are defined
just as in the discrete case
– These parameters have the same properties as in the discrete case
(see slides 22-24)
39
4. Basic probability theory
Contents
• Basic concepts
• Discrete random variables
• Discrete distributions (nbr distributions)
• Continuous random variables
• Continuous distributions (time distributions)
• Other random variables
40
4. Basic probability theory
Uniform distribution
X ∼ U(a, b), a < b
– continuous counterpart of “casting
a die”
• Value set: SX = (a,b)
• Probability density function (pdf):
1
f X (x) = , x ∈(a,
b − b)
• a (cdf):
Cumulative distribution function
FX (x) := P{X ≤ x} = b−a x − a , x ∈(a,
• b)E[X] = ∫a b x/(b − a) dx = (a + b)/2
Mean value:
• Second moment: E[X2] = ∫ b x2/(b − a) dx = (a2 + ab + b2)/3
a
• Variance: D2[X] = E[X2] − E[X]2 = (b − a)2/12
41
4. Basic probability theory
Exponential distribution
X ∼ Exp(λ), λ > 0
– continuous counterpart of geometric distribution (“failure” prob. ≈
λdt)
• Value set: SX = (0,∞)
• Probability density function (pdf): −λx
f X (x) = λe , x >
• 0 function (cdf):
Cumulative distribution
FX (x) = P{X ≤ x} = 1− e−λx , x > 0
• Mean value: E[X] = ∫ ∞ λx exp(−λx) dx = 1/λ
0
• Second moment: E[X2] = ∫ ∞ λx2 exp(−λx) dx
= 2/λ2
0
• 42
Variance: D2[X] = E[X2] − E[X]2 = 1/λ2
4. Basic probability theory
Memoryless property of exponential distribution
• Exponential distribution has so called memoryless property: for
all x,y ∈ (0,∞)
P{X > x + y | X > x} = P{X > y}
– Prove!
• Tip: Prove first that P{X > x} = e−λx
• Application:
– Assume that the call holding time is exponentially distributed with mean h
(min).
– Consider a call that has already lasted for x minutes.
Due to memoryless property,
this gives no information about the length of the remaining holding time: it is
distributed as the original holding time and, on average, lasts still h minutes!
43
4. Basic probability theory
Minimum of exponential random variables
• Let X1 ∼ Exp(λ1) and X2 ∼ Exp(λ2) be independent. Then
X min := min{X1, X 2} ∼ Exp(λ1 + λ 2 )
and
λi , i
P{X min = X i } =
λ1 +λ2 ∈{1,2}
• Prove!
– Tip: See
slide 15
44
4. Basic probability theory
Standard normal (Gaussian) distribution
X ∼ N(0,1)
– limit of the “normalized” sum of IID r.v.s with mean 0 and variance 1 (cf.
slide 48)
• Value set: SX = (−∞,∞)
• Probability density function (pdf):
− 12 x2
f X (x) = ϕ(x) := e
1
• Cumulative distribution function (cdf):
2π x
FX (x) := P{X ≤ x} = ∫−∞ ϕ( y)
• Φ(x)
Mean value: :== 0 (symmetric pdf)
E[X] dy
• Variance: D2[X] = 1
45
4. Basic probability theory
Normal (Gaussian) distribution
X ∼ N(µ,σ 2 ), µ ∈ℜ, σ > 0
– if (X − µ)/σ∼ N(0,1)
• Value set: SX = (−∞,∞)
• Probability density function (pdf):
f X (x) = FX ' (x) =σ1 ϕ (
x−µ
σ
•
)
Cumulative distribution function (cdf):
{
FX (x) := P{X ≤ x} = P σX − µ ≤ σx − µ }= σ
( )
• Mean value: E[X] = µ + σE[(X − µ)/σ] = µ (symmetric pdf around µ)
x−µ
• Φ
Variance: D2[X] = σ2D2[(X − µ)/σ] = σ2
46
4. Basic probability theory
Properties of the normal distribution
• (i) Linear transformation: Let X ∼ N(µ, 2) and α,β ∈ ℜ.
Then
σ
Y := αX + β ∼ N(αµ + β ,α 2σ 2 )
• (ii) Sum: Let X1 ∼ N(µ1,σ 2) and X ∼ N(µ ,σ 2) be
independent.
1 2
2
2
Then
X1 + X 2 ∼ N(µ1 + µ2 ,σ 2 +σ 2 )
1
2
• n
(iii) Sample mean: Let Xi ∼ N(µ, 2), i = 1,…n, be independent
and 1 ∑
X n := n 2 i X σ ∼ N(µ, 1σ
n
identically distributed (IID). i=1
Then)
47
4. Basic probability theory
Central limit theorem (CLT)
• Let X1,…, Xn be independent and identically distributed (IID) with
mean µ and variance σ2 (and the third moment exists)
• Central limit theorem:
i.d.
σ /1 n ( X n − µ ) →
• It follows that N(0,1)
X n ≈ N(µ, n1 σ 2 )
48
4. Basic probability theory
Contents
• Basic concepts
• Discrete random variables
• Discrete distributions (nbr distributions)
• Continuous random variables
• Continuous distributions (time distributions)
• Other random variables
49
4. Basic probability theory
Other random variables
• In addition to discrete and continuous random variables,
there are so called mixed random variables
– containing some discrete as well as continuous portions
• Example:
– The customer waiting time W in an M/M/1 queue has an atom at zero
(P{W = 0} = 1 − ρ > 0) but otherwise the distribution is continuous
FW(x) 1
1−ρ
x
0
0
50