Common Families of Distributions
Study Guide
By Yuanhao Jiang
1 Introduction
The main purpose of this chapter is to find different types of distributions and their mean and
variance and many other useful applications will be taken into consideration.
2 Discrete Distributions
Discrete distributions mean the sample space of a random variable X is countable, and mostly, X
has integer-valued outcomes.
Discrete Uniform Distribution
Definition: A random variable X has a discrete uniform(1 , N ) distribution if
1
P ( X=x|N )= , x=1 ,2 , … , N ,
N
N is a specified integer.
Mean:
N +1
EX=
2
Variance:
(N +1)(N −1)
Var X=
12
Hypergeometric Distribution
Definition: A random variable has a hypergeometric distribution if
P ( X=x|N , M , K ) =
( )( )
M N −M
x K −x
, x=0 , 1, … , K
( )
N
K
Mean:
KM
EX=
N
Variance:
KM ( N −M )(N −K )
Var X= ( )
N N (N−1)
Binomial Distribution
Before we look into Binomial Distribution, we should consider the Bernoulli Distribution first.
Definition: A random variable has a Bernoulli distribution if
{
X = 1 with probalilty p 0 ≤0 ≤ 1.
0 with probability 1− p ,
Mean:
EX= p
Variance:
Var X= p (1− p)
Binomial distribution is based on Bernoulli Distribution, so now let’s take a look at binomial
distribution.
Definition: A random variable has a binomial(n , p) distribution if
y()
P ( Y = y|n , p )= n p (1− p) , y =0 ,1 , 2 , … , n
y n− y
Mean:
EX=np
Variance:
Var X=np(1− p)
MGF:
t n
M X ( t )=[ p e + ( 1−0 ) ] .
Poisson Distribution
Definition: A random variable has a Poisson(λ) distribution if
−λ x
e λ
P ( X=x|λ )= , x =0 ,1 , …
x!
Mean:
∞ ∞ ∞ ∞
e−λ λ x e− λ λ x λ x−1 λy
EX=∑ x ¿∑ x ¿ λ e− λ ∑ ¿ λ e− λ ∑ ¿λ
x=0 x ! x=1 x! x=1 ( x−1 ) ! y=0 y !
Variance: Similar calculation to Mean.
t
Var X=e λ(e −1)
MGF:
λ(e¿¿t −1).¿
M X ( t )=e
Special relationship between Poission Distribution and Binomial Distribution: when n is very
large and p is small, the Poisson is approximate to binomial.
Negative Binomial Distribution
Definition: A random variable has a negative binomial(r . p) distribution if
( )
P ( X=x|r , p )= x−1 p (1− p) , x=r , r +1 , … ,
r−1
r x−r
Another form of it is
( y
y
)
P(Y = y)= r + y−1 p (1− p) , y=0 ,1 , … ,
y
Mean:
(1−p)
EY =r
p
Variance:
(1− p)
Var Y =r
p2
Special relationship between Poission Distribution and Binomial Distribution: If r → ∞∧ p →1
such that r ( 1− p ) → λ , 0< λ< ∞, then
(1−p)
EY =r → λ,
p
(1− p)
Var Y =r → λ,
p2
Geometric Distribution
Definition: Geometric Distribution is the simplest of the waiting time distributions and is a
special case of the negative binomial distribution. A random variable has a geometric distribution
if
P ( X=x| p ) =p (1− p)
x−1
, x =1, 2 , … ,
This distribution is when r =1 in
( )
P ( X=x|r , p )= x−1 p (1− p) , x=r , r +1 , … ,
r−1
r x−r
Mean:
1
EX=EY +1=
p
Variance:
1− p
Va r X= 2
p
3 Continuous Distribution
In this section we will talk about some famous continuous distribution.
Uniform Distribution
Definition: The continuous uniform distribution is defined by spreading mass uniformly over an
interval [a, b]. Its pdf is given by
{
1
if x ∈ [a , b ]
f ( x|a ,b )= b−a
0 otherwise .
Mean:
b+ a
EX=
2
Variance:
2
(b−a)
Var X=
12
Gamma Distribution
Definition: The gamma(α , β ) distribution is defined over an interval [0, +∞ ). Its pdf is given by
1 α −1 −x / β
f ( x|α , β ) = α
x e , 0< x< +∞ , α >0 , β >0
Γ (α ) β
Mean:
EX=αβ
Variance:
2
Var X=α β
MGF:
α
1 1
M X ( t )=( ) , t<
1−β t β
Special cases of the gamma distribution
p
When α = , where p is an integer and β=2, then the gamma pdf becomes
2
p
1 ( )−1
f ( x| p ) = p/ 2
x 2 e−x /2 , 0< x< +∞ ,
Γ (p / x)2
which is the χ 2 pdf with p degrees of freedom.
When α =1, then the gamma pdf becomes
1 − x/ β
f ( x|β )= e , 0< x <+∞ ,
β
which is the exponential pdf with the scale parameter β .
When X exponential( β), then Y = X 1/ γ has a Weibull(γ , β) distribution. Its pdf is given by
γ γ−1 − y / β
γ
f Y ( y|γ , β ) = y e , 0< y<+ ∞ , γ >0 , β >0
β
Weibull Distribution is important for analyzing failure time data and very useful for modeling
hazard functions.
Normal Distribution (Guass Distribution)
Definition: The normal 2 distribution is defined over an interval of . Its pdf is
(μ , σ ) (−∞,+∞)
given by
1
f ( x|μ , σ ) =
2 2
2
e− ( x−μ ) /(2 σ ) ,−∞ < x <+∞
√2 πσ
Mean:
EX=μ
Variance:
2
Var X=σ
Beta Distribution
Definition: The beta(α , β ) distribution is defined over an interval of (0 , 1). Its pdf is given by
1 α −1
f ( x|α , β ) =
β−1
x (1−x) dx
B(α , β)
Mean:
α
EX=
α+ β
Variance:
αβ
Var X= 2
( α + β ) (α+ β +1)
Cauchy Distribution
Definition: The normal 2 distribution is defined over an interval of . Its pdf is
(μ , σ ) (−∞,+∞)
given by
1 1
f ( π|θ )= ,−∞< x<+ ∞ ,−∞ <θ<+ ∞
π 1+(π −θ)2
Mean:
E∨X∨¿ ∞
Variance: Doesn’t exist.
Lognormal Distribution
Definition: The lognormal 2 distribution is defined over an interval of . Its
logX n (μ , σ ) (0 ,+ ∞)
pdf is given by
1 1 − (log x−μ ) / (2 σ )
f ( x|μ , σ ) =
2 2
2
e ,0< x<+ ∞ ,−∞ < μ<+ ∞ , σ > 0 ,
√2 πσ x
Mean:
2
EX=e μ+σ / 2
Variance:
2 2
Var X=e2 (μ+ σ ) −e 2 μ+σ
Double Exponential Distribution
Definition: The double exponential distribution is defined over an interval of (−∞,+∞). Its pdf
is given by
1 −¿ x−μ∨¿σ
f ( x|μ , σ ) = e ,−∞ < x <+∞ ,−∞< μ <+∞ , σ >0 ,
2σ
Mean:
EX=μ
Variance:
2
Var X=2σ
4 Exponential Families
Definition 4.1: A family of pdfs or pmfs is called an Exponential family if it can be expressed as
(∑ )
k
f ( x|θ )=h ( x ) c ( θ ) exp ωi ( θ ) t i ( x )
i=1
Exponential families include the continuous families—normal, gamma, and beta, and the discrete
families—binomial, Poisson, and negative binomial.
To verify that a family of pdfs or pmfs is an exponential family of not, we show know the
functions and then the family should have the form above.
h ( x ) , c ( θ ) , wi ( θ ) , t i (x )
Theorem 4.2 If X is a random variable with pdf ∨ pmf of the above form , then
(∑ ) ( ) (∑ )
k k 2
k
∂ω i (θ ) −∂ ∂ ωi ( θ ) −∂
2
∂ ωi ( θ )
E
∂θj i
t (X ) =
∂θj
logc ( θ ) ;Var ∑ ∂θ i t ( X ) =
∂ θ2j
logc ( θ ) −E
∂ θ2j
ti ( X )
i=1 i =1 j i=1
Definition 4.3 The indicator function of a set A, most often donated by I A ( x ), is the function
I A ( x )= 1, x ∈ A
0,x ∉ A {
We can re-parameterize an exponential family, then we get
(∑ )
k
¿
f ( x|η )=h ( x ) c ( η ) exp ηi t i ( x )
i=1
(∑ )
∞ k
The set Η={η= η , … ,η : h ( x ) exp ηi t i ( x ) dx< ∞ } is called the natural parameter space
( 1 k) ∫
−∞ i=1
for the family. For the values of η ∈ H , to ensure that the pdf integrated to 1, we must have
¿
c ( η )=¿ ¿.
Definition 4.4 A curved exponential family is a family of densities of the form for which the
dimension of the vector θ to be equal to d < k . If d=k , the family is a full exponential family.
5 Location and Scale Families
In this section we will talk about 3 approaches to constructing families of distributions. The 3
types of families are called location families, scale families, and location-scale families. Each of
them is constructed by specifying a single pdf f (x), called the standard pdf for the family. Then
all other pdfs in the family are generated by transforming the standard pdf in a prescribed way.
We start with a simple theorem about pdfs.
Theorem 5.1 Let f (x) be any pdf and let μ∧σ >0 be any given constants. Then the function
1 x−μ
g ( x|μ ,σ )= f
σ σ ( )
Definition 5.2 Assume f (x) to be any pdf. Then the family of pdfs f ( x−μ ), indexed by the
parameter μ, −∞ < μ<+ ∞, is called the location family with standard pdf f (x) and μ is called
the location parameter for the family.
1 x
Definition 5.3 Assume f (x) to be any pdf. Then for any σ > 0, the family of pdfs f ( ),
σ σ
indexed by the parameter σ , is called the scale family with standard pdf f (x), and σ is called
the scale parameter of the family.
Definition 5.4 Assume f (x) to be any pdf. Then for any μ, −∞ < μ<+ ∞, and any σ > 0, the
σ
f ( )
family for pdfs 1 x−μ , indexed by the parameter
σ
(μ , σ )
, is called the location-scale family
with standard pdf f (x); μ is called the location parameter and σ is called the scale parameter.
Theorem 5.5 Let f ( · ) be any pdf. Let μ ben any real number, and let σ be any positive real
number. Then
X
σ
f
σ ( )
is random variable with pdf 1 x−μ if and only if there exists a random
variable Z with pdf f ( z ) and X =σZ + μ.
Theorem 5.6 Let Z be a random variable with pdf f (z). Suppose EZ and Var Z exist. If X is a
1 x−μ
random variable with pdf f ( ), then
σ σ
2
E X =σ E Z + μ∧Var X=σ Var Z .
In particular, if E Z=0 and Var Z=1, then E X =μ and Var X =σ 2 .
6 Inequalities and Identities
Theorem 6.1 Let X be a random variable and let g(x ) be a nonnegative function. Then for any
r >0 ,
E g ( X)
P(g( X )≥ r )≤
r
Theorem 6.2 Let X α , β denote a gamma(α , β ) random variable with pdf f (x∨α , β ), where α >1.
Then for any constants a∧b ,
P ( a< X α , β <b )=β ( f ( a|α , β )−f ( b|α , β ) ) + P(a< X α−1 , β < b)
Lemma 6.3 (Stein’s Lemma) Let 2 , and let be a differentiable function satisfying
X n(θ , σ ) g
. Then
E|g' ( X )|< ∞
E [ g ( X ) ( X−θ ) ] =σ E g ( X ) .
2 '
Theorem 6.4 Let 2 denote a chi squared random variable with degrees of freedom. For any
Xp p
function h( x),
Eh ( X ) = pE
2
p
( h ( X 2p+2 )
2
X p+2 )
Theorem 6.5 (Hwang) Let g(x ) be a function with −∞ < Eg ( X ) < ∞ and −∞ < g (−1 ) <∞ . Then:
a. If X Poision(λ),
E ( λ g ( X )) =E ( Xg ( X −1 ) ) .
b. If X negative binomial ( r , p ) ,
E ( ( 1− p ) g ( X ) ) =E ( r + XX−1 g ( X −1))