Chapter 2
Random Variables
Continuous and Discrete Probability Distributions
                 Random Variables
Let (, F, P) be a probability model for an experiment,
and X a function that maps every   , to a unique
point x  R, the set of real numbers. Since the outcome 
is not certain, so is the value X ( ) = x. Thus if B is some
subset of R, we may want to determine the probability of
“ X ( )  B ”. To determine this probability, we can look at
the set A = X −1 ( B )   that contains all    that maps
into B under the function X.                      
                                                
                                                    A
                                   X ( )
                                            x           B
                                                            R
                                                                2
  Random Variables
  • If   A = X −1 ( B )   belongs to the associated field F,
Probabilit y of the event " X ( )  B " = P( X ( B)).    −1
Random Variable
• Definition
   – If   | X ( )  x  is an event ( F ) for every x in R,
    – Random Variable is a finite single valued function X (  ) that maps
      the set of all experimental outcomes  into the set of real
      numbers R
Random Variable
• The specification of a random variable can also imply a redefinition
  of the sample space.
Example
• we could define as a random variable “the total number of heads.”
• Experiment: Fair coin tossed 3 times,
Notation
• Notation:
   – RV will always be denoted with uppercase letters
   – The realized values of the variable (its range) will be denoted by
     the corresponding lowercase letter.
   – Thus, the random variable X can take the value x.
Borel collection
• X :r.v, X −1 ( B )  F
• B represents semi-infinite intervals of the form   {−  x  a }
• The Borel collection B of such subsets of R is the smallest σ-field of
  subsets of R that includes all semi-infinite intervals of the above
  form.
• if X is a r.v, then
                        
                         | X ( )  x   =  X         x   
    is an event for every x.
Probability Distribution Function(PDF)
(Cumulative Distribution Function )
                  P   | X ( )  x  = FX ( x )  0 .
• FX ( x ) is said to the Probability Distribution Function associated
  with the r.v X.
• The role of the subscript X is only to identify the actual r.v.
Properties of a PDF
• if g(x) is a distribution function, then
    – g ( + ) = 1, g ( − ) = 0 ,
    – If x1  x2 , then      g ( x1 )  g ( x 2 ),
            +
    – g ( x   ) = g ( x ), for all x.
Properties of a PDF - Proof
• The probability distribution function is:
    – nonnegative
    – monotone nondecreasing.
• Proof - We have:         FX ( + ) = P   | X ( )  +  = P (  ) = 1
                            FX ( − ) = P   | X ( )  −  = P ( ) = 0 .
• If   x1  x2 , then ( −, x1 )  ( −, x2 ).
• X ( )  x1 implies X ( )  x 2 . So,
                         | X ( )  x1     | X ( )  x 2 ,
         FX ( x1 ) = P ( X ( )  x1 )  P ( X ( )  x 2 ) = FX ( x 2 )
Properties of a PDF - Proof
• Let     x  x n  x n −1    x 2  x1 ,
• Consider the event Ak =   | x  X ( )  x k .
• Since  x  X ( )  x k   X ( )  x  =  X ( )  x k ,
• We get P ( Ak ) = P ( x  X ( )  x k ) = FX ( x k ) − FX ( x ).
                                                     (mutually exclusive property)
• But  Ak +1  Ak  Ak −1  , and hence
                            
                 lim Ak =  Ak =        = lim P ( Ak ) = 0.
                 k →                         k →
                           k =1
Properties of a PDF - Proof
• Thus
               lim P ( Ak ) = lim FX ( xk ) − FX ( x ) = 0.
               k →           k →
                      +
• But lim xk = x , the right limit of x, and hence FX ( x ) = FX ( x ),
                                                         +
        k →
• i.e., FX (x ) is right-continuous, justifying all properties of a distribution
  function.
Additional Properties
• If FX ( x0 ) = 0 for some x0 , then FX ( x ) = 0, x  x0 .
   This follows, since FX ( x0 ) = P ( X ( )  x0 ) = 0 implies X ( )  x0 
     is the null set, and for any x  x0 ,    X ( )  x  will be a subset of the null
        set.
•   P  X ( )  x  = 1 − FX ( x ).
     since    X ( )  x    X ( )  x  = 
• P  x1  X ( )  x 2  = FX ( x 2 ) − FX ( x1 ), x 2  x1 .
  The events  X ( )  x1  and { x1  X ( )  x 2 } are mutually exclusive and
        their union represents the event  X ( )  x 2 .
Additional Properties
P ( X ( ) = x ) = FX ( x ) − FX ( x − ).
• Let x1 = x −  ,   0, and x 2 = x.
• We have: lim P x −   X ( )  x  = FX ( x ) − lim FX ( x −  ),
               →0                                   →0
  or P  X ( ) = x  = FX ( x ) − FX ( x ).
                                         −
• Thus the only discontinuities of a distribution functionFX (x ) occur at
  points x0 where P  X ( ) = x 0  = FX ( x 0 ) − FX ( x 0− )  0 .
Continuous-type & Discrete-type R.V
• X is said to be a continuous-type r.v if FX ( x − ) = FX ( x )
                 P X ( ) = x0  = FX ( x0 ) − FX ( x0− )  0.
                 PX = x = 0 .
• If xi is constant except for a finite number of jump discontinuities
  (piece-wise constant; step-type), then X is said to be a discrete-type
  r.v.
• If FX (x ) is such a discontinuity point, then
                  p i = P X = xi  = FX ( xi ) − FX ( xi− ).
Example
• X is a r.v such that X ( ) = c ,    . Find FX (x ).
Solution
• For x  c ,  X ( )  x  =   , so that FX ( x ) = 0, F
• For x  c , X ( )  x  =  , so that FX ( x ) = 1.
                                                           FX ( x )
                                                       1
                                                                          x
                                                                      c
• at a point of discontinuity we get
                 P  X = c  = FX ( c ) − FX ( c − ) = 1 − 0 = 1 .
Example
•    In tossing a coin,  = H ,T . Suppose the r.v X is
      such that X (T ) = 0 , X ( H ) = 1 . Find FX (x ).
Solution
•     For    x  0,  X ( )  x  =   , so that FX ( x ) = 0 .
         0  x  1,    X ( )  x  =  T  , so that FX ( x ) = P T  = 1 − p,
            x  1,     X ( )  x  =  H , T  = , so that FX ( x) = 1.
                                                                      FX ( x )
                                                                  1
                                                                  q
•    at a point of discontinuity we get                                              x
    P  X = 0  = FX ( 0 ) − FX ( 0 − ) = q − 0 = q.
                                                                                 1
Example
•     A fair coin is tossed twice, and let the r.v X represent the number of heads.
      Find FX (x ).
Solution
    X ( HH ) = 2 , X ( HT ) = 1, X ( TH ) = 1, X ( TT ) = 0 .
     x  0,   X ( )  x =   FX ( x) = 0,
0  x  1, X ( )  x =  TT   FX ( x) = P TT  = P (T ) P (T ) = ,
                                                                      1
                                                                      4
1  x  2, X ( )  x =  TT , HT , TH   FX ( x) = P TT , HT , TH  = ,
                                                                          3
                                                                          4
   x  2, X ( )  x =   FX ( x) = 1.
                                                                FX (x )
                                                             1
                                                          3/ 4
                                                          1/ 4
                                                                                      x
                                                                       1       2
Probability Density Function (p.d.f)
(Probability Mass Function - for discrete RVs)
                                        dFX ( x )
                             f X ( x) =
                                         dx
           dFX ( x )           FX ( x +  x ) − FX ( x )
Since                = lim                                0,
            dx          x → 0           x
from the monotone-nondecreasing nature of FX ( x ),
  f X ( x )  0 for all x
Probability Density Function (p.d.f)
• X is a continuous type r.v: f X (x ) will be a continuous function
• X is a discrete type r.v:   f X (x ) has the general form
                                                     f X ( x)
                                                                pi
            f X ( x ) =  pi ( x − xi ),                            x
                       i                                        xi
                                                    Fig. 3.5
•  x    i represent the jump-discontinuity points in
• f X ( x ) represents a collection of positive discrete masses,
• It is known as the probability mass function (p.m.f ) in the discrete
    case.
Probability Density Function (p.d.f)
                                 x
• We have: F X ( x ) =       
                             −
                                     f X (u ) du.
• Since    FX ( + ) = 1,        we can say
              +
           −
                   f X ( x ) dx = 1,
          P x1  X ( )  x2  = FX ( x 2 ) − FX ( x1 ) =  f X ( x ) dx.
                                                            x2
                                                            x1
• the area under f X ( x ) in the interval ( x1 , x2 ) represents the
  probability.
                   FX (x )                                   f X (x )
               1
                         x1 x2             x                       x1 x2     x
               (a)                                        (b)
Continuous-type Random Variables
      X can take any value in the real line within a bounded or
                       unbounded interval.
•   Normal (Gaussian)         • Nakagami – m distribution
•   Uniform                   • Cauchy
•   Exponential               • Laplace
•   Gamma                     • Student’s t-distribution with n
•   Beta                        degrees of freedom
•   Chi-Square                • Fisher’s F-distribution
•   Rayleigh                  •   Others: Weibull, Erlang, Lognormal,…
Normal Distribution
                     1            − ( x −  ) 2 / 2 2
   f X ( x) =                 e                          .
                    2   2
                x         1                                              x−
  FX ( x) =                           e   − ( y −  ) 2 / 2 2
                                                                  dy = G    ,
                −
                         2 2                                             
                                  x        1 − y2 / 2
  Where, G ( x ) =
                              
                              −           2
                                              e       dy
                                                                            f X (x )
• Notation: X ~ N (  ,  )                      2
                          mean                  variance                               x
                                                                                   
Normal Distribution
          The normal distribution is symmetrical around mean.
  p.d.f                                  PDF
Normal Table
• The Normal Distribution illustrated in the table.
• With mean of 0 and a standard deviation of 1.
• In examples and exercises where z is used, it is found using the
  formula (x- μ) / σ
• or (value given - mean) / standard deviation
Using the Normal Table
• The shaded area, A,
  gives the probability
  that Z is greater than
  the given value.
 Applications
• Approximately normal distributions occur in many situations.
• When a large number of small effects acting additively and
  independently.
• Example: Measurement errors are often assumed to be normally
  distributed
Uniform Distribution
• Notation:          X ~ U ( a , b ), a  b ,
            1
                   , a  x  b,
f X ( x) =  b − a
             0, otherwise.
                                    p.d.f                PDF
Application
• Sampling from arbitrary distributions.(random number generation)
• inverse transform sampling method, which uses the cumulative distribution
function (CDF) of the target random variable.
Exponential distribution
                                          1 −x / 
                                          e
• Notation: X ~  (  )                             , x  0,
                              f X ( x) =  
                                         
                                          0, otherwise.
         f X ( x)
                          x
                                     Here,    X =
Exponential distribution
   p.d.f              PDF
Exponential Distribution
• Assume the occurences of nonoverlapping intervals are
  independent, and assume:
   – q(t): the probability that in a time interval t no event has
      occurred.
   – x: the waiting time to the first arrival
   – Then we have: P(x>t)=q(t)
   – t1 and t2 : two consecutive nonoverlapping intervals,
Exponential Distribution
• Then we have: q(t1) q(t2) = q(t1+t2)
• The only bounded solution is:
            q (t ) = e − t
            Hence
            FX (t ) = P( X  t ) = 1 − q (t ) = 1 − e −t
 So the pdf is exponential.
 If the occurrences of events over nonoverlapping intervals are
 independent, the corresponding pdf has to be exponential.
Application
• describing the lengths of the inter-arrival times in a homogeneous
  Poisson processes.
    – the time it takes before your next telephone call
• situations where certain events occur with a constant probability per
  unit distance:
    – the distance between mutations on a DNA strand;
Gamma Distribution
• Notation: X ~ G ( ,  )         (  0,   0 )
                x −1       −x / 
                          e        , x  0,         f X (x )
   f X ( x ) =  ( )  
               
                   0, otherwise.
             
   ( ) =    e dx
                 −1 − x
              x
             0
• For integer    = n ,  ( n ) = ( n − 1)!.
Gamma Distribution
  p.d.f              PDF
Beta Distribution
• Notation: X ~  ( a , b )             ( a  0, b  0 )
                    1
                           x a −1 (1 − x ) b −1 , 0  x  1,
   f X ( x ) =   ( a , b)
               
                     0,                  otherwise.
                             1
   where,  ( a, b) =       
                            0
                                 u a −1 (1 − u ) b −1 du.
                             ( a ) (b)
                        =                                       f X ( x)
                             ( a + b)
• Beta distribution with a=b=1 is                                              x
                                                                0          1
  the uniform distribution on (0,1).
Chi-Square Distribution
• Notation:
                                         p.d.f
Applications
• chi-square tests for goodness of fit
                                         PDF
Rayleigh
 • Notation:
                                                  PDF
• when a two-dimensional vector (e.g. wind
  velocity) has elements that are
       • normally distributed,
       • uncorrelated,
                                                   p.d.f
       • with equal variance
• The vector’s magnitude (e.g. wind speed) will
  then have a Rayleigh distribution.
Nakagami – m Distribution
                                          p.d.f
Applications
• Used to model attenuation of wireless
  signals traversing multiple paths.
                                          PDF
Cauchy Distribution
• Notation:
                                        p.d.f
• The ratio of two independent
  standard normal random variables is
  a standard Cauchy variable
• It has no mean, variance or higher
  moments defined.
                                        PDF
Laplace Distribution
• The difference between two iid
  exponential random variables is
  governed by a Laplace distribution   p.d.f
                                       PDF
Student’s t-Distribution
• arises in the problem of estimating the mean of a normally
  distributed population when the sample size is small.
  p.d.f                                 PDF
Fisher’s F-Distribution
• A random variate of the F-distribution arises as the ratio of two chi-
  squared variates
    p.d.f                                  PDF
Discrete-type Random Variables
     X can take only a finite (or countably infinite) number of values
•   Bernoulli
•   Binomial
•   Poisson
•   Hypergeometric
•   Geometric
•   Negative Binomial
•   Discrete-Uniform
•   Polya’s distribution
Bernoulli Distribution
 X takes the values (0,1) and,
                 P ( X = 0) = q,   P ( X = 1) = p .
Binomial Distribution
• Notation: X ~ B ( n , p ),
                n  k n −k
  P( X = k ) = 
               k p q     , k = 0,1,2,  , n.
                
The probability of k successes in n experiments with replacement (in ball
drawing)
  p.d.f                                    PDF
Poisson Distribution
 • Notation: X ~ P (  ) ,
                           −   k
          P( X = k ) = e             , k = 0,1,2,  , .
                                k!
  p.d.f                                PDF
Poisson Distribution - Applications
• The Poisson distribution applies to various phenomena of discrete
  nature.
• The probability of the phenomenon should be constant in time or
  space.
• Examples: The number of…
   – spelling mistakes one makes while typing a single page.
   – phone calls at a call center per minute.
   – times a web server is accessed per minute.
   – roadkill (animals killed) found per unit length of road.
   – pine trees per unit area of mixed forest.
   – stars in a given volume of space.
Hypergeometric Distribution
• The probability of k successes in n experiments without
replacement (ball drawing)
                      m      N −m 
                                  
                      k      n −k 
       P( X = k ) =         
                             N 
                                     
                                       ,   max(0, m + n − N )  k  min( m, n )
                              
                             n 
                              
Application
• The classical application of the hypergeometric distribution is sampling
  without replacement.
Geometric Distribution
• Notation:
  P ( X = k ) = pq k , k = 0,1,2 ,  ,  ,   q = 1 − p.
 the geometric distribution is memoryless.