Probability and statistics                                                                       April 26, 2016
Lecturer: Prof. Dr. Mete SONER
Coordinator: Yilin WANG
                                     Solution Series 7
Q1. Suppose two random variables X1 and X2 have a continuous joint distribution for which the
    joint p.d.f. is as follows:
                                               (
                                                4x1 x2 for 0 < x1 , x2 < 1,
                                f (x1 , x2 ) =
                                                0      otherwise.
     Calculate Cov(X1 , X2 ). Determine the joint p.d.f. of two new random variables Y1 and Y2 ,
     which are defined by the relations
                                                       X1
                                            Y1 =          and Y2 = X1 X2 .
                                                       X2
     Solution:
     We apply the formula
                                   Cov(X1 , X2 ) = E(X1 X2 ) − E(X1 )E(X2 ).
     Since the marginal density for X1 is
                                                           Z       1
                                          f1 (x1 ) =                   4x1 x2 dx2 = 2x1 ,
                                                               0
                                              Z    1
                                   E(X1 ) =            x1 f1 (x1 )dx1 = 2/3 = E(X2 ).
                                               0
     We have also                                 Z    1   Z   1
                                 E(X1 X2 ) =                       x1 x2 4x1 x2 dx1 dx2 = 4/9.
                                                   0       0
     Thus Cov(X1 , X2 ) = 0. Actually, one can see from f1 (x1 )f2 (x2 ) = f (x1 , x2 ) that X1 and X2
     are independent.
     Joint p.d.f. of (Y1 , Y2 ) = (r1 (X1 , X2 ), r2 (X1 , X2 )): where r1 (x1 , x2 ) = x1 /x2 and r2 (x1 , x2 ) =
     x1 x2 . We have that
                                                √                   p
                                         x1 = r1 r2 and x2 = r2 /r1 .
                                                                   p               p          
                                                                 1                             3      1
                                    ∂x1 /∂r1 ∂x2 /∂r1                     r 2 /r1   −   r 2 /r1
              Jac(r1 , r2 ) = det                             = det p                p             =     .
                                    ∂x1 /∂r2 ∂x2 /∂r2            4        r1 /r2       1/r1 r2       2r1
                                                               1
Probability and statistics                                                                 April 26, 2016
     By the transformation formula of the density,
                                                        √       p
                          fY1 ,Y2 (r1 , r2 ) = fX1 ,X2 ( r1 r2 , r2 /r1 )Jac(r1 , r2 )
                                             = 4r2 /2r1 10<√r1 r2 ,√r2 /r1 <1
                                              = 10<√r1 r2 ,√r2 /r1 <1 2r2 /r1 .
Let α and β be positive numbers. A random variable X has the gamma distribution with parameters
α and β if X has a continuous distribution for which the p.d.f. is
                                         ( α
                                           β
                                          Γ(α)
                                               xα−1 e−βx for x > 0,
                            f (x|α, β) =
                                          0               for x ≤ 0.
Where Γ is the function defined as: for α > 0,
                                            Z             ∞
                                     Γ(α) =                   xα−1 e−x dx.
                                                      0
Q2. Let X have the gamma distribution with parameters α and β.
      (a) For k = 1, 2, . . . , show that the k-th moment of X is
                                                Γ(α + k)   α(α + 1) · · · (α + k − 1)
                                  E(X k ) =       k
                                                         =                            .
                                                 β Γ(α)                βk
           What are E(X) and V ar(X)?
      (b) What is the moment generating function of X?
      (c) If the random variables X1 , · · · , Xk are independent, and if Xi (for i = 1, . . . , k) has the
          gamma distribution with parameters αi and β, show that the sum X1 + · · · + Xk has
          the gamma distribution with parameters α1 + · · · + αk and β.
     Solution:
      (a) For k = 1, 2, · · ·
                                                            Z ∞
                                                      βα
                                            k
                                       E[X ] =                   xk xα−1 e−βx dx
                                                     Γ(α) 0
                                                            Z ∞
                                                      βα
                                                 =               xα+k−1 e−βx dx
                                                     Γ(α) 0
                                                            Z ∞
                                                      βα
                                                 =               β −α−k y α+k−1 e−y dy
                                                     Γ(α) 0
                                                     Γ(α + k)
                                                 =             ,
                                                      β k Γ(α)
           where the change of variable y = βx is used. It can be easily seen that for α > 0,
                                                     Γ(α + 1) = αΓ(α).
                                                          2
Probability and statistics                                                                      April 26, 2016
           We then deduce the second equality.
                                           E(X) = α/β
                                        V ar(X) = E(X 2 ) − E(X)2 = α/β 2 .
     (b) By definition of moment generating function,
                                          ψX (t) = E(etX )
                                                         Z ∞
                                                    βα
                                                 =           etx xα−1 e−βx dx
                                                   Γ(α) 0
                                                         Z ∞
                                                    βα
                                                 =           xα−1 e−(β−t)x dx
                                                   Γ(α) 0
                                                          α
                                                       β
                                                 =
                                                     β−t
           The above computation is only valid for t < β.
      (c) If ψi denotes the m.g.f of Xi , then it follows from the last question that for i = 1, · · · , k,
                                                              αi
                                                            β
                                              ψi (t) =             .
                                                          β−t
           The m.g.f. ψ of X1 + · · · + Xk is, by independence,
                                          k                         α1 +···+αk
                                          Y                     β
                                 ψ(t) =         ψi (t) =                           for t < β.
                                          i=1
                                                               β−t
           It coincides with the m.g.f. of a Gamma random variable with parameter (α1 + · · · +
           αk , β), in an open interval of 0, thus the sum X1 + · · · + Xk has the Gamma distribution.
Q3. Service Times in a Queue. For i = 1, · · · , n, suppose that customer i in a queue must
    wait time Xi for service once reaching the head of the queue. Let Z be the rate at which
    the average customer is served. A typical probability model for this situation is to say that,
    conditional on Z = z, X1 , . . . , Xn are i.i.d. with a distribution having the conditional p.d.f.
    g1 (xi |z) = z exp(−zxi ) for xi > 0. Suppose that Z is also unknown and has the p.d.f.
    f2 (z) = 2 exp(−2z) for z > 0.
      (a) What is the joint p.d.f. of X1 , . . . , Xn , Z.
     (b) What is the marginal joint distribution of X1 , . . . , Xn .
      (c) What is the conditionalPp.d.f. g2 (z|x1 , . . . , xn ) of Z given X1 = xi , . . . , Xn = xn ? For
          this we can set y = 2 + ni=1 xi .
     (d) What is the expected average service rate given the observations X1 = x1 , · · · , Xn = xn ?
     Solution:
                                                       3
Probability and statistics                                                                       April 26, 2016
     (a) The joint p.d.f. of X1 , · · · , Xn , Z is
                                                           n
                                                           Y
                               f (x1 , · · · , xn , z) =         g1 (xi |z)f2 (z)
                                                           i=1
                                                             n
                                                     = 2z exp(−z[2 + x1 + · · · + xn ]),
          if z, x1 , · · · , xn > 0 and 0 otherwise.
     (b) The marginal joint distribution of X1 , · · · , Xn is obtained by integrating z out of the
         joint p.d.f. above.
                Z ∞
                                                       2Γ(n + 1)                      2(n!)
                     f (x1 , · · · , xn , z)dz =                       n+1
                                                                           =                           ,
                 0                               (2 + x1 + · · · + xn )      (2 + x1 + · · · + xn )n+1
         for all xi > 0 and 0 otherwise.
     (c) We set y = 2 + ni=1 xi , for z > 0
                          P
                                                                                         y n+1
                                     g2 (z|x1 , · · · , xn ) = f (x1 , · · · , xn , z)
                                                                                         2(n!)
                                                                 z n exp(−zy)y n+1
                                                               =
                                                                          n!
                                                                      n+1
                                                                    y
                                                               =           z n+1−1 e−yz ,
                                                                 Γ(n + 1)
          we recognize the conditional distribution of Z given X1 = x1 , · · · Xn = xn is Gamma
          distribution with parameter α = n + 1, β = y.
     (d) The conditional expected value of Z given X1 = x1 , · · · Xn = xn is the expected value
         of Gamma distribution with parameter α = n + 1, β = y, which by Q2.a, equals to
                                                                                  n+1
                                    E(Z|X1 = x1 , · · · Xn = xn ) =                        .
                                                                               2 + ni=1 xi
                                                                                  P
Q4. Least-squares line.
     (a) Let (x1 , y1 ), . . . , (xn , yn ) be a set of n points of R2 and xi s are not all the same. Show
         that the straight line defined by the equation y(x) = β̂0 + β̂1 x that minimizes the sum
         of the squares of the vertical deviations of all the points from the line has the following
         slope and intercept, i.e. (β̂0 , β̂1 ) minimizes
                                                               n
                                                               X
                                           I(β0 , β1 ) :=            (β0 + β1 xi − yi )2
                                                               i=1
          over all choices of (β0 , β1 ) ∈ R2 :
                                                   Pn
                                                        (y − ȳ)(xi − x̄)
                                                     Pn i
                                              β̂1 = i=1               2
                                                                          ,
                                                         i=1 (xi − x̄)
                                              β̂0 = ȳ − β̂1 x̄,
                                                           4
Probability and statistics                                                                   April 26, 2016
                                    Table 1: Data for Q4.(b)
                                           i        xi           yi
                                          1       0.5          40
                                          2       1.0          41
                                          3       1.5          43
                                          4       2.0          42
                                          5       2.5          44
                                          6       3.0          42
                                          7       3.5          43
                                          8       4.0          42
         where x̄ = n1 ni=1 xi and ȳ = n1 ni=1 yi .
                      P                     P
         The minimizing line is called the least-squares line. Remark that the least-squares line
         passes through the point (x̄, ȳ).
     (b) Fit a straight line of the form y = β0 + β1 x to these values by the method of least
         squares (with your calculator or Excel).
     Solution:
     (a) Using the fact that I(β0 , β1 ) → +∞ as k(β0 , β1 )k → ∞ (which is true since the xi s are
         not all the same), the infimum of I is in approximated in some compact set. Since I is
         continuous, the infimum of I is a minimum. We can look for critical points (β̂0 , β̂1 ):
                                                        n
                                                        X
                                ∂β0 I(β̂0 , β̂1 ) = 2         β̂0 + β̂1 xi − yi = 0
                                                        i=1
                                                        Xn
                                ∂β1 I(β̂0 , β̂1 ) = 2         xi (β̂0 + β̂1 xi − yi ) = 0.
                                                        i=1
         We solve the above system:
                                  nβ̂0 + nx̄β̂1 = nȳ           ⇒     β̂0 = ȳ − x̄β̂1 ,
                                                    5
Probability and statistics                                                                                 April 26, 2016
         and
                                                 n
                                                              !              n
                                                 X                           X
                                  nx̄β̂0 +              x2i       β̂1 =            xi y i
                                                  i=1                        i=1
                                                                 n
                                                                             !           n
                                                                 X                       X
                             ⇒ nx̄(ȳ − x̄β̂1 ) +                      x2i       β̂1 =            xi y i
                                                                 i=1                        i=1
                                    n
                                                             !           n
                                    X                                    X
                             ⇒             x2i   − nx̄   2
                                                                 β̂1 =            xi yi − nx̄ȳ
                                    i=1                                  i=1
                                     n
                                                            !             n
                                    X                                    X
                                                        2
                             ⇒             (xi − x̄)             β̂1 =           (xi − x̄)(yi − ȳ)
                                     i=1                                 i=1
                                    Pn
                                         (x − x̄)(yi − ȳ)
                                      Pn i
                             ⇒ β̂1 = i=1                2
                                                           .
                                          i=1 (xi − x̄)
     (b) We apply the above formula to find β̂1 = 40.89 and β̂0 = 0.55.