Advanced Econometrics
Masters Class
                     Chrispin Mphuka
                             UNZA
                      January 2010
CM (Institute)        Econometrics Lecture 3   01/2010   1 / 17
Assumptions of the CLRM (Recap)
   Linearity: yi = xi 1 β1 + xi 2 β2 + ...xik βk + εi   (i = 1, 2, 3, ..., n)
   Full rank: The n K data matrix X has full column rank
   Exogeneigty of regressors: E (εi jX1 , ..., Xn ) = 0     (i = 1, 2, ..., n)
   Homoscedasticity and nonautocorrelation: Each disturbance εi has
   the same …nite variance σ2 and is uncorrelated with every other
   disturbance εj .
   Exogenously generated data
   Normal distribution of disturbances
    CM (Institute)              Econometrics Lecture 3              01/2010   2 / 17
Linearity
                                           1
Recall :OLS estimator is b = (X0 X)            X 0y
                       1
    Let C = (X0 X)         X0 =) b = Cy
      CM (Institute)             Econometrics Lecture 3   01/2010   3 / 17
Linearity
                                           1
Recall :OLS estimator is b = (X0 X)            X 0y
                       1
    Let C = (X0 X)         X0 =) b = Cy
    This shows that the OLS estimators are a linear combination of the
    regressand.
      CM (Institute)             Econometrics Lecture 3       01/2010   3 / 17
Unbiasedness
              1                 1                                                1
b = X0 X          X0 y = X0 X       X0 (Xβ + ε) = β+ X0 X                            X0 ε = β + Cε
                                                                                                (1)
   Now taking expectations of the OLS estimator conditioning of the X
   matrix:
                                                     h                           i
                                                                   1
                       E (bjX) = β + E                   X0 X          X0 ε jX                     (2)
                                                             1
                                    = β + X0 X                   X0 E [ ε jX ]                     (3)
   By Assumption 1, the last term is 0
   Hence:
                               E (bjX) = β                                                         (4)
   This shows that under the CLRM assumptions the OLS estimators are
   unbiased
    CM (Institute)                  Econometrics Lecture 3                               01/2010   4 / 17
The Variance of the OLS estimator
                      Var (bjX) = E (b            β ) (b   β ) 0 jX
                                         1                    1
                      = E     X0 X           X0 εε0 X X0 X        jX
                                   1                                  1
                      =    X0 X        X0 E εε0 jX X X0 X
                                   1                          1
                      =    X0 X        X0 σ 2 I X X0 X
                                        1
                      = σ 2 X0 X
     CM (Institute)               Econometrics Lecture 3                  01/2010   5 / 17
Gauss Markov Thoerem
   In the classical linear regression model with regressor matrix X, the
   least squares estimator b is the linear unbiased estimator of β. for any
   vector W, the minimum variance estimator of W0 β, in the classical
   regression model is W0 b where b is the least squares estimator.
    CM (Institute)           Econometrics Lecture 3             01/2010   6 / 17
Proving Minimum Variance
   Let b0 = Ay be another linear unbiased estimator, where A is a
   K n matrix. if b0 is unbiased then :
                     E [AyjX] = E [(AXβ + Aε) jX] = β                 (5)
    CM (Institute)          Econometrics Lecture 3          01/2010   7 / 17
Proving Minimum Variance
   Let b0 = Ay be another linear unbiased estimator, where A is a
   K n matrix. if b0 is unbiased then :
                     E [AyjX] = E [(AXβ + Aε) jX] = β                 (5)
   This implies that AX = I
    CM (Institute)            Econometrics Lecture 3        01/2010   7 / 17
Proving Minimum Variance
   Let b0 = Ay be another linear unbiased estimator, where A is a
   K n matrix. if b0 is unbiased then :
                     E [AyjX] = E [(AXβ + Aε) jX] = β                 (5)
   This implies that AX = I
   The variance-covariance matrix of C is
                            Var [b0 jX] = σ2 AA0
    CM (Institute)            Econometrics Lecture 3        01/2010   7 / 17
Proving Minimum Variance
   Let b0 = Ay be another linear unbiased estimator, where A is a
   K n matrix. if b0 is unbiased then :
                     E [AyjX] = E [(AXβ + Aε) jX] = β                 (5)
   This implies that AX = I
   The variance-covariance matrix of C is
                                Var [b0 jX] = σ2 AA0
                                 1
   Now let D = A      (X0 X )        X0 () Dy = b0 b
    CM (Institute)              Econometrics Lecture 3      01/2010   7 / 17
Proving Minimum Variance
   Let b0 = Ay be another linear unbiased estimator, where A is a
   K n matrix. if b0 is unbiased then :
                     E [AyjX] = E [(AXβ + Aε) jX] = β                                (5)
   This implies that AX = I
   The variance-covariance matrix of C is
                                 Var [b0 jX] = σ2 AA0
                                  1
   Now let D = A       (X0 X )        X0 () Dy = b0 b
   Then
                                                   1                   1        0
          Var [b0 jX] = σ2       D + X0 X              X0   D + X0 X       X0
    CM (Institute)               Econometrics Lecture 3                    01/2010   7 / 17
Proving Minimum Variance
                                   1
   But AX = I = DX + (X0 X)            X0 X. ) DX = 0.
   Therefore:
            Var [b0 jX] = σ2 X0 X + σ2 DD0 = Var [bjX] + σ2 DD0
   The conditional variance covariance matrix of b0 equals that of b plus
   a nonnegative de…nite matrix.
   Thus
                          Var [bjX] 6 Var [b0 jX]
    CM (Institute)           Econometrics Lecture 3           01/2010   8 / 17
Estimator of the variance
    The Least squares residual is: e = My = M (Xβ + ε) = Mε, Since
    MX = 0 by construction
    The estimate of the variance os disturbances is based on residual
    sum of squares.
                                    e0 e = ε0 Mε
                           E e0 ejX = E ε0 MεjX
    since ε0 Mε is a scalar matrix so it is equal to its trace. Thus
    E [tr (ε0 Mε) jX] = E tr Mεε0 jX
    Since X is assumed nonstochastic we have ;
              E tr Mεε0 jX       = tr ME εε0 jX         = tr Mσ 2 I
                                 = σ2 tr (M)
     CM (Institute)            Econometrics Lecture 3              01/2010   9 / 17
Estimator of the variance
    The
      h trace of M is : i
                    1                                           1
    tr In X (X0 X) X0 = tr (In )                 tr X (X0 X)        X0 =
                         1
    n       tr (X0 X)        X0 X = n     tr (IK ) = n      K
    Thus E [e0 ejX] = σ2 (n         K)
    Therefore the variance estimator is
                                                  e0 e
                                        s2 =
                                                 n K
        CM (Institute)             Econometrics Lecture 3                  01/2010   10 / 17
Statistical Inference
    Since b is a linear function of ε which is mutivarite normal it follows
    that b will also be multivariate normal.
    Therefore:
                            bjX~N β, σ2 (X0 X)         1
    So each element of b jX is also distributed as;
                           bk jX~N βk , σ2 (X0 X)kk1
     CM (Institute)           Econometrics Lecture 3             01/2010   11 / 17
Hypothesis Testing
   Let skk be the kth diagonal element of (X0 X)                 1   then
                                 bk           βk
                            zk = p                   ~N (0, 1)
                                         σ2 s   kk
   If σ2 was known then we can use the standard normal distribution for
   statistical inference.
   But normally this is not the case so: we need a new statistic . We
   start with
                      (n    k) s2         e0 e           ε   0       ε
                                     =         =                 M
                           σ2             σ2             σ           σ
   This is a quatratic form of a standard normal vector σε . Therefore it
   has a chisquared distribution with degrees of freedom tr (M) = n K
     CM (Institute)             Econometrics Lecture 3                      01/2010   12 / 17
Hypothesis Testing
      CM (Institute)   Econometrics Lecture 3   01/2010   13 / 17
Hypothesis Testing
   Lets de…ne
                         b   β             1 0 ε
                               = X0 X       X
                           σ                      σ
   Theorem: A linear function Lx and a symmetric idempotent form
   x0 Ax in a standard normal vector are statistically independent if
   LA = 0 (Proof see Theorem B12 )
   let σε be x then the requirement of the theorem is that:
   (X0 X) 1 X0 M = 0 which holds since MX = 0
   Theorem: If ε is normally disributed, then the least squares coe¢ cient
   estimator b is statistically independent of the residual vector e and
   therefore, all functions of e .
     CM (Institute)          Econometrics Lecture 3           01/2010   13 / 17
Hypothesis Testing
   Recall: a standard normal divided by an independent chisquare which
   is divided by its repective degrees of freedom if a t distribution.
   Therefore
                                       bk     βk
                                       p
                                            σ2 skk            bk       βk
                      tk =                                  = p
                             (n    k )s 2                          s 2 skk
                                  σ2
                                             / (n      K)
   This is now distributed as a t-distribution with n                  K degrees of
   freedom
     CM (Institute)               Econometrics Lecture 3                     01/2010   14 / 17
Con…dence Intervals
                  Pr ob (bk   tα/2 sbk      βk       bk + tα/2 sbk ) = 1   α
   Look at Example 4.4 Page 52 Green
     CM (Institute)               Econometrics Lecture 3                   01/2010   15 / 17
Data Problems
   Multicollinearity
   - e¤ects of high correlation among variables include: small changes in
   the data produces wide swings in the parameter estimates;
   coe¢ cients may have very high standard errors and low sigini…cance
   levels, coe¢ cients may have wrong signs or implausible maginitudes
   - Detection include using the variance in‡ation factor, 1 1R 2 .Values in
                                                                k
   excess of 20 are assumed indicate serious multicollinearity.
   -Correction for collinearity includes droping the problem data, using
   additional data, using principal components
   Look at Example 4.4 Page 52 Green
    CM (Institute)            Econometrics Lecture 3            01/2010   16 / 17
Data Problems
   Multicollinearity
   - e¤ects of high correlation among variables include: small changes in
   the data produces wide swings in the parameter estimates;
   coe¢ cients may have very high standard errors and low sigini…cance
   levels, coe¢ cients may have wrong signs or implausible maginitudes
   - Detection include using the variance in‡ation factor, 1 1R 2 .Values in
                                                                k
   excess of 20 are assumed indicate serious multicollinearity.
   -Correction for collinearity includes droping the problem data, using
   additional data, using principal components
   Missing Observations
   -if missing at random then you can ignore the case
   - if missing in a systematic manner then some people have resorted to
   imputing the missing data using available information.
   Look at Example 4.4 Page 52 Green
    CM (Institute)            Econometrics Lecture 3            01/2010   16 / 17
Data Problems
   In‡uential Data Points
   -we identify which residuals are too large to identify outliers
           e = 2 ei 1/2
   -we use b
                     [s (1 p ii )]
   - if the value is above 2.0 then it suggests problem value.
   Solution: In cross section data we can drop the data but this is not
   advisable in time series data
    CM (Institute)                   Econometrics Lecture 3     01/2010   17 / 17