Violation of CLRM
Prof. Rishman Jot Kaur Chahal
                  HSC - 205
Department of Humanities and Social Sciences
   Indian Institute of Technology Roorkee
                                               1 / 17
Autocorrelation
Detecting Autocorrelation
      Graphically: Visual examination of the estimated residuals provide
      clue for the presence of autocorrelation. We can simply plot them
      against time, the time sequence plot.
      Durbin Watson d test: Most celebrated test for detecting serial
      correlation is that developed by statisticians Durbin and Watson.
      Following is the d-statistic
                                   t=n
                                       (uˆt − ût−1 )2
                                   P
                                   t=2
                              d=       Pt=n 2                             (1)
                                          t=1 uˆt
      which is simply the ratio of the sum of squared differences in
      successive residuals to the Residual Sum of Square (RSS).
      Note that in the numerator of the d statistic the number of
      observations is n − 1 because one observation is lost in taking
      successive differences.
                                                                           2 / 17
Autocorrelation
Detecting Autocorrelation
      d statistic is based on estimated residuals which is great advantage.
      However, following are some of the assumptions of this statistic:
           1. The regression model includes the intercept term. If it is not present,
           as in the case of the regression through the origin, it is essential to
           rerun the regression including the intercept term to obtain the RSS.
           2. The explanatory variables, the X’s, are nonstochastic, or fixed in
           repeated sampling.
           3. The disturbances ut are generated by the first-order autoregressive
           scheme: ut = ρut−1 + ϵt . Therefore, it cannot be used to detect
           higher-order autoregressive schemes.
                                                                                   3 / 17
Autocorrelation
Detecting Autocorrelation
      4. The error term ut is assumed to be normally distributed.
      5. The regression model does not include the lagged value(s) of the
      dependent variable as one of the explanatory variables. Thus, the test
      is inapplicable in models of the following type:
               Yt = β1 + β2 X2t + β3 X3t + ... + βk Xkt + γYt−1 + ut
      where yt−1 is the one period lagged value of Y.
      6. There are no missing observations in the data.
                                                                         4 / 17
Autocorrelation
Detecting Autocorrelation
      Deriving the exact sampling distribution or probability distribution of
      the d statistic is difficult as it depends in a complicated way on X
      values.
      So, unlike t, F and χ2 tests there is no critical value that will lead to
      accept or reject the null (no first order serial correlation of the
      disturbances (ut )).
      So, what did Durbin and Watson did?
                                                                             5 / 17
Autocorrelation
Detecting Autocorrelation
      Durbin and Watson derived a lower bound (dL ) and and upper bound
      (dU ) such that if the computed d lies outside these critical values a
      decision can be made regarding the presence of positive or negative
      serial correlation.
      Simplify the d statistic as follows:
                             P 2 P 2            P
                                uˆt + ût−1 − 2 uˆt ût−1
                       d=                  P 2
                                            uˆt
      Note that
                               X            X
                                     ˆ 2≈
                                   ut−1         uˆt 2
                                                                          6 / 17
Autocorrelation
Detecting Autocorrelation
      Thus,
                                           P
                                            ût ût−1
                               d ≈ 2(1 −    P ˆ2 )
                                               u t
      Consider,
                                      P
                                        uˆt ût−1
                                  ρ̂ = P 2
                                           uˆt
      Since −1 ≤ ρ ≤ 1, thus 0 ≤ d ≤ 4.
      These are the bounds of d; any estimated d value must lie within
      these limits.
      Rule of Thumb: If d is found to be 2 in an application, one may
      assume that there is no first-order autocorrelation, either positive or
      negative.
                                                                            7 / 17
Autocorrelation
Detecting Autocorrelation
      Decision Rule:
                     Figure: Gujarati et al., 2009 fifth edition
      But how to get dL and dU ?? From the Durbin-Watson tables
      following the sample size and given explanatory variables.
                                                                   8 / 17
Autocorrelation
Detecting Autocorrelation
      The mechanics of the Durbin–Watson test:
           Run the OLS regression and obtain the residuals.
           Compute d from Eq.9.
           Now find the critical values of dL and dU for the given sample size and
           given number of explanatory variables.
           Follow the decision rules given in the figure above.
                                                                                9 / 17
Autocorrelation
Detecting Autocorrelation from an Example
     Let us consider the example of wages-productivity regression where
     wages are affected by the productivity of the individuals.
           From the Durbin-Watson tables, we can see that for n=46 and one
           explanatory variable, dL = 1.475 and dU = 1.566. at 5 percent level of
           significance.
           Also, consider the estimated d value of the regression is 0.2175,
           Since the computed d of 0.2175 lies below dL , we cannot reject the
           hypothesis that there is positive serial correlation in the residuals.
                                                                                10 / 17
Autocorrelation
Detecting Autocorrelation through a General Test: Breusch-Godfrey (BG) Test
     Also known as LM Test as it is based on Lagrange multiplier principle.
     Consider a 2 variable regression model as:
                                 Yt = β1 + β2 Xt + ut                         (2)
     Assume that the error term ut follows the pth-order autoregressive,
     AR(p), scheme as follows:
                      ut = ρ1 ut−1 + ρ2 ut−2 + ... + ρp ut−p + ϵt
     The null hypothesis is of no serial correlation i.e. H0 :
     ρ1 = ρ2 = ρ3 = ... = ρp = 0.
                                                                              11 / 17
Autocorrelation
Detecting Autocorrelation through a General Test: Breusch-Godfrey (BG) Test
     Steps for BG Test:
           Estimate the model by OLS and obtain uˆt .
           Regress uˆt on the original Xt and introduce the additional ût−1 ,
           ût−2 ,..., ût−p . So, if p=5 we will include five lagged values of residuals
           as additional regressors.
                         ût = α1 + α2 Xt + p̂1 ût−1 + ... + p̂p ût−p + ϵt
           Obtain its R 2 .
           If the sample size is large (technically, infinite), Breusch and Godfrey
           have shown that
                                         (n − p)R 2 ∼ χ2p
           If (n − p)R 2 exceeds the critical chi-square value at the chosen level of
           significance, we reject the null hypothesis.
                                                                                     12 / 17
Autocorrelation
Remedies
     One must try to analyse whether its a pure autocorrelation or not?
     Sometimes due to misspecified modelling some important variables
     are excluded due to which there is autocorrelation.
     But if still there is pure autocorrelation, one can transform the model.
     As in the case of heteroscedasticity, we used the generalized
     least-square (GLS) method.
     In large samples, we can use the Newey–West method to obtain
     standard errors of OLS estimators.
                                                                          13 / 17
Autocorrelation
GLS to correct pure Autocorrelation
      If the coefficient of first-order autocorrelation ρ is known, the problem
      of autocorrelation can be easily solved. Consider a two-variable
      regression model as:
                                  Yt = β1 + β2 Xt + ut                     (3)
      Say ut = ρut−1 + ϵ where −1 < ρ < 1.
                              Yt−1 = β1 + β2 Xt−1 + ut−1                   (4)
      Multiply by ρ in both the sides of eq 4 and subtract 4 from 3.
                  Yt − ρYt−1 = β1 (1 − ρ) + β2 (Xt − ρXt−1 ) + ϵt
      where ϵ = (ut − ρut−1 )
                                                                           14 / 17
Autocorrelation
GLS to correct pure Autocorrelation
      Thus,
                                 Yt∗ = β1∗ + β2∗ Xt∗ + ϵt             (5)
      Now this error term satisfies the usual OLS assumptions and one can
      apply OLS to this transformed model to estimate the betas.
      Recall that GLS is nothing but OLS applied to the transformed model
      that satisfies the classical assumptions.
      Eq 5 is also known as Generalized Differenced Equation.
                                                                      15 / 17
Autocorrelation
GLS to correct pure Autocorrelation
      When ρ is not known
      First Differenced Model.
                     Yt − Yt−1 = β2 (Xt − Xt−1 ) + (ut − ut−1 )
                                      ∆Yt = β2 ∆Xt + ϵt               (6)
      where ∆ is the first-difference operator.
      The first-difference transformation may be appropriate if the
      coefficient of autocorrelation is very high.
                                                                      16 / 17
Autocorrelation
GLS to correct pure Autocorrelation
      Example: our wages–productivity regression. Now rerun the
      regression in the first-difference form. You will get the following
      results:
                                      ∆Ŷt = 0.653∆Xt
      where t = 11.40, r 2 = 0.426 and d = 1.7442.
      The d value has increased dramatically, perhaps indicating that there
      is little autocorrelation in the first difference regression.
                                                                            17 / 17