100% found this document useful (2 votes)

1K views33 pages

Auto Correlation

Autocorrelation occurs when the error terms in a time series model are correlated over time. This violates the assumption of independent errors and can bias standard errors and test statistics. The autoregressive AR(1) model is commonly used to model autocorrelated errors, where the current error term is a function of the lagged error term plus a white noise error. This leads to the error terms having a constant correlation over time that decays exponentially. Various tests can be used to detect autocorrelation, such as examining plots of residuals over time or using the runs test to check if positive and negative residuals alternate randomly.

Uploaded by

api-26942617

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

1K views33 pages

Auto Correlation

Uploaded by

api-26942617

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

12 Autocorrelation

12.1 Motivation

• Autocorrelation occurs when something that happens today has an impact on what

happens tomorrow, and perhaps even further into the future.

• This is a phenomena that is mainly found in time-series applications.

• Note: Autocorrelation can only happen into the past, not into the future.

• Typically found in financial data, macro data, sometimes in wage data.

• Autocorrelation occurs when cov(²i , ²j ) 6= 0 ∀ i, j.

12.2 AR(1) Errors

• AR(1) errors occur when yi = Xi β + ²i and

²i = ρ²i−1 + ui

where ρ is the autocorrelation coefficient, |ρ| < 1 and ui ∼ N (0, σu2 ).

• Note: In general we can have AR(p) errors which implies p lagged terms in the error

structure, i.e.,

²i = ρ1 ²i−1 + ρ2 ²i−2 + · · · + ρp ²i−p

• Note: We will need |ρ| < 1 for stability and stationarity. If |ρ| < 1 happens to fail
then we have the following problems:

1. ρ = 0: No serial correlation present

187
2. ρ > 1: The process explodes

3. ρ = 1: The process follows a random walk

4. ρ = −1: The process is oscillatory

5. ρ < −1: The process explodes in an oscillatory fashion

• The consequences for OLS: β̂ is unbiased and consistent but no longer efficient and
usual statistical inference is rendered invalid.

• Lemma:
∞
X
²i = ρj ui−j
j=0

• Proof:

²i = ρ²i−1 + ui

²i−1 = ρ²i−2 + ui−1

²i−2 = ρ²i−3 + ui−2

Thus, via substitution we obtain

²i−1 = ρ²i−2 + ui−1

= ρ(ρ²i−3 + ui−2 ) + ui−1

= ρ2 ²i−3 + ρui−2 + ui−1

and

²i = ρ(ρ2 ²i−3 + ρui−2 + ui−1 ) + ui

= ρ3 ²i−3 + ρ2 ui−2 + ρui−1 + ui

188
If we continue to substitute for ²i−k we get

∞
X
²i = ρj ui−j
j=0

• Note the expectation of ²i is

X∞
E[²i ] = E[ ρj ui−j ]
j=0
∞
X
= ρj E[ui−j ]
j=0
∞
X
= ρj 0 = 0
j=0

• The variance of ² is

var(²i ) = E[²2i ]

= E[(ui + ρui−1 + ρ2 ui−2 + · · ·)2 ]

= E[u2i + ρui−1 ui + ρ2 u2i−1 + ρ4 u2i−2 + · · ·]

var(²i ) = σu2 + ρ2 σu2 + ρ4 σu2 + · · ·

• Note: E[ui uj ] = 0 for all i 6= j via the white noise assumption. Therefore, all terms

ρN where N is odd are wiped out. This is not the same as E[²i , ²j ] = 0.

• Therefore, the var(²i ) is

var(²i ) = σu2 + ρ2 σu2 + ρ4 σu2 + · · ·

= σu2 + ρ2 (var(²i−1 ))

189
But, assuming homoscedasticity, var(²i ) = var(²i−1 ) so that

var(²i ) = σu2 + ρ2 (var(²i−1 ))

= σu2 + ρ2 (var(²i ))
σu2
var(²i ) = ≡ σ2
1 − ρ2

• Note: This is why we need |ρ| < 1 for stability in the process.

• If |ρ| > 1 then the denominator is negative and the var(²i ) cannot be negative.

• What about the covariance across different observations?

cov(²i ²i−1 ) = E[²i ²j ]

= E[(ρ²i−1 + ui )²i−1 )

= E[ρ²i−1 ²i−1 + ui ²i−1 ]: but Ui and ²i−1 are independent, so

2
σu
cov(²i ²i−1 ) = ρvar(²i−1 ) + 0: but var(²i ) = 1−ρ2
, so
ρ
cov(²i ²i−1 ) = σ2
1 − ρ2 u

• In general
ρj−i 2
cov(²i ²i−j ) = E[²i ²i−j ] = σ
1 − ρ2 u

190
• which implies that
 
2 3 N −1
 1 ρ ρ ρ · ρ 
 
 ρ 1 · · · ρN −2 
 
 
 
σu 
2 ρ2 · 1 · · · 
2
σ Ω=  
1 − ρ2 
 · · · · · ·


 
 
 · · · · · · 
 
 
ρN −1 ρN −2 · · · 1

• We note the correlation between ²i and ²i−1 .

cov(²i , ²i−1 )
corr(²i , ²i−1 ) = p
var(²i )var(²i−1 )
ρ
σ2
1−ρ2 u
= 2
σu
=ρ
1−ρ2

where ρ is the correlation coefficient.

• Note: If we know Ω then we can apply our previous results of GLS for an easy fix.

• However, we rarely know the actual structure of Ω.

• At this point the following results hold

1. The OLS estimate of s2 is biased but consistent

2. s2 is usually biased downward because we usually find ρ > 0 in economic data.

• This implies that σ 2 (X 0 X)−1 tends to be less than σ 2 (X 0 X)−1 X 0 ΩX(X 0 X)−1 if ρ > 0

and the variables of X are positively correlated over time.

• This implies that t-statistics are over-stated and we may introduce Type I errors in

our inferences.

191
• How do we know if we have Autocorrelation or not?

12.3 Tests for Autocorrelation

1. Plot residuals (²̂i ) against time.

2. Plot residuals (²̂i ) against ²̂i−1

3. The Runs Test

• Take the sign of each residual and write them out as such

(++++) (——-) (++++) (-) (+) (—) (++++++)

(4) (7) (4) (1) (1) (3) (6)

• Let a ”run” be an uninterrupted sequence of the same sign and let the ”length”

be the number of elements in a run.

• Here we have 7 runs: 4 plus, 7 minus, 4 plus, 1 minus, 1 plus, 3 minus, 6 plus.

• Then to complete the test let

N = n1 + n2 Total Observations
n1 Number of positive residuals

n2 Number of negative residuals

k Number of runs

• Let H0 : Errors are Random and Hα : Errors are Correlated

• At the 0.05 significance level, we fail to reject the null hypothesis if

E[k] − 1.96σk ≤ k ≤ E[k] + 1.96σk

192
where
2n1 n2 2n1 n2 (2n1 n2 − n1 − n2 )
E[k] = ; σk2 =
n1 + n2 (n1 + n2 )2 (n1 + n2 − 1)

• Here we have n1 = 15, n2 = 11 and k = 7 thus E[k] = 13.69 σk2 = 5.93 and

σk = 2.43

• Thus our confidence interval is written as

[13.69 ± (1.96)(2.43)] = [8.92, 18.45]

However, k = 7 so we reject the null hypothesis that the errors are truly ran-

dom. In STATA after a reg command, calculate the fitted residuals and use the

command runtest, e.g., reg y x1 x2 x3, predict res, r, runtest res.

4. Durbin-Watson Test

• The Durbin-Watson test is a very popular test for AR(1) error terms.

• Assumptions:

(a) Regression has a constant term

(b) No lagged dependent variables
(c) No missing values
(d) AR(1) error structure

• The null hypothesis is that ρ = 0 or that there is no serial correlation.

• The test statistic is calculated as

PN 2
t=2 (²̂t − ²̂t−1 )
d= PN 2
t=1 ²̂t

193
which is equivalent to
 
 1 −1 0 · · 0 
 
 −1 2 −1 0 · 0 
 
 
 
0
²̂ A²̂  0 −1 2 0 · 0 
0
where A = 



²̂ ²̂  · · · · · · 
 
 
 · · · · 2 −1 
 
 
· · · · 1 1

• An equivalent test is d = 2(1 − ρ̂) where ρ̂ comes from ²̂t = ρ²̂t−1 + ut .

• Note that −1 ≤ ρ ≤ 1 so that d ∈ [0, 4] where

(a) d = 0 indicates perfect positive serial correlation

(b) d = 4 indicates perfect negative serial correlation
(c) d = 2 indicates no serial correlation.

• Some statistical packages report the Durbin-Watson statistic for every regression

command. Be careful to only use the DW statistic when it makes sense.

• a rule of thumb for the DW test: a statistic very close to 2, either above or below,

suggests that serial correlation is not a major problem.

• There is a potential problem with the DW test, however. The DW test has three
regions: We can reject the null, we can fail to reject the null, or we may have an
inconclusive result.

• The reason for the ambiguity is that the DW statistic does not follow a standard

distribution. The distribution of the statistic depends on the ²̂t , which are de-
pendent upon the Xt0 s in the model. Further, each application of the test has a
different number of degrees of freedom.

194
• To implement the Durbin-Watson test

(a) Calculate the DW statistic

(b) Using N , the number of observations, and k the number of rhs variables
(excluding the intercept) determine the upper and lower bounds of the DW

statistic.

• Let H0 : No positive correlation (ρ ≤ 0) and H0∗ : Positive autocorrelation (ρ > 0)

• Then if
d < DWL Reject H0 : Evidence of positive correlation

DWL < d < DWU We have an inconclusive result.

DWU < d < 4 − DWU Fail to reject H0 or H0∗

4 − DWU < d < 4 − DWL We have an inconclusive result.

4 − DWU < d < 4 We reject H0∗ : Evidence of negative correlation

• For example, let N = 25, k = 3 then DWL = 0.906 and DWU = 1.409. If
d = 1.78 then d > DWU but d < 4 − DWU and we fail to reject the null.

• Graphically this looks like

... ... ... ... ...

... ... ... ... ...
..
...
...
Reject H0 ..
...
...
Inconclusive
..
...
...
Fail to Reject ..
...
...
Inconclusive
..
...
...
Reject H0∗
...
... Positive ...
... Zone ...
... H0 or H0∗ ...
... Zone ...
... Negative
... ... ... ... ...
...
...
...
Correlation ...
...
...
...
...
...
...
...
...
...
...
...
Correlation
0 DWL DWU 2 4 − DWU 4 − DWL 4

• The range of inconclusiveness is a problem. Some programs, such as TSP, will

generate a P -value for the DW calculated. If not, then non-parametric tests may

be useful at this point.

195
• Let’s look a little closer at our DW statistic.

PN 2 PN PN 2
i=2 ²ˆi −2 i=2 ²̂i ²̂i−1 + i=2 ²̂i−1
DW =
h ²̂0 ²̂ i
P
²̂0 ²̂ − 2 N ²̂ ²̂
i=2 i i−1 + ²̂ 0
²̂ − ²̂2
1 − ²̂ 2
N
=
²̂0 ²̂

why? Note the following:

PN 2 2 2 2
PN 2
i=2 ²̂i = ²̂2 + ²̂3 + · · · + ²̂N i=2 ²̂i−1 = ²̂21 + ²̂22 + · · · + ²̂2N −1
²̂0 ²̂ = ²̂21 + ²̂22 + · · · + ²̂2N ²̂0 ²̂ = ²̂21 + ²̂22 + · · · + ²̂2N
Therefore we have simply added and subtracted ²̂21 and ²̂2N .

Therefore,

PN
2²̂0 ²̂ − 2 i=2 ²̂i ²̂i−1 − ²̂21 − ²̂2N
DW =
²̂0 ²̂
PN
2 i=2 (ρ²̂i−1 + ui )²̂i−1 − [²̂21 + ²̂2N ]
= 2−
²̂0 ²̂

then DW = 2 − 2γ1 ρ̂ − γ2 where

PN 2
i=2 ²̂i−1 ²̂21 + ²̂2N
γ1 = and γ2 =
²̂0 ²̂ ²̂0 ²̂

• Note that as N → ∞ then γ1 → 1 and γ2 → 0 so that DW → 2 − 2ρ̂.

• Under H0 : ρ = 0 and thus DW = 2.

• Note: We can calculate ρ̂ as ρ̂ = 1 − 0.5DW .

5. Durbin’s h-Test

• The Durbin-Watson test assumes that X is non-stochastic. This may not always

be the case, e.g., if we include lagged dependent variables on the right-hand side.

196
• Durbin offers an alternative test in this case.

• Under the null hypothesis that ρ = 0 the test becomes

µ ¶s
d N
h= 1−
2 1 − N (var(α))

where α is the coefficient on the lagged dependent variable.

• Note: If N var(α) > 1 then we have a problem because we can’t take the square

root of a negative number.

• Durbin’s h statistic is approximately distributed as a normal with unit variance.

6. Wald Test

• It can be shown that

√ d
N (ρ̂ − ρ) → N (0, 1 − ρ2 )

So that a test statistic

ρ̂ d
W =q → N (0, 1)
1−ρ̂
N

7. Breusch-Godfrey Test

• This is basically a Lagrange Multiplier test of H0 : No autocorrelation versus

Hα : Errors are AR(p).

• Regress ²̂i on Xi , ²̂i−1 , . . . , ²̂i−p and obtain N R2 ∼ χ2p where p is the number of

lagged values that contribute to the correlation.

• The intuition behind this test is rather straightforward. We know that X 0 ²̂ = 0

so that any R2 > 0 must be caused by correlation between the current and the

lagged residuals.

197
8. Box-Pierce Test

• This is also called the Q-test. It is calculated as

L PN
X j=i+1 ²̂j ²̂j−1
Q=N ri2 where ri = PN 2
i=1 j=1 ²̂j

and Q ∼ χ2L where L is the number of lags in the correlation.

• A criticism of this approach is how to choose L.

12.4 Correcting an AR(1) Process

• One way to fix the problem is to get the error term of the estimated equation to satisfy
the full ideal conditions. One way to do this might be through substitution.

• Consider the model we estimate is yt = β0 + β1 Xt + ²t where ²t = ρ²t−1 + ut and

ut ∼ (0, σu2 ).

• It is possible to rewrite the original model as

yt = β0 + β1 Xt + ρ²t−1]+ut

but ²t−1 = yt−1 − β0 − β1 Xt−1

thus yt = β0 + β1 Xt + ρ(yt−1 − β0 − β1 Xt−1 ) + ut : via substitution

yt − ρyt−1 = β0 (1 − ρ) + β1 (Xt − ρXt−1 ) + ut : via gathering terms

⇒ yt∗ = β0∗ + β1 Xt∗ + ut

• We can estimate the transformed model, which satisfies the full ideal conditions as

long as ut satisfies the full ideal conditions.

198
• One downside is the loss of the first observation, which can be a considerable sacrifice

in degrees of freedom. For instance, if our sample size were 30 observations, this

transformation would cost us approximately 3% of the sample size.

• An alternative would be to implement GLS if we know Ω, i.e., we know ρ such that

β̃ = (X 0 Ω−1 X)−1 X 0 Ω−1 y

where  
2 N −1
 1 ρ ρ · · ρ 
 
 ρ 1 · · · · 
 
 
 
1  
· · · · · · 

Ω=
1 − ρ2 
 · · · · · ·


 
 
 · · · · · ρ 
 
 
ρN −1 · · · ρ 1

• Note that for GLS we seek Ω−1/2 such that Ω−1/2 ΩΩ−1/2 = I and transform the model.

Thus we estimate

Ω−1/2 y = Ω−1/2 Xβ + Ω−1/2 ²

where  p 
1 − ρ2 0 · · 0
 
 
 −ρ 1 0 · 0 
 
 
Ω−1/2 =
 0 −ρ 1 0 0 

 
 
 · · · · · 
 
0 · · −ρ 1

• This is known as the Prais-Winsten (1954) Transformation Matrix.

199
• This implies that

p p p
1st observation 1 − ρ2 y1 = 1 − ρ2 X1 β + 1 − ρ2 ²1

Other N − 1 obs. (yi − ρyi−1 ) = (Xi − ρXi−1 )β + ui

where ui = ²i − ρ²i−1

• Thus,
cov(β̃) = σu2 (X 0 Ω−1 X)−1

and
1
σ̃ 2 = (y − X β̃)0 Ω−1 (y − X β̃)
N

12.4.1 What if ρ is unknown?

• We seek a consistent estimator of ρ so as to run Feasible GLS.

• Methods of estimating ρ

1. Cochranne-Orcutt (1949): Throw out the first observation.

We assume an AR(1) process which implies ²i = ρ²i−1 + ui .

So, we run OLS on ²̂i = ρ²̂i−1 + ui and obtain

PN
i=2 ²̂i ²̂i−1
ρ̂ = P N 2
i=2 ²̂i

which is the OLS estimator of ρ.

Note: ρ̂ is a biased estimator of ρ, but it is consistent and that is all we really

need.

With ρ̂ in hand we can go to an FGLS procedure.

200
2. Durbin’s Method (1960)

After substituting for ²i we see that

yi = β0 + β1 Xi1 + β2 Xi2 + · · · + βk Xik + ρ²i−1 + ui

= β0 + β1 Xi1 + · · · + βk Xik + ρ(yi−1 − β0 − β1 Xi−1,1 − · · · − βk Xi−1,k ) + ui

So, we run OLS on

yi = ρyi−1 + (1 − ρ)β0 + β1 Xi1 − β1 ρXi−1,1 + · · · + βk ρXi,k − βk ρXi−1,k + ui

From this we obtain ρ̂ which is the coefficient on yi−1 . This parameter estimate
is biased but consistent.

Note: When k is large, we may have a problem in the degrees of freedom. To

preserve the degrees of freedom, we must have N > 2k +1 observations to employ

this method. In small samples, this method may not be feasible.

3. Newey-West Covariance Matrix

We can correct the covariance matrix of β̂ much like we did in the case of het-

eroscedasticity. This extention of White (1980) was offered by Newey and West.

We seek a consistent estimator of X 0 ΩX which then leads to

\
cov( \
β̂) = σ 2 (X 0 X)−1 X 0 ΩX(X 0 X)−1

where

L N
\ 1 X 2 0 1 X X 0
0
X ΩX = ²̂i Xi Xi + ωi ²̂j ²̂j−1 (Xj Xj−1 + Xj−1 Xj0 )
N N i=1 j=i+1

201
where
i
ωi = 1 −
L+1

A possible problem in this approach is to determine L, or how far back into the

past to go to correct the covariance matrix of autocorrelation.

12.5 Large Sample Fix

• We sometimes use this method because simulation models have shown that a more

efficient estimator may be obtained by including lagged dependent variables.

• Include lagged dependent variables until the autocorrelation disappears. We know

when this happens because the estimated coefficient on the k th lag will be insignificant.

• Problem: Estimates are biased, but they are consistent. Be Careful!!

• This approach is useful in time-series studies with lots of data. Thus you are safely

within the “large sample” world.

12.6 Forecasting in the AR(1) Environment

• Having estimated β̃GLS we know that β̃GLS is BLUE when the cov(²) = σ 2 Ω when
Ω 6= I.

• With an AR(1) process, we know that tomorrow’s output is dependent upon today’s

output and today’s random error.

• We estimate

yt = Xt β + ²t

where ²t = ρ²t−1 + ut .

202
• The forecast becomes

yt+1 = Xt+1 β + ²t+1

= Xt+1 β + ρ²t + ut+1

• To finish the forecast, we need ρ̂ from our previous estimation techniques and then we

recongize that

²˜t = yt − Xt β̃

from GLS estimation. We assume that ut+1 as a zero mean.

• Then we see that

ŷt+1 = Xt+1 β̃ + ρ̂²̃

• What if Xt+1 doesn’t exist. This occurs when we try to perform out-of-sample fore-
casting. Perhaps we use Xt β̃?

• In general we find that

ŷt+s = Xt+s β̃ + ρ̂s ²˜t

12.7 Example: Gasoline Retail Prices

• In this example we look at the relationship between the U.S. average retail price of

gasoline and the wholesale price of gasoline from from January 1985 through February
2006, using the Stata data file gasprices.dta.

• As an initial step, we plot the two series over time and notice a highly correlated set

of series:

203
250
200
150
100
50

0 50 100 150 200 250

obs

allgradesprice wprice

• A simple OLS regression model produces:

. reg allgradesprice wprice

Source | SS df MS Number of obs = 254

-------------+------------------------------ F( 1, 252) = 3467.83
Model | 279156.17 1 279156.17 Prob > F = 0.0000
Residual | 20285.6879 252 80.4987614 R-squared = 0.9323
-------------+------------------------------ Adj R-squared = 0.9320
Total | 299441.858 253 1183.56466 Root MSE = 8.9721
------------------------------------------------------------------------------
allgradesp~e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
wprice | 1.219083 .0207016 58.89 0.000 1.178313 1.259853
_cons | 31.98693 1.715235 18.65 0.000 28.60891 35.36495

• The results suggest that for every penny in wholesale price, there is a 1.21 penny

increase in the average retail price of gasoline. The constant term suggests that, on

204
average, there is approximately 32 cents difference between retail and wholesale prices,

comprised of profits, state and federal taxes.

• A Durbin-Watson statistic calculated after the regression yields

. dwstat
Durbin-Watson d-statistic( 2, 254) = .1905724

. disp 1- .19057/2 .904715

• The DW statistic suggests that the data suffer from significant autocorrelation. Re-
versing out an estimate of ρ̂ = 1 − d/2 suggests that ρ = 0.904.

• Here is a picture of the fitted residuals against time:

20
10
Residuals
0
−10
−20

0 50 100 150 200 250

obs

• Here are robust-regression results:

205
. reg allgradesprice wprice, r

Regression with robust standard errors

Number of obs= 254
F( 1, 252) = 5951.12
Prob > F = 0.0000
R-squared = 0.9323
Root MSE = 8.9721
------------------------------------------------------------------------------
| Robust
allgradesp~e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
wprice | 1.219083 .0158028 77.14 0.000 1.18796 1.250205
_cons | 31.98693 1.502928 21.28 0.000 29.02703 34.94683

• The robust regression results suggest that the naive OLS over-states the variance in

the parameter estimate on wprice, but the positive value of ρ suggests the opposite is
likely true.

• Various “fixes” are possible. First, Newey-West standard errors:

. newey allgradesprice wprice, lag(1)

Regression with Newey-West standard errors

Number of obs = 254
maximum lag: 1

F(1,252) = 3558.42
Prob > F = 0.0000
----------------------------------------------------------------------
| Newey-West
allgrades | Coef. Std. Err. t P>|t| [95% Conf. Interval]
----------+-----------------------------------------------------------
wprice | 1.219083 .0204364 59.65 0.000 1.178835 1.259331
_cons | 31.98693 2.023802 15.81 0.000 28.00121 35.97265

• The Newey-West corrected standard errors, assuming AR(1) errors, are significantly
higher than the robust OLS standard errors but are only slightly lower than those in

naive OLS.

206
• Prais-Winsten using Cochrane-Orcutt transformation (note: the first observation is

lost):

Cochrane-Orcutt AR(1) regression -- iterated estimates

Source | SS df MS Number of obs = 253

-------------+------------------------------ F( 1, 251) = 606.24
Model | 5740.73875 1 5740.73875 Prob > F = 0.0000
Residual | 2376.81962 251 9.46940088 R-squared = 0.7072
-------------+------------------------------ Adj R-squared = 0.7060
Total | 8117.55837 252 32.2125332 Root MSE = 3.0772
------------------------------------------------------------------------------
allgradesp~e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
wprice | .8133207 .0330323 24.62 0.000 .7482648 .8783765
_cons | 75.27718 12.57415 5.99 0.000 50.5129 100.0415
-------------+----------------------------------------------------------------
rho | .9840736
------------------------------------------------------------------------------
Durbin-Watson statistic (original) 0.190572
Durbin-Watson statistic (transformed) 2.065375

• Prais-Winsten transformation which includes the first observation:

Prais-Winsten AR(1) regression -- iterated estimates

Source | SS df MS Number of obs = 254

-------------+------------------------------ F( 1, 252) = 639.29
Model | 6070.42888 1 6070.42888 Prob > F = 0.0000
Residual | 2392.88989 252 9.4955948 R-squared = 0.7173
-------------+------------------------------ Adj R-squared = 0.7161
Total | 8463.31877 253 33.4518529 Root MSE = 3.0815
------------------------------------------------------------------------------
allgradesp~e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
wprice | .8160378 .0330426 24.70 0.000 .7509629 .8811126
_cons | 66.01763 9.451889 6.98 0.000 47.40287 84.63239
-------------+----------------------------------------------------------------
rho | .9819798

207
------------------------------------------------------------------------------
Durbin-Watson statistic (original) 0.190572
Durbin-Watson statistic (transformed) 2.052344

• Notice that both Prais-Winsten results reduce the parameter on WPRICE and the

increases the standard error. The t-statistic drops, although the qualitative result
doesn’t change.

• In both cases, the DW stat on the transformed data is nearly two, indicating zero

autocorrelation.

• We can try the “large sample fix” by going back to the original model and including

the once-lagged dependent variable:

. reg allgradesprice wprice l.allgradesprice

Source | SS df MS Number of obs = 253

-------------+------------------------------ F( 2, 250) = 8709.80
Model | 294898.633 2 147449.316 Prob > F = 0.0000
Residual | 4232.28346 250 16.9291339 R-squared = 0.9859
-------------+------------------------------ Adj R-squared = 0.9857
Total | 299130.916 252 1187.02744 Root MSE = 4.1145
---------------------------------------------------------------------------
allgradesp~e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+-------------------------------------------------------------
wprice | .4171651 .0278594 14.97 0.000 .3622 .4720
allgradesp~e |
L1 | .6860807 .0224148 30.61 0.000 .6419348 .7302265
_cons | 7.668692 1.1199 6.85 0.000 5.463052 9.874333

. durbina

Durbin’s alternative test for autocorrelation

---------------------------------------------------------------------------
lags(p) | chi2 df Prob > chi2
-------------+-------------------------------------------------------------
1 | 134.410 1 0.0000

208
---------------------------------------------------------------------------
H0: no serial correlation

• The large sample fix suggests a smaller parameter estimate on WPRICE, the standard

error is larger and the t-statistic is much lower than the original OLS model.

. reg allgradesprice wprice l(1/3).allgradesprice

Source | SS df MS Number of obs = 251

-------------+------------------------------ F( 4, 246) = 5172.42
Model | 295015.56 4 73753.8899 Prob > F = 0.0000
Residual | 3507.73067 246 14.2590677 R-squared = 0.9882
-------------+------------------------------ Adj R-squared = 0.9881
Total | 298523.29 250 1194.09316 Root MSE = 3.7761
------------------------------------------------------------------------------
allgradesp~e | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
wprice | .375143 .0268634 13.96 0.000 .3222313 .4280547
allgradesp~e
L1 | .994551 .0556423 17.87 0.000 .8849549 1.104147
L2 | -.5186459 .0761108 -6.81 0.000 -.668558 -.3687339
L3 | .243287 .0463684 5.25 0.000 .1519572 .3346167
_cons | 6.778409 1.063547 6.37 0.000 4.68359 8.873228

• In this case, we included three lagged values of the dependent variable. Note that
they are all significant. If we include four or more lags, the fourth (and higher) lags
are insignificant. Notice that the marginal effect of wholesale price on retail price is

dampened when we include the lagged values of retail price.

12.8 Example: Presidential approval ratings

• In this example we investigate how various political/macroeconomic variables relate to

the percentage of people who answer, “I don’t know” to the Gallup poll question “How
is the president doing in his job?” The data are posted at the course website and were

borrowed from Christopher Gelpi at Duke University.

209
• Our first step is to take a crack at the standard OLS model:

. reg dontknow newpres unemployment eleyear inflation

Source | SS df MS Number of obs = 172

-------------+------------------------------ F( 4, 167) = 27.57
Model | 1096.28234 4 274.070585 Prob > F = 0.0000
Residual | 1660.23101 167 9.94150307 R-squared = 0.3977
-------------+------------------------------ Adj R-squared = 0.3833
Total | 2756.51335 171 16.1199611 Root MSE = 3.153

• Things look pretty good. All variables are statistically significant and take reasonable
values and signs. We wonder if there is autocorrelation in the data, as the data are
time series. If there is autocorrelation it is possible that the standard errors are biased

downwards, t-stats are biased upwards, and Type I errors are possible (falsely rejecting
the null hypothesis).

We grab the fitted residuals from the above regression: . predict e1, resid. We then
plot the residuals using scatter and tsline (twoway tsline e1——scatter e1 yearq)

210
15
10
Residuals
05
−5

1950q1 1960q1 1970q1 1980q1 1990q1

yearq

Residuals Residuals

• It’s not readily apparent, but the data look to be AR(1) with positive autocorrelation.
How do we know? A positive residual tends to be followed by another positive residual

and a negative residual tends to be followed by another negative residual.

• We can plot out the partial autocorrelations: . pac e1

211
0.60 0.40
Partial autocorrelations of e1
0.00 0.20
−0.20

0 10 20 30 40
Lag
95% Confidence bands [se = 1/sqrt(n)]

• We see that the first lag is the most important, the other lags (4, 27, 32) are also
important statistically, but perhaps not economically/politically.

• Can we test for AR(1) process in a more statistically valid way? How about the Runs
test? Use the STATA command runtest and give the command the error term defined
above, e1.

. runtest e1
N(e1 <= -.3066953718662262) = 86
N(e1 > -.3066953718662262) = 86
obs = 172
N(runs) = 54
z = -5.05
Prob>|z| = 0

• Looks like error terms are not distributed randomly (p-value is small). The threshold(0)

option tells STATA to create a new run when e1 crosses zero.

212
. runtest e1, threshold(0) N(e1 <= 0) = 97
N(e1 > 0) = 75
obs = 172
N(runs) = 56
z = -4.6
Prob>|z| = 0

• It still looks like the error terms are not distributed randomly.

• We move next to the Durbin-Watson test

. dwstat

Durbin-Watson d-statistic( 5, 172) = 1.016382

• The results suggest there is positive autocorrelation (DW stat is less than 2). We can

test this more directly by regressing the current error term on the previous period’s

error term

. reg e1 l.e1

Source | SS df MS Number of obs = 171

-------------+------------------------------ F( 1, 169) = 52.57
Model | 386.705902 1 386.705902 Prob > F = 0.0000
Residual | 1243.1 169 7.35562129 R-squared = 0.2373
-------------+------------------------------ Adj R-squared = 0.2328
Total | 1629.8059 170 9.58709353 Root MSE = 2.7121

------------------------------------------------------------------------------
e1 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
e1 |
L1 | .4827093 .066574 7.25 0.000 .3512854 .6141331
_cons | -.0343568 .2074016 -0.17 0.869 -.4437884 .3750747

213
• The l.e1 variable tells STATA to use the once-lagged value of e1. In the results, notice

the L1 tag for e1 - the parameter estimate suggests positive autocorrelation with rho

close to 0.48.

• Just for giggles, we find that 2*(1-rho) is ”close” to the reported DW stat

. disp 2*(1-_b[l.e1]) 1.0345815

• We try the AR(1) estimation without the constant term:

. reg e1 l.e1,noc

Source | SS df MS Number of obs = 171

-------------+------------------------------ F( 1, 170) = 52.87
Model | 386.680946 1 386.680946 Prob > F = 0.0000
Residual | 1243.30184 170 7.31354026 R-squared = 0.2372
-------------+------------------------------ Adj R-squared = 0.2327
Total | 1629.98279 171 9.5320631 Root MSE = 2.7044

• Now, we try the AR(2) estimation to see if there is a second-order process:

. reg e1 l.e1 l2.e1,noc

Source | SS df MS Number of obs = 170

-------------+------------------------------ F( 2, 168) = 25.00
Model | 364.907208 2 182.453604 Prob > F = 0.0000
Residual | 1225.9515 168 7.29733038 R-squared = 0.2294
-------------+------------------------------ Adj R-squared = 0.2202
Total | 1590.85871 170 9.35799242 Root MSE = 2.7014

214
------------------------------------------------------------------------------
e1 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
e1 |
L1 | .4986584 .0766115 6.51 0.000 .3474132 .6499036
L2 | -.0572775 .0759676 -0.75 0.452 -.2072515 .0926966

• It doesn’t look like there is an AR(2) process. Let’s test for AR(2) with Bruesch-

Godfrey test:

. reg e1 l.e1 l2.e1 newpres unemployment eleyear inflation

Source | SS df MS Number of obs = 170

-------------+------------------------------ F( 6, 163) = 8.33
Model | 373.257912 6 62.209652 Prob > F = 0.0000
Residual | 1216.78801 163 7.46495711 R-squared = 0.2347
-------------+------------------------------ Adj R-squared = 0.2066
Total | 1590.04592 169 9.40855575 Root MSE = 2.7322

------------------------------------------------------------------------------
e1 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
e1 |
L1 | .5015542 .0775565 6.47 0.000 .3484092 .6546992
L2 | -.0400182 .0795952 -0.50 0.616 -.1971888 .1171524
newpres | -.5280386 .5741936 -0.92 0.359 -1.661855 .6057781

unemployment | .0092944 .1327569 0.07 0.944 -.2528507 .2714395

eleyear | .1199647 .4869477 0.25 0.806 -.8415742 1.081504

inflation | .0300826 .0680796 0.44 0.659 -.104349 .1645142

_cons | -.1698233 .7882084 -0.22 0.830 -1.726239 1.386592

------------------------------------------------------------------------------

. test l.e1 l2.e1

( 1) L.e1 = 0
( 2) L2.e1 = 0

215
F( 2, 163) = 24.80
Prob > F = 0.0000

• Notice that we reject the null hypothesis that the once and twice lagged error terms
are jointly equal to zero. The t-stat on the twice lagged error term is not different from

zero, therefore it looks like the error process is AR(1).

• An AR(1) process has been well confirmed. What do we do to ”correct” the original

AR(1)-plagued OLS model?

1. We can estimate using Cochranne-Orcutt approach

. prais dontknow newpres unemployment eleyear inflation, corc

Cochrane-Orcutt AR(1) regression -- iterated estimates

Source | SS df MS Number of obs = 171

-------------+------------------------------ F( 4, 166) = 24.15
Model | 715.863671 4 178.965918 Prob > F = 0.0000
Residual | 1230.2446 166 7.41111202 R-squared = 0.3678
-------------+------------------------------ Adj R-squared = 0.3526
Total | 1946.10827 170 11.4476957 Root MSE = 2.7223

------------------------------------------------------------------------------
dontknow | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
newpres | 5.248822 .7448114 7.05 0.000 3.778298 6.719347
unemployment | -.6211503 .2441047 -2.54 0.012 -1.1031 -.1392004
eleyear | -2.630427 .6519536 -4.03 0.000 -3.917617 -1.343238
inflation | .1577092 .1247614 1.26 0.208 -.0886145 .4040329
_cons | 15.91768 1.474651 10.79 0.000 13.00619 18.82917
-------------+----------------------------------------------------------------
rho | .5022053
------------------------------------------------------------------------------
Durbin-Watson statistic (original) 1.016382
Durbin-Watson statistic (transformed) 1.941491

216
• Now, inflation is insignificant - autocorrelation led to Type I error? Notice the new

DW statistic is very close to 2, suggesting no AR(2) process.

2. We can estimate using Prais-Winsten transformation

. prais dontknow newpres unemployment eleyear inflation

Prais-Winsten AR(1) regression -- iterated estimates

Source | SS df MS Number of obs = 172

-------------+------------------------------ F( 4, 167) = 25.39
Model | 760.152634 4 190.038159 Prob > F = 0.0000
Residual | 1250.04815 167 7.48531828 R-squared = 0.3781
-------------+------------------------------ Adj R-squared = 0.3633
Total | 2010.20079 171 11.7555602 Root MSE = 2.7359

------------------------------------------------------------------------------
dontknow | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
newpres | 5.209751 .749466 6.95 0.000 3.730103 6.6894
unemployment | -.5766546 .2457891 -2.35 0.020 -1.061909 -.0914003
eleyear | -2.707991 .6550137 -4.13 0.000 -4.001165 -1.414816
inflation | .1100198 .1229761 0.89 0.372 -.1327684 .3528079
_cons | 15.98344 1.493388 10.70 0.000 13.03509 18.9318
-------------+----------------------------------------------------------------
rho | .5069912
------------------------------------------------------------------------------
Durbin-Watson statistic (original) 1.016382
Durbin-Watson statistic (transformed) 1.918532

• Once again, the inflation variable is not significant.

3. We could use Newey-West standard errors (similar to White, 1980)

. newey dontknow newpres unemployment eleyear inflation, lag(1)

Regression with Newey-West standard errors Number of obs = 172 maximum

F( 4, 167) = 11.66

217
Prob > F = 0.0000

------------------------------------------------------------------------------
| Newey-West
dontknow | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
newpres | 4.876033 1.06021 4.60 0.000 2.78289 6.969175
unemployment | -.6698213 .1502249 -4.46 0.000 -.9664059 -.3732366
eleyear | -1.94002 .5212213 -3.72 0.000 -2.969052 -.9109878
inflation | .1646866 .083663 1.97 0.051 -.0004868 .3298601
_cons | 16.12475 .9251126 17.43 0.000 14.29833 17.95117

• Here, inflation is still significant and positive.

4. We could use REG with robust standard errors:

. reg dontknow newpres unemployment eleyear inflation, r

Regression with robust standard errors Number of obs = 172

F( 4, 167) = 16.01
Prob > F = 0.0000
R-squared = 0.3977
Root MSE = 3.153

------------------------------------------------------------------------------
| Robust
dontknow | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
newpres | 4.876033 .9489814 5.14 0.000 3.002486 6.749579
unemployment | -.6698213 .1257882 -5.32 0.000 -.9181613 -.4214812
eleyear | -1.94002 .419051 -4.63 0.000 -2.76734 -1.1127
inflation | .1646866 .0702247 2.35 0.020 .0260441 .3033292
_cons | 16.12475 .7825265 20.61 0.000 14.57983 17.66967

• Here, inflation is still significant, although the standard error of inflation is a bit smaller

than with Newey-West standard errors.

• Which to use? It depends.

218
1. Cochranne-Orcutt approach assumes a constant rho over the entire sample pe-

riod, transforms the data, and drops the first observation.

2. Prais-Winsten approach transforms the data (essentially a weighted least squares

approach) and alters both the parameter estimates and their standard errors.

3. Newey-West standard errors do not adjust parameter estimates but do alter the

standard errors, however NW does require an accurate number of lags to be

specified (although here that doesn’t seem to be a problem)

4. Robust standard errors are perhaps the most flexible option - the correction might

allow for heteroscedasticity as well as autocorrelation, something that NW and

other approaches do not allow. However, the robust White/Sandwich standard

errors are not guaranteed to accurately control for the first order autocorrelation.

5. In this case, I would lean towards the Newey-West standard errors.

219

Autocorrelation: Dr. Prabir K. Das Indian Institute of Foreign Trade
No ratings yet
Autocorrelation: Dr. Prabir K. Das Indian Institute of Foreign Trade
54 pages
9699 Et Autoregression2-Detection
No ratings yet
9699 Et Autoregression2-Detection
5 pages
Run and Durbin-Watson Tests Explained
No ratings yet
Run and Durbin-Watson Tests Explained
5 pages
Ch5 Slides
No ratings yet
Ch5 Slides
32 pages
Ch5 Slides Ed3 Feb2021
No ratings yet
Ch5 Slides Ed3 Feb2021
49 pages
Time Series Predictive Models Guide
No ratings yet
Time Series Predictive Models Guide
15 pages
Autocorrelation - Computer Lab
No ratings yet
Autocorrelation - Computer Lab
7 pages
Autocorrelation 2
No ratings yet
Autocorrelation 2
36 pages
Autocorrelation
No ratings yet
Autocorrelation
18 pages
Chapter 4
No ratings yet
Chapter 4
38 pages
Understanding Autocorrelation in Regression
No ratings yet
Understanding Autocorrelation in Regression
17 pages
Autocorrelation
No ratings yet
Autocorrelation
49 pages
MFIN 305 - Lecture3
No ratings yet
MFIN 305 - Lecture3
66 pages
Autocorrelation
No ratings yet
Autocorrelation
6 pages
Chapter 4
No ratings yet
Chapter 4
55 pages
Unit7 Autocorrelation
No ratings yet
Unit7 Autocorrelation
11 pages
Autocorrelation in Time Series
No ratings yet
Autocorrelation in Time Series
52 pages
Chapter 4
No ratings yet
Chapter 4
62 pages
ch12 Autocorrelation
100% (1)
ch12 Autocorrelation
36 pages
Autocorrelation
No ratings yet
Autocorrelation
4 pages
Autocorrelation
No ratings yet
Autocorrelation
18 pages
Econometric Lec6
No ratings yet
Econometric Lec6
52 pages
7 Autocorrelation - FINAL
No ratings yet
7 Autocorrelation - FINAL
16 pages
Auto
No ratings yet
Auto
43 pages
Econometrics (EM2008/EM2Q05) Autocorrelation: Irene Mammi
No ratings yet
Econometrics (EM2008/EM2Q05) Autocorrelation: Irene Mammi
30 pages
Autocorrelation 2
No ratings yet
Autocorrelation 2
10 pages
Session Autocorrelation
No ratings yet
Session Autocorrelation
39 pages
Diagnostic Tests
No ratings yet
Diagnostic Tests
51 pages
Econometrics: Autocorrelation Guide
No ratings yet
Econometrics: Autocorrelation Guide
9 pages
Autocorrelation
No ratings yet
Autocorrelation
38 pages
Topic 1 Wble
No ratings yet
Topic 1 Wble
58 pages
Heteroscedasticity & Autocorrelation
No ratings yet
Heteroscedasticity & Autocorrelation
5 pages
Chapter 7 Autocorrelation
No ratings yet
Chapter 7 Autocorrelation
13 pages
Economic Data Analysis: Autocorrelation
No ratings yet
Economic Data Analysis: Autocorrelation
8 pages
Estimation
No ratings yet
Estimation
16 pages
Topic 11: Autocorrelation: ECO2009: Empirical Economic Analysis
No ratings yet
Topic 11: Autocorrelation: ECO2009: Empirical Economic Analysis
27 pages
Lecture 18. Serial Correlation: Testing and Estimation Testing For Serial Correlation
No ratings yet
Lecture 18. Serial Correlation: Testing and Estimation Testing For Serial Correlation
21 pages
Lecture 11 - Autocorrelation
No ratings yet
Lecture 11 - Autocorrelation
46 pages
Serial Correlation Analysis
No ratings yet
Serial Correlation Analysis
10 pages
Econometrics A
No ratings yet
Econometrics A
18 pages
Eviews-5 Econometrisc Guide
No ratings yet
Eviews-5 Econometrisc Guide
15 pages
RA Assignment A004
No ratings yet
RA Assignment A004
16 pages
Autocorrelation and Its Tests
No ratings yet
Autocorrelation and Its Tests
27 pages
Ar English
No ratings yet
Ar English
31 pages
Subject Business Economics
No ratings yet
Subject Business Economics
15 pages
Autocorrelation by Christopher Dougherty PDF
No ratings yet
Autocorrelation by Christopher Dougherty PDF
30 pages
Autocorrelation Analysis Guide
No ratings yet
Autocorrelation Analysis Guide
25 pages
Spurious Regressions in Economics PDF
No ratings yet
Spurious Regressions in Economics PDF
10 pages
DWTest
No ratings yet
DWTest
2 pages
Slides Cap2 Part2 Vi
No ratings yet
Slides Cap2 Part2 Vi
35 pages
Autocorrelation
No ratings yet
Autocorrelation
8 pages
Durbin Watson Test
No ratings yet
Durbin Watson Test
6 pages
Macroeconomics
100% (13)
Macroeconomics
296 pages
Lecture Notes in Macroeconomics
No ratings yet
Lecture Notes in Macroeconomics
151 pages
Diags Need
No ratings yet
Diags Need
10 pages
Waqf A Bibliography
No ratings yet
Waqf A Bibliography
45 pages
BU-1-E Easy Arabic Grammar Part 3 of 3
100% (5)
BU-1-E Easy Arabic Grammar Part 3 of 3
105 pages
Easy Arabic Grammar Part 2 of 3
No ratings yet
Easy Arabic Grammar Part 2 of 3
121 pages
Tajweed Urdu
100% (3)
Tajweed Urdu
8 pages
Certificate in Computing (Cic) : Efune' 2008
No ratings yet
Certificate in Computing (Cic) : Efune' 2008
20 pages
War and Photography - Ernst Junger
No ratings yet
War and Photography - Ernst Junger
4 pages
CR02AM: Mitsubishi Semiconductor Thyristor
No ratings yet
CR02AM: Mitsubishi Semiconductor Thyristor
6 pages
Problems Problem 33-1 (IFRS)
No ratings yet
Problems Problem 33-1 (IFRS)
8 pages
Telecom Brochure
No ratings yet
Telecom Brochure
4 pages
Algebra I m1 Teacher Materials
No ratings yet
Algebra I m1 Teacher Materials
327 pages
MITSUBISHI CNC M800-M80 Series PDF
100% (1)
MITSUBISHI CNC M800-M80 Series PDF
17 pages
IoT Home Automation
No ratings yet
IoT Home Automation
3 pages
Santosh - Resume IP
No ratings yet
Santosh - Resume IP
3 pages
2
No ratings yet
2
7 pages
Compit06 Proceedings
No ratings yet
Compit06 Proceedings
465 pages
OpenText Media Management 16.3 - Administration Guide English (MEDMGT160300-AGD-EN-02) PDF
No ratings yet
OpenText Media Management 16.3 - Administration Guide English (MEDMGT160300-AGD-EN-02) PDF
306 pages
Petar Pavloski: Curriculum Vitae
No ratings yet
Petar Pavloski: Curriculum Vitae
2 pages
GIS/RS Training Evaluation Test
100% (4)
GIS/RS Training Evaluation Test
3 pages
Muratec MFX 4555 Service Manual
No ratings yet
Muratec MFX 4555 Service Manual
292 pages
References & Bibliography
No ratings yet
References & Bibliography
1 page
DME-ASII CDLS Requirements Guide
No ratings yet
DME-ASII CDLS Requirements Guide
10 pages
The Metropolis Sampling Algorithm: 1.1 Computer Graphics
No ratings yet
The Metropolis Sampling Algorithm: 1.1 Computer Graphics
10 pages
Problem Set 1 Answer Sheet
No ratings yet
Problem Set 1 Answer Sheet
4 pages
Course Outline For Advance Web Development Course
No ratings yet
Course Outline For Advance Web Development Course
3 pages
Journal - 02 06 2017
No ratings yet
Journal - 02 06 2017
380 pages
ZXMP M721 System Architecture - Sun Jianfeng - 20140616
No ratings yet
ZXMP M721 System Architecture - Sun Jianfeng - 20140616
23 pages
VADS Installation Guide
No ratings yet
VADS Installation Guide
10 pages
Intelligent Human Systems Integration 2020
No ratings yet
Intelligent Human Systems Integration 2020
1,312 pages
Sales Forecast For Bhushan Steel Limited
100% (1)
Sales Forecast For Bhushan Steel Limited
31 pages
SFTP - Apex
No ratings yet
SFTP - Apex
7 pages
MOST150: Audi's Infotainment Evolution
No ratings yet
MOST150: Audi's Infotainment Evolution
21 pages
Datasheet: Release Date: 17 Revision Number: 1.3
No ratings yet
Datasheet: Release Date: 17 Revision Number: 1.3
4 pages
Recent Developments in Mechatronics and Intelligent Robotics
No ratings yet
Recent Developments in Mechatronics and Intelligent Robotics
1,291 pages
SKSS Software Suite Model 5660: Installation and Operations Manual
No ratings yet
SKSS Software Suite Model 5660: Installation and Operations Manual
28 pages