0% found this document useful (0 votes)
25 views43 pages

Econometrics: Assumption Violations

This document discusses violations of assumptions in regression analysis, including multicollinearity, heteroscedasticity, and autocorrelation. It defines these terms, provides examples of their causes and consequences, and discusses methods to detect and remedy them.

Uploaded by

atalayasres57
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views43 pages

Econometrics: Assumption Violations

This document discusses violations of assumptions in regression analysis, including multicollinearity, heteroscedasticity, and autocorrelation. It defines these terms, provides examples of their causes and consequences, and discusses methods to detect and remedy them.

Uploaded by

atalayasres57
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Chapter-4:Violation of Assumptions

Sied Hassen(PhD)

Department of Economics, Addis Ababa University

December, 2015
Outline

Multicollinearity
Hetroscedasticity
Autocorrelation
Multicollinearity

Multicollinearity refers to a linear relationship among some or


all explanatory variables of multiple linear regression model
The relationship can be prefect or exact like

λ1 X1 + λ2 X2 + ... + λk Xk = 0 (1)

Or the relationship can be less than prefect like

λ1 X1 + λ2 X2 + ... + λk Xk + vi = 0 (2)

where vi is error term


As stated above multicollinearity refers only to linear
correlation. It does not rule out non-linear relation like

Yi = α + β1 X1 + β2 X12 + β3 X13 + ui (3)


Causes of multicollinearity

Data collection method employed


sampling over limited range of the values taken by the
explanatory variables in the population
constraints on the model or population
For example, in the regression of electricity consumption on
income (X1 ) and house size (X2 ) there is a physical constraint
in the population in that families with higher incomes generally
have larger homes than families with lower incomes.
model specification
for example, adding polynomial terms to a regression model,
especially when the range of the X variable is small.
An over-determined model
This happens when the model has more explanatory variables
than the number of observations
Consequences of Multicollinearity

If multicollinearity is perfect, the regression coefficients are


indeterminate and their standard errors are infinite
If multicollinearity is less than perfect,the regression
coefficients are determinate but their their standard errors are
large
This will make the t-value of the coefficients to be smaller and
hence most of coefficients are individually insignificant
It malkes the model R 2 to be very large, usually greater than
80%
The high R 2 will make the slope coefficients to be jointly
significant
Detetion of Multicollinearity

High R 2 (> .8) but few significant coefficients


High pair-wise correlation among the explanatory
variables(rx1 x2 > .8)
This is a sufficient but not a necessary condition.
This is because it is possible that R 2 to be high when the
correlation between the variables is low
Using Rx21 x2 x3 ...xk from an auxiliary regression, i.e regression of
one explantory variable on the remaining explantory variables

X1 = λ1 X2 + λ2 X3 + ... + λk Xk (4)
In this case there is multicollinearity if Rx21 x2 x3 ...xk from an
auxiliary regression is greater than R 2 from the regression of
Y on the explantory variables(X ′ s)
Detetion of Multicollinearity

a more formal test Using Rj2 from an auxiliary regression is

Rx21 x2 x3 ...x
k
F= k−2 (5)
2
1−Rx1 x2 x3 ...x
k
n−k+1

if the computed F is greater than critical F (k − 1, n − k + 1),


there is multicollinearity problem
Variance inflation factor(VIF)or Tolerance(TOL)
1
VIF = 1−Rx21 x2 x3 ...x
(6)
k

1
Tol = VIF
If VIF > 10, Tol < .1andRx21 x2 x3 ...xk > .9, it is an indication of
high collinearity
Remedial Measure

Do nothing
or follow some rule of thumb
The rule of thumb
Drop variable/varaibles that cuase the problem of
multicolinearity
this may cause an omitted variable bias
Transform the variables:(1)First difference
Yt = α + β1 X1t + β2 X2t + ut (7)
The one period lagged of the above equation, i.e
Yt−1 = α + β1 X1t−1 + β2 X2t−1 + ut−1 (8)
Take difference of the above two euations
Yt−1 = α + β1 X1t−1 + β2 X2t−1 + ut−1 (9)
Yt −Yt−1 = β1 (X1t −X1t−1 )+β2 (X2t −X2t−1 )+(ut −ut−1 ) (10)
this can be rewritten as
∆Yt = β1 ∆X1t + β2 ∆X2t + ∆ut (11)
The rule of thumb

Transform the variables:(2)ratio transformation

Yt = α + β1 X1t + β2 X2t + ut (12)


Yt
X2t = α
X2t + β1 [ X 1t
X2t ] +
ut
X2t (13)
Use additional or new data
Reduce collinearity in polynomial regression
Hetroscedasticity

Constant variance of the disturbance term is one of the


classical regression assumptions, i.e,

E (ui2 ) = σ 2 = constant (14)

This means variance of the error term is constant regardless of


the values of the explanatory variables
If variance varies with the values of the explanatory variables,
we have a hetroscedastic variance,i.e

E (ui2 ) = σi2 ≠ constant (15)

This is most common in cross-sectional data.


Hetroscedasticity

Fig. (a) Homosecdastic error variance fig.(b) Heteroscedastic error variance


Matrix Representation Variance of error term

 (U i2 ) (U 1U 2 ) .......... (U 1U n ) 


 
 (U 1U 2 ) (U 22 ) .......... (U 2U n )
(UU ' ) 
 : : : 
 
(U 1U n ) (U 2U n ) .......... (U n ) 
2
Matrix Representation of Hemoscedasticity
Assuming that there is no autocorrelation(E (Ui Uj ) = 0), the
homoscedastic variance of the error term is given by

 2 0 .......... 0 
 
0  2 .......... 0 
(UU ')  
 : : : 
 
0 0 ..........  2 
1 0 .......... 0
0 1 .......... 0
(UU ')   2  
: : :
 
0 0 .......... 1
(UU ')   2 I
Matrix Representation of Hetroscedasticity
Assuming that there is no autocorrelation(E (Ui Uj ) = 0), the
hetroscedastic variance of the error term is given by

 12 0 .......... 0 
 
 0  22 .......... 0 
(UU ') 
 : : : 
 
0 0 ..........  n2 
Causes of hetroscedasticty

1. Error learning model:


It states that as people learn their error of behavior become
smaller over time. In this case σi2 is expected to decrease
Example: as the number of hours of typing practice increases,
the average number of typing errors and as well as their
variance decreases.
2.As data collection technique improves, σi2 is likely to
decrease
3. Heteroscedasticity can also arise as a result of the
presence of outliers
An outlier is an observation that is much different (either very
small or very large) in relation to the other observation in the
sample.
Consequences of Hetroscedasticity

The OLS estimators are still linear ,unbiased and consistent


However, they are no longer efficient.
With homoscedastic variance and one explanatory variable,
variance of the slope coefficient,β̂ is
σ2
var (β̂) = ∑ xi2
(16)

However if we have hetroscedasticty and single regressor,the


correct variance of the slope coefficient is
∑ xi2 σi2
var (β̂) = 2 (17)
(∑ xi2 )

If hetroscedasticty is not corrected,it can be show that


∑ xi2 σi2 σ2
2 > ∑ xi2
(18)
(∑ xi2 )
Consequences of Hetroscedasticity

If hetroscedasticty is not corrected, the large variance will


make the t-values to be lower.
The the estimated coefficients are appeared to be insignificant
while actually they may not be
This will make our inference and prediction to be wrong.
Detecting Hetroscedasticity

There are several methods to detect the presence or absence


of hetroscedasticity
The main refernce, Gujarati, Basic econometrics 4th edition,
have a discussion on these methods
Here we focus only on the mostly common applied in research
and regression softwares
Graphing of the residual square with the predicted Y(Ŷ ) or
explanatory variable(X ) can give a hint.
Detecting Hetroscedasticity
White’s test for Hetroscedasticity

Unlike other tests White test does not rely on the normality
assumption and is easy to implement
As an illustration of the basic idea, consider the following two-
explanatory variable regression model

Yi = α + β1 X1i + β2 X2i ut (19)


The White test proceeds as follows:
stage 1: estimate the model and obtain the residuals

ei = Yi − α̂ − βˆ1 X1i − βˆ2 X2i (20)

stage 2:run the following auxiliary regression

ei2 = λˆ0 + λˆ1 X1i + λˆ2 X2i + λˆ3 X1i2 + λˆ4 X2i2 + λˆ5 X1i X2i + vi (21)
White’s test for Hetroscedasticity
stage 3:Obtain R 2 from the above auxiliary regression.
Under the null hypothesis that there is no heteroscedasticity,
it can be shown

n ∗ Re2i X1 X1 ∼ χ2df (22)


df is degree of freedom and in this case it takes the number
of explanatory variables
stage 4:If the chi-square value obtained in (22) exceeds the
critical chi-square value at the chosen level of significance,
the conclusion is that there is heteroscedasticity.
If it does not exceed the critical chi-square value, there is no
heteroscedasticity,
which is to say that in the auxiliary regression
λˆ1 = λˆ2 = λˆ3 = λˆ4 = λˆ5 = 0 (23)
Solution(Remedial Measure) to Hetroscedasticty

The solution or remedial measure for hetroscedasticity is to


transform the the regression equation.
The propose of the transformation is to make the error term
in the transformed model to be homescedastic.,
Applying OLS in the transformed model is called the
Generalized Least square(GLS) or Weighted Least
Square(WLS)
Hence, the solution to hetroscedastcity is to use GLS or WLS
Solution(Remedial Measure) to Hetroscedasticty
For the model
Yi = α + βXi + ui (24)
Genrally the transformation is done as follows.
If variance of the error term is given by
var (ui ) = σ 2 f (Xi ) (25)
The√transformation is done by dividing the regression model
by f (Xi ) as
√ Yi = √ α + β √ Xi + √ ui (26)
f (Xi ) f (Xi ) f (Xi ) f (Xi )

For example if var (ui ) = E (ui2 ) = σ 2 Xi


The transformed model is
√Yi = √α + β √XXi + √ui (27)
Xi Xi i Xi
Solution(Remedial Measure) to Hetroscedasticty

In equation 27 it can be shown that variance of the


transformed error term ( √uXi ) is constant
i

var ( √uXi ) = E ( √uXi )2 = 1 2


Xi E (ui ) (28)
i i

From above E (ui2 ) = σ 2 Xi


This implies, equation 28 can be written as

var ( √uXi ) = 1 2
Xi E (ui ) = ( X1i )(σ 2 Xi ) = σ 2 = constant (29)
i

Hence in the transformed model the error term is


homoscedastic.
Applying OLS on equation 27 is called GLS or WLS
GLS/WLS estimates are BLUE
Autocorrelation

In both simple and multiple regression model, we assumed


that successive values of the error terms are independent

Cov (Ut , Ut−1 ) = E (Ut Ut−1 ) = 0 (30)


This is called the assumption of no autocorrelation.
If successive values of the error terms are correlated, then
there is autocrrelation.

Yt = α + βXt + Ut (31)

Ut = ρUt−1 + vt ; ρ ≠ 0 (32)
vt is assumed to be non-auto-correlated and homoscedastic
Auto-correlation is most common in time series data
Difference between correlation and autocorrelation

Autocorrelation is a special case of correlation which refers to


the relationship between successive values of the same variable
While correlation may also refer to the relationship between
two or more different variables
Autocorrelation is also sometimes called as serial correlation
But some economists distinguish between these two terms.
Autocorrelation is the lag correlation of a given series with
itself but lagged by a number of time
units(U2 , U4 , ..., U10 )and(U1 , U3 , ..., U11 )
Whereas correlation between time series such as
(U1 , U2 , ..., U10 ) and (V1 , V2 , ..., V10 ) where U and V are two
different time series, is called serial correlation
Graphical representation of Autocorrelation
Graphs (a)-(d) show existence of autocorrelation while fig(e)
shows no autocorrelation

Ui Ui Ui

t t t

(a) (b ) (c)
Ui Ui
t : : : : : : :: :: : : : : t
:::::::::::::

(d) (e)
Matrix representation of Autocorrelation

 2 12 .......... 1n    12 12 .......... 1n 


   
  2 ..........  2 n    22 ..........  2 n 
(UU ')   21 (UU ')   21
 : : :   : : : 
   
  n1  n 2 ..........  2    n1  n 2 ..........  n2 

Homoscedastic and autocorrelated hetroscedastic and autocorrelated


Causes of Autocorrelation
1.Cyclical fluctuations:
Time series such as GNP, price index, production, employment
and unemployment exhibit business cycle
2.Exclusion of variables from the regression model
Error term captures any variable excluded from the model
Thus error term will show a systematic change as these
excluded variable changes
3.Incorrect functional form
Error term also captures any mistake in the functional form
This will also make the error terms to be correlated
For example , suppose that the correct model is
Yt = α + βXt + λYt−1 + Ut (33)
for some reason we incorrectly regress
Yt = α + βXt + Vt (34)
Causes of Autocorrelation

Which implies that

Vt = λYt−1 + Ut (35)

Hence, Vt shows systematic change reflecting autocrrelation


Consequences of Autocorrelation

The OLS estimators are linear, unbiased and consistent


However, they are no longer efficient
The OLS estimates will be appeared to be statistically
insignificant while actually they may not be
Detection of Autocorrelation

The most commonly used method in testing for the presence


or absence of autocorrelation is the Durbin-Watson d test
∑t=n
t=2 (et −et−1 )
2
d= ∑t=n e 2 (36)
t=1 t

Note that, in the numerator of d statistic the number of


observations is n − 1 because one observation is lost in taking
successive differences
There are certain assumptions that underlie this test
1. The regression model includes an intercept term
Detection of Autocorrelation
2. The explanatory variables, the Xs, are non-stochastic, or
fixed in repeated sampling.
3. The disturbances are generated by the first order auto
regressive scheme successive differences
Ut = ρUt−1 + t (37)
This is first order auto-regressive model(AR(1))
Ut = ρ1 Ut−1 + ρ2 Ut−2 + t (38)
This is second order auto-regressive model(AR(2))
4.The regression model does not include lagged value of Y,
the dependent variable as one of the explanatory variables
Yt = α + βXt + λYt−1 + Ut (39)
5. There are no missing observations in the data
Detection of Autocorrelation

d can also be rewritten as


∑t=n 2 2
t=2 (et +et−1 −2et et−1 )
d∗ = t=n 2
∑t=1 et
(40)

For large sample size ∑t=n 2 t=n 2


t=2 et = ∑t=2 et−1
this implies that
2(∑t=n
t=2 et
2 2(∑t=n
t=2 et et−1
d∗ = ∑t=n e 2) − ∑t=n 2 (41)
t=1 t t=1 et )

∑t=n
t=2 et et−1
d ∗ = 2(1 − ∑t=n 2 ) (42)
t=1 et

d ∗ = 2(1 − ρ̂) (43)


∑t=n
t=2 et et−1
ρ̂ = ∑t=n 2 (44)
t=1 et

−1 ≤ ρ̂ ≤ 1 ⇒ 0 ≤ d ∗ ≤ 4 (45)
Detection of Autocorrelation

If ρ̂ = 0 ⇒, d ∗ = 2 ⇒ no autocorrelation
If ρ̂ = 1 ⇒, d ∗ = 0 ⇒ positive autocorrelation
Ifρ̂ = −1 ⇒, d ∗ = 4 ⇒ negative autocorrelation
The formal test is by comparing d* with two critical values
which are called lower critical value(dL ) and upper critical
value(du )
if d∗ < dL or d∗ > (4 − dL ) ⇒ we reject the null hypothesis of
no autocorrelation in favor of the alternative which implies
existence of autocorrelation.
if du < d∗ < 4 − du ⇒ we accept the null hypothesis of no
autocorrelation
If dL < d∗ < du or 4 − du < d∗ < 4 − dL ⇒ indeterminate.
Which means we neither accept or reject the the null
hypothesis of autocorrelation
Detection of Autocorrelation

If ρ̂ = 0 ⇒, d ∗ = 2 ⇒ no autocorrelation
If ρ̂ = 1 ⇒, d ∗ = 0 ⇒ positive autocorrelation
Ifρ̂ = −1 ⇒, d ∗ = 4 ⇒ negative autocorrelation
The formal test is by comparing d* with two critical values
which are called lower critical value(dL ) and upper critical
value(du )
if d∗ < dL or d∗ > (4 − dL ) ⇒ we reject the null hypothesis of
no autocorrelation in favor of the alternative which implies
existence of autocorrelation.
if du < d∗ < 4 − du ⇒ we accept the null hypothesis of no
autocorrelation
If dL < d∗ < du or 4 − du < d∗ < 4 − dL ⇒ indeterminate.
Which means we neither accept or reject the the null
hypothesis of autocorrelation
Graphical representation of d-test
Detection of Autocorrelation

If ρ̂ = 0 ⇒, d ∗ = 2 ⇒ no autocorrelation
If ρ̂ = 1 ⇒, d ∗ = 0 ⇒ positive autocorrelation
Ifρ̂ = −1 ⇒, d ∗ = 4 ⇒ negative autocorrelation
The formal test is by comparing d* with two critical values
which are called lower critical value(dL ) and upper critical
value(du )
if d∗ < dL or d∗ > (4 − dL ) ⇒ we reject the null hypothesis of
no autocorrelation in favor of the alternative which implies
existence of autocorrelation.
if du < d∗ < 4 − du ⇒ we accept the null hypothesis of no
autocorrelation
If dL < d∗ < du or 4 − du < d∗ < 4 − dL ⇒ indeterminate.
Which means we neither accept or reject the the null
hypothesis of autocorrelation
Example on d-test
Consider the simple regression model, Yt = α̂ + β̂Xt + et .
Given the following information, test whether there is
auto-correlation in the estimated model using Durbin-Watson
d test. Where x and y are in deviation form
2 2
∑ xt yt = 255; ∑ xt = 280; ∑ yt = 274; X̄ = 8; Ȳ = 7 (46)
t=n t=n
2 2
∑ (et − et−1 ) = 60.21; ∑ et = 41.767; dL = 1.08; du = 1.36
t=2 t=1
(47)
solution;
∑t=n
t=2 (et −et−1 )
2
60.21
d= t=n 2
∑t=1 et
= 41.767 = 1.42 (48)
4 − du = 4 − 1.36 = 2.64;4 − dL = 2.92 ∶ dL = 1.08; du
since du < d ∗ < 4 − du = 1.36 < 1.442 < 2.64 ⇒ we accept the
null hypothesis of no autocorrelation. Hence the estimated
model is not auto-correlated
Exercise on d-test
From the table below, first estimate the coefficients ,the
t=n
(et −et−1 )2
residuals(et &et−1 ), ∑t=2∑t=n 2 then the d-value
t=1 et
Make sure that you get the same d-value as in the above

Xt 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Yt 2 2 2 1 3 5 6 6 10 10 10 12 15 10 11

t 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Solution of Auto-correlation
Generally the solution is to apply OLS on the transformed
model. Again, This is called GLS or WLS.
The transformation depends on ρ in the model below is known
or not
Yt = α + βXt + Ut (49)
Ut = ρUt−1 + vt ; ρ ≠ 0 (50)
if ρ is known, then he procedure of transformation will be
given below.
Take the lagged form of Yt = α + βXt + Ut and multiply
through by ρ
ρYt = ρα + ρβXt−1 + ρUt−1 (51)
Subtract the above equation from Yt = α + βXt + Ut i,.e
(Yt −ρYt−1 ) = (α−ρα)+(betaXt −ρβXt−1 +(Ut −ρUt−1 ) (52)
⇒ (Ut − ρUt−1 = vt and vt is not auto-correlated
Solution of Auto-correlation
OLS on the above transformed model is GLS and estimated
its coefficient are BLUE
When ρ is not known, we rely either on a prior information or
use estimated ρ
1. a prior information : A researcher usually makes
reasonable guess. Most commonly used value is ρ = 1. In this
case the transformation is called first difference
(Yt − Yt−1 ) = (α − α) + (βXt ) − βXt−1 + (Ut − Ut−1 ) (53)
2. Use estimated ρ : for example we can estimate ρ from
the d ∗ ,Durbin-Watson d test as
d ∗ = 2(1 − ρ̂) (54)
Or estimate from
∑t=n
t=2 et et−1
ρ̂ = ∑t=n 2 (55)
t=1 et

You might also like