Introductory Econometrics
ECON2206/ECON3209
Slides02
Lecturer: Minxian Yang
ie_Slides02
my, School of Economics, UNSW
2. Simple Regression Model (Ch2)
2. Simple Regression Model
 Lecture plan
Motivation and definitions
ZCM assumption
Estimation method: OLS
Units of measurement
Nonlinear relationships
Underlying assumptions of simple regression model
Expected values and variances of OLS estimators
Regression with STATA
ie_Slides02
my, School of Economics, UNSW
2. Simple Regression Model (Ch2)
 Motivation
 Example 1. Ceteris paribus effect of fertiliser on soybean yield
yield = 0 + 1ferti + u .
 Example 2. Ceteris paribus effect of education on wage
wage = 0 + 1educ + u .
 In general,
y = 0 + 1x + u,
where u represents factors other than x that affect y.
 We are interested in
 explaining y in terms of x,
 how y responds to changes in x,
holding other factors fixed.
ie_Slides02
my, School of Economics, UNSW
2. Simple Regression Model (Ch2)
 Simple regression model
 Definition
y = 0 + 1x + u ,
y : dependent variable (observable)
x : independent variable (observable)
1 : slope parameter, partial effect, (to be estimated)
0 : intercept parameter (to be estimated)
u : error term or disturbance (unobservable)
 The disturbance u represents all factors other than x.
 With the intercept 0, the population average of u can
always be set to zero (without losing anything)
E(u) = 0 .
y = 0 + E(u) + 1x + u  E(u)
ie_Slides02
my, School of Economics, UNSW
2. Simple Regression Model (Ch2)
 Zero conditional mean assumption
y = 0 + 1x + u
y + y = 0 + 1(x + x)
+ u + u
 If other factors in u are held fixed (u = 0), the ceteris
paribus effect of x on y is 1 :
 = change
y = 1 x .
 But under what condition u can be held fixed while x
changes?
 As x and u are treated as random variables,
u is fixed while x varying is described as
the mean of u for any given x is the same (zero).
 The required condition is
X=
X
X
X
...
E(u |X) = 0
0
0
0
E(u | x) = E(u) = 0 ,
known as zero-conditional-mean (ZCM) assumption.
1
ie_Slides02
my, School of Economics, UNSW
2. Simple Regression Model (Ch2)
 Zero conditional mean assumption
 Example 2. wage = 0 + 1educ + u
Suppose u represents ability.
Then ZCM assumption amounts to
E(ability | educ) = 0 ,
ie, the average ability is the same irrespective of the
years of education.
This is not true
 if people choose the education level to suit their ability;
 or if more ability is associated with less (or more)
education.
In practice, we do not know if ZCM holds and have to
deal with this issue.
ie_Slides02
my, School of Economics, UNSW
2. Simple Regression Model (Ch2)
 Zero conditional mean assumption
 Taking the conditional expectations of
y = 0 + 1x + u
for given x, ZCM implies
E(y | x) = 0 + 1x ,
known as the population regression function
(PRF), which is a linear function of x.
 The distribution of y is centred about E(y | x).
Systematic part of y : E(y | x).
Unsystematic part of y : u.
ie_Slides02
my, School of Economics, UNSW
2. Simple Regression Model (Ch2)
 Simple regression model
yi = 0 + 1xi + ui
y
distribution of u
conditional mean of y given x
(population regression line)
E(y| x = x3)
= 0 + 1x3
E(y| x = x2)
distribution of y
for given x = x3
E(y| x = x1)
x1
ie_Slides02
x2
x3
my, School of Economics, UNSW
x
8
2. Simple Regression Model (Ch2)
 Observations on (x, y)
 A random sample is a set of independent
observations on (x, y), ie, {(xi , yi), i = 1,2,...,n}.
 At observation level, the model may be written as
yi = 0 + 1xi + ui , i = 1, 2, ..., n
where i is the observation index.
 Collectively,
 y1 
 x1   u1 
 y1  1
1
y 
 x  u 
 y  1
1
2
2
2
            , or  2   
0
y
x
u
1
 n
 n  n
 yn  1
 Matrix notation:
ie_Slides02
x1 
 u1 
x2    0  u2 
.
   1    
 
xn 
u n 
Y  X B  U.
my, School of Economics, UNSW
2. Simple Regression Model (Ch2)
 Estimate simple regression
 The model:
yi = 0 + 1xi + ui ,
i = 1, 2, ..., n
 Let ( 0 , 1 ) be the estimates of (0 , 1).
 Corresponding residual is
ui  y i  0  1xi , i  1,2,..., n.
 The sum of squared residuals (SSR)
n
SSR   u   ( y i  0  1 xi )2
i 1
2
i
i 1
indicates the goodness of the estimates.
 Good estimates should make SSR small.
ie_Slides02
my, School of Economics, UNSW
10
2. Simple Regression Model (Ch2)
 Ordinary least squares (OLS)
 The OLS estimates ( 0 , 1 ) minimise the SSR:
( 0 , 1 )  minimiser of SSR.
 Choose ( 0 , 1 ) to minimise SSR.
The first order conditions lead to
n
 ( y i  0  1xi )  0,
mean residual = 0
i 1
n
 (y
i 1
 0  1 xi )xi  0.
covariance of
residual and x
=0
ie_Slides02
my, School of Economics, UNSW
11
2. Simple Regression Model (Ch2)
 Ordinary least squares (OLS)
 Solving the two equations with two unknowns gives
n
(x
i 1
 x )( y i  y )
2
(
x
x
)
 i
0  y  1 x ,
i 1
where
1 n
y   yi ,
n i 1
1 n
x   xi .
n i 1
 OLS requires the condition
n
2
(
x
x
)
 0.
 i
i 1
ie_Slides02
my, School of Economics, UNSW
12
2. Simple Regression Model (Ch2)
 OLS regression line or SRF
 For any set of data {(xi , yi), i = 1,2,...,n} with n > 2,
OLS can always be carried out as long as
n
2
(
x
x
)
 0.
 i
i 1
 Once OLS estimates are obtained, y i  0  1xi
is known as the fitted value of y when x = xi.
 By OLS regression line or sample regression
function (SRF), we refer to
y  0  1x,
which is an estimate of PRF E(y | x) = 0 + 1 x.
ie_Slides02
my, School of Economics, UNSW
13
2. Simple Regression Model (Ch2)
 Interpretation of OLS estimate
 In the SRF y  0  1x,
the slope estimate 1 is the change in y when x
increases by one unit: 1  y / x,
which is of primary interest in practice.
 The dependent variable y may be decomposed either
as the sum of the SRF and the residual
y  y  u
or as the sum of the PRF and the disturbance.
y  E( y | x )  u.
ie_Slides02
my, School of Economics, UNSW
14
2. Simple Regression Model (Ch2)
 PRF versus SRF
 Hope: SRF = PRF on average or when n goes to infinity.
y i   0  1 xi  ui
(xi, yi)
y
sample
regression
line 0  1x
residual
ui
population
regression
line 0+ 1x
ie_Slides02
my, School of Economics, UNSW
15
2. Simple Regression Model (Ch2)
 OLS example
10
15
20
wa ge  0.90  0.54 educ ,
Population : workforce in 1976
y = wage : hourly earnings (in $)
x = educ : years of education
OLS SRF : n = 526
wage
25
 Example 2. (regress wage educ)
10
15
educ
 Interpretation
 Slope 0.54 : each additional year of schooling increases
the wage by $0.54.
 Intercept -0.90 : fitted wage of a person with educ = 0?
SRF does poorly at low levels of education.
 Predicted wage for a person with educ = 10?
ie_Slides02
my, School of Economics, UNSW
16
2. Simple Regression Model (Ch2)
 Properties of OLS
 The first order conditions:
n
 (y
i 1
(y
i 1
 0  1 xi )  0,
 0  1xi )xi  0
imply that
 the sum of residuals is zero.
 the sample covariance of x and the residual is zero.
 the mean point ( x, y ) is always on the SRF (or OLS
regression line).
ie_Slides02
my, School of Economics, UNSW
17
2. Simple Regression Model (Ch2)
 Sums of squares
 Each yi may be decomposed into y i  y i  ui .
 Measure variations from y :
 Total sum of squares (total variation in yi ):
SST  i 1 ( y i  y )2 ,
n
 Explained sum of squares (variation in y i ):
SSE  i 1 ( y i  y )2 ,
n
 sum of squared Residuals (variation in u i ):
SSR  i 1 u i2 .
n
 It can be shown that
ie_Slides02
SST = SSE + SSR .
my, School of Economics, UNSW
18
2. Simple Regression Model (Ch2)
 R-squared: a goodness-of-fit measure
 How well does x explain y?
or how well does the OLS regression line fit data?
 We may use the fraction of variation in y that is
explained by x (or by the SRF) to measure.
 R-squared (coefficient of determination):
SSE
SSR
R2 
 1
.
Not advisable to put
SST
SST
too much weight on
 larger
better fit;
 0  R2  1.
R2,
R2 when evaluating
regression models.
eg. R2 = 0.165 for wa ge  0.90  0.54 educ ,
16.5% of variation in wage is explained by educ.
ie_Slides02
my, School of Economics, UNSW
19
2. Simple Regression Model (Ch2)
 Effects of changing units of measurement
 If y is multiplied by a constant c, then the OLS
intercept and slope estimates are also multiplied by c.
 If x is multiplied by a constant c, then the OLS
intercept estimate is unchanged but the slope
estimate is multiplied by 1/c.
 The R2 does not change when varying the units of
measurement.
eg. When wage is in dollars, wa ge  0.90  0.54 educ .
If wage is in cents, wa ge  90  54 educ .
ie_Slides02
my, School of Economics, UNSW
20
2. Simple Regression Model (Ch2)
 Nonlinear relationships between x and y
 The OLS only requires the regression model
y = 0 + 1x + u
to be linear in parameters.
 Nonlinear relationships between y and x can be easily
accommodated.
1
0
lwage
eg. Suppose a better description
is that each year of education
increases wage by a fixed
percentage. This leads to
log(wage) = 0 + 1 educ + u ,
with %wage = (1001)educ
when u= 0.
OLS: lwa ge  0.584  0.083 educ ,
ie_Slides02
my, School of Economics, UNSW
10
15
educ
R 2  0.186
21
2. Simple Regression Model (Ch2)
 Nonlinear relationships between x and y
 Linear models are linear in parameters.
 OLS applies to linear models no matter how x and y
are defined.
 But be careful about the interpretation of .
ie_Slides02
my, School of Economics, UNSW
22
2. Simple Regression Model (Ch2)
 OLS estimators
 A random sample, containing independent draws
from the same population, is random.
 A data set is a realisation of the random sample.
 OLS estimates ( 0 , 1 ) computed from a random
sample is random, called the OLS estimators.
 To make inference about the population parameters
(0, 1), we need to understand the statistical
properties of the OLS estimators.
 In particular, we like to know the means and
variances of the OLS estimators.
 We find these under a set of assumptions about the
simple regression model.
ie_Slides02
my, School of Economics, UNSW
23
2. Simple Regression Model (Ch2)
 Assumptions about simple regression model
(SLR1 to SLR4)
1. (linear in parameters) In the population model, y is
related to x by y = 0 + 1 x + u, where (0, 1) are
population parameters and u is disturbance.
2. (random sample) {(xi , yi), i = 1,2,...,n} with n > 2 is a
random sample drawn from the population model.
3. (sample variation) The sample outcomes on x are
not of the same value.
4. (zero conditional mean) The disturbance u satisfies
E(u | x) = 0 for any given value of x. For the random
sample, E(ui | xi) = 0 for i = 1,2,...,n.
ie_Slides02
my, School of Economics, UNSW
24
2. Simple Regression Model (Ch2)
 Property 1 of OLS estimators
Theorem 2.1
Under SLR1 to SLR4, the OLS estimators are
unbiased: E( 1 )  1, E( 0 )  0 .
Unbiased estimators ( 0 , 1 )
 they are centred around (0, 1).
 they correctly estimate (0, 1) on average.
It is useful to note that ( y i  y )  1( xi  x )  (ui  u ),
1  1
ie_Slides02
n
i 1
(ui  u )( x i  x )
n
i 1
( xi  x )
The estimation error is
entirely driven by a linear
combination of ui with
weights dependent on x.
my, School of Economics, UNSW
25
2. Simple Regression Model (Ch2)
 Property 2 of OLS estimators
5. (SLR5, homoskedasticity)
Var(ui|xi) = 2 for i = 1,2,...,n. (It implies Var(ui) = 2.)
Strictly, Theorem 2.2 is about the variances
of OLS estimators, conditional on given x.
Theorem 2.2
Under SLR1 to SLR5, the variances of ( 0 , 1 ) are:
Var ( 1 ) 
n
i 1
( xi  x )
 2 n 1 i 1 x i2
n
2
2
, Var ( 0 ) 
n
i 1
( xi  x )
 the larger is 2, the greater are the variances.
 the larger the variation in x, the smaller the variances.
ie_Slides02
my, School of Economics, UNSW
26
2. Simple Regression Model (Ch2)
 Homoskedasticity and heteroskedasticity
ie_Slides02
my, School of Economics, UNSW
27
2. Simple Regression Model (Ch2)
 Estimation of 2
 As the residual approximates u, the estimator of 2 is
2
u
SSR
 i
 2 
 i 1 .
n2
n2
n
2 is the number of
estimated coefficients
    2 is known as the standard error of the
regression, useful in forming the standard errors of
( 0 , 1 ).
Theorem 2.3 (unbiased estimator of 2)
Under SLR1 to SLR5, E ( 2 )   2 .
ie_Slides02
my, School of Economics, UNSW
28
2. Simple Regression Model (Ch2)
 OLS in STATA
standard
error of
regression
SSR
ie_Slides02
my, School of Economics, UNSW
29
2. Simple Regression Model (Ch2)
 Summary
 What is a simple regression model?
 What is the ZCM assumption? Why is it crucial for
model interpretation and OLS being unbiased?
 What is the OLS estimation principle?
 What are PRF, SRF, error term and residual?
 How is R-squared is related to SSR?
 Can we describe, in a simple linear regression model,
the nonlinear relationship between x and y?
 What are Assumptions SLR1 to SLR5? Why do we
need to understand them?
 What are the statistical properties of OLS estimators?
 How do you OLS in STATA? regress y x
ie_Slides02
my, School of Economics, UNSW
30