Outline
1. Some basic ideas
2. The problem of estimation: OLS method
3. Classical Normal Linear Regression Model
(CNLRM)
4. Interval estimation and hypothesis testing
5. Extensions of the two variable linear regression
model
10/24/2017 Mai VU-FIE-FTU 2
2. The problem of estimation: OLS method
2.1. The method of ordinary least squares (OLS)
2.2. Some characteristics of OLS estimators
2.3. Assumptions of the OLS method
2.4. Precision or standard error of least-squares
estimates
2.5. Properties of least squares estimators: The Gauss-
Markov theorem
2.6. The coefficient of determination r2: a measure of
goodness of fit
10/24/2017 Mai VU-FIE-FTU 3
2.1. The method of ordinary least squares
The method of ordinary least squares is attributed to
Carl Friedrich Gauss, a German mathematician.
Under certain assumptions, the method of least
squares has some very attractive statistical properties
that have made it one of the most powerful and
popular methods of regression analysis.
10/24/2017 Mai VU-FIE-FTU 4
2.1. The method of ordinary least squares
Recall the two-variable PRF:
𝑌𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 + 𝑢𝑖 (2. 1.1)
However, the PRF is not directly observable. We
estimate it from the SRF:
1 + 𝛽
𝑌𝑖 = 𝛽 2 𝑋𝑖 + 𝑢ෝ𝑖 (2.1.2)
=𝑌𝑖 + 𝑢ෝ𝑖 (2.1.3)
Where 𝑌𝑖 is the estimated (conditional mean) value of Y.
10/24/2017 Mai VU-FIE-FTU 5
2.1. The method of ordinary least squares
But how is the SRF itself determined?
To see this, let us proceed as follows. First, express
(2.3) as:
1 − 𝛽
𝑖 = 𝑌𝑖 − 𝛽
𝑢ෝ𝑖 = 𝑌𝑖 − 𝑌 2 𝑋𝑖 (2.1.4)
→ which shows that the 𝑢ෝ𝑖 (the residuals) are simply the
differences between the actual and estimated Y values.
10/24/2017 Mai VU-FIE-FTU 6
2.1. The method of ordinary least squares
Now given n pairs of observations on Y and X, we
would like to determine the SRF in such a manner that
it is as close as possible to the actual Y.
To this end, we may adopt the following criterion:
Choose the SRF in such a way that the sum of the
residuals σ 𝑢ෝ𝑖 = σ(𝑌𝑖 −𝑌 𝑖 ) is as small as possible.
Although, this is not a very good criterion. Why?
10/24/2017 Mai VU-FIE-FTU 7
2.1. The method of ordinary least squares
Figure shows that 𝑢
ෞ2 and 𝑢
ෞ3 as Y SRF
well as 𝑢
ෞ1 and 𝑢
ෞ4 receive the same Yi
weight in the sum (ෞ 𝑢1 + 𝑢
ෞ2 + 𝑢
ෞ3 +
Yˆi = ˆ1 + ˆ2 X i
𝑢
ෞ4 ), although the first two û3
residuals are much closer to the û4
û1
SRF than the latter two.
All the residuals receive equal û2
importance no matter how close
or how widely scattered the
individual observations are from
the SRF.
A consequence: the algebraic sum
of the 𝑢ෝ𝑖 is small (even zero)
although the 𝑢ෝ𝑖 are widely
scattered about the SRF. X2 X
X1 X3 X4
10/24/2017 Mai VU-FIE-FTU 8
2.1. The method of ordinary least squares
We can avoid this problem if we adopt the least -squares
criterion, which states that the SRF can be fixed in such
a way that:
2 2 2
σ 𝑢ෝ𝑖 = σ 𝑌𝑖 − 𝑌𝑖 = σ 𝑌𝑖 − 𝛽መ1 − 𝛽መ2 𝑋𝑖 (2.1.5)
is as small as possible, where 𝑢ෝ𝑖 2 are the squared
residuals.
By squaring 𝑢ෝ𝑖 , this method gives more weight to
residuals such as ෞ ෞ4 than ෞ
𝑢1 and 𝑢 𝑢2 and 𝑢
ෞ3 .
10/24/2017 Mai VU-FIE-FTU 9
2.1. The method of ordinary least squares
Under the minimum σ 𝑢ො 𝑖 criterion, the sum can be
small even though the 𝑢ො 𝑖 are widely spread about the
SRF.
But this is not possible under the least-squares
procedure, for the larger the 𝑢ො 𝑖 (in absolute value), the
larger the σ 𝑢ො 𝑖 2 .
Then, the sum of the squared residuals is some
function of the estimators 𝛽1 and 𝛽2 :
n n
ˆ
u =
i =1
f 2
(
i 1 2 i 1 2 i
ˆ , ˆ ) = (Y − ˆ − ˆ X ) 2
i =1
10/24/2017 Mai VU-FIE-FTU 10
2.1. The method of ordinary least squares
values should we choose?
Now which sets of β
1 and 𝛽
We will choose 𝛽 2 that give the smallest possible
value of σ 𝑢ො 𝑖 2 .
Remind that the function f(X) reaches the smallest value if
its 1st derivative f’(X)=0 and 2nd derivative f’’(X)>0.
2
Then, to find out the smallest value of σ 𝑢ො 𝑖 , we have to
solve these normal simultaneous equations:
n n
nˆ1 + ˆ2 X i = Yi
f ' (u ) = 0 i =1 i =1
↔
f ' ' (u ) 0 n n n
ˆ1 X i + ˆ2 X = Yi X ii
2
i =1 i =1 i =1
10/24/2017 Mai VU-FIE-FTU 11
2.1. The method of ordinary least squares
1 and 𝛽
Differentiating (5) partially with respect to 𝛽 2 ,
we obtain:
ෝ𝑖2 )
𝜕(σ 𝑢
1 = −2 σ𝑛𝑖=1 𝑌𝑖 − 𝛽መ1 − 𝛽መ2 𝑋𝑖 = −2 σ𝑛𝑖=1 𝑢ො 𝑖
𝜕𝛽
ෝ𝑖2 )
𝜕(σ 𝑢
2 = −2 σ𝑛𝑖=1 𝑌𝑖 − 𝛽መ1 − 𝛽መ2 𝑋𝑖 𝑋𝑖 = −2 σ𝑛𝑖=1 𝑢ො 𝑖 𝑋𝑖
𝜕𝛽
Setting these equations to zero, after algebraic
simplification and manipulation, we obtain:
10/24/2017 Mai VU-FIE-FTU 12
2.1. The method of ordinary least squares
n n n n n
n X iYi − X i Yi (X i − X )(Yi − Y ) x y i i
̂ 2 = i =1
n
i =1
n
i =1
= i =1
n
= i =1
n
(2.1.6)
n X i2 − ( X i ) 2 (X i − X )2 x 2
i
i =1 i =1 i =1 i =1
n n n n
X Y − X X Y
i
2
i i i i
ˆ1 = i =1 i =1 i =1 i =1
= Y − ˆ2 X (2.1.7)
n n
n X i2 − ( X i ) 2
i =1 i =1
Where 𝑋ത and 𝑌
ത are the sample means of X and Y and where we define
ത and yi = (Yi − 𝑌).
xi =(Xi- 𝑋) ത
Henceforth we adopt the convention of letting the lowercase letters
denote deviations from mean values
→ The estimators obtained previously are known as the least-squares
estimators, for they are derived from the least-squares principle.
10/24/2017 Mai VU-FIE-FTU 13
2.1. The method of ordinary least squares
Example 1: Estimation of the consumption function
Given the Keynesian consumption function:
𝑐𝑜𝑛𝑠 = 𝛽1 + 𝛽2 𝑖𝑛𝑐 + 𝑢𝑖
1 and 𝛽
Estimates the estimators 𝛽 2 of the model.
Obs. 1 2 3 4 5 6
Consi(Yi) 5 7 8 10 11 13
Inci(Xi) 6 9 10 12 13 16
10/24/2017 Mai VU-FIE-FTU 14
Obs. Yi Xi ഥ
Xi- 𝑿 ഥ
Yi- 𝒀 Yi*Xi 𝑋𝑖2 ഥ )*(Yi- 𝒀
(Xi- 𝑿 ഥ) ഥ )2
(Xi- 𝑿
(1) (2) (3) (4) (5) (2)*(3) (2)*(3) (4)*(5) (4)*(4)
1 5 6 -5 -4 30 36 20 25
2 7 9 -2 -2 63 81 4 4
3 8 10 -1 -1 80 100 1 1
4 10 12 1 1 120 144 1 1
5 11 13 2 2 143 169 4 4
6 13 16 5 4 208 256 20 25
Sums 54 66 0 0 644 786 50 60
Mean 9 11
10/24/2017 Mai VU-FIE-FTU 15
6
(X i − X )(Yi − Y )
644 − 9 * 66
ˆ2 = i =1
= = 0.83
6
786 − 11* 66
i
( X
i =1
− X ) 2
ˆ1 = Y − ˆ2 X = 9 − 0.83 *11 = −0.16
10/24/2017 Mai VU-FIE-FTU 16
2.2. Some characteristics of OLS estimators
The OLS estimators are expressed solely in terms of
the observable (i.e., sample) quantities (i.e., X and Y).
Therefore, they can be easily computed.
They are point estimators; that is, given the sample,
each estimator will provide only a single (point) value
of the relevant population parameter.
The regression line obtained has the following
properties:
10/24/2017 Mai VU-FIE-FTU 17
2.2. Some characteristics of OLS estimators
It passes through the Y
Yˆi = ˆ1 + ˆ2 X i
sample means of Y and
SRF
X.
Y = ˆ1 + ˆ2 X Y
X X
10/24/2017 Mai VU-FIE-FTU 18
2.2. Some characteristics of OLS estimators
𝑖 is equal to the
The mean value of the estimated Y = 𝑌
mean value of the actual Y : Yˆ = Y
n
The mean value of the residuals 𝑢ො 𝑖 is zero: uˆi = 0
i =1
The residuals 𝑢ො 𝑖 are uncorrelated with the predicted
Yi: n
Yˆ uˆ
i =1
i i =0
The residuals 𝑢ො 𝑖 are uncorrelated with Xi
n
uˆ X
i =1
i i =0
10/24/2017 Mai VU-FIE-FTU 19
2.3. Assumptions of the OLS method
1
In regression analysis our objective is not only to obtain 𝛽
and 𝛽 2 but also to draw inferences about the true β1 and β2.
For example, we would like to know how close 𝛽1 and
𝛽2 are to their counterparts in the population or how close
𝑌𝑖 is to the true E(Y|Xi).
To that end, we must not only specify the functional form
of the model, but also make certain assumptions about the
manner in which Yi are generated.
The Gaussian, standard, or classical linear regression
model (CLRM), which is the cornerstone of most
econometric theory, makes 10 assumptions:
10/24/2017 Mai VU-FIE-FTU 20
2.3. Assumptions of the OLS method
1. The regression model is linear in the parameters:
𝑌𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 + 𝑢𝑖 (2.3.1)
2. X values are fixed in repeated sampling. Values
taken by the regressor X are considered fixed in
repeated samples. More technically, X is assumed to
be nonstochastic.→ our regression analysis is
conditional regression analysis, that is,
conditional on the given values of the regressor(s) X.
10/24/2017 Mai VU-FIE-FTU 21
2.3. Assumptions of the OLS method
3. Zero mean value of Y
Giá trị trung bình
disturbance ui. Given PRF : Yi = 1 + 2 X i
the value of X, the
mean, or expected,
value of the random
disturbance term ui is
zero. Technically, the + ui
conditional mean value − ui
of ui is zero.
Symbolically, we have:
E(ui |Xi) = 0 (2.3.2) X1 X2 X3 X4
X
10/24/2017 Mai VU-FIE-FTU 22
2.3. Assumptions of the OLS method
4. Homoscedasticity or equal variance of ui. Given the
value of X, the variance of ui is the same for all
observations. That is, the conditional variances of ui are
identical. Symbolically, we have:
var (ui/Xi)= E[ui- E(ui/Xi)]2= E(ui2/Xi)= σ2 (2.3.3)
This assumption states that the variance of ui for each Xi is
some positive constant number equal to σ2.
In other words, the Y populations corresponding to various
X values have the same variance
10/24/2017 Mai VU-FIE-FTU 23
2.3. Assumptions of the OLS method
Put simply, the variation f(u)
around the regression line
(which is the line of
Mật độ xác suất của ui
average relationship
between Y and X) is the
same across the X values; Y
it neither increases or
decreases as X varies
PRF : Yi = 1 + 2 X i
X1 X2 Xi X
10/24/2017 Mai VU-FIE-FTU 24
2.3. Assumptions of the OLS method
In contrast, if the f(u)
conditional variance of
the Y population varies
Mật độ xác suất của ui
with X. This situation is
known appropriately as
heteroscedasticity, or
unequal spread, or Y
variance. Symbolically,
we have:
PRF : Yi = 1 + 2 X i
var (ui/Xi)= σi2 (2.3.4)
X1 X2 Xi X
10/24/2017 Mai VU-FIE-FTU 25
2.3. Assumptions of the OLS method
5. No autocorrelation between the disturbances. Given
any two X values, Xi and Xj (i = j), the correlation between
any two ui and uj (i = j) is zero. Symbolically,
cov(ui, uj/Xi, Xj) = E{[ui-E(ui)]/Xi}{[uj-E(uj)]/Xj}
= E(ui/Xi)(uj/Xj)=0 (2.3.5)
where i and j are two different observations and where cov
means covariance
In words, it postulates that the disturbances ui and uj are
uncorrelated.
Technically, this is the assumption of no serial
correlation, or no autocorrelation.
10/24/2017 Mai VU-FIE-FTU 26
Patterns of correlation among the disturbances
(a) positive serial correlation; (b) negative serial correlation; (c) zero
correlation.
+ ui + ui
− ui + ui − ui + ui
− ui + ui − ui
(a) (b)
− ui + ui
− ui
(c )
10/24/2017 Mai VU-FIE-FTU 27
2.3. Assumptions of the OLS method
6. Zero covariance between ui and Xi, or E(uiXi) = 0 (2.3.5)
Formally,
cov (ui, Xi) = E[ui - E(ui)][Xi- E(Xi)]
= E[ui(Xi- E(Xi)] since E(ui)= 0
= E(uiXi)- E(Xi)E(ui) since E(Xi) is nonstochastic
= E(uiXi) since E(ui)= 0
=0
It states that the disturbance u and explanatory variable X
are uncorrelated.
The rationale for this assumption is as follows:
10/24/2017 Mai VU-FIE-FTU 28
2.3. Assumptions of the OLS method
When we expressed the PRF, we assumed that X and u
(which may represent the influence of all the omitted
variables) have separate (and additive) influence on Y.
But if X and u are correlated, it is not possible to assess
their individual effects on Y.
Thus, if X and u are positively correlated, X increases when
u increases and it decreases when u decreases.
Similarly, if X and u are negatively correlated, X increases
when u decreases and it decreases when u increases.
In either case, it is difficult to isolate the influence of X and
u on Y.
10/24/2017 Mai VU-FIE-FTU 29
2.3. Assumptions of the OLS method
7. The number of observations n must be greater than
the number of parameters to be estimated.
Alternatively, the number of observations n must be
greater than the number of explanatory variables.
8. Variability in X values. The X values in a given sample
must not all be the same. Technically, var (X) must be a
finite positive number.
9. The regression model is correctly specified.
Alternatively, there is no specification bias or error in
the model used in empirical analysis.
10. There is no perfect multicollinearity. That is, there are
no perfect linear relationships among the explanatory
variables.
10/24/2017 Mai VU-FIE-FTU 30
2.4. Precision or standard errors of OLS estimates
It is evident that least-squares estimates are a function
of the sample data. But since the data are likely to
change from sample to sample, the estimates will
change ipso facto.
Therefore, what is needed is some measure of
precision of the estimators 𝛽1 and 𝛽2 .
In statistics the precision of an estimate is measured
by its standard error (se).
Given the Gaussian assumptions, the standard errors
of the OLS estimates can be obtained as follows:
10/24/2017 Mai VU-FIE-FTU 31
2.4. Precision or standard errors of OLS estimates
2 n
ˆ
var( 2 ) = n X i
2
[2.3.7] ˆ
= i =1
2
xi 2 var( 1 ) n [2.3.9]
i =1
n x2
i =1
i
se( ˆ2 ) =
n
i
X 2
se( ˆ1 ) =
n
x
i =1 [2.3.10]
2 [2.3.8] n
i
n xi2
i =1
i =1
where var = variance and se = standard error and where σ2 is the
constant or homoscedastic variance of ui of Assumption 4.
10/24/2017 Mai VU-FIE-FTU 32
2.4. Precision or standard errors of OLS estimates
All the quantities entering into the preceding equations
except σ2 can be estimated from the data. σ2 itself is
estimated by the following formula:
n
i
ˆ
u 2
[2.3.11]
ˆ 2 = i =1
n−2
Where:
𝜎ො 2 = OLS estimator of the true but unknown σ2
n− 2= number of degrees of freedom (df)
ො 𝑖2 = the sum of the residuals squared or the residual sum of
𝑢
squares (RSS)
10/24/2017 Mai VU-FIE-FTU 33
2.4. Precision or standard errors of OLS estimates
𝑢ො 𝑖2 can be computed as follows:
2 2
𝑢ෝ𝑖 = 𝑌𝑖 − 𝛽መ1 − 𝛽መ2 𝑋𝑖
Note that the positive square root of 𝜎ො 2 is known as the
standard error of estimate or the standard error of the
regression (se):
n
i
ˆ
u 2
[2.3.12]
ˆ = i =1
n−2
It is often used as a summary measure of the “goodness of
fit” of the estimated regression line.
10/24/2017 Mai VU-FIE-FTU 34
2.4. Precision or standard errors of OLS estimates
Example 2:
Compute the variance and standard error of 𝛽1 and 𝛽2 in
example 1
10/24/2017 Mai VU-FIE-FTU 35
2.5. Properties of OLS estimators: The
Gauss-Markov theorem
Given the assumptions of the classical linear
regression model, the least-squares estimates possess
some ideal or optimum properties.
These properties are contained in the well-known
Gauss–Markov theorem.
To understand this theorem, we need to consider the
best linear un-biasedness property of an estimator.
10/24/2017 Mai VU-FIE-FTU 36
2.5. Properties of OLS estimators: The
Gauss-Markov theorem
2 , is said to be a best
An estimator, say the OLS estimator 𝛽
linear unbiased estimator (BLUE) of β2 if the following
hold:
1. It is linear, that is, a linear function of a random variable,
such as the dependent variable Y in the regression model.
2. It is unbiased, that is, its average or expected value,
2 ), is equal to the true value, β2.
E(𝛽
3. It has minimum variance in the class of all such linear
unbiased estimators; an unbiased estimator with the least
variance is known as an efficient estimator.
10/24/2017 Mai VU-FIE-FTU 37
2.5. Properties of OLS estimators: The
Gauss-Markov theorem
In the regression context it can be proved that the OLS
estimators are BLUE.
This is the gist of the famous Gauss–Markov theorem,
which can be stated as follows:
Gauss–Markov Theorem: Given the assumptions of
the classical linear regression model, the least-
squares estimators, in the class of unbiased linear
estimators, have minimum variance, that is, they
are BLUE.
10/24/2017 Mai VU-FIE-FTU 38
2.5. Properties of OLS estimators: The
Gauss-Markov theorem
In Figure (a) we have shown
the sampling distribution
of the OLS estimator 𝛽2 , that
is, the distribution of the
values taken by 𝛽2 in repeated
sampling experiments.
For convenience we have
assumed 𝛽2 to be distributed
symmetrically.
Here, the mean of the 𝛽2
values, E(𝛽2 ), is equal to the
true β2.
In this situation we say that E (ˆ2 ) = 2 ̂ 2
𝛽2 is an unbiased estimator
(a) Phân phối mẫu của β2
of β2.
10/24/2017 Mai VU-FIE-FTU 39
2.5. Properties of OLS estimators: The
Gauss-Markov theorem
In Figure (b) we have shown
the sampling distribution of
β2*, an alternative estimator of
β2 obtained by using another
(i.e., other than OLS) method.
For convenience, assume that
β2*, like 𝛽2 , is unbiased, that
is, its average or expected
value is equal to β2.
Assume further that both 𝛽2 E(2* ) = 2 2*
and β2* are linear estimators: (b) Phân phối mẫu của β2*
they are linear functions of Y.
Which estimator, 𝛽2 or β2*,
would you choose?
10/24/2017 Mai VU-FIE-FTU 40
2.5. Properties of OLS estimators: The
Gauss-Markov theorem
Although both 𝛽2 , and β2* are
unbiased the distribution of β2* is
more diffused or widespread around ̂ 2
the mean value than the distribution
of 𝛽2 . In other words, the variance of
2 .
β2* is larger than the variance of 𝛽
Given two estimators: both linear
and unbiased, one would choose the 2*
estimator with the smaller variance
because it is more likely to be close
to β2 than the alternative estimator.
In short, one would choose the BLUE 2 ˆ2 , 2*
estimator. (c ) Phân phối mẫu của β2 và β2*
10/24/2017 Mai VU-FIE-FTU 41
2.6. The coefficient of determination r2 : a
measure of goodness of fit
It is clear that if all the observations were to lie on the
regression line, we would obtain a “perfect” fit, but this
is rarely the case.
Generally, there will be some positive 𝑢ො 𝑖 and some
negative 𝑢ො 𝑖 . What we hope for is that these residuals
around the regression line are as small as possible.
The coefficient of determination r2 (two-variable
case) or R2 (multiple regression) is a summary
measure that tells how well the sample regression line
fits the data.
10/24/2017 Mai VU-FIE-FTU 42
2.6. The coefficient of determination r2 : a
measure of goodness of fit
10/24/2017 Mai VU-FIE-FTU 43
2.6. The coefficient of determination r2 : a
measure of goodness of fit
In the above figure the circle Y represents variation in the
dependent variable Y and the circle X represents variation
in the explanatory variable X.
The overlap of the two circles (the shaded area) indicates
the extent to which the variation in Y is explained by the
variation in X (say, via an OLS regression).
The greater the extent of the overlap, the greater the
variation in Y is explained by X.
The r2 is simply a numerical measure of this overlap: the
area of the overlap increases, that is, successively a greater
proportion of the variation in Y is explained by X
10/24/2017 Mai VU-FIE-FTU 44
2.6. The coefficient of determination r2 : a
measure of goodness of fit
To compute this r2, we proceed as follows: Recall that:
𝑖 + 𝑢ෝ𝑖
𝑌𝑖 = 𝑌 (2.6.1)
Or in the deviation form:
𝑦𝑖 = 𝑦ෝ𝑖 + 𝑢ෝ𝑖 (2.6.2)
Squaring (2.6.2) on both sides and summing over the sample, we
obtain:
𝑦𝑖2 = 𝑦ො𝑖2 + 𝑢ො 𝑖2 + 2 𝑦ො𝑖 𝑢ො 𝑖
= σ 𝑦ො𝑖 + σ 𝑢ො 𝑖 (2.6.3)
= 𝛽22 𝑥ො𝑖 + 𝑢ො 𝑖2
Since σ 𝑦ො𝑖 𝑢ො 𝑖 =0 and 𝑦ො𝑖 = 𝛽መ2 𝑥𝑖
10/24/2017 Mai VU-FIE-FTU 45
2.6. The coefficient of determination r2 : a
measure of goodness of fit
TSS = yi2 = (Yi − Y )2 (2.6.5)
Total sum of squares (TSS): total variation of the actual Y values
about their sample mean.
ESS = yˆi2 = (Yˆi − Y ) 2 = ˆ22 xi2 (2.6.6)
Explained sum of squares (ESS): variation of the estimated Y
values about their mean.
RSS = uˆi2 = (Yi − Yˆi ) 2 (2.6.7)
Residual sum of squares (RSS): residual or unexplained variation of
the Y values about the regression line.
10/24/2017 Mai VU-FIE-FTU 46
2.6. The coefficient of determination r2 : a
measure of goodness of fit
Y
SRF
Y
ESS
Yˆi
TSS
RSS
Yi
Xi X X
10/24/2017 Mai VU-FIE-FTU 47
2.6. The coefficient of determination r2 : a
measure of goodness of fit
Thus:
TSS= ESS + RSS (2.6.8)
↔ i
y 2
= 2 i
2
x 2
+ i
ˆ
u 2
This shows that the total variation in the observed Y
values about their mean value can be partitioned into
two parts, one attributable to the regression line and
the other to random forces because not all actual Y
observations lie on the fitted line.
10/24/2017 Mai VU-FIE-FTU 48
2.6. The coefficient of determination r2 : a
measure of goodness of fit
Now dividing (2.6.8) by TSS on both sides, we obtain:
1=
ESS RSS
+ = (Yˆ − Y )
i
2
+ uˆ 2
i
TSS TSS (Y − Y )
i
2
(Y − Y )
i
2
We now define r2 as:
r =
2 (Yˆ − Y ) 2 ESS
i
= ( 2.6.8) or r 2
= 1 − uˆi
=
2
1 −
RSS
(2.6.9)
(Yi − Y ) TSS
2
(Yi − Y ) 2
TSS
The quantity r2 thus is known as the (sample) coefficient of
determination and is the most commonly used measure of the
goodness of fit of a regression line.
Verbally, r2 measures the proportion or percentage of the total
variation in Y explained by the regression model.
10/24/2017 Mai VU-FIE-FTU 49
2.6. The coefficient of determination r2 : a
measure of goodness of fit
Two properties of r2 may be noted:
It is a nonnegative quantity.
Its limits are 0 ≤ r2 ≤ 1.
𝑖 = Yi for
An r2 of 1 means a perfect fit, that is, 𝑌
each i.
On the other hand, an r2 of zero means that there
is no relationship between the regressand and the
2 = 0).
regressor whatsoever (i.e., 𝛽
10/24/2017 Mai VU-FIE-FTU 50
2.6. The coefficient of determination r2 : a
measure of goodness of fit
r2 can be computed more quickly from the following formula:
r 2 = i2 = 2 2 i = ˆ22 ( i2 )
ˆ
y
2
ˆ 2 x2 x 2
(2.6.10)
y yi yi
If we divide the numerator and the denominator of (2.6.10) by
the sample size n (or n− 1 if the sample size is small), we obtain:
x 2
i
2
(n − 1) S
r 2 = ˆ22 [ ] = ˆ22 x2
i
y 2 S
y (2.6.11)
(n − 1)
where 𝑆𝑥2 and 𝑆𝑦2 are the sample variances of X and Y, respectively.
10/24/2017 Mai VU-FIE-FTU 51
2.6. The coefficient of determination r2 : a
measure of goodness of fit
• On the other side, SRF has the form as : yi = ˆ2 xi + uˆi
ˆ
→ 2 =
yi − uˆi
= x ( y − uˆ ) = x y − x uˆ
i i i i i i i
xi x x x i i
2
i
• Since x uˆ =0 → ˆ
= xy
[2.6.12] i i
x 2
i i 2
( x y ) i 2
• Replace [2.6.12] in [2.6.10], we have: r2 = i i
[2.6.13]
x y
2
i
2
i
• Take the square root of both sides of [2.6.13]:
r= x y i i
=
n X iYi − ( X i )( Yi )
( x )( y
2
i
2
i ) n X i
2 2
− ( X i ) n Yi − ( Yi )2 2
[2.6.14]
→ which is known as the simple correlation coefficient.
10/24/2017 Mai VU-FIE-FTU 52
2.6. The coefficient of determination r2 : a
measure of goodness of fit
Example 3:
Compute r2 and r of the model in example 1
10/24/2017 Mai VU-FIE-FTU 53
2.6. The coefficient of determination r2 : a
measure of goodness of fit
Some of the properties of r are as follows:
1. It can be positive or negative, the sign depending on
the sign of the term in the numerator of (2.6.14), which
measures the sample covariation of two variables.
2. It lies between the limits of −1 and +1; that is, −1 ≤ r ≤ 1.
3. It is symmetrical in nature; that is, the coefficient of
correlation between X and Y(rXY) is the same as that
between Y and X(rYX).
10/24/2017 Mai VU-FIE-FTU 54
2.6. The coefficient of determination r2 : a
measure of goodness of fit
4. If X and Y are statistically independent the correlation
coefficient between them is zero; but if r = 0, it does
not mean that two variables are independent.
5. It is a measure of linear association or linear
dependence only; it has no meaning for
describing nonlinear relations.
6. Although it is a measure of linear association
between two variables, it does not necessarily
imply any cause-and-effect relationship
10/24/2017 Mai VU-FIE-FTU 55
2.6. The coefficient of determination r2 : a
measure of goodness of fit
10/24/2017 Mai VU-FIE-FTU 56