Statistical Inference and OLS Asymptotic
Prof. Rishman Jot Kaur Chahal
                    HSN - 206
  Department of Humanities and Social Sciences
     Indian Institute of Technology Roorkee
                                                 1 / 32
Normality Assumption
   Remember we assumed that ui ∼ N(0, σ 2 ). So, we assumed that ui
   follows a normal distribution with mean 0 and constant variance (σ 2 ).
   Moreover, the estimators βˆ2 , βˆ3 and βˆ1 are themselves normally
   distributed with means equal to true β2 , β3 and β1 and their
   respective variances are also calculated for a three variable regression
   model.
                  2
   Also, (n − 3) σσ̂2 follows the χ2 distribution with n − 3 degrees of
   freedom.
                                                                          2 / 32
Normality Assumption
   Further, for each of the parameters we find that they follow
   t-distribution with (n-3) df.
                                    βˆ1 − β1
                               t=                                 (1)
                                     se(βˆ1 )
                                    βˆ2 − β2
                               t=                                 (2)
                                     se(βˆ2 )
                                    βˆ3 − β3
                               t=                                 (3)
                                     se(βˆ3 )
                                                                  3 / 32
Testing the Individual Regression Coefficients
    Consider the following 3-variable regression model:
                           Yi = β1 + β2 X2 + β3 X3 + ϵi                     (4)
    H0 : β2 = 0, and
    H1 : β2 ̸= 0
    Can you explain in words what your null hypothesis state?
         It means that keeping X3 constant X2 has no effect or influence on Y.
    To test the null hypothesis, we can rely on t-test.
                                          βˆ2 − β2
                                    t=
                                           se(βˆ2 )
    With confidence interval approach:
    βˆ2 − tα/2 se(β̂2 ) ≤ β2 ≤ βˆ2 + tα/2 se(β̂2 )
                                                                            4 / 32
Testing the overall significance of the Regression
    H0 : β2 = β3 = 0.
    Testing the overall significance as this null hypothesis is a joint
    hypothesis that β2 and β3 are jointly or simultaneously equal to zero.
    But why can’t we test the overall significance by testing the
    significance of βˆ2 and βˆ3 individually?
                                                                        5 / 32
Testing the overall significance of the Regression
    Look back at the confidence interval approach where we establish a
    confidence interval for β2 which let’s say is 95%.
    So, now we cannot say that both β2 and β3 lie in their respective
    confidence intervals with a probability of (1 − α) (1 − α) =
    (0.95)(0.95).
    So, individually following statements are true but not for
    simultaneously including both β2 and β3 .
            Pr [βˆ2 − tα/2 se(βˆ2 ) ≤ β2 ≤ βˆ2 + tα/2 se(βˆ2 )] = 1 − α
            Pr [βˆ3 − tα/2 se(βˆ3 ) ≤ β3 ≤ βˆ3 + tα/2 se(βˆ3 )] = 1 − α
                                                                          6 / 32
Testing the overall significance of the Regression
    Analysis of Variance approach to test the overall significance.
    Remember
                               TSS = ESS + RSS
                X              X               X            X
                    yi2 = (β̂2    yi x2i + βˆ3   yi x3i ) +   ûi2
    Now, for the null hypothesis of β2 = β3 = 0, following F-test we can
    show that
                           (β̂2 yi x2i + βˆ3 yi x3i )/2
                               P             P
                       F =         P 2                                 (5)
                                     ûi /(n − 3)
    which follows F-distribution with 2 and (n-3) df.
                                                                       7 / 32
Testing the overall significance of the Regression
    For a Multiple Regression: Test the hypothesis as
                         H0 : β2 = β3 = ... = βk = 0
    H1 : Not all slope coefficients are simultaneously zero.
                              ESS/df   ESS/(k − 1)
                        F =          =
                              RSS/df   RSS/(n − k)
    Can you establish a relationship between R 2 and F? (Hint: use the
    above equation)
                                                                         8 / 32
Testing the overall significance of the Regression
    Using the above equation we can say that
                                     n − k ESS
                               F =
                                     k − 1 RSS
                                n−k      ESS
                          F =
                                k − 1 TSS − ESS
                               n − k ESS/TSS
                         F =
                               k − 1 1 − ESS/TSS
                                  n − k R2
                            F =
                                  k − 1 1 − R2
                                   R 2 /(k − 1)
                          F =
                                (1 − R 2 )/(n − k)
                                                     9 / 32
Testing the overall significance of the Regression
    Thus the F test, which is a measure of the overall significance of the
    estimated regression, is also a test of significance of R 2 . Or
    Testing the null H0 : β2 = β3 = ... = βk = 0 is equivalent to testing
    the null H0 : R 2 = 0.
                                                                        10 / 32
Testing the overall significance of the Regression
    Numerical Example: Suppose you regress child mortality (CM) on
    GNP per capita (PGNP) and obtained following results:
                        ˆ i = 157.42 − 0.0114PGNP
                       CM
                           t = (15.9894)(−3.5156)
    where r 2 = 0.1662 and adjr 2 = 0.1528.
    Now, test the hypothesis that PGNP has no significant effect on CM.
    Also, you are given the following information.
                           SS            df   MSS
                   ESS     60, 449.5     1    60,449.50
                   RSS     3,03,228.50   62   4890.782
                   Total   3,63,678      63
    Further, can you establish a relationship between F and t?
                                                                     11 / 32
Testing the overall significance of the Regression
    Consider the following multiple regression
                   Yi = β1 + β2 X2i + β3 X3i + β4 X4i + ui
    We want to test the hypotheses:
                      H0 : β3 = β4    or   (β3 − β4 ) = 0
                      H1 : β3 ̸= β4   or   (β3 − β4 ) ̸= 0
    How do we test this?
                                                             12 / 32
Testing the overall significance of the Regression
                             (βˆ3 − βˆ4 ) − (β3 − β4 )
                        t=
                                   se(βˆ3 − βˆ4 )
    Thus, it follows t-distribution with (n-4) df.
                                 q
    Remember, se(βˆ3 − βˆ4 ) =    var (βˆ3 ) + var (βˆ4 ) − 2cov (βˆ3 , βˆ4 )
    Thus, the test statistic will be
                                   (βˆ3 − βˆ4 )
                   t=q
                      var (βˆ3 ) + var (βˆ4 ) − 2cov (βˆ3 , βˆ4 )
                                                                                13 / 32
Testing the overall significance of the Regression
                                                              
                                   3         1             3   5
                                  1       1             1   4
                                                              
    Numerical Example: Given, Y = 8
                                   
                                     ; X = 1
                                                          5   6
                                  3       1             2   4
                                   5         1             4   6
    Write the regression model and estimate β̂.
    Write the estimated regression model.
                                                  P 2
    Estimate the R 2 . (Hint: convert the R 2 =   P ŷ 2   in X and Y form)
                                                    y
    Estimate the   R̄ 2 .
                                                                              14 / 32
Testing the overall significance of the Regression
                              y 2 which is    (Y − Ȳ )2
                          P                  P
    Remember, TSS =
    So,
                                     X     1 X 2
                            TSS =      Y2 − (    Y)                       (6)
                                           n
                                         1 X 2
                                = Y ′Y − (    Y)
                                         n
                e 2 = e ′ e = (Y − X β̂)′ (Y − X β̂)
            P
    RSS =
                     e ′ e = Y ′ Y + β̂ ′ X ′ X β̂ − Y ′ X β̂ − YX ′ β̂
                    RSS = e ′ e = Y ′ Y − β̂ ′ X ′ Y
                                                   P 2
                      y − e = β̂ ′ X ′ Y − ( nY ) .
              P 2 P 2 P 2
    So, ESS =  ŷ =
    Thus, R 2 is
                                                   ( Y )2
                                                    P
                                       β̂ ′ X ′ Y −
                               R2 =                P n
                                                  ( Y )2
                                                                          (7)
                                        Y ′Y −      n
                                                                          15 / 32
Testing the overall significance of the Regression
    Remember, R̄ 2 is
                                           RSS/(n − k)
                              R̄ 2 = 1 −                       (8)
                                           TSS/(n − 1)
    Further it is given as
                                          (n − 1)
                             R̄ 2 = 1 −           (1 − R 2 )   (9)
                                           n−k
                                                               16 / 32
Restricted Least Squares
    In economic models there are certain situations where regression
    coefficients must satisfy certain linear equality restrictions.
    Can you give any example for the mentioned situation? (Hint: Think
    about a form of production function widely used in economics).
                                                                       17 / 32
Restricted Least Squares
    Cobb-Douglas Production Function:
                             Yi = β1 X2iβ2 X3iβ3 e ui
    where Y = output, X2 = labor input, and X3 = capital input.
    Taking log on both the sides
                     lnYi = β0 + β2 lnX2i + β3 lnX3i + ui
    Now remember in case of constant returns to scale β2 + β3 = 1 which
    is an example of linear restriction.
    Now, you need to test that whether this restriction is valid i.e.
    whether there are constant returns to scale or not. How will you
    approach this?
                                                                        18 / 32
Restricted Least Squares
    Two ways to approach the above question:
        Usual t-test approach. So you are not considering any linear restriction
        just estimating the parameters for the log model and testing the
        hypothesis or restriction.
        This is known as unrestricted or unconstrained regression.
                                    (β̂2 + β̂3 ) − (β2 + β3 )
                               t=
                                          se(β̂2 + β̂3 )
                                         (β̂2 + β̂3 ) − 1
                          t=q
                                 var β̂2 + var β̂3 + 2cov (β̂2 , β̂3 )
         where null hypothesis is β2 + β3 = 1.
        So here we are trying to find whether the linear restriction exists after
        estimating the “unrestricted” regression.
                                                                               19 / 32
Restricted Least Squares
    Second or more direct approach is the F-test.
        Here first incorporate the restriction into the estimating procedure at
        the outset i.e.
                                       β2 = 1 − β3
        Or
                                       β3 = 1 − β2
        Thus using any of the above conditions we can eliminate one of the β
        coefficient and can write the model as:
                         lnYi = β0 + (1 − β3 )lnX2i + β3 X3i + ui              (10)
        Or
                       lnYi − lnX2i = β0 + β3 (lnX3i − lnX2i ) + ui            (11)
                                   Yi              X3i
                               ln     = β0 + β3 ln     + ui                  (12)
                                  X2i              X2i
        where (Yi /X2i ) = output/labor ratio and (X3i /X2i ) = capital labor
        ratio, quantities of great economic importance.                       20 / 32
Restricted Least Squares
    Eq. 11 is known as the Restricted Least Squares (RLS).
    Once estimating β3 from eq. 11, β2 can be easily estimated from the
    restriction.
    This procedure can be generalized to models containing any number
    of explanatory variables and more than one linear equality restriction.
    But how do we know that the restriction is valid? Or How can we
    compare the restricted and unrestricted least squares regressions?
                                                                         21 / 32
Restricted Least Squares
                                  2 − R 2 )/m
                                (RUR    R
                        F =         2 )/(n − k)
                                                                      (13)
                              (1 − RUR
    where RUR2 and R 2 are, respectively, the R 2 values obtained from the
                      R
    unrestricted and restricted regressions.
              2 ≥ R 2 and thus
                                     P     2 ≤
                                                 P ˆ2
    Remember RUR   R                     ûUR     uR .
    Try to prove this on your own.
                                                                       22 / 32
Structural or parameter stability of the models: Chow Test
    Structural change in the relationship between Y and the regressors
    may occur specially in a time series data.
    But what do you mean by structural change?
                                                                    23 / 32
Structural or parameter stability of the models: Chow Test
    Structural change means that the values of the parameters of the
    model do not remain the same through the entire time period.
    May happen due to external factors like the Gulf war of 1990-91,
    policy changes (such as the switch from a fixed exchange-rate system
    to a flexible exchange-rate system around 1973), and other changes.
    But how do we know that a structural change has occurred?
    Example: Suppose we want to estimate a simple savings function that
    relates savings (Y) to disposable personal income DPI (X). Since we
    have the data for 1970-95, so we can obtain an OLS regression of Y
    on X.
    Anything we are missing here?
                                                                     24 / 32
Structural or parameter stability of the models: Chow Test
    In 1982 US suffered from worst peacetime recession thus considering
    the relationship between savings and disposable income same for 26
    years is quite a strict assumption.
    Thus possibility of “large” prediction errors which may cast doubt on
    the constancy hypothesis, and converse for the “small” prediction
    errors.
    In such scenario, divide the data n into two parts i.e. n1 and n2 where
    n1 can be used for estimation and n2 for testing.
    But, how to do this in a cross-section data?
                                                                        25 / 32
Structural or parameter stability of the models: Chow Test
    Now, in our US example we can divide the data as n1 = 12 for
    1970-81, n2 = 14 for 1982-95.
    Thus we can undertake 3 different regressions:
                           Yt = λ1 + λ2 Xt + u1t                   (14)
    for n1 = 12
                           Yt = γ1 + γ2 Xt + u2t                   (15)
    for n2 = 14
                            Yt = α1 + α2 Xt + ut                   (16)
    for n = n1 + n2 = 26
    What difference can you see between eq. 14, 15 and 16 in terms of
    parameters?
                                                                    26 / 32
Structural or parameter stability of the models: Chow Test
    For eq. 16 we can say that we assume λ1 = γ1 = α1 and
    λ 2 = γ2 = α2 .
    Estimate eq. 14 and 15 seperately and thus compare their
    parameters.
    But there exists a formal test as well i.e. the Chow test.
        Using the designated n1 observations regress y1 on X1 and obtain RSS1
        i.e. e1′ e1 where df = n1 − k.
        Fit the same regression for all (n1 + n2 ) observations and obtain the
        restricted RSS e∗′ e∗ where df = n1 + n2 − 2k.
        Wait! why did I call it as restricted?
                                                                             27 / 32
Structural or parameter stability of the models: Chow Test
    Because it is obtained by imposing the restrictions that λ1 = γ1 and
    λ2 = γ2 , that is, the subperiod regressions are not different.
    Further, since the two sets of samples are deemed independent, we
    can add RSS1 and RSS2 to obtain what may be called the
    unrestricted residual sum of squares (RSSUR ).
    If there is no structural change then RSSR and RSSUR should not be
    different.
               (e∗′ e∗ − (e1′ e1 ) − (e2′ e2 ))/k
        F =                                       ∼ F (k, n1 + n2 − 2k)    (17)
              (e1′ e1 + e2′ e2 )/(n1 + n2 − 2k)
    then Chow has shown that under the null hypothesis that
    (regressions) eq. 14 and eq. 15 are (statistically) the same (i.e., no
    structural change or break).
    Can you tell any assumption that we kept in mind here?                28 / 32
Structural or parameter stability of the models: Chow Test
    The Chow test assumes that we know the point(s) of structural
    break. In our example, we assumed it to be in 1982.
    Further, the error terms in subperiod regressions i.e. 14 and 15 are
    normally distributed with the same (homoscedastic) variance σ 2 i.e.
    u1 ∼ N(0, σ 2 ) and u2 ∼ N(0, σ 2 ).
    Can you examine this assumption here? Why would you say that the
    error variances of two periods are same? (Hint: Think in terms of
    RSS)
                                                                       29 / 32
Structural or parameter stability of the models: Chow Test
    As we cannot observe the true error variances so let us observe their
    estimates from RSS. Thus, for eq. 14 and 15 RSS is given as:
                                              RSS1
                                   σˆ1 2 =
                                             n1 − k
                                              RSS2
                                   σˆ2 2 =
                                             n2 − k
    Given the assumptions, σˆ1 2 and σˆ2 2 are unbiased estimators of the
    true variances in the two subperiods. Thus, as assumed by Chow test
    σ12 = σ22 (which is your null hypothesis) then
                             σˆ2
                            ( σ12 )
                               1
                                      ∼ F(n1 −k),(n2 −k)
                              σˆ2
                            ( σ22 )
                                2
                                                                       30 / 32
Structural or parameter stability of the models: Chow Test
    If σ12 = σ22 , then F reduces to
                                              σ̂12
                                       F =                            (18)
                                             σˆ2 2
    Now if null hypothesis is not rejected then one can use the Chow test.
    In case if null is rejected then one can use the modified Chow test.
                                                                           31 / 32
Specification error
    The specification of the linear model centers on the disturbance
    vector u and matrix X.
    This bias arises from incorrect specification of the model which could
    be due to omitted variable or wrong functional form.
    Let us start this topic with assumptions about u which includes:
                                    ui ∼ iid(0, σ 2 )                              (19)
    Or
                                  ui ∼ iid    N(0, σ 2 )                           (20)
    Further, E (Xit , us ) = 0 for all i = 1, 2, ..., k and t, s = 1, 2, ..., n.
    X is non-stochastic with full column rank k.
    If assumption 19 holds but 20 doesn’t will affect the BLUE property
    of OLS estimates?
                                                                                    32 / 32