FINM3123 Introduction to Econometrics
Chapter 4
Multiple Regression Analysis: Inference
1
Multiple Regression Analysis: Inference
Statistical inference in the regression model
§ Hypothesis tests about population parameters
§ Construction of confidence intervals
Sampling distributions of the OLS estimators
§ The OLS estimators are random variables
§ We already know their expected values and their variances
§ However, for hypothesis tests we need to know their distribution
§ In order to derive their distribution we need additional assumptions
§ Assumption about distribution of errors: normal distribution
2
Multiple Regression Analysis: Inference
Assumption MLR.6 (Normality of error terms)
𝑢! ∼ 𝑁(0, 𝜎 " ) independently of 𝑥!# , 𝑥!" , … , 𝑥!$
It is assumed that the unobserved
factors are normally distributed around
the population regression function.
The form and the variance of the
distribution does not depend on any of
the explanatory variables.
It follows that:
𝑦|𝒙 ∼ 𝑁(𝛽! + 𝛽" 𝑥" + ⋯ + 𝛽# 𝑥# , 𝜎 $ )
3
Multiple Regression Analysis: Inference
Discussion of the normality assumption
§ The error term is the sum of “many” different unobserved factors
§ Sums of independent and identically distributed factors tends towards a
normal distribution (Central Limit Theorem, provided the variance is finite)
§ The normality of the error term is an empirical question
§ At least the error distribution should be “close” to normal
§ In many cases, normality is questionable or impossible by definition
4
Multiple Regression Analysis: Inference
Discussion of the normality assumption (cont.)
§ Examples where normality cannot hold:
- Wages (nonnegative; also: minimum wage)
- Number of arrests (takes on a small number of integer values)
- Unemployment (indicator variable, takes on only 1 or 0)
§ In some cases, normality can be achieved through transformations of the
dependent variable (e.g. use log(𝑤𝑎𝑔𝑒) instead of 𝑤𝑎𝑔𝑒)
§ Under normality, OLS is the best unbiased estimator (whether linear or nonlinear)
§ Important: For the purposes of statistical inference, the assumption of normality
can be replaced by a large sample size
5
Multiple Regression Analysis: Inference
Terminology
“Gauss-Markov assumptions” “Classical linear model (CLM) assumptions”
Theorem 4.1 (Normal sampling distributions)
Under assumptions MLR.1 – MLR.6:
The estimators are normally distributed The standardized estimators follow
around the true parameters with the a standard normal distribution
variance that was derived earlier
6
Multiple Regression Analysis: Inference
Testing hypotheses about a single population parameter
§ Theorem 4.2 (𝒕-distribution for standardized estimators)
Under assumptions MLR.1 – MLR.6:
If the standardization is done using the estimated
standard deviation 𝜎. (= standard error), the
normal distribution is replaced by a 𝑡-distribution
Note: The 𝑡-distribution is close to the standard normal distribution if 𝑛 − 𝑘 − 1 is large.
§ Null hypothesis (for more general hypotheses, see below)
The population parameter is equal to zero, i.e.
after controlling for the other independent
variables, there is no effect of 𝑥𝑗 on 𝑦
7
Multiple Regression Analysis: Inference
§ 𝒕-statistic (or 𝒕-ratio)
The t-statistic will be used to test the above null hypothesis. The farther
the estimated coefficient is away from zero, the less likely it is that the
null hypothesis holds true. But what does “far” away from zero mean?
This depends on the variability of the estimated coefficient, i.e. its
standard deviation. The 𝑡-statistic measures how many estimated
standard deviations the estimated coefficient is away from zero.
§ Distribution of the 𝒕-statistic if the null hypothesis is true
§ Goal: Define a rejection rule so that, if it is true, H0 is rejected only with a small
probability (= significance level, e.g. 5%)
8
Multiple Regression Analysis: Inference
Testing against one-sided alternatives (greater than zero)
Test 𝑯𝟎 : 𝜷𝒋 = 𝟎 against 𝑯𝟏 : 𝜷𝒋 > 𝟎
Reject the null hypothesis in favour of the
alternative hypothesis if the estimated coefficient
is “too large” (i.e. larger than a critical value).
Construct the critical value so that, if the null
hypothesis is true, it is rejected in, for example,
5% of the cases.
In the given example, this is the point of the
𝑡-distribution with 28 degrees of freedom that is
exceeded in 5% of the cases.
Reject if 𝑡-statistic greater than 1.701
9
Multiple Regression Analysis: Inference
Example: Wage equation
Test whether, after controlling for education and tenure, higher work experience
leads to higher hourly wages
Standard errors
Test 𝑯𝟎 ∶ 𝜷𝒆𝒙𝒑𝒆𝒓 = 𝟎 against 𝑯𝟏 ∶ 𝜷𝒆𝒙𝒑𝒆𝒓 > 𝟎
One would either expect a positive effect of experience on hourly wage or no effect at all.
10
Multiple Regression Analysis: Inference
Example: Wage equation (cont.)
𝑡-statistic
Degrees of freedom; here the standard
normal approximation applies
Critical values for the 5% and the 1% significance
level (these are conventional significance levels).
The null hypothesis is rejected because the 𝑡-
statistic exceeds the critical value.
The effect of experience on hourly wage is statistically greater
than zero at the 5% (and even at the 1%) significance level.
11
Multiple Regression Analysis: Inference
Testing against one-sided alternatives (less than zero)
Test 𝑯𝟎 : 𝜷𝒋 = 𝟎 against 𝑯𝟏 : 𝜷𝒋 < 𝟎
Reject the null hypothesis in favour of the
alternative hypothesis if the estimated coefficient
is “too small” (i.e. smaller than a critical value).
Construct the critical value so that, if the null
hypothesis is true, it is rejected in, for example,
5% of the cases.
In the given example, this is the point of the
𝑡-distribution with 18 degrees of freedom so that
5% of the cases are below the point.
Reject if 𝑡-statistic less than -1.734 12
Multiple Regression Analysis: Inference
Example: Student performance and school size
§ Test whether smaller school size leads to better student performance
Percentage of students Average annual Staff per one School enrollment
passing maths test teacher compensation thousand students (= school size)
Test 𝑯𝟎 ∶ 𝜷𝒆𝒏𝒓𝒐𝒍𝒍 = 𝟎 against 𝑯𝟏 ∶ 𝜷𝒆𝒏𝒓𝒐𝒍𝒍 < 𝟎
Do larger schools hamper student performance or is there no such effect?
13
Multiple Regression Analysis: Inference
Example: Student performance and school size (cont.)
𝑡-statistic
Degrees of freedom; here the standard
normal approximation applies
Critical values for the 5% and the 15% significance level.
The null hypothesis is not rejected because the 𝑡-statistic
is not smaller than the critical value.
One cannot reject the hypothesis that there is no effect of school size
on student performance (not even for a lax significance level of 15%).
14
Multiple Regression Analysis: Inference
Example: Student performance and school size (cont.)
§ Alternative specification of functional form:
R-squared slightly higher
Test 𝑯𝟎 ∶ 𝜷𝒍𝒐𝒈(𝒆𝒏𝒓𝒐𝒍𝒍) = 𝟎 against 𝑯𝟏 ∶ 𝜷𝒍𝒐𝒈(𝒆𝒏𝒓𝒐𝒍𝒍) < 𝟎
15
Multiple Regression Analysis: Inference
Example: Student performance and school size (cont.)
𝑡-statistic
Critical value for the 5% significance level. Reject null hypothesis
The hypothesis that there is no effect of school size on student performance
can be rejected in favor of the hypothesis that the effect is negative.
How large is the effect? + 10% enrollment -0.129 percentage points students pass test
−1.29
𝜕𝑚𝑎𝑡ℎ10 𝜕𝑚𝑎𝑡ℎ10 100 −0.0129
−1.29 = = = = (small effect)
𝜕 log 𝑒𝑛𝑟𝑜𝑙𝑙 𝜕𝑒𝑛𝑟𝑜𝑙𝑙 1 +1%
𝑒𝑛𝑟𝑜𝑙𝑙 100
16
Multiple Regression Analysis: Inference
Testing against two-sided alternatives
Test 𝑯𝟎 : 𝜷𝒋 = 𝟎 against 𝑯𝟏 : 𝜷𝒋 ≠ 𝟎
Reject the null hypothesis in favour of the
alternative hypothesis if the absolute value
of the estimated coefficient is too large.
Construct the critical value so that, if the
null hypothesis is true, it is rejected in, for
example, 5% of the cases.
In the given example, these are the points
of the 𝑡-distribution so that 5% of the cases
lie in the two tails.
Reject if the absolute value of the 𝑡-statistic
is greater than 2.06 17
Multiple Regression Analysis: Inference
Example: determinants of college GPA
Lectures missed per week
For critical values, use standard normal distribution
The effects of ℎ𝑠𝐺𝑃𝐴 and 𝑠𝑘𝑖𝑝𝑝𝑒𝑑 are
significantly different from zero at the
1% significance level. The effect of 𝐴𝐶𝑇
is not significantly different from zero,
not even at the 10% significance level.
18
Multiple Regression Analysis: Inference
“Statistically significant” variables in a regression
§ If a regression coefficient is different from zero in a two-sided test, the corresponding
variable is said to be “statistically significant”
§ If the number of degrees of freedom is large enough so that the normal approximation
applies, the following rules of thumb apply:
“statistically significant at 10 % level”
“statistically significant at 5 % level”
“statistically significant at 1 % level”
19
Multiple Regression Analysis: Inference
Guidelines for discussing economic and statistical significance
§ If a variable is statistically significant, discuss the magnitude of the coefficient to get
an idea of its economic or practical importance
§ The fact that a coefficient is statistically significant does not necessarily mean it is
economically or practically significant!
§ If a variable is statistically and economically important but has the “wrong” sign, the
regression model might be misspecified
§ If a variable is statistically insignificant at the usual conventional levels (10%, 5%, 1%),
one may think about dropping it from the regression
§ If the sample size is small, effects might be imprecisely estimated so that the case for
dropping insignificant variables is less strong
20
Multiple Regression Analysis: Inference
Testing more general hypotheses about a regression coefficient
§ Null hypothesis
Hypothesized value of the coefficient
§ 𝒕-statistic
The test works exactly as before, except that the hypothesized value is
subtracted from the estimate when forming the statistic
21
Multiple Regression Analysis: Inference
Example: Campus crime and enrollment
§ An interesting hypothesis is whether crime increases by one percent if enrollment is
increased by one percent
The estimate is different from one but is
this difference statistically significant?
The null hypothesis is
rejected at the 5% level
22
Multiple Regression Analysis: Inference
Computing 𝒑-values for 𝒕-tests
§ If the significance level is made smaller and smaller, there will be a point where
the null hypothesis cannot be rejected anymore
§ The reason is that, by lowering the significance level, one wants to avoid more and
more to make the error of rejecting a correct 𝐻0
§ The smallest significance level at which the null hypothesis is still rejected, is
called the 𝒑-value of the hypothesis test
§ A small 𝑝-value is evidence against the null hypothesis because that means one
would reject the null hypothesis even at small significance levels (above the 𝑝-value)
§ A large 𝑝-value is evidence in favor of the null hypothesis
§ 𝑃-values are more informative than tests at fixed significance levels
23
Multiple Regression Analysis: Inference
How the 𝒑-value is computed (here: two-sided test)
The 𝑝-value is the significance level at which one
is indifferent between rejecting and not rejecting
the null hypothesis.
These would be the In the two-sided case, the 𝑝-value is thus the
critical values for a probability that the 𝑡-distributed variable takes
5% significance level on a larger absolute value than the realized
value of the test statistic, e.g.:
𝑃 𝑡 − 𝑟𝑎𝑡𝑖𝑜 > 1.85 = 2 .0359 = .0718
From this, it is clear that a null hypothesis is
rejected if and only if the corresponding
𝒑-value is smaller than the significance level.
value of the 𝑡-statistic For example, for a significance level of 5% the
𝑡-statistic would not lie in the rejection region.
24
Multiple Regression Analysis: Inference
Confidence intervals
Critical value of
§ Simple manipulation of the result in Theorem 4.2 implies that two-sided test
Lower bound of the Upper bound of the Confidence level
confidence interval confidence interval
§ Interpretation of the confidence interval
• The bounds of the interval are random
• In repeated samples, the interval that is constructed in the above way will cover the
population regression coefficient in 95% of the cases
25
Multiple Regression Analysis: Inference
Confidence intervals for typical confidence levels
Use rules of thumb
Relationship between confidence intervals and hypotheses tests
Reject 𝐻! ∶ 𝛽" = 𝑎" in favor of 𝐻1: 𝛽𝑗 ≠ 𝑎"
26
Multiple Regression Analysis: Inference
Example: Model of firms’ R&D expenditures
Spending on R&D Annual sales Profits as percentage of sales
.0217 ± 2.045(.0128)
The effect of sales on R&D is relatively precisely This effect is imprecisely estimated as
estimated as the interval is narrow. Moreover, the the interval is very wide. It is not even
effect is significantly different from zero because statistically significant because zero
zero is outside the interval. lies in the interval.
27
Multiple Regression Analysis: Inference
Testing hypotheses about a linear combination of parameters
Example: Return to education at 2 year vs. at 4 year colleges
Years of education at 2 year colleges Years of education at 4 year colleges
Test 𝑯𝟎 ∶ 𝜷𝟏 − 𝜷𝟐 = 𝟎 against 𝑯𝟏 ∶ 𝜷𝟏 − 𝜷𝟐 < 𝟎
A possible test statistic would be:
The difference between the estimates is normalized by the estimated
standard deviation of the difference. The null hypothesis would have
to be rejected if the statistic is “too negative” to believe that the true
difference between the parameters is equal to zero.
28
Multiple Regression Analysis: Inference
Impossible to compute with standard regression output because
Usually not available in regression output
Alternative method
Define 𝜽𝟏 = 𝜷𝟏 − 𝜷𝟐 and test 𝑯𝟎 ∶ 𝜽𝟏 = 𝟎 against 𝑯𝟏 ∶ 𝜽𝟏 < 𝟎
Insert into original regression a new regressor (= total years of college)
29
Multiple Regression Analysis: Inference
Estimation results Total years of college
Null hypothesis is rejected at
10% level but not at 5% level
This method always works for single linear hypotheses
30
Multiple Regression Analysis: Inference
Testing multiple linear restrictions: the F-test
Testing exclusion restrictions
Salary of major league baseball player Years in the league Average number of games per year
Batting average Homeruns per year Runs batted in per year
against
Test whether performance measures have any effect/can be excluded from regression.
31
Multiple Regression Analysis: Inference
Estimation of the unrestricted model
None of these variables is statistically significant when tested individually
Idea: How would the model fit be if these variables were dropped from the regression?
32
Multiple Regression Analysis: Inference
Estimation of the restricted model
The sum of squared residuals necessarily increases,
but is the increase statistically significant?
Test statistic
The relative increase of the sum of squared residuals when
Number of restrictions going from 𝐻1 to 𝐻0 follows an 𝑭-distribution (if the null
hypothesis 𝐻0 is true), a.k.a. Fisher-Snedecor distribution
1
(𝑆𝑆𝑅/ − 𝑆𝑆𝑅0/ )/𝑞 (𝑅0/ − 𝑅/1 )/𝑞
𝐹= = 1 ∼ 𝐹2,45657
𝑆𝑆𝑅0/ /(𝑛 − 𝑘 − 1) (1 − 𝑅0/ )/(𝑛 − 𝑘 − 1)
𝑭-statistic, a.k.a. 𝑭-ratio R-squared form of the 𝐹-statistic 𝐹-distribution
33
Multiple Regression Analysis: Inference
Rejection rule (Figure 4.7)
An 𝐹-distributed variable only takes on
positive values. This corresponds to the
fact that the sum of squared residuals can
only increase if one moves from 𝐻1 to 𝐻0
Choose the critical value so that the null
hypothesis is rejected in, for example, 5%
of the cases, although it is true.
34
Multiple Regression Analysis: Inference
Test decision in example
Number of restrictions to be tested
Degrees of freedom in
the unrestricted model
The null hypothesis is overwhelmingly rejected
(even at very small significance levels).
Discussion
§ The three variables are “jointly significant”
§ They were not significant when tested individually
§ The likely reason is multicollinearity between them
35
Multiple Regression Analysis: Inference
Test of overall significance of a regression
The null hypothesis states that the explanatory variables
are not useful at all in explaining the dependent variable
Restricted model (regression on constant)
The test of overall significance is reported in most regression packages; the null hypothesis
is usually overwhelmingly rejected
36
Multiple Regression Analysis: Inference
Testing general linear restrictions with the F-test
Example: Test whether house price assessments are rational
The assessed housing value
Actual house price (before the house was sold) Size of lot (in feet)
Square footage Number of bedrooms
In addition, other known factors should
not influence the price once the
assessed value has been controlled for.
If house price assessments are rational, a 1% change in the
assessment should be associated with a 1% change in
price.
37
Multiple Regression Analysis: Inference
Unrestricted regression
Restricted regression The restricted model is actually a
regression of [y-x1] on a constant
Test statistic
cannot be rejected
38
Multiple Regression Analysis: Inference
Regression output for the unrestricted regression
When tested individually, there is also
no evidence against the rationality of
house price assessments
The F-test works for general multiple linear hypotheses
For all tests and confidence intervals, validity of assumptions MLR.1 – MLR.6 has been
assumed. Tests may be invalid otherwise.
39
Summary
§ Assumption MLR.6 (normality of error terms)
§ Gauss-Markov assumptions (MLR.1-5) and Classical linear model (CLM)
assumptions (MLR.1-6)
§ Two theorems
• Normal sampling distributions
• 𝒕-distribution for standardized estimators
§ For all tests and confidence intervals, MLR.1-6 should be satisfied.
40
Summary
Three types of hypothesis tests
§ Tests about a single population parameter
• 𝑡 test
• Three approaches: critical value, 𝑝-value, confidence interval for two-sided tests
§ Tests about a linear combination of parameters
• 𝑡 test
• Alternative method: variable transformation
§ Tests about multiple linear restrictions
• 𝐹 test (SSR form, R-squared form)
• SSR form should be used if 𝑦 is different between unrestricted and restricted models.
41