Ch-2 Lecture Note
Ch-2 Lecture Note
Chapter Two
Basic Regression Analysis with Time Series Data
2.1 The Nature of Time series Data
An obvious characteristic of time series data which distinguishes it from cross-
sectional data is that a time series data set comes with a temporal ordering. For instance,
the data for 1970 immediately precede the data for 1971. A sequence of random variables
indexed by time is called a stochastic process or a time series process. In other words,
stochastic or random process is a collection of random variables ordered in time.
Another difference between cross-sectional and time series data is relied on the
randomness of variables. The cross-sectional data should be viewed as random outcomes
since a different sample drawn from the population will generally yield different values of
the independent and dependent variables. Therefore, the OLS estimates computed from
different random samples will generally differ, and this is why we consider the OLS
estimators to be random variables.
Regarding time series data, when we collect a time series data set, we obtain one possible
outcome, or realization, of the stochastic process. This is because we cannot go back in time
and start the process over again. (This is analogous to cross-sectional analysis where we can
collect only one random sample.) However, if certain conditions in history had been
different, we would generally obtain a different realization for the stochastic process.
Consider for instance a nation has a GDP of $2872.8 billion for 1970–I. In theory, the GDP
figure for the first quarter of 1970 could have been any number, depending on the economic
and political climate then prevailing. The figure of 2872.8 is, therefore a particular outcome
of the stochastic process prevailed in that time. Thus, the time series data is considered as
the outcome of random variables. The distinction between the stochastic process and its
realization is akin to the distinction between population and sample in cross-sectional
data. Just as we use sample data to draw inferences about a population, in time series we
use the realization to draw inferences about the underlying stochastic process.
A. Static Models
Suppose that we have time series data available on two variables, say y and z, where yt and
zt are dated contemporaneously, that is, data occurring in the same period of time. A static
model relating y to z is
1.1
The name “static model” comes from the fact that we are modeling a contemporaneous
relationship between y and z. Usually, a static model is postulated when a change in z at
time t is believed to have an immediate effect on y. Static regression models are also used
when we are interested in knowing the tradeoff between y and z. An example of a static
model is the static Phillips curve, given by
1.2
This equation is used to study the contemporaneous tradeoff between the annual inflation
rate, inft and the unemployment rate, unemt.
Naturally, we can have several explanatory variables in a static regression model. Let
mrdrtet denote the murders per 10,000 people in a particular city during year t, let convrtet
denote the murder conviction rate, let unemt be the local unemployment rate, and let
yngmlet be the fraction of the population consisting of males between the ages of 18 and 25.
Then, a static multiple regression model explaining murder rates is
1.3
Using a model such as this, we can hope to estimate, for example, the ceteris paribus effect
of an increase in the conviction rate on criminal activity.
1.4
Where gfrt is the general fertility rate (children born per 1,000 women of childbearing age)
and pet is the real dollar value of the personal tax exemption. The idea is to see whether, in
the aggregate, the decision to have children is linked to the tax value of having a child.
Equation (1.4) recognizes that, for both biological and behavioral reasons, decisions to have
children would not immediately result from changes in the personal exemption. Equation
(1.4) is an example of the model
1.5
Special terminology and notation are used to indicate future and past values of z. the value
of z in the previous period is called its first lagged value or more simply, its first lag, and
is denoted zt-1. Its jth lagged value (or simply its jth lag) is its value j periods ago, which is zt-
j. In this regard, equation (1.5) is the two lags of z on y. the change between the two
consecutive periods, zt and zt-1 (or z), is termed as the first difference of z on y. 1 Note
that zt+1 denote the value of z one period into the future2. The parameter 0 is the immediate
change in y due to the one-unit increase in z at time t, and it is usually called the impact
propensity or impact multiplier. This summarizes the dynamic effect that a temporary
increase in z has on y. Note that there are no further changes in y after two periods. This
shows that the sum of the coefficients on current and lagged z, 0, 1, 2, is the long-run
change in y given a permanent increase in z and is called the long-run propensity (LRP)
or long-run multiplier.
1.6
Assumption 2 (Zero conditional mean):- For each t, the expected value of the error ut,
given the explanatory variables for all time periods, is zero. Mathematically
1
Economic time series are often analyzed after computing their logarithms or the changes in their logarithms. Thus, the
logarithm of the first difference in terms of percentage change, which is denoted by 100 ln(zt) measures the growth rate.
This indicates that the original time series data would be transformed into log form to compute the growth rate of a
variable of interest.
2
The interval between observations, that is, the period of time between observation t and observation t+ 1, is some unit of
time such as weeks, months, quarters (three-month units), or years. For example, usually, the inflation data are studied on
quarterly basis, so the unit of time: (a period') is a quarter of a year.
1.7
Assumption 3 (No perfect collinearity):- In the sample (and therefore in the underlying
time series process), no independent variable is constant or a perfect linear combination of
the others.
Assumption 4 (Homoskedasticity):- Conditional on X, the variance of ut is the same for all
t:
1.8
Assumption 5 (No serial correlation):- Conditional on X, the errors in two different time
periods are uncorrelated:
1.9
Note that in order to use the usual OLS standard errors, t statistics, and F statistics for
hypothesis testing or inference, we need to add a final assumption (Assumption 6) that is
analogous to the normality assumption we used for cross sectional analysis.
Assumption 6 (Normality):- The errors ut are independent of X and are independently and
identically distributed as Normal (0, 2 ).
Theorems
Theorem 1 Unbiasedness of OLS
Under Assumptions 1, 2, and 3 the OLS estimators are unbiased conditional on X, and
therefore unconditionally as well:
1.10
Theorem 2 Guass - Marcov theorem
Under Assumptions 1 through 5, the OLS estimators are the best linear unbiased estimators
conditional on X.
Theorem 3 OLS sampling variance
Under the time series Gauss-Markov assumptions 1 through 5, the variance of ˆ , j
conditional on X, is:
1.11
2
Where SST j is the total sum of squares of X j and R j is the R-squared from the regression
of X j on the other independent variables. The estimator ̂ 2 is the unbiased estimator of
2 and it is computed as ˆ 2 SSR / df , where df n k 1
Theorem 4 Normal sampling distributions
Under Assumptions 1 through 6, the CLM assumptions for time series, the OLS estimators
are normally distributed, conditional on X.
1.12
Where prepopt is the employment rate in Puerto Rico during year t (ratio of those working
to total population), usgnpt is real U.S. gross national product (in billions of dollars), and
mincov measures the importance of the minimum wage relative to average wages. In
particular, mincov = (avgmin/avgwage)=avgcov, where avgmin is the average minimum
wage, avgwage is the average overall wage, and avgcov is the average coverage rate (the
proportion of workers actually covered by the minimum wage law).
1.13
The estimated elasticity of prepop with respect to mincov is -0.154, and it is statistically
significant with t = -2.37. Therefore, a higher minimum wage lowers the employment
rate, something that classical economics predicts. The GNP variable is not statistically
significant, but this changes when we account for a time trend in the next section.
We can use logarithmic functional forms in distributed lag models, too. For example,
for quarterly data, suppose that money demand (Mt) and gross domestic product (GDPt)
are related by
1.14
Dummy Variables
It is not always to have numerical variables in the regression. Thus, if we incorporate
some qualitative variables such as nominal or ordinal variables (eg. gender, race,
religion, region etc.) in the regression analysis, we used a dummy variable to quantify the
qualitative variable and to obtain a meaningful result.
A dummy variable is, therefore, an artificial variable constructed such that it takes the value
unity whenever the qualitative phenomena it represent occurs and zero otherwise. For
example, if we take gender as a dummy variable, we may assign 1 for male and 0 for
female as a base or bench mark.
Dummy independent variables are also quite useful in time series applications. Since the
unit of observation is time, a dummy variable represents whether, in each time period, a
certain event has occurred. For example, for annual data, we can indicate in each year
whether a Democrat or a Republican is president of the United States by defining a variable
democt , which is unity if the president is a Democrat, and zero otherwise. Or, in looking at
the effects of capital punishment on murder rates in Texas, we can define a dummy variable
for each year equal to one if Texas had capital punishment during that year, and zero
otherwise.
Often dummy variables are used to isolate certain periods that may be systematically
different from other periods covered by a data set.
1.15
gfr explained in terms of the average real dollar value of the personal tax exemption (pe)
and two binary variables. The variable ww2 takes on the value unity during the years
1941 through 1945, when the United States was involved in World War II. The variable
pill is unity from 1963 on, when the birth control pill was made available for
contraception.
1.16
In the regression result that the fertility rate was lower during World War II: given pe,
there were about 24 fewer births for every 1,000 women of childbearing age. Thus, the
historical event WWII has an effect of reducing births by 24 (being 98-24=74 for every
1,000 women of childbearing age) compared to the birth in rest periods which were free
from this historical event. The number of births in this period is 98 for every 1,000 women
of childbearing age.
Similarly, the fertility rate has been substantially lower since the introduction of the birth
control pill and the statistics is interpreted as did in the case of the dummy variable ww2.
The coefficient on pe implies that a 12-dollar increase in pe increases gfr by about one birth
per 1,000 women of childbearing age. The intercept term or value represents the value for
the bases of the dummy variables.
Index numbers
An index number is a parameter which aggregates a vast amount of information into a
single quantity. Index numbers are used regularly in time series analysis, especially in
macroeconomic applications. An example of an index number is the index of industrial
production (IIP), computed monthly by the Board of Governors of the Federal Reserve (in
the case of United States). The IIP is a measure of production across a broad range of
industries, and, as such, its magnitude in a particular year has no quantitative meaning.
In order to interpret the magnitude of the IIP, we must know the base period and the
base value. For instance, in the data set of industrial production from 1987 to 1999 the
base year is 1987 (the base period is just a convention) and the base value is 100. If the IIP
was 107.7 in 1992, we can say that industrial production was 7.7% higher in 1992 than in
1987. Similarly, if IIP _= 61.4 in 1970 and IIP = 85.7 in 1979, industrial production
grew by about 39.6% during the 1970s. Most of the time indexes are defined with one as
the base value in order to give much sense for the interpretation.
It is easy to change the base period for any index number, and sometimes we must do
this to give index numbers reported with different base years a common base year. For
example, if we want to change the base year of the IIP from 1987 to 1982, we simply
divide the IIP for each year by the 1982 value and then multiply by 100 to make the
base period value 100. Generally, the formula is
1.17
Where oldindexnewbase is the original value of the index in the new base year. For example,
with base year 1987, the IIP in 1992 is 107.7; if we change the base year to 1982, the
IIP in 1992 becomes 100(107.7/81.9) = 131.5 provided that 81.9 is the IIP in 1982.
Another important example of an index number is a price index, such as the consumer price
index (CPI). CPI is used to compute annual inflation rate and to change the nominal
economic variables into real terms. The annual inflation rate will be computed by
computing the percentage change of CPI across different years (or months/quarters, if we
are using monthly/quarterly data).
Price indexes are also used for turning a time series measured in nominal dollars (or current
dollars) into real dollars (or constant dollars). This is because most economic behavior is
assumed to be influenced by real, not nominal, variables. The real values of the economic
variables such as GDP and wage are obtained by dividing the nominal value by the CPI.
Note that we must be a little careful to first divide the CPI by 100 so as to make the value in
the base year is one (i.e. CPI/100=p). In other words, the real value is obtained by dividing
the nominal value by the p.
1.18
Interpreting 1 in (1.18) is simple: holding all other factors (those in et) fixed, 1
measures the change in yt from one period to the next due to the passage of time: when
et =0,
1.19
A more realistic characterization of trending time series allows {et} to be correlated over
time, but this does not change the flavor of a linear time trend. In fact, what is important for
regression analysis under the classical linear model assumptions is that E(yt) is linear in t.
1.20
Many other economic time series are better approximated by an exponential trend –
constant average growth, which follows when a series has the same average growth rate
from period to period. For example, In the early years, we see that the change in the
imports over each year is relatively small, whereas the change increases as time passes.
This is consistent with a constant average growth rate: the percentage change is roughly the
1.21
1 is interpreted as the average per period growth rate in yt. For example, if t denotes year
and 1 = .027, then yt grows about 2.7% per year on average. Thus, 1 represents the
proportionate change in yt: log(yt) - log(yt-1), which is also called the growth rate in y from
period t -1 to period t.
log(yt) = log(yt) - log(yt-1) = 1 for all t 1.22
For concreteness, consider a model where two observed factors, xt1 and xt2, affect yt. In
addition, there are unobserved factors that are systematically growing or shrinking over
time. A model that captures this is
1.23
Although assumptions 1, 2 and 3 are satisfied, omitting t from the regression and regressing
yt on xt1, xt2 will generally yield biased estimators of 1 and 2 : we have effectively
omitted an important variable, t, from the regression which reflects some relationship
between the explanatory due to time trend. This is especially true if xt1, xt2 are themselves
trending, because they can then be highly correlated with t.
1.24
The elasticity of per capita investment with respect to price is very large and statistically
significant; it is not statistically different from one. We must be careful here. Both invpc
and price have upward trends. In particular, if we regress log(invpc) on t, we obtain a
coefficient on the trend equal to .0081 (standard error = .0018); the regression of log( price)
on t yields a trend coefficient equal to .0044 (standard error = .0004). While the standard
errors on the trend coefficients are not necessarily reliable—these regressions tend to
contain substantial serial correlation—the coefficient estimates do reveal upward trends.
To account for the trending behavior of the variables, we add a time trend:
1.25
The story is much different now: the estimated price elasticity is negative and not
statistically different from zero. The time trend is statistically significant, and its
coefficient implies an approximate 1% increase in invpc per year, on average. From
this analysis, we cannot conclude that real per capita housing investment is influenced
at all by price. There are other factors, captured in the time trend, that affect invpc, but
we have not modeled these. The results in (1.24) show a spurious relationship between
invpc and price due to the fact that price is also trending upward over time.
1.26
ˆ1 and ˆ 2 can be obtained by detrending the dependent and independent variables as
follows: detrending: regression without a time trend.
(i) Regress each of yt on xt1, xt2 on a constant and the time trend t and save the
residuals, say yt , x1t and x2t so that both the dependent and independent variables
are detrended.
(ii) Regress yt on x1t , x2t . This regression exactly yields ˆ1
and ˆ 2 .
This means that the estimates of primary interest, ˆ1 and ˆ 2 can be interpreted as coming
from a regression without a time trend, but where we first detrend the dependent variable
and all other independent variables. The same conclusion holds with any number of
independent variables and if the trend is quadratic or of some other polynomial degree.
1.27
2 2
where ˆ u is the unbiased estimator of the error variance, ˆ y = SST/(n -1), and SST =
n 2
y
t 1
t y
When the dependent variable satisfies linear, quadratic, or any other polynomial trends, it is
easy to compute a goodness-of-fit measure that first nets out the effect of any time trend on
yt through detrending. That is, first regress yt on t and obtain the residuals yt . Then, we
regress yt on x1t , x2t , t. thus the R-squared can be computes as:
SSR
R2 1 n 1.28
yt2
t 1
The R-squared in (1.28) better reflects how well xt1 and xt2 explain yt, because it nets out the
effect of the time trend. An adjusted R-squared can also be computed based on (1.28):
dividing SSR by the df (n-k) where k is the number of parameters in the usual regression
that includes the intercept term and the parameter of any time trends (in this case it is 4),
n
and dividing y
t 1
2
t by n-p where p is the number of trend parameters estimated in
2.5.2 Seasonality
If a time series is observed at monthly or quarterly intervals (or even weekly or daily),
it may exhibit seasonality. For example, retail sales in the fourth quarter are typically
higher than in the previous three quarters because of the Christmas holidays
prevailed in the fourth quarter. This can be captured by allowing the average retail sales
to differ over the course of a year. This is in addition to possibly allowing for a trending
mean. For example, retail sales in the most recent first quarter were higher than retail sales
in the fourth quarter from 30 years ago, because retail sales have been steadily growing.
Nevertheless, if we compare average sales within a typical year, the seasonal holiday factor
tends to make sales larger in the fourth quarter.
Even though many monthly and quarterly data series display seasonal patterns, not all of
them do. For example, there is no noticeable seasonal pattern in monthly interest or
inflation rates. In addition, series that do display seasonal patterns are often seasonally
adjusted before they are reported for public use. A seasonally adjusted series is one that, in
principle, has had the seasonal factors removed from it. Sometimes we may face with
seasonally unadjusted data. GDP is a typical example in which the data is reported annually
so that it is impossible to adjust seasonally. However, simple methods are available for
dealing with seasonality in regression models for unadjusted data. Generally, we can
include a set of seasonal dummy variables to account for seasonality in the dependent
variable, the independent variables, or both.
The approach is simple. Suppose that we have monthly data, and we think that seasonal
patterns within a year are roughly constant across time. For example, since Christmas
always comes at the same time of year, we can expect retail sales to be, on average, higher
in months late in the year than in earlier months. Or, since weather patterns are broadly
similar across years, housing starts in the Midwest will be higher on average during the
summer months than the winter months. A general model for monthly data that captures
these phenomena is
1.29
where febt, mart,…, dect are dummy variables indicating whether time period t corresponds
to the appropriate month. In this formulation, January is the base month, and 0 is the
intercept for January. If there is no seasonality in yt, once the xtj have been controlled for,
then 1 through 11 are all zero. This is easily tested via an F test.
Stationarity is important in time series since it simplifies the complexities within time
series data, making it easier to model and forecast than non-stationary time series. The
consistency of the series or variables makes predictions easier and reliable. In contrary,
non-stationary data can lead to unreliable model outputs and inaccurate predictions, just
because the models are not expecting it.
Some time series are non-stationary as variables do vary with time. That is, the statistical
properties are changing through time. This implies that the data does not have a stable or
predictable behavior, and that the past observations may not be representative of the future
ones. Thus, non-stationary series is a series with the existence of trend, seasonality,
cyclicity, and irregularity. Accordingly, non-stationary data can be converted to stationary
data through detrending and differencing. In finance, for example, many processes are
non-stationary.
Correlation between ice cream sales and shark attacks can be an example of spuriousness.
The correlation between ice cream sales and shark attacks look like causal relationships in
both their statistical measures and in graphs, but it's not real. For example, ice cream sales
and shark attacks correlate positively at a beach. As ice cream sales increase, there are more
shark attacks.
If a time series has a unit root problem implying that if the time series is non stationary, the
first difference (i.e., the series of changes from one period to the next) of such time series is
'stationary'. Therefore, the solution for the unit root problem is to take the first
difference of this time series.
A unit root test refers to an econometric measure that aids researchers in identifying
whether a time series is stationary or non-stationary. Some of the standard methods are the
Augmented Dickey-Fuller (ADF), Dickey-Fuller (DF), and Phillips-Perron (PP) tests. A
well-known test that is valid in large samples is the augmented Dickey–Fuller test (ADF).