Lecture Notes of Bus 41202 (Spring 2017)
Analysis of Financial Time Series
Ruey S. Tsay
Simple AR models: (Regression with lagged variables.)
Motivating example: The growth rate of U.S. quarterly real
GNP from 1947 to 1991. Recall that the model discussed before is
rt = 0.005 + 0.35rt−1 + 0.18rt−2 − 0.14rt−3 + at, σ̂a = 0.01.
This is called an AR(3) model because the growth rate rt depends
on the growth rates of the past three quarters. How do we specify
this model from the data? Is it adequate for the data? What are the
implications of the model? These are the questions we shall address
in this lecture.
Another example: U.S. monthly unemployment rate.
AR(1) model:
1. Form: rt = φ0 + φ1rt−1 + at, where φ0 and φ1 are real numbers,
which are referred to as “parameters” (to be estimated from the
data in an application). For example,
rt = 0.05 + 0.4rt−1 + at
2. Stationarity: necessary and sufficient condition |φ1| < 1. Why?
φ0
3. Mean: E(rt) = 1−φ1
1
U.S. quarterly real GNP growth rate: 1947.II to 1991.I
0.04
0.03
0.02
0.01
gnp
0.00
−0.02 −0.01
1950 1960 1970 1980 1990
Time
Figure 1: U.S. quarterly growth rate of real GNP: 1947-1991
0.01 0.03
0.01 0.03
x[3:176]
x
−0.02
−0.02
0 50 100 150 −0.02 0.00 0.02 0.04
Index x[1:174]
Series x
0.01 0.03
0.0 0.4 0.8
x[2:176]
ACF
−0.02
−0.02 0.00 0.02 0.04 0 5 10 15 20
x[1:175] Lag
Figure 2: Various plots of U.S. quarterly growth rate of real GNP: 1947-1991
2
Monthly U.S. unemployment rate
10
8
rate
6
4
1950 1960 1970 1980 1990 2000 2010
year
Figure 3: U.S. monthly unemployment rate (total civilian, 16 and older) from January 1948
to February, 2017.
4. Alternative representation: Let E(rt) = µ be the mean of rt so
that µ = φ0/(1 − φ1). Equivalently, φ0 = µ(1 − φ1). Plugging
in the model, we have
(rt − µ) = φ1(rt−1 − µ) + at. (1)
This model also has two parameters (µ and φ1). It explicitly uses
the mean of the series. It is less commonly used in the literature,
but is the model representation used in R.
σa2
5. Variance: Var(rt) = 1−φ21
.
6. Autocorrelations: ρ1 = φ1, ρ2 = φ21, etc. In general, ρk = φk1
and ACF ρk decays exponentially as k increases,
7. Forecast (minimum squared error): Suppose the forecast origin
is n. For simplicity, we shall use the model representation in (1)
3
and write xt = rt −µ. The model then becomes xt = φ1xt−1 +at.
Note that forecast of rt is simply the forecast of xt plus µ.
(a) 1-step ahead forecast at time n:
x̂n(1) = φ1xn
(b) 1-step ahead forecast error:
en(1) = xn+1 − x̂n(1) = an+1
Thus, an+1 is the un-predictable part of xn+1. It is the shock
at time n + 1!
(c) Variance of 1-step ahead forecast error:
Var[en(1)] = Var(an+1) = σa2.
(d) 2-step ahead forecast:
x̂n(2) = φ1x̂n(1) = φ21xn.
(e) 2-step ahead forecast error:
en(2) = xn+2 − x̂n(2) = an+2 + φ1an+1
(f) Variance of 2-step ahead forecast error:
Var[en(2)] = (1 + φ21)σa2
which is greater than or equal to Var[en(1)], implying that
uncertainty in forecasts increases as the number of steps in-
creases.
4
(g) Behavior of multi-step ahead forecasts. In general, for the
`-step ahead forecast at n, we have
x̂n(`) = φ`1xn,
the forecast error
en(`) = an+` + φ1an+`−1 + · · · + φ1`−1an+1,
and the variance of forecast error
2(`−1)
Var[en(`)] = (1 + φ21 + · · · + φ1 )σa2.
In particular, as ` → ∞,
x̂n(`) → 0, i.e., r̂n(`) → µ.
This is called the mean-reversion of the AR(1) process. The
variance of forecast error approaches
1 2
Var[en(`)] = σ a = Var(rt).
1 − φ21
In practice, it means that for the long-term forecasts serial
dependence is not important. The forecast is just the sample
mean and the uncertainty is simply the uncertainty about the
series.
8. A compact form: (1 − φ1B)rt = φ0 + at.
Half-life: A common way to quantify the speed of mean reversion
is the half-life, which is defined as the number of periods needed so
5
that the magnitude of the forecast becomes half of that of the forecast
origin. For an AR(1) model, this mean
1
xn(k) = xn.
2
Thus, φk1 xn = 12 xn. Consequently, the half-life of the AR(1) model
ln(0.5)
is k = ln(|φ1 |) . For example, if φ1 = 0.5, the k = 1. If φ1 = 0.9, then
k ≈ 6.58.
AR(2) model:
1. Form: rt = φ0 + φ1rt−1 + φ2rt−2 + at, or
(1 − φ1B − φ2B 2)rt = φ0 + at.
2. Stationarity condition: (factor of polynomial)
3. Characteristic equation: (1 − φ1x − φ2x2) = 0
φ0
4. Mean: E(rt) = 1−φ1 −φ2
5. Mean-adjusted format: Using φ0 = µ − φ1µ − φ2µ, we can write
the AR(2) model as
(rt − µ) = φ1(rt−1 − µ) + φ2(rt−2 − µ) + at.
This form is often used in the finance literature to highlight the
mean-reverting property of a stationary AR(2) model.
φ1
6. ACF: ρ0 = 1, ρ1 = 1−φ2 ,
ρ` = φ1ρ`−1 + φ2ρ`−1, ` ≥ 2.
6
7. Stochastic business cycle: if φ21 + 4φ2 < 0, then rt shows char-
acteristics of business cycles with average length
2π
k= √ ,
cos−1[φ1/(2 −φ2)]
where the cosine inverse is stated in radian. If we denote the
√
solutions of the polynomial as a ± bi, where i = −1, then we
have φ1 = 2a and φ2 = −(a2 + b2) so that
2π
k= √ .
cos−1(a/ a2 + b2)
√
In R or S-Plus, one can obtain a2 + b2 using the command
Mod.
8. Forecasts: Similar to AR(1) models
Simulation in R: Use the command arima.sim
1. y1=arima.sim(model=list(ar=c(1.3,-.4)),1000)
2. y2=arima.sim(model=list(ar=c(.8,-.7)),1000)
Check the ACF and PACF of the above two simulated series.
Discussion: (Reference only)
An AR(2) model can be written as an AR(1) model if one expands
the dimension. Specifically, we have
rt − µ = φ1(rt−1 − µ) + φ2(rt−2 − µ) + at
rt−1 − µ = rt−1 − µ, (an identity.)
7
Now, putting the two equations together, we have
rt − µ φ1 φ2 rt−1 − µ at
= + .
rt−1 − µ rt−2 − µ
1 0 0
This is a 2-dimensional AR(1) model. Several properties of the AR(2)
model can be obtained from the expanded AR(1) model.
Building an AR model
• Order specification
1. Partial ACF: (naive, but effective)
– Use consecutive fittings
– See Text (p. 40) for details
– Key feature: PACF cuts off at lag p for an AR(p)
model.
– Illustration: See the PACF of the U.S. quarterly growth
rate of GNP.
2. Akaike information criterion
2`
AIC(`) = ln(σ̃`2) +
,
T
for an AR(`) model, where σ̃`2 is the MLE of residual vari-
ance.
Find the AR order with minimum AIC for ` ∈ [0, · · · , P ].
3. BIC criterion:
` ln(T )
BIC(`) = ln(σ̃`2) + .
T
8
Series : dgnp
0.3 0.2
Partial ACF
0.1 0.0
−0.1
0 5 10 15 20
Lag
R command: ar(rt, method=’’mle’’,order.max=12)
• Needs a constant term? Check the sample mean.
• Estimation: least squares method or maximum likelihood method
• Model checking:
1. Residual: obs minus the fit, i.e. 1-step ahead forecast errors
at each time point.
2. Residual should be close to white noise if the model is ade-
quate. Use Ljung-Box statistics of residuals, but degrees of
freedom is m − g, where g is the number of AR coefficients
used in the model.
9
Example: Analysis of U.S. GNP growth rate series.
R demonstration:
> setwd("C:/Users/rst/teaching/bs41202/sp2015")
> library(fBasics)
> da=read.table("dgnp82.dat")
> x=da[,1]
> par(mfcol=c(2,2)) % put 4 plots on a page
### See Figure 2 of the lecture note 2.
> plot(x,type=’l’) % first plot
> plot(x[1:175],x[2:176]) % 2nd plot
> plot(x[1:174],x[3:176]) % 3rd plot
> acf(x,lag=12) % 4th plot
> pacf(x,lag.max=12) % Compute PACF (not shown in this handout)
> Box.test(x,lag=10,type=’Ljung’) % Compute Q(10) statistics
Box-Ljung test
data: x
X-squared = 43.2345, df = 10, p-value = 4.515e-06
> m1=ar(x,method=’mle’) % Automatic AR fitting using AIC criterion.
> m1
Call: ar(x = x, method = "mle")
Coefficients:
1 2 3 % An AR(3) is specified.
0.3480 0.1793 -0.1423
Order selected 3 sigma^2 estimated as 9.427e-05
> names(m1)
[1] "order" "ar" "var.pred" "x.mean" "aic"
[6] "n.used" "order.max" "partialacf" "resid" "method"
[11] "series" "frequency" "call" "asy.var.coef"
> plot(m1$resid,type=’l’) % Plot residuals of the fitted model (not shown)
> Box.test(m1$resid,lag=10,type=’Ljung’) % Model checking
Box-Ljung test
data: m1$resid
X-squared = 7.0808, df = 10, p-value = 0.7178
> m2=arima(x,order=c(3,0,0)) % Another approach with order given.
> m2
Call: arima(x = x, order = c(3, 0, 0))
Coefficients:
10
ar1 ar2 ar3 intercept % Fitted model is
0.3480 0.1793 -0.1423 0.0077 % y(t)=0.348y(t-1)+0.179y(t-2)
s.e. 0.0745 0.0778 0.0745 0.0012 % -0.142y(t-3)+a(t),
% where y(t) = x(t)-0.0077
sigma^2 estimated as 9.427e-05: log likelihood = 565.84, aic = -1121.68
> names(m2)
[1] "coef" "sigma2" "var.coef" "mask" "loglik" "aic"
[7] "arma" "residuals" "call" "series" "code" "n.cond"
[13] "model"
> Box.test(m2$residuals,lag=10,type=’Ljung’)
Box-Ljung test
data: m2$residuals
X-squared = 7.0169, df = 10, p-value = 0.7239
> ts.plot(m2$residuals) % Residual plot
> tsdiag(m2) % obtain 3 plots of model checking (not shown in handout).
> p1=c(1,-m2$coef[1:3]) % Further analysis of the fitted model.
> roots=polyroot(p1)
> roots
[1] 1.590253+1.063882e+00i -1.920152-3.530887e-17i 1.590253-1.063882e+00i
> Mod(roots)
[1] 1.913308 1.920152 1.913308
> k=2*pi/acos(1.590253/1.913308)
> k
[1] 10.65638
> predict(m2,8) % Prediction 1-step to 8-step ahead.
$pred
Time Series:
Start = 177
End = 184
Frequency = 1
[1] 0.001236254 0.004555519 0.007454906 0.007958518
[5] 0.008181442 0.007936845 0.007820046 0.007703826
$se
Time Series:
Start = 177
End = 184
Frequency = 1
[1] 0.009709322 0.010280510 0.010686305 0.010688994
11
[5] 0.010689733 0.010694771 0.010695511 0.010696190
Another example: Monthly U.S. unemployment rate from Jan-
uary 1948 to February, 2017. I use this example to emphasize two
messages: (1) Modeling and prediction using AR models, including
model simplification; (2) handling outliers.
Demonstration:
> require(quantmod)
> get Symbols("UNRATE",src="FRED")
> chartSeries(UNRATE)
> unrate <- as.numeric(UNRATE) ## create a regular vector, instead of a ‘‘xts’’ object
> tail(UNRATE)
UNRATE
2016-09-01 4.9
2016-10-01 4.8
2016-11-01 4.6
2016-12-01 4.7
2017-01-01 4.8
2017-02-01 4.7
> tdx <- c(1:830)/12+1948
> plot(tdx,unrate,type=’l’,xlab=’year’,ylab=’rate’)
> title(main="Monthly U.S. unemployment rate")
> ar(unrate,method="mle")
Call:ar(x = unrate, method = "mle")
Coefficients:
1 2 3 4 5 6 7 8
0.9946 0.2152 -0.0713 -0.0533 0.0494 -0.1275 -0.0610 0.0513
9 10 11
-0.0077 -0.1048 0.1003
Order selected 11 sigma^2 estimated as 0.03719
> m1 <- arima(unrate,order=c(11,0,0))
> m1
Call:arima(x = unrate, order = c(11, 0, 0))
Coefficients:
ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8
0.9945 0.2152 -0.0712 -0.0532 0.0493 -0.1275 -0.0610 0.0513
s.e. 0.0346 0.0488 0.0495 0.0495 0.0497 0.0495 0.0496 0.0496
ar9 ar10 ar11 intercept
12
-0.0077 -0.1047 0.1004 5.6715
s.e. 0.0496 0.0490 0.0348 0.4417
sigma^2 estimated as 0.03718: log likelihood = 186.03, aic = -346.07
> names(m1)
[1] "coef" "sigma2" "var.coef" "mask" "loglik" "aic"
[7] "arma" "residuals" "call" "series" "code" "n.cond"
[13] "nobs" "model"
> tsdiag(m1,gof=24)
> c1 <- c(NA,NA,NA,0,0,NA,0,0,0,NA,NA,NA)
> m2 <- arima(unrate,order=c(11,0,0),fixed=c1) ## refinement
> m2
Call:arima(x = unrate, order = c(11, 0, 0), fixed = c1)
Coefficients:
ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 ar9 ar10
0.9967 0.2045 -0.0800 0 0 -0.1369 0 0 0 -0.0989
s.e. 0.0343 0.0481 0.0427 0 0 0.0291 0 0 0 0.0407
ar11 intercept
0.0998 5.6702
s.e. 0.0342 0.4416
sigma^2 estimated as 0.03733: log likelihood = 184.37, aic = -352.74
> tsdiag(m2)
> tsdiag(m2,gof=24)
> pm2 <- predict(m2,4)
> names(pm2)
[1] "pred" "se"
> low <- pm2$pred-1.96*pm2$se
> upp <- pm2$pred+1.96*pm2$se
> names(pm2)
[1] "pred" "se"
> pm2$pred
Time Series:
Start = 831
End = 834
Frequency = 1
[1] 4.737312 4.710012 4.745765 4.759146
> pm2$se
Time Series:
Start = 831
End = 834
Frequency = 1
[1] 0.1932128 0.2727943 0.3577585 0.4391267
> low
Time Series:
13
Start = 831
End = 834
Frequency = 1
[1] 4.358614 4.175335 4.044559 3.898457
> upp
Time Series:
Start = 831
End = 834
Frequency = 1
[1] 5.116009 5.244688 5.446972 5.619834
######################## Handling outliers
> which.min(m2$residuals) ### locate the minimum of residuals
[1] 23
> I23 <- rep(0,830)
> I23[23] <- 1
> c1 <- c(c1,NA)
> m3 <- arima(unrate,order=c(11,0,0),fixed=c1,xreg=I23)
> m3
Call: arima(x = unrate, order = c(11, 0, 0), xreg = I23, fixed = c1)
Coefficients:
ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 ar9 ar10
1.0449 0.1219 -0.0472 0 0 -0.1345 0 0 0 -0.1021
s.e. 0.0349 0.0515 0.0434 0 0 0.0287 0 0 0 0.0413
ar11 intercept I23
0.1025 5.6709 -0.7749
s.e. 0.0345 0.4428 0.1338
sigma^2 estimated as 0.03591: log likelihood = 200.43, aic = -382.87
> tsdiag(m3,gof=24)
> which.max(m3$residuals) ### locate the maximum of the residuals
[1] 22
> I22 <- rep(0,830)
> I22[22] <- 1
> c1 <- c(c1,NA)
> X <- cbind(I23,I22)
> m4 <- arima(unrate,order=c(11,0,0),fixed=c1,xreg=X)
> m4
Call:arima(x = unrate, order = c(11, 0, 0), xreg = X, fixed = c1)
Coefficients:
ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8 ar9 ar10
1.0764 0.1170 -0.0955 0 0 -0.1069 0 0 0 -0.0951
s.e. 0.0346 0.0507 0.0431 0 0 0.0283 0 0 0 0.0416
ar11 intercept I23 I22
0.0901 5.6690 -0.2580 1.1729
14
Standardized Residuals
4
−8 −4 0
0 200 400 600 800
Time
ACF of Residuals
0.0 0.4 0.8
ACF
0 5 10 15 20 25 30
Lag
p values for Ljung−Box statistic
0.8
p value
0.4
0.0
5 10 15 20
lag
Figure 4: Model checking for AR(11) model fitted to UNRATE series.
s.e. 0.0348 0.4388 0.1367 0.1375
sigma^2 estimated as 0.03305: log likelihood = 234.86, aic = -449.73
> tsdiag(m4,gof=24)
Moving-average (MA) model
Model with finite memory!
Some daily stock returns have minor serial correlations and can be
modeled as MA or AR models.
MA(1) model
• Form: rt = µ + at − θat−1
• Stationarity: always stationary.
• Mean (or expectation): E(rt) = µ
• Variance: Var(rt) = (1 + θ2)σa2.
• Autocovariance:
15
1. Lag 1: Cov(rt, rt−1) = −θσa2
2. Lag `: Cov(rt, rt−`) = 0 for ` > 1.
Thus, rt is not related to rt−2, rt−3, · · ·.
−θ
• ACF: ρ1 = 1+θ2
, ρ` = 0 for ` > 1.
Finite memory! MA(1) models do not remember what happen
two time periods ago.
• Forecast (at origin t = n):
1. 1-step ahead: r̂n(1) = µ − θan. Why? Because at time n,
an is known, but an+1 is not.
2. 1-step ahead forecast error: en(1) = an+1 with variance σa2.
3. Multi-step ahead: r̂n(`) = µ for ` > 1.
Thus, for an MA(1) model, the multi-step ahead forecasts
are just the mean of the series. Why? Because the model
has memory of 1 time period.
4. Multi-step ahead forecast error:
en(`) = an+` − θan+`−1
5. Variance of multi-step ahead forecast error:
(1 + θ2)σa2 = variance of rt.
• Invertibility:
– Concept: rt is a proper linear combination of at and the past
observations {rt−1, rt−2, · · ·}.
16
– Why is it important? It provides a simple way to obtain the
shock at.
For an invertible model, the dependence of rt on rt−` con-
verges to zero as ` increases.
– Condition: |θ| < 1.
– Invertibility of MA models is the dual property of stationarity
for AR models.
MA(2) model
• Form: rt = µ + at − θ1at−1 − θ2at−2. or
rt = µ + (1 − θ1B − θ2B 2)at.
• Stationary with E(rt) = µ.
• Variance: Var(rt) = (1 + θ12 + θ22)σa2.
• ACF: ρ2 6= 0,but ρ` = 0 for ` > 2.
• Forecasts go the the mean after 2 periods.
Building an MA model
• Specification: Use sample ACF
Sample ACFs are all small after lag q for an MA(q) series. (See
test of ACF.)
• Constant term? Check the sample mean.
17
• Estimation: use maximum likelihood method
– Conditional: Assume at = 0 for t ≤ 0
– Exact: Treat at with t ≤ 0 as parameters, estimate them to
obtain the likelihood function.
Exact method is preferred, but it is more computing intensive.
• Model checking: examine residuals (to be white noise)
• Forecast: use the residuals as {at} (which can be obtained from
the data and fitted parameters) to perform forecasts.
Model form in R: R parameterizes the MA(q) model as
rt = µ + at + θ1at−1 + · · · + θq at−q ,
instead of the usual minus sign in θ. Consequently, care needs to be
exercised in writing down a fitted MA parameter in R. For instance,
an estimate θ̂1 = −0.5 of an MA(1) in R indicates the model is
rt = at − 0.5at−1.
Example:Daily log return of the value-weighted index
R demonstration
> setwd("C:/Users/rst/teaching/bs41202/sp2017")
> library(fBasics)
> da=read.table("d-ibmvwew6202.txt")
> dim(da)
[1] 10194 4
> vw=log(1+da[,3])*100 % Compute percentage log returns of the vw index.
> acf(vw,lag.max=10) % ACF plot is not shon in this handout.
> m1=arima(vw,order=c(0,0,1)) % fits an MA(1) model
> m1
18
Call:
arima(x = vw, order = c(0, 0, 1))
Coefficients:
ma1 intercept
0.1465 0.0396 % The model is vw(t) = 0.0396+a(t)+0.1465a(t-1).
s.e. 0.0099 0.0100
sigma^2 estimated as 0.7785: log likelihood = -13188.48, aic = 26382.96
> tsdiag(m1)
> predict(m1,5)
$pred
Time Series:
Start = 10195
End = 10199
Frequency = 1
[1] 0.05036298 0.03960887 0.03960887 0.03960887 0.03960887
$se
Time Series:
Start = 10195
End = 10199
Frequency = 1
[1] 0.8823290 0.8917523 0.8917523 0.8917523 0.8917523
Mixed ARMA model: A compact form for flexible models.
Focus on the ARMA(1,1) model for
1. simplicity
2. useful for understanding GARCH models in Ch. 3 for volatility
modeling.
ARMA(1,1) model
• Form: (1 − φ1B)rt = φ0 + (1 − θB)at or
rt = φ1rt−1 + φ0 + at − θ1at−1.
19
A combination of an AR(1) on the LHS and an MA(1) on the
RHS.
• Stationarity: same as AR(1)
• Invertibility: same as MA(1)
φ0
• Mean: as AR(1), i.e. E(rt) = 1−φ1
• Variance: given in the text
• ACF: Satisfies ρk = φ1ρk−1 for k > 1, but
ρ1 = φ1 − [θ1σa2/Var(rt)] 6= φ1.
This is the difference between AR(1) and ARMA(1,1) models.
• PACF: does not cut off at finite lags.
Building an ARMA(1,1) model
• Specification: use EACF or AIC
• Use the command auto.arima of the package forecast.
• Estimation: cond. or exact likelihood method
• Model checking: as before
• Forecast: MA(1) affects the 1-step ahead forecast. Others are
similar to those of AR(1) models.
Three model representations:
20
• ARMA form: compact, useful in estimation and forecasting
• AR representation: (by long division)
rt = φ0 + at + π1rt−1 + π2rt−2 + · · ·
It tells how rt depends on its past values.
• MA representation: (by long division)
rt = µ + at + ψ1at−1 + ψ2at−2 + · · ·
It tells how rt depends on the past shocks.
For a stationary series, ψi converges to zero as i → ∞. Thus, the
effect of any shock is transitory.
The MA representation is particularly useful in computing variances
of forecast errors.
For a `-step ahead forecast, the forecast error is
en(`) = an+` + ψ1an+`−1 + · · · + ψ`−1an+1.
The variance of forecast error is
Var[en(`)] = (1 + ψ12 + · · · + ψ`−1
2
)σa2.
Unit-root Nonstationarity
Random walk
• Form pt = pt−1 + at
• Unit root? It is an AR(1) model with coefficient φ1 = 1.
21
• Nonstationary: Why? Because the variance of rt diverges to
infinity as t increases.
• Strong memory: sample ACF approaches 1 for any finite lag.
• Repeated substitution shows
∞ ∞
pt = at−i = ψiat−i
X X
i=0 i=0
where ψi = 1 for all i. Thus, ψi does not converge to zero. The
effect of any shock is permanent.
Random walk with drift
• Form: pt = µ + pt−1 + at, µ 6= 0.
• Has a unit root
• Nonstationary
• Strong memory
• Has a time trend with slope µ. Why?
differencing
• 1st difference: rt = pt − pt−1
If pt is the log price, then the 1st difference is simply the log
return. Typically, 1st difference means the “change” or “incre-
ment” of the original series.
22
• Seasonal difference: yt = pt − pt−s, where s is the periodicity,
e.g. s = 4 for quarterly series and s = 12 for monthly series.
If pt denotes quarterly earnings, then yt is the change in earning
from the same quarter one year before.
Meaning of the constant term in a model
• MA model: mean
• AR model: related to mean
• 1st differenced: time slope, etc.
Practical implication in financial time series
Example: Monthly log returns of General Electrics (GE) from 1926
to 1999 (74 years)
Sample mean: 1.04%, std(µ̂) = 0.26
Very significant!
is about 12.45% a year
$1 investment in the beginning of 1926 is worth
• annual compounded payment: $5907
• quarterly compounded payment: $8720
• monthly compounded payment: $9570
• Continuously compounded?
23
Unit-root test
Let pt be the log price of an asset. To test that pt is not predictable
(i.e. has a unit root), two models are commonly employed:
pt = φ1pt−1 + et
pt = φ0 + φ1pt−1 + et.
The hypothesis of interest is Ho : φ1 = 1 vs Ha : φ1 < 1.
Dickey-Fuller test is the usual t-ratio of the OLS estimate of φ1 being
1. This is the DF unit-root test. The t-ratio, however, has a non-
standard limiting distribution.
Let ∆pt = pt − pt−1. Then, the augmented DF unit-root test for an
AR(p) model is based on
p−1
∆pt = ct + βpt−1 + φi∆pt−i + et.
X
i=1
The t-ratio of the OLS estimate of β is the ADF unit-root test statis-
tic. Again, the statistic has a non-standard limiting distribution.
Example: Consider the log series of U.S. quaterly real GDP se-
ries from 1947.I to 2009.IV. (data from Federal Reserve Bank of St.
Louis). See q-gdpc96.txt on the course web.
R demonstration
> library(fUnitRoots)
> help(UnitrootTests) % See the tests available
>da=read.table(‘‘q-gdpc96.txt’’,header=T)
>gdp=log(da[,4])
> adfTest(gdp,lag=4,type=c("c")) #Assume an AR(4) model for the series.
Title: Augmented Dickey-Fuller Test
24
Test Results:
PARAMETER:
Lag Order: 4
STATISTIC:
Dickey-Fuller: -1.7433
P VALUE:
0.4076 # cannot reject the null hypothesis of a unit root.
*** A more careful analysis
> x=diff(gdp)
> ord=ar(x) # identify an AR model for the differenced series.
> ord
Call:ar(x = x)
Coefficients:
1 2 3
0.3429 0.1238 -0.1226
Order selected 3 sigma^2 estimated as 8.522e-05
> # An AR(3) for the differenced data is confirmed.
# Our previous analysis is justified.
Discussion: The command arima on R.
1. Dealing with the constant term. If there is any differencing, no
constant is used.
The subcommand include.mean=T in the arima command.
2. Fixing some parameters. Use subcommand fixed in arima.
See the unemployment rate series used in AR modeling.
Exponential Smoothing Approach
Suppose the available data are rt, rt−1, rt−2, · · · and we are interested
in predicting rt+1.
25
Intuition: The data rt should be more relevant than rt−1 in pre-
dicting rt+1. Similarly, rt−1 is more relevant than rt−2, etc.
A simple formulation: Suppose we assign weight w to rt, weight
wθ with 0 < θ < 1 to rt−1, weight wθ2 to the data rt−2, etc. That
is, we use an initial weight w and a discounting factor θ.
Simple fact: The weight must sum to 1. Why? Not to change the
scale. Therefore,
1 = w + wθ + wθ2 + wθ3 + · · ·
= w[1 + θ + θ2 + θ3 + · · ·]
w
= .
1−θ
Therefore, w = 1 − θ. Consequently, the 1-step ahead prediction of
rt+1 is
rt(1) = (1 − θ)rt + (1 − θ)θrt−1 + (1 − θ)θ2rt−2 + (1 − θ)θ3rt−3 + · · · .
(2)
Let at+1 be the forecast error (or the innovation at time t + 1), then
we have
rt+1 = rt(1) + at+1. (3)
Now, suppose we have data rt−1, rt−2, · · · and are interested in fore-
casting rt at time t − 1. The same argument shows that
rt−1(1) = (1 − θ)rt−1 + (1 − θ)θrt−2 + (1 − θ)θ2rt−3 + · · · . (4)
Next, using Equation (2), we have
rt(1) = (1 − θ)rt + θ[(1 − θ)rt−1 + (1 − θ)θrt−2 + (1 − θ)θ2rt−3 + · · ·]
= (1 − θ)rt + θrt−1(1). [see Equation (4)].
26
Putting the prior result into Equation (3), we obtain
rt+1 = (1 − θ)rt + θrt−1(1) + at+1
= rt − θ[rt − rt−1(1)]
= rt − θat + at+1.
In the above, we have use rt = rt−1(1) + at. Consequently, the
exponential smoothing model is
rt+1 − rt = at+1 − θat,
which is an ARIMA(0,1,1) model and can be written as
(1 − B)rt = (1 − θB)at.
This shows that the exponential smoothing method is simply using
an ARIMA(0,1,1) model with a positive θ, which is the discounting
factor.
Updating: For a given discounting rate θ, it is easy to update the
forecast via the exponential smoothing method, because
rt(1) = (1 − θ)rt + θrt−1(1),
which means the new prediction is simply a weighted average of the
new data rt and the previous forecast rt−1(1). The weights are simply
the initial weight and the discounting factor, respectively.
27