RATS 10 Reference Manual
RATS 10 Reference Manual
REFERENCE MANUAL
RATS VERSION 10
REFERENCE MANUAL
Estima
1560 Sherman Ave., Suite 1029
Evanston, IL 60201
Estima
1560 Sherman Ave., Suite 1029
Evanston, IL 60201
The point of the Reference Manual is to describe the syntax, algorithms and output.
Most instructions will include at least one or two short examples of their use. We
have included few full running examples here, preferring to leave those to the User’s
Guide. As might be expected from the name, we expect that this manual will be used
mainly for quick checks, while the User’s Guide will be used to learn techniques. Note
that almost all the information in the Reference Guide is now included in the on-line
Help.
A list of the changes and new features in Version 10 is included at the beginning of
the User’s Guide.
The User’s Guide also includes a combined Index covering the Introduction, User’s
Guide, and Reference Manual.
Table of Contents
Preface RM–iii
Bibliography RM–559
Wizard
You can use the Data/Graphics—Transformations wizard to set up this instruction
Choose the “Partial Sums–Accumulation” operation in the dialog box.
Parameters
series Series to sum.
start end Range of entries over which to compute partial sums. If you
haven’t set a SMPL, this defaults to the defined range of series.
newseries Series for the partial sums. By default, newseries=series.
newstart New starting entry for the resulting series. By default,
newstart=start.
Option
standardize/[nostandardize]
When you use the option STANDARDIZE, ACCUMULATE divides the entries of
newseries by the final sum. newseries thus ends with the value 1.0. (This op-
tion was called SCALE in some older versions.)
Missing Values
ACCUMULATE excludes missing values in series from the calculation of the partial
sums and sets the corresponding entries in newseries to missing. For example:
set trend 1 10 = t
set trend 5 6 = %na
accumulate trend / trendsum
print
produces the following output
Examples
acc sales / totalsales
set pchange = sales/totalsales{1}
linreg pchange
# constant totalsales{1}
takes the series SALES, computes the total sales through each entry, then computes a
linearized logistic trend regression.
impulse(model=varmodel,result=impulses,noprint,steps=nsteps)
dec rect[ser] accumimp(%nvar,%nvar)
do i=1,%nvar
do j=1,%nvar
accumulate impulses(i,j) 1 nsteps accumimp(i,j)
end do i
end do j
This stores accumulated impulse responses into an array of series called ACCUMIMP.
Notes
ACCUMULATE is sometimes used when all that is needed is the final value, not the
whole series of sums. For that, it’s better to use SSTATS::
sstats(mean) 1963:1 1963:12 deuip>>avg1963
computes (into AVG1963) the average value of DEUIP over 1963:1 to 1963:12
The ALLOCATE instruction is now optional. If you prefer, you can omit the ALLOCATE
and set the workspace length using the start and end parameters on your first
DATA or SET instruction.
If you use ALLOCATE, you will usually use the first of the two forms shown below.
The second form is generally seen only in programs written for much older versions.
allocate length
allocate series length
Wizard
The Data wizards, located on the Data/Graphics menu, automatically determine the
appropriate range when the data are read from a single data file.
Parameters
length This sets the standard workspace length. This is not a binding
constraint—you can define series (such as out-of-sample fore-
casts) which exceed it. We would recommend that you set it to
the length of the data which you read with DATA since length
sets the default value for the end parameter on DATA.
The length parameter can be:
• a simple entry number.
• a date value, if you have used CALENDAR to define a dating
scheme. Note that you must use a “:” when specifying a
date, even for annual data (for instance, 1991:1)
• individual//observation for panel data.
observation can be an entry number or a date.
series If using the second form, you specify a non-zero value for this
parameter, which tells rats to create a set of data series
numbered 1 to series.
Examples
allocate 1200 1200 observation data set
calendar(d) 1986:1:8 Daily, Jan. 8, 1986 through Dec. 31, 2017
allocate 2017:12:31
cal(m) 2008:1 Monthly, Jan. 2008 through Dec. 2009
all 24
cal(m,panelobs=12) 1998:1 Monthly panel data, 12 months per individual
all 50//12 (or 50//1998:12) and 50 individuals
Pre-Allocating Series
If you use the second form, rats will create series numbers 1 to series. Any series
created later by other instructions will be numbered series+1 and above. These
numbered series can be helpful when working with large, well-structured data sets.
However, many of these cases are more easily handled using VECTORS of SERIES (see
page UG–483 in the User’s Guide).
You can manipulate %X and its elements just like any other RECTANGULAR array.
See Also . . .
CALENDAR Sets the date of the first entry in the data set. In time series
or panel data sets this comes before ALLOCATE.
SCRATCH Creates new numbered data series.
UG Section 15.3 Numbered data series.
Wizard
ar1 is in the Statistics—Linear Regressions wizard. Select either “AR(1)–Regression”
or “AR(1)–Instrumental Variables” from the “Method” list.
Parameters
depvar Dependent variable for the regression.
start end Range to use in estimation. If you haven’t set a SMPL, this
defaults to the largest range for which all variables involved are
defined. If you choose a method which does not retain the initial
observation (Cochrane–Orcutt or Hildreth–Lu), rats will actu-
ally run the regressions beginning at start+1, using entry
start to provide the lag for start+1.
residuals (Optional) Residuals are automatically saved in %RESIDS. You
can use this parameter to save them in a different series.
Options
method=corc/[hilu]/maxl/search/pw
This chooses the method for estimation. See “Technical Details” for details:
CORC is (iterated) Cochrane–Orcutt, or, for instrumental variables, Fair’s
(1970) procedure.
HILU (the default) is Hildreth–Lu, a grid search procedure.
MAXL is Beach and MacKinnon’s (1978) maximum likelihood procedure.
SEARCH is a maximum likelihood grid search procedure.
PW is Prais-Winsten, which is similar to SEARCH, but doesn’t use the log
variance terms from the likelihood.
AR1 may not be able to honor your choice. The methods which retain the initial
observation (MAXL, SEARCH and PW) cannot be used for instrumental variables
and the iterated methods (MAXL and CORC) cannot be used if there are missing
observations. AR1 will pick the closest permitted choice when it must switch.
[print]/noprint
vcv/[novcv]
smpl=SMPL Series or formula (“SMPL Option” on page RM–546)
dfc=Degrees of Freedom Correction (Additional Topics pdf, Section 1.4)
unravel/[nounravel] (Section 2.10)
equation=equation to estimate
entries=number of supplementary card entries to process [all]
title=”title to identify estimation method”
These are the same as for LINREG.
instruments/[noinstruments]
wmatrix=weighting matrix
Use the INSTRUMENTS option to do two-stage least squares. You must set your
instruments list first using the instruction INSTRUMENTS. WMATRIX is described
in Section 7.8 of the User’s Guide.
define=equation to define
frml=formula to define
These define an equation and formula, respectively, for forecasting purposes. The
equation (or formula) created incorporates the serial correlation within it, so that
an estimated model of
Technical Details
For the following model with first order serially correlated errors:
(1) yt = X t β + ut , ut = ρut−1 + εt
−T T 1 1 2
(2) log (2π ) − log (σ 2 ) + log (1 − ρ 2 ) − 2 (1 − ρ 2 )( y1 − X1β ) −
2 2 2 2σ
1
T
2
2 ∑ ( t
y − ρyt−1 − ( X t − ρ X t−1 ) β )
2σ
t=2
The goal in every method is to reach a point where the estimate of r changes by less
than convergence criterion (set by the CVCRIT option). This usually takes more
trials with the search procedures. However, the search procedures guarantee that
you have found the global optimum.
(4)
∑u ut t−1
∑u 2
t−1
Missing Values
If there are any missing values within the data range, the simple iterative process
described above for the Cochrane–Orcutt (CORC) and Beach–MacKinnon (MAXL) esti-
mators can’t be used, since there will be terms missing in (4). If you have requested
one of these, the most similar search procedure will be used instead.
If there is a gap of s periods in the data before period t, the likelihood function will
include the extra term
1 1 − r 2
(5) log
2 1 − r 2s
(
(7) ut |ut−s ~ N ρ sut−s , σ 2 (1 + ρ 2 + ρ 4 + + ρ 2( s−1) ) )
Examples
The first estimates an investment equation using instrumental variables, producing
the results shown on the next page.
ar1(inst,frml=investeq) invest
# constant ydiff gnp{1} rate{4}
This estimates first by Hildreth-Lu (the default method) and then by maximum like-
lihood:
ar1 loggpop
# constant logpg logypop logpnc logpuc
ar1(method=maxl) loggpop
# constant logpg logypop logpnc logpuc
Hypothesis Tests
You can use any of the hypothesis testing instructions after AR1, but you can’t test
RHO using EXCLUDE or SUMMARIZE.
Sample Output
Regression with AR1 - Estimation by Hildreth-Lu Search
Dependent Variable RATE
Monthly Data From 1959:04 To 1996:02
Usable Observations 443
Degrees of Freedom 438
Centered R^2 0.9679799
R-Bar^2 0.9676875
Uncentered R^2 0.9945027
Mean of Dependent Variable 6.0806546275
Std Error of Dependent Variable 2.7714419161
Standard Error of Estimate 0.4981857000
Sum of Squared Residuals 108.70677837
Regression F(4,438) 3310.2261
Significance Level of F 0.0000000
Log Likelihood -317.4010
Durbin-Watson Statistic 1.6508
Q(36-1) 144.4029
Significance Level of Q 0.0000000
The R2, other summary statistics and the residuals are based upon the complete
model—so they use the e’s, not the u’s. The estimate of %RHO is separated from the
other regressors in the regression output. AR1 computes the standard errors and
covariance matrix from a linearization of the objective function.
If you use the HETEROGENOUS option with panel data, AR1 omits the output for %RHO,
and computes the standard errors from the second stage regression on the quasi-
differenced data.
boxjenk(gls,ar=2) y
# constant x1 x2
Description
When you estimate an equation within rats, the estimating instruction automatical-
ly sets the coefficients and residual variance. So, you usually only need ASSOCIATE if
you wish to alter what is stored, or to assign coefficients to equations (such as iden-
tity equations) that have not been estimated within your rats program. ASSOCIATE
provides three ways to set the coefficients of an equation:
1. It can read the coefficients from supplementary cards (the default method).
2. It can copy them from a VECTOR by using the coeffVECTOR parameter.
3. It can read them from a file (using the options BINARY or FREE)
This is rarely used in any modern programs, but you may come across it in code
written from RATS version 4 or earlier. If you need to set or reset the coefficients of
a Vector Autoregression (such as when doing Monte Carlo integration and similar
types of simulations), use the functions %MODELGETCOEFFS and %MODELSETCOEFFS.
Similarly, the functions %EQNSETCOEFFS and %EQNSETVARIANCE provide alternative
methods of setting the coefficient values and variance of an individual equation.
Parameters
equation ASSOCIATE sets the coefficients of this equation.
coeffVECTOR (Optional) This is a VECTOR which holds the coefficients.
Options
variance=residual variance
Residual variance for this equation. You only need to supply this if you are going
to use SIMULATE, IMPULSE or ERRORS. It is usually set when the equation is
estimated.
residuals=series of residuals
Series holding the residuals for this equation. You only need this information if
you want to use FORECAST, STEPS, SIMULATE or THEIL with an ARMA equation,
or the HISTORY instruction with any equation. The residuals are usually saved
when the equation is estimated.
[coeffs]/nocoeffs
Use NOCOEFFS when you want to use RESIDUALS, VARIANCE or FRML, but you do
not want to change the coefficients of equation. Omit the supplementary card if
you use NOCOEFFS.
perm/[noperm]
You can only use the PERM option if you are also using the coeffVECTOR param-
eter. PERM makes the association with the specified VECTOR permanent—if you
change the values of the VECTOR, you change the coefficients.
free/[nofree]
binary/[nobinary]
unit=[data]/input/other unit]
These options let you read the coefficients from a file rather than from a supple-
mentary card. With FREE, the coefficients are read free-format (tab, blank, or
comma-delimited text file) and with BINARY, they are read as binary data. In
either case, omit the supplementary card.
Examples
There are several ways to input the relationship
See Also . . .
EQUATION Defines equations
FRML Defines formulas (FRMLS)
Parameters
BOOTseries The SERIES of INTEGERS created by BOOT.
start end The range of entries in BOOT series set by BOOT. By default,
the standard workspace.
lower upper The lower and upper bounds (inclusive on both ends) on the
value range of the random integers. These default to start and
end. If you specify these as a range of entries, BOOT fills the
series with random entry numbers from that range.
Option
[replace]/noreplace
Determines whether or not the sampling will be done with replacement. With
NOREPLACE, once a value is drawn, it won’t be drawn again for a different ele-
ment of BOOTseries. With REPLACE, numbers may be drawn more than once.
Drawing with replacement is the normal procedure in bootstrapping operations,
as the sample is treated as if it were the population from which the data are
drawn. Drawing without replacement is typically part of approximate randomiza-
tion analyses, and is usually done to shuffle the entire entry range.
panel/[nopanel]
Use PANEL if you want to randomize the individuals in a (balanced) panel data
set. Within each individual, the original time order is maintained. Use the PANEL
option on any subsequent SET instructions you use to get the shuffled data.
Examples
calendar(q) 1980:1
allocate 2017:4
boot entries / 1980:1 2003:4
The dates of lower and upper correspond to the 1st and the 96th entry numbers, so
BOOT fills the ENTRIES series with random integers ranging from 1 to 96.
boot(noreplace) entry 1 50
set shuffle 1 50 = ressqr(entry(t))
creates SHUFFLE as a random reordering of the fifty elements of the series RESSQR.
Notes
BOOT draws with replacement by scaling and translating uniform random numbers.
A similar calculation can be done with FIX(%UNIFORM(lower,upper+1)). The +1
is needed because the upper bound is included as a possible value.
Draws without replacement are done by randomly shuffling the numbers between
LOWER and UPPER.
See Also . . .
SEED Controls the seeding of the random number generator
%RANINTEGER (Function) Draws a random integer from a set range
%RANPERMUTE (Function) Returns a random permutation of {1,..,n}
%RANCOMBO (Function) Returns a random combination (sample without
replacement) from {1,...,n}
%RAN(x) (Function) Fills an array with draws from a Normal distribu-
tion with a standard deviation of x.
Wizard
The Time Series—Box Jenkins (ARIMA) Models wizard provides dialog-driven access
to most of the features of the BOXJENK instruction.
Parameters
depvar Dependent variable.
start end Estimation range. If you use these to set the range (rather than
using the default range), and are not using the maximum likeli-
hood method, you must set start to allow for:
• the autoregressive and seasonal autoregressive lags in the
dependent variable.
• the required lags for input variables. The total number of
lags of actual data required is the sum of the highest ar
lag, the highest seasonal ar lag and the highest lag in the
input numerator.
You do not have to allow for lags in the moving average part or
any denominator lags in the inputs.
If you haven’t set a SMPL, the range defaults to the maximum
range over which all of these lags are defined.
residuals (Optional) The residuals are automatically saved in the series
%RESIDS. You can supply a series name for this parameter if
you also want to store the residuals in a different series.
Options
constant/[noconstant]
demean/[nodemean]
CONSTANT includes an intercept (constant) term in the model as an estimated
parameter (by default, there is none). DEMEAN offers an alternative for handling
series with non-zero means. It removes the mean from the dependent variable
(after differencing, if required), prior to estimating the model. The mean removal
is done using an internal copy—the actual series itself is not affected.
rats supports any combination of lags for the AR, MA, SAR, and SMA options:
• For N consecutive lags (all lags from 1 through N) for a given parameter, use
the format AR=N, MA=N, etc.
• For non-consecutive lags, use ||list of lags||. If you are listing more
than one lag, separate them by commas. For example, use AR=||3|| for an ar
parameter at lag 3 only, while AR=||1,3|| gives parameters on lags 1 and 3.
You can also use a VECTOR of INTEGERs.
[print]/noprint
vcv/[novcv]
dfc=Degrees of Freedom Correction (User’s Guide, Section 5.15)
These are the same as for LINREG. Note that there are no WEIGHT or SPREAD op-
tions.
smpl=SMPL series or formula (“SMPL Option” on page RM–546)
You can supply a series or a formula that can be evaluated across entry numbers.
Entries for which the series or formula is zero or “false” will be skipped, while
entries that are non-zero or “true” will be included in the operation. You must use
the MAXL option with this.
maxl/[nomaxl]
Estimates the model using maximum likelihood estimation rather than condi-
tional least squares. MAXL has the advantage of being able to handle missing
values within the estimation range.
method=[gauss]/bfgs/simplex/genetic/annealing/ga/initial/evaluate
iterations=iteration limit [100]
subiterations=subiteration limit [30]
cvcrit=convergence limit [.00001]
trace/[notrace]
BOXJENK uses non-linear optimization. METHOD sets the estimation method to be
used, with Gauss-Newton being the default choice. INITIAL is the initial guess
algorithm—the same one used by the rats instruction INITIAL. EVALUATE sim-
ply evaluates the model given the initial parameter values (which you can input
using the INITIAL option).
ITERATIONS sets the maximum number of iterations, SUBITERS sets the maxi-
mum number of subiterations, CVCRIT the convergence criterion. TRACE prints
the intermediate results. See Chapter 4 in the User’s Guide for more details.
pmethod=gauss/bfgs/simplex/genetic/annealing/ga/initial
piters=number of PMETHOD iterations to perform [none]
Use PMETHOD and PITERS if you want to use a preliminary estimation method to
refine your initial parameter values before switching to one of the other estima-
tion methods—rats will automatically switch to the METHOD choice after com-
pleting the preliminary iterations requested using PMETHOD and PITERS.
initial=VECTOR of initial guesses
The initial guess values used by rats are usually adequate for well-specified
models. However, if you are having trouble getting convergence, you can input
your own guess values. See ““Coefficient Order”” for the proper placement..
The option INITIAL=%BETA will start iteration from the point at which the previ-
ous BOXJENK left off, if you decide you simply want to let the process run for a
few more iterations.
define=equation to define from result
If you intend to use the model for forecasting, you must use DEFINE to save the
estimated equation. rats obtains the equation from the model by multiplying
out the autoregressive and input denominator polynomials. For instance, rats
converts the model:
10 1 + .4 L
yt = xt + ut
1 − .3 L 1 − .5 L
into the equation
If you transformed your dependent variable or any input series with differenc-
ing operators prior to doing BOXJENK, or if they were residuals themselves from
BOXJENK (prewhitened), then you must use MODIFY and VREPLACE on the equa-
tion to put it into a form directly usable by the forecasting instructions.
outlier=[none]/ao/ls/tc/standard/all
critical=critical (t-statistic) value [based on # of observations]
With any of the choices for OUTLIER other than NONE, BOXJENK does an automat-
ic procedure for detecting and removing outliers. This can be used with or with-
out the GLS option. If used without GLS, it operates like GLS with an empty set of
base regressors, that is, it estimates dummy shifts to the mean of the dependent
variable, using maximum likelihood. CRITICAL allows you to set the t-statistic
value used for the automatic outlier detection threshold. See “Outliers” on page RM–21
for more details.
Supplementary Cards
If using the REGRESSORS option, supply any additional regressors in regression for-
mat. If using the INPUTS option, supply one supplementary card for each input. The
lag polynomial form of an input is:
(ω + ω L + + ω L )
0 1 n
n
X
(1 − δ L − − δ L ) m t −d
1
m
Missing Values
BOXJENK requires use of the MAXL option if there are any missing values within the
data range.
Technical Information
The parameterization used for the arima model in rats is:
e (1 + θ L + + θ L )(1 + Θ L + + Θ )
1
q
1
s sl
(1 − L ) (1 − L )
d q l
s
yt =α+ u
(1 − φ L −−φ L )(1 − Φ L −− Φ L )
p s sm t
1 p 1 m
Note the sign convention on the coefficients. Also, note that the parameterization of
the constant term is different from that used in some other software.
If you are using the INPUTS option, polynomial terms of the form shown in
“Supplementary Cards” are added to the right hand side of the above expression (one
polynomial per input).
Algorithm
By default, BOXJENK uses the Gauss-Newton algorithm with numerical derivatives.
The simplex and genetic methods are also available with the METHOD and PMETHOD
options. These can be helpful in improving initial parameter guesses for models that
prove difficult to fit. Maximum likelihood estimation is done using a state-space
representation. Ansley’s (1979) conversion of the full-sample likelihood into a least-
squares problem is employed if you use Gauss-Newton. METHOD=BFGS, though, often
provides better estimation performance.
For transfer function models, rats generates
Xt
Zt =
∂ (L )
(when denominator lags are present) by solving ∂ (L ) Zt = X t , where presample val-
ues of X are set equal to the mean of the first twenty observations.
Outliers
The OUTLIER and related options implement an automatic procedure for dealing with
outliers in the data.
The choices for OUTLIER are NONE, A0, LS, TC, STANDARD, and ALL. AO locates ad-
ditive outliers. For an outlier at entry t0, the resulting dummy would be 1 only at
t0. LS detects level shifts, generating a dummy with -1’s from the beginning of the
sample until t0 - 1. TC detects temporary changes. For a temporary change starting
at t0, the dummy takes the value 1 at t0, then declines exponentially for data points
beyond that. OUTLIER=AO, OUTLIER=LS and OUTLIER=TC select scans for only the
indicated type of outlier. OUTLIER=STANDARD scans for AO and LS, OUTLIER=ALL
does all three.
The following procedure is repeated until no further outliers are detected:
Beginning with the last regarima model (including previously accepted outliers), lm
tests are performed for each of the requested types of outliers at all data points. If
the largest t-stat exceeds the critical value, that shift dummy is added to the model,
which is then re-estimated.
When there are no further outliers to be added, the list is then pruned by examining
the t-stats from the full estimation using the same critical value.
Note that the first step uses a “robust” estimate of the standard error of the residu-
als, based upon the median absolute value. There are several ways to compute
maximum likelihood estimates; rats uses Kalman filtering, Census x12–arima
uses optimal backcasting. The two lead to identical values for the likelihood function,
identical values for the sum of squares of the residuals, but not to identical sets of
estimated residuals. As a result, there can be slight differences between this robust
estimate of the standard error. In some cases, they can be large enough to cause the
two programs to differ on whether a marginal t-stat is above or below the limit. (x12–
arima tends to give a lower value for the standard error, and hence higher t-statis-
tics). This tends to correct itself in the backwards pruning steps.
Coefficient Order
For a standard ARIMA/Transfer function model, the coefficients are in the model in
the following order. This is used in the INITIAL option or for doing hypothesis tests.
1. Constant
2. Autoregressive
3. Seasonal autoregressive
4. Moving average
5. Seasonal moving average
6. Inputs in order, numerator first
For RegARIMA models, the regressors are considered to be of principal interest, so
they are moved to the start:
1. Regressors in order
2. Autoregressive
3. Seasonal autoregressive
4. Moving average
5. Seasonal moving average
Hypothesis Tests
You can use TEST and RESTRICT to test hypotheses on the coefficients. The coeffi-
cient order is provided in the description of the INITIAL option. It’s probably easiest
to use the Regression Tests Wizard to set those up.
Examples
boxjenk(diff=1,sdiff=1,ar=1,ma=||1,12,13||,maxl,define=reseq) $
rescons
estimates by maximum likelihood an ARIMA (1,1,3)´(0,1,0) with the MA parameters
on lags 1, 12 and 13. This also defines the forecasting equation RESEQ from the re-
sult.
boxjenk(diffs=1,ar=1,inputs=1,apply,define=saleseq) sales
# ads 0 1 0
boxjenk(diffs=1,ar=||2,4||,define=adseq) ads
estimates a transfer function model from ADS to SALES. The transfer term is (0,1,0)
and the noise term is ARIMA(1,1,0). The second BOXJENK fits an ARIMA(2,1,0) to ADS
with AR lags on 2 and 4. Use the two equations together to forecast SALES out-of-
sample.
Variables Defined
In addition to the standard regression variables (see page RM–288), BOXJENK defines:
%NARMA number of arma parameters (useful for degrees of freedom cor-
rection when computing a Q statistic) (INTEGER)
%NFREE number of free parameters (INTEGER)
%FUNCVAL final value of the estimated function (REAL)
%ITERS iterations completed (INTEGER)
%CONVERGED = 1 or 0. Set to 1 if the process converged, 0 if not.
%CVCRIT final convergence criterion. This will be equal to zero if the subi-
terations limit was reached on the last iteration (REAL).
Output
The output is the standard regression output, except for the labels on the coefficients,
and the inclusion of the Ljung–Box Q statistic, which will show the correction to
degrees of freedom for the estimated arma parameters (below 8 correlations minus 1
lost degree for the one ma parameter). rats labels the ar, ma, and seasonal param-
eters as AR, AR_SEAS, MA, and MA_SEAS. The numerator and denominator coefficients
for input series are N_nnnnnn and D_nnnnnn. The R2 is computed based upon the
original dependent variable before differencing, if any is used.
boxjenk(ma=1,maxl,constant,inputs=1,define=saleseq) sales
# adv 1 1 0
Example
This does iterated weighted least squares, allowing for up to 40 iterations, but break-
ing out of the loop if the slope coefficient changes by less than .00001 from one itera-
tion to the next.
Parameters
cseries The complex series to transform.
start end Range of entries to transform. By default, range of cseries.
newcseries Complex series for the result. By default, same as cseries.
newstart Starting entry of the result series. By default, same as start.
Options
standardize/[nostandardize]
If you use STANDARDIZE, CACCUMULATE normalizes the resulting series so the
final entry has the value one. (This was called SCALE before version 7).
Example
This is the business end of a Durbin test. The CACCUMULATE instruction cumulates
the periodogram (squared Fourier transform) over frequencies 0 to p and standard-
izes it to have an end value of 1.0. If the series examined is white noise, this cumu-
lated periodogram should differ only slightly from the theoretical spectral distribu-
tion function for white noise: a straight line. The CXTREMUM instruction computes the
maximum gap between the two distribution functions.
frequency 3 nords
compute half=(nords+1)/2
rtoc
# resids
# 1
clabels 1 2 3
# "Cumprdgm" "Whitenoise" "Gap"
fft 1
cmult 1 1 Periodogram of resids
cacc(standardize) 1 1 half
cunits 2 Spectral density of white noise (constant)
cacc(standardize) 2 1 half
csubtract 2 1 1 half 3
cxt(part=absval) 3 1 half Computes extreme values of series 3
Common Frequencies
calendar(a) year:1
calendar(q) year:quarter
calendar(m) year:month
calendar(b) year:month:day
calendar(w) year:month:day
calendar(d) year:month:day
calendar(7) year:month:day
For Annual (A), Quarterly (Q), Monthly (M), Bi-Weekly (B), Weekly (W), and Daily
data (D=five day per week, 7= seven days per week). The parameter gives the date
of the first entry, in standard rats date notation.
Wizards
The Data/Graphics—Calendar wizard allows you to set the CALENDAR using a dialog
box. Both of the Data wizards set the CALENDAR as part of the process of reading in
data.
Other Options
RECALL=saved VECT[INTEGER] to be used for calendar scheme
SAVE=VECT[INTEGER] into which calendar scheme is saved
Note that SAVE does not change the working calendar scheme—this allows you to
define a calendar scheme to be used for processing a data file which has no date
information but which follows a different arrangement than the main workspace.
Description
CALENDAR is one of the most important instructions in rats:
• With time series data, use CALENDAR to tell rats the starting date and peri-
odicity of the data.
• With panel data, use CALENDAR to set the number of time periods per indi-
vidual, and, optionally, the starting date and periodicity of the data.
After setting a CALENDAR, you can refer to entry ranges using an easy-to-understand
date notation (“Date Notation”), and your output will be labeled similarly. Note that
you should never use a CALENDAR instruction with cross-sectional data. It will only
confuse rats.
Frequencies
The form of the CALENDAR instruction will vary depending on the desired frequency,
as described in the following paragraphs. See the examples later in this section.
Intraday Data
If you have a data set with a fixed number of time periods per day, use the PPD option
to set the periods per day, combined with one of the other options (usually D or 7) to
describe the day-to-day arrangement of the data set.
You can use any number of periods with PPD—you do not have to cover a full 24 hour
period. For instance, data at five minute intervals from 9:05am to 3:00pm could be
handled using PPD=72.
Panel Data
Include the option PANELOBS=periods per individual together with other op-
tions to describe the date scheme in the time direction. periods per individual
is the number of time periods for each individual in a data set. The setting of the time
series scheme is optional and has no effect on your ability to use the special panel
data features of rats. Note that you cannot use PANELOBS with the PPD option. See
page UG–393 of the User’s Guide for more on panel data.
Examples
The first three examples all use ALLOCATE to set the default ending period:
calendar(m) 1950:1
allocate 2017:12
Monthly data, starting January, 1950, ending December, 2017.
cal 1791:1
allocate 1856:1
Annual data, beginning in 1791, ending in 1856. Note the :1 on the date references
is required when specifying dates for annual data.
cal(w) 1985:5:8
allocate 1989:3:1
Weekly data, with the first week ending on May 8, 1985. Data end March 1, 1989.
cal(7) 1980:1:1
open data daily.rat
data(format=rats) * 1999:12:31
Daily data with seven days per week, beginning on January 1, 1980, and set the
default ending period (December 31, 1999) using a DATA instruction.
cal(ppd=8,d) 2000:7:1
allocate 2003:8:31//8
Eight time periods per business day, starting July 1, 2000, ending on August 31,
2003 (with a full eight periods on the last day).
cal(panelobs=52,w) 1990:1:5
allocate 30//52
Panel data at a weekly frequency, with 52 time periods per individual, first week
ending Friday, January 5, 1990. There are 30 individuals in the data set.
Converting Frequencies
rats will not allow you to work with two frequencies of data simultaneously, mean-
ing that only one CALENDAR can be in effect at a time. However, the DATA instruction
can translate data automatically from other frequencies to the current CALENDAR
frequency (see Introduction, Section 2.5 for details). Thus, if you have a mixture of
monthly and quarterly data, you can work with it all at either quarterly or monthly
frequencies.
DATA itself does not use any sophisticated techniques for producing a higher fre-
quency version of a series. For instance, in translating quarterly data from a file
into monthly series, it merely copies the quarterly value to each of the corresponding
months. You can then use the @DISAGGREGATE procedure to do more complex inter-
polations and distributions on the expanded data.
Saving/Recalling Calendars
It’s sometimes necessary to switch to a different frequency temporarily, or to switch
out of time-dated data to simple sequence data. You can save the current calendar
setting, then recall it later on using the function %CALENDAR() and the RECALL op-
tion on the CALENDAR instruction. The following saves the initial quarterly calendar
setting in SAVECAL, switches to IRREGULAR, then switches back later.
calendar(q) 1954:1
...
compute savecal=%calendar()
calendar(irregular)
...
calendar(recall=savecal)
You can also save a calendar scheme without resetting the workspace date scheme
by using the SAVE option. For instance, the following sets up a quarterly workspace
starting in 1954:1, then reads a data file which has monthly data from 1948:1, com-
pacting by quarterly averages. Since the data file is free format, there is no iden-
tifying date information on it, thus the need for the CALENDAR option on the DATA
instruction.
cal(q) 1954:1
cal(m,save=mdata) 1948:1
data(format=free,compact=average,calendar=mdata) $
1954:1 2016:4 ffunds
calendar(yearsperentry=years) year
for data spaced the indicated number of years apart
Wizard
If you do use File–Open to open a rats format file, you’ll get the same information,
but in a scrolling list. From this list, you can view/edit data, export data, and more.
Parameters
list of series If you provide a list of series names, CATALOG will display in-
formation only for those series. Otherwise, rats will list all the
series on the file.
Options
full/[nofull]
FULL requests full information (names, dates, comments) on each series in the
list. Otherwise CATALOG lists names only. FULL is the default only if you ask for
information on specific series by using the list parameter.
unit=[output]/copy/Other unit
Choose where you want output to go.
Examples
catalog
Lists the names of all the series on the file.
catalog(like=”gdp*”,full,unit=copy)
Lists information on all series whose names start with GDP to the current COPY unit.
catalog real_gnp nom_gnp
Lists full information on the series REAL_GNP and NOM_GNP
See Also . . .
DEDIT Opens a rats data file for editing.
PRTDATA Prints series from a rats data file.
Parameters
distribution The distribution selected. Choose (at least three letters) of
• FTEST (for F)
• TTEST (for two-tailed t)
• CHISQUARED
• NORMAL (for two-tailed standard normal).
With TTEST and NORMAL, divide the significance level by two if
you want just a one-tailed test.
statistic The value of the test statistic. This can be an expression.
degree1, 2 • For t and c2, degree1 is the degrees of freedom.
• For F, degree1 and degree2 are the numerator and de-
nominator degrees of freedom, respectively.
Option
[print]/noprint
rats displays the statistic and significance level unless you use NOPRINT.
Example
This does a simple specification test by regressing the residuals on a larger set of
exogenous variables. See page UG–95 in the User’s Guide for more.
linreg lwage / resids
# constant exper expersq educ
linreg resids
# constant exper expersq educ age kidslt6 kidsge6
cdf(title="Specification Test") chisqr %trsq 3
Parameters
oldunit Name of the I/O unit that you want to set. It makes little sense
for this to be anything other than INPUT, OUTPUT or PLOT.
newunit The existing I/O unit that you want to take the place of
oldunit.
Description
CHANGE makes newunit take over the role currently assigned to oldunit. For in-
stance,
change input keyboard
will switch the source of rats instructions from whatever it is currently (probably a
file) to the keyboard.
change output screen
switches output to the screen or edit window.
Examples
Suppose you are running a long batch job, and you want to direct some output (such
as regression results) to one file, and other output (such as hypothesis test results)
to another file. If you only need to change output files once, you could simply use two
OPEN OUTPUT commands:
A similar example is where you have an important piece of output which you would
like to place in its own window, so that it can be found easily. This can be done by
something like
open(window) regwindow "Key Regressions"
change output regwindow
linreg ...
....
change output screen
This opens a new window (which will be titled on the display as “Key Regressions”)
and puts all output created by the LINREG and any other instructions down to the
next CHANGE OUTPUT into that window. That second CHANGE OUTPUT switches the
remaining output back to the main output window.
clabels list
list ofof complex
complex series
series
# labels within "..." separated by blanks separated by blanks
# labels within “...” or label expressions
Parameters
list of series List of complex series you want to label.
Supplementary Cards
The labels can be any collection of characters (up to sixteen) enclosed within quotes
("..." or '...'). You can also use string expressions, LABEL variables, or elements of
an array of LABELS.
Example
freq 5 512
clabels 4 5
# "Crosspec" "Spectrum"
labels series 4 and 5 as CROSSPEC and SPECTRUM, respectively.
Parameters
list of series The list of the series you want to clear. You can list an array of
series to initialize all the series in the array. Use the ALL option
to clear all existing series in memory.
If you include a new series name on the list, that series will be
created.
Options
all/[noall]
If you use ALL, rats clears all existing series.
zeros/[nozeros]
Use ZEROS if you want to set the values of the series to zero rather than to the
missing value code.
Parameters
cseries The complex series to transform.
start end Range of entries to transform. By default, the defined range of
cseries.
newcseries Complex series for the result. By default, same as cseries.
newstart Starting entry of the result series. By default, same as start.
Example
cln 3
ift 3
cset 3 = %z(t,3)*(t<=ordinate/2)
fft 3
cexp 3
ift 3
cset(scratch) 3 = %z(t,3)/%z(1,3)
is a section of the procedure @SPECFORE, which computes forecasts using spectral
methods. It takes the log of series 3 (a spectral density), applies an inverse transform
to the series, sets its negative “lags” to zero, transforms the series back, then expo-
nentiates the series and inverse transforms it again. The CSET instruction at the end
normalizes series 3 to a value of 1.0 in entry 1.
Parameters
RATS I/O unit This can be COPY, PLOT, DATA or any other unit which you have
defined.
Examples
This is a (section of) a two–part program. The first runs 100,000 replications of some
testing procedure, producing values for the scalar variables CDSTAT1 and CDSTAT2.
The DISPLAY instruction writes each pair of values to file SIMUL.DAT. When the rep-
lications are finished, the first program is terminated, and a second program reads
this information back in as data series for analysis.
open copy simul.dat
do i=1,100000
...
display(unit=copy) cdstat1 cdstat2
end do i
*
end(reset)
*
close copy
open data simul.dat
all 100000
data(org=obs) / cdstat1 cdstat2
order cdstat1
order cdstat2
...
See Also . . .
OPEN Opens an I/O unit (includes general discussion of rats I/O
units).
REWIND Rewinds an I/O unit.
Parameters
cseries Series to create.
start end Range of entries to set. By default, 1 and FREQUENCY length.
seasonalwidth Number of entries between seasonals. By default, rats will use
the number of frequencies divided by the CALENDAR seasonal.
bandwidth The width of the zero band around each seasonal frequency.
rats rounds even values up to the next odd value. By default:
seasonalwidth/6.
bandcenter The center (entry number) of the first seasonal band. By de-
fault, entry one (zero frequency). If this is seasonalwidth+1,
CMASK will leave the low frequencies unaffected.
Description
CMASK creates a mask for seasonal frequencies. It sets entries start to end of
cseries to the value 1.0. Then it sets to zero a band of width bandwidth about
every seasonalwidth entry, beginning with bandcenter. A band will wrap around
to the other end of the series if necessary. For instance:
cmask 3 1 128 32 5 33
results in zeros in entries 31 to 35, 63 to 67 and 95 to 99. These are the bands around
p 2 , p, and 3p 2 .
Notes
You need to take some special steps when smoothing a spectrum which you intend
to mask: use the option MASK=masking series on WINDOW, then multiply the
smoothed series by the mask.
fft 2
cmult(scale=1./(2*%pi*%scaletap)) 2 2
cmask 3 1 128 32 5 1
window(mask=3,width=9) 2 1 128 4
cset 4 = %z(t,4)*%z(t,3)
Parameters
start end Range of entries to use in computing the cross-moment matrix.
If you haven’t set a SMPL, this defaults to the largest interval
over which all the variables are defined.
Supplementary Cards
The first supplementary card lists the variables to include in the cross-moment ma-
trix. You must include in this list the dependent variable(s) of any regressions that you
plan to base on this CMOMENT. Omit this if using the EQUATION, LASTREG, MODEL, or
INSTRUMENTS option.
The second card is used only with ZXMATRIX.
Description
Cross-moment and correlation matrices are important in statistical work. However,
the main uses of CMOMENT come from the fact that a cross moment matrix which
includes both the explanatory variables and the dependent variable(s) of a linear re-
gression have all the sample information required to compute regression coefficients
and sums of squared residuals. This can be used in several ways within rats:
• The LINREG instruction with the CMOMENT option pulls that information out
of the computed cross product matrix. This can cut down on the computation
time required for doing large numbers of similar regressions, and also ensures
that all those regressions use a common sample. See, for example, the infor-
mation criteria searches in Section 2.9 in the User’s Guide.
• The %RANMVPOSTCMOM, %RSSCMOM (for univariate regressions) and
%RANMVKRONCMOM and %SIGMACMOM (for multivariate regressions) functions
simulate coefficients from the posterior in a linear model (the %RAN... func-
tions) or compute the sum of squared residuals or covariance matrix (the other
two) using the statistics included in the cross-moment matrix. This is a huge
time-saver in simulations as the only calculation that depends upon the data
gets computed just once, rather than with every draw.
• The %SWEEP function can be used to do a sequence of “regressions” using a
common sample range if all you need are the coefficients and sum of squared
residuals.
Options
print/[noprint]
Use PRINT to print the cross-moment matrix. rats prints it in a table like that
used for the covariance matrix of regression coefficients (see page Int–77 of the
Introduction).
corr/[nocorr]
center/[nocenter]
With CORR, CMOMENT computes the correlation matrix instead of the cross-
moment matrix. With CENTER, it computes a centered (means subtracted) cross-
moment matrix. You cannot use a LINREG(CMOMENT) instruction after CMOMENT
with either of these.
smpl=SMPL series or formula (“SMPL Option” on page RM–546)
You can supply a series or a formula that can be evaluated across entry numbers.
Entries between start and end for which the series or formula is zero or “false”
will be omitted from the computation.
rats will form the cross-moment matrix using the explanatory variables, and
will include the dependent variable(s) of the equation, regression, or model unless
you use the option NODEPVAR.
Note: With LASTREG, the sample range is determined by the CMOM instruction
(that is, by the default range, the start and end parameters, or the SMPL op-
tion), which may not be the same as the range used in the preceding regression.
To use the same range as the regression, use the %regstart() and %regend()
functions as the start and end parameters of the CMOM instruction.
zxmatrix=RECTANGULAR Z’X matrix [unused]
Takes two sets of supplementary cards, or some combination of LASTREG or
EQUATION and INSTRUMENT options and possibly a supplementary card. It
computes Z ¢ X where Z is the first list of variables and X the second. If you use
LASTREG or EQUATION, that will always be the first list; INSTRUMENT will be first
if you use it but not LASTREG or EQUATION. If you only use one of these options,
use a single supplementary card to supply the other list of variables. You cannot
use a LINREG(CMOMENT) instruction after CMOMENT with the ZXMATRIX option.
matrix=SYMMETRIC array for computed matrix [%CMOM]
This saves the computed cross-moment matrix in the SYMMETRIC array. By de-
fault, the result goes into the array %CMOM, which you can access. You cannot use
a LINREG(CMOMENT) instruction after a CMOM using the MATRIX option.
setup/[nosetup]
SETUP creates and dimensions the %CMOM array and prepares for future
LINREG(CMOM) instructions, but does not actually compute the matrix. Instead,
you can fill the array yourself with subsequent instructions, such as COMPUTE.
Notes
Correlations computed by CMOM(CORR) may be different from pairwise correlations
computed with the function %CORR or with CROSS. CMOMENT does the computations
over a single set of entries appropriate for all the variables involved. This may not be
the same as the entries which could be used for a single pair of the variables.
Missing Values
Any observation for which any of the series in the cross-moment matrix is missing is
dropped from the sample.
Examples
cmom
# constant m1{0 to 12} gnp
do lag=1,12
linreg(cmoment) gnp
# constant m1{0 to lag}
end do lag
computes a set of distributed lag regressions. This is similar to what you would get
with LINREGs alone except the regressions are computed over a uniform sample pe-
riod, because the CMOM is computed using the full 12-period lag length for M1.
cmom(corr,print)
# rate30 rate60 rate90 rate120 rate1yr rate2yr
computes and prints the correlation of six interest rate series.
cmom 1974:2 *
# uscpi{1} exrat{1} itcpi{1} uscpi exrat italcpi
compute s=%sweeptop(%cmom,3)
“sweeps out” the lag values. This, in effect, runs a one lag VAR.
cmom
# constant shortrate{0 to 24} longrate
linreg(cmom) longrate
# constant shortrate{0 to 24}
compute beta=%beta
*
* Draw residual precision conditional on previous beta
*
compute rssplus=nu*s2+%rsscmom(%cmom,beta)
compute hu =%ranchisqr(nu+%nobs)/rssplus
*
* Draw betas given hu
*
compute beta=%ranmvpostcmom(%cmom,hu,priorh,priorb)
uses the information in %CMOM to make random draws for the regression variance
(done with the help of %RSSCMOM) and coefficients (%RANMVPOSTCMOM).
See Also . . .
%CORR Means-subtracted correlation between two vectors or series.
%COV Means-subtracted covariance between two vectors or series.
CROSS Computes cross-correlations or covariances of two series
CORRELATE Computes autocorrelations of a series
Parameters
cseries1, 2 The pair of complex series you want to transform.
start end Range of entries to transform. By default, the common defined
range of cseries1 and cseries2.
newcseries Complex series for the result. By default, same as cseries1.
newstart Start of the result series. By default, same as start.
Comments
CMULTIPLY is the instruction generally used to convert a Fourier transform into
a spectrum and cross-spectrum. The proper scaling factor for spectral estimates is
SCALE=1./(2*%PI*N), where N is the number of actual data points and not the pad-
ded length. If you use a taper (see TAPER), you should use the %SCALETAP variable
defined by TAPER instead of N.
Examples
cmult(scale=1./(2*%pi*nobs)) 1 1
cmult 1 2 / 3
The first replaces series 1 by its squared absolute value (series 1 times its complex
conjugate), divided by 2 pNOBS. The second sets 3 equal to the product of 1 and the
conjugate of 2.
Description
Among its uses in typical rats programs:
• You can do calculations using values such as the Residual Sum of Squares (%RSS
variable) produced by instructions like LINREG and NLLS.
• You can set integer values for starting and ending observations, then use these
variables as parameters on instructions. You can then easily alter the program to
use a different range of entries by changing just a few COMPUTE instructions.
• You can do matrix calculations such as multiplication, addition, inversion and
determinants.
With COMPUTE, you can do just about any numerical computation you could do with
a high level programming language such as fortran or C and often much more ef-
ficiently, since rats can work with entire matrices. And it has similar capabilities to
well-known “math” packages.
Parameters
variable This can be almost any type of variable supported by rats,
except FRML (formula) which must be set using the instruction
FRML. It can be an element of a series, but not an entire series
itself—use SET, CSET or GSET (depending upon the type of the
series) to set multiple elements of a series with a single instruc-
tion.
expression The expression can include any legal combination of arithme-
tic and matrix operators, functions, numeric values, character
literals, and previously defined variables and arrays. If the
variable is of a known type, the expression must evaluate to a
type which can be converted to the variable’s type.
Scalar Calculations
COMPUTE can do almost any kind of scalar calculation. Some examples:
compute ndraws=500
compute fstat = ( (rssr-rssu)/q ) / (rssu/%ndf)
This first sets the (INTEGER) NDRAWS to 500. The second computes an F-statistic us-
ing previously set values, and saves the result in the variable FSTAT.
Matrix Calculations
COMPUTE is the primary instruction for matrix calculations. Some examples:
compute c = a*b
sets the array C equal to the product of the arrays A and B. You do not have to declare
or dimension C ahead of time.
compute [symmetric] cmominv = inv(%cmom)
computes the inverse of the cross-moment matrix %CMOM and stores the resulting ar-
ray in the (new) SYMMETRIC array CMOMINV.
Multiple Expressions
You can evaluate multiple expressions with a single COMPUTE instruction. Just sepa-
rate the expressions with commas. For example:
compute b0=%beta(1) , b1=%beta(2) , b2=%beta(3)
You can also combine different types of expressions in a single COMPUTE:
compute adotb = %dot(a,b), ainv = inv(a), binv = inv(b)
Here, ADOTB is a real variable, while AINV and BINV are arrays.
In-Line Matrices
You can use COMPUTE to construct arrays directly in an expression. For example:
compute [vector] r = ||1.0, 2.0, 3.0||
creates a 3-element VECTOR called R, with elements 1.0, 2.0, and 3.0, while
compute [vector[label]] names = ||"US","Canada","Japan","Mexico"||
creates and sets a 4-element VECTOR of LABELS.
do row=1,%rows(%cmom)
compute %cmom(row,row) = %cmom(row,row)+k
end do row
adds the constant K to each entry on the diagonal of the %CMOM array.
See Also . . .
UG, Section 1.1 Scalar calculations
UG, Section 1.7 Matrix calculations
Wizard
You can also copy series to a file using View—Series Window to display a list of all
the series in memory, selecting the series you wish to export, and doing File—Export.
You can choose the file format using the “Save As Type” field.
Parameters
start end Range of entries to be printed or copied to a file. If you have not
set a SMPL, COPY uses the smallest range required to show all
defined data for the series. Any undefined data are treated as
missing values.
list of series List of series to be printed or saved. If you omit the list, all cur-
rent series are printed.
Options
format=[free]/binary/cdf/dbf/dif/dif/fame/html/portable/prn/rats/
rtf/tex/tsd/wks/xls/xlsx/"(fortran format)"
The format to be used for the output. See Chapter 2 of the Introduction and Data
Formats on the help for details on the various file formats supported. The FREE
and FORTRAN formats are also discussed on the following pages. You can use
any of these if you are putting the output to the COPY or other unit. However,
UNIT=OUTPUT will only accept FREE, PRN, PORTABLE and CDF.
organization=columns/[rows]
The option ORGANIZATION (or ORG) tells rats how the data should be organized
on the file. Use COLUMNS if you want the data arranged in columns (that is, series
run down the page in columns, with each row containing a single observation for
all series). Use ROWS if you want the series to run across the page in rows.
Note that COPY still supports the choices OBSERVATION (equivalent to COLUMNS)
and VARIABLE (equivalent to ROWS) used in older versions. The ORG option is
ignored for most formats, since only text files and spreadsheets allow for the dif-
ferent arrangements.
unit=[copy]/output/other unit
This option sets the destination of the COPY operation. This can be an external
file or the current OUTPUT unit. The default choice is the COPY unit, which you
open with the instruction OPEN COPY. If you don’t open the unit in advance,
rats will prompt you for a file name.
dates/[nodates]
With the DATES option, observations are labeled with dates or entry numbers (if
no CALENDAR is in effect). With NODATES, observations are not labeled at all.
been closed, use the APPEND option on the OPEN COPY command to reopen the file.
Using FORMAT=FREE
COPY with FORMAT=FREE is not quite analogous to DATA with FORMAT=FREE: if you
use the DATES option, COPY will produce a file which DATA cannot process.
Use FORMAT=FREE in one of two situations:
• You are porting data to a program which cannot accept any of the “labeled”
formats. You will get a file which consists of numbers only.
• You want to print data to the OUTPUT unit (usually the screen when working
in interactive mode).
While the spreadsheet formats will use only one (possibly extremely long) line to
represent all entries of a series (with ORG=ROWS) or observation (with ORG=COLS),
FORMAT=FREE uses multiple lines if necessary to keep the width of the output from
getting too large. For instance, this is a segment of the output for a single quarterly
data series (this was done with options FORMAT=FREE,ORG=ROWS,DATES):
1947:01 224.9000000 229.1000000 233.3000000 243.6000000
1948:01 249.6000000 257.1000000 264.0000000 265.5000000
1949:01 260.1000000 256.6000000 258.6000000 256.5000000
1950:01 267.4000000 276.9000000 294.5000000 305.9000000
1951:01 319.9000000 327.7000000 334.4000000 338.5000000
FORTRAN Format
FORTRAN format (Additional Topics PDF, Section 3.9) is very similar to free-format
with a picture code. It does, however, give you a bit more control over the appearance.
If you use a FORTRAN format with the option DATES, your format must allow for the
date string at the start of each line. This is an Aw format, where w is at least 7 for
quarterly and other “periods per year” data, 10 for daily, weekly, etc. and 15 for in-
traday or panel data. In addition, if you use ORG=ROWS with DATES, there is a special
option ACROSS which you must set correctly:
ACROSS=number of entries per line [4]
ACROSS indicates the number of data values the format will print per line.
For example, we could display the data above formatted eight across by using COPY
with the options FORMAT="(A8,8F8.1)",ACROSS=8,DATES.
Examples
open copy fooddata.xls
copy(dates,format=xls,org=columns)
writes all series in memory onto an Excel file, labeled with dates. Please note that
you have to create an entire spreadsheet or database file with a single instruction—
rats cannot “append” data to an existing spreadsheet.
cal(q) 1960:1
all 1998:4
open data oecdg7.rat
data(format=rats) / canrgdps frargdps deurgdps gbrrgdps usargdps
pform g7rgdp
# canrgdps to usargdps
cal(q,panelobs=%nobs) 1960:1
open copy g7gdp.rat
copy(format=rats,header=1) / g7rgdp
G7 Real GDP Data
constructs a panel data set and writes it out as a rats format file.
See Also . . .
OPEN Opens disk files.
PRINT Displays data series to the OUTPUT unit.
DISPLAY Displays scalars and expressions.
WRITE Displays the contents of arrays.
DATA Reads data from a file into rats.
Wizard
You can use Autocorrelations on the Time Series menu.
Parameters
series The series for which to compute statistics.
start end Range of entries to use. If you have not set a SMPL, this defaults
to the defined range of series.
There is also a fourth parameter which has the same function as the RESULTS option.
Options
number=number of correlations to compute [ min T 4 , 2 T ] ( )
The default is a function of the number of observations.
method=yule/[burg]
This selects the method used to compute the correlations. The default is the Burg
algorithm. The option YULE selects the Yule–Walker method (the method used
in versions of rats before 5.0, and the one provided in some other software). See
“"Yule–Walker vs Burg Methods" on page RM–538.
covariances/[nocovariances]
By default CORRELATE computes autocorrelations. If you use the COVARIANCES
option, CORRELATE computes autocovariances rather than autocorrelations.
[center]/nocenter
Use NOCENTER if you want the correlations to be computed without subtracting
means from the data.
qstats/[noqstats]
span=width of test intervals [all]
dfc=degrees for freedom correction for test [0]
With QSTATS, CORRELATE computes Ljung–Box (1978) Q tests for absence of
autocorrelation. If you use SPAN, it will do a series of Q tests, beginning with lags
1 to width, and increasing the upper bound by width each time. Without SPAN,
it does a single test using all the computed lags.
Use DFC to correct the degrees of freedom of the Q statistic. If you are applying
CORRELATE to the residuals from a ARIMA model, the degrees of freedom correc-
tion should be the number of ARMA parameters that you estimated.
[print]/noprint
picture=picture code for output
title="Title for output"
Unless you use NOPRINT, rats displays the computed statistics. PICTURE
allows you to control the formatting of the numbers in the output table—see
"Picture Codes" on page RM–545 for more information. Use the TITLE option if you want
to provide your own title for the output. By default, this will be “Autocorrelations
(covariances) of Series xxxx”
organization=rows/columns
The ORGANIZATION option determines whether the output is oriented in rows
or in columns. The default is ORG=ROWS except when using the WINDOW option,
where it is ORG=COLUMNS.
window="Title of window"
If you use WINDOW, the output goes to a (read-only) spreadsheet window with the
given title, rather than being inserted into the output window or file as text.
Notes
All the output series (RESULTS, PARTIALS, INVERSE and STDERRS) have the “0” lag
in entry 1. While this has value 1 for the first three of these (and zero for STDERRS)
unless you use the COVARIANCE option, it makes it easier to do graphs since the
difference between (for instance) the theoretical pattern of an AR(1) and ARMA(1,1)
depends upon the shape when including the 0 lag.
There are other instructions you can use, directly or indirectly, to compute autocor-
relations. In general, these will produce somewhat different results than CORRELATE,
which uses methods specifically designed for autocorrelations. For instance
CMOM(CORR) applied to a set of lags of a series will give different results for two rea-
sons: each lag of the series will have a separate mean calculated and removed; and
CMOM does all calculations over the range available for all the lags. For instance, if
you use four lags, it will skip four initial points.
Missing Values
In computing the correlations, rats treats any missing values as being equal to the
mean value of the series. The series itself is not modified in any way.
Partial Autocorrelations
The partial autocorrelation for lag k is (in effect) the coefficient on lag k in a k lag
autoregression. If all partial autocorrelations greater than p are “small,” it indicates
that a model no bigger than AR(p) is adequate for modelling the series.
Inverse Autocorrelations
If a series y has the ARMA representation
(1) Φ (L ) yt = Θ (L )ut ′
then the inverse autocorrelations are the autocorrelation of the process (call it z)
(2) Θ (L ) zt = Φ (L )ut ′
The inverse autocorrelations have two primary uses:
• They can assist in the identification stage of a Box-Jenkins model, since they
reverse the ar and ma polynomials.
• You can use them to compute a maximum entropy estimate of a spectrum
(mesa).
Note that the inverse autocorrelations will usually look nothing like the auto-
correlations themselves.
Variables Defined
%NOBS Number of observations (INTEGER)
%QSTAT The test statistic for the (final) Q test (REAL)
%QSIGNIF The significance level of the (final) Q test (REAL)
%NDFQ degrees of freedom for (final) Q test (INTEGER)
Examples
It is usually easiest to examine correlations by graphing them. The quickest way to
do that is to use the @BJIDENT procedure included with rats. However, it is use-
ful to know how to do these yourself so that you can choose exactly how you want to
present the information.
We recommend that you always use the options MAX=1.0, MIN=-1.0 and NUMBER=0
on your GRAPH. The NUMBER=0 option causes the “time” axis to be labeled with 0 to
number. Below is a simple example. See the code in BJIDENT.SRC and other proce-
dures for more complex examples.
log gdpq
difference gdpq / dgdp
correlate(number=40,results=corrl) gdpq
correlate(number=40,results=corrd) dgdp
graph(style=bargraph,key=upright,number=0,max=1.0,min=-1.0) 2
# corrl
# corrd
Output
The code below computes and prints the autocorrelations, partial autocorrelations,
and a range of Q-statistics for the series DGDP. The output is displayed in columns.
correlate(corrs=corrd,partial=partd,qstats,span=4,org=column) dgdp
Ljung-Box Q-Statistics
Lags Statistic Signif Lvl
4 12.801 0.012290
8 22.795 0.003638
12 31.987 0.001390
16 40.468 0.000665
Parameters
start end Range of entries to print. By default, the range required to dis-
play all defined entries of the series printed.
cseries list List of complex series to print. If you omit the list, rats prints
all of the complex series stored in memory.
Description
CPRINT prints each entry as a pair of real numbers, the real and imaginary parts
respectively. It prints the series in blocks, with two or three to a block, until the list
is exhausted. See the sample output.
Options
interval=sampling interval [1]
You can use the INTERVAL option to reduce the number of frequencies printed.
CPRINT prints just one entry out of each sampling interval, beginning with
start. Because spectral estimates are usually quite smooth, you will lose very
little detail while reducing rather considerably the size of the output.
lc=[entries]/periods/frequencies
LC (Left Column) indicates how you want to label the entries. CPRINT assumes
entry one is frequency zero when it computes the PERIODS and FREQUENCIES.
window="Title of window"
If you use the WINDOW option, the output is displayed in a (read-only) spreadsheet
window. The series will be in columns, with labels across the top row. You can
export the contents of this window to various file formats using File–Export....
Sample Output
cprint(length=nords,lc=freq) 1 (nords+1)/2 9 10
This prints series 9 and 10 labeling the entries by their frequencies. The LENGTH op-
tion is used because the number of frequencies printed is different from the number
that should be used in calculating the frequencies.
cprint(len=nords,lc=periods,picture="##.####") 1 (nords+1)/2 9 10
CROSS — Cross-Correlations
CROSS computes the cross-correlations or cross-covariances for a pair of series.
Wizard
You can use Cross Correlations on the Time Series menu to do these computations.
Parameters
series1 series2 CROSS computes the cross correlations of these two series.
start end The range of entries to use. If you have not set a SMPL, this
defaults to the maximum range over which both series1 and
series2 are defined.
Two additional parameters used in versions of rats before 7.0 for specifying the
range of lags have been replaced by the FROM and TO options. The old parameters are
still recognized, but we recommend that you use the options.
Description
CROSS computes the cross-correlations (or, optionally, the cross-covariances) of se-
ries1 and series2. Note that the “lag” refers to the number of periods by which
series2 lags series1, that is, the correlation at lag k is the correlation between
series1(t) and series2(t-k).
The @CROSSCORR procedure uses the CROSS instruction to compute and graph cross-
correlations. You’ll probably find it better to use that than CROSS itself.
Options
from=lowest lag number
to=highest lag number
These specify the range of lags for which CROSS will compute the cross correla-
tions. Represent leads as negative lags. FROM defaults to –n and TO defaults to n,
(
where n is min T 4 , 2 T . )
covariances/[nocovariances]
If you use the option COVARIANCES, CROSS computes the cross-covariances
rather than the cross-correlations of the series.
qstats/[noqstats]
span=width of test intervals [all]
dfc=degrees of freedom correction for test [0]
With QSTATS, CROSS computes Q tests for absence of correlation. If you compute
both lags and leads, it will include tests for lags only, leads only and lags and
leads combined. If you use SPAN, it will do a series of Q tests, beginning with the
tests for intervals 1 to width, –width to –1 and –width to width (subject to
your choices of FROM and TO), and increasing the test span by width with each
set. Use DFC to correct the degrees of freedom of the Q statistic.
[center]/nocenter
Use NOCENTER if you want the correlations to be computed without subtracting
means from the data.
[print]/noprint
picture=”picture code for output”
title="Title for output" [“Cross-correlations of xx and yy”]
Unless you use NOPRINT, CROSS displays the computed statistics. PICTURE
allows you to control the formatting of the numbers in the output table—see
"Picture Codes" on page RM–545. Use the TITLE option if you want to provide your
own title for the output. By default, this will be “Cross-correlations (covariances)
of Series series1 and series2 ”
organization=rows/columns
The ORGANIZATION option determines whether the output is oriented in rows
or in columns. The default is ORG=ROWS except when using the WINDOW option,
where it is ORG=COLUMNS.
window="Title of window"
If you use the WINDOW option, the output goes to a (read-only) spreadsheet win-
dow with the given title rather than being inserted into the output window or file
as text. Each lag is shown on a separate line.
Missing Values
CROSS treats missing values as being equal to the mean of the valid observations.
Variables Defined
%NOBS Number of observations (integer)
%QSTAT The test statistic for the (final) Q test.
%QSIGNIF The significance level of the (final) Q test.
Examples
set dsales = sales-sales{1}
set dlead = lead-lead{1}
cross(qstats,org=column,from=-8,to=8) dlead dsales
computes cross-correlations of the first differences of SALES and LEAD. It also per-
forms Q tests on the correlations.
Output
Cross Correlations of Series DLEAD and DSALES
Ljung-Box Q-Statistics
Lag Range Statistic Signif Lvl
1 to 8 6.482 0.593389
-8 to -1 110.167 0.000000
-8 to 8 116.651 0.000000
Technical Information
CROSS uses the following to estimate the cross-correlation function of x and y:
rˆ xy
2
( j)
(4) Q = T (T + 2) ∑
M 1≤ j ≤ M 2 T− j
See Also . . .
CORRELATE Computes autocorrelations for a single series.
CMOM Computes correlations of a set of regressors.
%CORR(A,B) Computes the correlation of a pair of series or vectors.
%COV(A,B) Computes the covariance between a pair of series or vectors.
Parameters
cseries Complex series to transform.
start end Range of entries to process. By default, the defined range of
cseries.
newcseries Series for the result.
newstart Starting entry in newcseries for the result. By default, same
as start.
Options
interval=sampling interval [1]
The INTERVAL option specifies the width of the sampling interval.
Example
freq 5 384
...
csample(interval=3) 1 1 384 2
series 2 is set equal to entries 1,4,7,... from series 1.
Parameters
cseries The complex series to create or set with this instruction.
start end The range of entries to set. This defaults to the FREQUENCY
range, regardless of the series involved in the function.
function(T) The function of the entry number T, which gives the value for
entry T of cseries. There should be at least one blank on either
side of the = which comes between the other parameters and the
function.
Description
This sets the values of entries start to end of cseries by evaluating the function
at each of the entries, substituting for T the number of the entry being set.
To get entry T of complex series n, use %Z(T,n). There are several specialized func-
tions which are handy for computing functions of frequency ordinates:
%UNIT(x) returns exp(ix ) for the real number x
%UNIT2(t1,t2) returns exp (−i 2p (t1 − 1) / t2) for t1 and t2 integers. t1 is usu-
ally T. With t2 equal to the number of frequencies, this maps
the unit circle as T runs from 1 to t2.
%ZLAG(t,x) returns exp (−i 2p (t − 1) x / N ) where N is the number of fre-
quencies that you have specified on FREQUENCY. As t runs from
1 to N, this gives the appropriate Fourier transform for the lag
operator Lx.
Options
first=complex expression/FRML
special value or FRML (if the value of start affects the calculation) for entry start.
Use this only if the first entry uses a different calculation from the others. (This
is not common in frequency domain calculations).
scratch/[noscratch]
Use the SCRATCH option when you are redefining a series (that is, the cseries is
also part of the function) and the transformation uses data from more than just
the current entry T. This is rare in frequency domain work.
Examples
cset 3 = %z(t,3)*%z(t,4)
cset 3 = %z(t,3)*%conjg(%z(t,4))
The first makes series 3 equal to series 3 multiplied by the conjugate of series 4,
while the second makes 3 equal to 3 multiplied by 4 (without conjugation).
cset 3 = %unit2(t,400)
cset 3 = %cexp(%cmplx(0.0,2*%pi*(t-1)/400.)
cset 3 = %cmplx(cos(2*%pi*(t-1)/400.),sin(2*%pi*(t-1)/400.))
cset 5 = c=%cmplx(1.0,0.0),%do(i,1,4,c=c+b(i)*%zlag(t,i)),c
Makes series 5 equal to the transfer function for 1 + b(1)L + b(2)L2 + b(3)L3 + b(4 )L4 .
Writing it this way allows the number of lags in the polynomial to be changed quickly
by changing the size of the vector b.
cset(scratch) 3 = %z(t,3)/%z(1,3)
Replaces series 3 by itself normalized so that entry 1 is 1.0. The SCRATCH option is
needed because entry 1 will be overwritten before it gets used to compute entries
from 2 on.
Parameters
start end Range of entries to transfer. This defaults to the (possibly dif-
ferent) defined range of each complex series.
newstart Entry in the real series to which the start entry is copied. By
default, same as start. The default starting entry may change
from series to series if you use the default start to end range.
Description
This transfers data from the complex series listed on the first supplementary card to
the corresponding real series listed on the second supplementary card.
Options
part=[real]/imaginary/absvalue
PART specifies which part of the complex numbers you want to send to the time
domain. (ABSVALUE = Absolute Value)
Examples
ctor(interval=4) 1 720
# 1
# spectrum
copies the real part of every fourth entry of complex series 1 to entries 1 through 180
of the real series SPECTRUM.
This next example sends two series to the frequency domain, filters them by us-
ing a seasonal mask on the Fourier transforms, then sends the results back. Note
that, while this produces “values” for series 1 and 2 which go beyond the actual data
length, it is only the original range which is actually useful.
compute nords=2*%freqsize(2018:6)
freq 3 nords
rtoc 1960:1 2018:6
# x y
# 1 2
fft 1
fft 2
cmask 3 1 nords
cmult 1 3
cmult 2 3
ift 1
ift 2
ctor 1960:1 2018:6
# 1 2
# xadj yadj
See Also...
RTOC The inverse operation of CTOR—this copies real data into complex series.
(6) E ( ut ut′ ) = V
In most applications, only one of A and B is actually used. A and B are specified us-
ing FRML[RECT]’s, that is, formulas which evaluate to a rectangular matrix, and V as
a FRML[SYMM].
Parameters
We recommend that you use the A and B options rather than the A Formula and B
Formula parameters.
sigma The SYMMETRIC covariance matrix of the residuals u.
AFRML A FRML[RECT] which gives the A matrix in (1). We would rec-
ommend using the newer A option instead.
BFRML A FRML[RECT] which gives the B matrix in (1). We would rec-
ommend using the newer B option instead.
Options
a=A formula producing RECTANGULAR array
b=B formula producing RECTANGULAR array
Use these to provide formulas for the A and B matrices in (1). If you use these,
don’t use the AFRML and BFRML parameters.
dmatrix=[concentrated]/identity/marginalized
dfc=degrees of freedom correction [0]
pdf=prior degrees of freedom [0]
dvector=(output)VECTOR of values for diagonal of D matrix
The first three select the specific form for the maximand. The DMATRIX option de-
termines the treatment of the “D” matrix in (1), whether it’s concentrated out, set
to the identity. You can use the DVECTOR option to capture the estimated values
for this. The DFC option is used when you’re maximizing the posterior density. It
gives the value of c in the formulas in Section 7.5.2 of the User’s Guide. The PDF
option sets d, which is used for the posterior density with D integrated out.
method=[bfgs]/simplex/genetic/annealing/ga/evaluate
iterations=iteration limit [100]
subiterations=subiteration limit [30]
cvcrit=convergence limit [.00001]
trace/[notrace]
These control the non-linear optimization. BFGS is Broyden, Fletcher, Goldfarb
and Shanno and is the only method which computes standard errors. However, it
is very sensitive to initial guess values, so you might need to use a PMETHOD with
one of the other methods first. EVALUATE simply evaluates the function given the
initial values—by default, no output is displayed
ITERATIONS sets the maximum number of iterations, SUBITERS sets the maxi-
mum number of subiterations, CVCRIT the convergence criterion. TRACE prints
the intermediate results. See the User’s Guide Chapter 4 for more details.
pmethod=bfgs/[simplex]/genetic/annealing/ga
piters=number of PMETHOD iterations to perform [none]
Use PMETHOD and PITERS if you want to use a preliminary estimation method to
refine your initial parameter values before switching to one of the other estima-
tion methods. For example, to do 20 simplex iterations before switching to bfgs,
you use the options PMETHOD=SIMPLEX, PITERS=20, and METHOD=BFGS.
[print]/noprint
vcv/[novcv]
title="title for output" ["Covariance Model"]
These are the same as for other estimation instructions (see LINREG).
Examples
This estimates a model for a six-variable covariance matrix. With nine free param-
eters, this is overidentified by six. (There are 21 = 6 ×(6 + 1) 2 distinct values in
a 6 ´ 6 covariance matrix, with six parameters used for the variances of the v’s,
leaving fifteen to be estimated as part of the model). It is estimated using the BFGS
method. The “Sample Output” is from this example.
The factor matrix is put into AFACTOR. Note that AFRML must be declared as a
FRML[RECT] before it can be defined using the FRML instruction.
nonlin rx ur cu cr pc pr mp mc mr
dec frml[rect] afrml
frml afrml = ||1.0,0.0,ur ,0.0,0.0,0.0|$
cu ,1.0,cr ,0.0,0.0,0.0|$
0.0,0.0,1.0,rx ,0.0,0.0|$
0.0,0.0,0.0,1.0,0.0,0.0|$
0.0,mc ,mr ,0.0,1.0,mp |$
0.0,pc ,pr ,0.0,0.0,1.0||
compute ur=cu=cr=rx=mc=mr=mp=pc=pr=0.0
cvmodel(method=bfgs,factor=afactor,a=afrml) %sigma
The code below uses the V option to estimate a one factor model for the covariance
matrix RR.
dec vect lambda(n) d(n)
compute lambda=%const(0.1)
compute d=%xdiag(rr)
dec frml[symm] vf
frml vf = %outerxx(lambda)+%diag(d)
cvmodel(method=bfgs,v=vf,obs=tdim) rr
Sample Output
This is the output from the first example above. With METHOD=SIMPLEX or
METHOD=GENETIC, the only columns that will show in the bottom table will be “Vari-
able” and “Coeff”, since neither of those methods can compute standard errors.
If the model is overidentified, it will show both the Log Likelihood of the estimated
model, and the unrestricted Log Likelihood, along with the test of the restrictions.
For a just identified model, it will show both the likelihoods, but no test. The two
likelihoods should match—if they don’t, you hit a local but not global maximum.
Notes
These models can be quite difficult to estimate successfully (see Section 7.5.2 of the
User’s Guide). There are often multiple local maxima, and there can even be multiple
global maxima in very different regions of the parameter space. Before you put a lot
of effort into writing up results, experiment with different initial guess values and
PMETHOD=GENETIC to make sure you didn’t end up at the wrong spot on the first try.
Variables Defined
Standard estimation: %NREG, %LOGL,%XX, %BETA, %STDERRS, %TSTATS
Non-linear estimation: %CVCRIT, %ITERS, %CONVERGED, %LAGRANGE
%CDSTAT The test statistic for overidentifying restrictions
%SIGNIF The significance level of that test
%FUNCVAL The log likelihood or log posterior at the final estimates
%NVAR The number of variables (INTEGER)
%NFREE The number of free parameters (including diagonal if concen-
trated/marginalized) (INTEGER)
Parameters
cseries Complex series whose extreme values you want to locate.
start end Range of entries to search.
Options
part=[real]/imaginary/absvalue
The part of the complex numbers (real, imaginary, or absolute value) whose ex-
treme values you want.
[PRINT]/NOPRINT
TITLE="title for output" ["Extreme Values of the yyy Part for Se-
ries xxx"]
NOPRINT suppresses the output. Use NOPRINT if you want to use the variables
described below, but you don’t need the printed output. If you use the default of
PRINT, the TITLE option can be used to provide a more descriptive title.
Output
cxt(part=absval) 3
Wizard
The Data wizards (on the Data/Graphics menu) generate the appropriate sequence
of CALENDAR, OPEN DATA, and DATA instructions to read data from a file.
Parameters
start end Range of entries to read. If you have not set a SMPL, this de-
faults to the standard workspace. If you have not set that, and
the length of the data set is clear from the contents of the data
file, DATA uses that and sets it as the workspace length.
list of series The list of series names or numbers you want DATA to read from
the file. If you omit the list, DATA reads all of the series on the
file. You can take advantage of this with any formats except
FREE, BINARY or FORTRAN format (as they have no labels).
You can use series<<fileseriesname parameter fields to
map a series/column name on an original file to a (valid) rats
name. This can be used to handle source file names which are
too long, improperly formatted (for instance, having spaces),
conflict with rats reserved names (such as INV or T), or simply
not easily understood. If fileseriesname has spaces or punctua-
tion, enclose it in quotes (single or double). This was added with
rats 9.2.
Options
format=[free]/binary/cdf/citibase/crsp/dbf/dif/dri/dta/fame/fred/
haver/matlab/odbc/portable/prn/rats/tsd/wf1/wks/
xls/xlsx/'(fortran format)'
This describes the format of your data set. rats will not attempt to determine
the format from the file name or extension—you must use the correct FORMAT op-
tion. The various formats are described in Chapter 2 of the Introduction.
organization=columns/[rows]/multiline
This tells DATA whether the data series run down the file in columns (one series
per column) or across it in rows. The ORG option isn’t needed for most formats,
since only text files and spreadsheets allow for the different arrangements.
ORG=MULTILINE can be used to read data for a single series (only) which spans
more than one row in a spreadsheet (or similar file format). If you have more
than one series on a file blocked this way, use separate DATA instructions with
the TOP and BOTTOM options to isolate the information for each series.
unit=[data]/input/other unit
Chooses the rats I/O unit from which the data are read. With the default option,
DATA reads the data from the external data file. Use UNIT=INPUT when you want
to enter data directly into the input file.
compact=[average]/geometric/sum/maximum/minimum/first/last
select=subperiod
These options, which work with most dated file formats, set the method DATA will
use to convert data from higher frequency to lower, such as monthly to quarterly.
See Section 2.5 of the Introduction.
blank=[zero]/missing
This applies only with FORTRAN formats. If an area where DATA expected a num-
ber is blank, it treats it either as a zero or as a missing value, depending upon
your choice for this option.
which forces the data to be interpreted as having the CALENDAR scheme of the
current workspace.
reverse/[noreverse]
If your data are in time-reversed order (starting with the most recent at the top
or left), use REVERSE to have it switched to the standard time series order when
read in. Note that if the file has usable dates which run in the reversed order,
DATA will automatically detect that and correct the ordering, so this is only neces-
sary if you have a data set with reversed sequence and no usable dates.
verbose/[noverbose]
Use VERBOSE if you want DATA to list the name and comments (if any) of each
series it reads, along with other information about the file. You can use VERBOSE
to verify that DATA is reading date information on your data file properly—this is
particularly important when converting from one frequency to another.
With QUERY=INPUT, rats reads the sql commands from the lines following the
DATA instruction in the input window (or input file). With QUERY=unit, rats
reads the query from the text file associated with the specified i/o unit (opened
previously with an OPEN instruction). In either case, use a “;” symbol (which tells
rats to begin a new instruction) to signal the end of the sql string.
See OPEN for more on i/o units, and Section 3.15 of the Additional Topics for
more on using sql commands to read data.
In the date format string, use y for positions providing year information, m for
positions providing month information, and d for positions with day information.
Include the delimiters (if any) used on the file. For example:
dateformat="mm/dd/yyyy"
dateformat="yyyymmdd"
Examples
open data states.xls
data(org=col,format=xls) 1 50
reads all series from the Excel file STATES.XLS. The file is organized by column.
cal(m) 1955:1
open data brdfix.dat
data(org=row) 1955:1 1979:12 ip
reads data for the series IP from the file BRDFIX.DAT. This is a free-format file, orga-
nized by row. Note, that unlike the “labeled” formats, like XLS and RATS, we are free
to give the series any name we wish.
cal(a) 1920:1
all 1945:1
open data klein.prn
data(org=col,format=prn) 1920:1 1941:1 consumption $
profit privwage invest klagged production govtwage govtexp taxes
reads data for nine series in labelled ascii (prn) format over the range 1920:1 to
1941:1.
data(format=rats,compact=max) / spdji
set djmax = spdji
data(format=rats,compact=min) / spdji
set djmin = spdji
data(format=rats,compact=last) / spdji
set djlast = spdji
reads data which are monthly on the data file into an annual CALENDAR, taking the
maximum in each year (to create DJMAX), the minimum (to create DJMIN) and the
final (to create DJLAST).
dbox(action=define)
dbox( options ) output value
dbox(action=run)
Parameters
output value For most dialog box elements, output value should be an inte-
ger variable, which will contain the current value of that item.
The output value variables can be used as the arguments on
VERIFY options, to check that the user has supplied valid input,
and on DISABLE options, to enable or disable elements of the
dialog based on values currently selected or input by the user.
When the user clicks “OK” to close the dialog box, output value
will contain the final value of the dialog box element.
A variable must be declared before it can be used as an output
value on the DBOX instruction.
Note that, because these get set during the course of running
the dialog, their values can change even if the user eventually
cancels the dialog. Use the STATUS option to determine if the
user ok’ed the dialog.
Item Types
There are several types of items you can include within a DBOX. Some of these are
shown below (see "Item Type Options" for the full list). To add an item to a dialog,
use the corresponding option name together with ACTION=MODIFY (the default choice
for ACTION). The samples below generally include only one type of item (although
some include an additional Static Text item with descriptive text), but note that a
single dialog box can include many items. Below each example, we show the code
used to generate the dialog box.
Static text A simple text label which can’t be edited.
dbox(action=define,title="Sample Static Text")
dbox(stext="Static text box")
dbox(action=run)
Edit text A text box in which the user types information. You can request
that this be interpreted as a number or passed back intact as a
string. You do that by setting the type of the output value.
declare string result
dbox(action=define,title="Sample Edit Text")
dbox(stext="Edit text box")
dbox(edittext,focus) result
dbox(action=run)
declare integer on
dbox(action=define,title="Sample Check Box")
dbox(checkbox="A check box") on
dbox(action=run)
Popup box A list of choices which has much the same function as the radio
buttons, but with only the selected choice showing when the box
is at rest. If you click on it, you have access to the other choices.
declare integer seriesnum
dbox(action=define,title="Sample Popup Box")
dbox(stext="A Popup box")
dbox(popupbox,series) seriesnum
dbox(action=run)
Combo box A combination of a text box with a popup box. The user can
either type information directly into the text box, or activate the
popup and choose one of the choices available there.
General Options
action=define/[modify]/run
As mentioned above, you start with ACTION=DEFINE and end with ACTION=RUN.
Since ACTION=MODIFY is the default, you don’t need an ACTION option when
defining the items in the dialog box.
focus/[nofocus]
Use FOCUS to indicate that the item is to be the one which receives the initial
focus.
initialize/[noinitialize]
For an edit text item, this indicates whether the value already in the output
value should be used to initialize the text field. If not, it starts out blank.
optional/[nooptional]
This applies to the EDITTEXT and COMBOBOX items. If OPTIONAL, the field can be
left blank. If it is blank, the output value will be %NA for a real, 0 for an inte-
ger, and an empty string for a label or string variable.
series
strings=VECTOR of STRINGS to list
list=VECTOR of INTEGERS to list
regressors
When creating a popup, combo, or scroll box, use one of these four options to set
the list to be displayed:
• Use SERIES to request a selection from the list of series currently in memory. The
output values will be the selected series numbers. Note that the special series
CONSTANT is never included in the list.
• Use STRINGS to request a selection from a list of labels.
• Use the LIST option to request a selection from an arbitrary list of integers.
• Use the REGRESSORS option to request a selection of regressors from the last com-
pleted regression. The return values are the selected positions in the list.
edittext
output value can be a scalar variable of any type. If you use the INITIALIZE
option, the text field will be initialized with the value in that variable at the time
the DBOX is executed. You can use the WIDTH option to restrict the width of the
text box. For instance, if you’re expecting an integer value no larger than 99, you
can include WIDTH=2.
spinbutton
lowerlimit=lower limit on values
upperlimit=upper limit on values
date/[nodate]
SPINBUTTON must follow immediately after an EDITTEXT item. It does not have
its own output value. Instead, it adds a spin button (or up-down control) to the
preceding EDITTEXT, which allows the user to make minor adjustments to the
text box without retyping. You can provide LOWERLIMIT and/or UPPERLIMIT to
prevent the user from moving the value outside of that range. If you use DATE,
the information will be displayed in date format, though the output value will
still be an integer.
popupbox
scrollbox
combobox
The set of strings for any of these comes from one of the SERIES, STRINGS, LIST
and REGRESSORS options described above.
For a COMBOBOX, output value can be a scalar variable of any type, which is
filled with the value obtained by processing the text field.
For POPUPBOX and SCROLLBOX, the values returned depend upon which option
was used to fill the box:
SERIES the series number
STRINGS the index in the array (1,...,n)
REGRESSORS the index into the list of regressors (1,...,number of regressors)
LIST the values of the integers
For POPUPBOX, the DEFAULT option can be used to set which of the choices will be
selected initially. By default, it will be the first. For POPUPBOX, output value is
an INTEGER.
matrix
picture=picture code for data [none]
vlabels=VECTOR of STRINGS for row labels
hlabels=VECTOR of STRINGS for column labels
If you use MATRIX, the output value is a real-valued matrix which must al-
ready be dimensioned. A MATRIX item can be used for one of two things: it can be
used to request input of numerical information into a matrix (which is the default
behavior), or it can be used to request that the user select rows or columns based
upon the displayed values (described below with the MSELECT option).
A picture code is most helpful when you’re using this to display data for user
selection, rather than having the user enter or edit it. By default, DBOX will use
a representation with seven digits right of the decimal. This not only may show
many more digits than are statistically reliable, but also will take up more room
per number than a smaller format, so fewer values will be visible at one time. For
example, the option PICTURE="*.###" could be used to limit the display to three
digits right of the decimal. See "Picture Codes" on page RM–545 for more information.
You can use VLABELS and HLABELS to supply your own labels for the rows and
columns in the matrix box. By default, columns and rows are labeled with integer
row or column numbers.
mselect=rows/one/byrow/bycol
select=VECTOR[INTEGER] of selections
If you use the MSELECT option (with MATRIX), DBOX will display the data, and
the user can select items from the matrix. When the window is closed, the
VECTOR[INTEGER] that you provide with the option SELECT will be filled with
information regarding the selection, as described below.
You use SELECT to control how selections are made. With MSELECT=ROWS, the
user can select one or more rows. The array provided using the SELECT option
will have dimension equal to the number of selections made, and will list the
rows selected using numbers 1,...,n. If MSELECT=ONE, it will have dimension 2
with array(1)=row and array(2)=column. If MSELECT=BYROW, the user selects
one cell per row. The SELECT array will have dimension equal to the number of
rows, with array(i) set equal to the column selected in row i. MSELECT=BYCOL
is similar, but one cell per column is selected.
Wizard
The Statistics—Limited/Discrete Dependent Variables operation provides dialog-
driven access to most of the features of the DDV instruction.
Parameters
depvar Dependent variable. DDV requires numeric coding for this. The
coding required will depend upon the model type. See the de-
scription under the TYPE option.
start end Estimation range. If you have not set a SMPL, this defaults to
the maximum common range of all the variables involved.
Options
type=[binary]/ordered/multinomial/conditional/count
Model type. BINARY is a binary choice model. ORDERED is an ordered choice
model, MULTINOMIAL and CONDITIONAL are for multiple choice logits, and COUNT
is for a Poisson count model. The main difference between MULTINOMIAL and
CONDITIONAL is that the former is designed for fixed individual characteristics,
with different coefficients for the choices, and the latter is for analyzing differing
attributes across choices, with a fixed set of coefficients. See Section 12.2 of the
User’s Guide for more information.
For BINARY, one choice is represented (in depvar) by zero, and the other by any
non-zero value or values. For ORDERED, each choice must have a distinct value,
ordered systematically, but they don’t have to have any specific set of values. For
MULTINOMIAL each choice again must have a distinct value. COUNT data are just
non-negative integer values.
distribution=[probit]/logit/extremevalue
The distribution of the underlying “index” model. Probit is the standard normal.
This matters only for BINARY and ORDERED; MULTINOMIAL and CONDITIONAL
are always done as logit, and COUNT uses a Poisson model.
[print]/noprint
vcv/[novcv]
smpl=SMPL series or formula ("SMPL Option" on page RM–546)
unravel/[nounravel] (Section 5.11)
title="title for output" [depends upon options]
equation=equation to estimate
These are similar to the LINREG options.
All models except the conditional logit are estimated using Newton-Raphson,
that is, they use analytical second derivatives. Conditional logit uses bhhh. See
the User’s Guide Chapter 4 for details. ITERATIONS sets the maximum number
of iterations, SUBITERS sets the maximum number of subiterations, CVCRIT the
convergence criterion. TRACE prints the intermediate results. INITIAL supplies
initial estimates for the coefficients. The default values are usually sufficient.
robusterrors/[norobusterrors]
lags=correlated lags [0]
lwindow=neweywest/bartlett/damped/parzen/quadratic/[flat]/
panel/white
damp=value of g for lwindow=damped [0.0]
lwform=VECTOR with the window form [not used]
cluster=SERIES with category values for clustered calculation
These permit calculation of a consistent covariance matrix. Note, however, that,
with the exception of TYPE=COUNT, these models are unlikely to produce consis-
tent estimates of the coefficients themselves if the distributional assumptions are
incorrect.
Description
The three distributions which you can select for the binary and ordered models are
the standard normal (probit), logistic (logit) and extreme value. The distribution
function for the logistic is
(9) L = ∑ log ( F ( X b ))
Yi ≠0
i + ∑ log (1 − F ( X b ))
Yi =0
i
As mentioned above, this is done using the Newton-Raphson algorithm, as the first
and second derivatives are easy to compute.
TYPE=MULTINOMIAL will always use the logit model. This estimates a separate set of
coefficients for the m–1 choices with the highest values; the model is normalized on
the choice with the lowest numerical value.
TYPE=CONDITIONAL will also use the logit model. As indicated above, you have to
structure your data set with one observation for each combination of individual and
choice. That is, if you have 400 individuals and 4 choices for each, your data set will
have 1600 observations. A series identifying the individuals is required, and one
identifying the choices is recommended: use the INDIV and MODES options to tell DDV
which series they are. The depvar series shows whether an observation represents
the choice made by the individual. Note that if a choice is unavailable to an individu-
al, you don’t need to include an observation for it in the data set.
DDV calculates the likelihood elements individual by individual. If none of the obser-
vations are flagged (in depvar) as chosen, the individual will be skipped. You can
use this to estimate a model over a restricted set of choices: if you use a SMPL which
knocks out any observation for a particular mode, it will remove from consideration
any individual who chose that mode.
TYPE=COUNT is the one model within this instruction which doesn’t model “choice.”
Instead, it’s used for models of “count” data, where the dependent variable is required
to be a non-negative integer, usually a small one. DDV estimates a Poisson regression
model (see, for instance, Wooldridge(2010), section 18.2), estimating the log of the
mean of the Poisson distribution as a linear function of the regressors. The likelihood
for an individual is given by
(12) exp(− exp( X i b )) exp( yi × X i b )
(This ignores the factorial term, which doesn’t interact with the coefficients). This
is the only model type here where the estimates have some robustness to deviations
from the distribution chosen, so using ROBUSTERRORS to correct the covariance ma-
trix is reasonable.
Examples
ddv(smpl=(kids==0)) lfp
# constant wa agesq faminc we
estimates a binary choice probit for the observations where KIDS is zero.
ddv(dist=logit) union
# potexp exp2 grade married high constant
prj(dist=logit,cdf=fitlgt)
estimates a binary choice logit, and uses PRJ to get the predicted probabilities from
the model.
ddv(type=multinomial,smpl=y87) status
# educ exper expersq black constant
estimates a multinomial logit.
Output
TYPE=COUNT is similar to a linear regression, and has similar output. The rest all
model probabilities. The following statistics are included in the output (N is the num-
ber of individuals, yi is the observed value):
N
Log Likelihood log L ≡ ∑ log P (Yi = yi X i )
i =1
1 N
Average Likelihood exp ∑ log P (Y i = yi X i )
N i =1
m
Base Likelihood log Lc ≡ ∑ Npˆ j log( pˆ j )
j =1
−( 2 / N ) log Lc
Pseudo-R2 1 − (log L / log Lc )
The average likelihood is the geometric mean of the likelihood elements. The base
likelihood is the log likelihood of a model which predicts the observed probabili-
ties of each choices; in most cases, this is the maximum of the likelihood without
any slope coefficients. The Pseudo-R2 is the measure of fit from Estrella (1998).
Hypothesis Testing
You can apply the hypothesis testing instructions (EXCLUDE, TEST, RESTRICT
and MRESTRICT) to estimates from DDV except that you can’t use EXCLUDE with
TYPE=MULTINOMIAL. They compute the “Wald” test based upon the quadratic
approximation to the likelihood function. rats uses the second set of formulas
on page UG–74 of the User’s Guide to compute the statistics. Note that you cannot use
the CREATE or REPLACE options on RESTRICT and MRESTRICT. You can also do
likelihood ratio tests “by hand”: run restricted and unrestricted models and use
COMPUTE and CDF to compute the statistic.
debug( options )
Execute the DEBUG(SYMBOL) instruction first, and then execute the commands
that define your procedure or function. For procedures stored on separate files,
you can use the SOURCE instruction to read them in.
rats will display a list of any global variables found in the procedure or function.
You can use LOCAL instructions in the procedure or function to define these as
local variables instead.
table/[notable]
Displays a list of all variables, procedures, and functions currently in use. This
includes both user-defined variables, and reserved variable and function names.
Parameters
datatype The data type you want to assign to the listed variables. For
instance, REAL, RECTANGULAR, or VECTOR[SERIES].
list of names The list of variables that will have datatype. Separate the
names with blanks. Symbolic names must be unique—if you at-
tempt to assign a new type to a name already in use (by you or
by rats) you will get an error message.
You can also set the dimensions of an array at the time you
DECLARE it, by using a dimension field instead of simply the
variable name—just provide the dimensions for the array in
parentheses immediately after the variable name. Arrays can
also be dimensioned later (or redimensioned) using DIMENSION.
Description
In most situations, you do not need to declare a variable name before using it. rats
will determine what type of variable is needed from the context. However:
• You will probably have to DECLARE any array whose values will be set in a
COMPUTE instruction, to be sure that you get the array type that you want.
• Any “non-standard” aggregate type, such as a VECTOR[SYMMETRIC] or
zFRML[COMPLEX] will need a DECLARE.
• Any variable or array which you initialize with INPUT or ENTER needs DECLARE,
since those instructions can handle a number of variable types.
You can, however, often dispense with a separate DECLARE instruction by putting the
type specification directly into the instruction. For instance,
compute [symmetric] s=tr(a)*a
can be used instead of
declare symmetric s
compute s=tr(a)*a
Variables introduced with DECLARE (as well as any that are created from context by
a rats instruction) have global scope. They can be used anywhere later on in the
program, and in procedures. You cannot change the type of such a variable later in
the program.
Examples
declare real a b c d e
declare integer count
declare vector[integer] counters(10)
declare symm s(nvar,nvar)
The statements above declare the following variables:
• A through E as scalar real variables.
• COUNT as a scalar integer variable.
• COUNTERS as a vector of integers. COUNTERS is also dimensioned to have 10 ele-
ments (referenced as element 1 through element 10).
• S as a symmetric array, with dimensions NVAR by NVAR (where NVAR has been set
previously to some value)
Wizards
The Open and New operations on the File menu are an alternative way to open or
create a rats format data file. There are then other operations which allow you to
view, edit, and create data series, copy data from one file to another, export data, and
more. This is now the preferred way to handle RATS format files. See Section 2.7 in
the Introduction.
Parameters
file name Name of the data file you want to edit or create (can include
drive and path information). If you simply type DEDIT, rats
will prompt you for the file name.
Option
new/[nonew]
Use the option NEW when you are creating a new file. If you forget to put NEW
when creating a file, rats will prompt you if it cannot find the file requested.
Notes
You can only edit one rats format file at a time using DEDIT. If you have a rats file
open for editing, and then issue another DEDIT, rats will ask you if you want to save
the changes to the old file before opening the new file. You can, however, open other
rats format files for input (using OPEN DATA) while you are editing a rats file. You
can use the instruction ENVIRONMENT RATSDATA=file name in place of DEDIT for
existing files.
DEDIT does not allow you to read data from the opened file. Use OPEN DATA (or
ENVIRONMENT RATSDATA=) and DATA instructions (or the Data Wizard (RATS For-
mat) operation) to read data from rats format files.
You can also use the command COPY(FORMAT=RATS) to write data to a rats format
file. With COPY, all the data must be written to the file with a single instruction; if a
file with the same name already exists, it will be overwritten. With DEDIT, you can
use several STORE instructions to write series to a file. If you do create a rats file
using COPY, you can always go back and use DEDIT and related instructions to add,
remove, display, or edit series on the file.
Examples
dedit(new) results.rat
store testone testtwo control
save
creates a new rats format file called RESULTS.RAT, includes the series TESTONE,
TESTTWO and CONTROL on it, then saves the file.
When transferring files from one platform to another (by ftp or e-mail, for example),
to avoid corrupting the file be sure to use a “binary” transfer, rather than a “text”
transfer.
See Also . . .
Intro. Section 2.7 rats format data files
QUIT Closes rats file being edited without saving any changes.
SAVE Saves the changes made to a rats format file.
Wizard
The Statistics—(Kernel) Density Estimation operation provides access to the major
features of DENSITY.
Parameters
series The series for which you want to compute the density function.
start end Range to use in computing the density. If you have not set a
SMPL, this defaults to the defined range of series.
grid Series of points at which the density is estimated.
density Series for the estimated density corresponding to grid.
If DENSITY creates the grid series (GRID=AUTOMATIC option), the grid and density
series will be defined from entry 1 until the number of points in the grid.
Options
type=[epanechnikov]/triangular/gaussian/logistic/flat/
parzen/histogram
counts/[nocounts]
See “Description” on page RM–100.
bandwidth=kernel bandwidth
BANDWIDTH specifies the bandwidth for the kernel. The default value is
0.79 N -1 5 IQR
where IQR is the interquartile range (75%ile–25%ile) of the series and N is the
number of data points.
grid=[automatic]/input
If AUTOMATIC, the grid series runs in equal steps from the 1%-ile to the 99%-
ile of the input series. If INPUT, you fill in the grid series with whatever values
you want prior to using the DENSITY instruction. Note that the grid series does
not have to be in increasing order. If the density is being used as part of a more
involved calculation (see Section 13.1 of the User’s Guide), the grid series is usu-
ally a series created from the data.
print/[noprint]
PRINT produces a table of grid values and the estimated density at each point.
Description
For types other than HISTOGRAM, DENSITY estimates the density function for a series
of data x1 , , xT , by computing at each point u in the grid:
T u − xt T
fˆ (u) = ∑ wt K h ∑ wt
t =1
h t =1
where K is the kernel function, h the bandwidth and wt are the weights, which, by
default, are 1 for all t. The kernel types take the following forms:
1 −v2
GAUSSIAN K (v) = exp
2p 2
2
LOGISTIC K (v) = ev (1 + ev )
Examples
density gdpgrowth / sgrid sdensity
density(type=histogram) gdpgrowth / hgrid hdensity
scatter(style=bargraph,overlay=line,ovsame,header="GDP Growth") 2
# hgrid hdensity
# sgrid sdensity
computes an Epanechnikov kernel density estimate and a histogram, and graphs
these along with the smoothed density (shown as a line overlaying the histogram).
data(unit=input) 1 12 gridpts
5 15 25 35 45 55 65 85 105 135 165 235
density(type=histogram,grid=input,print) income / gridpts idensity
computes a histogram for an income series using an input set of interval midpoints
which give wider intervals for the higher incomes.
Variables Defined
%EBW The computed bandwidth
Wizard
The Time Series—VAR (Setup/Estimate) wizard provides an easy, dialog-driven
interface for defining and estimating var models.
Parameters
list of ... Lists the variables that you want to include in the equations.
Use regression format (see page Int–79 of the Introduction).
Examples
system(model=deumodel)
variables deuip deum1 deutbill deucpi
determ constant seasonal{-10 to 0}
lags 1 2 3 4 6 9 12 13
end(system)
This adds a constant and eleven seasonal dummies to the equations of a four variable
VAR. See SEASONAL for details on using leads of a single seasonal dummy variable.
system(model=unrestricted)
variables gdpq unemprate gpdi
lags 1 to 4
det constant gdpdefl{1 to 4} m2{1 to 4}
end(system)
This example includes the lags of GDPDEFL and M2 as exogenous variables in the un-
restricted model, allowing us to test the hypothesis that they do not enter the equa-
tions for GDPQ, UNEMPRATE, and GDPI.
Notes
The ECT instruction is available as a subcommand of SYSTEM to define error correc-
tion terms. In some older programs, you might find these put into the system using
DETERMINISTIC.
See Also . . .
User’s Guide, Ch. 7 Vector Autoregressions.
SYSTEM First instruction for setting up a VAR.
Wizard
The Data/Graphics—Differencing wizard provides an easy, dialog-driven interface to
the differencing capabilities of the DIFFERENCE instruction.
Parameters
series Series to difference.
start end Range of entries over which to difference the series. start
must allow for the lagged values needed. If you have not set a
SMPL, these default to the maximum allowed by series. If you
have set a SMPL, you probably will need to reset it before using
the differenced series.
newseries Resulting series. By default, newseries=series.
newstart Starting entry for newseries. By default, newstart=start.
Options
differences=number of regular differences [see “Description”, page RM–104]
sdifferences=number of seasonal differences [0]
span=span for seasonal differences [CALENDAR seasonal]
Use these to specify the differencing operators to apply. The default, if you use
neither option, is a single regular difference (DIFFERENCES=1). If you use the
SDIFFERENCES option, DIFFERENCES defaults to zero. So, if you want to use a
combination of regular and seasonal, you must use both options.
standardize/[nostandardize]
center/[nocenter]
The CENTER option centers the series by subtracting its mean from each observa-
tion (Yt = X t − X ). STANDARDIZE subtracts off the mean and divides by the stan-
dard deviation (Yt = ( X t − X ) s ) to create a series with mean 0 and variance 1. If
you use one of these with the DIFFERENCES or SDIFFERENCES options, it will be
applied to the results of the differencing operation. If you use one of these with
the FRACTION option, the centering or standardization will be done first.
Description
If d=regular differences and e=seasonal differences, and s=span, and L
represents the lag (or backshift) operator, DIFFERENCE computes
e
Yt = (1 − L ) (1 − Ls ) X t
d
For frequencies defined in terms of the number of periods per year, SPAN defaults to
the CALENDAR seasonal (for example, SPAN=12 for monthly data). For frequencies
like weekly and daily, where there is no clear definition of a seasonal span, SPAN
defaults to 1.
Examples
data 1959:1 2009:12 m1nsa
set m1nsa = log(m1nsa)
diff m1nsa / m11diff
diff(sdiffs=1) m1nsa / m11sdiff
diff(diffs=1,sdiffs=1) m1nsa / m1both
This computes the first difference, the first seasonal difference and a combination of
one difference and one seasonal difference for the log of M1NSA. M11DIFF is defined
from 1959:2, M11SDIFF from 1960:1 and M1BOTH from 1960:2.
diff(fraction=d,pad=uupresample) uu / uufilter
does fractional differencing on series UU producing UUFILTER. The out-of-sample
values are treated as having the value UUPRESAMPLE.
Notes
The instruction BOXJENK includes the DIFFS, SDIFFS and SPAN options which al-
low you to use the undifferenced form of the dependent variable in the estimation
procedure. BOXJENK handles the differencing itself prior to estimating the arma
parameters, then rewrites the estimated equations in terms of the original dependent
variable.
If you estimate a regression on a differenced series (whether with BOXJENK or
LINREG or any other instruction), and want the equivalent equation using the origi-
nal series, use the instructions MODIFY and VREPLACE.
For instance, the following
diff m1nsa / m11diff
linreg(define=diffeq) m11diff
# constant m11diff{1 to 12}
modify diffeq
vreplace m11diff with m1nsa diff 1
estimates a twelve lag autoregression on the first difference of M1NSA, then recasts
the equation DIFFEQ in terms of M1NSA itself by substituting out M11DIFF.
The instruction FILTER can handle more general types of linear filtering.
Parameters
array(dims) The list of arrays you want to dimension, with the dimensions
for each array listed in parentheses. You can dimension any
number of arrays with a single instruction and the arrays di-
mensioned do not have to be of the same type.
Dimension fields
A dimension field has one of the forms:
name(dimension) for one-dimensional arrays (VECTOR arrays).
name(dim1,dim2) for the two-dimensional arrays: RECTANGULAR, and SYMMETRIC,
and PACKED. This sets up an dim1 ´ dim2 matrix. (dim2 is
ignored for SYMMETRIC or PACKED, since such matrices must be
square).
The dimension expressions can be any integer-valued expression, not just constants.
Examples
compute n=6
declare rect s1 s2
declare vector v1
dimension v1(10) s1(n,3) s2(n^2+n,n+17)
This declares and dimensions the arrays V1, S1, and S2. V1 is a 10-element VECTOR,
S1 is a 6 by 3 RECTANGULAR and S2 is a 42 by 23 RECTANGULAR.
declare vector[rectangular] vv(10)
do i=1,10
dim vv(i)(5,5)
end do i
When you have an array of arrays, as in this case, you must dimension each of the
component arrays separately.
Notes
You can DIMENSION an array several times during a program, but re-dimensioning
an existing array erases its previous contents. If you have an array which (possibly)
needs to be extended without losing the original information, you might be better off
using a LIST rather than a VECTOR.
Parameters
variables These can be variables or expressions to be evaluated. You can
also display EQUATIONs, MODELs or PARMSETs, and SERIES on
(almost) any type, though PRINT is generally much better for
working series.
strings Strings of characters enclosed in single or double quotes. Any
trailing blanks on a string are discarded—use a position code if
you want more than one space to separate two fields.
picture codes A picture code takes a form such as ##### for integers and
#####.## or ###### for reals— details on the next page. It
provides the representation for the next variable printed.
position/tab By default, each of the fields (variables or strings) is
codes followed by exactly one space. Position and tab codes provide
control over the positioning of fields, and are particularly useful
in formatting tables of output. See the next page for details.
Options
unit=[output]/copy/Other Unit
Use the UNIT option if you want the results to go somewhere other than the cur-
rent output.
delimited=[none]/tab/comma/semicolon
By default (with the NONE choice), output in a DISPLAY instruction is separated
by blank spaces. This option allows you to generate tab, comma, or semicolon-de-
limited output instead. This works when outputting to the screen, but is primar-
ily useful with the UNIT option to output to a text file.
store=string variable
This saves the created line in string variable; it does not display it. You can use
this variable, for instance, as a header for a graph, though you can often do the
same operation with a COMPUTE instruction.
hold/[nohold]
If you use HOLD, DISPLAY creates the string and waits for another DISPLAY
instruction to add more to it. This can be useful when the number of objects to
display is not known in advance, but REPORT is generally the better choice.
Picture Codes
A picture code is a string of characters that provide a template for formatting the
next variable (or variables) displayed. You can use the following characters to con-
struct a picture code:
#, *, and .
To display an integer value (or a real value rounded to the nearest integer) with a
fixed number of character spaces, use a form like “####”. This tells rats to display
the number right-justified in a field whose width is the number of # signs.
For real-valued output, use a form like “###.##”. The number of character spaces
used equals the total number of characters in the string, with the decimal at the
indicated location. Use * to the left of the decimal point (“*.##”) if you want rats to
use only as many places as needed to show the number and no more.
For instance, ###.## would display the value 1.23 with two blanks padding the field
on the left, while *.## will display it using just the four characters.
Using * by itself requests a standard format which generally shows five digits right
of the decimal.
You can get a picture code from a string variable or expression using the notation
&expression, where the expression should evaluate to a string with one of the
forms shown above.
Two functions which can prove quite useful in formatting tables are %BESTREP(X,W)
and %MINIMALREP(X,W), where X is a real array and W is the field width. %BESTREP
returns a picture code which best represents the data in X in a field of width W.
%MINIMALREP returns the smallest picture code which represents the data in X ac-
curately using at most W positions. The difference between the two is that %BESTREP
will always use up the whole width, usually by adding digits right of the decimal
place, while %MINIMALREP will give a shorter picture if it can.
A picture code stays in effect in a particular DISPLAY instruction until overridden by
another picture code. To “cancel” a picture code, just use a field with * by itself.
Position Codes
By default, each field is followed by one blank space. Position codes allow you to alter
the placement of the field immediately following it. They take one of the forms:
@n display next field at position n
@+n display next field n positions to the right
@-n display next field n positions to the left
The last two move from the blank space after the previous field. For instance, to run
two fields together, use a position code of @-1. Position codes can use expressions, for
instance, @13*I will place the next field at position 13*I.
Tab Codes
A tab field takes the form @@[flag]spacing. The flags are
. decimal
> right
< left (the default if no flag is included)
^ center
The spacing starts relative to the current position. Subsequent fields will be posi-
tioned with the “flagged” feature placed spacing positions apart. The instruction
below displays the elements of XVECT with the first decimal place in position 12, the
second at 24, etc; each number displayed with two digits right of the decimal point:
display @@.12 *.## xvect
Using ?
You can abbreviate the command to simply ? if you aren’t using any options.
Examples
display "Test statistic is" ((rssr-rssu)/q)/(rssu/%ndf)
display "Degrees of freedom are" q "and" %ndf
which will produce something like
Test statistic is 3.92092
Degrees of freedom are 13 and 123
If you want to make this look a little cleaner (when it’s displayed), you could do some-
thing like
display "F(" q "," %ndf ") =" ((rssr-rssu)/q)/(rssu/%ndf)
which will give you
F( 13 , 123 ) = 3.92092
A further improvement (which you might do if you’re writing this as part of a proce-
dure which others will use), is to use a picture code to squeeze out the extra spaces
and drop some of the excess digits on the result. This:
display "F("+q+","+%ndf +") =" *.### ((rssr-rssu)/q)/(rssu/%ndf)
will produce
F(13,123) = 3.921
Note that it’s often easier to use REPORT rather than complicated DISPLAY instruc-
tions.
@johmle(lags=6,det=rc,cv=cvector)
# ftbs3 ftb12 fcm7
equation(coeffs=cvector) ecteq *
# ftbs3 ftb12 fcm7 constant
?ecteq
estimates a cointegrating vector using @JOHMLE, creates an EQUATION based upon it,
and displays that.
See Also . . .
REPORT Organizes output into a table format.
WRITE Displays arrays and other variables. More general, but less
flexible, than DISPLAY.
MESSAGEBOX Displays information messages requiring user response.
INFOBOX Displays messages, progress bars (no user response required).
Parameters
start end Range to use in estimation.
state vectors A SERIES of VECTORS into which the output state vectors are
placed. If you call this STATES, for instance, STATES(2014:1)
is the complete estimated state vector for 2014:1. To get compo-
nent k of this, use STATES(2014:1)(k).
state variances A SERIES of SYMMETRIC arrays into which the variance matri-
ces of the states are saved.
Options
Most of the options define information needed in the model. For state-space/signal
extraction, the model consists of two equations:
(1) X t = At X t−1 + Zt + Ft Wt , and
(2) Yt = mt + Ct ′ X t + Vt
type=[filter]/smooth/control/simulate/csimulate
This determines what technique is to be used. FILTER is the Kalman filter,
SMOOTH is the Kalman smoother and CONTROL solves the control problem (see
“Options for Optimal Control”). SIMULATE does a random (Normal) simulation
of the dlm drawing randomly from the presample distribution, the state distur-
bances and the measurement equation disturbances. CSIMULATE does a condi-
tional simulation, drawing the states from their distribution conditional on the
observed Y’s.
save=DLM to save inputs [not used]
get=DLM with saved model to be used[not used]
SAVE saves the information about the inputs to the model (basic input options A,
C, Y, F, SW, SV, MU, Z and less standard ones SX0, X0, Q, R, SH0 and DISCOUNT).
You can then use GET on a later DLM instruction to re-use those inputs. Note that
this only applies to the inputs, not the outputs, or to TYPE options or estimation
controls.
presample=ergodic/[x0]/x1/diffuse
exact/[noexact]
g=matrix (RxN, R<N) reducing model to stationarity [not used]
sh0=variance of diffuse part of the prior [identity if EXACT]
x0=VECTOR or FRML[VECTOR] [zero vector]
sx0=SYMMETRIC or FRML[SYMMETRIC] [zero]
These options can be used to initialize the pre-sample states of the Kalman filter.
See Section 10.6 of the User’s Guide for the technical details.
If PRESAMPLE=X0 (the default), the initial (finite) state mean and covariance
matrix are supplied by the X0 and SX0 options. X0 supplies the initial state
mean—it should evaluate to an N VECTOR while SX0 gives the covariance of X0—
it should evaluate to an N´N SYMMETRIC array. If PRESAMPLE=X1, X0 and SX0
are still used, but they provide X1|0 and S1|0, not X0|0 and S0|0.
variance=[known]/concentrated/chisquaredinverse
scale/[noscale] (old option—use VARIANCE=CONCENTRATED instead of SCALE)
VARIANCE=KNOWN assumes all variances are known (or being estimated).
VARIANCE=CONCENTRATED assumes all variances are known up to a single un-
known scale factor (usually the variance of the measurement equation) which is
to be concentrated out. VARIANCE=CHISQUARED assumes all variances are known
up to an unknown scale factor (again, usually the variance of the measurement
equation) which has an informative (inverse) chi-squared prior distribution—see
the PDF and PSCALE options below.
pmethod=bfgs/[simplex]/genetic/annealing/ga/gauss
piters=number of PMETHOD iterations to perform [none]
Use PMETHOD and PITERS if you want to use a preliminary estimation method to
refine your initial parameter values before switching to one of the other estima-
tion methods—rats will automatically switch to the “METHOD” choice after com-
pleting the preliminary iterations requested via PMETHOD and PITERS.
T
(5) E X 0′ Q0 X 0 + ∑ { X t′Qt X t + Ut′Rt Ut }
t =1
function WithStdSigma xt vt
type vector *xt
type symm *vt
*
if xt(1)>1.0
compute xt(1)=1.0
end
*
dlm(fpost=WithStdSigma,y=y,c=1.0,sw=sigsqw,sv=sigsqv,$
presample=diffuse,type=filter) / xstates_std vstates_std
Missing Values
A missing data point can be represented by either a Ct or Yt which has a missing
value. You can also use a SMPL option to skip data points. Note that the missing time
periods are still considered part of the overall sample. The Kalman filter and smooth-
er will estimate the state vector at those time periods and optimal control will gener-
ate a control value for them.
Note that if Y has more than one component, some can be missing, while others
aren’t. DLM will adjust the calculation in those cases to use the information available.
Examples
(6) yt − µ = φ( yt−1 − µ ) + εt + θεt−1
If p is known and fairly small, it’s probably easiest to just code the A matrix directly.
We show here how to do it for a general p. The transition equation matrices are
j1 j2 j p−1 j p
yt 1
1 0 0 0
y
, F = 0 , W = e
(9) X t =
t −1 , A=0 1 0 0 0
t t
y 0 0 0
t− p+1
0 0 1 0
If the coefficients are known and aren’t being estimated, A can be set up with an
EWISE instruction with an %IF to handle the first row. Assume that the coefficients
are in a vector named PHI.
dec rect a(p,p)
ewise a(i,j)=%if(i==1,phi(j),(i==j+1))
However, if the coefficients are to be estimated, the “ewise” has to be done within the
defining formula. The easiest way to do that is to create a FUNCTION which returns
the matrix. Because the A matrix doesn’t depend on time (just on the parameters),
it’s a good candidate for being computed using the START option. In the DLM instruc-
tion below, this calls AFUNC with the current values of the parameters, puts the
do index = startvalue,endvalue,increment
instructions executed for each value taken by index
end do
Parameters
index An integer-valued variable or array element. At the end of each
pass through the loop, rats increments this variable by the
increment value.
startvalue The initial (integer) value for index to take. It can be a con-
stant, another variable, or an integer-valued expression.
endvalue The terminating (integer) value for the index. rats executes
the instructions in the loop as long as:
index £ endvalue when increment > 0
index ³ endvalue when increment < 0
Note that rats will not execute the loop a single time if
startvalue is greater than endvalue (or less than endvalue
for negative increment).
increment (Optional: Default is 1). This is an (integer) value by which
index changes with each pass through the loop. It can be a con-
stant, variable or expression, and it can be positive or negative.
Examples of Syntax
do step=0,6 STEP=0,1,2,..,6
do count=5,0,-1 COUNT=5,4,...,0
do j=1979:1,enddata,4 J=1979:1,1980:1 (with quarterly data),... as
long as J£JENDDATA
Nesting Loops
You can nest loops, that is, put one loop inside of another. Each DO must have its own
END DO instruction. For example:
do iters=1,500
do steps=0,6
...
end do steps
end do iters
The indenting that we used in this example is not necessary, but makes the code
easier to follow (you can use Edit–Indent Lines to do this). rats ignores any text
(on the same line) after an END DO, so you can add comments to indicate which loop
is “ended” by each END DO. In the example above, we’ve used the names of the two
index variables (STEPS, ITERS) as comments.
Notes
You should never change the value of the index variable within the loop. rats deter-
mines the number of trips that it will make through the loop when it first executes
the DO instruction. This will not be affected by changes you make to the index. If you
need a more flexible loop form, use WHILE or UNTIL.
The value of the index when the loop is exited is the value it had on the last pass
through, not the value that causes the loop to terminate.
Avoid using the reserved variable T as a loop index, because it is reserved for use by
SET and FRML instructions (which can change the value of T).
The function %DO can be used to put “looping” subcalculations into a larger calcula-
tion. For instance,
frml archpart = v=0.0, %do(i, 1, p, v=v+b(i)*u{i}^2), v
will compute (for a value of T) a sum involving P lags of the series U.
Examples
do ar=0,3
do ma=0,3
boxjenk(diffs=1,constant,ar=ar,ma=ma,maxl) $
rate 1975:4 2017:4
end do ma
end do ar
estimates 16 arima models, one for each possible combination of p and q between 0
and 3.
Parameters
index The variable that takes each of the values in the list in turn.
This can have any data type but if not previously declared it
will be an INTEGER.
list of values The list of values index is to take. These values must be com-
patible with index. See the rules in the next paragraph.
List of Values
• if index is an INTEGER, they may be any integer-valued variables or expres-
sions. You can use “n TO m” as shorthand for consecutive integer values.
• If index is a SERIES or an EQUATION, they may be series or equation names,
or any integer-valued variables or expressions. Integer values are interpreted
as series or equation “handles”.
• if index is any other type, they must be variables or expressions which can be
converted to the correct type.
With any type of index, you can use an array aggregate of that type in the list of
values: for example, you can use VECTOR of INTEGERS with an INTEGER index.
DOFOR will loop over all the elements in that array.
Variables Defined
%DOFORPASS pass number through the list (starting at 1) (INTEGER).
Notes
index is used as a “placeholder”: its value at the end of execution of the loop is un-
changed from what it was at the beginning.
You can use an INTEGER index even if you intend to loop over a list of series. That
is the preferred way of handling a list of series, since you don’t have to create a new
series (which DECLARE SERIES will do) solely for the purpose of serving as a loop in-
dex. rats converts a series name into its numerical “handle” and will convert it back
to reference a series if necessary. If you need to refer to the series in a SET or FRML
instruction, however, explicitly tell rats that you intend the index to represent a se-
ries by using index{0} rather than index alone the way you could with an ordinary
series name. See the first two examples below.
Examples
spgraph(hfields=3,vfields=4,$
xlabels=||"Levels","1st Difference","12th Difference"||,$
ylabels=||"Interest Rate","Money","Price","Output"||)
dofor y = ffunds lm lp lo
set dy = y{0}-y{1}
set d12y = y{0}-y{12}
graph(row=%doforpass,col=1)
# y
graph(row=%doforpass,col=2)
# dy
graph(row=%doforpass,col=3)
# d12y
end dofor y
spgraph(done)
This loops over four series, creating the 1st and 12th difference for each, then
graphing the level, and the two derived series in a 4 x 3 graph matrix, organized by
SPGRAPH. The GRAPH instructions use the %DOFORPASS variable to put the graph into
the proper slot.
Parameters
target series List of series which are the endogenous variables in the model.
The number of series should match the number of equations in
the model.
initial values (Optional) If desired, you can use the syntax “series<<value”
to provide an initial guess value for the expansion point for any
of the series in the list of targets.
Options
model=MODEL to be solved
This must take the form shown in “Form for Model”.
expand=[none]/linear/loglinear
This indicates the type of expansion required. The default (EXPAND=NONE) is
used when the model is fully linear. EXPAND=LINEAR does a linear expansion and
EXPAND=LOGLINEAR does a log-linear expansion.
The SOLVEBY option controls the method used for solving the Newton’s method
steps. SOLVEBY=LU uses the faster lu-decomposition, but that fails when
the model has unit roots, in which case you’ll need to switch to the slower
SOLVEBY=SVD.
You can retrieve Q1 using the A option, Qc with the Z option and Q0 with the F
option. The final term drops out if the Z process is serially uncorrelated. If it isn’t,
or if you want to predict the effect of (known) future shocks to Z, you can use
the ETZ option to obtain the three matrices needed for that. After ETZ=THETA,
THETA(1) has Qy , THETA(2) has Qf and THETA(3) has Qz . Note that the infinite
sum in the final term will rarely simplify easily, so the sum will generally have to
be approximated with a finite number of terms.
The CONTROLS option controls the positioning of the states for the final set of se-
ries listed on the DSGE instruction. Ordinarily, all augmenting states (leads and
extra lags) are included after all the original series. However, with the CONTROLS
option, those final series are placed at the end of the state vector so you can easily
separate them from the remainder of the states.
( )
-1
(4) log xn+1 = log xn - diag ( xn ) F ′ ( xn ) F ( xn )
This is repeated until the maximum of the absolute values of the components of the
adjustment vectors is less than the convergence criterion.
The default initial guess values are a vector of zeros if a linear expansion is used, and
a vector of ones if a log-linear expansion is used. There’s a good chance that you’ll
need to provide a better set of values than these. That can be done either with the <<
fields on the DSGE instruction or with the INITIAL option.
Examples
The example file CASSKOOPMANS.RPF shows a simple Cass-Koopmans growth model.
The model is deterministic and the conditions are linear. What DSGE does here is to
suppress the unstable root. The state-space representation is solved using DLM for
several different sets of initial conditions, one of which we show here. Because the
model is linear, there is no need to find the steady-state in order to expand it; here,
we compute it in order to know how to set the initial conditions to be above or below
the steady-state.
declare series c lambda k
declare real u0 u1 f0 beta
frml(identity) f1 = u0-u1*c-lambda
frml(identity) f2 = f0*lambda-1.0/beta*lambda{1}
frml(identity) f3 = f0*k{1}-k-c{1}
compute beta=.95,f0=1.3,u0=1.0,u1=0.2
group casskoopmans f1 f2 f3
dsge(expand=linear,steadystate=ss,a=a,z=z,model=casskoopmans) $
c k lambda
dlm(x0=||3.0,17.0,u0-u1*3.0||,a=a,z=z,presample=x1) 1 20 xstates
set c 1 20 = xstates(t)(1)
set k 1 20 = xstates(t)(2)
spgraph(vfields=2,footer="Initial consumption below steady state")
graph(hlabel="Consumption")
# c
graph(hlabel="Capital")
# k
spgraph(done)
These are the first order conditions for a simple rbc model with inelastically sup-
plied labor, log utility function and an AR(1) productivity shock.
1 = Et β Rt +1ct / ct +1
Rt = αktα−−11θt + (1 − δ )
yt = ktα−1θt
yt = kt − (1 − δ )kt−1 + ct
log θt = ρ log θt−1 + (1 − ρ )µ + εt
Part of the example file SIMPLERBC.RPF is provided below which analyzes this. The
model has five equations in five endogenous variables, with one fundamental shock.
Because the conditions are non-linear, this needs to be (log) linearized, which is
why the DSGE instruction includes the EXPAND=LOGLIN option. We need to save the
steady-state solution (using the STEADY option) to transform log-linearized simula-
tions back to levels.
dec real beta rho mu alpha delta
dec series r c k y theta
*
frml(identity) f1 = 1 - (beta*r{-1}*c/c{-1})
frml(identity) f2 = r - (theta*alpha*k{1}^(alpha-1)+1-delta)
frml(identity) f3 = y - (theta*k{1}^alpha)
frml(identity) f4 = y - (c + k - (1-delta)*k{1})
frml f5 = log(theta) - (rho*log(theta{1})+(1-rho)*mu)
*
group simplerbc f1 f2 f3 f4 f5
compute alpha=.3,delta=.15,rho=.9,beta=.95,mu=4.0
dsge(model=simplerbc,expand=loglin,steady=ss,a=adsge,f=fdsge) $
y c k r theta
Simulate 40 periods of the model with productivity shocks that are have a stan-
dard deviation of .02 (2%, since theta is in log form).
dlm(a=adsge,f=fdsge,presample=ergodic,sw=.02^2,type=simulate) $
1 40 xsims
set y = exp(xsims(t)(1))*ss(1)
set c = exp(xsims(t)(2))*ss(2)
graph(header="Simulated Economy",key=below) 2
# y
# c
Variables Defined
%CONVERGED 1 or 0. Takes the value 1 if the solution for the steady state con-
verged, 0 otherwise.
%CVCRIT Final convergence criterion (if steady state solution is needed)
Parameters
series series to define
start end range to set. These default to the current sample period.
Options
from=starting period for treatment [defaults to start period]
to=ending period for treatment[defaults to end period]
FROM sets the starting period for the dummy treatment, while TO sets the ending
period. Used alone or together, these define a shift dummy which is 0 outside the
FROM, TO range, and 1 inside it.
FROM and TO are also used with the RAMP option and may be used with the LS
option.
ramp/[noramp]
Defines a “ramp”, which is a temporary change in a linear trend (increasing at a
rate of 1 per entry). The FROM and TO option define the limits of the trend shift.
The dummy is (FROM-TO) from start until FROM, (T – TO) from FROM until TO
and 0 after that. Again, this is designed to make the dummy zero at the end of
the sample.
DFROM and DTO provide the base date for the effect.
The ||month,day|| form is for a specific day of the month, for instance,
||12,25|| for Christmas. Use ||month,day of week,count|| for “floating”
dates, given as the n’th occurrence in a given month of the specified day of the
week. Days of the week is coded as 1=Monday,…,7=Sunday. So the second Sun-
day in May is ||5,7,2|| (5th month is May, 7th day is Sunday, 2 gives the 2nd
occurrence in the month).
Use a negative value for count to count back from the end of the month, with -1
giving the last occurrence, -2 the next to last occurrence, etc.
Examples
dummy(ao=1987:10) temp
dummy(ls=1987:10) perm
defines TEMP as a single period dummy for 1987:10, and PERM as a level shift which is
-1 through 1987:10 and 0 afterwards.
set trend = t
dummy(ramp,to=1981:1) tbreak
linreg y
# constant trend tbreak
estimates a regression with a broken trend, where the rate (but not the level) chang-
es at 1981:1.
dummy(dfrom=||9,1,1||,wfrom=10,dto=||9,1,1||,wto=0) statefair
defines a split dummy in which the months of August and September in each year
get the fraction of the 11 days up to and including U.S. Labor Day (first Monday in
September) that fall in that month.
Wizard
The Time Series—VAR (Setup/Estimate) wizard includes support for defining and
estimating error-correction models.
Parameters
equations lists equations describing the “stationary” relationships among
variables in the var model. You can use a VECTOR of equations
if you might need to vary the number. The equations can also
include exogenous variables if desired.
Suppose that, instead, a RECTANGULAR matrix B has been estimated which lists
the cointegrating vectors. The example on the following page creates a VECTOR of
EQUATIONS to hold the equations. In this case, the dependent variables of the equa-
tions are ignored in forming the stationary conditions.
The following excerpt is taken from the example program KPSW5.RPF, which repro-
duces Table 5 from King, Plosser, Stock and Watson (1991).
Parameters
matrix Matrix for which the eigen decomposition is computed. In gen-
eral, it can be any N´N RECTANGULAR or SYMMETRIC matrix.
However, if you use either the SCALE or EXPLAIN options, it
must be a positive definite SYMMETRIC.
eigenvalues (Optional output) For an N´N matrix, EIGEN saves the (real
parts of the) N eigenvalues in this VECTOR. Use the CVAL-
UES option if you are expecting complex-valued eigenvalues.
Use * for this parameter if you want eigenvectors but not
eigenvalues.
eigenvectors (Optional output) EIGEN saves the (real parts of the) eigenvec-
tors in this N´N RECTANGULAR. Use the CVECTORS option
instead if you are expecting complex-valued eigenvectors.
Options
dmatrix=[eigenvalues]/identity
scale/[noscale]
For a SYMMETRIC matrix, the eigen decomposition for A takes the form
A = PDP ′ where D is a diagonal matrix. If DMATRIX=IDENTITY (the older op-
tion SCALE is a synonym), the matrix of eigenvectors (P) is normalized so that
D is the identity matrix. This is only possible if matrix is SYMMETRIC and posi-
tive definite. For the default DMATRIX=EIGENVALUES, each column of P has unit
where lj is the jth eigenvalue, eij is the ith element of eigenvector j and Vii is the
ith diagonal element of the matrix. This can only be used if matrix is SYMMET-
RIC and positive definite.
Examples
eigen %cmom eigenval
compute condition=%maxvalue(eigenval)/%minvalue(eigenval)
computes the condition number for the (positive-definite) SYMMETRIC matrix %CMOM.
eigen(cvalues=cxvalues,sort=absval) a
dec vector absval(%rows(a))
ewise absval(i) = %real(%cabs(cxvalues(i)))
computes the absolute values of the eigenvalues of A, producing the VECTOR ABSVAL.
Parameters
codings A RECTANGULAR matrix of dimension number of new series x
number of original variables. Each row of codings provides the
coefficients of a different linear combination of the variables.
start end This is the range of entries over which ENCODE will compute the
transformation. If you have not set a SMPL, this defaults to the
maximum range allowed by the variables being transformed.
new series List of series that ENCODE will construct from the linear com-
binations of regressors. There should be one series per row of
codings. We recommend using the newer RESULTS option
instead.
Options
[clear]/noclear
Unless you use NOCLEAR, ENCODE erases any information from previous ENCODE
instructions. If a regression requires more than one ENCODE, the ENCODEs after the
first must use NOCLEAR.
show up in your output—Z does. ENCODE and UNRAVEL don’t change this procedure,
they just ensure that the regression is reported using the original regressors. Basically,
ENCODE computes the “Z” variable, and remembers how it was constructed. When you
UNRAVEL a regression, the “Z” is replaced with equal coefficients on X4 and X5.
Example
The code below is taken from the example PDL.RPF. It computes a 3rd order polyno-
mial distributed lag first using ENCODE and UNRAVEL, then with the @PDL procedure.
declare rect r
* PDL with no end constraints
dim r(4,25)
ewise r(i,j)=j^(i-1)
encode(results=enc) r
# shortrate{0 to 24}
linreg(unravel) longrate
# constant enc
* Same estimation using the PDL procedure
@pdl(graph) longrate
# shortrate 0 24 3
* PDL with far constraint
dim r(3,25)
ewise r(i,j)=(j-26)*j^(i-1)
encode(results=enc) r
# shortrate{0 to 24}
linreg(unravel) longrate
# constant enc
* Same estimation using the PDL procedure
@pdl(constrain=far,graph) longrate
# shortrate 0 24 3
Notes
When you UNRAVEL a regression, the result is a covariance matrix which is not full-
rank. For instance, in the example above, the 27 coefficients in the final regression
are actually linear combinations of the 9 coefficients estimated. You can run hypothe-
sis tests on the restricted regressions as you would unrestricted ones. However, rats
may have to adjust the degrees of freedom of the test. For instance,
exclude
# shortrate{0 to 12}
will actually have just 4 numerator degrees of freedom, not 13. rats will automati-
cally adjust the degrees of freedom accordingly, and issue a warning like:
X13. Redundant Restrictions. Using 4 Degrees, not 13
See Also . . .
RESTRICT Tests or imposes general linear restrictions.
MRESTRICT Tests or imposes general linear restrictions using matrices.
We primarily describe the third purpose here. See the discussions of SYSTEM, looping
instructions, and procedures elsewhere for examples of the other uses.
If rats encounters an END instruction that does not include either the SYSTEM or
RESET options, and does not terminate a loop or other compiled section, it is sim-
ply ignored. This is a change from versions prior to 7.0, where a lone END command
would terminate the program.
If you wish to clear the memory, use END(RESET), as described below.
end(option)
Options
SYSTEM When END is used with this option, it signals the end of a
SYSTEM block used to define a VAR model or other system of
equations. See the SYSTEM instruction or User’s Guide Chapter
10 for details.
RESET This clears all variables and other information from rats
memory. When working in interactive mode, you can also do
this using the File—Clear Memory menu operation.
See Also . . .
HALT Terminates execution from inside a compiled section.
Parameters
variables list ENTER sets the values of the variables you list using the in-
formation on the supplementary card. These can be INTEGER,
REAL, COMPLEX, LABEL or STRING variables or array elements.
You can use any combination of these types.
You can enter a complete array only if it is the only object in the
variables list.
Options
varying/[novarying]
entries=(output) INTEGER for number of entries
You can only use VARYING when you are entering a VECTOR of one of the basic
data types. It allows you to input a list of unspecified length, such as a regression
supplementary card. If you use VARYING, you can include the option ENTRIES.
ENTRIES saves in an INTEGER the number of entries processed.
Supplementary Card
ENTER has a number of uses, and the placement of the supplementary card depends
upon how it is being applied:
• If you use ENTER to bring in specific information when the procedure is used (a
regressor list, for instance), omit the card from the procedure. When there is
no supplementary card immediately after ENTER, rats expects to see it after
the instruction which called the procedure.
• If you use ENTER within the procedure itself to build up a list in a VECTOR,
include the supplementary card in the procedure, right after the ENTER.
If you are reading in a complete array, ENTER takes it from a single supplementary
card according to the internal arrangement of the array:
• For RECTANGULAR, by columns
• For SYMMETRIC or PACKED, by rows, lower triangle only.
Examples
procedure constrain nconstr r
type integer nconstr
type vector *r
local integer nser nper i
local real value
do i=1,nconstr
enter nser nper value
compute r(i)=value-([series]nser)(nper)
end do i
end
CONSTRAIN will have NCONSTR supplementary cards, each with two integer values
and one real. The first integer should be a series name or number. For example:
@constrain 3 r
# tbill 2019:1 2.2
# tbill 2019:2 2.3
# tbill 2019:3 2.4
You can use ENTER(VARYING) along with regressor list functions to build up a list
of regressors, which is particularly useful in writing procedures. The following code,
taken from the @VARLAGSELECT procedure, builds a var regressor list:
enter(varying) list
compute n=%rows(list)
compute ntotal=n*(lags+1)
* Build the regressor list, starting with the dependent variables:
dim reglist(0)
do i=1,n
compute reglist=%rladdone(reglist,list(i))
end do i
* Now add the lagged variables.
do j=1,lags
do i=1,n
compute reglist=%rladdlag(reglist,list(i),j)
end do i
end do j
See page UG–482 in the User’s Guide for more on the regressor list functions.
Notes
INPUT, READ, COMPUTE and FIXED are superior to ENTER when you have known val-
ues. QUERY and MEDIT also can be used to get information from a user.
Environment Strings
The following are the available environment strings. Note that blanks are significant.
[showgraphs]/noshowgraphs
printgraphs/[noprintgraphs]
NOSHOWGRAPHS suppresses the displaying of graphs on the screen (they will be
still be saved to disk if you use OPEN PLOT or ENV GSAVE instructions). PRINT-
GRAPHS causes rats to automatically print each graph as it is generated—
graphs are printed even if you use NOSHOW as well.
echo/noecho
When running in batch mode, rats normally prints out (echoes) each line it
reads from the input file. If you don’t want the input lines echoed, use ENV
NOECHO at the beginning of your program. ENV ECHO will restart the echoing.
traperrors/notraperrors
subscripterrors/nosubscripterrors
TRAPERRORS suppresses direct error messages. If you are writing a program
or procedure and want to handle errors yourself, use TRAPERRORS and test the
%ERRCODE variable where appropriate. Subscript errors occur when you reference
an out-of-range array or series element in an expression. Suppressing subscript
errors with NOSUBSCRIPTERRORS will speed up (somewhat) programs which
make heavy use of subscripted expressions, such as COMPUTE instructions within
loops or EWISE instructions. You should never do this until you are sure the pro-
gram is running correctly.
Note: The ENV GSAVE/GFORMAT operation described below has been superseded by
the GSAVE instruction.
gsave=filename template for saving graphs
gformat=rgf/portrait/landscape/wmf/pict
Use GSAVE=template to automatically save graphs to disk. If include an aster-
isk (*) somewhere in the template, rats will save each subsequent graph using
that template, replacing the * with a sequence number. For example, to save
graphs as MYGRAF1.RGF, MYGRAF2.RGF, etc., do:
env gsave=”mygraf*.rgf”
Without the asterisk, rats will use the explicit filename provided, and add each
graph to that file. By default, graphs are saved in rgf (rats Graph Format). Use
the GFORMAT parameter to select a different form from the choices listed above
(PORTRAIT is PostScript format with a portrait orientation, and LANDSCAPE is
PostScript format in landscape orientation).
Examples
environment gsave="basics*.eps" gformat=portrait
graph(key=upleft) 3
# rate
# ip
This saves a graph to disk in PostScript format with the “Portrait” orientation. The
file will be called BASICS1.EPS.
Variables Defined
%ERRCODE An INTEGER variable. If you use ENV TRAPERRORS, this will
be set to the appropriate error code if rats generates an error.
You can look up the error code in the text file RATSERRS.MSG to
see the corresponding error message.
Wizard
You can use the Statistics—Equation/FRML Definition wizard to define equations.
Description
EQUATION takes slightly different forms for arma (autoregressive–moving average)
equations compared with standard regression relationships.
• For standard equations, you supply only the equation and depvar param-
eters. The explanatory variable are listed on the supplementary card.
• For arma equations, you use the AR option (or ARlags) and (MA option) (or
MAlags) in addition to equation and depvar.
Parameters
equation Name or number of the equation being defined.
depvar Dependent variable of the equation.
ARlags (Optional) AutoRegressive lags. If this is a single number, it
means consecutive lags from 1 to ARlags. To skip lags, use
||list of lags||; for instance, ||1,4|| for lags 1 and 4.
You can also use a VECTOR of INTEGERS.
The AR and MA options are the preferred way to input arma lag
information.
MAlags (Optional) Moving Average coefficients. These work the same
way as the ARlags parameter.
Supplementary Card
Standard form For standard regression equations, EQUATION needs one supple-
mentary card listing the explanatory variables in regression
format. Omit it if you use the LASTREG or EMPTY options.
ARMA models With arma equations, you only need a supplementary card
if you use the REGRESSORS option to include extra variables
besides the arma parameters and the automatic CONSTANT.
Options
identity/[noidentity]
Use IDENTITY when you are defining an identity. Some of the forecasting in-
structions (IMPULSE, ERRORS, SIMULATE and HISTORY) need this information.
lastreg/[nolastreg]
LASTREG defines the equation using the variables and estimated coefficients from
the most recent regression. Omit the supplementary card if you use this option.
constant/noconstant
For an arma equation, EQUATION includes the CONSTANT series among the ex-
planatory variables in the equation unless you use NOCONSTANT. For non-arma
models, NOCONSTANT is the default—if a constant term is needed, it is usually
supplied using CONSTANT on the supplementary card along with the other ex-
planatory variables.
regressors/[noregressors]
You can use this option with arma models when you want to include variables
in addition to the CONSTANT and the arma part. Models like this are sometimes
called armax models (arma with extra variables). List the additional regressors
(in regression format) on a supplementary card. (This option was called MORE in
versions of rats before 7, and that’s still acceptable.)
empty/[noempty]
Use the EMPTY option to create an equation which has only a dependent variable,
with no right-hand-side terms. This can be useful if you need to apply shocks to
an “exogenous” variable in an impulse response analysis.
ARMA Equations
The rats forecasting instructions cannot handle directly equations with the multipli-
cative structure permitted by the BOXJENK instruction. They require equations with
the simpler parametric form:
Φ( L ) yt = a + Ψ( L ) xt + Θ( L ) ut
where F, Y and Q are simple polynomials in the lag operator L. The DEFINE option
on BOXJENK generates such equations by multiplying through by all polynomials
which appear in the denominators of the estimated form and expanding all products.
If you want to generate a series which follows a particular ARMA structure, you can
write the equation in the form above and input the equation with EQUATION.
You can estimate simple arma equations (with no armax components) by INITIAL
followed by ITERATE. INITIAL does not compute coefficients for non-arma variables,
so you need to do one of the following for Armax equations:
• Use INITIAL to set initial guesses for the ARMA parameters, while zeroing out
the extra coefficients. Use ITERATE from there. This is most likely to be successful
when most of the explanatory power is in the serial correlation model.
• For equations with ma components, but no ar components, first estimate a
standard regression. Use EQUATION and LINREG(EQUATION=...) or just
LINREG(DEFINE=...) to create the base equation. Then use:
MODIFY equation
VADD %MVGAVGE list of lags
to add the ma components. This works better when the regression part of the
model dominates.
The next example defines an ARIMA(2,0,1) with AR lags 1 and 4 and MA lag 4:
(1 − Φ L − Φ L ) y
1 2
4
t = a + (1 + Θ1 L4 ) ut
and assigns the coefficient values F1 = 0.5, F2 = 0.4, and Q1 = 0.7 (and a 0 intercept).
Equations are set up with the CONSTANT first, then the AR lags, then the MA lags.
equation(coeffs=||0.0,0.5,0.4,0.7||,ar=||1,4||,ma=||4||) yeq y
See Also . . .
ASSOCIATE Sets coefficients for an EQUATION
FRML Defines a FRML—a (possibly) non-linear relationship
BOXJENK Estimates ARIMA, transfer function, and intervention models.
MODIFY With VREPLACE and VADD, changes the structure of an equa-
tion.
Parameters
list of series List of series to be given names
Text Card
This is the list of the names, separated by blanks, that you want to assign to the
list of series. The names on this card must be legal variable names:
• The name must begin with a letter, % or _.
• The only characters are letters, digits, $, % and _.
• The maximum length is sixteen characters.
Description
EQV used to be an important instruction, and you may see it in programs written for
rats version 3 or earlier. Now, if you need an instruction like this, LABELS will prob-
ably be the better choice. While EQV assigns series names, which can be used on input
and output, LABELS only sets output labels, which offers several advantages:
• Series names (done with EQV) must be unique: you can’t have two series called
RESIDS or FORECAST, but any number of series can share a label.
• Labels are not subject to the restrictions put on symbolic names—you can use
any combination of characters (up to sixteen).
• You can set labels in a more flexible fashion. For example, you can use string
expressions and LABEL variables.
Example
open data g7oecd.rat
cal(q) 1956:1
allocate 9 1997:4
eqv 1 to 9
usashort frashort gbrshort usagbond fragbond gbrgbond $
usardiff frardiff gbrrdiff
data(format=rats) / 1 to 6
do i=1,3
set i+6 = (i+3){0}-i{0}
end do i
This uses ALLOCATE and EQV to produce a uniform numbering relationship between
the short and long rates of three countries. The DO loop uses this relationship to con-
struct series of differences for each of the countries.
Wizard
The Time Series—VAR (Forecast/Analyze) wizard provides an easy, dialog-driven
interface for computing variance decompositions.
Parameters
equations Number of equations in the system.
Options
With the exception of MODEL, PRINT/NOPRINT, STEPS, and STDERRORS, these all ap-
ply to ERRORS only when used for decomposition of variance.
model=model name
Of the two ways to input the form of the model to be solved (the other is with
supplementary cards), this is the more convenient. MODELs are usually created by
GROUP or SYSTEM. FRML’s (formulas) can be included if they are of a simple lin-
ear form. (ERRORS requires that the model be fully linear.) If the model includes
any identities, those should be last in the model. If you use this, omit the “equa-
print/noprint
PRINT is the default if you have more than one equation, and NOPRINT is the
default if you have just one. ERRORS can produce a great deal of output for a big
var. See “Output” for a sample.
window=”Title of window”
If you use the WINDOW option, a (read-only) spreadsheet window is created with
the indicated title and displayed on the screen. This will display N blocks of N+1
columns, in a format similar to the standard output.
impulses/[noimpulses]
Use the IMPULSES option to print the impulse responses which go into the decom-
position. An IMPULSE instruction of similar form would produce exactly the same
output, so if you want the responses in addition to the decomposition, you need
only use the ERRORS instruction with IMPULSES. If you need more flexibility, or
if you need to be able to save the impulse responses, use the IMPULSE instruction
instead. Note: for a big var, this produces a lot of output.
Example of Decomposition
This example computes decompositions for a system of four equations using two
orderings.
system(model=canmodel)
variables cpr m1 ppi gdp
lags 1 to 4
det constant
end(system)
estimate(cvout=v)
*
* The first decomposition is done in the original order:
* CPR-M1-PPI-GDP. The second is GDP-PPI-M1-CPR.
*
errors(model=canmodel,steps=24,cv=v)
errors(model=canmodel,factor=%psdfactor(v,||4,3,2,1||,steps=24)
Technical Information
For a static model such as
(1) yt = X t b + ut ; Var (ut ) = s 2
(See, for instance, Greene(2012), p 81). The first term is due to the equation error
ut+1, and the second is due to sampling error in using b̂ to estimate b. For simple
projections, you can get this variance by using the STDERR option of PRJ.
The situation is much more complicated for a multiple step forecast in a dynamic
model. Take the simplest possible case:
(5) a ut + ut +1 + (a 2 − aˆ 2 ) yt−1
Note that the effect of sampling error (the last term) depends upon the squares of the
coefficients. This term becomes extremely complicated as the size of the model and
the number of steps increases.
ERRORS ignores the sampling error term and concentrates on the others: the ones due
to the effects of the innovations (u’s). Note that the two step forecast error depends
not only upon the second period’s innovation, but also upon the first period’s innova-
tion as well. More generally, in the moving average representation,
∞ ∞
K −1 K −1
(7) ∑Ψ u s t −s = ∑ Ψ s Fvt−s
s =0 s =0
where FF ¢ is a factorization of the covariance matrix of u, and the v’s are orthogo-
nalized innovations. The covariance matrix of the K-step ahead forecasts is
K −1 K −1
(8) ∑ Ψ FF ′ Ψ ′ = ∑ Ψ ΣΨ ′
s =0
s s
s =0
s s
This does not depend upon which factorization of S is chosen. However, the decom-
position of variance, which breaks this sum down into the contributions of the com-
ponent of v, does. See User’s Guide Section 7.6 for a more detailed discussion on the
decomposition of variance.
Output
This is part of the output from an ERRORS instruction applied to a six variable var.
There will be one such table for each endogenous variable.
The first column in the output is the standard error of forecast for this variable in the
model. This is computed using (8). Since the computation assumes the coefficients
are known, it is lower than the true uncertainty when the model has estimated coef-
ficients. The remaining columns provide the decomposition. In each row they add up
to 100%. For instance, in the sample above, 81.84% of the variance of the one-step
forecast error is due to the innovation in CANRGDPS itself.
Notes
If you want to compute the true uncertainty of forecast, you need to apply the tech-
nique of Monte Carlo integration (User’s Guide, Section 16.5) to generate draws from
the posterior distribution of the coefficients of the model. You then use SIMULATE to
draw random shocks for the innovations during the forecast period.
Wizard
The Time Series—Exponential Smoothing wizard provides dialog-driven access to
most of the features of the ESMOOTH instruction.
Parameters
series Series to smooth, seasonally adjust or forecast.
start end Range to smooth. If you have not set a SMPL, this defaults to the
defined range of series.
Options
trend=[none]/linear/exponential/select
seasonal=[none]/additive/multiplicative/select
These jointly determine the type of model. You can choose any combination of
the two. The linear trend model is the Holt–Winters two parameter model. If you
choose SELECT, ESMOOTH tests all three choices for that option, and chooses the
best-fitting model. Used together, TREND=SELECT and SEASONAL=SELECT gives
you the best-fitting combination.
estimate/[noestimate]
alpha=constant level smoothing parameter [.3]
gamma=trend smoothing parameter [.3]
delta=seasonal smoothing parameter [.3]
constrain/[noconstrain]
initial=[full]/start
If you use the ESTIMATE option, ESMOOTH finds the values for a, g and d (see
"Technical Information") which produce the best fit with the data by minimizing
the sum of squared in-sample forecast errors. By default (with NOESTIMATE), the
three other options provide the values of a, g and d. In all cases, values close to
zero provide the most smoothing; values close to one the least. Note: if you use
SELECT for TREND or SEASONAL, ESMOOTH always estimates the parameters.
[print]/noprint
This matters only if you use ESTIMATE or one of the SELECT choices. PRINT out-
puts the squared-error statistics and the final estimated coefficients.
Description
You choose which of the nine methods you want by choosing a trend model: no
trend, linear trend (Holt–Winters) or exponential trend; and a seasonality model:
no seasonal, additive seasonal, multiplicative seasonal. You can, of course, al-
low TREND=SELECT and/or SEASONAL=SELECT to help you make the decision. See
"Technical Information" for a description of the models and formulas.
Missing Values
ESMOOTH simply smooths over missing values, assuming (in effect) that the miss-
ing datum is the forecast value for that period. This permits you to use ESMOOTH for
patching gaps in a time series, provided:
• the series is reasonably smooth, so an exponential smoothing representation is
adequate.
• the gaps are not too near the start of the data, since exponential smoothing relies
solely on the past for the generation of the smoothed data.
Examples
esmooth(estimate,fore=forecast,steps=12) tbill3mo 1980:1 2009:12
This uses data from 1980:1 through 2009:12 to fit a non-trending, non-seasonal
model with estimated coefficients, and then uses that model to produce forecasts for
2010:1 through 2010:12.
esmooth(alpha=1.0,gammaw=0.3,trend=exponential,forecasts=ship_f, $
steps=21) shipment 1994:1 2010:3
forecasts 2010:4 to 2011:12 using an exponential trend, non-seasonal model, with as-
signed parameters.
esmooth(trend=select,seasonal=select,smooth=canretsax) canrett
smooths Canadian Retail Sales using the best fitting model of the nine possibilities.
This saves the smoothed (seasonally adjusted) data in the series CANRETSAX. See
Section 6.3 of the User’s Guide for a complete version of this example.
Output
The following is the output from the Canadian example above. The table at the start
shows the selection of the model. For each of the nine possible models, the sum of
squared errors and the Schwarz (bic) criterion are shown. The Schwarz criterion
penalizes the models which have extra parameters. The chosen model is the one
which minimizes the value of Schwarz. Here, it is the linear trend with multiplicative
seasonal. The data are much clearer about the choice of seasonal model than trend
model, as exponential trend with multiplicative seasonal has a very similar value.
The second section shows the estimated coefficients for the chosen model.
Exponential Smoothing for Series CANRETT
Model Selection
TREND SEASONAL SumSquares SBC
None None 1919642978.894695 8531.21
None Additive 387239181.209608 7629.87
None Multiplicative 258039909.180180 7399.71
Linear None 1661329541.805498 8455.61
Linear Additive 139617130.682347 7057.79
Linear Multiplicative 76249709.192393 6714.82
Exponential None 1518459266.953026 8404.62
Exponential Additive 120020798.921642 6972.04
Exponential Multiplicative 76876443.721707 6719.46
Notes on SELECT
You should not automatically use SELECT for the options. ESMOOTH sees a series as
just a set of numbers. It has no knowledge of how the series is expected to behave—
you do! If it sees a general upward movement over the data set, it may very well se-
lect a trending model over a non-trending one. It can do this even for a series (such as
U.S. interest rates) where you probably would not choose to include a trend yourself.
With enough data, ESMOOTH will probably pick the model which is truly the best of
those available, but with small data sets, your judgment becomes very important.
Technical Information
The table below lists the error-correction forms of the models used for the different
combinations of SEASONAL (top) and TREND (left). We are using Gardner’s (1985)
notation:
St smoothed level of the series
T t trend rate
I t seasonal index (factor)
et period t forecast error
p seasonal span
I t = I t− p + δ(1 − α ) et I t = I t− p + δ(1 − α ) et St
ESMOOTH uses the simplex method to estimate parameters, minimizing the sum of
et2 . It obtains initial values for It by a regression of the data on seasonal dummy vari-
ables and for Tt by a regression on a simple time trend.
Note, by the way, that while some programs limit the smoothing parameters to the
range of [0,1], the smoothing model is stable for a wider range than that (for instance,
[0,2] for a), and the optimal values for many economic series are, in fact, greater
than one. Thus ESMOOTH does not constrain the values to the [0,1] range by default. If
you do want to impose the [0,1] constraint, use the CONSTRAIN option.
Variables Defined
%NOBS Number of observations
%RSS Sum of squared errors
%ESALPHA a, the level smoothing parameter
%ESGAMMA g, the trend smoothing parameter
%ESDELTA d, the seasonal smoothing parameter
See Also . . .
UG, Chapter 6 Univariate Forecasting
Wizard
The Time Series—VAR (Setup/Estimate) wizard provides an easy, dialog-driven
interface for defining and estimating var models.
Parameters
start end Estimation period. If you have not set a SMPL, this defaults to
the maximum range that ESTIMATE can use, taking into ac-
count the required lags.
Options
dfc=Degrees of freedom correction (Additional Topics, Section 1.4)
smpl=SMPL series or formula ("SMPL Option" on page RM–546)
spread=standard SPREAD option ("SPREAD Option" on page RM–547)
weight=series of entry weights ("WEIGHT option" on page RM–549)
These are the same as for LINREG, except that for each option the single value
applies to all equations in the system. If you need differing SPREADs, for instance,
you must use a set of LINREG instructions instead.
[print]/noprint
ftests/noftests
Use NOPRINT to suppress the standard regression output. For a vector autore-
gressive system, FTESTS prints a set of F–tests after each estimated equation.
This tests (for each regression separately) the block of included lags of the depen-
dent variables of the system. NOPRINT will also suppress the F–tests unless you
use the FTESTS option explicitly.
sigma/[nosigma]
cvout=Symmetric covariance matrix of residuals
Respectively, these compute and print, or compute and save, the covariance ma-
trix of the residuals. If you use SIGMA, ESTIMATE prints a covariance/correlation
matrix of the form shown in Section 2.1 of the User’s Guide. In some older ver-
sions, CVOUT was called OUTSIGMA. ESTIMATE will still recognize the old name.
1 T
Σ=
T
∑ u u′
t=1
t t
This is the more general form for S, as using the degrees of freedom correction is not
appropriate for Bayesian or near–var models.
Variables Defined
%BETASYS stacked coefficient VECTOR
%LOGDET log determinant of the estimate of S (REAL).
%LOGL Normal log likelihood (REAL).
%NFREE free coefficients, including the covariance matrix (INTEGER)
%NOBS number of observations (INTEGER)
%NREG number of regressors in the first equation (INTEGER)
%NREGSYSTEM total number of regressors in the model (INTEGER)
%NVAR number of equations (INTEGER)
%SIGMA covariance matrix of residuals (SYMMETRIC)
p
%VARLAGSUMS (for a var only) the N´N matrix: I − ∑ Φs where Fs is the
s =1
matrix of var coefficients for lag s. In an ect var, these are
computed with the coefficients on the differenced dependent
variables.
%VECMALPHA When using ECT, this is set to the a matrix (the loadings) for
the error correction model (N´r RECTANGULAR)
%VECMPI When using ECT, this is set to the P matrix for the error correc-
tion model (N´N RECTANGULAR)
%XX the X ′ X −1 matrix (SYMMETRIC)
These test the significance of the block of lags associated with each of the variables in
turn. In this one, for instance, the significance level of the block of CANCPINF lags in
the CANTBILL equation is .0262556. Note that in a VAR with more than two vari-
ables, an insignificant result should not (by itself) be interpreted as a lack of causal-
ity from the tested variable to the dependent variable—that requires a more formal
block exogeneity test.
Examples
system(model=canmodel)
variables usargdps canusxsr cancd90d canm1s canrgdps cancpinf
lags 1 to 4
det constant
end(system)
estimate(noprint,cvout=v,residuals=resblock) * 1997:4
sets up and estimates a six-variable var with four lags, saving the residual covariance
matrix in V and the residuals in the VECTOR[SERIES] RESBLOCK. The estimation
range runs from the earliest possible time through 1997:4.
linreg(define=eweq) w
# constant y z
*
system(model=vecm)
variables y z w
lags 1 2
det constant
ect eweq
end(system)
*
estimate
This estimates a model with cointegration (vecm), where the error correction equa-
tion is generated from an "Engle-Granger" regression.
Dependent Variable Y
Mean of Dependent Variable 0.0103956122
Std Error of Dependent Variable 0.3270248297
Standard Error of Estimate 0.3151203397
Sum of Squared Residuals 9.2349770506
Durbin-Watson Statistic 1.9036
See Also . . .
UG, Chapter 7 Vector Autoregressions
KALMAN Executes the Kalman filter
SYSTEM Sets up a vector autoregression
SPECIFY Sets the prior for a var
EQUATION Defines a single equation
Description
You must dimension the array before using EWISE. For each I or I,J combination
within the bounds of the array, EWISE carries out the indicated calculation. The
“function” can include multiple expressions, separated by commas—the values of the
array will be set according to the last expression in the list. This can be useful when
you need to do intermediate computations to produce the final values.
You can apply EWISE to arrays of any data type (including arrays of arrays).
You can use any INTEGER variable in place of I and J. However, I and J are already
defined as INTEGER’s (primarily for use in EWISE); any other variable would need to
be DECLARE’d.
Note that you can only use EWISE if you want to set all the entries of an array. An
instruction like
ewise a(i,1)=....
(intending to set only column 1) is not permitted. Use a DO loop or the %DO function
for this type of operation. For example:
do i=1,%rows(a)
compute a(i,1)=myseries1(i)
end do i
or
compute %do(i, 1, %rows(a), a(i,1)=myseries1(i))
Notes
Anything that you can do with EWISE, you can also do with DO loops. If the DO loops
would just loop over a COMPUTE instruction, use the more efficient EWISE. We recom-
mend DO loops in several situations:
• The EWISE expression gets so complex (for instance, if it requires several
nested %IF functions) that you cannot easily tell what it is doing.
• You can set up the DO loops to run over a reduced set of entries, where EWISE
must set the entire array.
Wizard
In the Statistics—Regression Tests wizard, select the Exclusion Restrictions radio
button.
Supplementary Card
The supplementary card lists the collection of variables from the previous regression
which you want tested (as a block). List them in regression format.
Options
[print]/noprint
NOPRINT suppresses the printing of the test information. This is useful if you
only need the %CDSTAT or %SIGNIF variable, and don’t need to see the output.
all/[noall]
Use ALL to test whether all of the coefficients can be excluded. Omit the supple-
mentary card if you use this. (This was called WHOLE in version 6 or earlier.)
form=f/chisquared
This determines the form of the test statistic used. By default, rats will select
the appropriate form based upon the estimation technique used last. You can use
FORM to manually select a distribution if you have made changes to the regres-
sion that require a different distribution, such as altering the %XX matrix in a
way which incorporates the residual variance into %XX. See Section 3.3 in the
User’s Guide.
Technical Information
These are done as Wald tests, with formulas described in Section 3.3 of the User’s
Guide. The main test statistic is usually shown as an F, but will be shown as a chi-
squared when EXCLUDE is applied to estimates from a DDV or LDV instruction, or from
any instruction for which the ROBUSTERRORS option was used during estimation. You
can also control the distribution yourself using the FORM option.
For F tests with one degree of freedom, EXCLUDE will report a two-tailed t test in
addition to the F test. For chi-squared tests with more than one degree of freedom,
EXCLUDE will report an F with an infinite number of denominator degrees of freedom
(that is, the chi-squared statistic divided by the numerator degrees of freedom) in ad-
dition to the chi-square.
Examples
linreg gdp
# constant m1{ -4 to 8 }
exclude(title="Sims Causality Test")
# m1{-4 to -1}
exclude
# m1{ 6 to 8 }
This regresses GDP on 4 leads, current and 8 lags of M1. The first EXCLUDE tests the
joint significance of the leads, and labels the test as “Sims Causality Test.” The sec-
ond tests the joint significance of lags 6, 7 and 8.
linreg(robusterrors) cge
# constant fge ige
exclude
# fge ige
This tests the significance of the regressors with the covariance matrix of the regres-
sion corrected for possible heteroscedasticity using ROBUSTERRORS.
Variables Defined
%CDSTAT the computed test statistic (REAL)
%SIGNIF the marginal significance level (REAL)
%NDFTEST (numerator) degrees of freedom for the test (INTEGER)
See Also . . .
TEST Tests for equality with specific constants.
RESTRICT Tests more general linear restrictions.
MRESTRICT Tests more general linear restrictions (using matrices).
Parameters
procname Name of the procedure invoked. You must use the full name,
not an abbreviation. Before rats can EXECUTE a procedure, it
must compile the code for the procedure itself. See “Compiling
Procedures” below for details.
parameters List of actual parameters. These pass information (series, ar-
rays, scalars, etc.) to matching formal parameters listed on the
PROCEDURE instruction which defined the procedure.
Options
Procedure options (defined with the OPTION instruction) are selected in the same
fashion as are options for any standard rats instruction. You can abbreviate to three
or more letters the option name, and the choices for a CHOICE option. For the three
option types:
• SWITCH options use option name alone for “On” (translated as 1), and NOoption
name for “Off” (translated as 0). That is, if you define an option PRINT, the user
would use PRINT or NOPRINT to turn it on or off.
• CHOICE options use option name=keyword for choice. For instance, if you define
a CHOICE option TYPE with FLAT and TENT as the choices, the user would select
the type by means of TYPE=FLAT or TYPE=TENT.
• Value options use option name=variable or expression. rats handles value
options in the same fashion as procedure parameters of the same type.
Compiling Procedures
Before you can use a procedure, you must execute, or “compile”, the code that defines
the procedure. If the procedure code is included in the same file as the program being
executed, you just need to make sure the procedure code precedes the EXECUTE com-
mand that calls the procedure.
More commonly, the procedure code is stored on a separate file. In that case, you can:
• compile the procedure explicitly, using a SOURCE instruction or by including the
file in your “procedure library” using the File—Preferences menu operation, the
ENVIRONMENT instruction, or the /PROC command-line switch , or
• let rats search for the file. Given a procedure called “PROCNAME”, rats will
search for a file called name PROCNAME.SRC. It will check (in order) the "parent"
directory if one procedure calls another, your “Procedure Directories” (defined in
the File—Preferences), the current default directory, and the directory containing
the rats executable file.
Notes
If you have nested procedures, that is, if one procedure contains an EXECUTE for an-
other procedure, rats needs to compile the code for the inner procedure first.
When control is returned from the procedure by a RETURN or END instruction, execu-
tion of the main program continues with the instruction that follows EXECUTE.
Undefined Parameters/Options
If you either omit a parameter or use * in its place, the procedure will treat it as if
you put a * wherever that parameter occurs. If it is used in a calculation, that calcu-
lation will be skipped. If certain parameters must be assigned values for your proce-
dure to make sense, you should use the %DEFINED function to check this and exit if it
returns 0.
proc quickie matpar
type symmetric *matpar
if .not.%defined(matpar) {
disp "Syntax: @quickie matrix"
return
}
compute matpar=inv(matpar)
end
Example
procedure specfore series start end forecast
type series series
type integer start end
type series *forecast
*
option integer diffs 0
option integer sdiffs 0
option switch const 1
option choice trans 1 none log root
shows the parameters and options for the procedure SPECFORE. The following is an
example of this being executed:
@specfore(diffs=1,noconst,trans=root) longrate $
2019:1 2019:12 flong
LONGRATE is the SERIES parameter, 2019:1 and 2019:12 are START and END, and the
forecasts are returned to the series FLONG. DIFFS is equal to one, CONST will be zero
(the effect of the use of NOCONST), TRANS will be 3—the value for the ROOT choice.
See Also . . .
UG, Chapter 15 General information about rats procedures.
SOURCE Runs a set of rats instructions on a text file. Use SOURCE for
making available a procedures distributed with rats.
PROCEDURE Sets up a procedure.
TYPE Sets data types for formal parameters.
OPTION Defines options for a procedure.
LOCAL Declares local variables and arrays in a procedure.
%DEFINED(x) Returns 1 if X (a parameter or option) was defined and 0 oth-
erwise.
Wizard
From the Statistics—Univariate Statistics wizard, choose "Extreme Values".
Parameters
series Series you are analyzing.
start end Range to use. If you have not set a SMPL, this defaults to the
defined range of series.
Options
[print]/noprint
title="title for output" ["Extreme Values for Series xxx"]
You can use NOPRINT to suppress the output.
Sample Output
ext gdpgrowth
Parameters
cseries The complex series to transform.
start end The range of entries to transform. By default, the defined range
of cseries. See the comment below.
newcseries New complex series for the result. By default, same as
cseries.
newstart The starting entry for the result. For FFT, this is the entry
where frequency 0 is placed. By default, same as start.
Description
The Finite Fourier transform of the series X (t ) ,t = 1, ,T , is
T
(1) X (2p j T ) = å X (t ) exp (-2p i j (t - 1) T ); j = 0,1, , T - 1
t =1
Usage
You will rarely need to use any parameters other than cseries and newcseries.
The most common forms of the instructions are
fft cseries
fft cseries / newcseries
the first transforming cseries onto itself, the second transforming it to a new series.
Examples
The following passes series X through a high-pass filter, zeroing out a band around
0 frequency. The frequencies which aren’t zeroed are from p 2 (1/4 of the number of
ordinates) to 3p 2 .
freq 2 256
rtoc
# x
# 1
fft 1
cset 1 = %z(t,1)*(t>64.and.t<=192)
ift 1
See Also . . .
TRFUNC computes a transfer function for a filter.
Wizard
You can use the Data/Graphics—Filter/Smooth wizard to do various filtering opera-
tions. Just which technique to apply using the "Filter Type" drop down.
Parameters
series Series to transform
start end Range of entries to transform. You must set start to allow for
the lags in the filter and end to allow for the leads, if any, un-
less you use the TRUNCATE option.
If you have not set a SMPL, this defaults to the maximum range
allowed by the defined portion of series. This range is from
(series start) + (highest lag) to (series end) – (highest lead).
newseries Resulting series. By default, newseries=series.
newstart The starting entry for newseries. By default,
newstart=start.
Options
type=[general]/flat/henderson/spencer/hp/hpdetrend/lagging/centered
width=base window size (for TYPE=FLAT,HENDERSON,SPENCER,LAGGING)
span=same as WIDTH (used in older versions of rats)
by=repetitions of simpler filter, used with TYPE=FLAT
tuning=tuning value for TYPE=HP or TYPE=HPDETREND [depends on
CALENDAR]
TYPE indicates the type of filter to be used. See the "Technical Information" for
details on each of these.
TYPE=GENERAL (the default) allows any pattern—if used, it requires use of either
the supplementary cards or the EQUATION option to indicate the form of the filter.
With TYPE=GENERAL, the zero lag is assumed to have a coefficient of 1.0 unless
included explicitly.
TYPE=FLAT is a centered flat moving average, with width given by WIDTH. It can
also be used for an N´M convolution of flat filters by combining the WIDTH and
BY options.
TYPE=HENDERSON gives a centered Henderson filter with width given by the
WIDTH option. This filter is used extensively by the X11 seasonal adjustment
procedure.
TYPE=HP implements the Hodrick-Prescott filter. You can use the TUNING option
to supply a value for the tuning parameter for the hp filter. The default values
for this depend on the frequency of the current CALENDAR setting (100 for annual
data, 1,600 for quarterly, and 14,400 for monthly).
TYPE=HPDETREND produces the residuals from the Hodrick-Prescott filter (gener-
ally interpreted as the "cycle"). The TUNING option applies as it does for TYPE=HP.
TYPE=SPENCER gives the Spencer moving average filter. WIDTH must be 15 or 21
(the default is 15).
TYPE=LAGGING gives a filter on a list of lags. If you use the WEIGHTS option, it
will use those values. Otherwise it will put equal weights on the lags 0 through
L–1, given WIDTH=L.
TYPE=CENTERED is the same as TYPE=FLAT, except that you can supply filter
weights with the WEIGHTS option. Note that for TYPE=CENTERED, you only need
to supply the weights for half the window, starting with 0 lag and working out—
the same lag weights will automatically be used for the leads.
remove=mean/trend/seasonal/both
Removes from series by linear regression the mean, mean and trend, seasonal
(using dummies) and mean, or trend and seasonals, respectively. This is not tech-
nically a linear filter, but serves a similar purpose to many of the other filters.
weights=VECTOR of filter weights (TYPE=LAGGING,CENTERED)
unitsum/[nounitsum] (Used with WEIGHTS)
WEIGHTS provides the filter weights used with the TYPE=LAGGING or
TYPE=CENTERED options. Use UNITSUM if you want FILTER to normalize the
WEIGHTS to sum to 1—this allows you to just supply a shape using WEIGHTS
and let FILTER figure out the normalization. For example: TYPE=CENTERED,
WEIGHTS=||5.0,3.0,2.0||,UNITSUM gives filter coefficients: 2/15, 3/15,
5/15, 3/15, 2/15.
truncate/[notruncate]
extend=zeros/repeat/rescale/msereweight
If you set start and end manually, you normally need to allow for lags and
leads. For example, if your filter includes lags, you could not start at entry one,
since the lag terms would refer to non-existent entries. TRUNCATE and EXTEND
offer two alternative ways to handle out-of-sample values.
With TRUNCATE, you can specify any start and end values. rats will assume
that all unavailable entries outside the data range are equal to 0 for the purposes
of computing the filtered series (Missing values in the middle of the data are still
treated as missing).
Supplementary Cards
You need two supplementary cards unless you use one of the pre-defined filter types,
or the EQUATION option. Note that TYPE=CENTERED or LAGGING with input WEIGHTS
will usually be simpler than using these.
1. The first card lists the lags in the filter. Represent leads as negative numbers. In
filters which involve lags only, lag 0 gets a coefficient of 1.0 unless you set it explic-
itly.
2. The second card lists the filter coefficients. They correspond, in order, to the lags
on the first card.
Technical Information
For TYPE=FLAT, if the window size is odd (WIDTH=2N+1), the filtered series is
1
xt = (x + x + … + xt + N −1 + xt + N )
(2N + 1) t−N t−N +1
If it’s even (WIDTH=2N), the output is
1
xt = (0.5xt−N + xt−N +1 + … + xt +N −1 + 0.5xt +N )
(2N )
This is generally used with seasonal data where the WIDTH is the length of the sea-
sonal, and is also known as a 2´N filter.
If TYPE=FLAT and you use both the WIDTH and BY options, the result is the same
effect as a flat filter of width given by the WIDTH applied to the output of a flat filter
of width given by the BY option. This concentrates the filter a bit more towards the
center than a flat filter of the same overall length.
A Henderson filter is a centered filter whose lag coefficients are selected to minimize
the sum of squared third differences of the lag coefficients for the class of symmetric
lag polynomials which pass third order polynomials through without change. It’s
used within the X11 adjustment procedure to estimate local trends.
A Spencer filter is similar to the Henderson filter, but allows only the two widths (15
and 21).
The Hodrick-Prescott Filter (Hodrick and Prescott, 1997) computes the estimated
growth component g of a series x that minimizes the sum over t of:
2 2
( x t − gt ) + l ( gt − 2 gt−1 + gt−2 )
The value of l is set by the TUNING option. This is solved internally by Kalman
smoothing. See the HPFILTER.RPF example for additional technical details.
Missing Values
If any of the values required to compute an entry are missing, FILTER sets the entry
in the output series to missing. Exception: data that are unavailable at the beginning
or end of the filter are treated differently when using TRUNCATE or EXTEND options.
Examples
data 1955:1 2010:6 ip
filter(type=flat,width=12) ip 1955:7 2009:12 trdcycle
This sets TRDCYCLEt = (.5 × IPt−6 + IPt−5 + IPt +5 + .5 × IPt +6 ) 12 . Note that the
range is set to run from (1955:1)+6 to (2010:6)–6. This use of an even window size
means that each calendar month will get the same weight in the average, and thus a
seasonal will tend to be flattened.
filter(remove=hptrend,tuning=1600) y / detrend
makes DETREND equal to the difference between Y and its Hodrick-Prescott trend
estimate.
filter(type=henderson,width=23,extend=mse) data / tc
does a 23-term Henderson with minimum MSE revision handling of the end points.
Using Vectors
For long and complex filters, it is probably simplest to put the filter coefficients into
a vector. The supplementary cards on FILTER will accept a vector of integers for the
list of lags and vector of real numbers. For instance, the following generates a fifty
lag expansion of a fractional difference filter (Section 14.10 of the User's Guide. Note
that DIFFERENCE has an option for a specific form of fractional differencing.
dec vect[int] lags(50)
dec vect coeffs(50)
ewise lags(i)=i
ewise coeffs(i)=%binomial(.7,i)*(-1)^i
filter(truncate) series / fracdiffs
# lags
# coeffs
1. yt = A (L ) yt (univariate autoregression)
2. yt = B (L ) xt (univariate distributed lag)
The only other variable allowed in the equation is the CONSTANT. FILTER uses
1–A(L) for type 1 and B(L) for type 2, for instance,
Equation yt = 1.3 yt−1 − 0.6 yt−2 is converted into the
Filter 1 − 1.3 L + 0.6 L2
The variables in the equation do not have to be related to the series you are filter-
ing. For instance, the equation in the next example is an autoregression on residuals,
after which FILTER is applied to the dependent variable and regressors.
Example
This filters the series LOGGPOP and LOGPG with a linear filter derived from a second
order ar:
linreg(define=ar2) resids
# resids{1 2}
filter(equation=ar2) loggpop / fgpop
filter(equation=ar2) logpg / fpg
Parameters
max/min/root Choose which of the operations you wish to do.
expression This is the (real-valued) expression that FIND is optimizing.
This will usually be a variable set within the block of instruc-
tions. You must declare any variables in this expression which
are set within the instruction block.
Options
parmset=PARMSET to estimate [default internal]
This tells which PARMSET is to be estimated by the FIND. If you don’t provide a
PARMSET, rats uses the last one created by NONLIN (Section 4.6, User’s Guide).
method=[simplex]/genetic/bfgs/annealing/ga/grid
iterations=iteration limit [100]
subiterations=subiteration limit [30]
cvcrit=convergence limit [.00001]
trace/[notrace]
These control the optimization method. Of these, only BFGS makes assumptions
about differentiability (it requires the function be twice continuously differen-
tiable) and is the only one capable of producing standard errors for the estimates.
SIMPLEX requires continuity, while the others have no special requirements. GRID
is specific to FIND and does a grid search over a range of values provided by the
GRID option.
ITERATIONS sets the maximum number of iterations, SUBITERS sets the maxi-
mum number of subiterations, CVCRIT the convergence criterion. TRACE prints
the intermediate results. The "iteration" counts for simplex, genetic, annealing
and GA are designed to give a similar amount of calculation to what an iteration
in BFGS requires).
pmethod=simplex/genetic/bfgs/grid
piters=number of PMETHOD iterations to perform [none]
Use PMETHOD and PITERS if you want to use a preliminary estimation method to
refine your initial parameter values before switching to one of the other estima-
tion methods. For example, to do 20 simplex iterations before switching to BFGS,
you use the options PMETHOD=SIMPLEX, PITERS=20, and METHOD=BFGS.
stderrs/[nostderrs]
With this you can decide whether or not to show the computed standard errors
of the coefficients. This is only an option if you use METHOD=BFGS as the other
methods don't assume enough about the function to allow standard errors calcu-
lations. If you use METHOD=BFGS, you can use STDERRS to make the output show
standard errors, t–statistics and significance levels, just like other estimation
instructions. However, if the function that you are maximizing isn’t a likelihood
or quasi-likelihood function, the numbers reported (computed as described in Sec-
tion 4.5 of the User’s Guide) are unlikely to be interpretable as standard errors.
[print]/noprint
vcv/[novcv]
title=”description of optimization being done”
These are the same as for other estimation instructions (see LINREG for details).
VCV only works with METHOD=BFGS and only if you use the STDERRS option.
Statement Block
FIND is actually very similar to the instruction LOOP (User’s Guide, Section 15.1).
A function evaluation will execute all instructions between the FIND and the END
FIND. A BREAK instruction within the statement block will cause FIND to abort esti-
mation.
Technical Information
FIND ROOT actually minimizes the absolute value of the expression. This allows the
optimization algorithms to be used, but it can be thrown off if initial values are near
a local minimum of the function. FIND ROOT is provided as a convenience, but is not
a dedicated “root finder.” Make sure you check the function value (either in the out-
put or in the %FUNCVAL variable) to see if it is near zero before using the results.
Notes
FIND can be very slow. The instructions it controls will have to be executed once for
each function evaluation, and it may take several hundred function evaluations to
reach convergence even with just four or five free parameters.
Examples
This is a trivial example which shows the basic steps required to use FIND. It finds
the minimum of x 2 + x + 3
nonlin x
compute x=0.5
find minimum x^2+x+3
end find
The NONLIN sets X as the only parameter. Notice that you don’t have to bracket the
minimum: a single point is enough. This example is so trivial that we don’t need to do
any computations in the statement block. You still need the END FIND, however.
The code below does a preliminary grid search for C (estimating the other parame-
ters) then estimates a final model starting at the best value found by the grid search.
stats(fractiles) y
compute grid=%seqa(%fract01,(%fract99-%fract01)/99.0,100)
nonlin(parmset=conly) c
nonlin(parmset=gammaonly) gamma
find(parmset=conly,method=grid,grid=||grid||) min %rss
nlls(frml=star,parmset=regparms+gammaonly,noprint) y
end find
nonlin(parmset=starparms) gamma c
nlls(frml=star,parmset=regparms+starparms,print) y
The next example is more complex. It determines an optimal scale factor for a logistic
in creating draws for a truncated Normal. This finds the minimum of a maximum,
and thus requires one FIND instruction inside another, and two PARMSET’s—one be-
ing estimated by the outer FIND, one by the inner one.
The inner FIND has a NOPRINT option. Without this, you’ll get an output from the
estimation for each trial value of the outer FIND. In this case, we are using the slower
METHOD=GENETIC because of a (justified) fear that the inner FIND will end up at a
local and not global maximum for some of the trial values for SFAC.
Note that the inner FIND has an IF which sets the value to %NA if the scale factor
goes out of range. This method allows you to steer the optimization away from re-
stricted regions.
nonlin(parmset=xset) xm
nonlin(parmset=sset) sfac
*
compute xm=1.0,sfac=1.0
*
compute value=0.0
find(parmset=sset,trace) min value
find(parmset=xset,method=genetic,noprint) max value
if sfac<=0.0
compute value=%na
else
compute value = -.5*xm^2+xm/sfac+ $
log(sfac)+2*log(1+exp(-xm/sfac))
end find
end find
Output
This is a typical output from a FIND instruction. Because there is no connection to
“data” in the basic instruction, there aren’t any goodness of fit statistics or the like.
Standard errors, t–statistics and significance levels are included only if you use
METHOD=BFGS and the STDERRS option and you should do that only if you are quite
sure that the use of the inverse Hessian as an estimate of the covariance matrix is
proper (see Section 4.5 in the User’s Guide).
FIND Optimization - Estimation by Genetic
Convergence in 43 Iterations. Final criterion was 0.0000061 < 0.0000100
Function Value 0.99668260
Variable Coeff
*****************************************
1. XM -0.999996023
Variables Defined
%FUNCVAL Final value of expression (REAL)
%BETA The estimated parameter values (VECTOR)
%NFREE The number of free parameters (INTEGER)
%CONVERGED = 1 or 0. Set to 1 if the process converged, 0 if not.
%CVCRIT final convergence criterion. This will be equal to zero if the subi-
terations limit was reached on the last iteration (REAL)
%LAGRANGE VECTOR of Lagrange multipliers if estimating with constraints
and METHOD=BFGS.
Parameters
datatype can indicate an array or array of arrays of integer, real, com-
plex, label or string values.
array(dims) the name of the array you want to create, followed by the di-
mensions of the array in parentheses. Only a single array can
be initialized on a FIXED instruction. If you have more than one
array to create, use multiple FIXED instructions
Text Card
values for the array This is a collection of numbers or strings to fill the elements of
the array. They can be separated by blanks, tabs or commas,
and can cover more than one line. The array is filled by rows.
If a row has been filled, and there are more values on the input
line, they will be used for the next row.
These must be literal values; expressions are not allowed.
If you use an array of arrays, such as:
fixed vect[rect] cval(5)(3,3)
CVAL(1) will be filled first, followed by CVAL(2), etc.
Description
FIXED can only be used inside a PROCEDURE or FUNCTION. It is used to set the values
for an array whose size and values are fixed; for instance, a lookup table for critical
values. The arrays declared using FIXED are considered to be local to the procedure.
Note that no expressions can be used either in the dimensions or the values. If you
need something more general, use the combination of LOCAL, DIMENSION and ENTER
or COMPUTE.
Examples
fixed vect[int] dfssize(6)
25 50 100 250 500 9999
fixed vect dfsigval(8)
.01 .025 .05 .10 .90 .95 .975 .99
fixed vect[rect] dftcval(3)(6,8)
-2.66 -2.26 -1.95 -1.60 0.92 1.33 1.70 2.16
-2.62 -2.25 -1.95 -1.61 0.91 1.31 1.66 2.08
-2.60 -2.24 -1.95 -1.61 0.90 1.29 1.64 2.03
-2.58 -2.23 -1.95 -1.62 0.89 1.29 1.63 2.01
-2.58 -2.23 -1.95 -1.62 0.89 1.28 1.62 2.00
-2.58 -2.23 -1.95 -1.62 0.89 1.28 1.62 2.00
Parameters
array RECTANGULAR array to set. You must dimension this array at
some point before the FMATRIX instruction.
startrow The starting row for the process. By default, startrow=1.
startcolumn The column in startrow for the coefficient on the zero lag of
the filter. By default, startcolumn=1.
endrow The ending row for the process. By default, the last row in
array.
Description
You set the form of the filter using the options.
FMATRIX puts the zero lag of the filter into startcolumn of startrow. The coeffi-
cient for lag m is m columns to the right of startcolumn. Leads (negative lags) are
to the left. In the next row, the coefficients are laid out beginning one column further
to the right. This continues until endrow .
FMATRIX skips filter coefficients if they would end up outside the bounds of the array.
Thus, you have to be careful about your choice of dimensions so the filter doesn’t get
truncated unintentionally.
Options
DIFFERENCES, SDIFFERENCES and SPAN (shown on the next page) set differencing
filters as in the instruction DIFFERENCE. The EQUATION option is identical to that
for the instruction FILTER—see the discussion there. The third way to specify the
filter is to use a pair of supplementary cards, again as described with the instruction
FILTER.
differences=number of differences
sdifferences=number of seasonal differences
span=seasonal span [CALENDAR seasonal]
[zeros]/nozeros
The ZEROS (the default) option sets to zero all entries in array, other than those
set as filter coefficients. If you use NOZEROS, these other entries keep the values
that they had before FMATRIX. If you need several FMATRIX instructions to set up
the array, use NOZEROS on FMATRIX instructions after the first.
Examples
dec rect amatrix(4,7)
fmatrix(diffs=3) amat
creates the matrix
1.0 -3.0 3.0 -1.0 0.0 0.0 0.0
0.0 1.0 -3.0 3.0 -1.0 0.0 0.0
0.0 0.0 1.0 -3.0 3.0 -1.0 0.0
0.0 0.0 0.0 1.0 -3.0 3.0 -1.0
The 1, -3, 3, -1 entries are the coefficients of a third order differencing operator. With
each new row, these coefficients move over one column.
dec rect s(5,5)
fmatrix s 1 1
# -1 0 1
# .25 .5 .25
The 1 1 parameters mean that the lag 0 coefficient goes in the first column in the
first row:
.50 .25 .00 .00 .00
.25 .50 .25 .00 .00
.00 .25 .50 .25 .00
.00 .00 .25 .50 .25
.00 .00 .00 .25 .50
Note how the filter coefficients are truncated in the first and last rows.
See Also . . .
DIFFERENCE Differences or seasonally differences a series.
FILTER General linear filter.
Parameters
cseries Source complex series to transform.
start end Range of entries to process. By default, the defined range of
cseries.
newcseries Series for the result. For FOLD, this must be different from
cseries.
newstart Starting entry for the result. By default, same as start.
Option
factor=reduction factor [1]
reduction factor is the ratio of sampling rates: 3 (=12/4) for monthly to quar-
terly. FOLD transforms N frequencies to N/reduction factor frequencies.
Description
FOLD sets each frequency in newcseries equal to the sum of all the frequencies for
cseries aliased with it at the lower sampling rate.
Example
*
* nords is the number of ordinates,
* nobs is the number of actual data points
*
compute nords=768
freq 5 nords
rtoc 1966:1 2017:11
# ipx
# 1
compute nobs=%nobs
fft 1
cmult(scale=1./(2*%pi*nobs)) 1 1
fold(factor=3) 1 1 nords 2
Series 2 has 768/3=256 entries, which are entries 1+257+513, 2+258+514,... of series
1.
Wizards
The Time Series—VAR (Forecast/Analyze) wizard can forecast any MODEL, and thus
provides access to most of the functionality of FORECAST. If you only need to forecast
a single equation, you can use the Time Series—Single-Equation Forecasts wizard.
Parameters
equations Number of equations in the system. Omit this when using the
MODEL option.
Two additional parameters (steps and start) used in versions before 7 have been
replaced by the STEPS, FROM, and TO options described below. rats will still recog-
nize these older parameters.
Supplementary Cards
There is one supplementary card for each equation, unless you use the MODEL option.
The other two types of supplementary cards shown above are only used with special
options for adding shocks: INPUT and PATHS.
equation The equation name or number.
forecasts (Optional) The series for the computed forecasts of the depen-
dent variable of equation. If you find it is convenient,
forecasts may be the same as the dependent variable.
newstart (Optional) The forecasts will be saved starting in entry
newstart of the forecasts series. By default, this will be the
same as the starting period of the forecast range.
Options
model=model name
Of the two ways to input the form of the model to be solved (the other is with
supplementary cards), this is the more convenient and is the only way to forecast
with a set of FRMLs. MODELs are usually created by GROUP or SYSTEM.
print/[noprint]
window="Title of window"
Use PRINT if you want rats to display the forecasts in the output window or file;
this is not done automatically. Use the WINDOW option to have rats display the
results in a (read-only) spreadsheet window with the indicated title. This will
show the forecasts in columns below the name of the dependent variable. Note
that you can use the File–Export... operation to export data directly from such a
window.
static/[nostatic]
errors=VECT[SERIES] of one-step forecast errors
Use STATIC if you want static forecasts rather than dynamic forecasts—that is, if
you want to use actual values rather than forecasted values for lagged dependent
variable terms at later forecast horizons.
When doing static forecasts, you can use the ERRORS option to save the forecast
errors into a VECTOR of SERIES.
input/[noinput]
shocks=VECTOR of first period shocks
matrix=RECTANGULAR array of shock paths
paths/[nopaths]
These options (which are mutually exclusive) add shocks to the equations.
INPUT and SHOCKS options do this only at the first period, MATRIX and PATHS
over the whole forecast horizon. You can use them either with linear or general
models. They have a number of uses, from add factoring to bootstrapping. See
"Options for Adding Shocks".
Variables Defined
%FSTART Starting entry of forecasts (INTEGER)
%FEND Ending entry of forecasts (INTEGER)
Description
FORECAST solves the model for each time period requested. If there are any lagged
dependent variables, FORECAST will use the actual data if the lag reaches back before
the first forecast period, and will use the forecasted values if not (unless you use the
STATIC option).
For instance, in forecasting the range T+1 to T+n, a one period lag will come from
the actual data at period T when forecasting T+1, but will come from the period T+1
forecast when forecasting T+2. Note, however, that this only applies for variables
which are dependent variables of one of the equations of the system. Any exogenous
variables have to be treated as described below.
If you have an equation with a moving average part (for instance, one estimated with
BOXJENK with ma or sma terms), the lagged residuals come from the series of residu-
als saved when estimating the equation.
If you have a MODEL which includes FRML’s, rats uses the Gauss–Seidel solution
technique described in Section 8.3 of the User’s Guide.
If you get forecasts which show as NA (missing values), the usual cause is that an
exogenous variable is not defined into your forecasting period. Other possibilities are
the absence of a lagged dependent variable. For instance, you try to forecast begin-
ning at 2019:1, but one of your series has no data for 2018:4 and you need a lag of it.
Exogenous Variables
Exogenous variables require some care, as you need values for them throughout the
forecast period. If a variable is not the dependent variable of one of the equations,
rats takes its values over the forecast horizon from its data series. If you are fore-
casting out-of-sample, you have two basic ways to handle exogenous variables:
• Close the model by adding to it equations or formulas, such as univariate autore-
gressions, that forecast the exogenous variables.
• Set up time paths for the variables (prior to forecasting) with SET or DATA or some
other instruction that will directly provide the values.
Examples
system(model=canmodel)
variables canrgdps canm1s cancd90d cancpinf canusxsr usargdps
lags 1 to 4
det constant
specify(tightness=.15,type=symmetric) 0.50
end(system)
estimate
forecast(model=canmodel,results=forecasts,from=2006:3,steps=10)
forecasts 2006:3 to 2008:4 using a six-variable var. The forecasts are in series
FORECASTS(1) (for CANRGDPS) to FORECASTS(6) (for USARGDPS).
Below, equation UNEMPEQ is a forecasting equation for the series UNEMP. The
FORECAST instruction computes and prints forecasts for 24 months beginning with
2018:7. The GRAPH graphs both the forecasts and the last year of actual data.
forecast(print,from=2018:7,to=2020:6) 1
# unempeq foreun
graph 2
# unemp %fstart %fend
# foreun
Next, equation M1EQ has the log of M1 (LOGM1) as its dependent variable, RATEEQ is
an equation for an interest rate, and PRICEQ is for LOGPRICE. This computes fore-
casts over 2018:2 to 2022:4 (fifteen quarters) and appends them to the original data
series. The SET instructions take the anti-logs of LOGM1 and LOGPRICE, and the
PRINT instruction prints the forecasts along with the last five quarters of data. This
uses the variables %FSTART and %FEND defined by FORECAST to simplify the steps
after doing the forecasts.
forecast(from=2018:2,steps=15) 3
# m1eq logm1
# rateeq rate
# priceq logprice
set m1 %fstart %fend = exp(logm1)
set price %fstart %fend = exp(logprice)
print %fstart-5 %fend m1 rate price
You can do the same operation using a GROUPed system by incorporating definitional
identities for the log relationships:
frml(identity) m1iden m1 = exp(logm1)
frml(identity) priceiden price = exp(logprice)
group smallmod m1eq rateeq>>rate priceeq $
m1iden>>m1 priceiden>>price
forecast(model=smallmod,from=2018:2,to=2022:4,print)
With PATHS, a supplementary card lists the series which provide the paths of the
shocks. These series must be defined for steps entries beginning with the entry
given by the start parameter. Put a * on the supplementary card for an equa-
tion whose shocks you want to be zero.
input/[noinput]
shocks=VECTOR for first period shocks
You can use either of these options to input general first period shocks. With
INPUT, a supplementary card provides the values; with SHOCKS, the indicated
VECTOR provides the shocks.
Parameters
cseries Number of complex series to create. You must supply a value for
this parameter. It can be zero, but usually is positive.
length Length of the complex series.
Description
You can create complex series using instructions similar to those you use for real se-
ries. However, you will probably find it much more convenient to set up the full block
of series (using the cseries parameter) and then refer to them with numbers rather
than with names.
You can invoke FREQUENCY any number of times during the program. Each use of
FREQUENCY eliminates the previous block of complex series. This makes it possible
(and desirable) to write self-contained procedures for your analysis.
Choosing length
You usually want to choose length to be a convenient number of frequencies for the
data analysis. We refer to the extra length beyond the actual number of data points
as padding. There are two considerations in choosing length:
• rats can compute the Fourier transforms much faster for lengths with small
prime factors (2,3 and 5) than for lengths that are products of large primes.
• The exact seasonal frequencies are included if the number of frequencies is a mul-
tiple of the number of periods per year.
The function %FREQSIZE(n) will produce a recommended size for n actual data
points. This returns the smallest integer of the form 2mS which is greater than or
equal to n. (S is the number of periods per year).
Whether or not you use %FREQSIZE, we recommend that, as a habit, you save the
number of ordinates in a variable before doing the FREQUENCY instruction. If you
need to refer to the number of ordinates in any of your instructions, you can use your
variable instead. That way, you can easily modify your program to allow for a dif-
ferent number of ordinates. If you’re writing a procedure where you’re working with
complex series provided by the user, you can find out the FREQUENCY length by using
%FREQEND() function.
Examples
frequency 8 512
Creates 8 complex series with 512 entries.
Wizard
You can use the Statistics—Equation/FRML wizard to define formulas.
Parameters
formulaname The symbolic name you are assigning to this formula.
depvar (Optional) Dependent (or left-side) variable for full equations.
Options
equation=equation to convert
lastreg/[nolastreg]
regressors/[noregressors]
The (mutually exclusive) options EQUATION and LASTREG can be used to convert
the specified equation or the last regression to a formula. Skip the parameters
after formulaname when you use EQUATION or LASTREG. You should only use
FRML(EQUATION=xxx) after you have estimated the equation. The REGRESSORS
option generates the right hand side of the formula from a list of regressors sup-
plied (in regression format) on a supplementary card. Omit the “=” symbol and
function if you use this option.
You must also use VECTOR or NAMES to specify how the parameters are created.
identity/[noidentity]
Use the option IDENTITY when the formula you are defining is an identity.
variance=residual variance
Residual variance for this formula. You only need to supply this if you are going
to use SIMULATE.
residuals=series of residuals
Series holding the residuals for this formula.
Description
FRML defines a function of the form
f ( X1t , X 2t , , X kt , b1 , b2 , , b p )
or
yt = f ( X1t , X 2t , , X kt , b1 , b2 , , b p )
where each Xt is a rats data series, and each b is a REAL variable or element of
a real-valued array. If you are using the FRML for non-linear estimation, you will
usually need to do a NONLIN instruction prior to defining the FRML to introduce the
parameters. Note that you also can define formulas without any b parameters.
rats stores the relationship in a variable of type FRML, with the name you supply for
formulaname. In general, you write the function in terms of the INTEGER time sub-
script T in the same manner as the right side expression is written for SET. As with
SET you can use either series or series(T) to refer to the current value of series
and series{lag} for a lagged value.
Once you have defined a formula (and specified values for any b variables, if there
are any), you can use it in expressions with
formulaname(entry)
which will take the value of function(entry). You can do this no matter which
form of the FRML command you chose—rats only uses the function(T) part in the
full equations.
Examples
frml avgvalue = (gdp{2}+gdp{1}+gdp+gdp{-1}+gdp{-2})/ 5.0
compute avg1909 = avgvalue(1909:1)
compute avg1901 = avgvalue(1901:1)
compute avg1885 = avgvalue(1885:1)
evaluates five year moving averages of GDP around specified time periods.
sur(inst) 3
# conseq
# inveq
# wageeq
frml(equation=conseq) consfrml
frml(equation=inveq) invfrml
frml(equation=wageeq) wagefrml
The first FRML instruction sets up GDPID as the standard income accounting identity.
The second is a definitional identity required for certain types of models. The last
group of instructions estimates a set of three equations (CONSEQ, INVEQ, WAGEEQ) by
three stage least squares, then converts the three estimated equations to formulas.
nonlin rho b1 b2 b3
frml auto1 = rho*y{1} + (1-rho)*b1 + $
b2*(x1-rho*x1{1}) + b3*(x2-rho*x2{1})
linreg y
# constant x1 x2
compute rho=%rho, b1=%beta(1), b2=%beta(2), b3=%beta(3)
nlls(frml=auto1) y
or:
nonlin rho
linreg y
# constant x1 x2
frml(lastreg,names="B",addparms) regfrml
frml auto1 = rho*y{1} + regfrml(t) - rho*regfrml(t-1)
compute rho = %rho
nlls(frml=auto1) y
The second example builds AUTO1 using a second formula (named REGFRML) for the
regression part. This can make it simpler to code a complex formula, and does make
it simpler to change it. If you want to add an additional explanatory variable to the
model the first way, you need to change the NONLIN, FRML, LINREG and COMPUTE
instructions. All you need to do in the second is change the supplementary card on
LINREG. The first FRML instruction in this second example does all of the following:
• Creates REGFRML with the variables B1, B2 and B3 for the three regression
coefficients.
• Initializes B1, B2 and B3 to the values estimated by LINREG.
• Adds B1, B2 and B3 to the list of NONLIN parameters.
If you are creating a formula using a number of “subfrmls,” we would strongly sug-
gest that you explicitly use subfrml(T) (as we’ve done above) rather than subfrml
alone when you need to refer to the subformula. This makes it easier to tell formula
references from series references.
Self-Referencing Formulas
You cannot create a FRML which is defined using an explicit reference to itself. If the
value of your formula at T depends upon previously calculated values of itself (direct-
ly or indirectly), you need to write your FRML to store the values into a series as they
are computed. The lagged values then come from the series.
arch and garch models are the most common types of such recursively defined
functions. For instance, in an arch-m model, the variance depends upon the lagged
residual and the residual depends upon the variance. We can’t write down two simple
FRML’s for this without each (at least indirectly) referencing itself. While this can be
done with GARCH, we’ll show how to define FRML’s to do it. The first of these uses the
UU series to save the squared residuals for use in the variance formula.
set u = 0.0
set uu = %seesq
frml archvar = a0 + a1 * uu{1}
frml regresid = y - b1 - b2 * x1 - b3 * sqrt(archvar(t))
frml archlogl = u(t)=regresid(t), uu(t)=u(t)^2, $
%logdensity(archvar(t),u)
We now give a somewhat quicker formulation which also saves the variance in a
series. This eliminates redundant calculations of ARCHVAR.
set u = 0.0
set uu = %seesq
set v = %seesq
frml archvar = a0 + a1 * uu{1}
frml regresid = y - b1 - b2 * x1 - b3 * sqrt(v)
frml archlogl = (v(t)=archvar(t)), (u(t)=regresid(t)), $
uu(t)=u(t)^2, %logdensity(v,u)
When you have several layers of formulas, be careful that calculations are done in
the correct order. REGRESID depends upon the current value of ARCHVAR, so ARCHVAR
has to be calculated first.
VECTORs of FRMLs
rats allows you to create arrays of FRMLs. For instance,
See Also . . .
GROUP Combines FRMLs to build a forecasting model.
EQUATION Defines linear relationships.
Parameters
funcname The name you want to give this function. funcname must be
distinct from any other function, procedure, or variable name
in the program. By default, the function will return a real
value. You can change that by using a TYPE statement to define
funcname as a different type.
parameters Names of the parameters. These names are local to the func-
tion—they will not conflict with variables elsewhere in the
program. By default, parameters are INTEGER passed by value.
You can change this using TYPE statements.
Description
User-defined functions are very similar to rats procedures (see Section 15.2 of the
User’s Guide). However, procedures are designed to mimic rats instructions, and
thus cannot be called from within an expression. Functions, on the other hand, are
designed to be used in expressions, and thus make it possible to embed complicated
computations in FRMLs and other expressions. As a result, functions make it pos-
sible to handle a wide range of optimization problems that would be very difficult or
impossible to handle without them.
A function definition begins with a FUNCTION statement and ends with a matching
END. The FUNCTION statement itself names the function and lists the formal param-
eters, if any.
You need to use function instructions in the following order:
function statement
type statements, if any, to define the type of the function and any parameters
other instructions
end
If possible, you should try to write the function using only parameters, local variables
and global variables which rats itself defines. That way, you don’t have to worry
about accidentally changing a global variable elsewhere in the program.
Using SOURCE
As with procedures, you may find it convenient to save functions that you’ve writ-
ten on separate files, so that they can be used in different applications. If you have a
function stored on a separate file, bring it into your current rats program using the
instruction SOURCE. The typical instruction is
source file with function
If you have a collection of functions which you use regularly, you can include them in
a “procedure library” that gets brought in right at the start of your program (set on
the File—Preferences operation).
Indirect References
You can set up a PROCEDURE or FUNCTION to accept a FUNCTION as an option or pa-
rameter, and then pass through to it when you execute a specific function. The type
specification for such as indirect function reference is
FUNCTION[returntype](argtype1,argtype2,...,argtypeN)
if it has N arguments. For instance,
procedure MCVARDoDraws
*
option model model
option integer steps 48
option integer draws 1000
option vect[int] accum
option function[rect](symm,model) ffunction
This adds an option called FFUNCTION (for Factor FUNCTION) to the procedure.
This takes a SYMMETRIC and MODEL as its parameters. Within the procedure, this is
used as:
if %defined(ffunction)
compute factor=ffunction(%outerxx(fsigmad),model)
else
compute factor=fsigmad
This checks to see if FFUNCTION has been defined. If FFUNCTION isn’t used in execut-
ing @MCVARDODRAWS, it won’t be. If it is defined, it gets used to compute FACTOR; if
not FSIGMAD (which in this case is the Cholesky factor) gets used instead.
function bqfactor sigma model
type rect bqfactor
type symm sigma
type model model
*
compute bqfactor=%bqfactor(sigma,%modellagsums(model))
end
Examples
This defines a VECTOR of probabilities in a regime-switching model, return a
2-VECTOR with the density functions in the two regimes, where the regimes differ
based upon the values PHI.
function RegimeF time
type vector RegimeF
type integer time
local integer i
*
dim RegimeF(2)
ewise RegimeF(i)=exp(%logdensity(sigsq,$
%eqnrvalue(xeq,time,phi(i))))
end
This computes the tri-gamma function (second derivative of log gamma) of its argu-
ment. (Note: the function %TRIGAMMA is now built-in, but this offers a good example
of writing a general purpose function). The function checks for an out-of-range argu-
ment and returns the missing value in that case. Note also that the function name is
used repeatedly in the body of the function for the intermediate calculations. This is
permitted; the only thing special about the function name (compared with any other
local variables) is that the final value assigned to it before the return will be the
value returned by the function.
function TriGamma z
type real TriGamma z
local real zp zpr
if z<=0 {
compute TriGamma=%NA
return
}
compute zp=z
compute TriGamma=0.0
while (zp<30) {
compute zpr=1/zp
compute TriGamma=TriGamma+zpr*zpr
compute zp=zp+1
}
compute zpr=1/zp
compute TriGamma=TriGamma+zpr*(1+zpr*(.5+zpr*(1.0/6.0+zpr^2*$
(-1.0/30.0+zpr^2/42.0))))
end
Wizard
There are separate Time Series—ARCH/GARCH(Univariate) and Time Series—
ARCH/GARCH(Multivariate) wizards. However, the GARCH instruction has several
options not included in the Wizards, so be sure to refer to the information below if
you need capabilities beyond those available using the wizards.
Parameters
start end Range to use in estimation. If you have not set a SMPL, this
defaults to the largest common range for all the variables in-
volved.
list of series List of one or more dependent variables
xregressors/[noxregressors]
Use this if you want some exogenous shift variables in the variance equation(s).
If you use it, include them on a supplementary card. If you are also using the
REGRESSORS option, the XREGRESSORS are on a second card. For multivariate
models, the same set of regressors are included in all variance equations.
distrib=[normal]/t/ged
shapeparm=input value for shape parameter for t or GED [estimated]
The assumed distribution of the error process, Normal, t (Student) or Generalized
Error Distribution. If using T or GED, you can use SHAPEPARM to provide your
own value for the shape parameter. If you don’t, it will be estimated.
asymmetric/[noasymmetric]
ASYMMETRIC includes an asymmetry term. For a standard garch model, this
will give you the gjr (Glosten, Jagannathan, Runkle, 1993) model. This can be
used with all of the supported univariate and multivariate models.
condition/[nocondition]
If CONDITION, GARCH conditions on the required lagged residuals rather than as-
signing them the presample values.
i=nodrift/drift
This constrains the garch coefficients (all a’s and b’s) to sum to one. (In an
egarch model only the b's are included). For multivariate models, each compo-
nent is constrained separately. With I=NODRIFT, constant terms in the variance
equations are constrained to zero. With I=DRIFT, constant terms are estimated.
You cannot use I with MV=BEKK or MV=VECH.
exponential/[noexponential]
EXPONENTIAL does the e-garch model of Nelson (1991) or a generalization of it.
BEKK gives the “bekk” formulation (also known as bek or ek), which imposes
positive-definiteness on the covariance matrix. DBEKK (diagonal bekk) restricts
the lagged variance and residual terms to have diagonal multiplying matrices.
TBEKK (triangular bekk) restricts the lagged variances and residuals so the front
multiplier matrix is lower triangular—the results will depend on variable order.
xbekk=[combined]/separate
Determines how "X" regressors for a bekk model are handled. Both of these add
a lower triangular matrix to the parameter set for each X regressor. The default
(COMBINED) does a linear combination of the constant variance matrix plus the X
effects and squares the combined matrix:
(C + E1 x1t +…+ Ek x kt )(C + E1 x1t +…+ Ek x kt )′
The SEPARATE choice takes the outer product of each term separately:
CC′ + x12t E1 E1′ +…+ x kt2 Ek Ek′
DCC=[COVARIANCE]/CORRELATION/CDCC
QBAR=initial Q matrix for DCC [empirical]
DCC chooses the form of the recursion for the model for the "GARCH" submodel
that determines the dynamic correlations. The default is DCC=COVARIANCE,
which means
Qt = (1 − a − b)Q + aut−1ut′−1 + bQt−1
where ut is the mean model residual and Q is the unconditional covariance
matrix. DCC=CORRELATION is Engle's original idea of
Qt = (1 − a − b)Q + aet−1et′−1 + bQt−1
where et is the vector of standardized residuals ( ut divided by the model esti-
mate for its standard deviation) and Q is the sample covariance matrix of the et .
Engle's recursion is really designed for "two-step" estimation procedures which
estimate univariate models first and take the standardized residuals as given
in a second step of estimating the joint covariance matrix. However, it can't be
applied easily with any of the VARIANCE models which have interactions among
the variances from different equations. It also, in practice, rarely fits better than
DCC=COVARIANCE, and often fits quite a bit worse. (It tends to be more easily
dominated by outliers.) DCC=COVARIANCE is the calculation that GARCH has been
doing since DCC was added and remains the default.
For whichever model is chosen for DCC, the QBAR option can be used to override
the sample calculation of the Q in case you want to experiment.
With METHOD=EVALUATE, GARCH simply evaluates the model given the initial
parameter values (input using the INITIAL option), without trying to estimate
new coefficient values. You can use options like HSERIES and RESIDS to save the
resulting variance and residual series for evaluation.
pmethod=bhhh/bfgs/[simplex]/genetic/annealing/ga
piters=number of PMETHOD iterations to perform [none]
Use PMETHOD and PITERS if you want to use a preliminary estimation method to
refine your initial parameter values before switching to one of the other estima-
tion methods. For example, to do 10 simplex iterations before switching to bfgs,
use PMETHOD=SIMPLEX, PITERS=10, and METHOD=BFGS.
[print]/noprint
vcv/[novcv]
title=”description of estimation” [depends upon options]
These control the printing of regression output, the printing of the estimated co-
variance/correlation matrix of the coefficients (page Int–77 of the Introduction) and the
title of the output. By default, the title will be the type of model (“garch model”,
“e-garch model”, etc.).
robusterrors/[norobusterrors]
lags=correlated lags [0]
lwindow=neweywest/bartlett/damped/parzen/quadratic/[flat]
damp=value of g [0.0]
lwform=VECTOR with the window form [not used]
will estimate a standard garch model but with t distributed errors with differ-
ent degrees of freedom for the first 1000 observations and the remainder of the
sample. The added parameters will come at the end of the output. Note that you
do not use DISTRIB=T with this (that would add a SHAPE parameter that has no
effect, since you're overridden the distribution calculation).
Examples
arch(2) on a ar(1) regression model, with the presample variance fixed at the value
of %SEESQ from a previous regression.
garch(p=0,q=2,regressors,presample=||%seesq||) / ffed
# constant ffed{1}
garch(p=1,q=1,exp,asymmetric,regressors) / ibmln
# constant %mvgavge{1}
Trivariate dcc garch(1,1) model, with separate mean models for each variable.
group meanm rspeq rnkeq rhseq
garch(p=1,q=1,mv=dcc,model=meanm,method=bhhh)
This uses HADJUST to include the square root of the variance of the third equation
(the OILGROW equation) as a regressor in the mean equation of a var garch model.
This updates the variable HOIL as a function of the variances, which are being saved
in the (SERIES of SYMMETRIC arrays) HH:
set hoil = 0.0
system(model=basevar)
variables pgrow ipgrow oilgrow rate
lags 1 to nlags
det constant hoil
end(system)
garch(model=basevar,mv=diag,hmatrices=hh, $
hadjust=%(hoil=sqrt(hh(t)(3,3))))
Wizard
If you generate a series list, you can use the toolbar to generate a quick box plot
from selected series.
Positioning
When using SPGRAPH to put multiple graphs on a page, rats normally fills the page
by column starting at the top left (field 1,1). To place a graph in specific field, use ei-
ther the ROW and COL options or the hfield (column) and vfield (row) parameters.
Parameters
number Number of series to graph. The maximum permitted is twenty.
hfield vfield See “Positioning” above.
Supplementary Cards
Use one supplementary card for each series you want to plot.
series the series to be plot.
start end (Optional) the range to use in generating the box plot. If you
have not set a SMPL, this defaults to the defined range of
series. start and end can be different for each series in the
graph.
Options
Most of the GBOX options are the same as those available on the GRAPH instruction.
These common options are listed briefly below—see GRAPH for details on these. The
options that are unique to GBOX are described in more detail below.
[axis]/noaxis Draw horizontal axis if Y=0 is within bounds
col=column number Column for this graph in the SPGRAPH matrix.
extend/[noextend] Extend horizontal grid lines across graph
footer=footer label Adds a footer label below graph
frame=[full]/half/
none/bottom Controls frame around the graph
header=string Adds a header to the top of the graph
height=value Sets height of graph in inches
hlabel=label Adds a label to the horizontal axis
log=value Base for log scale graphs
max=value Value for upper boundary of graph
min=value Value for lower boundary of graph
picture=pict. code Picture code for axis label numbers
row=row number Row for this graph in the SPGRAPH matrix.
scale=[left]/right/
both/none Placement of vertical scale
smpl=series or frml Series/formula indicating entries to be graphed
subhead=string Subheader string for graph
vgrid=vector Values for grid lines across from vertical axis.
vlabel=label Label for the vertical axis
vticks=number or VECTOR Maximum number or specific list of vertical ticks
width=value Sets width of graph in inches
window=string Title for graph window
3.5
3.0
1.5
1.0
Minimum value
0.5 LNWAGE
This draws a separate box plot for each year’s worth of data in the quarterly series
WASH.
open data washpower.dat
calendar(q) 1980
data(format=free,org=columns) 1980:1 1986:4 wash
gbox(group=%year(t),$
labels=||"1980","1981","1982","1983","1984","1985","1986"||)
# wash
160000
140000
120000
100000
80000
60000
40000
1980 1981 1982 1983 1984 1985 1986
Positioning
When using SPGRAPH to put multiple graphs on a page, rats normally fills the page
by column starting at the top left (field 1,1). To place a graph in specific field, use ei-
ther the ROW and COL options or the hfield (column) and vfield (row) parameters.
Parameters
hfield vfield See “Positioning” above.
Options—Specific to GCONTOUR
The following is a list of the options for GCONTOUR. Many of these are identical to
options on SCATTER, while some are unique to GCONTOUR. The options specific to
GCONTOUR are described in detail here. See SCATTER for details on the other options.
x=VECTOR of X grid values [required]
y=VECTOR of Y grid values [required]
f=RECTANGULAR array supplying function values [required]
Use X to supply a vector of grid values for the x-axis, and Y to supply a vector of
grid values for the y-axis. Use F to supply a RECTANGULAR array with dimen-
sions dim(x) ´ dim (y) containing the function values for each x,y grid point. For
example, F(1,2) should contain the function value associated with the grid point
given by X(1),Y(2). X and Y are usually created using the %SEQA function or an
EWISE instruction. The F array is usually set using EWISE.
Example
This does contours of the likelihood surface for a garch(1,1) model. To avoid an ex-
cessive number of contours, the function values are set to NA if too small.
compute [vector] atest=%seqa(.002,.002,100)
compute [vector] btest=%seqa(.500,.005,100)
dec rect ftest(100,100)
do i=1,100
do j=1,100
garch(initial=||%beta(1),lrvar*(1-btest(i)-atest(j)),$
atest(j),btest(i)||,p=1,q=1,method=eval) / sp500
compute ftest(i,j)=%if(%funcval<base-50,%na,%funcval)
end do j
end do i
gcontour(x=btest,y=atest,f=ftest)
goto gotolabel
Parameters
gotolabel The label indicating the point in the program where you want
to continue execution. rats permits forward and backward
branching. This must be sixteen characters or less in length.
Example
In the example below, there is no good alternative to GOTO. We wish to break out of
a pair of nested loops when some condition in the inner loop becomes true. A BREAK
instruction would only break the J loop.
{
do i=1,nvar
do j=1,nlag
@compval i j value
if value>100.0
goto done
end do j
end do i
:done
display "Result Achieved At Variable" i "Lag" j
}
Wizard
Data/Graphics—Graph is the wizard for the GRAPH instruction. Note that in order
to keep the Wizard from getting too complicated, some of the GRAPH options have
been omitted in the wizard. To further customize your graph, you can edit the GRAPH
instruction generated by the wizard by adding the desired options.
You can also display time series plots, histogram plots, and box plots from the Se-
ries Window. First, select View—Series Window to display this window. Then select
(highlight) one or more series in the window and click on one of the graphing buttons
on the toolbar. Those are for quick looks—they don’t generate instructions.
Positioning
If you’re using SPGRAPH to put multiple graphs on a single page, by default, the fields
are filled by column, starting at the top left (field 1,1). If you want to fill a particular
field instead, use either the combination of ROW and COL options or hfield (for the
column) and vfield (for the row) parameters.
Parameters
number Number of series to graph.
hfield vfield See “Positioning” above.
Options
This is a brief alphabetical listing. Detailed descriptions organized by function follow.
[axis]/noaxis Draw horizontal axis if Y=0 is within bounds
[box]/nobox Replaced by the FRAME option
col=column number Column for this graph in the SPGRAPH matrix.
[dates]/nodates Label entries with dates
extend/[noextend] Extend horizontal grid lines across graph
footer=footer label Adds a footer label below graph
frame=[full]/half/ Controls frame around the graph
none/bottom
grid=gridseries Series with non-zeros where you want vertical lines
header=string Adds a header to the top of the graph
height=value Sets height of graph in inches
hlabel=label Adds a label to the horizontal axis
[kbox]/nokbox Controls whether a box is drawn around the key
key=[none]/upleft/upright/ Allows addition of key to the graph
loleft/loright/above/
below/left/right/attach
kheight=value Specifies key box height as fraction of graph height
klabel=vect[strings] Used to supply your own labels for the graph key
kwidth=value Specifies key box width as fraction of graph width
[ksample]/noksample Controls whether samples are included in the key
log=value Base for log scale graphs
max=value Value for upper boundary of graph
min=value Value for lower boundary of graph
number=number Starting number for x-axis labels (with NODATES)
omax=value Value for upper boundary of overlay scale
omin=value Value for lower boundary of overlay scale
ovcount=number Number of series for right-side (overlay) scale
overlay=(see style) Style for overlay (creates a two-scale graph)
[ovkey]/noovkey Adds a key for the overlay series, if any
ovlabel=label Scale label for the right side of an overlay graph
ovrange=fraction Controls offset of vertical scales in overlay graph
ovsame/[noovsame] Use same scale for both axes of an overlay graph.
patterns/[nopatterns] Use patterns, not colors, to distinguish series
picture=pict. code Picture code for axis label numbers
row=row number Row for this graph in the SPGRAPH matrix.
scale=[left]/right/ Placement of vertical scale
both/none
series=vector[series] VECTOR of SERIES to graph
shading=series Series with non-zeros at entries to be shaded
smpl=series or frml Series/formula indicating entries to be graphed
General Options
patterns/[nopatterns]
This chooses the way you want GRAPH to distinguish among the series. rats
normally uses different colors, and will automatically switch to black and white
patterns if you print the graph on a black and white printer. If you want to see
on the screen (approximately) how the black and white hard copy will appear,
use the option PATTERNS—rats will display the series with patterns rather than
colors. Note that you can also switch between patterns and colors after displaying
the graph by using the toolbar icon on the graph window.
window="Window title" (in quotes: "..." or STRING)
When working in interactive mode, the WINDOW option allows you to set a title for
the graph window that will be associated with the graph. By default, graph win-
dows are titled using either the HEADER or FOOTER strings; if you have neither of
those, nor WINDOW, they are “Graph.01,” “Graph.02,” etc.
footer=STRING for graph footer
Adds a left-justified label in the lower left corner of the graph. To prevent a footer
from getting too wide, you can use the characters \\ to insert a line break.
frame=[full]/half/none/bottom
[box]/nobox
FRAME controls the box displayed around the outside of the graph. HALF displays
the frame only to the left and below the graph, omitting the top and right sides.
BOX/NOBOX is an older option for controlling the box (frame) which is superseded
by the more flexible FRAME. NOBOX is equivalent to FRAME=NONE.
smpl=SMPL series or formula (Chapter 25 of the User's Guide)
You can supply a series or a formula that can be evaluated across entry numbers.
Only entries for which the series or formula are non-zero will be graphed.
row=row number
col=column number
When using SPGRAPH, you can use ROW and COLUMN (or the hfield and vfield
parameters) to manually specify the position of the graph in the SPGRAPH grid.
style=[line]/polygonal/bar/stacked/overlapping/
vertical/step/symbol/midpolygon/fan/dots/spike
LINE is a simple line graph. It draws a line from one point to the next.
POLYGONAL draws a line from one point to the next, and paints the region be-
tween this line and the X-axis (or bottom of the graph if the minimum is
greater than 0). Recommended only when graphing a single series.
BAR draws a separate rectangle for each data series at each entry. If you are
graphing more than one series, you can’t really use BAR for more than about
100 data points, as the bars get too thin—use SPIKE instead.
STACKED is only useful with a set of non-negative series. With the STACKED op-
tion, rats stacks the bars for all the series at a given time point into a single
large rectangle.
OVERLAP is similar to BAR, except that the bars overlap somewhat. This allows it
to be used with more data points or series than the simple bar graph. Since
this paints the bar for the second series over part of the first series, the third
over part of the second, and so on, this style works best when the first series
is the largest and the last the smallest.
VERTICAL connects all values at a given time period with a vertical line, with
hash marks at all the values. You can use this for high/low/close plots or for
plotting confidence intervals.
STEP is similar to LINE except instead of drawing a line directly from one point to
the next, it draws horizontally to the new “x” position, then vertically to the
new “y” position.
SYMBOLS is similar to LINE except that it draws symbols at regular intervals
along the line. This may produce a better printed copy of the graph if you
have a number of intertwined series.
MIDPOLYGON is like POLYGON, except that the polygons are centered on tick
marks (similar to BAR), rather than centered between tick marks
FAN creates a fan chart, filling in the gap between series with a set of shaded fill
patterns, getting lighter towards the outside. Can be used to fill space be-
tween two series, but is most useful with five or more series.
DOTS plots each data point with a large dot.
SPIKE is similar to a bar graph, but uses narrow spikes rather than wide bars.
[ticks]/noticks
NOTICKS suppresses the tick marks and entry labeling on the time axis.
[dates]/nodates
number=labeling number for first entry
By default, GRAPH labels entries with dates (if possible) on the horizontal axis.
How rats represents dates depends on the font sizes (controlled with GRPARM),
the number of observations, and the size and shape of the graph. With a rela-
tively short series of daily data, GRAPH will probably use full month names with
dates at each entry. With long annual series, it may only label one out of every
five years. Year or month labels are centered under entries covered by that date.
With NODATES, GRAPH labels graphs with entry numbers. You can use NUMBER
(with NODATES) to use a number other than the entry number for the first obser-
vation. For instance, rats stores autocorrelations with the 0 lag in entry 1. To
label them correctly, use the options NODATES and NUMBER=0.
[axis]/noaxis
With NOAXIS, GRAPH will not draw the horizontal axis line even if the zero value
lies within the range of values in series.
ovsamescale/[noovsamescale]
You can use OVSAMESCALE to force both the regular and the overlay series to
share a common scale. They will just be shown in different styles.
[ovkey]/noovkey
You can use NOOVKEY to eliminate the key for the overlay series, if the meaning
is either obvious, or provided using the labels.
Key Options
key=[none]/upleft/upright/loleft/loright/above/below/
left/right/attached
KEY controls the placement of the key for the graph. The choices are:
NONE No key
UPLEFT Key in upper left corner, inside the graph box
UPRIGHT upper right corner, inside
LOLEFT lower left corner, inside
LORIGHT lower right corner, inside
ABOVE centered above the graph (and any HEADER and SUBHEADER).
BELOW centered below the graph, and below any X-axis labeling
LEFT left side, centered vertically, outside graph and Y-axis labeling
RIGHT right side, centered vertically, outside graph and Y-axis labeling
ATTACHED used with LINES and SYMBOLS styles, this puts the labels inside
the graph near the lines or symbols, at positions where the as-
sociation of a line with the labels is as unambiguous as possible.
klabel=VECTOR of STRINGS for key labels
By default, rats labels the KEY with the names of the series. Use KLABEL to
supply your own labels. You can create the VECTOR[STRINGS] ahead of time, or
enter it using the ||..|| matrix notation (see page UG–26 in the User’s Guide).
The order of labels in the VECTOR should match the order of the supplementary
cards. You can use \\ in a string to put a line break in the string.
[kbox]/nokbox
This controls whether or not a box (border) is drawn around the key.
[ksample]/noksample
NOKSAMPLE eliminates the sample line styles, colors, or fill patterns from the key,
leaving only the labels.
Notes
rats leaves missing values out of the graph. For a line graph, a dotted line connects
the points on either side of the missing data point.
For graphing a large number of series on a single graph, you may want to collect the
series into a VECTOR of SERIES and supply that variable using the SERIES option
rather than using supplementary cards You can also use LIST and CARDS to auto-
mate the supplementary card list.
RATS Reference Manual RM–229
Graph
Examples
This generates the stacked-bar graph shown below:
cal(q) 1946:1
all 2002:6
open data haversample.rat
data(format=rats) / cd cn cs
labels cd cn cs ; # "Durable" "Non-Durable" "Services"
smpl 1993:1 *
graph(style=stacked,header="Major Components of Consumption", $
key=below,patterns) 3
# cd ; # cn ; # cs
7000
6000
5000
4000
3000
2000
1000
0
1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
Durable Non-Durable Services
You can also use the toolbar icon to preview the black and white version (click
the button again to switch back to color). The PATTERNS option forces rats to gen-
erate the graph in black and white mode only, which can be useful if you are only
concerned with the black and white appearance.
10
11
12
Custom Styles
You can use the Graph Style Sheet feature to define and use your own custom styles.
You can define your own colors, mix pattern and color attributes, adjust line thick-
nesses, and more. See Section 3.16 in the Introduction and GRPARM for more informa-
tion.
Parameters
model The MODEL being created or modified.
formula>>result Each field gives the name of an EQUATION or FRML. The result
part of each of these fields is optional: it allows you to supply a
series into which the forecasts are to be placed. For example,
you might want to save the simulations for some formulas in
the model, but not for others. (In most cases, the instruction(s)
in which you use the MODEL will have some way to capture the
output as well).
You can use a VECTOR[FRML] or VECTOR[EQUATION] to make
up all or part of the list. If you do that, you can’t use the >> on
those.
Options
cv=SYMMETRIC covariance matrix of residuals
This option provides the (default) covariance matrix for the residuals if you do
SIMULATE, ERRORS or IMPULSE; you can also input this matrix directly into those
instructions using their own CV options. The matrix should have dimensions
of N´N, where N is the number of structural equations (non-identities) in the
model. (The older VCV is a synonym for CV).
Examples
This groups three univariate autoregressions to form a model named AR1, and uses
that for the mean model for the GARCH.
equation(constant) jpneq xjpn 1
equation(constant) fraeq xfra 1
equation(constant) suieq xsui 1
group ar1 jpneq fraeq suieq
garch(p=1,q=1,model=ar1,mv=dcc,pmethod=simplex,piter=10)
This defines six equations (four identities and two defining shocks) and groups them
into a model for solution by DSGE.
declare series a1 a2 eps eta
declare real alpha lambda sig_eta sig_eps
*
frml(identity) f1 = x -(x{1}+a1-lambda*a1{1})
frml(identity) f2 = mu-$
((1-lambda)*x{1}+lambda*mu{1}+a2-lambda*a2{1})
frml(identity) f3 = a1-1.0/(lambda+(1-lambda)*alpha)*(eps-eta)
frml(identity) f4 = a2-(1.0/(lambda+(1-lambda)*alpha)*$
((1+alpha*(1-lambda))*eps-(1-lambda)*eta))
frml d1 = eps
frml d2 = eta
*
group cagan f1 f2 f3 f4 d1 d2
Parameters
The GRPARM parameters specify the label type or types to which the GRPARM com-
mand will apply. You can change one or more of these with a single GRPARM instruc-
tion. Separate the items with one or more spaces. You can truncate any of the param-
eter names to three or more characters, for instance, KEY for KEYLABELING.
If you don’t want to change the size of a label (that is, if you only want to change its
font or style of the label), use * for the newsize parameter. For example:
grparm(nobold) vlabel *
Options
italics/[noitalics]
bold/[nobold]
font="font name"
BOLD and ITALICS allow you to choose bold and italics styles for the label types
listed on the GRPARM instruction. The FONT option allows you to specify the font
used for the specified labels. See “Choosing Fonts”( page Int–155 ) of the Introduction.
portrait/[noportrait]
The PORTRAIT option lets you rotate the graph by 90 degrees. In combination
with the Portrait/Landscape choices in the Page Setup dialog box (Windows/Ma-
cintosh versions) or the Portrait and Landscape PostScript exporting choices (all
versions), this provides additional choices for orienting the graph. See page Int–129 in
the Introduction.
background=stylenum
This allows you to choose a background for the main box of the graph other than
the default choice, which is a solid white background. The stylenum is one of the
color fills—you can select from one of the default style choices, or a custom style
defined as part of a style sheet (see next page).
shading=stylenum
This allows you to choose a different style for the SHADING (on GRAPH) or VSHADE
or HSHADE (on SCATTER and GCONTOUR). The default is a very light solid gray.
The stylenum is one of the color fills—you can select from one of the default
style choices, or a custom style.
grid=stylenum
This allows you to choose a different style for the GRID or VGRID options (on
GRAPH), VGRID or HGRID (on SCATTER and GCONTOUR) or VGRID (on GBOX). The
default is a “hairline” (very thin) solid black. The stylenum is one of the color
fills—you can select from one of the default style choices, or a custom style.
grparm(cmlabels=||"E","F","M","A","M","J",$
"J","A","S","O","N","D"||)
grparm(smlabels=||"enero","feb","marzo","abr","mayo","jun",$
"jul","agosto","set","oct","nov","dec"||)
grparm(lmlabels=||"enero","febrero","marzo","abril",$
"mayo","junio","julio","agosto","septiembre","octubre",$
"noviembre","deciembre"||)
would use Spanish month names and abbreviations instead. Note that the graphs
themselves don’t have the labels built in—instead, they save the raw date infor-
mation. The substitution is done whenever any graph is displayed, printed or
exported after the GRPARM instruction.
You first use an OPEN instruction to open the style sheet, and then use GRPARM with
the IMPORT option to load the styles. For example, if you have a style sheet called
GRAPHSTYLES.TXT, you might do:
open style graphstyles.txt
grparm(import=style)
See page Int–151 of the Introduction for more information on style sheets.
Examples
grparm(bold) keylabeling 18
grparm monthlabels 12
The first sets the key labels to 18 point bold; the second increases the month and day
labels to 12 point.
grparm axislabel 18 header 30
grparm(italics) subheader 22
graph(extend, $
header=”Canadian - US Exchange Rate”,sub=”Can $/US $”) 1
# canusx 1978:1 2002:12
This increases the sizes of the axis, header and subheader labels.
Usage
The basic procedure is:
1. Use SPGRAPH to initiate the special graph.
2. Use GRAPH, SCATTER or GCONTOUR to draw the graph.
3. Use one or more GRTEXT instructions to add text to the graph created in step 2.
Use one GRTEXT for each string you want to add.
4. If you are putting multiple graphs on the page (by using the HFIELDS and
VFIELDS options on SPGRAPH), repeat steps 2 and 3 as necessary to draw the
other graphs and, if desired, add text to those graphs.
5. Issue the SPGRAPH(DONE) instruction to complete the special graph.
It’s usually a good idea to do the graph first to decide where you want the added text.
(GRTEXT should usually be the final step when preparing a graph for publication).
The best choice for the ALIGNMENT and DIRECTION options may not be apparent
until you see how the graph lays out.
Parameters
"string" The string of text you want to add to the graph. This can be a
string of text enclosed in quotes, or a STRING or LABEL type
variable. You can include line breaks by inserting the charac-
ters \\ in the string at the points where you want line breaks.
Options
position=upleft/upright/loleft/loright/rightmargin/bottommargin/
leftmargin/topmargin
entry=entry number or date for x-axis position (used with GRAPH)
x=x-axis value for x-axis position (used with SCATTER or GCONTOUR)
y=y-axis value for y-axis position (with GRAPH, SCATTER, or GCONTOUR)
By default, the text will be centered in the graph. You can use these options to
control the positioning of the text.
The POSITION option puts the text in one of the corners of the graph box (upper
left, upper right, lower left, lower right), or in one of the margins of the graph. To
use any of the “margin” options, you need to do the GRTEXT before the graphing
instruction.
ENTRY and Y (for GRAPH) or X and Y (for SCATTER or GCONTOUR) allow you to
place the text at a specific location to annotate some feature. If you are doing a
GRAPH, you use the ENTRY option to set the horizontal position as a date or entry
number, and the Y option to set the vertical position of the strings, within the
Y-axis range. If you are doing a SCATTER or GCONTOUR, you use the X option to
specify the horizontal position within the X-axis range, and the Y option to specify
the vertical position within the Y-axis range.
alignment=[centered]/right/left
valignment=[centered]/top/bottom
direction=compass heading in degrees (integer from 0 to 360)
ALIGNMENT determines whether the text should be centered or right- or left-jus-
tified at the specified position. VALIGNMENT determines whether that position is
the vertical center, the top, or bottom.
DIRECTION allows you to position the text by specifying a direction from the
(x,y) point as a compass heading in degrees. For example, DIRECTION=0 (or 360)
will center the text at a point just above the (x,y) location; DIRECTION=45 will
display left-justified text, starting just above and to the right of the (x,y) location;
DIRECTION=270 will display right-justified text directly to the left of the (x,y)
point.
font="font name"
size=relative size of type in points
These select the typeface and size of the string. The default point size is 14
points, based on a full-page graph. Fonts are automatically scaled for smaller
graphs. See page Int–155 of the Introduction for details.
bold/[nobold]
italics/[noitalics]
These display the string in bold and/or italic type.
box/[nobox]
Puts a box around the text.
transparent/[notransparent]
GRTEXT strings are normally displayed with an opaque white background, so
any lines, patterns or symbols lying “under” the string will be obscured from
view. With TRANSPARENT, only the text itself will be opaque—all the white space
within and between letters will be transparent, allowing any underlying graph
elements to show through.
Examples
The following computes the histogram of a series, graphs it as a bar graph, overlay-
ing it with the normal density with the sample mean and variance, and displaying
the skewness and kurtosis in the upper left corner.
density(maxgrid=25,type=histogram) x / fx dx
stats(noprint) x
set nx = 1.0/sqrt(%variance)*%density((fx-%mean)/sqrt(%variance))
spgraph
scatter(style=bargraph,overlay=line,ovsamescale) 2
# fx dx
# fx nx
display(store=s) "Skewness" %skewness $
"\\Excess Kurtosis" %kurtosis
grtext(position=upleft) s
spgraph(done)
In the example below, we graph a series and add some text showing the maximum
and minimum values of the series. We use the EXTREMUM instruction to calculate the
maximum (%MAXIMUM) and minimum (%MINIMUM) values of the series and the entry
numbers (%MAXENT and %MINENT) at which these values occur. We don’t want the
text to overwrite any of the graph line, so we’ll put the labels a little above and below
the maximum and minimum values. We’ve also chosen to draw the labels left-justi-
fied, just to the right of the maximum and minimum points.
cal(q) 1980:1
all 2003:1
set(first=0) x = .05*t + .5*x{1} + %ran(2.0)
extremum(noprint) x
* Calculate positions for the labels:
compute maxentry = %maxent + 1 One entry right of the maximum
compute maxval = %maximum + .3 A little above the maximum
compute minentry = %minent + 1 One entry right of the minimum
compute minval = %minimum - .3 A little below the minimum
* Construct the strings:
disp(store=maxstring) "Maximum = " #.## %maximum
disp(store=minstring) "Minimum = " #.## %minimum
spgraph
graph(max=(maxval+1),min=(minval-1)) 1
# x
* Add the text at the specified positions:
grtext(align=left,valign=bottom,entry=maxentry,y=maxval) $
maxstring
grtext(align=left,valign=top,entry=minentry,y=minval) $
minstring
spgraph(done)
12.5
Maximum = 10.75
10.0
7.5
5.0
2.5
0.0
-2.5
Minimum = -3.69
-5.0
1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002
gsave(options) template
Parameter
template A template (or a specific filename) to be used for saving graphs.
See “Filename Templates” below for details. Execute GSAVE
without a parameter to turn off the automatic saving of graphs.
Options
format=[rgf]/portrait/landscape/pdf/lpdf/wmf/png/pict
By default, graphs are saved in rgf (“rats graph”) format. Use FORMAT to select
a different format. PORTRAIT and LANDSCAPE are PostScript format in por-
trait and landscape orientations, respectively. PDF and LPDF are Adobe PDF in
portrait and landscape modes respectively. Portrait means standard orientation
so you don't have to rotate the page. To keep the proportions reasonable it will,
by default, use only about the half of a page. Landscape rotates clockwise and,
since the height of the page is then the width, uses the full page. PNG is Portable
Network Graphics (bitmap).
WMF is Windows Metafile (or "Picture"), while PICT is an older Macintosh format.
The availability of some formats is platform-dependent.
[header]/noheader
[footer]/nofooter
Use NOHEADER and/or NOFOOTER to strip header or footer labels from the graph.
This is useful if you want to use headers or footers to identify the graphs as
displayed on screen in rats, but don’t want those labels included in the version
saved to disk, such as when generating graphs for publication where any captions
will be added in the word-procesing software. If applied to an SPGRAPH or nested
SPGRAPHs, only the headers/footers on the outer SPGRAPH would be stripped—
any such labels on the individual graphs (or nested SPGRAPHs) are retained.
patterns/[nopatterns]
Use PATTERNS to switch to black and white only when graphs are saved or
exported—they will still show as color on the screen. This applies to all graphs
saved or otherwise exported, whether with GSAVE, or menu operations.
Filename Templates
The template will usually be a filename string that includes an asterisk (*)
symbol. For each graph you generate, rats will save the graph using a filename
constructed by replacing the asterisk with a sequence number. Omit the asterisk
to save a single graph using a specific name.
Example
gsave(format=portrait) "ImpulseResponse*.eps"
@varirf(model=canmodel,steps=12,page=byshocks)
This uses the @VARIRF procedure to generate a set of impulse response graphs. The
graphs will be saved as PostScript files called “ImpulseResponse1.eps”,
“ImpulseResponse2.eps”, and so on.
Parameters
series The series of arrays to create or set with this instruction.
start end The range of entries to set. If you have not set a SMPL, this de-
faults to the standard workspace.
function(T) function of the entry number T giving the value for entry T of
series. This should evaluate to the type of the elements of the
series. There should be at least one blank on either side of the =.
The function can actually include multiple expressions, sepa-
rated by commas. The series will be set to the value returned by
the final expression in the list.
Options
startup=expression evaluated at period “start”
Use the START option to provide an expression which is computed just once, be-
fore the regular formula is computed. This allows you to do any time-consuming
calculations that don’t depend upon time. It can be an expression of any type.
[panel]/nopanel
When working with panel data, NOPANEL disables the special panel data treat-
ment of expressions which cross individual boundaries.
Examples
declare series[symm] uu
gset uu gstart gend = %zeros(2,2)
creates UU as a set of series of 2 ´ 2 symmetric matrices, all initialized to zeros.
dec series[vect] xu
gset xu regstart regend = %eqnxvector(0,t)*u
set ssqtest regstart regend = %qform(xxs,xu)
XXS is a previously defined SERIES[SYMMETRIC], and U is a series (of residuals).
This creates XU as a SERIES of vectors, set equal to the X t u from the last regression
(%eqnxvector(0,t) is X t ). SSQTEST is the series of values for ( X t u)′ XXSt ( X t u) .
halt message
Parameters
message (Optional) You can provide a string of up to 255 characters.
rats will print the message
Example
loop
menu "What Next?"
choice "Enter Data"
source indata.src
choice "Do Forecasts"
source forecast.src
choice "Quit”
halt
end menu
end loop
loops until the user chooses “Quit” from the menu.
See Also . . .
END Ends a rats program or an interactive session.
BREAK Breaks control out of a loop.
RETURN Returns control from a procedure or other compiled section.
history( options )
Wizard
In Time Series—VAR (Forecast/Analyze) wizard choose "Historical Decomposition" in
the Action drop down.
Options
model=model name
Of the two ways to input the form of the model to be solved (the other is with
supplementary cards), this is the more convenient. MODELs are usually created by
GROUP or SYSTEM. It can include FRML's (formulas) only if they represent simple
linear equations, as HISTORY requires that the model be fully linear. If the model
includes any identities, those should be last in the model. If you use MODEL, omit
the “equation” supplementary cards.
Variables Defined
%FSTART Starting entry of forecasts (INTEGER)
%FEND Ending entry of forecasts (INTEGER)
Technical Description
The historical decomposition is based upon the following partition of the moving
average representation
j−1 ∞
(1) yT + j = ∑ Ψ s uT + j−s + ∑ Ψ s uT + j−s
s=0 s= j
The first sum represents that part of yT+j due to innovations in periods T+1 to T+j
The second is the forecast of yT+j based on information available at time T.
If u has N components, the historical decomposition of yT+j has N+1 parts:
• The forecast of yT+j based upon information at time T (the term in brackets).
• For each of the N components of u, the part of the first term that is due to the
time path of that component.
This is the order in which the resulting series are organized: if you use the RESULTS
option, the first row in each column gives the base forecasts and the remaining rows
are the components. If you use supplementary cards, the base forecast is put into the
series given by the series field and the effects of the components go into the next N
series.
Comments
If you use the actual estimated model and its residuals, the components of the decom-
position will add up to the observed data.
Using the ADD option superimposes the innovation components on the base projec-
tion: the influential variables will tend to create movements from the base that are
much larger than the less important variables.
Examples
system(model=canmodel)
variables usargdps canusxsr cancd90d canm1s canrgdps cancpinf
lags 1 to 4
det constant
end(system)
compute hstart=2000:1
compute hend =2006:4
estimate
history(model=canmodel,add,results=history,from=hstart,to=hend)
print / history
This computes the historical decomposition from the first quarter of 2000 through
the fourth quarter of 2006 for the model CANMODEL. The results are stored into a
history(model=canmodel,add,results=history,window="History",$
labels=||"f1","f2","r1","r2","n1","n2"||,factor=bfactor,$
from=hstart,to=hend)
This modifies the previous example by using a non-standard factorization, with rela-
beled shocks (financial shocks 1 and 2, real 1 and 2 and nominal 1 and 2). In addition
to being stored into an array of series, the results are also displayed in a window
named “History.”
Older Syntax
history( options ) equations periods start VCVmatrix
# equation series newstart column (one for each equation if no MODEL)
Parameters
equations Number of equations in the system.
periods number of periods to decompose. Prefer the newer STEPS or
FROM and TO options.
start starting period. Prefer the newer FROM option.
VCVmatrix covariance matrix for Cholesky factor. Prefer the newer CV op-
tion.
if condition 1
statement or block of statements (executed if condition 1 is true)
else if condition n
statement or block of statements (executed if condition n is true and no earlier
conditions in the current IF block were true)
else
statement or block of statements (executed if no conditions were true)
Parameters
condition IF and ELSE IF statements evaluate a condition expression.
This can be either an integer- or real-valued expression. The
condition is true if it has a non-zero value and false if it has
value zero. Usually, you construct these using logical and rela-
tional operators.
Description
An IF-ELSE block has the following structure:
1. An IF instruction.
2. (Optionally) One or more ELSE IF instructions.
3. (Optionally) An ELSE instruction.
• rats executes the block of statements following the first IF or ELSE IF whose
condition is true.
• If all of the IF or ELSE IF conditions are false, and you have an ELSE, rats will
execute the block of statements following ELSE.
• If all of the conditions are false, and you do not have an ELSE statement, rats
drops to the first instruction following the last ELSE IF block.
After rats executes the instruction(s) associated with a true IF or ELSE condition, it
will ignore any remaining ELSE IF conditions and drop down to the first instruction
following the last block of statements in the IF-ELSE block.
If the IF instruction is the initial instruction in a compiled section, you should sur-
round it with a { before the IF and a } after the end of the logical block.
Blocks of Statements
You can follow each IF, ELSE IF, or ELSE statement with a single instruction, a
group of instructions enclosed in braces (the { and } symbols), a DO, DOFOR, WHILE,
UNTIL or LOOP loop, or another “nested” IF-ELSE block. When using nested IF’s,
please note the information on the next page regarding ambiguous ELSE’s. If you
have any doubt about whether you have set up a block properly, just enclose the
statements that you want executed in braces ({ and }).
if condition 1
if condition 2 ; statement 1
else ; statement 2
While indenting makes it easier for you to read and interpret code (and we strongly
recommend that you get in the habit of using indenting), the leading spaces or tabs
are ignored by rats itself. So, both examples are executed the same way, even
though the ELSE instructions are intended to apply to different IF instructions.
rats resolves the ambiguity by having an ELSE apply to the IF which immediately
precedes it; that is, rats uses the interpretation implied by the first example. If you
want to use the bottom form instead, you must put { and } around the second IF;
this tells rats that it should only execute those instructions within this block if con-
dition 1 is true:
if condition 1
{
if condition 2 ; statement 1
}
else ; statement 2
Examples
First, here is an example of the use of the %IF function in a transformation. The
series PATCH is set equal to XS if XS isn’t a missing value, or equal to XV otherwise.
This is not done with IF and ELSE instructions, as the IF and ELSE would choose one
of the two SETs to execute for all entries.
This handles a four branch decision. If LAGS>1 and TREND is non-zero, the first
LINREG is executed. If LAGS>1 and TREND is zero, it's the second. Because the {}'s
wrap around those two, the second ELSE applies when LAGS is not bigger than 1.
So if LAGS<=0, and TREND is non-zero, the third LINREG is executed and finally if
LAGS<=0 and TREND is zero, the fourth LINREG is executed. (Note that the check for
LAGS>1 is no longer necessary: SDIFF{1 to 0} is now interpreted in the natural
way of meaning no lags of SDIFF).
if lags>1 {
if trend
linreg(noprint) series startl+lags endl
# series{1} constant strend sdiff{1 to lags-1}
else
linreg(noprint) series startl+lags endl
# series{1} constant sdiff{1 to lags-1}
}
else {
if trend
linreg(noprint) series startl+1 endl
# series{1} constant strend
else
linreg(noprint) series startl+1 endl
# series{1} constant
}
if trans == 1
set transfrm nbeg nend = series
else if trans == 2
set transfrm nbeg nend = log(series)
else
set transfrm nbeg nend = sqrt(series)
See Also . . .
UG, Section 15.1 The rats Compiler.
impulse( options )
# first period shocks (only with INPUT option)
# list of path series (only with PATHS option)
Wizard
In the Time Series—VAR (Forecast/Analyze) wizard choose "Impulse Responses" in
the Action drop down.
Options
model=model name
Of the two ways to input the form of the model to be solved (the other is with
supplementary cards), this is the more convenient. MODELs are usually created by
GROUP or SYSTEM. It can include FRML's (formulas) only if they represent simple
linear equations, as IMPULSE requires that the model be fully linear. If the model
includes any identities, those should be last in the model.
As an alternative, you can use FACTOR to supply your own factorization of the
covariance matrix, such as the factor matrix produced by a CVMODEL instruction.
(User’s Guide, Section 7.5.4). This option was called DECOMP in earlier versions.
DECOMP is still recognized as a synonym for FACTOR. FACTOR can have a reduced
number of columns (shocks), but must have rows equal to the number of (struc-
tural) equations.
results=RECTANGULAR[SERIES] for result series
flatten=(output) RECT for use with %%RESPONSES
This provides a RECTANGULAR array of SERIES which will be filled with the
results. This will typically be used when you are getting the full moving aver-
age representation of a var (the default behavior), when it will have dimensions
M´M. The responses to the shock in innovation i will be column i in the matrix.
If you are requesting responses to a single shock, the matrix will have dimensions
M´1. Each series created will be filled from entries 1 to steps.
FLATTEN packs the results into a NVAR*NSHOCKS ´ steps matrix in the form
needed for an element of the %%RESPONSES array (page UG–519).
[print]/noprint
window="Title of window"
Use NOPRINT to suppress the display of the responses to the output window or
file. Use the WINDOW option if you want to display the output in a (read-only)
spreadsheet window, which will have the title you supply. The output will be
organized as separate sub–tables for each variable shocked. You can export infor-
mation from this window to a file in a variety of formats using the File—Export...
operation.
input/[noinput]
shocks=VECTOR for first period shocks
You can use one of these options to input general first period shocks. With INPUT,
you supply the shocks on a supplementary card of the second form; with SHOCKS,
the indicated VECTOR provides the shocks. See ""Technical Details and Choices
for Providing Shocks"".
With MATRIX, you set up a RECTANGULAR array to provide the paths of shocks to
the equations. The columns of the array should match the order of the equations,
that is, the shocks to the first equation should be in the first column. The number
of rows does not have to be equal to steps. The shocks will be set to zero for any
steps beyond those supplied by the array.
With PATHS, you supply a list of series on a supplementary card. These series
provide the paths of the shocks. You must define these series for steps entries
beginning with the START option entry. Use * on the supplementary card for any
equation whose shocks are to be zero for the entire period.
(1) yt = ∑ Ψ s ut−s
s =0
the response at t=k to an initial shock z in the u process is Ykz. For instance, the
response at step k to a unit shock in equation i at t=0 is just the ith column of the Yk
matrix.
IMPULSE allows the shock to the system to take one of several forms:
Examples
These examples use the six variable var from the example program IMPULSES.RPF.
The interest rate is the third variable in the system, a fact which is used in several of
the examples.
impulse(model=canmodel,steps=20,results=impulses)
computes twenty steps of the impulse responses to all the orthogonalized shocks to
the equations in CANMODEL. IMPULSES(i,j) is a series defined from entries 1 to 20
which has the response of the ith dependent variable to a shock in the jth.
impulse(model=canmodel,steps=24,col=3,window="Shock to Rate")
shocks the third orthogonalized component (the rate shocks) and puts 24 step re-
sponses out to a window.
impulse(model=canmodel,steps=20,factor=bfactor,$
window="Responses",labels=||"f1","f2","r1","r2","n1","n2"||)
puts out to a window a 20 step response to each of the components in an orthogonal-
ized system. The shocks are given labels of f1, f2, r1, r2, n1 and n2.
impulse(model=canmodel,steps=24,shock=%unitv(6,3),noprint,$
results=torate)
computes the response to a unit shock in the interest rate alone (not an orthogonal-
ized component). So the impact on the interest rate with be 1.0 initially, while all
other variables will have an impact shock of 0. TORATE(i,1) will be the response of
variable i to the shock. Note that you must be very careful with the scaling of shocks
like this. A unit shock to a variable in log's means an impact the same as multiply-
ing the data by 2.718. Unit shocks in orthogonalized components as in the previous
examples all adjust automatically to the scale of the variables.
Older Syntax
impulse( options ) equations steps shockto VCVmatrix
# equation response newstart column (one for each equation if no MODEL)
# first period shocks (only with INPUT option)
# list of path series (only with PATHS option)
Parameters
equations Number of equations in the system.
steps number of steps to compute. Prefer the newer STEPS option.
shockto variable/component to shock. Prefer the newer COLUMN option.
VCVmatrix covariance matrix for Cholesky factor. Prefer the newer CV op-
tion.
Equation Supplementary Cards (if you don’t use the MODEL option)
If you aren’t using the MODEL option, supply one supplementary card for each equa-
tion in the system. The order of listing affects the decomposition order. You must list
supplementary cards for identities last.
equation The equation name or number.
response (Optional) The series for the response of the dependent vari-
able of equation. rats ignores response if you use a single
IMPULSE instruction to print responses to each of the variables
in turn. You can get the whole matrix of responses by using the
RESULTS option.
newstart (Optional) The starting entry for the series response. The
default is 1.
column (Optional) The column of the CV matrix which corresponds to
this equation. By default: this is just the supplementary card
position: that is, first supplementary card corresponds to col-
umn one, second card to column two, etc.
impulse(input,steps=20,noprint) 3
# 1 resp1
# 2 resp2
# 3 resp3
# 1.0 -1.0 0.0
computes responses over 20 periods to a shock in the first period of 1.0 to the first
equation and –1.0 to the second. This saves the responses in RESP1, RESP2 and
RESP3.
See Also . . .
UG, Section 7.5 Orthogonalization.
UG, Section 7.6 Using IMPULSE and ERRORS.
ERRORS Decomposition of Variance.
Parameters
"messagestring" (optional) Message to display in the dialog box. When used with
the ACTION=DEFINE option, this sets the first line of text. The
first line cannot be changed later. Use it to describe the overall
operation. With ACTION=MODIFY, messagestring sets the
second line of text, which you can change each time you use
ACTION=MODIFY. You can supply this as a string of text en-
closed in quotes, or as a LABEL or STRING type variable.
Options
action=define/[modify]/remove
ACTION=DEFINE creates and displays the information box. If you want to display
a progress bar in the dialog, you must include the PROGRESS, LOWER, and UPPER
options when you do ACTION=DEFINE. The messagestring parameter, if sup-
plied, is displayed as the first line in the dialog box.
ACTION=MODIFY updates the box. Use this along with the CURRENT option to
update the progress bar, or with the messagestring parameter to replace the
second line of text in the dialog box.
The ACTION=REMOVE option removes the box from the screen.
progress/[noprogress]
If you want to display a progress bar, use this option when you do
ACTION=DEFINE. You must also use LOWER and UPPER to set the lower and up-
per bounds for the progress bar. Once the operation has taken longer than ten
seconds, an estimate of the time to completion is added to the box. This assumes
that the time required from lower bound to upper bound is roughly linear.
Example
If you are doing something like a time-consuming Monte Carlo simulation which
loops over a large number of draws, you can use INFOBOX to keep the user (or your-
self) informed of its progress. This displays an information box with a progress
bar and a single descriptive line. The progress bar is incremented by one each trip
through the loop.
infobox(action=define,progress,lower=1,upper=ndraws) $
"Monte Carlo Simulation"
do draw=1,ndraws
*** Monte Carlo simulation instructions here:
...
infobox(current=draw)
end do draw
infobox(action=remove)
See Also . . .
DBOX Generates custom dialog boxes for user-input. These can
include check boxes, text input fields, lists of “radio” buttons,
and more.
MESSAGEBOX Displays simple informational messages in a dialog box. Un-
like INFOBOX, a MESSAGEBOX pauses program execution and
waits for the user to respond (using OK, OK or Cancel, or Yes,
No, or Cancel buttons).
Parameters
equation Equation whose initial estimates are to be computed.
start end Range of entries to use in computing the covariances needed for
the calculation. If you have not set a SMPL, this defaults to the
defined range of the dependent variable of equation.
Options
[print]/noprint
The output from INITIAL lists the variables, lags and initial estimates of the
coefficients. You can suppress the output with NOPRINT.
covariances=series of autocovariances
Use the option COVARIANCES to solve for the ARMA representation of a process
with a specific covariogram. The series of autocovariances should start
with lag 0 (the variance) in entry 1 and should have at least as many lags as the
highest AR lag plus the highest MA lag.
Description
INITIAL computes initial estimates for the ARMA parameters of equation using the
autocovariances of the dependent variable, or those provided by the COVARIANCES
option. INITIAL uses the algorithms on pp. 223-224 of Box, Jenkins, and Reinsel
(2008), using the linearly convergent (Gauss–Seidel) algorithm for the initial esti-
mates of the moving average part. If the equation is an autoregression, this process
produces solutions to the Yule-Walker equations.
There may be no solution to the system of equations used for the moving average
part. For instance, no MA(1) model is compatible with a first lag correlation greater
than .5. This failure usually indicates a poorly specified or overparameterized model.
If rats cannot solve the system of equations, it issues a warning, sets the estimate
for the highest MA lag to zero and computes estimates for the remaining coefficients.
Parameters
arrays,... These are the objects for which data is to be read. You can use
any combination of variables. You can use arrays of arrays, but
any arrays must be dimensioned ahead of time (unless you use
the option VARYING).
Options
unit=[input]/data/other unit
INPUT reads the data from the specified I/O unit. The INPUT unit (the default
setting) is simply the console if you are working interactively or the current input
file if you are working in batch mode or have done a SOURCE or OPEN INPUT
instruction.
varying/[novarying]
status=INTEGER variable set to 0,1 status
[singleline]/nosingleline
These are more advanced options. See their description later in this section.
Description
Some general information about INPUT:
• It reads the arrays and variables in the order listed.
• It fills the elements of arrays in the order described below.
• It can read two or more arrays from the same line of data. It can read two or
more rows of an array from a single line.
• It reads complex numbers as a pair of real numbers: the real and imaginary
parts, respectively.
• A STRING variable is filled with the contents of a complete line.
Before you can INPUT data for a variable, you must first introduce it with DECLARE
or some other instruction. You also need to dimension any array prior to using it in
an INPUT instruction, unless you use the VARYING option.
Organization of Arrays
These are the three general forms of arrays and their organization:
One-dimensional arrays (VECTOR array type)
are written and read as row vectors.
Example
declare symmetric v(3,3)
declare real y
declare vector[complex] c2(2)
declare integer i
input v y c2 i
1.0 2.0 3.0 4.0 5.0 6.0 7.3
0.0 1.0 1.0 0.0 5
sets all of the following:
1.0
v = 2 .0 3 .0
4 .0 5 .0 6 .0
Y = 7.3
0.0 + 1.0i
c =
1.0 + 0.0i
I = 5
Notes
You can also use COMPUTE for the initialization above:
compute [symmetric] v=||1.0|2.0,3.0|4.0,5.0,6.0||
compute y=7.3
compute [vector[complex]] c2=||%cmplx(0.0,1.0),%cmplx(1.0,0.0)||
compute i=5
We prefer to use COMPUTE for integer and real scalars and vectors. For arrays with
multiple rows, however, INPUT is easier to read and requires fewer extra characters.
Compare the initializer for V above with
Advanced Options
varying/[novarying]
status=INTEGER variable set to 0,1 status
[singleline]/nosingleline
The VARYING and STATUS options allow you to work with lists whose size you do
not want to set in advance.
You can use VARYING to input data for a single VECTOR array of any numeric
or character type. With VARYING, the VECTOR is filled with as much data as is
available. By default, this is whatever is on a single line of data.
With NOSINGLELINE, it will read data until the end of the file—though READ is
preferable for data coming from a file.
If you use the option STATUS, INPUT does not give you an error if there is not
enough data to fill all the variables. Instead, it sets your status variable to 0. If
the INPUT is successful, it sets the status variable to 1.
Example
dec vector v
input(varying) v
1 5 10 25 50 100 250 500 1000
VARYING is useful if you would rather not count the entries or want to make quick
changes to the list without having to worry about changing the dimensions.
See Also . . .
READ An alternative to INPUT with a wider set of options. READ is
designed primarily for reading data from an external file.
WRITE Writes arrays and variables to output or to an external file.
QUERY Requests input from the program’s user.
MEDIT Inputs a matrix from a screen data editor
ENTER Inputs data from supplementary cards.
FIXED Defines fixed-value arrays in user-defined functions and pro-
cedures.
Parameters
value1 value2 These are INTEGER variables or INTEGER elements of a VECTOR
or RECTANGULAR array which are filled with the information
requested. Some options only return one value—you do not need
to specify a variable for value2 in such cases.
p1 p2 (Optional) Use these where you want your PROCEDURE to mimic
the standard “these default to the defined range of series”
behavior of rats instructions. p1 and p2 should be PROCEDURE
parameters or options of type INTEGER. value1 and value2
will take the values of p1 and p2, if explicit values for those are
provided. Otherwise, they get the INQUIRE values. See the first
example.
regressorlist/[noregressorlist]
equation=EQUATION supplying list of variables
model=MODEL supplying list of variables
Use REGRESSORLIST and a supplementary card listing a set of variables in
regression format when you want to determine the maximum defined range for a
set of variables. value1 and value2 are set to the starting and ending entries of
that range. EQUATION is similar, but determines the range based on the variables
(both dependent and explanatory) in the equation you supply. MODEL determines
the range based upon all variables in all (linear) equations in the MODEL.
smpl/[nosmpl]
value1 and value2 are set to the starting and ending entries of the current
SMPL.
seasonal/[noseasonal]
This returns as value1 the current calendar seasonal: for instance, 4 for a quar-
terly CALENDAR, 12 for a monthly CALENDAR.
matrix=matrix name
This returns the dimensions of the indicated matrix. value1 and value2 are set
to the number of rows and columns, respectively. However, it's simpler to use the
%ROWS(matrix) and %COLS(matrix) functions instead.
lastreg/[nolastreg]
value1 and value2 are set to the starting and ending entries of the last regres-
sion. However, it's simpler to use the %REGSTART() and %REGEND() functions.
Examples
procedure test series start end
type series series
type integer start end
local integer startl endl
inquire(series=series) startl<<start endl<<end
This is similar to the entry code for many of the procedures which we provide with
rats. Let’s look at three possible command lines to execute @TEST.
@test gdp82
@test gdp82 1947:1 2017:4
@test gdp82 1955:1 *
In the first, STARTL and ENDL will be the start and end of the series GDP82. In the
second, they will be 1947:1 and 2017:4 and in the third, 1955:1 and the end of GDP82.
env ratsdata=modeldat.rat
cal(q) 1960:1
inquire(dseries=gdp) * endgdp
inquire(dseries=unemp) * endunemp
compute dataend=%imin(endgdp,endunemp)
allocate dataend
This checks the current length of series GDP and UNEMP on the file MODELDAT.RAT,
and sets DATAEND to the minimum of the two. You can use INQUIRE in this fashion to
write a forecasting program, for instance, which requires no modification from month
to month except updates of the data file.
See Also . . .
rats also offers a number of functions for getting information about arrays, series,
and other information:
%ROWS(matrix) Returns the number of rows in a matrix.
%COLS(matrix) Returns the number of columns in a matrix.
%SIZE(matrix) Returns total number of elements in a matrix
%REGSTART() Returns the starting entry of the last regression.
%REGEND() Returns the final entry of the last regression.
%EQNSIZE(equation) Returns the number of explanatory variables in an
equation. Use equation=0 to get information for the
last regression in this and the next six functions.
%EQNTABLE(equation) Returns a 2´K INTEGER array of the explanatory
variables of an equation. The first row is the series
number, the second is the lag.
%EQNCOEFFS(equation) Returns the vector of coefficients from an equation.
%EQNDEPVAR(equation) Returns the dependent variable of an equation.
%EQNREGLABELS(equation) Returns a K vector of STRINGS which gives the regres-
sor labels as they appear on regression output.
%MODELSIZE(model) Returns the number of equations or formulas in a
model.
%MODELDEPVARS(model) Returns a VECTOR[INTEGER] which lists the de-
pendent variables of the equations or formulas in a
model.
Wizards
The relevant regression and estimation wizards provide fields for including instru-
mental variables, so you do not need to do an INSTRUMENTS instruction if you will be
using a wizard to do the estimation.
Parameters
variables List the exogenous and predetermined variables in regression
format. Note that the CONSTANT isn’t included automatically as
an instrument. If you need it, put it in the variables list.
Options
drop/[nodrop]
add/[noadd]
With ADD and DROP, you can make small changes to the existing list of exogenous
variables. ADD adds the new list to the existing one, while DROP removes any of
the listed variables. ADD and DROP are useful when you estimate large models
with many potential instruments. If each equation uses a different subset of the
instruments, these options can simplify specification of the instrument sets.
print/[noprint]
Use PRINT to list the current set of instruments.
Example
instruments constant trend govtwage taxes govtexp $
capital{1} production{1} profit{1}
sets the instruments list for Klein’s Model I.
Notes
The instruction NLSYSTEM has a special option MASK which allows a different set of
instruments to be used for each equation in the system. You provide a RECTANGULAR
array with dimensions “number of instruments” x “number of formulas” which has
1.0 in a cell in the column j if and only if you want instrument i to be used for for-
mula j. For instance:
instruments constant csz{1 to 4} pcs{1 to 4} aaz{1 to 4}
You need to be careful in using lags as instruments. For instance, in the example of
the large simultaneous equations model above, lags 1 to 4 of Y are used as instru-
ments. Since Y{4} isn’t available until T=5 (at a minimum), the estimation range can
start no earlier than period 5. This can be a major problem in a panel data set, as you
lose data points in each cross section. An alternative to using lag notation is to create
a separate series for each lag, but with zero values where the lagged data is unavail-
able. For panel data, use the %PERIOD function to get the time period within an indi-
vidual’s data. For instance, rather than Y{1 to 4}, you could do the following:
set y1 = %if(%period(t)<=1,0.0,y{1})
set y2 = %if(%period(t)<=2,0.0,y{2})
set y3 = %if(%period(t)<=3,0.0,y{3})
set y4 = %if(%period(t)<=4,0.0,y{4})
and then use Y1 Y2 Y3 Y4 on the instrument list.
Variables Defined
%NINSTR number of variables on the list (INTEGER)
See Also . . .
SWEEP Projects a set of target variables on a set of other variables.
UG, Section 2.3 Instrumental Variables and Two-Stage Least Squares.
UG, Section 4.9 Method of Moments Estimators (Univariate).
UG, Section 4.10 Non-Linear Systems Estimation.
%instlist() Function returning current instruments as a regressor list
%insttable() Function returning current instruments as a table
%instxvector(t) Function for extracting an X(t) for the current instrument set
kalman( options )
Basic Options
print/[noprint]
ftests/[noftests]
These control the printing of the regression and F test output. The defaults are
NOPRINT and NOFTESTS. If you want to display output based on the value of a
conditional expression, you can use the syntax OPTION=(condition). See “Ex-
amples” on page RM–273 in this section for more.
Advanced Options
startup=startup entry
When you start the Kalman filter without an ESTIMATE, the first KALMAN in-
struction should include this option. KALMAN instructions thereafter will recom-
pute coefficients given the next observation.
rtype=[all]/current/onestep/recursive
This option allows you to specify how you want the residuals series filled.
ALL : all entries using the updated coefficient estimates.
CURRENT : just the current entry, using the updated coefficients.
ONESTEP : just the current entry, using the previous coefficients.
RECURSIVE : recursive residual, just the current entry
backwards/[noback]
With the BACKWARDS option, the filter operates backwards for this update, adding
an observation to the beginning of the sample rather than to the end. KALMAN
replaces the lags in the equation with leads, and vice versa.
drop=observation to drop
add=observation to add
Use DROP and ADD either to drop an observation from the sample or add one. You
can use DROP along with the TEMP and CHANGE options (below) to see the effect
upon the coefficients of dropping a single observation.
temp/[notemp]
change=(output) VECTOR of coefficient changes
TEMP causes KALMAN to compute the new coefficients and (optionally) residuals,
then discards the results, leaving the filter in the same state as it was before the
KALMAN instruction. Use CHANGE with or without TEMP to save the change in the
coefficient vector of a one equation system.
The V option gives the variances of v; if you don’t include it, the equation
variances provided on the KFSET are used. If you do not use Y, the Kalman filter
error is taken to be zero, so the coefficients don’t change; only the covariance
matrix of the coefficients.
Coefficient Updating
To use the Kalman filter for coefficient updates, define the system using the SYSTEM
and related instructions, and then use ESTIMATE to initialize the Kalman filter,
computing estimates over some subset of the data range. Each KALMAN instruction
thereafter will add a single observation to the end of the data set.
Variables Defined
%NREG number of regressors in the first equation (INTEGER)
%XX the X ′ X −1 matrix (SYMMETRIC)
Examples
system(model=yrmp)
variables gdp cpr ipd m1
lags 1 to 4
det constant
end(system)
estimate 1948:1 2011:4
do time=2012:1,2017:4
kalman(print=(time==2017:4))
end do
ESTIMATE computes the regressions through 2011:4 and prints them with the F-
tests. The loop then executes KALMAN twenty-four times (quarterly data) so that the
last produces estimates using data through 2017:4. This only prints the estimates for
this final period because the argument for PRINT will be zero until TIME is 2017:4.
equation kfeq y
# constant x1 x2 x3
system kfeq
end(system)
estimate(cohistory=cokalman) 1 25
do entry=26,200
kalman(cohistory=cokalman)
end do entry
This Kalman filters a single equation over the period 26 to 200 and saves time series
of the coefficients in series COKALMAN(1),...,COKALMAN(4).
Parameters
residuals, coeffs first series in a block of series for the residuals and coefficients,
respectively (use VECTORS of SERIES or numbered series). The
new RESIDUALS and COEFFS options are generally more conve-
nient for saving these values.
printflag, testflag KALMAN prints output and F-tests (respectively) when these are
equal to 1 (usually set with logical expressions). These override
the NOPRINT and NOFTESTS options.
See Also . . .
UG, Section 7.14 Kalman Filter.
DLM General state-space and dynamic linear modeling
SYSTEM Initial instruction in the definition of a system.
KFSET Defines parameters for a Kalman filter.
TVARYING Defines parameters for time-varying coefficients.
ESTIMATE Estimates a system of equations.
Parameters
list of matrices List of SYMMETRIC arrays which are to hold the covariance
matrices of the coefficients (states). You can use the option
NAMES as an alternative to listing these explicitly.
There should be one array for each equation in the system,
except if the system is a simple OLS vector autoregression,
in which case you only need one array.
If you do not need access to the covariance matrices (either
to set them or to examine them), you can omit the list of
matrices. rats will store all information internally.
Technical Information
rats analyzes the equations in the system individually. The following model is used
for each equation:
Let bt|t be the estimates of bt using information through t, and let the covariance
matrix of bt|t be written St . What KFSET provides are
1. The St matrices (one for each equation)
2. The nt. The options VARIANCE, CONSTANT, VECTOR and the supplementary cards
are used to allow you to handle both situations where n is constant over time, and
those where it changes.
Options
variance=[concentrated]/known
constant/[noconstant] (with VARIANCE=KNOWN only)
v=VECTOR of variances (with VARIANCE=KNOWN, CONSTANT only)
Use these options for setting the variances of the measurement equation errors
(nt). Note that constant variances are provided in a VECTOR. This is done because
a SYSTEM can, and often does, have several equations. If you’re analyzing a single
equation, just use a vector with dimension one, or the notation ||variance||.
(The VARIANCE option replaces the SCALE option used before version 7).
Description
KFSET uses the list of matrices in two ways:
• If you execute KALMAN without an ESTIMATE, list of matrices supplies
the initial estimates of the covariance matrices of the coefficients (states). You
have to dimension the array(s) and set them before you can do the KALMAN.
• If you do execute an ESTIMATE, rats fills the list of matrices with the
estimated covariance matrices. For this use, you do not need to dimension the
arrays because ESTIMATE does it for you.
If you are using KALMAN without ESTIMATE, the initial coefficients also need to be
set. This is usually done with the ASSOCIATE instruction.
If you want time-varying parameters (M t non-zero), you need to use TVARYING as
well as KFSET.
Notes
If you want to write a single set of code which can handle VARs of various sizes, we
recommend that you use a VECTOR[SYMMETRIC] in place of a list. For instance,
dec vector[symmetric] kfs(neqn)
kfset kfs
makes KFS(1), KFS(2),..., KFS(NEQN) the list of matrices.
Equation Variances
There are several options which control the behavior of the measurement equation
variances. Choose the method which is correct for your assumptions:
below.
Examples
system(model=rmpy)
variables cpr m1 ppi ip
lags 1 to 12
kfset xxsys
end(system)
estimate 1948:1 2010:12 ESTIMATE dimensions and sets XXSYS
uses KFSET to obtain the ( X ′ X ) matrix from the estimation of a var. Since there is
−1
equation kfeq y
# constant x1 x2
system kfeq
kfset(variances=known,constant) xxmat
# .01
tvarying tvmat
end(system)
dimension xxmat(3,3) tvmat(3,3)
Time varying coefficients with nt=.01. You need to set the initial values of XXMAT,
TVMAT and the coefficients of KFEQ before you can use KALMAN.
Parameters
list of series List of series to be given labels.
Supplementary Card
The labels can be any collection of characters (up to sixteen) enclosed within single or
double quotes. You can also use string expressions, LABEL variables, or elements of
an array of LABELs.
Notes
The instruction EQV is similar, but goes beyond LABELS to attach a name which can
also be used on input, to reference the series. LABELS is usually more appropriate, as
labels aren’t subject to the restrictions put on symbolic names—you can use any com-
bination of characters (up to sixteen). Note also, that any number of series can share
an output label, while series names (set by EQV) must be unique.
Example
open data setup.dat
declare vector[label] pairs
read(varying) pairs
compute npairs=%rows(pairs)/2
open data ticker.rat
cal(d) 1988:1:5
allocate 2002:12:31
dec vector[series] tickseries(2)
do i=1,npairs
labels tickseries
# pairs(2*i-1) pairs(2*i)
data(format=rats) / tickseries
...
end do i
reads a (possibly very long) list of pairs of series names from the file SETUP.DAT.
NPAIRS is the number of pairs read. The series are read, two at a time, from the
rats format file TICKER.RAT, and some (unspecified) analysis is performed. You
need to use LABELS because DATA(FORMAT=RATS) searches for the data on the basis
of the series labels.
Wizard
The Time Series—VAR (Setup/Estimate) wizard provides an easy, dialog-driven
interface for defining and estimating var models.
Parameters
list of lags The set of lags of each of the endogenous variables which will
go into each created equation. Usually these will be consecutive
lags as shown below, but you can skip lags. For example:
LAGS 1 2 3 6 12
Example
system(model=canmodel)
variables cangnp canm1 cantbill canunemp canusxr usagnp
lags 1 to 4
det constant
end(system)
defines a six-variable var with four lags of each variable plus a constant in each
equation.
Notes
If you use VARIABLES and LAGS to define your VAR, each equation will include the
same list of lags for each variable listed on VARIABLES. You cannot, for instance, use
a different set of lags for the dependent variable or leave lags of variable z out of the
variable x equation. You can get such flexibility by defining each equation separately
using the instruction EQUATION. However, you cannot use a prior with such models.
If you use the ECT instruction to define an error-correction model, use the actual lag
length that you want for the original, undifferenced model—rats will automatically
drop one lag when estimating the differenced error correction form. ECT models do
not support non-consecutive lists of lags. If you use ECT in your SYSTEM definition,
rats will always define the model using all lags from 1 to L, where L is the longest
lag specified on the LAGS instruction. For an error correction model with non-consec-
utive lags, you will need to create the differenced variables and error correction terms
manually and specify the model in error correction form directly, without using ECT.
Wizard
The Statistics—Limited/Discrete Dependent Variables wizard provides dialog-driven
access to most of the features of the LDV instruction.
Parameters
depvar Dependent variable. rats requires numeric coding for this.
start end Estimation range. If you have not set a SMPL, this defaults to
the maximum common range of all the variables involved.
residuals (Optional) Series for the residuals.
Options
truncate=[neither]/lower/upper/both
censor=[neither]/lower/upper/both
interval/[nointerval]
These choose the type of estimation method to be employed. The TRUNCATE op-
tions are used when an observation is in the data set only if the dependent vari-
able is in range. CENSOR is used when you can observe the data points which hit
the limit. INTERVAL is used when the data actually consist only of the upper and
lower limits, that is, all you can observe are bounds above and below. With the
interval estimation, you still need the dependent variable, but it is used only to
determine which observations to use.
[print]/noprint
vcv/[novcv]
smpl=SMPL series or formula (“SMPL Option” on page RM–546)
unravel/[nounravel] (User’s Guide, Section 2.10)
equation=equation to estimate
These are similar to the LINREG options.
robusterrors/[norobusterrors]
lags=correlated lags [0]
lwindow=neweywest/bartlett/damped/parzen/quadratic/[flat]/
panel/white
damp=value of l for lwindow=damped [0.0]
lwform=VECTOR with the window form [not used]
cluster=SERIES with category values for clustered calculation
These permit calculation of a consistent covariance matrix allowing for het-
eroscedasticity (with ROBUSTERRORS) or serial correlation (with LAGS), as well as
clustered standard errors. See Sections 2.2, 2.3, and 2.4 of the User’s Guide and
“Long-Run Variance/Robust Covariance Calculations” on page RM–540.
Note, however, that the estimates themselves may be inconsistent if the distribu-
tional assumptions are incorrect.
weight=series of entry weights (“WEIGHT option” on page RM–549)
Use this option if you want to weight the observations unequally.
Description
The technical information on these models is provided in Section 12.3 of the User’s
Guide. All the models are based upon the standard
(1) yi = X i β + ui ; ui N (0, σ 2 ) i.i.d.
They differ upon when and what values can be observed for the dependent variable.
Note that while theoretically you can have a data set which is truncated at one end
and censored at the other, LDV isn’t designed for it.
Truncated and censored models tend to be fairly easy to set up. The INTERVAL esti-
mator is a bit trickier. It’s used when all that is observed for an individual is a pair
of values which bracket the true dependent variable. If you have hard numbers for
the upper and lower bounds for all observations, you’re unlikely to get much of an
improvement from using LDV versus a linear regression using the interval midpoints
for the dependent variable. INTERVAL is most useful when some of the observations
are unlimited on one end. With the interval estimator, the “dependent variable” is
provided using two series, one indicated with the UPPER option and one with the
LOWER option. The dependent variable is used only to determine which observations
are valid—LDV can’t look just at the upper and lower series, since a missing value in
them is used to show no limit in that direction.
All models are estimated by Newton-Raphson on the model reparameterized as
described in Olsen (1978), that is with { γ , h } º { β / σ ,1 / σ } . This assumes that s is be-
ing estimated: if you want to input a specific value, use the SIGMA option. With this
parameterization, for instance, the log likelihood for an observation for the interval
model is
log(1 − Φ( Li h − X i g )) if unbounded above
(2) log L = log(Φ(Ui h − X i g )) if unbounded below
log(Φ(Ui h − X i g ) − Φ( Li h − X i g )) otherwise
The covariance matrix for the natural parameterization is estimated by taking minus
the inverse Hessian from the reparameterized model and using the “delta method”
(linearization) to recast it in the original terms.
Hypothesis Testing
You can apply the hypothesis testing instructions (EXCLUDE, TEST, RESTRICT and
MRESTRICT) to estimates from LDV and you can use the Statistics—Regression Tests
wizard to set those up. They compute the “Wald” test based upon the quadratic
approximation to the likelihood function. rats uses the second set of formulas in
Section 3.3 of the User’s Guide to compute the statistics. Note that you cannot use the
CREATE or REPLACE options on RESTRICT and MRESTRICT.
Examples
ldv(censor=lower,lower=0.0) hours
# nwifeinc educ exper expersq age kidslt6 kidsge6 constant
This is a classic “tobit” model, censored below at zero.
ldv(truncate=lower,lower=0.0,smpl=affair) y
# constant z2 z3 z5 z7 z8
This is truncated below at zero, with the sample restricted to those with a non-zero
value for AFFAIR.
Wizard
Use the Statistics—Linear Regressions wizard, and select “OLS”, “Weighted Least
Squares”, or “Instrumental Variables” as the technique, depending upon the applica-
tion.
Parameters
depvar Dependent variable.
start end Range to use in estimation. If you have not set a SMPL, this
defaults to the largest common range for all the variables in-
volved. If you use the INSTR option, the instruments are includ-
ed in determining the default.
residuals (Optional) Series for the residuals. Omit with * if you do not
want to save the residuals but want to use coeffs.
coeffs (Optional) Series for the coefficients. It rarely makes sense to
use a coeffs series rather than simply using the %BETA vector.
Options
[print]/noprint
vcv/[novcv]
These control the printing of regression output and the printing of the estimated
covariance/correlation matrix of the coefficients (page Int–77 of the Introduction)..
unravel/[nounravel]
Substitutes for ENCODED variables (User’s Guide, Section 2.10). rats does not
print the intermediate regression (in terms of encoded variables).
cmom/[nocmom]
Use the CMOM option together with the CMOMENT instruction (executed prior to the
LINREG). LINREG takes the required cross products of the variables from the ar-
ray created by CMOMENT. This has two uses:
• By computing the cross products just once, you can reduce the computations
involved in repetitive regressions.
• By altering the %CMOM array before running the regression, you can implement
ridge, mixed and related estimators (Section 1.2 in Additional Topics PDF).
To use CMOM, you must include all variables involved in the regression in the list
for CMOMENT. rats will ignore the LINREG start and end parameters, and carry
over the range from CMOMENT.
create/[nocreate]
If you use CREATE, LINREG does not estimate the regression. Instead, it gener-
ates the standard regression output and statistics using information that you
provide. See “Options Used with CREATE”” for details.
robusterrors/[norobusterrors]
lags=correlated lags [0]
lwindow=neweywest/bartlett/damped/parzen/quadratic/[flat]/
panel/white
damp=value of g for lwindow=damped [0.0] (overrides LWINDOW setting)
lwform=VECTOR with the window form [not used]
cluster=SERIES with category values for clustered calculation
When used without the INSTRUMENTS option, these permit calculation of a con-
sistent covariance matrix allowing for heteroscedasticity (with ROBUSTERRORS)
or serial correlation (with LAGS), or clustering based upon some other set of
categories. Sections 2.2, 2.3, and 2.4 of the User’s Guide and “Long-Run Variance/
Robust Covariance Calculations” on page RM–540. Note especially the possible computa-
tional problems that may arise when using LAGS.
center/[nocenter]
CENTER adjusts the formula for the weight matrix to subtract off the (sample)
means of Z ' u , which may be non-zero for an overidentified model. For more infor-
mation, see “ZUMEAN and CENTER options” on page RM–541.
jrobust=statistic/[distribution]
You can use this option to adjust the J-statistic specification test when the
weighting matrix used is not the optimal one. See Section 4.9 in the User’s Guide
for more information.
Examples
linreg(define=foodeq) foodcons / resids
# constant dispinc trend
regresses FOODCONS on DISPINC and TREND, saves the residuals as RESIDS, and cre-
ates equation FOODEQ.
Variables Defined
%BETA Coefficient VECTOR
%DURBIN Durbin-Watson statistic (REAL)
%MEAN Mean of dependent variable (REAL)
%NDF Degrees of freedom (INTEGER)
%NFREE Number of free parameters (INTEGER)
%NOBS Number of observations (INTEGER)
%NREG Number of regressors (INTEGER)
%RESIDS Series containing the residuals (SERIES)
%RSS Residual sum of squares (REAL)
%SEESQ Standard error of estimate squared (REAL)
%STDERRS VECTOR of coefficient standard errors
%TSTATS VECTOR containing the t-stats for the coefficients
%RHO First lag correlation coefficient (REAL)
%VARIANCE Variance of dependent variable (REAL)
−1
%XX Covariance matrix of coefficients, or ( X ′ X ) (SYMMETRIC)
form=[ftest]/chisquared
If you change %XX or use COVMAT, and the new matrix is itself an estimate of the
covariance matrix, use the option FORM=CHISQUARED to switch those formulas
based upon F and t to those based upon c2 and Normal.
regcorr=number of restrictions
Use this option if you have computed the coefficients subject to restrictions. This
allows LINREG to compute the proper degrees of freedom.
equation=equation to use
You can use the EQUATION option as an alternative to a supplementary card
to input a regression which differs from the preceding one (so you can’t use
LASTREG). The equation should be an equation which you have already de-
fined—it supplies the list of explanatory variables and dependent variable. Use
the COEFF and COVMAT options to input the coefficients and covariance matrix.
Parameters
index The index variable, such as IEQN in the example below. It
should be followed by at least one blank, then the =.
list of values The set of values (numbers or series) the index is to take. You
can use a VECTOR of INTEGERS to form part or all of the list.
CARDS fields The fields from the standard supplementary card for the in-
struction, written in terms of the index.
Description
CARDS generates a supplementary card for each element in the list of values
in turn, by setting index to the value and evaluating the CARDS fields. The easiest
way to explain how the procedure works is with an example:
declare vector[series] imp(7)
impulse(cv=vcv,steps=48,col=1)
# 1 imp(1) 1 1
# 2 imp(2) 1 2
# 3 imp(3) 1 3
# 4 imp(4) 1 4
# 5 imp(5) 1 5
# 6 imp(6) 1 6
# 7 imp(7) 1 7
Notice the pattern in the supplementary cards. With CARDS and LIST, you can re-
place this with
declare vector[series] imp(7)
list ieqn = 1 to 7
impulse(cv=vcv,steps=48,col=1)
cards ieqn imp(ieqn) 1 ieqn
LIST is an actual rats instruction, while CARDS is a line which replaces the set of
supplementary cards.
Comments
Only one LIST is active at a given time. You can set up one LIST and use it for many
instructions, for instance,
list ieqn = 1 to 6
sur 6
cards ieqn
sur 4 / equate 3 5 and 6 won’t be used since the SUR says 4 equations
cards ieqn
but you can’t set up a LIST IEQN = ... and then a LIST ISER = ... and then
use IEQN on a CARDS. The second LIST instruction deactivates the first.
Applicability
You can use CARDS for the main set of supplementary cards for the forecasting in-
structions FORECAST, STEPS, IMPULSE, ERRORS, SIMULATE and HISTORY; and the
instructions GRAPH, GBOX, SCATTER, SUR, and SMODIFY.
You can also use CARDS to supply the supplementary card for an ENTER instruction
(in a PROCEDURE, for example). Use the SEQUENCE option on ENTER to indicate the
value for the LIST index that should be used in evaluating the CARDS instruction.
Parameters
datatype Variable type that you want to assign to these variables. Local
variables can be any of the data types supported by rats. See
Section 1.4 of the User’s Guide.
list of names The list of variables that will have datatype. Separate the
names with blanks. Symbolic names must be unique—if you
attempt to assign a new type to a name already in use as a local
variable, you will get an error message.
You can also set the dimensions of an array at the time you
declare it, by using a dimension field instead of simply the vari-
able name—just provide the dimensions for the array in paren-
theses immediately after the variable name. The dimensions
can use PROCEDURE or FUNCTION parameters or options.
Description
Any variable (other than a PROCEDURE parameter or option or FUNCTION param-
eter) in a rats program is a global variable unless you state otherwise. You can
use a global variable within any part of a program, including PROCEDURES and
FUNCTIONS. Global variable names must be unique—you cannot define two differ-
ent global variables with the same name (even if they are of different types) within a
particular program. Also, within a particular program or session, you cannot redefine
an existing global variable as a different type. For instance, you cannot redefine a
VECTOR as a REAL.
Procedures and functions, however, may have local variables as well as globals.
These are only recognized within the procedure which defines them. rats keeps
global and local variables completely separate, so a local variable can have the same
name as a global one. If a name conflict arises within a procedure, the local variable
is used.
If you plan to use a single PROCEDURE or FUNCTION in many different programs, it is
a good idea to write the procedure using only parameters, local variables, procedure
options, and variables that rats defines. That way there is no possibility that your
procedure will conflict with the names of global variables used in the main program.
You declare a local variable using LOCAL. LOCAL is quite similar to DECLARE.
Note that rats does not release the memory space for local series and local arrays
when you exit the procedure. If you find it necessary to release the space before re-
turning, use the instruction RELEASE.
Example
These are the first actual program lines for the @DFUNIT procedure. With the excep-
tion of DFUNIT itself and %%AUTOP, all the variables defined by the TYPE, OPTION
and LOCAL instructions are part of the DFUNIT’s “namespace”, that is, they are
recognized only here and can be used in other procedures or also as global variables.
%%AUTOP is intentionally defined using DECLARE so it will be visible outside the pro-
cedure.
Wizard
You can use the Data/Graphics—Transformations wizard.
Parameters
series Series to transform.
start end Range to transform. If you have not set a SMPL, this defaults to
the defined range of series.
newseries Series for the result. By default, newseries=series.
newstart Starting entry of new series. The default is newstart=start.
Examples
The first instruction takes the natural log of GNP and puts it into the series LOGGNP.
The second replaces the series FORECAST by its log.
log gnp / loggnp
log forecast
See Also . . .
EXP Takes the exponential (inverse log) of a series.
CLN Takes the natural log of a complex series.
LOG(x) Returns the log of a single real value.
%LOG(x) Takes the log of each element of a matrix.
Description
LOOP repeatedly executes the block of instructions between the LOOP and END LOOP
statements. You decide where and when to execute BREAK instructions to terminate
the loop. BREAK is the only way out of a LOOP. A NEXT will jump back up to the top of
the loop from whereever it is executed.
LOOP is most valuable when you make the continuation decision in the middle of the
block, rather than at the beginning or end. It is good programming practice to use
WHILE or UNTIL where possible, since they make it easier to follow the program flow.
Example
loop
menu "What Next?"
choice "Specify Model"
source specify.src
choice "Estimate Model"
source estimate.src
choice "Do Forecasts"
source forecast.src
choice "Quit"
break
end menu
end loop
This repeatedly offers a set of choices, until the person using the program selects
Quit. The BREAK after the Quit choice breaks control out of the loop.
See Also . . .
BREAK Breaks control out of a loop.
NEXT Skips to the top of a loop.
DO Looping over an index.
DOFOR Looping over a list of items.
WHILE Conditional looping.
UNTIL Conditional looping.
UG, Section 15.1 The rats Compiler
lqprog( options ) x
Parameters
x VECTOR into which the solution will be saved. This corresponds
to the x matrix (the matrix of unknowns) in the technical
discussions later in this section. You do not need to declare or
dimension this array ahead of time.
c A b Q These are for an older form of the instruction. They are the 2nd
through 5th parameters in order and provide the same infor-
mation as the newer options of the same names. We strongly
recommend using the option form.
Options
c=VECTOR of coefficients for c matrix [unused]
q=SYMMETRIC array of quadratic coefficients for Q matrix [unused]
These input the matrices controlling the objective function. If there is a Q matrix,
LQPROG does quadratic programming; otherwise it does linear programming.
[trace]/notrace
If you use TRACE, LQPROG will issue a report on the progress of the estimation
after each iteration.
[print]/noprint
This controls the printing of the results.
feasible/[nofeasible]
By default, LQPROG computes the full solution of the problem supplied. If you use
the FEASIBLE option, LQPROG will only compute the initial feasible solution.
Description
LQPROG can solve both linear programming problems and quadratic programming
problems. Examples of each are shown below. For details on using these instructions,
see Section 13.2 in the User’s Guide.
Note that, while LQPROG uses a fairly good general-purpose algorithm for solving
these problems, it is not designed to handle very large problems with hundreds of
variables or constraints.
Variables Defined
%FUNCVAL Final value of the function (REAL)
%LAGRANGE VECTOR of Lagrange multipliers on the constraints
subject to 2x1 + x 2 + x3 =2
x1 + 2x 2 + 3x3 =5
2x1 + 2x 2 + x3 ≤6
xi ≥0
declare rectangular amat(3,3)
declare symmetric qmat(3,3)
declare vector cvec(3) bvec(3)
compute amat = || 2, 1, 1 | 1, 2, 3 | 2, 2, 1||
compute bvec = || 2, 5, 6 ||
compute cvec = || -10 , 0, -4 ||
compute qmat = || 2 | 0, 2 | 1, 0, 2 ||
lqprog(equalities=2,c=cvec,a=amat,b=bvec,q=qmat) results
which produces the output:
QPROG converges at 3 iterations
minimum = -5.480000
solution x =
0.2000 0.0000 1.6000
The example below demonstrates a simple portfolio optimization routine (see ex-
ample PORTFOLIO.RPF for a more extensive example). This finds the global mini-
mum variance portfolio. N is the number of returns, and ExpRet and CovMat are the
expected values and covariance matrix of the returns. To modify this for any number
of assets, you simply need to change those three variables accordingly.
compute N=3
compute [vect] ExpRet = ||.15,.20,.08||
compute [symm] CovMat = ||.20|.05,.30|-.01,.015,.1||
*
* Compute the minimum variance portfolio, and its expected return
*
compute units=%fill(1,N,1.0)
lqprog(q=covMat,a=units,b=1.0,equalities=1) x
compute minr=%dot(x,expRet)
compute maxr=%maxvalue(expRet)
Parameters
array RECTANGULAR array to create. You do not need to declare or
dimension it.
start end Range of entries to use. If you have not set a SMPL, this defaults
to the largest range over which all the variables are defined.
Two additional parameters used in previous versions to store the number of observa-
tions and the number of variables have been deprecated. MAKE now automatically
saves these values in the variables %NOBS and %NVAR.
Description
MAKE creates the array from entries start to end of the listed series. It will au-
tomatically create and dimension the array. By default, each column represents a
different variable and each row represents an observation in the array. This is the
standard arrangement for an X array in y = Xb . You can use the TRANSPOSE option
to get X ¢.
Options
equation=name/number of equation for variables
lastreg/[nolastreg]
depvar/[nodepvar]
With EQUATION, MAKE uses the explanatory variables from the equation as its list
of variables, while LASTREG takes the explanatory variables from the last regres-
sion (or similar instruction). With either option, you can use DEPVAR to include
the dependent variable as well, as the final column.
panel/[nopanel]
You can use this option to split one or more panel data series into a T×(NK) ar-
ray, where T is the number of time periods per individual, N is the number of
individuals, and K is the number of series.
transpose/[notranspose]
Use TRANSPOSE if you want the transpose of the X array—each column is an en-
try and each row a variable. If you want to use the instruction OVERLAY to isolate
a subset of the entries, you need to use TRANS.
Examples
make(lastreg) x %regstart() %regend()
compute m=%identity(%nobs)-x*inv(tr(x)*x)*tr(x)
This takes the matrix of explanatory variables from the previous regression and cre-
ates the matrix X from them. That is used to compute the residual projection matrix:
I - X(X ¢X)-1 X ¢
make(panel) ut
# %resids
compute tdim=%rows(ut),ndim=%cols(ut)
This creates a T × N matrix UT from the residuals series %RESIDS where T is the
number of observations per individual in the panel set and N is the number of indi-
viduals.
inquire(regressorlist,valid=common) n1 n2
# constant x1 x2 x3 y1
make(smpl=common) x n1 n2
# constant x1 x2 x3
make(smpl=common) y n1 n2
# y1
compute [vector] b=inv(tr(x)*x)*(tr(x)*y)
does a least squares regression using matrix instructions. We include the SMPL op-
tion because each MAKE, on its own, might determine a different default range.
“Unmaking” a Matrix
Use SET or a sequence of SET instructions to set series equal to the rows (or columns)
of matrices. For example,
set x2 1 100 = a(t,1)
sets X2 equal to the first 100 entries of column 1 of A. Remember, however, that the T
subscript runs over the start to end range on SET, which is not necessarily 1,...,di-
mension. The following (trivial) example shows how to fix the subscript in the SET to
keep the output series aligned with the input. (XX will be the same as X).
make r 1948:1 2017:12
# constant x
set xx 1948:1 2017:12 = r(t-(1948:1-1),2)
If the size of the matrix can change from one use to another, you should “unmake”
into a VECTOR of SERIES. For instance, the following creates a vector of NCOLS series
from the columns of R.
compute ncols=%cols(r)
dec vect[series] columns(ncols)
do i=1,ncols
set columns(i) 1 %rows(r) = r(t,i)
end do i
Missing Values
If any series has a missing value at an entry, MAKE drops that entry from the con-
structed array. If you are using MAKE to create several arrays which must have iden-
tical sets of entries, use a common SMPL to deal with missing values.
smpl(reglist)
# constant v1{0 to 3} includes all variables
make x
# constant v1{1 to 3}
make y
# v1
Variables Defined
%NOBS Number of observations (INTEGER)
%NVAR Number of variables (INTEGER)
Description
MAXIMIZE finds (or attempts to find) the b which solves:
T
max G( b ) = ∑ f ( yt , X t , b )
b
t =1
where f is a rats formula (FRML). Before you can use it, you must:
• Set the list of free parameters using NONLIN
• Create the explanatory formula f using FRML
• Set initial values for the parameters (usually with COMPUTE or INPUT)
See Chapter 4 in the User’s Guide for a more detailed description of NONLIN, FRML
and the processes used in non-linear estimation.
Parameters
frml This is the FRML (created with the FRML instruction) which com-
putes the function f.
start end the estimation range. If you have not set a SMPL, it defaults to
the range over which frml can be computed.
Some frmls (particularly recursively defined formulas such as
garch models) require you to specify a range, either with SMPL
or start and end. If you get the error message:
Missing Values ... Leave No Usable Data Points
your range either has to be set explicitly or, if you have already
specified one, it must be changed to match the available data.
Options
parmset=PARMSET to use [default internal]
This option selects the parameter set to be estimated (User’s Guide, page UG–129). rats
maintains a single unnamed parameter set which is the one used for estimation
if you don’t provide a named set.
method=bhhh/[bfgs]/simplex/genetic/annealing/ga/evaluate
Chooses the estimation method. See Chapter 4 in the User’s Guide for a technical
description of these. BHHH should be used only if the function being maximized
is the log likelihood (apart from additive constants). BFGS and BHHH require the
formula to be twice continuously differentiable and are the only ones that can
compute standard errors. The others are derivative-free, make less stringent
assumptions and are generally used as a PMETHOD choice to refine initial guess
values rather than as the final estimation method.
[print]/noprint
vcv/[novcv]
title=”title for output” [“MAXIMIZE”]
These are the same as for other regressions.
robusterrors/[norobusterrors]
lags=correlated lags [0]
lwindow=newey/bartlett/damped/parzen/quadratic/[flat]/panel/white
damp=value of g for lwindow=damped [0.0]
lwform=VECTOR with the window form [not used]
cluster=SERIES with category values for clustered calculation
These permit calculation of a consistent covariance matrix (Section 4.5 in the
User’s Guide) or clustered standard errors if you are doing Quasi Maximum
likelihood estimation. With MAXIMIZE, the LAGS, LWINDOW, LWFORM and CLUS-
TER options are used for dealing with correlation in the gradient elements. See
“Long-Run Variance/Robust Covariance Calculations” on page RM–540
Hypothesis tests
If you have used METHOD=BFGS or METHOD=BHHH, you can test hypotheses on the coeffi-
cients using TEST, RESTRICT and MRESTRICT and compute standard errors for functions
of the coefficients using SUMMARIZE. Coefficients are numbered by their positions in the
parameter set. See “Technical Information” regarding the validity of these test statistics.
Missing Values
MAXIMIZE drops any entry which cannot be computed because of missing values.
Technical Information
Chapter 4 in the User’s Guide describes the estimation methods and how the instruc-
tion NLPAR controls some of the finer adjustments of them. MAXIMIZE uses numerical
derivatives to compute the gradient. After the first iteration, the perturbations used
in computing these adapt to the estimates of the dispersion of the parameters.
As discussed in Section 4.5, the validity of the covariance matrix and standard errors
produced by MAXIMIZE depend upon the functional form and the options that you
choose. In general, standard errors produced using METHOD=BHHH are asymptotically
correct only if the function being maximized is the log likelihood apart from additive
constants. This is true even if the function you use is a 1–1 monotonic transforma-
tion of the log likelihood, and thus has an identical maximum. The default standard
error calculations for bfgs are correct only under similar circumstances. If you’re
maximizing a function other than the log likelihood, you should use the combination
of METHOD=BFGS and ROBUSTERRORS.
Examples
This estimates a probit model for a data set where there are repetitions of settings for
the explanatory variables (X1 and X2), with REPS and SUCCESS giving the total ob-
servations and the number of successes. Note the use of the “sub-formula” to describe
the function of the explanatory variables and how the PROBIT formula evaluates
this just once and puts it into the variable Z. The use of the sub-formula allows quick
changes to the basic model.
nonlin b0 b1 b2
frml zfrml = b0+b1*x1+b2*x2
frml probit = (z=zfrml(t)) , $
success*log(%cdf(z))+(reps-success)*log(1-%cdf(z))
compute b0=b1=b2=0.0
maximize probit
Box-Cox models can take several forms. The only one which requires use of MAXI-
MIZE rather than NLLS has the dependent variable subject to a Box-Cox transforma-
tion with unknown parameter. In rats, you can compute the Box-Cox class of trans-
formations using the function %BOXCOX. The following estimates using maximum
likelihood the model
Initial guess values are obtained using a linear regression, which (with an adjust-
ment to the intercept) is a special case for l=1.
nonlin b0 b1 lambda sigmasq
frml rhsfrml = b0+b1*%boxcox(x,lambda)
frml boxcox = (lambda-1)*log(y) + $
%logdensity(sigmasq,%boxcox(y,lambda)-rhsfrml(t))
linreg y
# constant x
compute sigmasq=%seesq,b0=%beta(1)+%beta(2)-1
compute b1=%beta(2),lambda=1.0
maximize(method=bfgs,robusterrors,iters=200) boxcox
Note: convergence can be extremely slow with these models, which is why the itera-
tions limit is increased.
This estimates a linear regression on a panel data set allowing for heteroscedastic-
ity among the (five) individuals. The variance parameters are in the vector SIGSQ,
while the regression parameters come off the vector B generated using FRML with the
LASTREG option.
linreg iv
# constant f c
frml(lastreg,vector=b) zfrml
dec vector sigsq(5)
compute sigsq=%const(1.0)
nonlin b sigsq
frml groupgls = %logdensity(sigsq(%indiv(t)),iv-zfrml)
maximize(iters=200,method=bfgs) groupgls
Here Y given E is assumed to have an exponential distribution with mean B+E, where
B is a parameter to be estimated. This is estimated by maximum likelihood using the
bhhh algorithm.
data(format=prn,org=columns) 1 20 id y e
nonlin b
frml logl = -log(b+e)-y/(b+e)
maximize(method=bhhh) logl
Parameters
start end Range of entries to use. If you have not set a SMPL, this
defaults to:
• the range of the most recent regression, if you use the op-
tion LASTREG.
• the maximum range over which all the regressors are de-
fined, otherwise.
list of resids This is a list of one or more series of residuals. If you omit this,
it’s treated as a series of 1’s.
Options
See “Long-Run Variance/Robust Covariance Calculations” on page RM–540 for a more
complete explanation of the calculations and the main options (LAGS, LWINDOW,
CLUSTER).
lags=correlated lags [0]
The number of lags of autocorrelation (in the form of moving average terms) that
you want included. There are certain technical problems which arise when LAGS
is non-zero which typically require a choice for LWINDOW other than the default of
LWINDOW=FLAT. For the quadratic spectral window (LWINDOW=QUADRATIC), LAGS
gives the bandwidth, since a full set of lags are always used with that. For other
window types, note that rats counts the number of correlated lags, which is one
less than the width that is often used in descriptions of these window. LAGS can
take non-integer values; while this feature is mainly used with the quadratic
window, it can be used with the others.
lwindow=newey/bartlett/damped/parzen/quadratic/[flat]/panel/white
damp=value of g for lwindow=damped [0.0]
LWINDOW chooses the form of the lag window to be used. NEWEYWEST and
BARTLETT are identical to each other, and to LWINDOW=DAMPED with DAMP=1.0.
QUADRATIC is the quadratic spectral window. DAMP gives the parameter g of
the window with LWINDOW=DAMPED (if DAMP is set to something other than 0.0,
LWINDOW is automatically set to DAMPED). None of these matter if LAGS is zero,
except for PANEL which is a special case of clustered calculation; it’s the equiva-
lent of FLAT with LAGS equal to the number of time periods per individual (minus
one).
center/[nocenter]
CENTER adjusts the formula to subtract off the (sample) means of Z ¢u , which may
be non-zero for an overidentified model. For more information, see “ZUMEAN
and CENTER options” on page RM–541.
[zudependent]/nozudep
ZUDEP is the default for MCOV (but not for NLSYSTEM), as there is little reason to
use the MCOV instruction when the instruments and residuals aren’t dependent.
print/[noprint]
Use PRINT if you want to print the computed matrix. It is printed in a table simi-
lar to the one used for the covariance matrix of regression coefficients (Introduc-
tion, page Int–77).
[square]/nosquare
Use NOSQUARE with v=residuals if you want to compute Z ¢vZ, rather than
Z ¢uuZ. The LAGS option is ignored if you use NOSQUARE. This can only be used
with a single “v” series.
Examples
mcov(lwindow=bartlett,lags=lags) startl+1 endl resids
# constant
computes an estimate of the spectral density of RESIDS at frequency zero, using
“LAGS” lags. The computed matrix (which here will be 1´1) will be in %CMOM.
mcov(opgstat=sclm) / u
# %reglist() u{1}
cdf(title="LM Test for Serial Correlation") chisqr sclm 1
This computes an lm test using the OPGSTAT option as described above.
Variables Defined
In addition to %CMOM, MCOV defines %NCMOM (dimensions of the matrix) and %NOBS
(number of observations).
Parameters
list of arrays This is the list of arrays you want to view or edit. These can be
any combination of real-valued RECTANGULAR, SYMMETRIC, or
VECTOR arrays. You must declare and dimension these before
using them with MEDIT. You should also initialize the entries of
the array(s) to some valid real number (such as 0.0).
MEDIT currently cannot be used with other types of arrays (that
is, arrays of integers, arrays of labels, etc.).
Options
[edit]/noedit
With EDIT, the user can modify the values in the array(s). With NOEDIT, the user
can view, but not change, the values in the array.
modal/[nomodal]
With MODAL, an MEDIT editing window functions like a modal dialog box. Execu-
tion of any pending instructions is suspended until the user closes the editing
window(s) opened by MEDIT. Also, rats is essentially “locked” into local mode, so
no new instructions can be executed while the editing window is open. The user
can, however, switch to other windows or select menu operations. This setting
is very useful in interactive procedures where you need the user to set the array
before continuing with the rest of the program.
With NOMODAL, rats continues executing any pending instructions after open-
ing the editing window(s). The user can also enter new instructions in the input
window without first closing the editing window(s).
select=VECTOR[INTEGER] of selections
stype=[rows]/one/byrow/bycol
If you use the SELECT option, MEDIT will display the data, and the user can select
items from the matrix. When the window is closed, the VECTOR[INTEGER] that
you provide on the option will be filled with information regarding the selection,
as described below.
You use STYPE to control how selections are made. With the default of
STYPE=ROWS, the user can select one or more rows. The array provided using
the SELECT option will have dimension equal to the number of selections made,
and will list the rows selected using numbers 1,...,n. If STYPE=ONE, it will have
dimension 2 with array(1)=row and array(2)=column. If STYPE=BYROW, the
user selects one cell per row. The SELECT array will have dimension equal to
the number of rows, with array(i) set equal to the column selected in row i.
STYPE=BYCOL is similar, but one cell per column is selected.
Note that this type of selection is probably done more easily now using the DBOX
instruction with the MATRIX, MSELECT and STYPE options.
Description
MEDIT can be particularly useful in writing interactive procedures which require the
user to enter data into arrays. It provides a more useful and intuitive way to enter
data than the alternative methods using INPUT, READ or ENTER.
Note: MEDIT currently will not work in batch mode, and thus will not work in batch-
mode only versions of rats.
Examples
This simple example uses MEDIT as a way to type values into a matrix:
declare rectangular r(10,10)
medit r
The code below estimates an arima model for various combinations of ar and ma
lags, and computes Akaike and Schwarz criterion values for each model. The result-
ing criteria values are then displayed in an MEDIT window. (This can also be done
with REPORT).
do mas=0,3
do ars=0,3
boxjenk(constant,diffs=1,ma=mas,ar=ars,maxl, $
noprint) logyen 1973:5 1994:12
@regcrits(noprint)
compute aic(ars+1,mas+1)=%aic
compute sic(ars+1,mas+1)=%sbc
end do ars
end do mas
medit(hlabels=||"0","1","2","3"||,vlabels=||"0","1","2","3"||, $
picture="*.###", noedit) aic sic
Parameters
Menu string rats displays the menu string as a title in the dialog box. Use
this for a description of what the user is selecting.
Choice string This is the string which identifies the choice.
Description
A menu block consists of:
1. a MENU instruction
2. two or more CHOICE instructions, each followed by the instruction or block of
instructions to be executed if the user makes that choice.
3. an END MENU to terminate.
You can use either literal strings ("...") or STRING variables for the
menu description and choice identifier strings.
Example
menu "Which Transformation Do You Want?"
choice "Log"
set series = log(series)
choice "Square Root"
set series = sqrt(series)
choice "Percent Change"
set(scratch) series = log(series/series{1})
choice "None"
;
end menu
Parameters
"messagestring" This is the string that will be displayed in the dialog box. You
can supply this as a string enclosed in quotes (' or "), or as a
STRING or LABEL type variable.
For a long message, rats will automatically wrap lines to keep
the width reasonable. If you want to control breaks yourself,
insert the characters \\ where you want a line break.
Options
style=[alert]/okcancel/yesno/yncancel
The message box displays the message supplied by messagestring, along with
one, two, or three “buttons.” For all choices except ALERT, you will want to use
the STATUS option to determine which button the user selected. The buttons
shown and the returned status are shown in the table below.
default=default choice for STYLE=YESNO [1=yes]
This controls which button will be the default choice (the button that will be ex-
ecuted if the user just hits <Enter>). Use DEFAULT=0 if you want the “No” button
to be the default.
status=status code
status code is an INTEGER variable which will be set to the return values
shown in the table.
Example
messagebox(style=yesno,status=yn) "View the residuals?"
if yn==1
{
graph(header="Regression Residuals") 1
# %resids
}
This displays the following dialog:
If the user clicks on the “Yes” button, rats will execute the GRAPH instruction.
The code uses a MESSAGEBOX to ask the user if the current degrees of freedom value
(stored in DGF) is correct. If the user responds “Yes”, the rest of the code is skipped.
If the user answers “No”, a QUERY instruction prompts the user to input a new value.
The VERIFY option requires the user to supply a valid (greater than –1) value.
if %BFlag == 1 {
messagebox(sty=yesno,status=Stop) "Calculated "+%string(dgf)+ $
" degrees of freedom. Is this correct?"
* If “No”, use QUERY to have user input a value for DGF:
if Stop == 0 {
query(prompt="Input correct degrees of freedom", $
verify=(dgf>-1),error="Value must be larger than -1") dgf
}
}
Parameters
oldequation Source equation for the modification.
newequation (output) The target for the modified equation. By default, this is
the same as oldequation.
Option
print/[noprint]
Use PRINT to display the form of oldequation. Note that DISPLAY is a simpler
way to do that.
Example
linreg(inst,define=supply) price
# constant quant sshift1 sshift2
modify supply
vreplace price by quant swap
frml(equation=supply) supplyeq
The LINREG instruction does an instrumental variables estimation with PRICE as the
left-side variable, and saves the equation under the name SUPPLY. MODIFY and
VREPLACE replace PRICE with QUANT as the left-side variable. The FRML instruction
converts the equation to a formula (a FRML variable).
Note
Once you do a MODIFY, all subsequent VREPLACE and VADD instructions will apply to
oldequation, until you do another MODIFY. If you do a series of VREPLACE or VADD
operations to a single EQUATION, do only one MODIFY at the start of the sequence.
See Also . . .
EQUATION Sets up linear equations.
VREPLACE Replace variables in an equation with other variables. It can
also renormalize equations.
VADD Adds additional variables to an equation.
Wizard
You can use the Statistics—Regression Tests wizard to do restriction tests, though it
will produce a RESTRICT instruction rather than an MRESTRICT.
Parameters
See “Description” below for details on the Rmatrix and rvector parameters.
restrictions Number of restrictions imposed.
Rmatrix A RECTANGULAR with dimensions restrictions x number of
coefficients. This matrix must be set before doing MRESTRICT.
rvec A VECTOR with dimension equal to restrictions. You can
omit this if rvec is the zero vector. Use a * to skip rvec if you
want to use either of the next two parameters.
Options
The options are the same as for RESTRICT, so we only include a brief description.
create/[nocreate]
With CREATE, MRESTRICT does a restricted regression (as opposed to just a test
of the restrictions).
Example
This does the polynomial distributed lag from Section 1.1 in the Additional Topics
pdf using MRESTRICT. The restriction is that the 4th difference of the distributed
lag coefficients is zero. It estimates the unconstrained distributed lag, then applies
the 21 restrictions. The 1 and 2 on FMATRIX keep CONSTANT (regressor 1) out of the
restriction by starting the difference operator at the 1,2 element of R.
linreg longrate 1951:1 2006:4
# constant shortrate{0 to 24}
declare rect r(21,26)
fmatrix(diffs=4) r 1 2
mrestrict(create) 21 r
Variables Defined
%CDSTAT the computed test statistic (REAL)
%SIGNIF the marginal significance level (REAL)
%NDFTEST (numerator) degrees of freedom for the test (INTEGER)
%RESIDS SERIES of residuals (if CREATE or REPLACE)
%BETA VECTOR of coefficients (if CREATE or REPLACE)
Wizard
You can use the Data/Graphics—Moving Window Statistics wizard to generate sta-
tistics and fractiles for moving windows of data.
Parameters
series Series to transform.
start end Range of entries to transform. Unless you choose one of the
EXTEND options, start and end need to allow for the span of
data required by the window. For instance, a centered window
of width 11 requires that start and end be at least 5 entries
in from the beginning and end of series. If you have not set a
SMPL, these default to the maximum usable range of series.
newseries Resulting series, used only when using the FRACTILE option to
compute a single fractile. Use the RESULTS option when com-
puting more than one fractile.
newstart The starting entry for newseries. By default,
newstart=start.
Options
width=Width of the moving window [5]
span=same as WIDTH (used in older versions of rats)
centered/[nocentered]
The WIDTH is the number of entries considered at any one time. If you choose
a CENTERED window, this should be odd. If it’s even, MVSTATS will make it odd
by adding one. If you choose CENTERED, the results for an entry use a window
centered at that entry. If not (the default), the window consists of the entry and
preceding ones.
Examples
mvstats(variances=volatile,width=20) price
sets the series VOLATILE at each period equal to the sample variance of the twenty
entries ending at that period. VOLATILE is defined beginning at entry 20.
mvstats(max=mhigh,min=mlow,width=13,center,extend=rescale) price
sets the series MHIGH to the highest value achieved by PRICE in the thirteen periods
centered on each entry. Values for entries near each end of the data are computed
with increasingly asymmetric windows. For instance, the final entry will have the
maximum and minimum of the last seven entries.
mvstats(fractiles=||.1,.25,.75,.9||,results=pricefract) price
computes four different fractiles of PRICE, storing the results in to a VECTOR of
SERIES called PRICEFRACT.
mvstats(iqrange=iqr,centered,width=21) x
sets IQR to a moving interquartile range of series X, using current and ten entries on
either side.
See Also . . .
STATISTICS With the FRACTILES option, computes a fixed list of fractiles.
EXTREMUM Finds the maximum and minimum values of a series.
%MINVALUE(x) Returns the minimum value of an array ( or series).
%MAXVALUE(x) Returns the maximum value of an array (or series).
%FRACTILES(x,f) Computes a collection of fractiles for an array (or series).
Parameters
newtype The identifier that you want to use as shorthand for a more
complicated data type. Note that, unlike standard RATS data
types, you need to use the entire name, not just the first three
letters.
datatype The data type that newtype can now be used to represent.
Example
new GARCHSwitch function[vector](integer)
declare GARCHSwitch msgarchfnc
Description
As noted above, the NEXT instruction immediately ends the current pass through a
loop. Execution continues with the start of the next pass through the loop:
• the index update for DO and DOFOR
• the condition check for WHILE and UNTIL
• the top of the loop for LOOP.
Example
dofor i = result1 result2 result3 result4
statistics(noprint) i
if %cdstat < 1.96
next
(remaining analysis executed only when the condition is false)
end dofor
This performs a t-test on four series. If a series has mean insignificantly different
from zero (at 5%), we skip the rest of the analysis for that series and go on to the
next.
See Also . . .
UG, Section 15.1 The rats compiler
BREAK Breaks control out of a loop.
BRANCH Branches to a labeled instruction.
(1) yt = f ( X t ,b ) + ut
Before you can use it, you must:
• Set the list of free parameters using NONLIN
• Create the explanatory formula f using FRML
• Set initial values for the parameters (usually with COMPUTE or INPUT)
Parameters
depvar the dependent variable. You can use an asterisk (*) for this
parameter if the model is f ( X t ,b ) = ut so there is no “dependent
variable.” If you use *, the regression output will exclude R2 and
similar summary statistics.
start end the estimation range. If you have not set a SMPL, this defaults
to the maximum range over which the residuals can be com-
puted, and the instruments are defined (if you are doing instru-
mental variables).
residuals (Optional) Series for the residuals at the final estimates.
General Options
frml=formula name (This option is required)
The name of the formula for the function f.
[print]/noprint
vcv/[novcv]
title=”title for output” [“Nonlinear Least Squares” or “Nonlinear
Instrumental Variables”]
These are the same as for other regressions.
method=[gauss]/simplex/genetic/annealing/ga/evaluate
METHOD sets the estimation method to be used—see Chapter 4 of the User’s Guide
for descriptions. The default (GAUSS) is Gauss–Newton, which requires that the
formula be twice continuously differentiable and is the only choice that can com-
pute standard errors. The others make weaker assumptions, and can compute
point estimates only—they are more often used as a PMETHOD to refine initial
guesses before using Gauss–Newton to finish the estimation.
EVALUATE simply evaluates the formula given the initial parameter values and
produces no output.
pmethod=gauss/[simplex]/genetic/annealing/ga
piters=number of PMETHOD iterations to perform [none]
Use PMETHOD and PITERS if you want to do preliminary iterations using one
method, and then switch to another method for final estimates. For example, to
do 10 simplex iterations before switching to Gauss-Newton, you can use the op-
tions PMETHOD=SIMPLEX, PITERS=10, and METHOD=GAUSS.
[zudep]/nozudep
lags=correlated lags [0]
lwindow=newey/bartlett/damped/parzen/quadratic/[flat]/panel/white
damp=value of g for lwindow=damped [0.0]
lwform=VECTOR with the lag window form [not used]
cluster=SERIES with category values for clustered calculation
Use these to select the type of correlation assumed for the Z'u process. Note that
the default for the ZUDEP (ZU Dependence) option is different from the NLSYSTEM
instruction. For more information, see the"Long-Run Variance/Robust Covariance
Calculations" on page RM–540.
center/[nocenter]
CENTER adjusts the formula for the weight matrix to subtract off the (sample)
means of Z ¢u, which may be non-zero for an overidentified model. For more infor-
mation, see "ZUMEAN and CENTER options" on page RM–541.
jrobust=statistic/[distribution]
You can use this option to adjust the J-statistic specification test when the
weighting matrix used is not the optimal one. See page UG–140 in the User’s Guide
for more information.
Description
By default, NLLS estimates using the Gauss–Newton algorithm with numerical par-
tial derivatives. See User’s Guide Sections 4.2 and 4.8 for more on this.
Hypothesis tests
You can test hypotheses on the coefficients with TEST, RESTRICT and MRESTRICT
and you can use the Statistics—Regression Tests wizard to set those up. rats num-
bers the coefficients based on their positions in the list set up by NONLIN. You may
not use EXCLUDE after NLLS and can use SUMMARIZE only to expand an expression.
The hypothesis tests are based upon a quadratic approximation to the sum of squares
surface at the final estimates. This is a “Wald” test.
Missing Values
rats drops any entries which it can’t compute because of missing values. If you have
a recursive formula (that is, the value for entry T relies upon some quantity comput-
ed at T-1, then you have to be careful about the choice of start and end. You can’t
let rats determine the default range because it will simply start at entry one (which
generally isn’t computable), and will never find, by itself, the correct place to begin.
Examples
nonlin lgamma delta nu rho
frml ces = lgamma-nu/rho* $
log( delta * k^(-rho) + (1.-delta) * l^(-rho) )
compute lgamma=1.0, delta=0.4, nu=0.8, rho=.6
nlls(frml=ces,trace) q
nonlin a b g
linreg realcons
# constant realdpi
compute a=%beta(1),b=%beta(2),g=1
frml cfrml realcons = a+b*realdpi^g
instruments constant realcons{1} realdpi{1 2}
nlls(frml=cfrml,inst) realcons 1950:3 *
Variables Defined
NLLS defines the following standard estimation variables (see LINREG)
%BETA, %DURBIN, %MEAN, %NDF, %NFREE, %NOBS, %NREG, %RESIDS, %RSS,
%SEESQ, %RHO, %STDERRS, %TSTATS, %VARIANCE, %XX
for least squares only
%LOGL, %RBARSQ, %RSQUARED, %TRSQ, %TRSQUARED
for instrument variables only
%JSIGNIF, %JSTAT, %NDFJ, %UZWZU, %WMATRIX
Description
The instructions BOXJENK and ITERATE (ARMA estimation), NLLS and NLSYSTEM
(Non-Linear Least Squares and systems estimation), DDV and LDV (discrete and lim-
ited dependent variables), ESMOOTH (exponential smoothing), MAXIMIZE, FIND, DLM
(dynamic linear models) and CVMODEL all involve nonlinear optimization.
Because of this, estimation is an iterative process, and certain models may prove
difficult to fit. Most of the time, you can get your model estimated by altering the
method used, the initial guess values, and the number of iterations. The options to
control those are provided on each estimation instruction.
However, some models can be difficult or impossible to estimate without further ad-
justment to the estimation process. The options on NLPAR provide control over these
settings.
In general, there are five main methods of optimization used by RATS: the hill-climb-
ing methods (Section 4.2 of the User’s Guide), and the simplex, genetic, simulated an-
nealing and genetic annealing (Section 4.3). Of these, the ones most likely to benefit
from “tuning” are the last three, which are all designed to deal with surfaces with
possibly multiple peaks. If you do have multiple peaks, it requires quite a bit more
effort (bigger populations, slower speed of convergence) to allow all of those to be ex-
plored properly. The genetic algorithm has four controls and the annealing algorithm
does as well (and genetic annealing uses both sets).
Options
The following is a general description of the NLPAR options. Please see Chapter 4 in
the User’s Guide for technical details on the parameters controlled via NLPAR, as well
as general information on the algorithms and methods used.
criterion=[coeffs]/value
This chooses whether rats determines convergence by looking at the change
from one iteration to the next in the coefficients, or the change in the value of the
function being optimized. Convergence of coefficients is the default. Convergence
on value should be chosen only if the coefficients themselves are unimportant.
exactlinesearch/[noexactlinesearch]
In the “climbing” methods (Section 4.2 of the User’s Guide), where a direction is
chosen and a search made along that direction, this option controls whether the
estimation routine attempts to find the “exact” optimum in that direction, or only
looks for a new set of points which meet certain criteria. This choice rarely affects
whether you will get convergence—it merely alters how quickly you get there. In
general, EXACTLINESEARCH is slower for smaller models, as the extra function
evaluations provide little improvement for the effort. It can speed up models with
many free parameters, where the “cost” of each new iteration, particularly the
need to compute the gradient, is higher.
derive=[first]/second/fourth
This determines the method used for computing numerical derivatives. The
default is a right arc derivative. DERIVE=SECOND and DERIVE=FOURTH use more
accurate but slower formulas. In most cases, you won't need those.
mutate=[simple]/shrink/best/random
crossover=probability of taking mutated parameter [.5]
scalefactor=scale factor for mutations [.7]
populate=scale for population size [6]
These parameters control details of the GENETIC estimation method (User’s
Guide, Section 4.3)
marquardt/[nomarquardt]
For Gauss-Newton estimation, you can use MARQUARDT to select the Levenberg–
Marquardt subiteration algorithm (Marquardt, 1963). We generally don’t recom-
mend this, as we have generally obtained better results using the default method.
Parameters
start end Estimation range. If you have not set a SMPL, this defaults to
the common defined range of all the dependent variables of the
equations.
list of FRMLS The list of the formulas you want to estimate. A formula
can define the residuals ut either as yt = f ( X t ,b ) + ut or as
f ( X t ,b ) = ut . NLSYSTEM assumes the first if you define the
formula with a dependent variable and the second if you don’t.
Please note that unlike NLLS, you can only indicate the depen-
dent variable of a formula when you define the formula—there
is no way to do it on the NLSYSTEM instruction.
Description
NLSYSTEM uses one of two techniques:
• Multivariate non-linear least squares (non-linear SURE) when you estimate
without instruments.
• Generalized Method of Moments for instrumental variables estimation. Non-
linear three-stage least squares is a special case of this.
It does not do Maximum Likelihood estimation (except when these other techniques
are equivalent to ML).
Before you can use NLSYSTEM, you must:
• Set the list of free parameters using NONLIN
• Define the FRMLS
• Set initial values for the parameters (usually with COMPUTE or INPUT).
If you have a linear system, the instruction SUR may be a better alternative. It is
much faster at any type of model which both instructions can estimate.
Technical Information
(1) ut = (u1t , , unt )′ is the vector of residuals at time t (u depends upon b), and
(2) Σ = E ut ut′
Multivariate non-linear least squares solves
(5) G (b ) = ∑ ut ⊗ Zt
t
where SW is the weighting matrix for the orthogonality conditions. Some of the
options of NLSYSTEM let you set the form for the matrix SW in formula (6). By de-
−1
fault, it is just Σ−1 ⊗ (Z ′Z) , where Z is the T × r matrix of instruments for the entire
sample. With this, NLSYSTEM does non-linear three stage least squares.
General Options
model=model to be estimated
frmlvect=FRML[VECTOR] to be estimated
As an alternative to listing the formulas to be estimated as parameters, you can
use the MODEL option to estimate an existing MODEL, or use FRMLVECT to provide
a single FRML which, at each entry, provides the residuals in a VECTOR form. If
you use MODEL, the MODEL must consist only of FRML’s.
[print]/noprint
vcv/[novcv]
title="title for output" [depends upon options]
These are the standard output control options.
sigma/[nosigma]
This controls the printing of the final estimate of the residual covariance matrix.
robusterrors/[norobusterrors]
lags=correlated lags [0]
lwindow=neweywest/bartlett/damped/parzen/quadratic/[flat]/
panel/white
damp=value of g for lwindow=damped [0.0]
lwform=VECTOR with the window form [not used]
cluster=identifying SERIES for clustered std. errors [not used]
When you use these without the INSTRUMENTS option, they allow you to calculate
a consistent covariance matrix allowing for heteroscedasticity (with ROBUSTER-
RORS), serial correlation (with ROBUSTERRORS and LAGS), or clustered standard
errors (with ROBUSTERRORS and CLUSTER). For more information, see Sections
2.2, 2.3, and 2.4, and 4.5 of the User’s Guide and "Long-Run Variance/Robust
Covariance Calculations" on page RM–540.
Note that none of these options affect the parameter estimates. Just as with
LINREG and NLLS, these options come into play when the covariance matrix
of the estimates is computed. These behave differently when you are using
NLSYSTEM with the INSTRUMENTS option, as described below.
zudependent/[nozudep]
−1
wmatrix=SYMMETRIC weighting matrix for instruments [ (Z ′Z) ]
iwmatrix=SYMMETRIC inverse weighting matrix
sw=SYMMETRIC grand weighting matrix [not used]
isw=SYMMETRIC inverse grand weighting matrix
swout=(output) SYMMETRIC grand weighting matrix [not used]
NOZUDEP (the default) is the special case for the SW matrix. We call it NOZUDEP
because the most important case is where u is (serially uncorrelated and) inde-
pendent of the instruments Z. More generally, this is Case (i) in Hansen (1982,
page 1043).
With NOZUDEP, you can use the WMATRIX or IWMATRIX option (whichever is more
convenient) to set the W part of the Σ−1 ⊗ W and the CV option to set S. Other-
wise, NLSYSTEM estimates a new S after each iteration. Note that, if you use the
LAGS option, NLSYSTEM will automatically switch to the ZUDEP method of han-
dling the weight matrix.
If ZUDEP, you can use the SW or ISW option to set the full SW array. This is an nr
× nr SYMMETRIC array. Otherwise, NLSYSTEM determines a new SW matrix after
each iteration by taking the inverse of
1
(7) ∑ (u
t ⊗ Zt )(ut ⊗ Zt )′
T
(or the generalization of this if you use the LAGS option). The SWOUT option allows
you to save the estimated SW matrix into the specified array.
update=none/once/continuous [default depends on other options]
This controls the updating of the weighting matrix. The default is normally
UPDATE=CONTINUOUS, which recalculates the weight matrix at each iteration,
except in the following cases:
robusterrors/[norobusterrors]
If you use ROBUSTERRORS combined with an input CV or SW matrix, NLSYSTEM
will compute the coefficients using the “suboptimal” weighting matrix and then
correct the covariance matrix of the coefficients based upon the choices for the
LAGS, LWINDOW and other options immediately above.
jrobust=statistic/[distribution]
You can use this option to adjust the J-statistic specification test when the
weighting matrix used is not the optimal one. See page UG–140 in the User’s Guide
for more information.
center/[nocenter]
CENTER adjusts the weight matrix calculation to subtract off the (sample) means
of u Ä Z which may be non-zero for an overidentified model. For more informa-
tion, see "ZUMEAN and CENTER options" on page RM–541.
Variables Defined
The matrices %XX, %BETA, %STDERRS and %TSTATS are all defined for the full set
of coefficients and %NREG is the count of the number of regressors in the full sys-
tem, %NFREE is number of free parameters (including estimated covariance matrix),
%NOBS is the number of observations, %NVAR the number of equations. Other vari-
ables defined are:
%LOGDET log determinant of the estimate of S (REAL)
%SIGMA final covariance matrix of the residuals (SYMMETRIC)
%LAGRANGE VECTOR of Lagrange multipliers if estimating with constraints.
For multivariate least squares only
%LOGL (normal) log likelihood (REAL)
For instrumental variables/GMM only
%JSTAT Test statistic for overidentification for instrumental variables
(REAL). If you don’t use ROBUSTERRORS, NLSYSTEM will assume
the weight matrix is “optimal” and use the value of (6) as the J-
statistic. With ROBUSTERRORS, the formula in Lemma 4.1 from
Hansen (1982) is used.
%JSIGNIF Significance of %JSTAT (REAL)
%NDFJ Degrees of freedom of %JSTAT (INTEGER)
%UZWZU criterion function for instrumental variables (REAL)
Missing Values
rats drops any entries which it cannot compute because of missing values.
Output
NLSYSTEM prints a summary of information on the fit for each equation. The param-
eter estimates are listed in a single table.
Examples
This estimates by gmm a model for the behavior of interest rates which based upon
moment conditions for the mean and variance of the residuals.
nonlin(parmset=structural) c0 c1 c2 i0 i1 i2 i3 r0 r1 r2 r3 r4
nonlin(parmset=ar) rho
compute c0=c1=c2=0.0
compute i1=i1=i2=i3=0.0
compute r0=r1=r2=r3=r4=0.0
compute rho=0.0
nlsystem(inst,parmset=structural,lags=2,lwindow=newey,$
iters=400) 1950:1 1985:4 investnl consnl ratenl
Parameters
start end Range of entries of the output series to use.
Supplementary cards
The first supplementary card supplies the list of input series, while the second card
supplies the list of output series. The input series (first supplementary card) are
analogous to explanatory or independent variables in a regression, while the output
series (second card) are analogous to dependent variable(s).
The first card supports regression format, which means that you can include lags
or leads on the input list. The output list, however, must consist only of one or more
series names (no lags or leads).
Options
save=memory vector (required)
restart/[norestart]
The SAVE option saves the estimated weights of neural network model, as well as
general information about the model (number of inputs, number of outputs, etc.)
in a VECTOR of REALS. The memory vector can be used in subsequent NNLEARN
commands for additional training as described below, or with the NNTEST instruc-
tion to generate fitted values.
If you re-use an existing memory vector, NNLEARN will, by default, use the
values in the vector as the starting point for further training. This allows you to
do further training of an existing network using additional NNLEARN commands.
Use the RESTART option if you want to re-use the same vector name, but want
NNLEARN to start the estimation from a new set of randomly generated initial
values. In either case, after the estimation is completed, the new weights and
information are saved into memory vector, replacing the earlier values.
direct/[nodirect]
If DIRECT, the model will include direct links between the input nodes and the
output nodes. If NODIRECT, the only connection will be through hidden nodes.
pad=fraction to pad [ 0 ]
The values of the network outputs run from 0 to 1 or –1 to 1 (depending on the
SQUASH choice). By default, the outputs are scaled so that this range maps to the
smallest and largest values in the training sample output series. If the model
is ever used with samples that should produce larger or smaller output values
than were present in the training sample, the outputs produced by NNTEST will
be artificially truncated. You can avoid this by using the PAD option to provide a
value between 0 and 1 which indicates the fraction of “padding” to include when
rescaling the output variables.
If, for instance, you choose PAD=.2, the smallest output value in the training
sample will be mapped to .1 while the largest will be mapped to .9. If the original
range of the data were from 7.2 to 8, this would allow the network to produce
forecasts up to 8.1 and down to 7.1. See page UG–444 of the User’s Guide.
mode=[epoch]/example
This controls how often new weights are computed. With EPOCH, NNLEARN does
a forward and backward pass through the network for all observations in the
sample range before recomputing the weights. With EXAMPLE, weights are re-
computed after (a forward and backward pass through) each observation in the
sample.
squash=[logistic]/ht1/ht2
Selects the sigmoidal filter to be used for “squashing” the node outputs:
If you use the CVCRIT option, NNLEARN will train the model until the mean
square error (the mean of the squared error between the output series and the
current output values of the network) is less than the CVCRIT value.
If you use the RSQUARED option, NNLEARN will train the model until the
mean square error is less than (1-R2)s 2 , where R2 is the value specified in the
RSQUARED option, and s 2 is the smallest of the output series variances.
The default setting is CVCRIT=.00001. If you specify both options, NNLEARN will
use the CVCRIT setting.
trace/[notrace]
If you turn on the TRACE option, rats will periodically display the number of
iterations (epochs) evaluated and the current value of the convergence criterion.
We recommend that you always use TRACE, particularly when developing new
models.
Description
The NNLEARN instruction fits a neural net model based on the relationship between a
set of input series and a set of output series. If the input series are X1, X2, ..., Xn and
the output series are Y1, Y2, ..., Ym, this fits:
Examples
As a simple demonstration, we’ll fit a neural network model for the XOR (exclusive
OR) function. The XOR function takes two binary values (1 or 0, true or false) as
input, and returns a true value when either (but not both) of the inputs are true, and
returns a false value otherwise.
all 4
data(unit=input,org=obs) / x1 x2 xor_actual
0 0 0
0 1 1
1 0 1
1 1 0
nnlearn(save=mem,rsquared=.9999,hidden=2)
# x1 x2
# xor_actual
nntest / mem
# x1 x2
# xor_output
The first 12 elements of the vector contain basic information about the network
(number of input nodes, number of hidden nodes, and so on). The remaining elements
contain the computed weights of the neural network, as described below:
memory(13,...,P) = aij, for i=1,...,H; j=0,...,I (i.e. a10, a11,a12,...,a20, a21, a22, etc.)
For j=0, aij is the bias weight on hidden node i, otherwise aij is the weight
on hidden node i from input node j.
memory(Q+1,...,R) = dij, for i=1,...,O; j=1,...,I (only when using the DIRECT option)
If you use DIRECT, the memory vector will also contain the weights on the
direct connections—dij is the weight on the connection from input node j on
output node i.
memory(S+1,...,S+ (2× O ) ) = Pairs of values for each output node, containing the
lower and upper bounds when using YMAX or YMIN, or the lower and
upper scale factors when using PAD.
Parameters
start end The range of entries for which you want to generate output.
memoryvector (Required) A memory vector containing the neural net model
weights (set by the SAVE option on NNLEARN).
Options
smpl=SMPL series or formula ("SMPL Option" on page RM–546)
If you use the SMPL option, NNTEST will only compute output for entries where
the SMPL series or formula has a non-zero value. No output will be calculated for
entries where it's zero or NA.
validate/[novalidate]
If you use the VALIDATE option, NNTEST compares the output from the network
with the actual data in the output series. The mean square error is computed
and saved in %FUNCVAL. You can use this for automated validation of a part of
the sample. If you use this, the values of output series won’t be affected.
Description
Using the neural net model specified by the memory vector parameter, NNTEST
takes the supplied input series and computes the output. If you use VALIDATE, it will
compare these with the data in the series listed on the supplementary card. If you
don’t (by default), it will store the results in the series listed on the supplementary
card.
Note that the number of input and output series must match those used on the
NNLEARN to estimate the model.
Example
nntest / nnmodel
# x1 x2 x3
# ypreds
Wizard
You can define a list of nonlinear parameters when using the Statistics—Equation/
FRML Definition wizard to define a formula.
Parameters
parameterfields A parameterfield is one of the following:
• a simple REAL variable, such as B1
• a real array (VECTOR, RECTANGULAR, SYMMETRIC,
PACKED)
• an array of arrays, such as a VECTOR of VECTORs.
• a substitution operation: B3=B1*B2
• an equality constraint: B3==B1*B2
• an inequality constraint: B3>=0.0. You cannot use a strict
inequality here. For instance, B3>0.0 is illegal.
See the notes below on the use of these.
Options
parmset=PARMSET to define [default internal]
Using the PARMSET option, you can define or redefine a parameter set. By using
the PARMSET option on your estimation instruction, you can switch easily from
one parameter set to another. rats maintains a single unnamed parameter set,
which is the one used for estimation if you don’t provide a named set.
zeros/[nozeros]
If you use the ZEROS option, the elements of the PARMSET are initialized to zero.
By default, whatever values they previously had are retained—a new variable
name will be initialized to NA. If you use the PARMSET without initializing some
of the values, you will get a warning message like:
## NL6. NONLIN Parameter C1 Has Not Been Initialized. Trying 0
Thus, zero is (in effect) the default guess value for any parameter—the warning
is there in case you run into numerical problems related to not having proper
guess values. If zeroes are a reasonable set of guesses (for instance, linear func-
tions aren't sensitive to guess values), then the ZEROS option will allow you to use
the PARMSET without explicit guess values and without getting the warnings.
add/[noadd]
This allows you to add parameters or constraints to a parameter set without
reentering the full set. This is largely obsolete because you can now separate the
parameter set into different parts and “add” them using the + operator when you
need to use them for estimation. That is, MAXIMIZE(PARMSET=MODELPARMS+GAR
CH) will combine the MODELPARMS and GARCH parameter sets to form the work-
ing parameter set.
Notes
If you use an array, it must be DECLAREd before the NONLIN. It does not need to be
DIMENSIONed until you are ready to use it. You can include inequality constraints on
the individual elements of an array, or on all of them at once. For instance,
declare vector b
nonlin b b>=0.0
constrains all elements of B to be non-negative.
Use a substitution constraint rather than an equality constraint wherever possible.
rats handles B3=B1*B2 by setting B3 equal to B1*B2 every time B1 or B2 changes,
thus eliminating one free parameter. With B3==B1*B2, rats estimates the three
parameters separately, and uses Lagrange multiplier methods to push the estimates
towards the constraint. This is a much slower process. The equality constraint should
be used only when you can’t easily solve out for one parameter in terms of the others.
The FRML instruction can be used to create a formula and matching PARMSET from a
linear equation or regression. This can be very handy when there is a linear model
for the mean whose form you don’t want to fix in advance.
Before you can use any of the estimation instructions, you should set initial values
for the parameters. COMPUTE and INPUT are the two simplest ways to accomplish
this—the two examples below show two ways of setting the same set of initial values:
compute b1=b2=b3=0.0,b4=1.0
input b1 b2 b3 b4
0 0 0 1
You can use the DISPLAY instruction to display a list of the variables in a parameter
set, along with the current values of those variables. This only works for “named” pa-
rameter sets—you cannot display the contents of the (unnamed) default internal set.
Examples
nonlin alpha lambda sig_eta sig_eps
defines the default parameter set as having the five named parameters.
nonlin(parmset=lrrest) lrf(1)(2,3)==0.0
This defines a PARMSET called LRREST. This imposes an implicit constraint (note the
==). This is used because the LRF(1)(2,3) is a matrix calculation of other param-
eters.
Wizard
The Statistics—Nonparametric Regression wizard provides access to the major fea-
tures of NPREG.
Parameters
Y series The dependent variable.
X series The explanatory variable.
start end Range to use in estimating the regression. If you have not set a
SMPL, this defaults to the maximum range over which both the
Y series and X series are defined
grid Series of X values at which the fit is computed.
fit Series of fitted values corresponding to the grid series.
Options
method=[nadaraya]/lowess/nn
METHOD=NADARAYA does the Nadaraya-Watson kernel estimator,
METHOD=LOWESS does lowess, and METHOD=NN does nearest neighbor smooth-
ing. See the "Technical Information" for descriptions.
grid=[automatic]/input
maxgrid=maximum number of grid points for grid=automatic [100]
GRID=AUTOMATIC has NPREG generate the grid points for the fit. These range
from the lowest to the highest values attained by the actual X series, with the
number of points being given by the MAXGRID option. To control the points your-
self, use GRID=INPUT, in which case the grid series should be filled in advance
with your settings. Usually, an equally spaced grid is handy if you’re mainly in-
teresting in examining the shape of the f function. If the NPREG is part of a more
complex calculation, the grid series will usually be the X series itself.
type=[epanechnikov]/triangular/gaussian/logistic/flat/parzen
bandwidth=kernel bandwidth [see below]
TYPE selects the kernel type for the Nadaraya-Watson estimator. BANDWIDTH
specifies the bandwidth for the kernel. The default value is
(0.79 IQR) N1 5
Technical Information
The Nadaraya-Watson estimator (METHOD=NADARAYA) is:
fˆ (x ) =
∑ K ((x − x ) h) y i i
∑ K ((x − x ) h) i
where K is the kernel function and h is the bandwidth. The kernels have the forms:
(
8 1− v
3
) 3 if 0.5 £ v £ 1 ,
and 0 otherwise
As you increase the bandwidth, the estimated function becomes smoother, but is less
able to detect sharp features. A shorter bandwidth leads to a more ragged estimated
function, but sharp features will be more apparent.
where D (x ) is the range of the sample X’s used in the fit at x. fˆ (x ) is the intercept of
this regression.
Finally, METHOD=NN takes the simple average of y for the requested fraction of the
data closest to each value of x.
Examples
This example implements several non-parametric regressions from the nist Engi-
neering Statistics Handbook. See the file LOWESS.RPF for the complete code and the
web link for the nist document.
The first estimates using LoWess with FRAC=.33. The default FRAC=.5 would give a
“stiffer” function, and we suspect that .33 was picked (by nist) because of that. The
second does Nadaraya-Watson with the default bandwidth. As this seems to produce
a bit too stiff a function as well, this is refit with a shorter bandwidth.
All of the estimated functions are graphed using SCATTER, with the original data
done as dots, and the functions done as lines. Note the use of OVSAME on these. If you
don’t use it, the function could end up being graphed on a different scale than the
data.
allocate 21
data(unit=input,org=columns) / seq x y
See the LOWESS.RPF file for the actual data
npreg(method=lowess,frac=.33) y x / xv yv
scatter(style=dots,overlay=line,ovsame,$
header="LOWESS Fit, Frac=.33") 2
# x y
# xv yv
*
npreg(method=nadaraya,grid=input) y x / xv yn
scatter(style=dots,overlay=line,ovsame,$
header="Kernel Fit, Default Bandwidth") 2
# x y
# xv yn
npreg(method=nadaraya,grid=input,smoothing=.75) y x / xv yx
scatter(style=dots,overlay=line,ovsame,$
header="Kernel Fit, Smaller Bandwidth") 2
# x y
# xv yx
Variables Defined
%EBW The computed bandwidth
Parameters
RATS I/O unit The name of the unit you wish to open.
filename The name of the file or window to associate with the I/O unit.
Options
window/[nowindow]
If you use the WINDOW option, a text window will be opened with the title that you
provide in the filename field. Such a window can be used for text output.
append/[noappend]
If you are writing output to a file, you can use the APPEND option to append any
new output to the end of the current file. By default, the contents of the existing
file (if any) are destroyed.
immediate/[noimmediate]
format=[free]/binary/cdf/dbf/dif/html/matlab/portable/prn/rats/
rtf/tex/tsd/wf1/xls/xlsx/'(FORTRAN)'
write/[nowrite]
status=INTEGER variable returning status of open attempt [unused]
Normally, OPEN just associates a filename with a unit name. The file isn’t actu-
ally opened until you use an instruction like DATA or COPY. Use the IMMEDIATE
option if you want to open the file immediately. The FORMAT option specifies the
format of the file (see COPY or DATA for details on these). WRITE opens the file for
writing (output), otherwise it will be opened for reading (input). If you use the
STATUS option, the variable you supply will be set to 1 if the file was opened suc-
cessfully, and 0 otherwise.
You can also define your own names for I/O units. By defining your own units, you
can keep several data or output files open at a time and switch between them.
Examples
open copy teststat.fil
do i=1,draws
...
write(unit=copy) tstats
end do i
opens the file TESTSTAT.FIL as a “copy” file, and writes output to it from inside a
loop. Note that you put the OPEN instruction outside the loop. If you put it inside,
each OPEN will erase the old results.
compute fname=thiscountry+".rat"
open data &fname
data(format=rats)
opens as the DATA unit a file whose name is created from appending “.rat” to the cur-
rent contents of the string THISCOUNTRY and reads its contents. (DATA is the default
I/O unit for the DATA instruction).
opens the unit TEST as a window titled “Test Results” and puts information from the
DISPLAY instruction into it.
Notes
You can use the instruction CHANGE to redefine an I/O unit, which can be particularly
useful for redirecting output to different files.
The instruction CLOSE closes its associated file. A text file is usually left “open” after
an instruction writes to it. In general, an open file can’t be read from another pro-
gram. If you’re done with a unit, issuing a CLOSE instruction for it will make it pos-
sible to process the information with another application.
Parameters
optionname This is the name which you are assigning to the option. Within
the procedure, you must use its full name. However when you
execute the procedure, only the first three characters are signifi-
cant, so make sure you don’t use conflicting names (rats will
warn if you do so). Also, do not use a name which begins with
the letters NO.
The option names are local to the procedure, so you are free to
use the same option names in other procedures.
By default, all options are passed by value. SWITCH and CHOICE
options are always passed by value. You cannot change the
value of such an option within the PROCEDURE.
If you want an option to return a value, you need to pass it by
address. To do this, use *option name rather than option
name alone (User’s Guide, page UG–475).
default (for SWITCH): Use the value 0 if you want the option “off” by de-
fault, or the value 1 if you want it to be “on”. If you don’t specify
a value, this defaults to 0 (off).
default number (for CHOICE): the listed choice (by position in the list of
choices, not by name) you want to be the default. Use a 0 if
you want the default to be “not selected.”
list of choices (for CHOICE): the keywords (separated by blanks) that you want
to represent the choices. Only the first three letters are signifi-
cant. If you are using a PROCEDURE option to pass a choice for a
standard rats instruction, you should list the choices in exactly
the order used in the description of the instruction.
datatype For value options, this can be any of the rats supported data
types.
default value (for value options). Omit this if you want the default for the op-
tion to be “not selected.” You can use global variables or proce-
dure parameters in a default value expression. However, all
such variables must have been introduced before the OPTION
instruction. You cannot use local variables. Subject to these
restrictions, default value can be
• for any option passed by address: a single variable or array
element of the proper type.
• for REAL, INTEGER, COMPLEX, LABEL or STRING: any ex-
pression of the proper type.
• for SERIES or EQUATIONS: a series or equation name, or an
integer expression.
• for arrays, FRMLS, MODELS and PARMSETS: a single vari-
able (global variable or parameter) of the proper type.
• for FUNCTIONS, a FUNCTION which has the identical return
type and parameter list.
Examples
option choice type 1 flat tent
The TYPE option has choices FLAT and TENT with FLAT (choice 1) the default.
The code below is taken from the @TSAYTEST procedure, which does an arranged
regression test for threshold autoregression.This procedure requires that the user
supply a series of threshold values using the THRESHOLD option. The code below
checks to make sure that a series has been supplied for this option, as well as for
the dependent variable parameter. If the user has failed to supply either item, the
procedure generates a message detailing the required syntax, and then exits from the
procedure.
if .not.%defined(threshold).or..not.%defined(depvar)
{
display $
"Syntax: @tsaytest(threshold=threshold series) depvar start end"
display " # list of regressors"
return
}
Parameters
series Series to sort or rank.
start end Range of entries to sort or rank. If you have not set a SMPL, this
defaults to the defined range of series.
list of series (Optional) The listed series are reordered in parallel with
series. Observations are kept intact across series and list
of series and reordered as a group based upon the values of
series. To reorder all current data series, use the option ALL .
Options
decreasing/[nodecreasing]
Use DECREASING to sort or rank in decreasing order.
all/[noall]
Use ALL to reorder all current data series based upon series.
order(index=ix) x
set y = x(ix)
Examples
order pop / spend tax aid
reorders POP, SPEND, TAX and AID based upon POP.
order(rank=xrank) xseries
order(rank=yrank) yseries
linreg yrank
# constant xrank
ranks XSERIES and YSERIES and regresses the Y-ranks on the X-ranks. The coeffi-
cient and T-stat on the XRANK coefficient give Spearman’s rank correlation test.
Technical Information
ORDER uses a “Shell sort.” The sort time varies on the order of N1.5, where N is the
number of data points. It should provide acceptable performance with virtually any
typical rats data set.
If you use the RANK option, ORDER assigns to all data points involved in a tie the aver-
age of the ranks. The smallest (if you are doing the default sort in increasing order)
gets a rank value of 1.
On a sort, ORDER breaks ties by keeping data points in their original entry order.
Thus, if you do ORDER(ALL) on series A, then ORDER(ALL) on series B, the result
will be a data set sorted first on B and then A for tied values of B.
See Also...
The functions %SORT, %SORTC, %SORTCL, and %RANKS can be used to sort or rank
data in a vector or matrix. The instruction STATISTICS with the FRACTILES option
computes a set group of fractiles for a data series, and the function %FRACTILES com-
putes fractiles for a vector or matrix.
Parameters
series Source series.
start end Range of entries to use. If you have not set a SMPL, this defaults
to the defined range of series.
newseries Series for transformed values of series. By default,
newseries=series.
newstart The starting entry for the data in newseries. By default,
newstart=start.
Options
entry=Weight on value of series [0.0]
indiv=Weight on the individual mean [0.0]
time=Weight on the time mean [0.0]
icount=Weight on the count per individual [0.0]
tcount=Weight on the count per time period [0.0]
isum=Weight on the individual sum [0.0]
tsum=Weight on the time sum [0.0]
mean=Weight on the series mean [0.0]
These supply the weights on the various components. See "Description" for de-
tails.
effects=[individual]/time/both
This indicates whether to allow for INDIVIDUAL effects, TIME effects or BOTH.
This applies when you use the GLS and related options or the DUMMIES option.
compress/[nocompress]
If the transformation takes individual statistics only, or time statistics only, you
can use COMPRESS. It eliminates the repetitions and creates a series of length N
for INDIV and length T for TIME (as opposed to length N´T).
panel data set or general grouped data. For those situations, you can use the ID
and IDENTRIES options. ID returns a VECTOR of values of the grouping variable
(in sorted order), while IDENTRIES provides (for each value in ID) the corre-
sponding set of entry numbers. These would be used something like the example
below, which walks through the individuals (the I loop) and then the entries for
that individual (the J loop, with IT being the entry number).
panel(group=p_cusip,id=vid,identries=identries)
do i=1,%size(vid)
do j=1,%size(identries(i))
compute it=identries(i)(j)
...
end do j
end do i
Description
With yit ; i = 1, , N ; t = 1, ,T representing series, for entry it:
ENTRY is yit
INDIV is yi•, the mean of y for individual i, averaged across t
TIME is y •t,the mean of y for time t, averaged across i
MEAN is the mean across all entries
ISUM is the sum across t of y for individual i
TSUM is the sum across i of y for time period t
ICOUNT is the number of valid observations (time periods) in the current individual.
That is, for yit, ICOUNT returns the count of valid observations across all t for
individual i.
TCOUNT is the number of valid observations across individuals for the current time
period. That is, for yit, TCOUNT returns the count of valid observations across all i
at time period t.
For example, you would use the following options to create the indicated series:
For yit-yi•, use options ENTRY=1.0,INDIV=-1.0
For yit-qy •t, use options ENTRY=1.0,TIME=-THETA
Missing Values
PANEL removes any missing values from any average it calculates.
Examples
panel(entry=1.0,time=-1.0,smpl=oecd) lnrxrate / cxrate
Computes the deviations from time period means for LNRXRATE, using only the en-
tries for oecd countries.
panel(effects=time,dummies=tdummies) constant
Creates TDUMMIES as a set of time period dummies.
linreg(robust) lgaspcar
# lincomep lrpmg constant idummies
*
panel(spreads=countryvar) %resids
linreg(spread=countryvar) lgaspcar
# lincomep lrpmg constant idummies
This does a two-step feasible weighted least squares allowing for each individual to
have a different variance.
Variables Defined
%NGROUP number of individuals or groups used in the computation that
actually contain data. Does not include individuals/groups
where all time periods are empty (missing values).
Parameters
newseries The series to be constructed with correct form for a rats panel
data set.
start end The range of the input series to use.
Supplementary Card
input variables This can be a list of several series (each containing data for
one individual), or a single series that needs to be re-ordered to
conform to the panel organization supported by rats.
Options
input=[indiv]/time
The default behavior for PFORM is to take a set of series where each series con-
tains the data for one individual (across time), and stack those “individual” series
into a single “panel” series. Use INPUT=TIME if you instead have separate series
for each time period (that is, a given series contains data for all individuals for a
single time period). You can also use the TRANSPOSE, INDIVIDUAL, or TIME op-
tions described below to handle other situations—any of those options will over-
ride the INPUT=INDIV default.
transpose/[notranspose]
block=number of individuals per time block
If the input is a single series containing balanced panel data (each individual has
data for the same set of time periods), but is blocked by time, rather than by in-
dividual, the combination of TRANSPOSE plus BLOCK will cause the input variable
to be transposed into a series blocked by individual.
repeat/[norepeat]
If you have a series which is common to all individuals (there should be only one
input variable), and you have already set up a panel CALENDAR scheme, PFORM
with REPEAT will replicate one set of data across all individuals.
Variables Defined
%NGROUP number of individuals in the created series (INTEGER)
%NOBS number of time periods in the created series (INTEGER)
Description
PFORM constructs newseries as a series with the proper panel form: balanced (same
number of time periods for each individual, although the series can contain missing
values), and blocked by individuals (entries 1 through T contain all data for individu-
al 1, entries T+1 to 2T contain all data for individual 2, etc.). It constructs one series
at a time. PFORM can handle three different forms of input data:
Each individual is in a separate series
List the set of series on the supplementary card. Set start and end to the range of
each of these you want included in the concatenated series. If you leave start and
end blank, they will default to the maximum range covered by the input series, with
missing values used to pad individual’s records which are shorter.
Each time period is in a separate series
If you have a separate series for each time period, list the set of series on the supple-
mentary card and use the INPUT=TIME option.
Single series, balanced panel, but data blocked by time, not individual
If you have a series blocked by time rather than by individual (first N observations
contain data for time period 1 for all individuals, next N observations contain data for
time period 2, etc.), supply the series on the supplementary card, use the TRANSPOSE
options to tell rats to reorder the data, and the BLOCK option to tell it the number of
individuals.
Single series, possibly unbalanced, with a separate index series
If your series is unbalanced, or isn’t in any regular order whatsoever, you can still
form a panel series if you have separate index or “tag” series that identify the indi-
viduals and time periods. List the single input series on the supplementary card and
use the INDIVIDUAL and TIME options to supply your index series. If you use TIME
but not INDIVIDUAL, rats assumes that the first time a value of the TIME series oc-
curs, it is on an observation for the first individual; the second time is for the second
individual, etc. Similar assumptions apply if you use INDIV but not TIME.
Regardless of the options, transform each of the series in your data set first. Then set
the CALENDAR to describe your data set and either continue your analysis, or save the
data to a new data file. The variable %NOBS is set to the number of observations per
individual, which is the proper value for the PANELOBS option on CALENDAR.
Examples
This creates a single panel series from eight separate time series and resets the CAL-
ENDAR to the appropriate values.
This reblocks an unbalanced sample using the ID and TIME options to identify the
two dimensions.
open data abdata.dta
data(format=dta) 1 1031 ind year emp wage cap indoutpt n w k $
ys rec yearm1 id nl1 nl2 wl1 kl1 kl2 ysl1 ysl2 yr1976 yr1977 $
yr1978 yr1979 yr1980 yr1981 yr1982 yr1983 yr1984
*
pform(indiv=id,time=year) p_n
# n
pform(indiv=id,time=year) p_w
# w
pform(indiv=id,time=year) p_k
# k
pform(indiv=id,time=year) p_ys
# ys
pform(indiv=id,time=year) p_year
# year
*
cal(panel=%nobs)
all %ngroup//%panelobs()
Parameters
cseries Complex series which POLAR is to decompose.
start end Range of entries to process. By default, the defined range of
cseries.
modulus (Optional) Complex series for the modulus of each entry. Use *
here if you want the argument but not modulus.
argument (Optional) Complex series for the argument of each entry. Use *
here if you want to use newstart but not argument.
newstart Starting entry for modulus and argument. By default, same as
start.
Options
periods/[noperiods]
If you use the PERIODS option, POLAR converts the phase lead (argument) from
radians to periods. At frequency n, a phase lead of q is equivalent to a lead of θ ν
periods. This conversion is meaningful only for frequencies 0 to p.
Notes
For individual complex numbers, you can get the polar decomposition using the
functions %CABS(z) and %ARG(z). These are real-valued functions which return the
modulus and the argument of the complex number z.
If you try to do the polar decomposition of an unsmoothed cross periodogram, you will
find that you get a coherence of 1.0, as, in effect, you are estimating a separate rela-
tionship at each frequency. Make sure that you smooth everything first.
Example
POLAR can compute coherences and phase leads for cross-spectral analysis. In the fol-
lowing example, series 6, 7 and 8 are the two spectral densities and the cross-spectral
density of a pair of series. POLAR sets 9 as the series of coherences and 10 as the
phase leads. The phase leads are converted to periods. These two series are then sent
back to the time domain and graphed using SCATTER with a “production-quality”
setup, using x-axis labels showing fractions of p, and separate scales for each. The
coherence is forced onto a scale of 0.0 to 1.0.
fft 1
fft 2
cmult 1 1 / 3 We’re not scaling the periodograms as
cmult 2 2 / 4 the scale factors will wash out when we
cmult 1 2 / 5 define series 8 below.
window 3 / 6
window 4 / 7
window 5 / 8
cset 8 = %z(t,8)/%csqrt(%z(t,6)*%z(t,7))
polar(periods) 8 / 9 10
ctor 1 nords/2
# 9 10
# coher phase
Use Symbol font to get p’s for the axis labels. This doesn’t affect the numbers.
grparm(font="Symbol") axislabels 14
These are good labels for monthly data. For quarterly data, labeling at 0, p/4,
p/2, 3p/4 and p is more sensible.
Wizard
The Statistics—Panel Data Regressions wizard provides dialog-driven access to the
PREGRESS instruction.
Parameters
depvar Dependent variable.
start end Estimation range. If you have not set a SMPL, this defaults to
the maximum common range of all the variables involved.
resids (Optional) Series for the residuals. These will be the trans-
formed residuals.
Description
This estimates b in the linear regression
(1) yit = X it b + uit , where
(2) uit = εi + λt + ηit , unless METHOD=SUR
e is the individual effect, l is the time effect and h the purely random effect. If you
use the option EFFECTS=INDIV, or METHOD=FD, the decomposition only includes the e
and h components. With EFFECTS=TIME, it only includes l and h.
METHOD=POOLED just estimates (1) by least squares, with no panel effects.
METHOD=BETWEEN estimates (1) by least squares on individual averages. If you
use METHOD=FIXED, ei and lt are treated as constants and are “swept” out. With
METHOD=RANDOM, they are treated as part of the error term and b is estimated
by gls. If METHOD=FD (first difference), the data are differenced to eliminate ei .
METHOD=SUR assumes that the u’s are serially uncorrelated, but are correlated across
i at a given t.
For random effects estimation, you can input the variances of the components your-
self using the VRANDOM, VINDIV and VTIME options, or you can allow PREGRESS to
estimate them. There are many ways to estimate consistently these variances; most
of the commonly used choices can be implemented by a combination of the VCOMP and
CORRECTION options.
Options
effects=[individual]/time/both
This indicates whether to allow for INDIVIDUAL effects, TIME effects or BOTH.
method=[fixedeffects]/randomeffects/fd/sur/between/pooled
This chooses the estimation method: fixed or random effects, first difference,
cross-section sur, the “between” estimator, or pooled panel regression.
instruments/[noinstruments]
Use the INSTRUMENTS option to do instrumental variables. You must set your
instruments list first using the instruction INSTRUMENTS. This can be used with
any of the METHOD options except SUR.
hausman/nohausman
If you do METHOD=RANDOM, the HAUSMAN option requests that a Hausman test (for
random vs fixed effects) be done. Computing this requires the fixed effects esti-
mate. Because VCOMP=WK or VCOMP=SA each do a fixed effects regression anyway,
if you choose one of those, PREGRESS will do the Hausman test. This option is
necessary only if you want the Hausman test and you are using a different choice
for the component variances.
robusterrors/[norobusterrors]
cluster=SERIES with category values for clustered calculation
The combination of ROBUSTERRORS and CLUSTER allows the calculation of coef-
ficient standard errors which are robust to arbitrary correlation within groups
defined by the CLUSTER expression. This can be applied to any choice of METHOD
except SUR. CLUSTER=%INDIV(T) (or LWINDOW=PANEL) would be used for stan-
dard errors clustered by individuals.
[print]/noprint
vcv/[novcv]
smpl=SMPL series or formula ("SMPL Option" on page RM–546)
unravel/[nounravel] (User’s Guide, Section 2.10)
define=equation to define (User’s Guide, Section 1.5.4)
frml=formula to define
equation=equation to estimate
dfc=Degrees of Freedom Correction (Additional Topics, Section 1.4)
title=title to identify estimation method [depends upon options]
See LINREG for details. If you use EQUATION, omit the supplementary card.
Variables Defined
Besides the usual regression variables (%XX, %BETA, %LOGL and %RSS, etc.) PREGRESS
defines:
%VRANDOM variance of h: the random component (REAL)
%VINDIV variance of e: the individual component (REAL)
%VTIME variance of l: the time component (REAL)
%NGROUP number of individuals or groups (INTEGER)
%SIGMA covariance matrix for METHOD=SUR (SYMMETRIC)
%NFREE number of freely estimated parameters. This includes any fixed
effects coefficients that aren't reported directly and variance or
covariance matrix parameters (INTEGER)
Technical Information
The different choices for VCOMP use different quadratic forms in residuals to estimate
the component variances. VCOMP=WK (Wansbeek-Kapteyn) estimates fixed effects and
uses its residuals several ways. VCOMP=SA (Swamy-Arora) uses residuals from fixed
effects and “between” estimators. VCOMP=WH (Wallace-Hussain) uses the ols residu-
als several ways. See Baltagi (2008) for more detail.
To demonstrate how the CORRECTION option works, we’ll look at part of the calcula-
tion for VCOMP=WH. The ols residuals will be (I − X( X ′X )−1 X ′) u where u is the vec-
tor of true residuals. The expected value of the sum of squared residuals will be
trace( Euu ¢ ) will be a matrix with elements which are linear combinations of the
component variances. CORRECTION=NONE ignores the X matrices (in effect, acting as
if the ols residuals are the true residuals) and uses as one of its conditions that the
sum of squared ols residuals is equal to trace( Euu ¢ ) . CORRECTION=DEGREES uses
a correction based on the number of regressors in X rather than their specific values.
CORRECTION=FULL computes the exact correction using the structure of trace( Euu ¢ )
and X. (This can be rearranged using the properties of the trace to avoid multiplying
the potentially very large matrices in (3)).
Examples
preg(method=between) invest
# constant value cap
preg(method=fixed) invest
# constant value cap
preg(method=random,vcomp=wh) invest
# constant value cap
preg(method=random,vcomp=wk) invest
# constant value cap
preg(method=random,vcomp=sa) invest
# constant value cap
preg(method=random,vcomp=ml) invest
# constant value cap
This estimates an equation, allowing for individual effects only, using the between
estimator, fixed and random effects, done with several variance components estima-
tors. Note that we left the CONSTANT in the fixed effects equation. This will show a
zero coefficient with zero standard error.
preg vfrall
# beertax
preg(effects=both) vfrall
# beertax
This estimates an equation (by fixed effects, which is the default), first allowing for
individual effects only, then allowing for both individual and time effects.
Notes
If you estimate using fixed effects, the reported degrees of freedom will be reduced by
the number of implied “dummy” variables.
For a balanced sample, the coefficient estimates for both fixed and random effects
will be identical to those you would get doing the equivalent regression “by hand,”
using PANEL to transform the data. The covariance matrix will be slightly different
with random effects because of a different estimate of the variance. In an unbalanced
sample, the results will be the same for fixed effects, but only for EFFECTS=TIME or
EFFECTS=INDIV. Random effects on an unbalanced data set should only be done us-
ing PREGRESS.
Wizard
You can also view series by using the View—Series Window operation to open a Se-
ries Window, selecting the series you want to view, and selecting View–Data Table or
clicking on the View Data toolbar icon:
Parameters
start end Range of entries to print. If you have not set a SMPL, PRINT
uses the smallest range required to show all defined data for
the series. Any undefined data are treated as missing values.
list of series The list of series to print. If you omit the list, all current series
are printed.
Options
[dates]/nodates
By default, PRINT labels entries with their dates if you have set a CALENDAR. Use
NODATES to get just the entry number instead.
If you find that distracting, you can use PICTURE to reduce the number of deci-
mals. A picture code takes a form like "##.###" or "*.#". The first requests
two digits left of the decimal and three digits to the right. The second asks for
one digit right and as many digits as needed to the left. For details, see "Picture
Codes" on page RM–545
window="Title of window"
If you use the WINDOW option, the output is displayed in a (read-only) spreadsheet
window. The series will be in columns, with labels across the top row. You can
export the contents of this window to various file formats using File—Export....
Description
This displays entries start to end of the list of series. See the sample out-
put below. If all series will not fit across the page, PRINT will put them in blocks of
between four and seven. Note, by the way, that PRINT determines the default range
(start to end) separately for each block. (If you use the WINDOW option, however,
they will always be in a single block).
Missing Values
Missing values and data outside the range of a series display as NA.
Examples
print 1920:1 1929:1 prod profit capital
Wizard
You can use the Time Series—Single-Equation Forecasts wizard to forecast univari-
ate models, including static forecasts as produced by PRJ, though it will produce a
UFORECAST instruction to do that.
Parameters
series Series for the fitted values. If you use the options described
later under "Distribution Statistics", these are the normalized
fitted values (zi in the notation there).
start end Range of entries for which fitted values are to be computed.
If you have not used the SMPL instruction to set a range, this
defaults to the range of the most recent regression. Note: using
the SMPL option on the preceding regression has no effect on the
range set by PRJ.
Description
PRJ handles forecasts for only certain types of models and certain situations. Use
UFORECAST, FORECAST, or STEPS if you need more flexibility.
You can use PRJ to get fitted values after a LINREG, STWISE, DDV, LDV, AR1, or
BOXJENK, although some statistics cannot be computed for AR1 and BOXJENK.
PRJ also has several options for computing distribution statistics from the fitted val-
ues. These are important for programming truncated and censored regressions, and
for diagnostic tests in probit and related models (see User’s Guide, Section 12.3).
Fitted Values
PRJ takes the coefficients (b) and the regressors (x) from the most recent regression
and computes the fitted values (xtb) over entries start to end. For a logit or probit,
this gives the index value for the case.
When you use PRJ outside the regression range, it computes a simple form of forecast
called a static forecast: predicting the dependent variable given the values of all the
regressors. This is useful only for models with no lagged dependent variables. Note
that you must have data available for the right-hand-side variables in order to com-
pute forecasts.
Distribution Statistics
You can use PRJ with the set of options described below to obtain one or more of the
following statistics from a series zi of (standardized) deviates:
• Density: f (zi )
• Distribution: Φ (zi )
• Inverse Mills’ ratio: f(zi ) Φ(zi )
• Derivative of the Inverse Mills’ ratio, evaluated at zi.
If observation i is truncated at the value Ti, zi takes the following values:
Bottom truncation
( X βˆ − T ) σ
i i
zi =
Top truncation
(T − X βˆ ) σ
i i
density=Series of densities
cdf=Series of distributions
mills=Series of Inverse Mills’ Ratios
dmills=Series of Derivatives of MILLS
You can use any or all of these four options in a single PRJ instruction.
After a DDV estimation, the CDF option will generate the series of predicted prob-
abilities of the “1” choice. Use the option DISTRIB=LOGIT if you want these to be
calculated for the logit, as the default is to compute these for the normal (regardless
of your choice on the DDV).
Other Options
smpl=SMPL series or formula ("SMPL Option" on page RM–546)
You can supply a series or a formula that can be evaluated across entry numbers.
Entries for which the series or formula is zero or “false” will be skipped, while
entries that are non-zero or “true” will be included in the operation.
If the output series already exists, observations of that series not included in the
SMPL will be completely unaffected by the PRJ operation.
Note that UPPER and LOWER replace the older TRUNCATE and TOP/[NOTOP] op-
tions which provided the same functionality.
Use missing value codes for any entries that are to be treated as unlimited.
atmean/[noatmean]
xvector=the value of xi [unused]
The ATMEAN and XVECTOR options allow you to compute the index, density, stan-
dard error, and predicted probability for a single input set of X’s. The values are
returned as the variables %PRJFIT, %PRJDENSITY, %PRJSTDERR and %PRJCDF.
The ATMEAN option does the calculation at the mean of the regressors over the
estimation range. With XVECTOR, you provide a vector at which you want the
values calculated.
Additional statistics can be obtained using the options DENSITY, MILLS and DMILLS.
Notes
The MILLS option is the only one necessary to compute the correction for truncation.
You can use DMILLS with the MCOV instruction and matrix operations to compute the
covariance matrix of these estimators.
ddv(noprint) choice1
# constant age nadults nkids nkids2 lnx $
agelnx nadlnx bluecol whitecol
prj(mills=lambda)
linreg(smpl=share1>0,title="Tobit II") share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda
Parameters
procname The name you want to give this procedure. procname must be
distinct from any other procedure or variable name in the pro-
gram.
parameters Names of the formal parameters. These names are local to the
procedure—they will not conflict with variables elsewhere in
the program. When you execute the procedure, rats passes the
values on the EXECUTE instruction to the formal parameters.
By default, parameters are INTEGER passed by value. Use TYPE
to use other RATS Data types or to pass by address.
Description
The most powerful compiler structure in rats is the procedure. A procedure is simi-
lar to subroutines of Fortran , the functions of C or the procedures of Java. In effect,
procedures allow you to define new instructions from a sequence of rats commands.
A procedure begins with a PROCEDURE statement and ends with a matching END. The
PROCEDURE statement itself names the procedure and lists the formal parameters.
The usual arrangement of the statements in a procedure is
procedure statement
type, declare, local, option and fixed statements, if any.
other instructions
end
You should try to write the procedure using only parameters, options, local variables
and global variables which rats itself defines. That way you don’t have to worry
about accidentally changing a global variable. If you have a global variable that you
want to access outside the procedure, use a name starting with %% so it won’t conflict
with either the user’s names or any names defined by rats.
An alternative to a procedure is a function, which you can create with the FUNCTION
instruction. Where procedures define new instructions, functions define operations
similar to the functions used within rats expressions.
Procedures are described in much greater detail in Chapter 15 of the User’s Guide. You
should see also the descriptions of TYPE, LOCAL, and OPTION, used to set parameters
types, define local variables, and define procedure options, respectively.
Examples (Partial)
procedure cumpdgm series start end
type series series
The procedure CUMPDGM has three parameters: SERIES is a type SERIES, START and
END are INTEGER.
Using SOURCE
Once you have a procedure working the way you want, it is usually a good idea to
save it as a separate file, so you can use it in different applications. A well-designed
procedure can be used with a variety of data sets if specific information about the
current data is passed to the procedure through parameters and options, rather than
being hard-coded into the procedure. It’s most useful to name is procname.src
which will allow rats to locate it automatically when you use it.
If you have a procedure on a separate file, bring it into your current rats program
using the instruction SOURCE. The typical instruction is
source file with procedure
If you have a collection of procedures which you use regularly, you can create a proce-
dure library that gets brought in right at the start of your program.
Running a Procedure
Procedures are executed using the EXECUTE command or (more commonly), using the
@ sign (a shortcut for EXECUTE). For example:
@cumpdgm x 1980:1 2017:12
Order of Procedures
For complex tasks, it is very common for a procedure to execute other procedures. Be-
cause rats needs to know the syntax of a procedure before it can even interpret the
instruction which will execute it, it must process the sub-procedure before the main
one. Thus you either need to place the sub-procedure first in your file, or “SOURCE” it
in from a separate file before processing the main procedure.
Wizard
If you open a rats format file using File—Open..., you can export series to another
file using the File—Export... operation.
Parameters
list If you list a set of series names, PRTDATA will only print the
data for those series. If you leave this parameter blank, it will
print all the series on the file.
Options
format=[portable]/binary/cdf/dbf/dif/free/html/prn/rats/rtf/tex/
tsd/wks/xls/xlsx/”(FORTRAN format)”
This selects the desired format for the output. The “spreadsheet” formats (XLS,
XLSX, WKS, RTF, DBF, PRN, CDF, DIF, HTML and TEX) will only work if the listed
series have the same frequency. If you are making an archival copy, we would
suggest you stick with PORTABLE, since it includes all the information present on
the rats format file itself.
unit=output/copy/other unit
This sets the output unit. UNIT=OUTPUT is the default if you use
FORMAT=PORTABLE. Otherwise PRTDATA defaults to UNIT=COPY. See the OPEN
instruction for an explanation of I/O units.
Description
You can use PRTDATA to:
• check the data on the file.
• make archival copies of the data in a “human-readable” format, since rats
format files are machine-readable only.
• transfer data to another program.
Examples
dedit ukdata.rat
open copy archive.dat
prtdata(unit=copy)
This opens the rats format file UKDATA.RAT and a COPY file called ARCHIVE.DAT.
Then the PRTDATA instruction prints, in PORTABLE format, all series on the UKDATA.
RAT file to ARCHIVE.DAT.
dedit sales.rat
open copy sales.wks
prtdata(format=wks,org=cols) salesx31 salesy45 salesy66
this constructs the WKS format file SALES.WKS using the data from series SALESX31,
SALESY45 and SALESY66.
See Also . . .
DEDIT Opens or creates rats format data files.
OPEN Opens files
COPY Writes information from working data series to a file.
Parameters
series Series for which you want to compute statistics.
start end Range of entries to use. If you have not set a SMPL, this defaults
to the defined range of series.
Description
PSTATS uses the following decomposition of uit (the series):
uit = εi + λt + ηit
e is the individual effect, l is the time effect and h the purely random effect. If you
use the option EFFECTS=INDIV, the decomposition only includes the e and h compo-
nents. With EFFECTS=TIME, it only includes l and h. (Note: if a particular effect is
weak, it is possible for the estimated variance of the component to be negative.)
Options
effects=[individual]/time/both
This indicates whether to allow for INDIVIDUAL effects, TIME effects or BOTH.
smpl=SMPL series
This is the standard SMPL option ("SMPL Option" on page RM–546).
tests/[notests]
TESTS requests the calculation of F-tests (analysis of variance) for the effects. For
EFFECTS=INDIVIDUAL or EFFECTS=TIME, these are just the one-factor analysis
of variance tests. For EFFECTS=BOTH, these are two-factor tests with one obser-
vation per cell.
spread/[nospread]
If you use SPREAD, PSTATS does a likelihood ratio test for equal variances across
cross-sections.
Missing Values
rats drops missing values from the computation.
Variables Defined
%NGROUP number of indivduals/groups that actually contain data (omits
individuals where all time periods are empty) (INTEGER)
%VRANDOM variance of h: the random component (REAL)
%VINDIV variance of e: the individual component (REAL)
%VTIME variance of m: the time component (REAL)
Example
This is a variation on the code presented in the PANEL.RPF example program.
cal(panelobs=20) 1935
all 10//1954:1
open data grunfeld.dat
data(format=prn,org=cols)
linreg invest
# firmvalue cstock
pstats(tests,effects=both) %resids
preg(method=fixed) invest
# firmvalue cstock
pstats(tests,effects=time) %resids
pstats(spread) %resids
The first PSTATS tests the residuals from an OLS regression for time and individual
effects, the second tests the residuals from a fixed effects regression for time effects.
The third PSTATS tests for equal variances.
Output
The output from the second and third PSTATS instructions above is:
Notes
If you use EFFECTS=BOTH, the analysis of variance table will include F-tests for indi-
vidual effects, time effects and a joint test. Note that the individual effects test will
not be the same as you would get with EFFECTS=INDIV (and similarly for the time
effects test with EFFECTS=TIME), as it is testing for individual effects allowing for
time effects, while with EFFECTS=INDIV, it is not conditional on time effects.
Parameters
list This can be any collection of INTEGER, REAL, COMPLEX, STRING
or LABEL variables or array elements. You may not use an array
itself. You must introduce any variable prior to using it in a
QUERY instruction, with DECLARE for instance.
Options
prompt="string" or STRING variable ["Enter Information"]
This displays a prompt on the screen as a message to the user. It can either be a
string enclosed in quotes (") or a STRING variable.
The default error message is “Illegal Value”. If you use VERIFY, it’s a good idea to
include an error message and to make it as informative as possible.
initialize/[noinitialize]
If INITIALIZE, the text box is filled with the current value of the variable. This
only works if there is only one variable. By default, the field is blank.
Examples
These show two uses of QUERY in to set up the early part of a rats session.
declare integer ninput
query(prompt="How many observations") ninput
allocate ninput
Because rats does not make any changes to the file until you give an explicit SAVE
instruction, you actually only need to use QUIT if you want to free up the memory
space occupied by the file directory.
See Also . . .
DEDIT Opens or creates a rats format file.
SAVE Saves changes to an open rats format file.
STORE Adds data series to an open rats format file. You must do a
SAVE to make the changes permanent.
END Ends a rats program.
QZ Instruction — QZ Decomposition
qz( options ) A B
This computes the generalized Schur decomposition of the matrix pair (A,B), which
should be square matrices of the same dimensions. This generates a collection of ma-
trices such that QLZ ′ = A and QWZ ′ = B , where Q and Z are orthogonal matrices
(QQ ′ = ZZ ′ = I), W is upper triangular and L is block upper triangular, where the
blocks on the diagonal are 1´1 for real generalized eigenvalues and 2´2 for complex
conjugate pairs.
Unlike the standard eigenvalue routines, it isn’t easy to sort generalized eigenval-
ues. Instead, if it’s necessary to control positioning, they are partitioned into two sets
based upon some criterion. Assuming that the criterion never separates a pair of
complex conjugate eigenvalues, the L and W matrices will retain a block triangular
structure, which is generally all that is needed for further work. The blocking is con-
trolled by the combination of the BLOCK and CUTOFF options. The SIZE option allows
you to find out how big the “upper” block is.
Options
block=below/above/real/imag
Indicates the criterion used for determining which eigenvalues go into the upper
block. BELOW and ABOVE are based upon absolute values, and use the CUTOFF
value. BLOCK=BELOW,CUTOFF=1.0 will put all eigenvalues with absolute value
less than 1.0 in the upper block. REAL and IMAG partition the eigenvalues into
real and complex, moving the indicated group into the upper block.
The following are used to provide names for the variables computed by QZ:
q=(output) Q matrix
z=(output) Z matrix
lambda=(output) L matrix
omega=(output) W matrix
cvalues=(output) VECTOR[COMPLEX] of generalized eigenvalues
evalues=(output) VECTOR of real parts of generalized eigenvalues
size=(output) size of the upper block
Example
qz(q=q,z=z,lambda=lambda,omega=omega,block=below,$
cutoff=1.0/beta) g0 g1
does a qz decomposition of the pair G0, G1 with blocking to put all generalized eigen-
values less than 1.0/beta in absolute value at the top.
Parameters
start end Range over which the covariance matrices are computed. If you
have not set a SMPL, this defaults to the maximum common
range of all the series in the two sets.
Supplementary Cards
On the two supplementary cards, list the two sets of series to use in the test. RATIO
will compare the covariance matrices of these two sets of series. Both sets should
have the same number of series. You don’t need to worry about which of the two is
from restricted estimates and which is from the unrestricted as RATIO takes the
absolute value of the computed statistic.
Options
degrees=test degrees of freedom (Required)
This is the degrees of freedom for the chi-squared statistic, that is, the number of
restrictions. You must use this option.
[print]/noprint
title=”string for output title”
Use NOPRINT to suppress the printing of the test results. If you are showing the
output, you can use the TITLE option to provide your own title for the output.
Description
RATIO takes two lists of residual series, computes the two covariance matrices (S1
and S2 ) and generates the chi-squared statisic:
(T − c) log Σ1 − log Σ2
where T is the number of observations and c is given by the MCORR option. Note that
RATIO does not compute centered covariance matrices, that is, it does not subtract
means from the input series.
The null hypothesis is that the two log determinants are equal. Small test statistics
and significance levels close to 1.0 suggest the hypothesis can be accepted. Larger
statistics and significance levels close to 0.0 suggest the hypothesis is rejected.
Example
ratio(degrees=27,mcorr=10)
# ures1 ures2 ures3
# rres1 rres2 rres3
tests the difference between the covariance matrix of series ures1, ures2 and ures3
and that of series rres1, rres2 and rres3. The test statistic is compared with a c2
distribution with 27 degrees of freedom. Output such as:
Log Determinants are -46.441108 -44.054300
Chi-Squared(100)= 90.698721 with Significance Level 0.73622035
would suggest that the null hypothesis can be accepted.
Variables Defined
%CDSTAT the computed test statistic (REAL)
%SIGNIF the marginal significance level (REAL)
%NOBS number of observations (INTEGER)
%NVAR number of variables (INTEGER)
See Also . . .
VCV Computes a covariance matrix of a single set of series.
CDF Computes the significance level for a test statistic.
%DET(A) Function returns the determinant of a matrix.
Parameters
arrays,... These are the objects for which data is to be read. You can use
any combination of variables. You can use arrays of arrays, but
any arrays must be dimensioned ahead of time (unless you use
the option VARYING , or are reading from a matlab file).
Options
format=[free]/binary/cdf/matlab/prn/tsd/wks/xls/xlsx/
“( FORTRAN format )”
This tells READ the format of the data. See below for information on how the for-
mat option affects the way that READ fills arrays.
unit=input/[data]/other unit
READ reads the data from the specified I/O unit. By default, this is the DATA unit.
varying/[novarying]
status=INTEGER variable set to 0,1 status
singleline/[nosingleline]
These are more advanced options. See their description later in this section.
Description
The READ instruction reads information into the arrays and variables in the order
listed. The manner in which arrays are read depends upon the FORMAT option.
• With FORMAT=FREE, READ is identical to INPUT except for the different default set-
ting for the UNIT and SINGLELINE options. It reads RECTANGULAR arrays by rows,
and SYMMETRIC or PACKED arrays by the rows of the lower triangle. It can read
more than one row from a single line.
• With FORMAT=CDF, TSD, PRN, WKS, XLS, and XLSX, READ fills cells in the target
variables based on the arrangement of the data on the file. If you don’t have
enough values in a row, the remaining elements get missing value codes.
• With FORMAT=MATLAB, READ will read (and dimension) only full arrays which exist
with the same name on the file.
• With FORMAT=BINARY, READ processes arrays in their internal order. Arrays
which are RECTANGULAR are read by columns, SYMMETRIC or PACKED by the rows
of the lower triangle
• With a FORTRAN format (Additional Topics, Section 3.9), READ requires that each
array and each row of an array begin on a new line. The format should be the
format required to read a single row, not the whole array. As with FORMAT=FREE,
it reads RECTANGULAR arrays by rows and SYMMETRIC or PACKED arrays by the
rows of the lower triangle. A VECTOR is treated like a single row.
Notes
You should use FORMAT=BINARY only to read data written out of rats using
WRITE(FORMAT=BINARY). If the binary data are generated in some other fashion, it
is possible that the byte streams won’t match and you will end up with gibberish.
Examples
declare symmetric d(3,3)
read(unit=input,format="(3f6.2)") d
10.00
-5.20 12.20
1.30 5.40 14.60
read(unit=input,format=free) d
10.0 -5.2 12.2 1.3 5.4 14.6
The two READ instructions put the same set of numbers into the SYMMETRIC array D.
The first is an example of a formatted read: each row of data for the array appears on
a separate line. The second creates the same matrix using the free format option.
declare rectangular a
open data arrayin.mat
read(format=matlab) a
This reads the array A from a matlab file. It will take its dimensions from the ma-
trix A on the file.
Advanced Options
varying/[novarying]
status=INTEGER variable set to 0,1 status
singleline/[nosingleline]
The VARYING and STATUS options allow you to work with lists whose size you do
not want to set in advance.
You can use VARYING to input data for a single VECTOR array of any numeric or
character type. With VARYING, the VECTOR is filled with as much data as is avail-
able. By default, this is the entire contents of the data file. With SINGLELINE,
it will read data only from a single line. (SINGLELINE will be ignored if you use
FORMAT=BINARY).
If you use the option STATUS, READ does not give you an error if there is not
enough data to fill all the variables. Instead it sets your STATUS variable to 0. If
the READ is successful, it sets the status variable to 1.
See Also . . .
INPUT An alternative to READ, capable of reading only free-format data.
It is designed for reading data from the input unit.
ENTER Reads data for arrays and variables from supplementary cards.
MEDIT Obtains data for an array from the user through a spreadsheet
window.
WRITE Writes arrays and variables to output or to an external file.
Parameters
list... The arrays or series whose memory blocks you want to release.
Options
The options allow the release of space set aside for certain other purposes, though
with modern computers, these are generally not large enough to matter. Each of
these is a switch option, which is off by default:
FREQUENCY releases the frequency domain series block set up by a
FREQUENCY instruction.
REGRESS releases the block of regression information: %XX, %BETA and
other things.
CMOMENT releases the array %CMOM and other information set up by CMO-
MENT.
Notes
You may find it necessary to use this instruction if you have severe constraints on
available memory, or if you are running a large program. However, it makes little
sense to release a series or array unless you are truly finished with it. If you set it or
dimension it later, you simply have borrowed the space temporarily.
RELEASE does not actually remove the arrays and series themselves from the table of
variables, so you can re-dimension them later without using another DECLARE.
If you use LOCAL arrays or series in a PROCEDURE, note that rats does not automati-
cally release the space allocated to them when it finishes executing the procedure. If
you want to free up the space, you will have to do an explicit RELEASE instruction.
Parameters
variables/expressions With ACTION=MODIFY, this list of variables and
expressions provides the information which is to be
inserted into the report.
Options
action=define/[modify]/format/show/sort
REPORT with ACTION=DEFINE option initiates the report. ACTION=MODIFY
(which is the default, and so can be omitted) adds content to the report.
ACTION=FORMAT, in conjunction with PICTURE and WIDTH, allows you to adjust
the formatting. ACTION=SORT sorts the report on the values of one the columns.
Finally, use REPORT with ACTION=SHOW to display the report.
use=name of report
USE allows you to define a new report object (when used with ACTION=DEFINE),
or work with an existing report object (when used with the other choices for the
ACTION option: MODIFY, FORMAT, SHOW, and SORT). If you omit the USE option,
rats uses the default internal report.
The ACTION option is used on every REPORT instruction. The remaining options are
described below, grouped by the ACTION choice with which you use them.
align=[left]/center/right/decimal
Sets the alignment of a string or label. RIGHT and DECIMAL are the only ones you
can use for numbers.
special=[none]/onestar/twostars/threestars/parens/brackets
You can use SPECIAL to enclose cells in parentheses (), brackets [] or tag them
with one star (*), two stars (**), or three stars (***).
With NOSPAN, the width of the column being used will be expanded as needed to
fit the supplied information.
Examples
Below (and continued on the next page) is an excerpt from example 10.3 from Ver-
beek (2008). This uses the REGRESS option to build a report presenting the coeffi-
cients and standard errors from four different regressions (three panel data regres-
sions and one ols estimation).
Define the report and provide column headers:
report(action=define,hlabels=||"Variable","Between",$
"Fixed Effects","OLS","Random Effects"||)
Perform the regressions, following each with REPORT(REGRESS) to add the
results to the report:
preg(method=between) wage
# constant school exper expersq union mar black hisp pub
report(regress)
preg(method=fixed) wage
# exper expersq union mar pub
report(regress)
linreg wage
# constant school exper expersq union mar black hisp pub
report(regress)
preg(method=random,vindiv=.1055,vrand=.1234) wage
# constant school exper expersq union mar black hisp pub
report(regress)
Format the cells to use 3 decimals places and display the table:
report(action=format,picture="*.###")
report(action=show)
The next set of code is from example 2.7 from Verbeek (2008):
cal(m) 1960:1
open data capm2.dat
data(format=prn,org=columns) 1960:1 2002:12 $
rfood rdur rcon rmrf rf jandum
Define a report, and provide a set of row labels for the first column:
report(action=define)
report(atrow=1,fillby=cols) $
"Company" "Excess Returns" "" "Uncentered R^2" "s"
Now do the first regression and add four numerical results to the report.
COL=NEW puts these in a new column. ATROW=1 starts the information in row
one. FILLBY adds items going down the column, rather than across in rows:
linreg rfood
# rmrf
report(col=new,atrow=1,fillby=cols,align=center) $
"Food" %beta(1) %stderrs(1) %trsq/%nobs sqrt(%seesq)
Repeat process for two more regressions
linreg rdur
# rmrf
report(col=new,atrow=1,fillby=cols) $
"Durables" %beta(1) %stderrs(1) %trsq/%nobs sqrt(%seesq)
linreg rcon
# rmrf
report(col=new,atrow=1,fillby=cols) $
"Constuction" %beta(1) %stderrs(1) %trsq/%nobs sqrt(%seesq)
Generate the report:
report(action=show)
Wizard
Use Statistics—Regression Tests, and select “General Linear Restrictions”.
Parameters
restrictions The number of linear restrictions.
residuals (Optional) With the CREATE option only, this is a series for the
residuals from the restricted regression. Note that the standard
%RESIDS series is set to the residuals as well, so you will rarely
need this.
Supplementary Cards
Represent each restriction by a pair of supplementary cards. On the first, list the
numbers of the coefficients which enter the restriction. On the second supplementary
card in each pair, list the weights attached to the coefficients listed on the first card,
followed by the value which this linear combination of coefficients takes.
Note that you list the coefficients entering the restriction by coefficient numbers
rather than by variable names. rats puts the coefficients for a LINREG or similar
instruction in the regression in the order listed on the supplementary card, with a
block of lags ordered from the lowest lag (highest lead) to the highest lag.
Hypothesis Tests
RESTRICT, when used without the option CREATE, operates like the other hypothesis
testing instructions (EXCLUDE, TEST). It does not make any changes to the stored in-
formation about the last regression, so any additional hypothesis testing instructions
will apply to the original regression. rats ignores the residuals parameter on the
instruction line.
The main test statistic is usually shown as an F, but will be shown as a chi-squared
when RESTRICT is applied to estimates from GARCH, DDV and similar instructions
which do maximum likelihood estimation or from any instruction for which the
ROBUSTERRORS option was used during estimation.
For F tests with one degree of freedom, RESTRICT will report a two-tailed t test in
addition to the F test. For chi-squared tests with more than one degree of freedom,
RESTRICT will report an F with an infinite number of denominator degrees of free-
dom (that is, the chi-squared statistic divided by the numerator degrees of freedom)
in addition to the chi-square.
You can also control the distribution yourself using the FORM option.
print/[noprint]
By default, RESTRICT produces only the standard errors and t-statistics. If you
use the PRINT option, rats prints the restricted coefficient vector in a table with
the label, lag and coefficient, but without standard errors and t-statistics. If you
use NOPRINT explicitly, rats suppresses all output from RESTRICT.
Restricted Regressions
There are two ways to estimate linear models subject to restrictions. One is to “code”
the restrictions into the explanatory variables. rats has an instruction ENCODE for
implementing that strategy. The other way, used by RESTRICT, is to estimate the
unrestricted model and then impose the restriction.
Use the option CREATE when you want to use RESTRICT to compute the restricted
regression. CREATE does the following:
• It computes the new coefficient vector and covariance matrix of coefficients
subject to the restrictions. See Section 3.3 of the User’s Guide for the formulas
used in the calculation.
• It computes new summary statistics. It recomputes all the LINREG variables
such as %RSS and %NDF.
• It replaces the old regression with the new restricted regression. Any further
hypothesis tests will apply to the restricted regression, not the original one.
It does all of this in addition to the regular task of computing the test statistic for the
restriction. You can save the residuals or coefficients from the restricted regression
using the residuals and coeffs parameters on the instruction line.
[print]/noprint
vcv/[novcv]
When you use these with CREATE, they perform the same task as they do for
LINREG: controlling the printing of the regression results and covariance-correla-
tion matrix of coefficients, respectively.
unravel/[nounravel]
UNRAVEL causes a substitution for ENCODEd variables. This is one of the two steps
in the other method of restricted regression.
define=equation to define
frml=FRML to define
Respectively, these define an equation and a formula from the results.
form=f/chisquared
Described earlier—it affects only the test statistic, not the restricted regression.
Examples
Suppose you have the following equation:
yt = b0 + b1 x1t + b2 x 2t + b3 x3t + ut
which can be estimated using:
linreg y
# constant x1 x2 x3
We want to test the simple hypothesis that b1=b2. RESTRICT operates by testing
whether a linear combination of coefficients is equal to a specific value, so we need to
rewrite this hypothesis in this form: b1-b2=0.
We are testing the second and third coefficients here, so the RESTRICT would be:
restrict 1
# 2 3
# 1.0 -1.0 0.0
If you want to compute the restricted regression, do:
restrict(create) 1
# 2 3
# 1.0 -1.0 0.0
a1 + a2 = 1.0
g11 + 0.5g12 = 0.0
g22 + 0.5g12 = 0.0
The code below estimates a four lag autoregression on LGNP and forces the lag coef-
ficients (numbers 2 through 5) to sum to one.
linreg lgnp
# constant lgnp{1 to 4}
restrict(create) 1
# 2 3 4 5
# 1 1 1 1 1
Variables Defined
%CDSTAT the computed test statistic (REAL).
%SIGNIF the marginal significance level (REAL).
%NDFTEST (numerator) degrees of freedom for the test (INTEGER)
If you use CREATE, all the variables defined by LINREG will be defined as well.
See Also . . .
MRESTRICT is similar to RESTRICT, but uses matrices rather than supplementary
cards to specify the restrictions. TEST is more specialized than RESTRICT—it tests
specific values for coefficients, and cannot test any linear combinations including
more than one coefficient. EXCLUDE is even more specialized—it tests exclusion re-
strictions only.
See page UG–62 of the User’s Guide for more on restricted regressions
Example
procedure regdiags resids
type series resids
*
option arch integer 0
option qstat integer 0
*
local series ressqr
*
if arch<0
{
display "@REGDIAGS: ARCH option must be >=0"
return
}
if arch>0
...
uses RETURN to abort the procedure if the user specifies an improper value for an op-
tion.
See Also . . .
UG, Section 15.2 Procedures.
END Signals the end of a procedure or loop.
HALT Terminates rats from within a procedure.
Parameters
RATS I/O unit The I/O unit to rewind.
Example
data(format=free,org=columns,missing=-88.888) * 1991:2 $
year month notmiss regnobs stderr taxrate $
r0m r1m r2m r3m r4m r5m r6m r7m r8m r9m r10m r11m r12m $
r13m r14m r15m r16m r17m r18m r21m r24m r30m r36m r48m $
r60m r72m r84m r96m r108m r120m r132m r144m r156m r168m $
r180m r192m r204m r216m r228m r240m r252m r264m r276m $
r288m r300m r312m r324m r336m r348m r360m r372m r384m $
r396m r408m r420m r480m
*
rewind data
data(format=free,org=columns,missing=-88.888) 1991:3 531 $
year month notmiss regnobs stderr taxrate $
r0m r1m r2m r3m r4m r5m r6m r7m r8m r9m r10m r11m r12m $
r13m r14m r15m r16m r17m r18m r21m r24m r30m r36m r48m $
r60m r72m r84m r96m r108m r120m r132m r144m r156m r168m $
r180m r192m r204m r216m r228m r240m r252m r264m r276m $
r288m r300m r312m r324m r336m r348m r360m r372m r384m $
r396m r408m r420m r480m
This was for duplicating a paper which had an error in data handling. There weren’t
531 data points on the data file and the program used (which was not rats) rewound
the data file to fill the request.
See Also . . .
OPEN Opens a rats I/O unit.
CLOSE Closes a rats I/O unit.
Wizard
You can use Statistics—Recursive Least Squares operation to do recursive least
squares.
Parameters
depvar Dependent variable.
start end Range to use in estimation. If you have not set a SMPL, this
defaults to the largest common range for all the variables in-
volved.
residuals (Optional) Series for the recursive residuals.
Options
[print]/noprint
vcv/[novcv]
title="title for output" [“Recursive Least Squares”]
These control the printing of regression output and the printing of the estimated
covariance/correlation matrix of the coefficients (Introduction, page Int–77), and the
title used in labeling the output.
equation=equation to estimate
lastreg/[nolastreg]
Use the EQUATION option to estimate a previously defined equation. LASTREG
re-estimates the most recent regression using recursive least squares. If you use
either, don’t include a supplementary card.
Technical Information
If there are K regressors, RLS will first find the smallest set of entries in the sample,
added in the order indicated, which will give a full rank regression (unless you use
the CONDITION option, in which case RLS uses the number of entries you specify).
This will give a coefficient estimate bt and ( X ′ X ) matrix St. The residual, SIGHIST
−1
and SEHIST entries for these early entries will be zeros. Call the starting entry T0
and the point where we get to full rank T1. Given a previous set of entries, the result
of adding a new data point is
( yt − X t bt−1 )
(1) êt =
1 + X tΣt−1 X t′
( yt − X t bt−1 )
(2) bt = bt−1 + Σt−1 X t′
1 + X tΣt−1 X t′
Σt−1 X t′ X tΣt−1
(3) Σt = Σt−1 −
(1 + X tΣt−1 X t′ )
where êt is the recursive residual at t. The estimated variance of the regression
through t is
t
(4) st2 = ∑ eˆ2s (t − T1 )
s=T1 +1
and the standard errors of the coefficient estimates are square roots of the diagonal
elements of st2St .
Examples
The following excerpts are taken from the example on pages 121-126 of Johnston and
DiNardo (1997). The complete program is provided on the file JOHN4P121.RPF.
The COHIST and SEHIST options provide a VECTOR[SERIES] with the “histo-
ries” of the coefficients and the standard errors of the coefficient estimates, while
SIGHIST gives the standard errors of the regression. The CSQUARED option
returns the sum of squared recursive residuals, which will also be the sequence of
sums of squared residuals from the regressions.
rls(sehist=sehist,cohist=cohist,sighist=sighist, $
dfhistory=dfhist,csquared=cusumsq) y 1959:1 1973:3 rresids
# constant x2 x3
Do the sequential F-test graph shown in figure 4.4. Since the cusumsq series has
the RSS’s, the F-tests themselves are fairly easy. The second stage takes the F-
tests and converts them into a ratio to the critical value, which changes with t,
since the denominator degrees of freedom changes. %invftest is used for that.
set seqf = (t-%nreg-%regstart())*(cusumsq-cusumsq{1})/cusumsq{1}
set seqfcval %regstart()+%nreg+1 * = $
seqf/%invftest(.05,1,dfhist(t))
graph(header=$
"Figure 4.4 Sequential F-Tests as Ratio to .05 Critical Value",$
vgrid=||1.0||)
# seqfcval
Wizard
Use the Statistics—Linear Regressions wizard and select “Robust Regression” as the
technique.
Parameters
depvar Dependent variable.
start end Range to use in estimation. If you have not set a SMPL, this
defaults to the largest common range for all the variables in-
volved.
residuals (Optional) Series for the residuals.
Options
[print]/noprint
vcv/[novcv]
title="title for output" [depends upon options]
These control the printing of regression output and the printing of the estimated
covariance/correlation matrix of the coefficients (Introduction, page Int–77), and the
title used in labeling the output.
method=[lad]/quantile
quantile=quantile to use for METHOD=QUANTILE [not used]
METHOD chooses between lad and quantile estimation methods. If using
METHOD=QUANTILE, you can use the QUANTILE option to specify the quantile to
use. See ““Technical Information”” for details.
iters=number of iterations
Allows the user to control the number of iterations used for the linear program-
ming algorithm. The default value depends on the number of parameters, but is
generally set to 100.
lastreg/[nolastreg]
equation=equation to estimate
LASTREG will re-estimate the most recent regression. Use the EQUATION option to
estimate a previously defined equation. If you use either, omit the supplementary
card.
Technical Information
LAD chooses b to minimize
(1) ∑ y −X b
t t
(2) ∑ α yt − X t β + ∑ (1 − α) y − X β
t t
yt − Xt β >0 yt − Xt β <0
where a is the quantile requested. lad is a special case of the quantile regression
with a=0.5—the only difference is that the function value would be half as large.
a (1 − a )
(5) 2
where xa is (an) a quantile of the residuals.
f (x a )
While the parameter estimates are robust to non-normality, the estimates of the co-
variance matrix are not robust against heteroscedasticity and similar problems. More
complex quantile regression techniques exist that can produce robust covariance
estimates in these circumstances—these are currently not supported in RREG.
RREG computes f using a Gaussian kernel (see DENSITY), and bandwidth
0.79 IQR
(6)
N1 5
where IQR is the interquartile range, and N is the number of observations. This
choice has certain general optimality properties (see discussion in Pagan and Ullah,
1999), but can be too narrow in some circumstances.
If you want to override this choice of bandwidth, you can use the BANDWIDTH option.
And if you want to choose your own scale factor, use the XXSCALE option.
Example
This is part of an example from Greene (2012). It estimates a Cobb-Douglas produc-
tion function. There are two very large outliers, so the second LINREG estimates with
those omitted. As an alternative to dropping the outliers entirely, the RREG estimates
by lad to reduce their effect on the estimates.
linreg logy / resols
# constant logk logl
linreg(smpl=t<>4.and.t<>10) logy
# constant logk logl
rreg logy / reslad
# constant logk logl
rreg(smpl=t<>4.and.t<>10) logy / reslad
# constant logk logl
Variables Defined
%BETA Coefficient vector (VECTOR) −1
%XX Covariance matrix of coefficients, or ( X ′ X ) (SYMMETRIC)
%TSTATS Vector containing the t-stats for the coefficients (VECTOR)
%STDERRS Vector of coefficient standard errors (VECTOR)
%NOBS Number of observations (INTEGER)
%NREG Number of regressors (INTEGER)
%NFREE Number of free parameters (INTEGER)
%NDF Degrees of freedom (INTEGER)
%FUNCVAL minimized values of equations (1) or (2) (REAL)
%MEAN Mean of dependent variable (REAL)
%RESIDS Series containing the residuals (SERIES)
%DURBIN Durbin-Watson statistic (REAL)
%RHO First lag correlation coefficient (REAL)
%VARIANCE Variance of dependent variable (REAL)
%EBW Bandwidth used in estimate of the density (REAL)
Parameters
start end Range of entries to transfer. If you have not set a SMPL, this
defaults to the defined range of each real series, determined
separately for each series.
newstart Entry in the complex series for the start entry of the real se-
ries. By default, same as start. The default starting entry may
change from series to series if you use the default start and
end range.
Description
RTOC transfers data from the real series on the first supplementary card to the
corresponding complex series on the second card. It sets the imaginary parts of the
complex series to zero.
See Section 14.5 of the User’s Guide for a discussion of preparing data for frequency
domain analysis.
Options
[pad]/nopad
By default, RTOC sets to zero all the entries of the complex series which it doesn’t
explicitly transfer from the real series. Thus the “defined length” of the complex
series is just its total length. You can suppress this automatic padding with the
NOPAD option.
RTOC will pad both ends of the series if necessary. For instance, residuals from a
regression involving lags have beginning entries which aren’t defined. If you use
the defaults for the RTOC parameters, it will transfer the defined portion of the
real series and set the initial undefined observations to zero.
even/[noeven]
With EVEN, RTOC will take a one-sided sequence and turn it into a symmetric
two-sided sequence by reflecting the sequence around the midpoint of the series.
This can be useful if you have a set of autocovariances which have only been com-
puted in one direction in the real domain.
[labels]/nolabels
RTOC transfers the labels of the real series to the complex series unless you use
NOLABELS.
Missing Values
RTOC sets to zero any entries of the complex series which correspond to missing val-
ues in the real series.
Example
This transfers series RESIDS1 to complex series 1 and RESIDS2 to complex series 2
and pads both complex series to 256 entries.
frequency 5 256
rtoc
# resids1 resids2
# 1 2
Using Matrices
It is possible to read and write complex series with the matrix input instruction
READ. You must use OVERLAY to place a complex vector over the set of entries to be
read or written.
For example, the following overlays complex series 2 with CSERIES (a VECTOR of
COMPLEX numbers) and reads data for it using READ:
freq 5 128
open data freqs.dat
declare vector[complex] cseries
overlay %z(1,2) with cseries(128)
read cseries
Variables Defined
%NOBS Number of observations transferred (taken from the last series
on the list) (INTEGER)
Parameters
series The series providing the values.
start end The sampled range of series. If you have not set a SMPL, this
defaults to the defined length of series.
newseries Resulting series, which will contain the extracted values. By
default, newseries=series.
newstart Starting entry for newseries. newstart=start by default.
Options
The two options are mutually exclusive. If you don’t specify an option, SAMPLE uses
INTERVAL=1.
interval=sampling interval[1]
Use this option for a regular sampling interval. SAMPLE will copy the entries:
start, start+interval, start+2*interval, start+3*interval, ...
into consecutive entries of newseries, beginning at entry newstart.
You can use this to change the frequency of a data series after it has been read
into rats. It is, however, rather clumsy compared to the options available on the
DATA instruction, which has many choices for method of compaction and does the
translation automatically. See page Int–104 in the Introduction.
Examples
set filter_missing = %valid(sp500)
sample(smpl=filter_missing) sp500 / c_sp500
cal(irregular)
C_SP500 has the same set of values as SP500, except that the missing values (for
non-trading days) have been removed. Note that, while SP500 is regular daily data,
C_SP500 is not, so the CALENDAR is changed to IRREGULAR.
calendar(q) 1980:1
all 2010:4
open data quarters.rat
data(format=rats) / qseries
sample(interval=4) qseries 1980:1 * q1 1
sample(interval=4) qseries 1980:2 * q2 1
sample(interval=4) qseries 1980:3 * q3 1
sample(interval=4) qseries 1980:4 * q4 1
calendar(a) 1980:1
reads a quarterly series, breaks out separate series (q1, q2, q3 and q4) for each
quarter, and then resets the calendar to annual. Note that it’s very important for the
start and newstart parameters to be correct—it’s the start parameter that deter-
mines which period within the year that you get.
Notes
SAMPLE is not your best choice for drawing random subsamples. The instruction
BOOT combined with SET is the correct way to do that. For instance, to draw a ran-
dom sample of size 20 from a data series (X) with 100 observations:
boot entries 1 20 1 100
set draw 1 20 = x(entries)
Variables Defined
%NOBS Number of observations in the new series (INTEGER)
Example
dedit(new) prices.rat
store price90 price80 price70
save
This creates a new rats format file called PRICES.RAT, adds three series to it, then
saves the changes to the file.
See Also . . .
Intro, Section 2.7 rats format data files.
DEDIT Opens a rats format file for editing.
QUIT Aborts data editing without making any changes.
Wizard
Use Data/Graphics—Scatter (X-Y) Graph. Note that in order to keep the Wizard
from getting too complicated, some of the SCATTER options have been omitted in the
wizard. To further customize your graph, you can edit the SCATTER instruction gen-
erated by the wizard.
Positioning
If you’re using SPGRAPH to put multiple graphs on a single page, by default, the fields
are filled by column, starting at the top left (field 1,1). If you want to fill a particular
field instead, use either the combination of ROW and COL options or hfield (for the
column) and vfield (for the row) parameters.
Parameters
pairs Number of pairs of series to plot against each other.
hfield vfield See “Positioning” above.
Options—Quick Reference
The following is a list of all of the options for SCATTER. Many of these are identical
to options on GRAPH, some are unique to SCATTER, and some operate differently with
SCATTER than with GRAPH. See the section on GRAPH for details on the options com-
mon to both instructions. We describe the options specific to SCATTER in more detail
below, with options grouped by function.
SCATTER Options Function
axis=none/vertical/horizontal/[both] Draw x=0 and/or y=0 axes
extend=[none]/vertical/horizontal/both Extend grid lines across graph
hgrid=VECTOR of grid values Sets position of grid lines
hlabel=horizontal scale label Adds a label to the x-axis.
hlog=base for a log scale Selects a log scale for x-axis
hmax=value for right boundary Sets the maximum x-axis value
hmin=value for left boundary Sets the minimum x-axis value
hpicture=picture code for x-axis Format of x-axis scale values
hscale=[lower]/upper/both/none Placement of x-axis scale
hshade=RECTANGULAR with shading zones Shading zones for x-axis
hticks=max number of horizontal ticks Number of tick marks on x-axis
lines=RECTANGULAR with intercept/slope Draw lines given slope/intercept
omax=max value for overlay scale Sets max value of overlay scale
omin=min value for overlay scale Sets min value of overlay scale
ovcount=number of series for overlay # of series using overlay scale
overlay=dots/symbols/line/bar/poly/ Style used for overlay series
filled/spike/step
ovkey/[nooverkey] Adds a key for overlay series
ovlabel=label for the overlay scale Label for the overlay scale
ovsamescale/[noovsamescale] Same scale for regular & overlay
style=dots/[symbols]/line/bar/poly/ Style of graph
filled/spike/step
vgrid=vector of grid line values Sets grid for vertical axis
vlabel=vertical scale label Label for vertical axis
vlog=base for a log scale for y-axis Selects a log scale for y-axis
vmax=value for upper boundary Sets maximum y-axis value
vmin=value for lower boundary Sets minimum y-axis value
vpicture=picture code for y-axis Formatting of y-axis scale values
vscale=[left]/right/both/none Placement of vertical scale
vshade=RECTANGULAR with shading zones Shading zones for y-axis
vticks=max number of vertical ticks Number of tick marks on y-axis
xlabels=VECT[STRING] for x-axis labels Strings for labeling x-axis
Options Common to GRAPH and SCATTER Function
[box]/nobox Superseded by FRAME option
col=column number Column in the SPGRAPH matrix
footer=footer label Adds a footer label below graph
frame=[full]/half/none/bottom Controls frame around the graph
Labelling Options
hlabel=horizontal scale label ("..." or STRING) [none]
vlabel=vertical scale label ("..." or STRING) [none]
These provide labels for the horizontal and vertical scales. The placement of the
labels depends upon your choices for HSCALE and VSCALE. The HLABEL will be
centered at the bottom just below the horizontal tick marks for HSCALE=LOWER
or BOTH, or centered at the top above the tick marks if you use HSCALE=UPPER.
The VLABEL is centered at the left for VSCALE=LEFT or NONE, and centered at the
right for VSCALE=RIGHT. It appears on both sides with VSCALE=BOTH.
hscale=[lower]/upper/both/none
vscale=[left]/right/both/none
These control the placement of the horizontal and vertical scales on the graph.
The horizontal scale indicates the values of the x-series. You can place it on
the bottom of the graph (the default), on the top of the graph, on both the top and
bottom, or you can omit it entirely. The vertical scale indicates the values of the
y-series. You can place it on the left of the graph (the default), the right, both
left and right or omit it.
hlog=base for a log scale for the horizontal axis [not used]
vlog=base for a log scale for the vertical axis [not used]
Use one or both of these to graph data on a semi-log or log-log scale. The base
chosen really affects only the levels that get labeled, which will always be powers
of the base. The values of 10, 2, 4 and 5 are most likely to work best.
extend=[none]/both/vertical/horizontal
Normally, rats marks the vertical and horizontal axis with small tick marks
outside the graph. You can use the EXTEND option to have rats draw dotted grid
lines from each tick mark all the way across or down the graph. HORIZONTAL
draws horizontal grid lines, VERTICAL draws vertical grid lines, and BOTH draws
horizontal and vertical grid lines.
right-side (overlay) scale and style. The other series are graphed using the left-
side scale and style.
[ovkey]/noovkey
You can use NOOVKEY to eliminate the key for the overlay series, if the meaning
is either obvious, or provided using the labels.
ovsamescale/[noovsamescale]
You can use OVSAMESCALE to force both the regular and the overlay series to
share a common scale.
Missing Values
rats leaves out any entry for which either of the two series in the pair is missing.
Examples
This graphs inflation vs unemployment with one set of symbols (squares) for the
period 1954 to 1968 and another (diamonds) for 1969 to 1983.
scatter(style=symbol,header="Inflation vs. Unemployment", $
patterns,hlabel="Unemployment",vlabel="Inflation") 2
# unemp inflation 1954:1 1968:1 1
# unemp inflation 1969:1 1983:1 2
This computes a histogram for the series AGES, presented as a bar graph. The title on
the graph is “Histogram of Ages” and it goes into a window labeled “Histogram”.
density(type=histogram) ages / grid density
scatter(style=bar,window="Histogram",header="Histogram of Ages")
# grid density
This draws a set of data from a gamma distribution, estimates the sample density,
then graphs the estimate with the true density. The actual density is done using the
(filled) polygon style, while the estimated density overlays that with a line. The colors
are adjusted so the line comes in as solid black. OVSAME is used to force both to use a
single scale. The graph is shown below.
all 1000
set test = %rangamma(4.0)
density(bandwidth=1.00) test / x fx
set actual = exp(log(x)*3.0-x-%lngamma(4.0))
scatter(style=polygon,overlay=line,ovsame) 2
# x actual / 4
# x fx / 1
0.225
0.200
0.175
0.150
0.125
0.100
0.075
0.050
0.025
0.000
0.0 2.5 5.0 7.5 10.0 12.5 15.0
This rather complex example is taken from the cumulated periodogram procedure (on
the file CUMPGDM.SRC).
ctor 1 half
# 1 2
# actual white_noise
scatter(style=line,key=upleft, $
header="Cumulated Periodogram Test", $
xlabels=||"0","p/4","p/2","3p/4","p"||) 2
# freqs white_noise 1 half
# freqs actual 1 half
display(store=gaplabel) "gap = " #.#### %maximum
if actual(%maxent)<white_noise(%maxent)
grtext(align=left,x=%maxent-1,y=actual(%maxent)-.01) gaplabel
else
grtext(align=right,x=%maxent-1,y=actual(%maxent)+.01) gaplabel
spgraph(done)
0.75
gap = 0.3838
0.50
0.25
0.00
0 p/4 p/2 3p/4 p
Wizard
The Data/Graphics—Trend/Seasonals/Dummies wizard can create seasonal dum-
mies.
Parameters
series Series to set as a seasonal dummy.
start end Range of entries to set. If you have not set a SMPL, this defaults
to the standard workspace plus seasonal-1. See the explana-
tion on the next page.
Options
span=the seasonal span [CALENDAR seasonal]
period=first period to receive value 1 [last period in year]
SPAN sets the seasonal span, or periodicity, in terms of the number of periods per
year (12 for monthly, 4 for quarterly). This defaults to the seasonal defined by the
CALENDAR instruction.
PERIOD indicates the first period (between start and end) which is to get the
value 1. In other words, this determines the month or quarter represented by the
dummy. This defaults to the entry start+SEASONAL-1, that is, the dummy is
set up for the last period within the year. See the next page for details.
centered/[nocentered]
If you use CENTERED, SEASONAL produces a centered seasonal dummy rather
than a standard 0–1 dummy. From a standard dummy, subtract 1.0/seasonal
from each entry. For instance, a 4th quarter dummy will use the sequence
-.25, -.25, - .25, .75. Centered dummies have certain advantages when you use
seasonal–1 of them together with a CONSTANT. You need to be careful, however,
as a full set of seasonal centered dummies is linearly dependent.
Examples
seasonal(period=1948:12) december 1948:1 2018:12
seasonal(period=1948:6) june 1948:1 2018:12
sets up the series DECEMBER as a December seasonal (one in every 12th entry, begin-
ning with 1948:12) and the series JUNE as a June seasonal. Make sure you define the
dummy over the full range that you will need, using start and end as needed .
calendar(q) 1947:1
allocate 2018:4
seasonal seasons
defines a 4th quarter dummy over a period which runs 3 entries beyond 2018:4. It is
equivalent to:
seasonal(span=4,period=1947:4) seasons 1947:1 2019:3
Sometimes, however, you may need a separate series for each period. For quarterly
data, it is probably simplest to just do three or four separate SEASONAL instructions.
With more periods per year, however, you will probably want to automate the pro-
cess. Use a vector of series, and define the dummies in a loop:
dec vect[series] seasonals(12)
do i=1,12
seasonal(period=i) seasonals(i) 1947:1 2010:12
end do i
creates a vector of series which have the twelve dummies as its twelve elements. You
can then use SEASONALS in a regressor list to add the full set.
seed value
Parameters
value The desired seed. If value is 0 or omitted, rats computes a
new seed using the current date and time. You can select any
other value as your seed. While it is common to choose “random-
looking” numbers like the 13939 below, they have no particular
advantage over seeds like 1 or 777.
Description
The seed is an integer which initializes the rats random number generator. This
generator is used by the instructions SIMULATE and BOOT and by the functions such
as %RAN, %UNIFORM, %RANGAMMA and %RANWISHART. The seed uniquely determines
the sequence of (pseudo-) random numbers. The algorithms used are provided on the
next page.
rats normally sets the seed using the time and date at which the program begins
execution. This default seed will be different each time you run the program.
Instead, you can use SEED to rerun a program with the same set of generated num-
bers. This is very helpful when you are testing a program which draws random num-
bers, as it is easier to find and fix coding errors when your “data” doesn’t change each
time you run the program.
Put SEED right at the beginning of your code so it will be easy to remove once the
program is running correctly.
Notes
SEED only permits you to reproduce random numbers when you use exactly the same
instructions in the same sequence. Consider, for instance,
declare vector b1(5) b2(5) b(10)
seed 13939
compute b=%ran(1.0)
seed 13939
compute b1=%ran(1.0)
compute b2=%ran(1.0)
The entries of B1 match the first five entries of B. However, the entries of B2 are dif-
ferent from the last five entries of B. Technically, rats uses an acceptance-rejection
algorithm for generating Normal values from uniform. This requires an extra draw
from the uniform at the beginning of each set of numbers.
Algorithms
rats uses the period 2191 pseudo-random number generator from L’Ecuyer (1999).
This generates the same set of numbers from a given seed on any platform supported
by rats that is version 6.10 or later. (The random number generator was changed
with version 6.10). Real numbers in the range [0,1] are obtained from this by dividing
the integer by its period.
Normal and gamma deviates are generated from the uniforms by acceptance-rejec-
tion algorithms.
See Also . . .
SIMULATE Forecasts a model with random shocks.
BOOT Draws random entry numbers.
%RAN(x) Function returning draws from Normal(0,x2) distribution.
%UNIFORM(x1,x2) Function returning draws from a Uniform(x1,x2)
%RANINTEGER(L,U) Function returns integer in [L,U]
Parameters
number If you use just this one parameter, the user can only select one
item from the list, and number will be an integer indicating
which it was. The meaning of the return value depends upon
the list type–see the options below.
If you use both parameters, the user can select any number of
items and number returns the number of items selected.
selectlist To allow the user to select more than one item, supply a new
variable name for the selectlist parameter. rats will create
this variable as a VECTOR[INTEGERS] containing the return
values of the selected items.
Options
series
strings=VECTOR of STRINGS to list
list=VECTOR of INTEGERS to list
regressors
These four (mutually exclusive) options determine the type of list displayed:
• Use SERIES to request a selection from the list of series currently in memory.
When you use SERIES, the return values are the series numbers. You can use
a selectlist array directly in a regression list and other locations where
a rats instruction requests a “list of series.” The special series CONSTANT is
never displayed as one of the choices in the list.
• STRINGS requests a selection from a list of labels. You must declare and set
the VECTOR[STRINGS] before executing SELECT. The return values are the
selected positions (element numbers) in the VECTOR[STRINGS].
• Use LIST to request a selection from an arbitrary list of integers. You must
declare and set the VECTOR[INTEGERS] before executing SELECT. SELECT
returns the actual integer values selected (not the positions in the array).
• REGRESSORS requests a selection of regressors from the last completed re-
gression. The return values are the selected positions in the list.
prompt="prompt string"
Use PROMPT to display a message at the top of the dialog box.
LIMIT is most useful with the SERIES option, since you may want to restrict the
user to choosing only from the original data series, and not derived series like
residuals. If you know that there were four original series, with
you could ask the user to select one of the first four series.
Example
select(series,prompt="Select Explanatory Variables",status=ok) $
nselect reglist
if ok {
linreg y
# constant reglist
}
end if
This requests the user to select one or more series. The selected variables are then
used in a regression.
Wizard
You can use Data/Graphics—Transformations for general transformations. Select the
“General-Input Formula” action.
Parameters
series The series to create or set with this instruction.
start end (Optional) The range of entries to set. If you have not used
a SMPL instruction, this defaults to the standard workspace,
regardless of the series involved. However, in effect, rats cuts
this back to the maximum range available given the trans-
formation by setting to “missing” all entries which it cannot
compute.
function(T) This is the function of the entry number T which gives the value
of entry T of series. There should be at least one blank on
each side of the =. The function can actually include multiple
expressions, separated by commas. The series will be set to the
values returned by the final expression in the list.
Options
first=real expression/FRML for first entry set [not used]
Use FIRST when the first entry of series requires a different formula than the
remaining entries; for instance, a benchmark value. The option will accept a
FRML (formula) so it can adapt to the value of start.
scratch/[noscratch]
Use SCRATCH when redefining an existing series (that is, when series appears
in the function) and the transformation uses data from more than just the cur-
rent time period T of series. You don’t need it if only one of these is true.
[panel]/nopanel
When working with panel data, NOPANEL disables the special panel data treat-
ment of expressions which cross individual boundaries.
Description
This sets the values of entries start to end of series by evaluating the function
at each entry T, substituting for T the number of the entry being set. You can omit
the T subscript in most cases—see the note below.
The Variable T
When you do a SET instruction, the variable T is always equal to the number of the
entry being computed. You can use this to create trend variables and time period
dummies:
set trendsq = t^2 Creates “time” squared.
set postwar = t>=1946:1 Creates a postwar dummy for 1946 on.
Missing Values
SET propagates missing values through the formula. With only two exceptions (the
%VALID(x) function and the %IF(x,y,z) function), any operation which involves a
missing value returns a missing value.
SET also sets to missing any observation which involves an illegal operation, such as
divide by zero and square root of a negative number.
Examples
set gnp = log(gnp)
replaces the series GNP by its log.
makes XSWITCH equal to either the current entry of XPOS or of XNEG, depending upon
whether X is positive or not.
set(first=0.0) rw = rw{1}+%ran(1.0)
makes RW a “random walk” with an initial value of 0 at entry 1, adding N(0,1) incre-
ments. Without the FIRST option, this would create a series of missing values since
RW{1} isn’t defined when T=1.
set u = %ran(1.0)
set(first=0.0) x = u+.5*u{1}
generates U as a series of random numbers, then makes X as a moving average of
U. Again, without the FIRST option, the series would be all missing values. (Note,
by the way, that you should discard the first few entries of a series generated this
way since 0.0 is a convenient starting point but isn’t representative of the stationary
MA(1) process).
See Sections 1.4.3 and 1.5.9 of the Introduction for many more examples using SET.
The added / (for default range) is harmless but unnecessary—if there aren’t start
and end parameters, SET uses the default range.
show memory
show series
show files file identifier
Description
show memory
This displays the amount of memory available to rats for workspace, the num-
ber of bytes used and the number of bytes remaining.
show series
Lists the names of all the series currently in use.
show *.rat
will list all the files in the current directory with .RAT extensions. If you omit
file identifier, SHOW will list all files on the current directory.
Parameters
equations Number of equations in the system. Omit this parameter when
using the MODEL option.
Options
model=model name
Of the two ways to input the form of the model to be solved (the other is with
supplementary cards), this is the more convenient and is the only way to forecast
with a set of FRMLs. MODELs are usually created by GROUP or SYSTEM. If the model
includes any identities, those should be last in the model. If you use this, omit the
“equation” supplementary cards.
As an alternative, you can use the FACTOR option (the older DECOMP is also ac-
ceptable as a synonym) to provide a factorization of the covariance matrix of re-
siduals—see orthogonalization. Note that the distribution (though not the actual
simulations) of the simulations doesn’t depend upon which factorization is chosen
for S.
print/[noprint]
Use PRINT if you want rats to print the forecasts; this is not done automatically.
Supplementary Cards
There is one supplementary card for each equation. Include these only if you are not
inputting a MODEL. Identities should be listed last.
equation The equation.
forecasts (Optional) The series for the forecasts of the dependent variable
of equation. forecasts can be the same as the dependent
variable.
newstart (Optional) The starting entry for forecasts. By default, the
same as FROM.
column If you use the FACTOR option, this is the column in that matrix
which corresponds to this equation. By default, rats assumes
that the order in which you list the equations matches the
order of the columns in the FACTOR matrix.
Variables Defined
%FSTART Starting entry of forecasts (INTEGER)
%FEND Ending entry of forecasts (INTEGER)
Description
SIMULATE forecasts a system of equations using a random number generator to add
shocks to those equations which are not identities. Note that identities must be last
in the list of supplementary cards, or at the end of the model.
The shocks at each period are generated by a draw from a Normal(0,S) distribution.
You put the variance/covariance matrix S into SIMULATE in one of the following
ways:
1. If you use MODEL, it will already be part of that if you estimated the model with
ESTIMATE or SUR, or provided a covariance matrix with the CV option on GROUP.
2. You can supply it using the CV option. If you use the column fields on the supple-
mentary cards, you will get a rearrangement of this.
3. You can supply a factorization of S using the FACTOR option.
4. In all other cases, S is taken to be a diagonal matrix with diagonal equal to the
variances of the individual equations or formulas.
Comments
The standard way to use SIMULATE is to loop over the number of draws you want to
make, compiling statistics on characteristics of the simulated series. Chapter 16 of the
User’s Guide describes this in greater detail.
The instruction SEED can be helpful in checking programs which use SIMULATE be-
cause it allows you to control the set of random numbers that rats generates.
Note that SIMULATE isn’t needed to create a series of random numbers or simple
transformations of them. A SET instruction using functions like %RAN and %UNIFORM
is easier to set up.
If you need the shocks to have some distribution other than Normal, use FORECAST
with the PATHS option, using SET instructions to create the series of shocks. (That is
how bootstrapping is done).
To simulate an equation that includes moving average (ma) terms, the equation must
have a series of residuals associated with it. This happens automatically for equa-
tions estimated using instructions like BOXJENK, but for equations defined using
EQUATION, you must assign the residuals manually, using either ASSOCIATE or the
%eqnsetresids() function. For example:
set armaresids = 0.0
equation(ar=lags,ma=1,noconstant,coeffs=coeffsvec,$
variance=1) yeq y
compute %eqnsetresids(yeq,armaresids)
Examples
This computes and copies to a file a 100 element realization of the AR(2) process
Since the simulation starts out “cold” with zeros for the lagged values, the program
generates an additional 48 points at the beginning for burn-in. Simulations start at
period 3 because of the two lags:
allocate 150
set y = 0.0
equation(variance=4.0,coeffs=||6.0,1.4,-.8||) simeq y 2 0
simulate(from=3,steps=148) 1
# simeq y Simulations into Y beginning at 3
open copy simul.dat
copy 51 150 y
This next example does 100 simulations of a six-variable var over the period 2011:1
to 2016:4 and fills a pair of vectors with the minimum and maximum achieved by the
exchange rate over that period. (The simulated values for the exchange rate are in
SIMULS(2), since it is the second equation in the model.)
system(model=canmodel)
variables usargdps canusxsr cancd90d canm1s canrgdps cancpinf
lags 1 to nlags
det constant
end(system)
estimate(noprint)
declare vector minxrate(100) maxxrate(100)
do draw=1,100
simulate(model=canmodel,results=simuls, $
steps=24,start=2011:1)
ext(noprint) simuls(2)
compute minxrate(draw)=%minimum,maxxrate(draw)=%maximum
end do draw
Parameters
start The starting entry number or date of the range to be used in
subsequent instructions.
end The ending entry or date of the range.
You can use an asterisk (*) for either parameter if you want to fix only one end of
the range. rats will determine the *’ed end separately for each instruction. See the
examples.
SMPL with no parameters tells rats to return to using the maximum possible range.
Options
series=SMPL series or formula (Introduction, Section 1.6.2)
This sets the sample based on the values of a data series or a logical expression
that can be evaluated across entry numbers. Any entries for which the supplied
series or formula are zero, missing (NA), or “false” will be excluded from the
sample. This offers an alternative to the SMPL options available on many instruc-
tions, and allows you to do transformations that would be very difficult otherwise.
reglist/[noreglist]
Sets the sample range based on a set of regressors, listed in regression format.
Description
Following a SMPL instruction, rats will use the SMPL range on any subsequent in-
structions for which you do not give an explicit entry range. You can always override
the current SMPL by providing the entry range on an instruction.
You can set and clear the SMPL any number of times.
Also, see the description of the SMPL option in Section 1.6.2 of the Introduction.
Examples
Following are two functionally identical sets of instructions. The first uses the range
parameters on each instruction, while the second uses both default ranges and a
SMPL. In both cases, the DATA instruction reads from 1955:1 to 2009:12, then the
other computations are done from 1960:1 to 2009:12.
cal(m) 1955:1
data 1955:1 2009:12 inter infla
linreg inter 1960:1 2009:12
# constant infla
set dif 1960:1 2009:12 = inter-infla
set lgs 1960:1 2009:12 = log(dif)
cal(m) 1955:1
allocate 2009:12
data / inter infla The / tells rats to use default range.
smpl 1960:1 2009:12 Sets a new default range
linreg inter Range omitted, so uses range set by SMPL
# constant infla
set dif = inter-infla Again, range omitted, so SMPL range used.
set lgs = log(dif)
In the next example, both regressions begin in 1950:1. The LINREG runs through the
last entry for which both CONS and GNP are defined. The AR1 runs through the last
entry for which INVEST, YDIFF, GNP{1} and RATE{4} are defined.
smpl 1950:1 *
linreg(inst) cons
# constant gnp cons{1}
ar1(inst,frml=investeq) invest
# constant ydiff gnp{1} rate{4}
The last example runs a series of NLLS instructions with increasing numbers of lags
in the instrument list. By using SMPL(REGLIST) with the maximum number of lags
we plan to use, we ensure that the regressions will all be run over a standard range.
smpl(reglist)
# constant consgrow{0 to 6} realret{0 to 6}
dofor nlag = 1 2 4 6
instruments constant consgrow{1 to nlag} realret{1 to nlag}
nlls(inst,noprint,optimalweights,frml=h)
end dofor
Options
echo/[noecho]
Normally, rats won’t show the lines in the SOURCE file as it reads them. Use
ECHO if you want to see them. Note that the default setting is now NOECHO.
key="key for decrypting file"
Contact Estima if you want to obtain software to encrypt source files.
status=INTEGER variable returning status [unused]
If you use the STATUS option, rats will set the variable you supply to 1 if the file
was successfully opened, or 0 if it was not.
Notes
SOURCE has two primary uses:
• You can save on separate files PROCEDURES and FUNCTIONS, or groups of
rats instructions you use often, and SOURCE them into different programs.
• You can save your initial instructions: CALENDAR, ALLOCATE, DATA, and trans-
formations, and start programs by SOURCEing these, rather than repeating
them in each program.
If you use SOURCE within a compiled section, you should note that rats does not ex-
ecute the SOURCE instruction during the compilation phase; it is not like an “include”
allowed in many programming languages. During the execution phase, rats tempo-
rarily stops executing the compiled code to execute the instructions on the SOURCE
file, then switches back. Because the file is not processed during the compilation
phase, it cannot include references to procedure parameters or local variables.
Example
source gammaparms.src
compute gguess=%GammaParms(%mean,sqrt(%variance))
This example brings in the FUNCTION %GammaParms from the file GAMMAPARMS.SRC
and then applies it to determine the parameters for a gamma distribution to hit a
particular mean-standard deviation pair.
Wizard
The Time Series—VAR (Setup/Estimate) wizard provides an easy, dialog-driven
interface for defining and estimating var models.
Parameter
other’s weight This is the relative weight on other variables in a SYMMETRIC
prior; it is irrelevant for other prior types. If you omit it, its
default value is 1.0.
Description
We use the following notation throughout this section:
• variable j refers to the jth variable listed on the VARIABLES instruction.
• equation i refers to the equation whose dependent variable is variable i.
• the standard deviation of the prior distribution for lag l of variable j in equation
i: denoted S(i,j,l).
{g g (l ) f (i, j )} si
S (i, j, l ) = ; f ( i , i ) = g ( l ) = 1 .0
sj
Options
type=[symmetric]/general
matrix=RECTANGULAR array of weights for TYPE=GENERAL
TYPE selects the f(i,j) function. SYMMETRIC provides a restricted form of the func-
tion, while GENERAL allows complete generality. See the discussions below.
lagtype=[harmonic]/geometric
decay=lag decay parameter [no decay with lag]
These two options control the function g(l): how the standard deviation changes
with increasing lags. With d=lag decay parameter, rats uses the formulas:
g(l) =l-d for LAGTYPE=HARMONIC and
g(l) =d
l-1
for LAGTYPE=GEOMETRIC
1.0 if i = j
f (i, j ) =
w otherwise
system(model=sixvar)
variables ip m1 cpr unemp wage cpi
lags 1 to 12
det constant
specify(type=symmetric,tightness=.10,decay=1.0) .5
end(system)
For example:
system(model=sixvar)
variables ip m1 cpr unemp wage cpi
lags 1 to 12
det constant
specify(type=general)
# 1.0 0.5 0.5 0.2 0.2 0.2 $
0.2 1.0 1.0 0.2 0.2 0.5 $
0.2 1.0 1.0 0.2 0.2 0.2 $
0.2 0.5 0.5 1.0 0.2 0.2 $
0.2 0.5 0.2 0.2 1.0 0.5 $
0.0 0.5 0.2 0.2 0.5 1.0
end(system)
Dummy Observations
Suppose you impose the prior βk ~ N (b,λ 2 ) upon a coefficient bk. You can represent
this as a “dummy observation” with
σ σ
Ψ= and r = b
λ λ
where s is the standard deviation of the equation being estimated.
The first NL+D elements of each column of the FULL array provide the Y values for
the dummy observations for the coefficients. The last element provides the r for the
first own lag. This allows you:
• complete freedom in setting standard deviations on the lags
• the ability to put mean zero priors on any of the deterministic variables.
spgraph( options )
One or more graphics instructions
spgraph(done) To signal end of special graphs.
Description
SPGRAPH tells rats the overall structure of the graphics content. As shown above,
you follow this with the instructions which create the graphs themselves, then
SPGRAPH(DONE) when you are finished. You can also “nest” SPGRAPH blocks inside
other SPGRAPH blocks to create more complex graph presentations.
You must use SPGRAPH(DONE) at the end even if you only need a single graphing
instruction.
Positioning
If the SPGRAPH is inside of another SPGRAPH, the fields in the outer SPGRAPH are,by
default, filled by column, starting at the top left (field 1,1). If you want to fill a par-
ticular field instead, use the combination of ROW and COL options.
Options
hfields=number of horizontal fields [1]
vfields=number of vertical fields [1]
fillby=[columns]/rows
You can use these options, alone or together, to put several graphs on a single
page. They tell rats to divide the specified axis into multiple fields. If you use
these options, you can use the ROW and COL options (or hfield and vfield param-
eters) on GRAPH, GBOX, GCONTOUR, or SCATTER (or inner SPGRAPHs) to control the
positioning of each graph. By default, the graph instructions fill the fields by col-
umns, beginning with the top left, working down. Use FILLBY=ROWS to fill across
in rows by default instead. Either can be overridden by ROW and COLUMN options
on the graphing instructions.
done
You must do SPGRAPH(DONE) after all the graphics instructions. rats will draw
the graph as soon as you execute the SPGRAPH(DONE).
row=row in the SPGRAPH matrix for this graph
col=column in the SPGRAPH matrix for this graph
If you have one SPGRAPH inside another, this allows you to override the positio-
ing. See "Positioning".
samesize/[nosamesize]
Use SAMESIZE if you want all the graphs in the SPGRAPH to be the same size.
NONE no key
ABOVE entered above the content (and any HEADER and SUBHEADER).
BELOW centered below the content, and any outer horizontal labeling
LEFT left side, centered vertically, outside the content and any outer
vertical labeling
RIGHT right side, centered vertically, outside the content and any outer
vertical labeling
style=[line]/polygon/bar/stackedbar/overlapbar/vertical/step/
symbol/midpolygon/fan/dots/spike
Choose the style that will be used for displaying the information in the graph.
There are actually only three ways that a key sample is shown: a line, a filled
rectangle or a symbol so you just have to pick a style in the correct "family".
klabel=VECTOR of STRINGS for KEY labels [required]
This is required so SPGRAPH will know how to describe the series. You can create
the VECTOR[STRINGS] ahead of time, or enter it using the ||..|| notation.
symbols=VECTOR of INTEGERS supplying style numbers for SERIES
Use SYMBOLS to supply a VECTOR of INTEGERS with the style numbers you want
to use for the corresponding series if you don't want the standard behavior of us-
ing styles 1, 2, 3, ... in order.
[kbox]/nokbox
This controls whether or not a box (border) is drawn around the key.
kheight=height of key box (between 0 and 1)
kwidth=width of key box (between 0 and 1)
By default, rats tries to find the most efficient arrangement for the key, given
the number of series in the key, its position, the setting of the KEYLABELING
parameter on GRPARM and so on. KHEIGHT and KWIDTH allow you to control the
size and proportion of the box by specifying the height and width of the box as a
fraction of the graph’s overall height and width. You must specify both options.
Notes
HFIELDS and VFIELDS offer an alternative to creating full size graphs and then
pasting and resizing them into another application (or photoreducing them) in order
to put several on a page. rats will automatically scale all elements of the graphs,
including labels, to fit the page. Together, they create a “matrix” of graphs. See, for
instance, the example below. Once you get beyond two or three graphs on a page, you
will probably want to eliminate the tick marks and axis labeling. The graphs will get
too cluttered for the labeling to be useful.
You can use the SAMESIZE option to force the graph boxes to be the same size.
Examples
This generates a 3´3 “matrix” of pairwise scatter plots for the three series
SLIST(1), SLIST(2) and SLIST(3). The “diagonal” is left empty by skipping the
SCATTER instruction if I is equal to J.
spgraph(vfield=3,hfield=3,header="Scatter Plots")
do i=1,3 ; do j=1,3
if (i<>j)
scatter(vscale=none,hscale=none,style=symbols) 1 i j
# slist(i) slist(j)
end do j ; end do i
spgraph(done)
list s = transform
spgraph(window="Transformations")
graph(scale=none,header=header,key=below,noksample,$
klabels=klabels) 3
cards s nbeg nend 1
do i=1,3
grtext(entry=nbeg,align=right,$
y=%if(transform(i)(nbeg)>(i-1)+.50,$
transform(i)(nbeg)-.25,transform(i)(nbeg)+.25)) labels(i)
end do i
spgraph(done)
The example below is taken from the MONTEVAR.SRC procedure. Note the use of
the labeling options available on SPGRAPH:
dec vect[strings] xlabel(nvar) ylabel(nvar)
dec vect[integer] depvars
compute depvars=%modeldepvars(varmodel)
do i=1,nvar
compute ll=%l(depvars(i))
compute xlabel(i)=ll, ylabel(i)=ll
end do i
smpl 1 nstep
do j=1,nvar
graph(ticks,min=minlower,max=maxupper,number=0) 3 j i
# resp(j)
# upper(j) / 2
# lower(j) / 2
end do j
end do i
spgraph(done)
See Also . . .
GBOX Generates box-and-whisker plots.
GCONTOUR Generates high-resolution contour plots.
GRAPH Generates high-resolution time series graphs.
GRPARM Setting parameters for graphics instructions.
GRTEXT Adds text to a graph or scatter plot.
SCATTER Generates high-resolution X-Y scatter plots.
Parameters
start end Range to use. If you have not set a SMPL, this defaults to the
standard workspace. Unlike many rats instructions, you will
need to use the SMPL option to exclude any observations that
would cause the expression to return a missing value.
expression A variable or formula
result The (REAL) variable into which the computed result will be
stored.
Options
smpl=SMPL series or formula ("SMPL Option" on page RM–546)
You can supply a series or a formula that can be evaluated across entry numbers.
Entries for which the series or formula is zero or “false” will be omitted from the
calculations.
mean
product
maximum
minimum
frac=desired fractile (quantile) [not used]
Use one of these (mutually exclusive) options to select the statistic you want to
compute. If you don’t use any of the options, SSTATS computes the sum. Use
FRAC=.50 for the median value.
startup=FRML evaluated at period “start”
You can use the START option to provide an expression which is computed once
per function evaluation, before the regular formula is computed. This allows you
to do any time-consuming calculations that don’t depend upon time. It can be an
expression of any type.
[panel]/nopanel
When working with panel data, NOPANEL disables the special panel data treat-
ment of expressions which cross individual boundaries.
weight=series of entry weights ("WEIGHT option" on page RM–549)
Use this option if you want to give unequal weights to the observations.
Variables Defined
%NOBS Number of observations (INTEGER)
%MAXENT Entry number of maximum value (if MAXIMUM) (INTEGER)
%MINENT Entry number of minimum value (if MINIMUM) (INTEGER)
Examples
This is a common use for SSTATS. It does a comparison of a series of statistics with a
critical value and computes the percentage of those that exceed it. STATS>%CDSTAT
is either 1 or 0 depending upon whether or not the value of STATS(T) is bigger than
%CDSTAT or not, so the mean of that over the sample will be the fraction for which
that is true. The resulting value (called PVALUE) is an empirical significance level.
sstats(mean) 1 ndraws (stats>%cdstat)>>pvalue
disp "Bootstrapped p-value" pvalue
This example is taken from Chapter 14 of Stock and Watson (2007). Here, we com-
pute pseudo out-of-sample forecasts and use SSTATS to compute the (uncentered)
second moments of forecast errors. The @UFOREERRORS procedure uses an SSTATS in
a similar way to compute forecast performance statistics.
set fcst_adl = 0
set fcst_const = 0
do t1=1992:12,2002:11
linreg(noprint) exreturn 1960:1 t1
# constant exreturn{1} ln_divyield{1}
prj fcst_adl t1+1 t1+1
linreg(noprint) exreturn 1960:1 t1
# constant
prj fcst_const t1+1 t1+1
end do t1
declare vector rmse(3)
sstats(mean) 1993:1 2002:12 exreturn^2>>rmse(1) $
(exreturn-fcst_const)^2>>rmse(2) (exreturn-fcst_adl)^2>>rmse(3)
disp "RMSFE of Zero Return" @25 *.## sqrt(rmse(1))
disp "RMSFE of Constant" @25 *.## sqrt(rmse(2))
disp "RMSFE of ADL" @25 *.## sqrt(rmse(3))
Wizard
In the Statistics—Univariate Statistics wizard, choose "Basic Statistics".
Parameters
series Series to analyze.
start end Range to use. If you have not set a SMPL, this defaults to the
defined range of series.
Options
[print]/noprint
title="title for output" ["Statistics on Series xxxx"]
Use NOPRINT to suppress the output. Use TITLE to supply your own title to label
the resulting output.
smpl=SMPL series or formula ("SMPL Option" on page RM–546)
You can supply a series or a formula that can be evaluated across entry numbers.
Entries for which the series or formula is zero or “false” will be omitted from the
calculations, while entries that are non-zero or “true” will be included.
fractiles/[nofractiles]
If you use the FRACTILES option, STATISTICS computes the maximum, mini-
mum, median and a number of other sample fractiles (quantiles) (1%, 5%, 10%,
25%, 75%, 90%, 95% and 99%). If you want only the fractiles and not these
moment-based statistics, use the option NOMOMENTS.
[center]/nocenter
NOCENTER can be used if a mean of zero is assumed—it uses variants on the cal-
culations which don't include subtracting off an unknown mean.
[moments]/nomoments
By default, STATISTICS computes the following:
• sample mean, variance and standard error
• test for m=0
• skewness
• (excess) kurtosis
• Jarque–Bera (1987) normality test
The skewness and kurtosis statistics include a test of the null hypotheses that
each is zero (the population values if series is i.i.d. Normal.) Jarque–Bera is a
test for normality based upon the skewness and kurtosis measures combined. We
list the formulas for these in "Technical Information".
RATS Reference Manual RM–461
Statistics
Notes
rats has a number of other related instructions which you may find useful. SSTATS
computes statistics on one or more general formulas. EXTREMUM computes the maxi-
mum and minimum values only. MVSTATS computes means, variances, and various
quantiles for a moving window on a series.
Output
The following instructions read data from the file HAVERSAMPLE.RAT and compute
statistics on two series derived from the data (real gdp growth and trade surplus as
a fraction of gdp). The first does the moment statistics, the second fractiles.
open data haversample.rat
calendar(q) 1947:1
data(format=rats) 1947:01 2006:04 gdp gdph m x
*
set reltrade = (x-m)/gdp
set gdpgrowth = 400.0*log(gdph/gdph{1})
stats gdpgrowth
stats(fractiles,nomoments) reltrade
Technical Information
The skewness and kurtosis formulas and the test statistics based upon them are from
Kendall and Stuart (1958).
For the sample X1, X2, X3, ..., XN :
1 N
Mean ( X ) ∑X i
N i =1
1 N 2
Variance ( s2 ) ∑
N − 1 i=1
( Xi − X )
s
Standard Error of Mean
N
X N
t-statistic for mean=0
s
1 N k
mk (used below)
N
∑(X
i =1
i − X)
N2 m3
Skewness (Sk)
( N −1)( N − 2) s3
( N −1)( N − 2)
Sk=0 statistic z = Sk
6N
N2 (N + 1)m4 − 3 (N −1)m2 2
Kurtosis (Ku)
(N −1)(N − 2)(N − 3) s4
( N −1)( N − 2)( N − 3)
Ku=0 statistic z = Ku
24 N ( N + 1)
( Ku )2 (Sk )2
Jarque-Bera jb = N +
24 6
Notes
For accuracy, the calculations are all done as written here, and are not done using
(theoretically) equivalent expressions with uncentered moments.
Estimates of the variance can also be obtained from many other instructions, such as
CORRELATE, VCV, CMOMENT. Those estimates will be different from those produced by
STATISTICS as only STATISTICS uses an N–1 divisor. The estimates from VCV and
CMOMENT might show an even greater difference because those use a set of entries
common to all series involved, which may not be the same as the range which would
be used for each separately.
The rats formulas for skewness and kurtosis include small-sample corrections
which are omitted by many other software packages. As a result, values for these will
often be somewhat different from those obtained using other software.
Variables Defined
%NOBS Number of observations (INTEGER)
See Also . . .
TABLE Computes a table of statistics on one or more series.
EXTREMUM Locates the maximum and minimum values of a series.
Parameters
equations Number of equations in the system. You can use a * if you are
using the MODEL option.
Options
model=model name
This is an alternative way to specify the system of equations and is the only way
to forecast with a set of FRMLs. MODELs are usually created by GROUP or SYSTEM.
from=starting period for the forecast interval
to=ending period for the forecast interval
steps=number of forecast periods to compute
These determine the periods for which forecasts will be computed. If you have set
a SMPL, FROM and TO default to that range. Otherwise, FROM and TO default to
the beginning and end of the most recent estimation range, respectively. If you
want something other than the defaults, you can use:
• FROM and TO to set the starting and ending periods for the forecasts, or
• FROM and STEPS to set the starting date and number of steps (periods)
print/[noprint]
If PRINT, rats prints the forecasts and actual values over the forecast period.
window=”Title of window”
If you use the WINDOW option, a (read-only) spreadsheet window is created with
the indicated title and displayed on the screen. This will show the forecasts in
columns below the name of the dependent variable.
Variables Defined
%FSTART Starting entry of forecasts (INTEGER)
%FEND Ending entry of forecasts (INTEGER)
Example
steps(model=canmodel,print,results=stepsfore,from=2000:1,
to=2009:4)
do i=1,6
graph(header=”Forecasts of “+%modellabel(canmodel,i)) 2
# stepsfore(1,i)
# %modeldepvars(canmodel)(i) %fstart %fend
end do i
This computes and graphs one-step forecast errors for 2000:1 to 2009:4 for a six vari-
able model. The graphs include both the forecasts and the actual data for compari-
son. Note that this type of graph is likely to be very uninteresting if plotted over a
long time span—the forecast errors are generally quite small compared to the overall
range of the series, so the actuals and forecasts may be almost on top of each other.
Wizard
If you open a rats file with File—Open, you can write series to the file by doing
View—Series Window and dragging series from that window onto the rats data
window.
Parameters
list of series rats stores the listed series on the data file. If you have not
set a SMPL, STORE will use the defined range of each individual
series. Omit this list if you use the CONVERT option.
unit=[data]/input/other unit
This specifies the source I/O unit for the data you are converting. rats will read
the series from this unit, and store them on the rats format file. Usually, this
will just be the default setting of UNIT=DATA.
org=[rows]/columns
For spreadsheet-style formats, ORG=COLS is for a worksheet with the series
running down columns, while ORG=ROWS is for a worksheet with series running
across rows. Note: The older VARIABLES and OBSERVATION arguments for this
option are still supported.
sheet="worksheet name"
When converting data from a format that supports multiple pages or sheets, you
can use the SHEET option to provide the name of the particular worksheet from
which you want to read data. By default, rats will use the first sheet in the file.
sql="SQL string"
Used with FORMAT=ODBC, this allows you to connect to a database and read data
using sql commands. See Section 3.15 of the Additional Topics pdf for more
details.
[adddates]/noadddates
For XLS, XLSX, DIF, PRN, WKS, ODBC and DBF only. If there are dates on the file,
STORE will transfer them to the rats format file. If there are none, however, you
can still attach dates to the series by using a CALENDAR instruction (before the
STORE). The CALENDAR tells rats the date of the first observation on the file and
the data frequency. NOADDDATES is only necessary in the rare instance that you
have used a CALENDAR instruction, but want to save the series as undated.
Usage
To use STORE, you must first open a rats format data file for editing using DEDIT.
Be sure to do a SAVE after the STORE to put the changes out to the file. To use STORE
with CONVERT, you also need to OPEN the unit with the data.
Examples
cal(q) 1947:1
all 2018:4
open data country.dat
data / real_gnp deflator
set nom_gnp = real_gnp*deflator/100
dedit(new) country.rat
store real_gnp deflator nom_gnp
save
quit
reads REAL_GNP and DEFLATOR (the GNP deflator) from a free-format file, creates a
nominal GNP series (NOM_GNP), and stores all three series on the file COUNTRY.RAT.
prtdata
This opens the data file BASICS.XLS, creates a new rats file called BASICS.RAT,
and then copies all series whose names begin with “M” into the rats file. The SAVE
command saves the changes to the new file, and the PRTDATA command displays the
contents of the file to verify that the transfer was done properly.
Notes
When converting from RATS or PORTABLE files, STORE will extract up to two lines of
comments per series, and store them on the rats file. With the other formats, if you
want comment lines, you will have to use RENAME to add them after doing the STORE.
Series names longer than sixteen characters will be trimmed to the first sixteen. For
data downloaded from a commercial data base, you may want to do some RENAMEing
after you convert the file, since the data base names for series often are very cryptic.
See Also . . .
Intro, Section 2.7 rats format data files.
DEDIT Instruction required to open a rats format file for editing.
INCLUDE Adds a single series to a rats format file.
RENAME Renames or adds comments to a series on a rats format file.
SAVE Instruction required to save changes to a rats format file.
Wizard
Use the Statistics—Linear Regressions wizard and choose “Stepwise Regression” as
the technique.
Parameters
depvar Dependent variable in the regression.
start end Estimation range. If you have not set a SMPL, this defaults to
the maximum range over which all of the variables involved,
dependent and explanatory, are defined.
residuals (Optional) Series for the residuals.
Supplementary card
The order of listing is important if you want to force variables (such as the
CONSTANT) into the model, because the FORCE option, described later, will only act on
the first variables listed on the card.
Please note that no variables are automatically included in each model unless you
use the FORCE option. (Some software automatically includes an intercept.)
Options
method=[stepwise]/forward/backward/gtos
Sets the method to be used (see description later).
one with the smallest t-statistic is larger than the threshold value, it is deleted
and the procedure is repeated with the reduced regression.
If you choose METHOD=STEPWISE, STWISE will not allow you to set SLSTAY to a
lower value than SLENTER.
stwise(force=2) gnp_82
# constant trend x1 x2 x3 ...
The following options are the same as those for the LINREG instruction:
[print]/noprint
vcv/[novcv]
smpl=SMPL series or formula (“SMPL Option” on page RM–546)
spread=standard SPREAD option (“SPREAD Option” on page RM–547)
dfc=Degrees of Freedom Correction (Additional Topics, Section 1.4)
define=Equation to define (Introduction, Section 1.5.4)
frml=Formula to define
unravel/[nounravel] (User’s Guide, Section 2.10)
weight=series of entry weights (“WEIGHT option” on page RM–549)
Stepwise Methods
STWISE performs stepwise regressions using one of four methods, selected using the
METHOD option:
Forward selection Variables are added to the model sequentially until no
variable not yet in the model would, when added, have a t-
statistic with a p-value (significance level) smaller than the
SLENTER threshold value.
Backward selection Starting from the full set of regressors, variables with the
lowest t-statistics are deleted until all remaining variables
have a p–value smaller that the SLSTAY threshold.
Full stepwise This is the default method and combines the first two. At
each stage in the forward selection procedure, the backward
selection algorithm is run to delete variables which now
have small t-statistics.
General to specific Use GTOS (for General TO Specific) if you want STWISE
to drop regressors starting at the end of the list on the
supplementary card. This is useful for pruning lags from an
autoregressive model.
Missing Values
Any observation for which any of the variables is missing is omitted for the analysis.
Output
In addition to the standard regression output, STWISE prints the steps taken in ar-
riving at the final model, for instance,
stwise(force=1) yesvm
# constant public1_2 public3_4 public5 private years $
teacher loginc logproptax
Produces:
Stepwise Regression
Dependent Variable YESVM
Usable Observations 95
Degrees of Freedom 89
Centered R^2 0.2091332
R-Bar^2 0.1647025
Uncentered R^2 0.7086280
Mean of Dependent Variable 0.6315789474
Std Error of Dependent Variable 0.4849354328
Standard Error of Estimate 0.4432048524
Sum of Squared Residuals 17.482318163
Regression F(5,89) 4.7070
Significance Level of F 0.0007346
Log Likelihood -54.3965
Durbin-Watson Statistic 1.9187
Notes
You can use DEFINE or FRML to save the form of the estimated regression as an
EQUATION or FRML. If, however, your program needs to examine the choices made for
the regressors, you can “fetch” the final regression by using the functions
%EQNSIZE(0) number of regressors (could also use %NREG)
%EQNCOEFFS(0) coefficient vector (could also use %BETA)
%EQNTABLE(0) 2 x regressors INTEGER array, where the first row elements are
the series numbers of the chosen regressors and the second row
are their lags
%EQNREGLABELS(0) VECTOR[STRINGS] with the regressor labels
If we do the following after the STWISE instruction executed above,
For instance, after the example above
disp %reglabels()
will show
Constant PUBLIC3_4 PRIVATE TEACHER LOGINC LOGPROPTAX
Count Variables
%NDF, %NOBS, %NREG, %NFREE
summarize( options )
# list of variables in regression format (omit with VECTOR or ALL options)
Parameter
expression To analyze a linear or nonlinear combination of the coefficients,
supply the desired expression as a parameter, using elements
of the %BETA vector or nonlinear parameters (with PARMSET op-
tion) to represent the coefficients. Omit the supplementary card.
Supplementary Card
On the supplementary card, list in regression format the set of variables from the
previous regression that you want to sum. Omit this if you use the VECTOR or ALL
options, or if you are using the expression parameter to analyze your own function of
the coefficients.
Applicability
SUMMARIZE, in the first form described above, can be used only after LINREG, ST-
WISE, AR1, DDV (logit and probit), LDV, SUR or ITERATE, as those instructions use re-
gressor lists either directly or indirectly. It can be used after estimation instructions
that use parameter sets (such as MAXIMIZE) with the use of the VECTOR option or by
using the expression form with references to elements of %BETA. You can also use
the original parameters in the expression form if you use the PARMSET option.
Options
[print]/noprint
title=”string for output title”
Use NOPRINT if you just need to compute results, and don’t need printed output
Use TITLE to add to the output a description of what is being computed.
all/[noall]
Use ALL to test the sum of all the coefficients. Omit the supplementary card.
vector=VECTOR of weights
VECTOR allows you to compute any linear combination of coefficients, not just the
sum. The VECTOR should have dimension equal to the number of regressors and
give the weights to apply to the regressors in computing the linear combination.
form=f/chisquared
This determines the form of the test statistic. By default, rats selects the appro-
priate form based on the estimation technique used last. Use FORM to manually
select a distribution if you have made changes to the regression that require a
different distribution, such as altering the %XX matrix in a way which incorpo-
rates the residual variance into %XX. See Section 2.11 in the User’s Guide.
derives=VECTOR of derivatives
numerical/[nonumerical]
DERIVES returns the VECTOR of derivatives of the expression with respect to the
(full set of) parameters in case you need them for further calculations. By de-
fault, SUMMARIZE uses analytical derivatives, which aren’t always available (for
instance, rats doesn’t differentiate matrix operations). With NUMERICAL, it does
numerical derivatives instead.
Examples
The most common use of SUMMARIZE is the analysis of a set of coefficients in a
distributed lag. For instance, this SUMMARIZE computes the sum of the lags of
SHORTRATE:
linreg longrate
# constant shortrate{0 to 24}
summarize
# shortrate{0 to 24}
Output
The example above produces the following output:
Summary of Linear Combination of Coefficients
SHORTRATE Lag(s) 0 to 24
Value 0.97229804 t-Statistic 96.53677
Standard Error 0.01007179 Signif Level 0.00000000
Technical Information
If the preceding estimation instruction produces b̂ as the estimator, with covariance
matrix
−1
(1) ŝ 2 ( X ′ X )
(cβˆ )
−1
(3) σˆ c ( X ′ X ) c ′
(4) (cb̂ ) cΣ X c ′
Parameters
equations Number of equations in the system you are estimating. This is
ignored if you use the MODEL option—you can use * in its place
if you need any of the remaining parameters.
start end Estimation range. If you have not set a SMPL, this defaults to
the common defined range of all variables involved in the com-
plete regression.
equate This is just the word EQUATE. Omit this and the list if you
want the standard estimation without cross-equation restric-
tions. The differences between SUR with and without EQUATE
are described later in this section.
list (Optional) List of coefficient positions to be equated across equa-
tions. If you use EQUATE without list, all equations have the
same coefficient vector.
Supplementary Cards
Supply one supplementary card for each equation in the system being estimated.
Omit the supplementary cards if using the MODEL option to supply the model. If
you use supplementary cards to input the model, you can still use the RESIDS and
COEFFS options to save the residuals or coefficients, rather than the resids and
coeffs fields on the supplementary cards.
equation The equation name or number. You must set up the equations
in advance using EQUATION or LINREG with DEFINE.
residuals (Optional) The series for the residuals for equation.
coeffs (Optional) The series of the coefficient estimates for equation.
General Options
model=model name
This is an alternative way to specify the system of equations to be estimated.
MODELs are usually created by GROUP or SYSTEM. For SUR, the MODEL must con-
tain only EQUATIONS, and no FRMLs.
[print]/noprint
vcv/[novcv]
[sigma]/nosigma
These control the printing of the regression output, the covariance matrix of the
complete coefficient vector, and the final estimate of the residual covariance/cor-
relation matrix, respectively.
When you use CV, the standard errors and covariance matrix of coefficients will
be correct only if the CV matrix incorporates the residual variances. For instance,
you can obtain two-stage least squares estimates of the coefficients of a system of
equations using SUR(INST) with a CV of the identity matrix, but the covariance
matrix will be incorrect.
cmom/[nocmom]
This pulls cross products out of the cross product matrix computed previously
with a CMOM instruction. This can improve calculation time if the SUR is being
executed many times with different input CV matrices.
create/[nocreate]
setup/[nosetup]
Use CREATE to print the output from the system if you recompute the coefficients
and/or covariance matrix using an instruction other than SUR. This is the sys-
tems analogue of the CREATE option for LINREG. SETUP does no estimation: it
sets up the %BETA and %XX arrays (described below) so that you can compute the
coefficients and covariance matrix using matrix instructions. You can then use
SUR(CREATE...) to get the output.
unravel/[nounravel]
Substitutes for ENCODED variables (User’s Guide, Section 2.10). rats does not
print the intermediate regression (in terms of encoded variables).
robusterrors/[norobusterrors]
lags=correlated lags [0]
lwindow=newey/bartlett/damped/parzen/quadratic/[flat]/panel/white
damp=value of g for lwindow=damped [0.0]
lwform=VECTOR with the window form [not used]
cluster=SERIES for clustered std. errors [not used]
When you use these without the INSTRUMENTS option, they allow you to calculate
a consistent covariance matrix allowing for heteroscedasticity (with ROBUST),
serial correlation (with ROBUST and LAGS), or clustered standard errors (with
ROBUST and CLUSTER). For more information, see Sections 2.2, 2.3, and 2.4, and
4.5 of the User’s Guide and “Long-Run Variance/Robust Covariance Calculations”
on page RM–540.
None of these affect the parameter estimates. As with LINREG and NLLS, they
only come into play when the covariance matrix of the estimates is computed.
They behave differently when used with INSTRUMENTS.
zudependent/[nozudep]
−1
wmatrix=SYMMETRIC weighting matrix for instruments [ (Z ′Z) ]
sw=SYMMETRIC grand weighting matrix [not used]
swout=estimated SYMMETRIC grand weighting matrix [not used]
NOZUDEP (the default) is the special case for the SW matrix. We call this
NOZUDEP because the most important case is where u is (serially uncorrelated
and) independent of the instruments Z. More generally, this is Case (i) in Hansen
(1982, page 1043). With NOZUDEP, you can use WMATRIX to set the W part of the
S−1 ⊗ W and the CV option to set S. Otherwise, SUR estimates a new S after
each iteration.
If ZUDEP, you can use the SW option to set the full SW array. This is an nr x nr
SYMMETRIC array. Otherwise, SUR determines a new SW matrix after each itera-
tion by taking the inverse of
1
(1)
T
∑ (ut ⊗ Zt )(ut ⊗ Zt )′
(or the generalization of this if you use the LAGS option). The SWOUT option allows
you to save the estimated SW matrix into the specified array.
center/[nocenter]
CENTER adjusts the weight matrix formula to subtract off the (sample) means of
u ⊗ Z, which may be non-zero for an overidentified model. For more information,
see “ZUMEAN and CENTER options” on page RM–541.
robusterrors/[norobusterrors]
If you use ROBUSTERRORS combined with an input CV or SW matrix, SUR will
compute the coefficients using the “suboptimal” weighting matrix and then cor-
rect the covariance matrix of the coefficients based upon the choices for the LAGS,
LWINDOW and other options immediately above.
jrobust=statistic/[distribution]
You can use this option to adjust the J-statistic specification test when the
weighting matrix used is not the optimal one. See Section 4.9 in the User’s Guide
for more information.
Variables Defined
Because SUR estimates a whole set of equations, most of the single equation fit statis-
tics aren’t defined.
%BETA VECTOR of coefficients (across equations)
%XX covariance matrix of coefficients (SYMMETRIC)
%TSTATS VECTOR containing the t-stats for the coefficients
%STDERRS VECTOR of coefficient standard errors
%NOBS number of observations (INTEGER)
%NREG number of regressors (INTEGER)
%NFREE number of free parameters, including covariance matrix (INTE-
GER)
%LOGDET log determinant of the estimate of sigma (REAL)
%LOGL log likelihood (if not INSTRUMENTS) (REAL)
%SIGMA final estimate of the S matrix (SYMMETRIC)
%NVAR number of equations (INTEGER)
%LOGDET log determinant of the estimate of S (REAL)
Output
The next page shows the output from a two equation SUR. Note that SUR prints the
regression output (controlled by [PRINT]/NOPRINT) separately for each equation.
However, the covariance/correlation matrix of the coefficient estimates (controlled by
VCV/[NOVCV]) is for the full system. The final part of the output is the covariance/
correlation matrix of the residuals (controlled by [SIGMA]/NOSIGMA). rats labels
the rows and columns with the names of the dependent variables of the equations.
For each position in list, SUR forces the coefficients in all equations at that position
to be equal. For example, you would put a 2 in list to equate the 2nd coefficients in
all equations.
Example
equation geeq ige
# constant fge cge
equation westeq iwest
# constant fwest cwest
sur 2 / equate 2 3
# geeq
# westeq
This restricts the coefficient in position 2 of the first equation (the FGE coefficient) to
be equal to the coefficient in position 2 of the second equation (the FWEST coefficient).
It also restricts to be equal the coefficients in position 3: CGE and CWEST.
Output
The output for SUR with EQUATE is the same as for standard SUR with one exception:
the covariance matrix of coefficients does not include duplicates of the equated coeffi-
cients. The equated coefficients are listed first, followed by the coefficients which are
estimated separately. For the example above:
Restricted coefficients will take their labels from the last equation. The first CON-
STANT is the intercept from equation 1, the second is from equation 2.
The variables %XX, %BETA, %STDERRS, %TSTATS and %NREG are all set up in this or-
der as well. For instance, %NREG will just be four (the number of free coefficients) and
%BETA will have four entries.
Parameters
start, end estimation range. By default, maximum range permitted by all
variables involved in the regression, including instruments if
required.
Options
smpl=SMPL series or formula (“SMPL Option” on page RM–546)
You can supply a series or a formula that can be evaluated across entry numbers.
Entries for which the series or formula is zero or “false” will be skipped, while
entries that are non-zero or “true” will be included in the operation.
depvar/[nodepvar]
Controls whether or not the dependent variable(s) from EQUATION or MODEL are
included as target variables.
instruments/[noinstruments]
If INSTRUMENTS, use the current list from the INSTRUMENTS instruction rather
than a list on a supplementary card. Omit the “instruments” card if you use
INSTRUMENTS.
The VARIANCES option indicates whether the variances are HOMOGENEOUS (same
across groups) or HETEROGENEOUS (different).
The AVERAGE option indicates how the coefficient vectors for the different groups
are combined. COUNT weights them by the size of the group, SIMPLE weights each
group equally, PRECISION weights by the precision of the estimates.
Variables Defined
%BETA Averaged VECTOR of regression coefficients
%XX SYMMETRIC covariance matrix of averaged regression coeffi-
cients
%NOBS Number of observations (INTEGER)
%NREG Number of instruments (INTEGER)
%NFREE Number number of free coefficients in the full system, including
variances/covariance matrices (INTEGER)
%NREGSYSTEM Number of regressors across groups and targets (INTEGER)
%NVAR Number of target variables (INTEGER)
%NGROUP Number of groups (INTEGER)
%LOGL Log likelihood (REAL)
%SIGMA Covariance matrix of residuals (SYMMETRIC)
Examples
sweep(group=%indiv(t),var=hetero)
# dc
# lpc{1} constant lndi dp dy ddp
computes into %BETA the average regression coefficients from regressing DC on the
variables in the second list, with a separate regression on each individual in a panel
data set, allowing the variances to differ from individual to individual.
set group = 1+(z{1}>=g1t)+(z{1}>=g2t)
sweep(grouping=group)
# df ds
# constant df{1 to 8} ds{1 to 8} z{1}
does a bivariate systems regression of DF and DS on the second list, with regressions
done on three subsamples using a common variance matrix (since VARIANCES is the
default HOMOGENEOUS).
This does a panel causality test (for m causing y) allowing heterogeneity in the coef-
ficients and variances. The first SWEEP is the unrestricted model (including lags of m)
and the second excludes the lags of m. This uses an INQUIRE instruction to generate
a SMPL dummy which will ensure that both regressions use the same entries.
cal(panelobs=40)
open data panelmoney.xls
data(org=obs,format=xls) 1//1 19//40 realm realy
*
* Number of lags
*
compute p=3
*
set dy = realy-realy{1}
set dm = realm-realm{1}
*
inquire(valid=fullsmpl,reglist)
# dy{0 to p} dm{0 to p}
*
sweep(group=%indiv(t),smpl=fullsmpl,var=hetero)
# dy
# constant dy{1 to p} dm{1 to p}
compute loglunr=%logl,nregunr=%nregsystem
sweep(group=%indiv(t),smpl=fullsmpl,var=hetero)
# dy
# constant dy{1 to p}
compute loglres=%logl,nregres=%nregsystem
cdf(title=”Heterogeneous Panel Causality Test”) chisqr $
2.0*(loglunr-loglres) nregunr-nregres
compute jointtest=%cdstat,jointsignif=%signif
Wizard
The Time Series—VAR (Setup/Estimate) wizard provides an easy, dialog-driven
interface for defining and estimating var models.
Parameters
equations The list of equations you want to include in the system. Omit
this if you use the MODEL option, which is the recommended way
to handle a var. If you are grouping some dissimilar equations
for Kalman filtering, use EQUATION or LINREG(DEFINE=xxx)
to define them before using SYSTEM.
Options
model=model name [unused]
The MODEL option defines the equations in the system as a MODEL variable. If you
use MODEL, omit the list of equations.
The MODEL option makes using instructions such as FORECAST, IMPULSE, and
HISTORY easier—rather than having to list each equation on a supplementary
card, you just use the MODEL option on those instructions to reference the system.
If you are going to use the ECT instruction to add error-correction terms to your
model, you must use MODEL.
cmoment/[nocmoment]
Use the CMOMENT option when you are putting together a system of very similar
but not identical equations. It will reduce the computation time in estimating the
systems (though that is no longer an issue in practice).
Description
A system is specified beginning with SYSTEM, ending with END(SYSTEM) and (option-
ally) including the following instructions:
VARIABLES, LAGS, DETERMINISTIC
list the dependent variables, lags and exogenous variables which form a VAR.
SPECIFY
specifies the prior distribution for a Bayesian VAR (BVAR).
ECT
lists equations which describe error-correction terms
KFSET, TVARYING
set various options for standard and time-varying coefficients applications of the
Kalman filter.
Examples
system(model=ratemod)
variables shortrat longrate m1 twdollar
lags 1 2 3 4 6 9 12
det constant
end(system)
creates a four variable VAR. Using the older “numbered equations” method, this
would be:
system 1 to 4
variables shortrat longrate m1 twdollar
lags 1 2 3 4 6 9 12
det constant
end(system)
linreg(define=olseqn) depvar
# constant x1 x2 x3
system olseqn
end(system)
estimates a regression with LINREG, defining it as equation OLSEQN. This equation
then goes into a one-equation system, which can be estimated sequentially using the
Kalman filter.
Wizards
If you use View—Series Window operation, you can get a statistics table by selecting
the desired series and then using the Statistics toolbar icon:
Parameters
start end Range of entries to use in computing statistics. If you have not
set a SMPL, TABLE uses all available data for each series, deter-
mined separately.
list of series The list of series to include in the table. If you omit the list, all
current series are included.
Options
smpl=SMPL series or formula ("SMPL Option" on page RM–546)
You can supply a series or a formula that can be evaluated across entry numbers.
Entries for which the series or formula is zero or “false” will be omitted from the
computation, while entries that are non-zero or “true” will be included.
window="Title of window"
If you use the WINDOW option, the output goes to a (read-only) spreadsheet win-
dow with the given title, rather than being inserted into the output window or file
as text. From this window, you can easily export data (using File–Export...) to a
spreadsheet or word processing program to prepare a table for publication.
[print]/noprint
title=”title for output” [none]
Use NOPRINT if you want to suppress the displaying of the output on the screen
(which would only make sense if you are using Use TITLE to supply your own
title to label the resulting output.
Notes
It’s always a good idea to use a TABLE instruction immediately after the DATA in-
struction, particularly the first time you use a data set. It gives you a quick way to
check whether your data are in good shape. For instance, you can easily detect series
which have missing data because the number of observations will not match with
those for other series. You might also have a series whose data range doesn’t match
with what you expect. TABLE can help you spot series which have problems, but you
will have to use PRINT or some other instruction to isolate the cause.
The sample standard error is computed using an N–1 divisor where N is the number
of data points in a series.
%MAXIMUM and %MINIMUM are the only variables that TABLE defines that you can
access from within your program (except by using the MATRIX option). Use the
STATISTICS instruction applied to the series one at a time if you need access to such
things as the mean and number of observations.
Variables Defined
%MAXIMUM maximum value found in the list of series (REAL).
%MINIMUM minimum value found in the list of series (REAL).
Examples
open data states.wks
data(org=obs,format=wks) 1 50 expend pcaid pop pcinc
set pcexp = expend/pop
table
produces the table of statistics in the sample output below for the four series read
from the data file plus the created series PCEXP.
Sample Output
Series Obs Mean Std Error Minimum Maximum
EXPEND 50 3316.1600000 4360.4221634 368.0000000 22750.0000000
PCAID 50 185.0200000 74.1984350 103.0000000 570.0000000
POP 50 4149.5200000 4399.9075744 325.0000000 20411.0000000
PCINC 50 4309.4600000 604.9145735 3188.0000000 5414.0000000
PCEXP 50 0.7941836 0.2472352 0.5049801 2.1476923
See Also:
STATISTICS computes more detailed statistics on a single series. EXTREMUM locates
the maximum and minimum values of a single series.
Parameters
cseries Complex series to taper.
start end Range to taper. By default, the range of cseries. However,
you will generally need these because the taper is applied to the
unpadded portion of the data.
newcseries Series for the result of the tapering. By default, same as
cseries.
newstart Starting entry of the tapered series. By default, same as start.
Options
type=[trapezoidal]/cosine
This gives the type of taper. See the "The TYPE option" for technical details.
Note that TAPER has no option analogous to the FORM option of WINDOW. Tapering
is simply the multiplication of two series, so you can implement other tapering
functions fairly easily using CSET or CMULTIPLY.
Example
This uses a cosine taper affecting 20% of the data on either end.
frequency 3 768
rtoc 1956:1 2017:12
# prices
# 1
taper(type=cosine,fraction=.20) 1 1956:1 2017:12
fft 1
cmult(scale=1./(2*%pi*%scaletap)) 1 1
Wizard
In the Statistics—Regression Tests wizard, use "Exclusion Restrictions" if you’re test-
ing for zero values, and "Other Constant Restrictions" if some are non-zero.
Supplementary Cards
1. List the numbers of the coefficients which enter the restriction, by coefficient
numbers, not by variable names. rats puts coefficients for a LINREG or similar
instruction in the regression in the order listed on the supplementary card. You
can use TO triples, like “1 TO 5”, to abbreviate the list.
2. List the values you want these coefficients to assume under the restriction. TEST
computes a joint test of these restrictions.
Options
zeros/[nozeros]
ZEROS tests exclusion (zero) restrictions. Omit the second supplementary card if
you use this option.
[print]/noprint
NOPRINT suppresses the regular output of TEST. You may want to use this option
if you are just using TEST to compute the variables %CDSTAT or %SIGNIF.
all/[noall]
Use ALL to test whether all of the coefficients can be excluded. Omit the supple-
mentary card if you use this. This option was called WHOLE before version 7.
form=f/chisquared
This determines the form of the test statistic used. By default, rats will select
the appropriate form for the Wald test based upon the estimation technique used
last. You can use FORM to manually select a distribution if you have made chang-
es to the regression that require a different distribution, such as altering the %XX
matrix in a way which incorporates the residual variance into %XX. See Section
2.11 in the User’s Guide.
vector=coefficient vector
The coefficient vector is a VECTOR which supplies the restricted values. If you
use it, omit the second supplementary card. It must have the same size as the
current regression. Used properly, this permits tests between two estimated coef-
ficient vectors. See the note below.
Variables Defined
%CDSTAT the computed test statistic (REAL)
%SIGNIF the marginal significance level (REAL)
%NDFTEST (numerator) degrees of freedom for the test (INTEGER)
Examples
linreg logd 1 150
# constant logy logown logother
test(title="Test for Unit Elasticity")
# 3
# -1.0
This tests whether the LOGOWN coefficient is –1.0.
instruments constant z1 z2
linreg(inst) y
# constant x1 x2
compute [vect] beta2sls = %beta, [symm] xx2sls=%xx
instruments(add) z3 z4 z5
linreg(inst) y
# constant x1 x2
test(coeffs=beta2sls,covmat=xx2sls-%xx)
performs a Hausman test, comparing the 2SLS estimators from a small instrument
set with that from a larger set which contains the first.
linreg lgnp
# constant lgnp{1} trend dgnp{1 to 4}
*
* This computes the joint F-test. We suppress the printing,
* because it will give a misleading p-value, based upon an F and
* not the non-standard distribution.
*
test(noprint)
# 2 3
# 1.0 0.0
disp "ADF Joint test for p=1, and trend=0" %cdstat
This does a Dickey-Fuller joint test. While the Wald test gets the statistic correct, it
has a non-standard distribution, hence the use of NOPRINT (which will show a stan-
dard F significance level).
Description
For each variable and each forecast step, THEIL computes:
• the mean error
• the mean absolute error
• the root mean square (rms) error
• Theil’s U statistic: a ratio of the rms error to the rms error of the “naive” fore-
cast of no change in the dependent variable.
2) Forecasting, using THEIL without the SETUP or DUMP options. This usually goes
inside a loop, and does the forecasting and accumulation of forecast performance
statistics. If you want to save the forecasts, use a FORECAST instruction in addi-
tion to THEIL.
3) Final Statistics, using THEIL with the option DUMP. This takes the forecast sta-
tistics and prints a table for each variable in the model.
Use TO to indicate the last period in the sample. Since THEIL needs to compare
the forecasts with actual data, it will not compute any forecasts for periods be-
yond that period. TO defaults to the default series length.
Parameters—Forecasting Form
start Start of the forecast period. The same information can be input
using the FROM option.
Options—Forecasting Form
from=starting period of the forecast interval [last estimation end + 1]
Starting period of forecast. This provides the same information as the start
parameter.
forecasts=(input) VECT[SERIES] of forecasts
This allows you to input a series of forecasts which can't be generated by forecast-
ing calculations using standard EQUATIONs or MODELs. Note that you still have
to provide a MODEL or set of equations in the SETUP form so THEIL knows what
the dependent variable(s) are.
print/[noprint]
Use PRINT to get THEIL to print the forecasts together with the actual values.
This can generate a lot of output.
Theil’s U
Theil’s U statistic has several advantages over the RMS error when you are com-
paring models.
First, as a unit-free measurement, it is often easier to work with than the unit-
bound RMSE. For instance, Theil U’s for interest rate series usually are between
.8 and 1.0, while RMSE’s will vary depending upon term.
Second, it provides an immediate comparison of the forecasts with those of the
naive scheme of forecasting no change over time.
A value in excess of one is not promising, since it means the model did worse
than the naive method. However, a value substantially less than one should not
necessarily be construed as a major success—almost any reasonable procedure
will produce such a value for a series with a strong trend.
N. Obs column
The N. Obs column lists the number of different forecasts upon which the statis-
tics at that horizon are based. Here, we’ve used THEIL in a loop over 24 different
starting periods. As the starting period approaches the end of the sample, there is
no data against which to compare the forecasts of the later steps. Thus the total
number of available data points is less for the longer horizons. For example, the
table shows that the one-step horizon results are based on 24 different one-step
forecasts (one for each starting period used), while the two-step horizon results
are based on 23 forecasts, and so on.
Examples
boxjenk(define=bjeq,sdiffs=1,ar=2,sma=1) ldeuip * 2003:12
theil(setup,estimate=6,steps=24,to=2009:12) 1
# bjeq
do time=2004:1,2009:12,6
theil time
boxjenk(define=bjeq,sdiffs=1,ar=2,sma=1) ldeuip * time+5
end do time
theil(dump)
This evaluates a Box-Jenkins model over the period 2004:1 to 2009:12. To reduce
computation time, the model is re-estimated only every 6th period. The THEIL
instruction inside the loop will do one sets of forecasts beginning with period TIME,
another beginning with TIME+1, etc. through TIME+5.
system(model=canmodel)
variables canrgnp canm1s cantbill cancpinf canusxsr usargnp
lags 1 to 4
det constant
specify(tightness=.15,type=symmetric) 0.5
end(system)
theil(setup,model=canmodel,steps=8,to=2009:4)
estimate(noprint,noftests) * 1997:4
do time=1998:1,2009:4
if time==2003:1 ; theil(dump)
theil time
kalman
end
theil(dump)
This evaluates a six-variable VAR. The evaluation period is 1998:1 to 2009:4. The
initial ESTIMATE goes through 1997:4, then the KALMAN instructions update the coef-
ficients. The THEIL(DUMP) after the IF statement prints the intermediate results as
of 2003:1.
Technical Information
Each time you execute THEIL in forecasting form, rats computes forecasts for
steps horizons (or fewer, if the dataend period is hit first). From these and the
actual data, it computes forecast errors, absolute errors, squared errors, and squared
errors of the naive (flat) forecasts. The program stores running sums of these statis-
tics, and keeps track of the number of times that statistics have been computed for
each horizon. We’ll call this latter value Nt—the number of times that a forecast has
been computed for horizon t (t=1 for one-step-ahead forecasts, t=2 for two-step-ahead
forecasts, etc.). In the sample shown earlier in "Interpreting the Output", N1 was 24,
N2 was 23, and so on.
When you use THEIL(DUMP), rats divides the sums by Nt to convert them into
means, and from these it computes the rms error and Theil U statistics. The formu-
las for these computations (at forecast horizon t) are given below:
Nt
Sum of Forecast Errors SFEt = ∑ eit , eit = yt − yˆ it
i =1
where ŷit is the forecast at step t from the ith call
to the THEIL forecast step and yt is the actual
value of the dependent variable.
TRFUNCTION: Transfer Functions
TRFUNCTION computes the transfer function of the lag polynomial which is implied
by an equation. Unfortunately, the phrase “transfer function” is used in time series
analysis in two different ways. If you are interested in a Box-Jenkins transfer func-
tion model, please see the instruction BOXJENK.
Parameters
cseries Complex series for the computed transfer function.
start end Range of entries (frequencies) for which you want to compute
the transfer function. By default: 1 and the FREQUENCY length.
Option
equation=equation [no default—you must specify an equation]
Equation for which you want to compute the transfer function. See below for
details.
Description
The equation must have one of the forms:
1. yt = A (L ) yt (univariate autoregression)
2. yt = A (L ) yt + B (L )ut (univariate arma)
3. yt = C (L )xt (univariate distributed lag)
4. yt = C (L )xt + B (L )ut (univariate distributed lag with ma error)
5. yt = B (L )ut (ma process)
In all of these, the variable CONSTANT may be among the right side variables.
TRFUNC computes the transfer function by:
• Taking the Fourier transform of the lag polynomial 1-A(L) in types 1 and 2 or
C(L) for types 3 and 4.
• Dividing that by the Fourier transform of the moving average part B(L), if it is
present.
Notes
If you are filtering in the frequency domain and are using a transfer function, re-
member that CMULTIPLY conjugates the second series as does the * operation in an
expression such as CSET.
For simple transfer functions, particularly those with known coefficients, you might
find it easier to use the function %ZLAG to construct the transfer function directly us-
ing CSET.
Examples
linreg(define=dlag) ipd Type 3: Distributed lag
# constant m1{0 to 8}
frequency 1 512
trfunc(equation=dlag) 1 Series 1 = Transfer function
cmult 1 1 = Squared Gain
equation(noconstant,coeffs=||.9,.3||) arma y 1 1
trfunc(equation=arma) 1
The same transfer function as this last example can be created using the following
CSET instruction.
cset 1 = (1-.9*%zlag(t,1))/(1+.3*%zlag(t,1))
Parameters
list of ... These are the state-transition covariance matrices, which are
the covariance matrices of the changes in the coefficients. You
need one array for each equation in the SYSTEM.
These are SYMMETRIC arrays. You do not need to DECLARE
these in advance, but you do need to dimension and set them
before you execute a KALMAN instruction.
Notes
If you want to write a single set of code which can handle VAR’s of various sizes, we
recommend that you use a VECTOR[SYMMETRIC] in place of a list. For instance,
dec vector[symmetric] tvs(neqn)
tvarying tvs
makes TVS(1),TVS(2),...,TVS(NEQN) the list of matrices.
The TVARYING matrices, and the matrices for KFSET as well, have to be set before
you can do a KALMAN instruction.
See the instruction DLM and Section 10.1 of the User’s Guide for information on using
the Kalman filter for general state-space modelling. TVARYING is used for the specific
situation where you are estimating a linear model with time-varying coefficients.
See Also . . .
See the SYSTEM, KFSET, KALMAN instructions.
Parameters
datatype Data type you want to assign to the listed parameter names.
This can be any of the basic and aggregate data types that rats
supports. By default, parameters for a PROCEDURE are type
INTEGER, called by value, while those for a FUNCTION are type
REAL, called by value.
list of names The formal parameters of the procedure which are to have this
type. Put * in front of a name if you want it to be called by ad-
dress. Otherwise, it will be called by value.
Wizard
You can use the Time Series—Single-Equation Forecasts wizard to generate fore-
casts.
Parameters
series The series into which you wish to save the forecasts. rats will
create this series if it doesn’t already exist.
start end The range over which forecasts are to be computed. You can
also use the use the FROM, TO, or STEPS option instead.
Option
equation=equation to forecast
The name or number of the equation to forecast. If you omit this option,
UFORECAST will use the most recently estimated regression.
print/[noprint]
PRINT will display the forecasted and actual values in the output window.
simulate/[nosimulate]
SIMULATE draws independent N(0, s2) shocks over the forecast period where s2 is
the residual variance for the equation or regression being forecast.
bootstrap/[nobootstrap]
BOOTSTRAP draws shocks over the forecast period randomly with replacement
from the residuals associated with the equation or regression being forecast.
Because this is a simple shuffling of the residuals, it would not be completely ap-
propriate for a model with moving average terms if you’re bootstrapping an entire
sample.
Variables Defined
%FSTART Starting entry of forecasts (INTEGER)
%FEND Ending entry of forecasts (INTEGER)
Examples
The command below forecasts an equation called GDPEQ for the four quarters of 2004.
The forecasts are saved into GDPFORE.
uforecast(equation=gdpeq,from=2004:1,steps=4) gdpfore
This example, from Tsay (2010), uses the @UForeErrors procedure to get bench-
mark forecasts using the sample mean, then compares the errors with those from an
ar(1) estimation. See TSAY3P203.RPF for the full example.
This example, taken from Diebold (2004), uses UFORECAST to generate forecasts from
an arima model. See the file DIEB3P216.RPF for the full example.
boxjenk(regressors,ar=3,noconstant,define=ar3eq) lsales $
1968:1 1993:12 resids
# time time2 seasons{0 to -11}
Compute the forecasts, and their standard errors over the year 1994. Generate up-
per and lower 95% confidence bands
ufore(stderrs=stderrs,equation=ar3eq) fcst 1994:1 1994:12
set upper 1994:1 1994:12 = fcst+1.96*stderrs
set lower 1994:1 1994:12 = fcst-1.96*stderrs
Create a dummy variable to be used in shading the forecast period. Since we’re do-
ing forecasts out to 1998:12 later on, we create this over that entire period.
set forezone * 1998:12 = t>=1994:1
graph(header="History and 12-Month-Ahead Forecast",$
shading=forezone) 4
# lsales 1992:1 1993:12
# fcst 1994:1 1994:12
# upper 1994:1 1994:12
# lower 1994:1 1994:12
until condition
instruction or block of instructions to be executed until condition is true.
Parameters
condition This is the expression that UNTIL tests. It can either be an
integer or a real-valued expression. The condition is true if it
has a non-zero value (such as 1) and false if it has value zero.
Usually you construct it using logical and relational operators.
Statement Block
The statement block can be a single instruction, or a group of instructions enclosed in
braces. It can also be a DO, DOFOR, WHILE, UNTIL or LOOP loop, or an IF-ELSE block.
If you have any doubt about whether you have set up a block properly, just enclose
the statements that you want executed in { and }.
If the UNTIL instruction is the first instruction in a compiled section, you must en-
close the loop in an extra set of { and } to signal the limits of the compiled section.
Comments
Usually, the condition compares two variables, and it is quite common for one of
these to be set only within the statement block following UNTIL. Since the condi-
tion is processed first (though not executed first), rats will not yet have seen this
variable. To get around this, you must DECLARE any variable which is introduced
later in the loop.
Example
set smpl = 1.0
compute lastnobs=-1
{
until lastnobs==%nobs {
compute lastnobs=%nobs
linreg(smpl=smpl) depvar
# constant testv1 testv2 testv3 testv4
set normalize = abs(%resids/sqrt(%seesq))
set smpl = smpl.and.(normalize<3.00)
}
}
This keeps running a regression, each time dropping all observations with residuals
larger than three standard errors until no more observations are dropped.
Parameters
id_num>>"string" The parameter list specifies which menu items will be affected
by the current USERMENU command. The id_numbers are
integer values used to identify each menu item. For example,
you might use 1, 2, 3 and 4, or 1000, 1001, 1002, and 1003 as
id_numbers. The menustrings supply the corresponding
names that will appear in the pull-down menu. They are re-
quired when you create the menu with ACTION=DEFINE, but
are optional with ACTION=MODIFY.
When you define the menu, the parameter list must include
all of the menu items you want in the menu. You cannot add
or remove menu items after you have defined the menu.
With the ACTION=MODIFY option, the parameters specify the
menu items you wish to modify. Use this to change the name
of the item(s) (by supplying a different menustring), to en-
able or disable, and check or uncheck items.
All menu items must have a unique id_num, even if you are
doing multiple USERMENU’s. See the section on “Using Mul-
tiple USERMENU’s” for more information.
Options
id=menu identification number [0]
If you want to use two or more USERMENU’s simultaneously, you must use the ID
option with ACTION=DEFINE to assign a unique, non-zero integer identification
number to the USERMENU being defined. You can then use that number to refer to
the menu in later operations. Do not use the ID option when working with only
one menu. See "Using Multiple USERMENU’s" for details.
action=define/modify/[run]/remove
Use ACTION=DEFINE to define a menu, set its menu bar title, and the list of
menu items. This is the first step. Note that the menu does not go into the menu
bar until you do ACTION=RUN. This gives you time to modify its appearance after
the initial definition.
ACTION=RUN actually adds your menu(s) to the menu bar (if they are not already
there) and suspends the execution of the program until the user makes a selec-
tion from a menu. Your menu is in (almost) complete control of the program. The
user has access to the other menu bar operations, but can execute no other rats
instructions until you say so. You should provide some form of a quit or abort
item in your menu to allow the user to get out when she is done. When the user
has selected an item from a USERMENU menu, execution continues with the line
following USERMENU(ACTION=RUN) instruction. The variable %MENUCHOICE is set
to the id_number of the selected item.
ACTION=REMOVE removes the menu(s) from menu bar. It is your job to do this
when you no longer need the menu(s).
title="title string"
The TITLE option sets the title of the menu as it will appear in the menu bar.
Use this option with ACTION=DEFINE. You should keep this short (single word)
so the menu bar does not get cluttered.
enable=no/yes
With ACTION=MODIFY, you can use ENABLE=YES to enable the specified menu
item(s), or ENABLE=NO to disable them. Use ENABLE=NO to prevent the user of
your program from selecting items which cannot be executed yet, possibly be-
cause other steps must be completed first. All items are enabled initially.
check=no/yes
Use CHECK=YES with ACTION=MODIFY to place “check marks” in front of the
specified menu item(s), and CHECK=NO to remove them. You can use this to indi-
cate to the user that a “switch” is off or on, or that a certain operation has been
completed.
Examples
The trivial example below creates a simple menu with three items. Selecting either of
the first two items causes rats to display a message in the output window. The third
item closes the menu.
* Define a menu with three items:
usermenu(action=define,title="Test") $
1>>"First item" 2>>"Second item" 3>>"Quit menu"
* Begin loop
loop
* Activate menu:
usermenu(action=run)
Parameters
newseries The series whose lag(s) you want to add to the equation. Use
the variable name %MVGAVGE to add a moving average lag.
list of lags The lags of newseries to add to the equation. Use the second
form (without the keyword LAGS or the list of lags) if you
just want to add the zero lag.
Option
print/[noprint]
PRINT prints the modified equation.
Examples
linreg(define=gdpeq) gdp
# constant gdp{1} m1{0 to 3}
modify gdpeq
vadd %mvgavge lags 1 2
iterate gdpeq
This estimates an equation by OLS, then adds two moving average lags and re-esti-
mates it using ITERATE.
See Also
MODIFY Required to initiate modification of an equation.
VREPLACE Replaces a variable in an equation with a transformation of a
second variable.
Wizard
The Time Series—VAR (Setup/Estimate) wizard provides an easy, dialog-driven
interface for defining and estimating var models.
Parameters
list of ... These are the dependent variables for the equations in the sys-
tem.
If you are setting up an error-correction model using ECT, list
the undifferenced variables. The restrictions are handled auto-
matically by rats.
Comments
If you do not use a prior, the order of subcommands VARIABLES, LAGS and
DETERMINISTIC is unimportant.
Example
system(model=var12)
variables tbillus tbillcan m1us m1can exrcan
lags 1 to 12
det constant
end(system)
sets up a five variable VAR with twelve lags on each variable.
See Also
SYSTEM, LAGS, DETERMINISTIC, SPECIFY, ECT
Wizard
You can use the Statistics—Covariance Matrix wizard to compute a covariance
matrix. You can also compute a covariance matrix for a set of series by using View—
Series Window to open a Series Window, selecting the desired series, and clicking on
the toolbar icon.
Parameters
start end Range of entries to use. If you have not set a SMPL, this defaults
to the common defined range of all the listed series.
Supplementary Card
Lists the set of series for which VCV will compute the covariance matrix. Note that
this is a list of series only: you cannot use regression format. If you need something
more general, use CMOM with the proper set of options.
Options
[print]/noprint
Use NOPRINT to suppress the printing of the computed matrix. rats prints the
matrix with covariances on and below the diagonal and correlations above the
diagonal. See the example on the next page.
centered/[nocentered]
By default, VCV does not subtract means out of the input series. If you use the op-
tion CENTERED, it does.
window="Title of window"
If you use the WINDOW option, the output is displayed in a (read-only) spreadsheet
window. You can export the contents of this window to various file formats using
File—Export....
Examples
vcv(matrix=v) 1921:1 1941:1
# rcons rinv rwage
computes and prints the covariance matrix of three series, creates 3´3 SYMMETRIC
array V and saves the result there. A sample output is given below. This has the co-
variances on and below the diagonal and the correlations above it. The matrix V will
have just the covariances.
Covariance\Correlation Matrix
RCONS RINV RWAGE
RCONS 0.89175982593 0.3010684027 -0.5780087046
RINV 0.41131881809 2.09304660284 0.3863244971
RWAGE -0.39361453883 0.40304589112 0.52002665162
vcv(center)
# xjpn xfra xsui
compute qbar=%cvtocorr(%sigma)
Computes the covariance matrix of the three series XJPN, XFRA and XSUI and com-
putes QBAR as the correlation matrix using %CVTOCORR. (%SIGMA and the MATRIX op-
tion both give covariance matrices). Since NOCENTER is the default, this uses CENTER
to take the means out of the data.
This computes the covariance matrix of the VECT[SERIES] called ULIST and saves it
into the matrix V.
Technical Information
VCV will, in general, give the same result as the SIGMA options on instructions such
as ESTIMATE, NLSYSTEM and SUR if applied to the residuals from those. If ut is the
column vector of the group of series at time t, then the estimate is
1 T
(2) Ŝ =
T
∑u u ′
t =1
t t
Note the use of the T divisor, without adjustment for “degrees of freedom.” The T
divisor gives the maximum likelihood estimator (in general), and will thus give a
matrix which can be used in further likelihood-based analysis, such as testing and
restricted modeling.
VCV will, in general, give a different result than instructions or functions which re-
move the means in the course of their calculations (examples are the function %COV,
and the CMOM instruction when used with the CENTER or CORR options).
Also, because the standard errors of estimate in regression outputs are corrected for
degrees of freedom, there will also be a scale factor difference between the SEESQ’s in
the output and the diagonal elements produced by VCV.
Missing Values
Any observation which has a missing value for any of the series will be dropped from
the calculation.
Variables Defined
%SIGMA computed covariance matrix (SYMMETRIC)
%LOGDET log determinant of the matrix (REAL)
%NOBS number of observations (INTEGER)
%NVAR number of variables (INTEGER)
%MEANV VECTOR of means of the variables (only if CENTER option)
See Also
The %COV(a,b) and %CORR(a,b) functions and the CMOMENT instruction compute
similar statistics (but with means subtracted for %COV and %CORR, and CMOM if used
with the CENTER or CORR options). The instruction RATIO takes two sets of compat-
ible residuals and conducts a likelihood ratio test based upon them.
%CVTOCORR(a) produces a correlation matrix from a covariance matrix.
Parameters
oldseries The variable in old equation that you want to replace.
newseries The variable replacing oldseries.
transform The transformation which was done originally to newseries to
produce oldseries. These are described below. You can omit
transform if you simply want to replace oldseries by news-
eries.
number Depends upon the transformation as indicated above.
Supplementary Cards
For a transform of FILTER, include the same supplementary cards which you used
on the FILTER instruction.
Options
print/[noprint]
If you use the PRINT option, rats prints the new equation.
Description
VREPLACE replaces variable oldseries by the indicated transformation(s) of vari-
able newseries (transform=SWAP is explained on the previous page).
Examples
instruments constant dshift1 dshift2 sshift1 sshift2 sshift3
linreg(inst,frml=demandeq) price
# constant quantity dshift1 dshift2
linreg(inst,define=supplyee) price
# constant quantity sshift1 sshift2 sshift3
modify supplyee
vreplace price by quantity swap
frml(equation=supplyee) supplyeq
group market demandeq>>f_price supplyeq>>f_quant
estimates a supply-demand system by two-stage least squares, with PRICE as the left
hand side variable in both equations. The MODIFY and VREPLACE instructions replace
PRICE with QUANTITY on the left side of the supply equation, and the FRML converts
the equation into a formula. The GROUP instruction groups the two formulas into a
system which will determine both PRICE and QUANTITY.
boxjenk(ar=1,ma=1,define=yprewh) y / yres
boxjenk(ar=2,ma=0,define=xprewh) x / xres
boxjenk(ma=1,inputs=1,define=trfunc) yres
# xres 0 1 0
modify trfunc
vreplace yres by y prewhitened yprewh
vreplace xres by x prewhitened xprewh
computes a transfer function model of Y on X. This first prewhitens Y and X by
ARMA(1,1) and ARMA(2,0) models, respectively.
See Also
MODIFY, VADD
while condition
instruction or block of instructions to be executed as long as condition is true.
Parameters
condition This is the expression that rats tests. It can either be an inte-
ger or a real-valued expression. The condition is false if it has
a value zero and true if it has any non-zero value. Usually, you
construct it using logical and relational operators.
Description
When rats encounters a WHILE instruction, it tests the condition. If it is true,
rats executes the statement or block of statements following the WHILE instruction,
and then tests the condition again. As long as the condition is true, rats will execute
the statement(s). If the condition tests false, rats immediately skips to the first
instruction following the WHILE block. You can terminate the loop at any point with a
BREAK instruction, or skip directly to the next condition check with NEXT.
Example
compute count=0, i=0, sum=0.0
{
while count<20 {
compute i=i+1
if x(i)>0.0
compute sum=sum+x(i),count=count+1
}
}
computes the sum of the first 20 positive values of X.
Parameters
cseries Complex series to smooth.
start end Range of entries to smooth. This defaults to the defined range
of cseries. It should always be the range used for the Fourier
transform.
newcseries Series for the resulting smoothed series. By default, same as
cseries.
newstart Starting entry for the smoothed series. By default, same as
start.
Options
type=[flat]/tent/quadratic/triangular
form=VECTOR with the window form
TYPE selects from the types of windows that rats provides: flat, tent-shaped,
or quadratic. FORM allows you to choose your own form for the window. See
"The Window Options" for details.
mask=masking series
Use this option when you are going to apply a mask to the bands of the spectrum.
The smoothing transformation for entries near a masked-out band becomes
increasingly one-sided so it gives no weight to the excluded ordinates. After
WINDOW, you should multiply the smoothed series by the masking series.
Description
For a continuous spectral density there are two requirements for consistency:
• The window width must go to infinity as the number of data points increases
(ensuring that the variance goes to zero).
• The window width must increase at a rate slower than the increase in the
number of data points (ensuring that bias goes to zero).
rats uses a default width based upon the square root of the number of ordinates. It
offers several choices for the type of window, explained below.
rats uses the following formula to smooth series I to produce S:
n−1
S ( j) = ∑ w I ( j + k)
k
−(n−1)
where m is window width, n = (m + 1) 2 and the w’s are the window weights.
Series I is considered to be periodic of period end—in smoothing the last entries of
I, the window wraps around to use entries 1,2, etc. when the formula above asks for
end+1,end+2,etc.
For a mean zero series, the ordinate for frequency zero has, perforce, a value of zero.
Since most spectral analysis is done with mean zero series, WINDOW treats zero fre-
quency as if it were a masked entry. This means that the moving averages for ordi-
nates near this point are adjusted so they give no weight to the zero value. To include
the zero frequency of cseries in the moving averages, you must use a MASK option
with a masking series that has ones in all entries. CSET is the easiest way to cre-
ate such a series.
Use the FORM option for a general symmetric window. The VECTOR is dimension n
and provides the values for w0,w1,...,wn-1, that is, just one side of the window. rats
automatically reflects this for the negative k’s and scales so the weights sum to one.
For instance, the quartic window of width 11 could be done by:
declare vector mytent(6) 6=(11+1)/2
ewise mytent(k)=6^4-(k-1)^4 max at k=1, zero at k=7
window(form=mytent,width=11) 1 / 2
Comments
The choice of window type is largely a matter of personal preference. With relatively
smooth time series, it makes little difference. When the spectrum is likely to have
sharp features, the FLAT window, used in conjunction with a taper (instruction
TAPER) is probably the best choice. If you are expecting some interesting features
(such as peaks at the seasonals), it is usually a good idea to try several window
widths, as very narrow peaks may get flattened by a window that is too wide.
Variables Defined
%EDF Equivalent Degrees of Freedom (REAL)
%EBW Equivalent Band Width (REAL)
m m = window width for WINDOW = FLAT
EDF = 3n3 (2n2 + 1) n = (m + 1) 2 WINDOW = TENT,TRIANGULAR
1
in general
sum of squared weights
p EDF
EBW = where T is the number of ordinates
T
Note that for a padded series, the computed value for %EBW is correct, but %EDF must
be corrected by multiplying by N/T where N is the number of actual data points.
Example
fft 1
cmult(scale=1.0/(2*%pi*%nobs)) 1 1
window(width=9) 1 / 2
window(width=15,type=triangular) 1 / 3
produces series 2 as the periodogram 1 smoothed with flat window of width 9 and
series 3 as 1 smoothed with a tent, width 15.
Parameters
arrays,... These are the objects to be displayed. WRITE will put each on a
separate line (or set of lines for a matrix).
Options
unit=[output]/copy/other unit
The unit to which data are written.
format=binary/cdf/[free]/html/prn/rtf/tex/tsd/wks/xls/
xlsx/"(FORTRAN format)"
The format in which the data are to be written. A FORTRAN format should de-
scribe the format of a row of an array, not the entire array.
[skip]/noskip
With FORMAT=FREE or FORMAT="(format)", SKIP (the default) puts two
blank lines after each array printed. You can suppress these line skips by using
NOSKIP.
Example
declare rectangular e(3,2)
input e
1.5,2.0,0.0,3.5,2.7,4.6
write "E" e
E
1.500000 2.000000
0.000000 3.500000
2.700000 4.600000
Comments
WRITE does not label the arrays that it prints. You may find it helpful to include a
descriptive string on the WRITE instruction as in the example.
WRITE always writes out arrays one at a time. DISPLAY, used within a loop, can pro-
duce output unavailable through WRITE:
do i=1,n
display(unit=copy) x1(i) x2(i) x3(i)
end do i
You can also use REPORT to display arrays. For instance, this will display X1, X2 and
X3 in three columns, put into a common format that takes up 10 character positions.
report(action=define,hlabels=||"X1","X2","X3"||)
report(atrow=1,atcol=1,fillby=cols) x1
report(atrow=1,atcol=2,fillby=cols) x2
report(atrow=1,atcol=3,fillby=cols) x3
report(action=format,width=10)
report(action=show)
Finally, you can also use MEDIT to display an array in a report window, and then
export the contents of the window to a file using the File—Export... operation.
See Also . . .
DISPLAY, MEDIT, REPORT, READ
Wizard
Use the Data/Graphics—X11 wizard.
Parameters
series Series to be seasonally adjusted.
start end Range to use in X11 seasonal adjustment. If you have not set a
SMPL, this defaults to the largest range of series.
Overview
X11 can only be applied to monthly or quarterly series containing at least three years
of data. Note that the x11 method will not necessarily work well with some types of
series:
• It assumes that the seasonal component changes, if at all, in a fairly smooth
fashion from year to year.
• It assumes that the seasonality is primarily a function of the calendar (rather
than weather-based).
If, instead, the series shows a fairly sharp break in its seasonal pattern, or if the sea-
sonal is more a function of some other variable (e.g. weather-driven electric demand),
x11 is likely to leave some obvious seasonality in all, or at least part, of the data. One
simple way to check how well x11 is doing on a new series is to run it first on the first
3/4 of the data, then on the last 3/4. Compare the overlapping range.
Options
mode=[multiplicative]/additive/pseudoadditive/logadditive
multiplicative/[nomult] (for compatibility with older versions of rats)
The MODE option chooses which of the four adjustment modes to use. These are
described later in the supplement. The older MULTIPLICATIVE option still works,
but we would recommend correcting any of your programs to use the new option.
print=[none]/short/standard/long/full
decimals=number of decimals to show in output [depends upon data]
PRINT controls the amount of output displayed by the X11 instruction:
NONE This produces no output—used for production runs where you don’t
need to examine the output.
SHORT This choice produces the minimum amount of printed output—only
the initial series and final component estimates.
STANDARD This is the standard level of x11 output—still primarily the final
estimates.
LONG This setting includes most of the important intermediate steps in
addition to the standard output.
FULL This reports on every step in the main adjustment process.
DECIMALS controls the number of digits right of the decimal to display in any of
the output. This has no effect on the level of precision of the calculations or the
output series, just the printed output. By default, this depends upon the data.
[graduate]/nograduate
lower=Lower limit standard errors from the mean [1.5]
upper=Upper limit standard errors from the mean [2.5]
Graduating extremes reduces the effect of outliers on the estimates of the season-
al factors. This only makes a great deal of difference if a series has a good deal of
non-seasonal variability. The actual process used in graduating extreme values is
fairly complicated, but basically, it drops data points which are more than “Upper
Limit” standard errors from the mean (within a certain calculation), leaves un-
changed those closer to the mean than “Lower Limit” standard errors and makes
a smooth transition from inclusion to exclusion for those in between. The default
values for the limits are 1.5 and 2.5.
Before rats 7.1, the following holidays were switch options. These are still support-
ed, though it’s recommended that any of these be done as preliminary factors instead.
All of the holiday shifts are adjusted for long-run mean values, which prevents them
from picking up a spurious trend effect for particular ranges of data.
easter=number of days before Easter at which effect is felt
It’s assumed that the level of activity is different for this number of days before
Easter. This generates a dummy which splits this among February, March and
April based upon the number of days falling in each month. The analogue to the
old EASTER switch option is 21, though the calculation is now done differently.
Examples
This seasonally adjusts DEUIP (German Industrial Production) by applying the mul-
tiplicative decomposition. The adjusted series and the seasonal factors are saved into
DEUADJIP and DEUFACT, respectively.
calendar(m) 1961:12
allocate 1988:1
open data examx11.rat
data(format=rats) / deuip
x11(mode=mult,adjusted=deuadjip,factors=deufact) deuip
This uses the @GMAUTOFIT procedure to automatically select a model for the log of
the data, and then estimates the selected model using BOXJENK, with automatic de-
tection of outliers. The adjustment series needs to be anti-logged before it can be used
as a preliminary set of factors.
The value of FINAL at the end of the data is subtracted before the exp is taken so the
adjustment will leave the end of data value at its observed level. This is done because
the level shift and temporary change dummies are defined from t0 on, and so will
give non-zero shift values to the end of the data, rather than the beginning. (The ad-
justed data will be the same either way; any printed output looks more natural with
this correction). The series is adjusted using the log-additive model, with a full set of
printed output.
@gmautofit(report,diffs=1) ldata
boxjenk(ar=%%autop,diffs=1,ma=%%autoq,sar=%%autops,$
sdiffs=1,sma=%%autoqs,method=bfgs,outliers=standard,$
adjustments=final) ldata
set prior = exp(final-final(2008:7))
x11(mode=logadd,prefactors=prior,print=full) u36cvs
Technical Information
For technical details, please see the X11 Manual Supplement PDF file (included with
the Professional version of rats).
Operator Precedence
“Precedence” is the order in which rats evaluates the operators in an expression.
For example, consider the following instruction.
compute c = 100–5*5
Multiplication has a higher precedence than subtraction, so rats first multiplies 5
times 5. Then, it subtracts this result from 100, returning a final result of 75.
When operators have the same precedence (the same value in the Precedence column
in the table on the previous page), rats evaluates them from left to right, with two
exceptions:
• The exponentiation operators (** , ^ , .^) are evaluated from right to left. For
example, rats will handle A^B^C as A^(B^C)
• In the absence of parentheses, rats will evaluate the rightmost = sign (as-
signment operator) first, then move left.
Exponentiation takes precedence over negation in expressions of the form –A^B. For
example, –A^2 is interpreted as –(A2), rather than (–A)2.
Using Parentheses
You can use parentheses to control the order in which rats evaluates an expression.
rats will evaluate an expression contained in parentheses first, then move to opera-
tions outside the parentheses. If an expression includes nested parentheses (one set
inside another), rats evaluates the expression in the innermost set of parentheses
first, then moves outward. For example:
1+2*3+4*5 = 27
(1+2)*3+4*5 = 29
1+((2*3)+4)*5 = 51
You can use logical operators in conjunction with the %REAL and %IMAG functions to
compare the real or imaginary parts of a complex number. For example, if A and B
are complex:
%real(a) >= %real(b)
is a legal expression.
Autocorrelations
Long-Run Variance/Robust Covariance Calculations
Algorithms
Autocorrelations
Yule–Walker vs Burg Methods
RATS offers two algorithms for computing autocorrelations and partial autocorrela-
tions. The “textbook” calculation is known as METHOD=YULE (for Yule-Walker). This
starts by estimating the autocorrelation at lag k using
∑ (x t − x )(xt−k − x )
(1) rˆ( k ) = t = k +1
T
2
∑ (x
t =1
t −x)
(Optionally without subtracting the mean). The Yule–Walker equations relate the
autocorrelation coefficients to the partial autocorrelation coefficients by
Standard Errors
In large samples, the variance of the autocorrelation estimators is approximately
1
(3) Var (rˆ( k )) ≈ 1 + 2∑ rˆ 2 ( j )
T j <k
Q Tests
The Ljung–Box Q statistic for M lags is
r̂ 2j
(4) Q = T (T + 2) ∑
j≤ M T−j
Under appropriate circumstances, for a null hypothesis of no serial correlation Q
is asymptotically distributed as a c2 . rats uses (M – DFC option) for the degrees of
freedom. If the series are residuals from an arima model, the DFC option should be
set to the number of estimated arma parameters: BOXJENK saves this in the variable
%NARMA. For other types of residuals, there is no known asymptotic distribution, so
you shouldn’t rely upon this as a formal test.
Eicker-White/Heteroscedasticity/Misspecification Consistent
This is the simplest case. The terms in (1) are independent (or close to it), but not
identically distributed. The covariance matrix is approximated as
(2) var ∑ z j ≈ ∑ z j ′ z j
j j
This is done using MCOV with no other options, for correcting covariance matrices
with the ROBUSTERRORS option with none of the other options described here, and for
gmm weight matrices with ZUDEP with none of the other options.
∑ ∑ w (z ′z )
L
(3) l t t −l
l =− L t
where wl is a set of window weights. This is chosen using the LAGS option on any of
the instructions. If the w’s are all one (LWINDOW=FLAT), the matrix in (3) may fail to
be positive semi-definite, which can produce invalid standard errors. Various window
types are provided using the LWINDOW option to avoid this. The formulas for windows
are most easily described by defining
l
(4) z =
L +1
Except for the quadratic window, all the window weights are zero when |z|>1.
Note that the phrase bandwidth for these windows usually means L+1, not L itself.
BARTLETT and NEWEYWEST are identical (Bartlett is the historical name for the win-
dow in spectral analysis). For more information on lag windows, see Hamilton (1994),
pages 281-284.
(5) LWINDOW=FLAT w(z)=1
(6) LWINDOW=NEWEYWEST w( z ) = 1 − z
g
(7) LWINDOW=DAMPED w( z ) = (1 − z ) , g is the DAMP option.
1 − 6z 2 + 6z 3 0 ≤ z ≤½
(8) LWINDOW=PARZEN w( z ) =
2 (1 − z )2 ½ ≤ z ≤1
3 sin (6p z / 5)
(9) LWINDOW=QUADRATIC w( z ) = − cos (6 p z / 5)
2 6p z / 5
(6pz / 5)
′
(10) ∑ zt ∑ zt
t t
This is positive semi-definite, because it’s a matrix times its transpose. For a single
time series, however, it’s not very useful, because it has rank one. However, if you
add these up across a large number of individuals or categories, you get a full rank,
consistent estimator (big N, small T). See, for instance, Greene(2012), section 11.3.2.
If this is used in computing a covariance matrix, you get clustered standard errors.
LWINDOW=PANEL will do this calculation (clustering on individuals) for a panel data
set. If you want to do the clustering on a variable other than this, use the option
CLUSTER=series with clustering categories. This series should have a
unique value for each category, and should have at least as many categories as you
have variables (regressors) if you want a positive definite result, and should have
many more if you want consistency.
Multivariate Residuals
If you have more than one series of residuals, the calculation in (3) is done as
L
(11) ∑ ∑ w (u l t ⊗ Zt )′ (ut−l ⊗ Zt−l )
l =− L t
with the appropriate changes for other options. This arrangement (blocking the ma-
trix by equation) matches with that expected for a weight matrix by the instructions
NLSYSTEM and SUR. Note that this matrix will likely not be full rank if the size of Z
times the size of u is greater than the number of data points.
The NOZUDEP option can only be used with LAGS=0, and only with NOCENTER. It com-
putes the special case of
(12) Σ ⊗ Z ′Z
Date Notation
Picture Codes
SMPL Option
SPREAD Option
SHUFFLE Option
WEIGHT Option
Common Elements Common Elements
Date Notation
You refer to observations by date in rats using one of the following formats:
year:period annual, monthly, quarterly, other periods per year,
or years per period. For example, for monthly data,
2006:12 is December, 2006. For annual data,
2003:1 is the year 2003.
year:month:day weekly, daily, etc. (2005:5:15 for May 15, 2005).
year:month:day//period intraday data. (2000:12:1//3 for the third obser-
vation on December 1, 2000).
individual//date dated panel data (2//2005:4 for the April, 2005
observation of the second individual and
1//2001:2:1 for the February 1, 2001 observation
of the first individual).
individual//period dated or undated panel data.
You can also use “year:period” for weekly or daily data; it will give you the indicat-
ed week or day (business day for business day data) within the year. With seven-day
data, 2005:35 would be the February 4th observation (35th period for that year).
Also, with annual data, you must use “year:1”, and not just “year”, for a date, as
rats won’t recognize it as a date except in that form.
Picture Codes
Picture codes are used to choose the formatting for numerical values. They take the
form #### for integers, or ####, ####.##, or *.## for reals. A complex number will
display as (real part,complex part) where each part is formatted according to the
picture.
A code with no decimal point causes the number to be printed right-justified in a field
whose width is the number of # signs. A code with a decimal point produces a number
the width of the total string, with the decimal point in the indicated location. If you
use * left of the decimal point, it uses as many places as needed to show the number
and no more.
A * by itself requests the default formatting, which, for a DISPLAY instruction, is
generally fourteen total places, typically with five digits right of the decimal. (By
default, graphics instructions format the axis labels to align decimals while using as
few digits as possible).
On all instructions except DISPLAY, the picture code will be in quotes, as it will be
part of an option:
graph(picture=”*.##”)
For DISPLAY the picture is just another field and isn’t quoted (in quotes, it would be
a literal string):
display ##.## %beta
As examples, if the value is 123.4567,
SMPL Option
Many instructions that deal with series, including most estimations instructions,
have an SMPL option, which allows you to selectively include and omit observations
within the entry range used by the instruction. To use it, either:
• create a series with non-zero values (usually ones) in the entries you wish to
be part of the sample, and zeros in the entries you wish to exclude, or,
• supply an expression as a function of T (this can be a FRML) that evaluates to
true for the entries you want to include, and false for the entries you want to
exclude.
Examples
set small = pop<2000 ;* Will be 1 only when POP<2000
linreg(smpl=small) ...
The first SSTATS operates only on the entries where YESVM is 0 and the second
operates only on the entries where YESVM is 1. NOS will be the count (sum of 1’s) for
the YESVM==0 branch, NNNOS will be the number of entries where YESVM is 0 and
TESTVM<.5, etc.
linreg(smpl=age>=30.and.age<=64,robust) logahe
# yrseduc constant
The LINREG runs only over the entries where AGE is at least 30 and no more than 64.
SPREAD Option
spread=SERIES with assumed residual variances (at least to a scale
factor)
Use the SPREAD option to perform weighted least squares to correct for heterosce-
dasticity. With this option you provide a series to which the residual variances are
assumed to be proportional. (“WEIGHT” options on many packages require you to in-
dicate the reciprocal square root of this series). Note that there is a separate WEIGHT
option which is for probability-weighted (or stratified) estimates, which is not the
same as a heteroscedasticity correction.
Examples
If variances are assumed to be proportional to the series INCOME:
linreg(spread=income) food
# constant income
SHUFFLE Option
shuffle=SERIES[INTEGER] with entry remapping
The SHUFFLE option is available on some instructions for simplifying the task of
bootstrapping when the data set allows for direct rearrangement of the observations
(if, for instance, the observations are assumed to be independent). It won’t help if a
parametric bootstrap is required in, for instance, a vector autoregression.
Before using SHUFFLE, you need to do a BOOT instruction to generate a
SERIES[INTEGER] with the random remapping entries. If you have a five entry data
set, and BOOT generates the sequence 5, 1, 4, 4, 2, then the STATISTICS instruction
with that sequence input using SHUFFLE will compute the sample statistics of the
input series at those five entries (with 4 repeated and 3 missing).
As with any bootstrapping, you will typically need to embed that within a loop, with a
BOOT generating the remapping, the instruction with SHUFFLE generating statistics,
and then some form of bookkeeping to process that.
Example
This is part of the BOOTFGLS.RPF program, which bootstraps a feasible estimator for
gls for heteroscedasticity correction. Note that SHUFFLE needs to be used on all three
LINREG instructions that are part of the calculation.
boot shuffle
*
* Run the original regression using the boostrapped sample
*
linreg(shuffle=shuffle,noprint) food
# constant income
*
* Generate ESQ using the original sample data (so it’s
* compatible with the original data)
*
set esq = log(%resids^2)
set z = log(income)
*
linreg(shuffle=shuffle,noprint) esq
# constant z
*
* Get the fitted values (again, this uses the original sample)
* and run the FGLS estimates.
*
prj vhat
linreg(spread=exp(vhat),shuffle=shuffle,noprint) food
# constant income
WEIGHT option
weight=SERIES of (relative) weights for the data points
The WEIGHT option serves a different purpose from the SPREAD option. WEIGHT is
for weighting observations (typically with probability weights), while SPREAD is for
“weighting” data due to assumed variance differences. The difference is subtle, as the
two options will produce the same regression coefficients when given compatible op-
tions (if the two are reciprocals). If you are doing “weighted least squares” for correct-
ing for heteroscedasticity, use SPREAD, not WEIGHT.
If wt is the series of weights (they don’t have to sum to 1—they are normalized as
part of the calculation), then the mean for any function of the data {xt } is computed
as
f (x ) =
∑ w f (x ) = w f (x )
∑ w
t t t
∑w t ∑ t
t
Note that the weight applies to the function, not to the data. In particular for least
squares of y on X:
−1
b = (∑ w X ′ X ) (∑ w X ′ y )
t t t t t t
where the sum of weights will cancel out since one is in the inverse and so becomes
a reciprocal. By contrast, weighted least squares for heteroscedasticity (in effect)
divides through the data by st (both y on X in the case of a regression) right at the
start and carries that through all calculations.
WEIGHT is usually used as part of a much more complicated calculation, as the
probability weights are generated by another calculation. The following is part of
the ROBUSTSTAR.RPF program—it does 10 passes through a loop that adjusts the
weights on observations based upon how extreme the residuals are.
Ansley, C.F. (1979). “An Algorithm for the Exact Likelihood of a Mixed Autoregressive-Moving
Average Process.” Biometrika, Vol. 66, pp. 59-65.
Baltagi, B.H. (2008). Econometric Analysis of Panel Data, 4th Edition. Chichester, UK: Wiley.
Beach, C. and J. MacKinnon (1978). “A Maximum Likelihood Procedure for Regression with Auto-
correlated Errors.” Econometrica, Vol. 46, pp. 51-58.
Box, G.E.P., G.M. Jenkins, and G.C. Reinsel (2008). Time Series Analysis, Forecasting and Control,
4th ed. Hoboken: Wiley.
Burg, J.P. (1967). “Maximum Entropy Spectral Analysis,” Proceedings of the 37th Meeting of the
Society of Exploration Geophysicists.
Campbell, J.Y., A. Lo, and A.C. MacKinlay (1997). The Econometrics of Financial Markets. Princ-
eton: Princeton University Press.
Doan, T.A. (2010). “Practical Issues with State-Space Models with Mixed Stationary and Non-Sta-
tionary Dynamics,” Estima Technical Paper, (1).
Doan, T., R. Litterman, and C.A. Sims (1984). “Forecasting and Conditional Projection Using Real-
istic Prior Distributions.” Econometric Reviews, Vol. 3, pp. 1-100.
Durbin, J. (1969). “Tests for Serial Correlation in Regression Analysis Based on the Periodogram of
Least Squares Residuals.” Biometrika, Vol. 56, pp. 1-16.
Estrella, A. (1998). “A New Measure of Fit for Equations with Dichotomous Dependent Variables.”
Journal of Business and Economic Statistics, Vol. 16, pp. 198-205.
Fair, R.C. (1970). “The Estimation of Simultaneous Equation Models with Lagged Endogenous
Variables and First Order Serially Correlated Errors.” Econometrica, Vol. 38, pp. 507-516.
Findley, D.F., B.C. Monsell, W.R. Bell, M.C. Otto, and B. Chen (1998). “New Capabilities and
Methods of the X-12-ARIMA Seasonal-Adjustment Program”. Journal of Business and Eco-
nomic Statistics, Vol. 16, No. 2., 127-177.
Gali, J. (1999). “Technology, Employment and the Business Cycle: Do Technology Shocks Explain
Aggregate Fluctuations.” American Economic Review, Vol. 89, pp. 249-271.
Gardner, E.S. (1985). “Exponential Smoothing: the State of the Art.” Journal of Forecasting, Vol. 4,
pp. 1-28.
Glosten, L., R. Jagannathan and D. Runkle (1993) “On the Relation between the Expected Value
and the Volatility of the Nominal Excess Return on Stocks.” Journal of Finance, Vol. 48, pp.
1779-1801.
Greene, W.H. (2012). Econometric Analysis, 7th Edition. New Jersey: Prentice Hall.
Hall, A. (2000). “Covariance Matrix Estimation and the Power of the Overidentifying Restrictions
Test.” Econometrica, Vol. 68, pp. 1517-1528.
Hansen, L.P. (1982): “Large Sample Properties of Generalized Method of Moments Estimators.”
Econometrica, Vol. 50, pp. 1029-1054.
Hodrick, R. and E. Prescott (1997) “Post-War U.S. Business Cycles: An Empirical Investigation.”
Journal of Money, Credit and Banking, Vol. 29, No. 1, pp 1-16.
Jarque, C.M. and A.K. Bera (1987). “A Test for Normality of Observations and Regression Residu-
als.” International Statistical Review, Vol. 55, pp. 163-172.
Johnston, J. and J. DiNardo (1997). Econometric Methods, 4th Edition. New York: McGraw Hill.
Kendall, M.G. and A. Stuart (1958). The Advanced Theory of Statistics, Vol. 1. New York: Hafner.
King, R., C. Plosser, J. Stock and M. Watson (1991), “Stochastic Trends and Economic Fluctua-
tions”, American Economic Review, vol 81, No. 4, 819-40.
Koenker, R. and G. Bassett, Jr. (1978). “Regression Quantiles.” Econometrica, Vol. 46, pp. 33-50.
Keane, M. P. and D. E. Runkle (1992). “On the Estimation of Panel-Data Models with Serial Cor-
relation When Instruments Are Not Strictly Exogenous.” Journal of Business & Economic
Statistics, Vol. 10, No. 1, pp. 1-9.
L’Ecuyer, P. (1999). “Good Parameter Sets for Combined Multiple Recursive Random Number Gen-
erators.” Operations Research, Vol. 47, pp. 159—164.
Ljung, G.M. and G.E.P. Box (1978). “On a Measure of Lack of Fit in Time Series Models.” Biometri-
ka, Vol. 67, pp. 297-303.
Nelson, D.B. (1991) “Conditional Heteroskedasticity in Asset Returns: A New Approach.” Econo-
metrica, Vol. 59, pp. 347-370.
Nyblom, J. (1989) “Testing for Constancy of Parameters Over Time”, Journal of the American Sta-
tistical Association, Vol. 84, pp. 223-230.
Olsen, R. (1978). “A Note on the Uniqueness of the Maximum Likelihood Estimator in the Tobit
Model”, Econometrica, Vol. 46, pp. 1211-1215.
Pindyck, R. and D. Rubinfeld (1998). Econometric Models and Economic Forecasts, 4th Edition.
New York: McGraw-Hill.
Press, W.H., B.P. Flannery, S.A. Teukolsky and W.T. Vettering (2007). Numerical Recipes, 3rd Edi-
tion. New York: Cambridge University Press.
Sims, C.A. (1980). “Macroeconomics and Reality.” Econometrica, Vol. 48, pp. 1-49.
Sims, C.A. (2002). “Solving Linear Rational Expectations Models”, Computational Economics, Octo-
ber 2002, Vol. 20, Nos. 1-2, pp. 1-20.
Tsay, R.S. (2010). Analysis of Financial Time Series, 3rd Edition. New York: Wiley.
Verbeek, M. (2008). A Guide to Modern Econometrics, 3rd Edition. New York: Wiley.
West, M. and J. Harrison (1997). Bayesian Forecasting and Dynamic Models, 2nd Edition. New
York: Springer-Verlag.
Wooldridge, J. (2010). Econometric Analysis of Cross Section and Panel Data, 2nd Edition. Cam-
bridge, Mass.: The MIT Press.