INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Introduction and Background
Introduction and Background
• Introduction to panel data methods
• Issues with the OLS model in the presence of panel data
• Panel data: Least squares dummy variable (LSDV) method
• Panel data: First difference (FD) estimators
• Panel data: Fixed effects (FE) estimators
• Panel data: Random effects (RE) estimators
• Panel data: FE vs. RE estimators
• Summary and concluding remarks
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Introduction to Panel Data Methods
Introduction to Panel Data Methods
Relationship between security returns 𝑟𝑖𝑡 and order imbalance 𝑂𝐼𝐵𝑖𝑡
𝐵𝑢𝑦𝑉𝑜𝑙𝑢𝑚𝑒 −𝑆𝑒𝑙𝑙𝑉𝑜𝑙𝑢𝑚𝑒
• Here OIBit =
𝐵𝑢𝑦𝑉𝑜𝑙𝑢𝑚𝑒 +𝑆𝑒𝑙𝑙𝑉𝑜𝑙𝑢𝑚𝑒
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 + 𝑣𝑡 + α𝑖 + µ𝑖𝑡
• Assume 10 years and 100 securities
• 𝑣𝑡 + α𝑖 + µ𝑖𝑡 are our error terms; let us discuss them one by one
• 𝑣𝑡 (‘t’ from 1…T) is solely time dependent term, e.g., broad market-wide changes
• These time-dependent terms don’t vary across the city, and can be accounted for
by ‘n-1’ (10-1 = 9) dummy variables [i.e., least square dummy variable estimation]
Introduction to Panel Data Methods
Relationship between security returns 𝑟𝑖𝑡 and order imbalance 𝐎𝐈𝐁𝐢𝐭
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 + 𝑣𝑡 + α𝑖 + µ𝑖𝑡
• α𝑖 (‘i’ from 1…n) is the security-specific variable like firm size, firm beta,
industry, etc., and not changing frequently overtime
• Usually, T: number of periods is small, and N: number of individual entities is
large
• So, accounting for α𝑖 through dummy variable method can make the model
extremely inefficient
• So, what if we do not account for this α𝑖
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Issues with Ordinary Least Square
(OLS) Model
Fitting OLS Through Scattered Data Points
Fitting OLS with Panel Data
ONGC
HDFC
JSW
ICICI
TATA
NTPC
INFY
WIPRO
RELIANCE
TCS
L&T
Return ADANI
ITC
BPCL
OIB
Pooled OLS Estimation with Panel Data
Relationship between security returns 𝑟𝑖𝑡 and order imbalance 𝑂𝐼𝐵𝑖𝑡
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 + 𝑣𝑡 + 𝛼𝑖 + µ𝑖𝑡
• 𝑛𝑖𝑡 = 𝛼𝑖 + µ𝑖𝑡 [𝛼𝑖 : Unobserved heterogeneity]
• Cov(𝑛𝑖𝑡 , 𝑂𝐼𝐵𝑖𝑡 ) ≠ 0 [Problem of endogeneity]
• Cov(𝑛𝑖𝑡 ,𝑛𝑖𝑡+1 )=Cov(𝑢𝑖𝑡 +𝛼𝑖 ,𝑢𝑖𝑡+1 +𝛼𝑖 ) ≠ 0 [Problem of
autocorrelation]
• Pooled OLS estimates will be biased and inconsistent
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Lease Squares Dummy Variable
(LSDV) Estimators
LSDV Estimators
Relationship between security returns 𝑟𝑖𝑡 and order imbalance 𝑂𝐼𝐵𝑖𝑡
• Assuming that time-varying effects can be modeled using time-dummies
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 + 𝛼𝑖 + µ𝑖𝑡 (1)
• Include ‘N-1’ dummy variable for ‘N’ securities (𝑆2 , 𝑆3 , … , 𝑆𝑁 ) as follows
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 + σ𝑁
𝑛=2 𝑎𝑛 𝑆𝑛 + µ𝑖𝑡 (2)
• Here, 𝑆2 is a dummy variable that takes a value of 1 for security 2, and 0
otherwise; and so on for securities 3, 4,…, N
• Thus, we are explicitly accounting for the unobserved heterogeneity for each
security individually
LSDV Estimators
𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 + σ𝑁
𝑛=2 𝑎𝑛 𝑆𝑛 + µ𝒊𝒕
• The estimates of 𝑎𝑛 are consistent under the following conditions
• Cov(𝑢𝑖𝑡 , 𝑂𝐼𝐵𝑖𝑡 ) = 0 ; no serial correlation in errors; homoscedasticity in error
terms
• Theoretically, under these assumptions, the estimates from LSDV are the
same as fixed-effects (FE) estimate
• However, dummy variables allow estimating this 𝛼𝑖 explicitly, unlike FE
estimators
• However, the model is not parsimonious with large N
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
First Difference (FD) Estimators
First Differences Estimators
Relationship between security returns 𝑟𝑖𝑡 and order imbalance 𝑂𝐼𝐵𝑖𝑡
• Assuming that time-varying effects can be modeled using time-dummies
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 + 𝛼𝑖 + µ𝑖𝑡 (1)
• 𝑟𝑖𝑡−1 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡−1 + 𝛼𝑖 + µ𝑖𝑡−1 (2)
• Subtract (1) - (2)
• 𝛥𝑟𝑖𝑡 = 𝑎1 𝛥𝑂𝐼𝐵𝑖𝑡 + 𝛥µ𝑖𝑡 (3)
• Cov(𝛥µ𝑖𝑡 ,𝛥𝑂𝐼𝐵𝑖𝑡 ) =0
• This model can be estimated with OLS estimation
First Differences (FD) Estimators
Relationship between security returns 𝑟𝑖𝑡 and order imbalance 𝑂𝐼𝐵𝑖𝑡
• 𝛥𝑟𝑖𝑡 = 𝑎1 𝛥𝑂𝐼𝐵𝑖𝑡 + 𝛥µ𝑖𝑡 (3)
• 𝐶𝑜𝑣 ∆𝑢𝑖𝑡 , ∆𝑢𝑖𝑡−1 = 𝐶𝑜𝑣 𝑢𝑖𝑡 − 𝑢𝑖𝑡−1 , 𝑢𝑖𝑡−1 − 𝑢𝑖𝑡−2
• However, these is an issue of auto-correlation due to first differencing
• Differencing leads to small variation in variables and therefore considerable
increase in standard error of estimates
• Loss of observation
• Time independent factors can not be estimated
First Differences (FD) Estimators
Relationship between security returns 𝑟𝑖𝑡 and order imbalance 𝑂𝐼𝐵𝑖𝑡
• 𝛥𝑟𝑖𝑡 = 𝑎1 𝛥𝑂𝐼𝐵𝑖𝑡 + 𝛥µ𝑖𝑡 (3)
• All those terms with no variance across time will be eliminated;
so, we need the dependent and independent variables to have
some variation across time and city
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Fixed-Effects (FE) Estimator
Fixed-Effects Estimators
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 +𝜶𝒊 +µ𝒊𝒕 (1)
1 𝑇
• Time-demean equation (1) σ 𝑟 ∀ i’s = 1, 2, 3…N
𝑇 𝑡=1 𝑖𝑡
• 𝑟ഥ𝑖 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖 +𝜶𝒊 +𝑢ഥ𝑖 (2)
• Substract (1) - (2)
• 𝑟𝑖𝑡 -𝑟ഥ𝑖 = 𝑎1 (𝑂𝐼𝐵𝑖𝑡 -𝑂𝐼𝐵𝑖 )+(µ𝒊𝒕 -𝑢ഥ𝑖 )
• 𝑟𝑖𝑡 ෪ 𝑖𝑡 + 𝑈𝑖𝑡 ;
ǁ = 𝑎1 ∗ 𝑂𝐼𝐵
• Here, Cov(𝑂𝐼𝐵
෪ 𝑖𝑡 , 𝑈𝑖𝑡 ) = 0, and pooled OLS estimates will be consistent
Fixed-Effects Estimators
෪ 𝑖𝑡 + 𝑈𝑖𝑡 ;
• 𝑟𝑖𝑡 = 𝑎1 ∗ 𝑂𝐼𝐵
• Fixed effects remove any time-constant terms
• Fixed effects are costly (due to transformation of original data)
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Fixed-Effects (FE) vs.
First Difference (FD) Estimators
Fixed-Effects vs. First Difference Estimators
• 𝑟𝑖𝑡 ෪ 𝑖𝑡 + 𝑈𝑖𝑡 ;
ǁ = 𝑎1 ∗ 𝑂𝐼𝐵
• For T = 2, FD = FE
• For T > 2, FD ≠ FE
• With the assumptions that (a) large sample N→ ∞(b) error term (µ𝒊𝒕 ) is
uncorrelated with the independent variable (e.g., 𝑂𝐼𝐵𝑖𝑡 ) (c) sample is
random, and (d) sufficient variance in variables, the following is held
• 𝐸𝑎 ෞ1 𝐹𝐸 = 𝑎1 (both FE and FD estimates of 𝑎1 are unbiased)
ෞ1 𝐹𝐷 = 𝐸 𝑎
𝑝 𝑝
ෞ1 𝐹𝐷 → β ; 𝑎
• 𝑎 ෞ1 𝐹𝐸 → 𝑎1 (both the estimators are consistent )
Fixed-Effects vs. First Difference Estimators
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 +𝛼𝑖 +µ𝑖𝑡
• (A) 𝐶𝑜𝑣 µ𝑖𝑡 , µ𝑖𝑡−1 = 0: error terms (µ𝑖𝑡 ) are serially uncorrelated
• First differencing introducing serial correlation in error terms
• Due to this, the standard error of estimates for FE estimators are lower (more
efficient) than FD estimators: 𝑠𝑒 𝑎ො1𝐹𝐸 < 𝑠𝑒 𝑎ො1𝐹𝐷
• (B) µ𝑖𝑡 = µ𝑖𝑡−1 + 𝑒𝑖𝑡 : i.e., 𝛥µ𝑖𝑡−1 = 𝑒𝑖𝑡
• Random walk structure in error term or strong autocorrelation in errors
• 𝑠𝑒 𝑎ො1𝐹𝐸 > 𝑠𝑒 𝑎ො1𝐹𝐷
Fixed-Effects vs. First Difference Estimators
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 +𝛼𝑖 +µ𝑖𝑡
• (C) µ𝑖𝑡 =𝜌µ𝑖𝑡−1 + 𝑒𝑖𝑡 : AR(1) structure in error terms
• 𝜌 is close to ‘1,’ then the FD estimator is more efficient
• 𝜌 is close to ‘0,’ then the FE estimator is more efficient
• One solution is to examine the autocorrelation structure in FD errors
• If FD errors have a negative autocorrelation, that indicates original errors
have no autocorrelation; hence FE is more appropriate
• If FD errors have a very small correlation, that indicates original errors have
random walk; hence FD estimator is more appropriate
Fixed-Effects vs. First Difference Estimators
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 +𝛼𝑖 +µ𝑖𝑡
One solution is to examine the autocorrelation structure in FD errors
• For scenarios in between, one can estimate both FD and FE and
compare
For non-stationary process, first differences are more useful
For small sample sizes, FD is more appropriate
For data with large time dimension FE estimators are more appropriate
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Random Effects Estimator: Part 1
Random Effects (RE) Estimators
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 +𝛼𝑖 +µ𝑖𝑡
• Recall that the model would have an issue of endogeneity if the unobserved
heterogeneity (𝛼𝑖 ) is correlated with one of the independent variables:
𝐶𝑜𝑣 𝑂𝐼𝐵𝑖𝑡 , 𝛼𝑖 ≠0
• Thus, pooled OLS is not effective, and we used FD/FE methods to remove
𝛼𝑖 from the model
• However, if 𝐶𝑜𝑣 𝑂𝐼𝐵𝑖𝑡 , 𝛼𝑖 is reasonably close to ‘0’ then, we need not apply
FD/FE as they involve a heavy transformation in data
• E.g., FE leads to loss of observations (T-1 periods instead of T)
Random Effects (RE) Estimators
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 +𝛼𝑖 +µ𝑖𝑡 ;
• 𝐶𝑜𝑣 𝑂𝐼𝐵𝑖𝑡 , 𝛼𝑖 = 0 ; is a reasonable assumption in following cases
• All the relevant variables are accounted for
• 𝜶𝒊 is very small relative to other variables
• In this scenario, pooled OLS provides consistent estimates
• However, the errors may still be serially correlated: 𝐶𝑜𝑣 𝛼𝑖 + µ𝒊𝒕 , 𝛼𝑖 + µ𝒊𝒔 ≠ 0
• This serial correlation can be corrected through RE estimation without
putting a heavy cost of data (as in FD/FE)
• RE is more efficient than Pooled OLS and FE
Random Effects (RE) Estimators
• If you believe that sufficient variables have been entered in the model and
𝐶𝑜𝑣 𝑂𝐼𝐵𝑖𝑡 , 𝛼𝑖 ≠0 [Problem of Endogeneity] has been resolved
• Then RE is better than FE and OLS
• 𝑟𝑖𝑡 -λഥ
𝑟𝑖 = 𝑎0 1 − λ + 𝑎1 (𝑂𝐼𝐵𝑖𝑡 - λ 𝑂𝐼𝐵𝑖 )+(𝑛𝑖𝑡 -λ𝑛ഥ𝑖 ), where 𝑛𝑖𝑡 = 𝛼𝑖 + µ𝑖𝑡
• The above random effect is the pooled estimation of the above
transformation
• λ=0, then RE ≈ Pooled OLS
• λ =1, then RE ≈ FE
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Random Effects Estimator: Part 2
Random Effects (RE) Estimators
• 𝑟𝑖𝑡 -λഥ
𝑟𝑖 = 𝑎0 1 − λ + 𝑎1 (𝑂𝐼𝐵𝑖𝑡 - λ 𝑂𝐼𝐵𝑖 )+(𝑛𝑖𝑡 -λ𝑛ഥ𝑖 ), where 𝑛𝑖𝑡 = 𝛼𝑖 + µ𝑖𝑡
• Typically, 0≤ λ≤1, hence RE is somewhere between pooled OLS and FE
• What is λ?
0.5
𝜎𝑢2
• λ=1- ; here 𝜎𝑢2 is the variance of error term, 𝜎𝑎2 is the variance of 𝛼𝑖
𝜎𝑢2 +𝑇𝜎𝑎2
• If 𝜎𝑎2 = 0, then λ = 0; that is 𝛼𝑖 is insignificant/not important: RE converges to pool
• 𝑇𝜎𝑎2 >>> 𝜎𝑢2 , λ = 1, RE converges to FE
• Thus, unlike FE (fully time-demean) RE is quasi time-demean
• RE also allows to estimate time-constant terms
Random Effects (RE) Estimators
• 𝑟𝑖𝑡 = 𝑎0 + 𝑎1 𝑂𝐼𝐵𝑖𝑡 +𝛼𝑖 +µ𝑖𝑡 (1)
• 𝑟𝑖𝑡 -λഥ
𝑟𝑖 = 𝑎0 1 − λ + 𝑎1 (𝑂𝐼𝐵𝑖𝑡 - λ 𝑂𝐼𝐵𝑖 )+(𝑛𝑖𝑡 -λ𝑛ഥ𝑖 ) (2)
• where 𝑛𝑖𝑡 = 𝛼𝑖 + µ𝑖𝑡
0.5
𝜎𝑢2
• λ=1-
𝜎𝑢2 +𝑇𝜎𝑎2
መ this requires estimation of Eq. (1) through FE or
1. First step is to estimate 𝜆:
pooled OLS methods
2. Then estimate the transformed Eq. (2) using 𝜆መ using pooled OLS
• The combined system set-up is RE method of estimation
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Random Effects Estimator: Part 3
Assumptions of RE
The following assumptions are made for RE estimators to be consistent, i.e.,
𝑝
ෞ1 R𝐸 → 𝑎1 (as N → ∞)
𝑎
• 𝐶𝑜𝑣 𝑂𝐼𝐵𝑖𝑡 , 𝛼𝑖 =0
• Each cross section is randomly sampled
• 𝐸[𝑢𝑖𝑡 |𝑋𝑖𝑡 , 𝛼𝑖 ]=0
• No perfect multicollinearity
• The last three assumptions are applicable to FE/FD also
• Only the first assumption is specific to RE
Estimating Time Constant Variables with
RE
Recall the transformed model
• 𝑟𝑖𝑡 -λഥ
𝑟𝑖 = 𝑎0 1 − λ + 𝑎1 (𝑂𝐼𝐵𝑖𝑡 - λ 𝑂𝐼𝐵𝑖 )+ 𝑛𝑖𝑡 -λ𝑛ഥ𝑖 )
• In this model let us assume a time constant term 𝑆𝑖𝑧𝑒𝑖 , then the resulting model
• rit -λഥ
ri = 𝑎0 1 − λ + 𝑎1 (OIBit - λ OIBi ) + 𝑎2 ∗ Sizei (1- λ) + (nit -λnഥi )
• As long as λ≠0, we can estimate 𝑎2 , the effect of time constant variable 𝑆𝑖𝑧𝑒𝑖
• However, for these estimates to remain consistent, the assumption pertaining to
RE need to be held [e.g., 𝐶𝑜𝑣 𝑆𝑖𝑧𝑒𝑖 , 𝑂𝐼𝐵𝑖𝑡 ] = 0
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Fixed Effects (FE) vs. Random Effects
(RE)
FE vs. RE
Cov(𝛼𝑖 , 𝑋𝑖𝑡 )=0
• Both FE and RE estimates are consistent
• SE (RE estimate) < SE (FE estimate): Efficiency
• RE effect estimation allows for the effect of time-constant variables on
dependent variables (For FE, that is not possible)
Cov(𝛼𝑖 , 𝑋𝑖𝑡 )≠ 0
• Only the FE estimate is consistent
• SE (RE estimate) < SE (FE estimate)
• Hausman test can be employed to select between the two
Hausman Test
Hausman test statistic tests this hypothesis
• Null H0 => Cov(𝜶𝒊 , 𝑋𝑖𝑡 ) = 0 We should be able to use RE
𝛽 2
𝐹𝐸 −𝛽𝑅𝐸
• Estimate W = is distributed as chi-square with one df
𝑉𝑎𝑟(𝛽
𝐹𝐸 ) −𝑉𝑎𝑟(𝛽𝑅𝐸 )
• If H0 is true, the numerator is small (both estimates are consistent), but the denominator is
large, the statistic W is close to 0: Fail to reject the null, use RE estimator
• If null is false, the numerator is large, W is away from zero [Cov(𝜶𝒊 , 𝑋𝑖𝑡 )≠0]: reject the null,
use fixed effect estimator
• Essentially this estimator compares consistency (in numerator) relative to efficiency (in
denominator)
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Summary and Concluding Remarks
Summary and Concluding Remarks
• As compared to simple cross-sectional or time-series data, panel data
with longitudinal properties is rich in granularity, and the information it
offers
• However, it also entails a number of issues related to the estimation
• One of the important issues is the role of unobserved heterogeneity
• That is, the individual-specific time-invariant effects
• Though there are also only time-varying effects, they can be easily
modeled by applying ‘T-1’ time dummies for T time periods
Summary and Concluding Remarks
• Usually, there are many individual units as compared to time periods; therefore,
accounting for these units explicitly through dummies can make the model
extremely non-parsimonious
• One simple approach is to estimate such models using the FD method, which is
simply differencing the series by one lag and then applying pooled OLS
• A more evolved FE approach is to estimate time-demeaned series with pooled OLS
• Both FD and FE methods, though useful, put a heavy cost on data due to the
extreme nature of data transformation
• A less exacting approach is that of the RE method, which is a compromise between
two extremes of FE and pooled OLS approach
Summary and Concluding Remarks
• RE method is more appropriate when the assumption that unobserved
heterogeneity is not correlated with the dependent variable [Cov(𝜶𝒊 , 𝑋𝑖𝑡 )=0]
can be held
• Cov(𝜶𝒊 , 𝑋𝑖𝑡 )=0: Both RE and FE are consistent by RE is more efficient
• Cov(𝜶𝒊 , 𝑋𝑖𝑡 )≠ 0: Only FE is consistent
• The decision to select FE vs. RE is taken through Hausman test (HT) statistic
• HT statistic essentially is a test of gains in consistency at the cost of
efficiency while choosing FE vs. RE method
Thanks!
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Lesson 2: Panel Data: Application
Introduction
• Application of Panel data algorithm in prediction of security prices
• Prediction of Broad marketwide returns
• Data Visualization
• Pool vs. Panel models (Fixed and Random effects)
• Interpret the output from different models
• Examine the cross-sectional/serial correlation in errors
• Obtain coefficients using robust standard errors
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Case Study: Prediction of Broad
marketwide returns
Case Study: Index Return Prediction
• Broad market wide indices are known to reflect the growth of
economy and are often correlated with macroeconomic factors
such as GDP
• This strategy to invest in market-index based on forecasts related
to factors such as GDP has become a very prevalent strategy
known as factor investing
• However, this exercise may be vitiated by unobserved
heterogeneities such as country specific factors
Case Study: Index Return Prediction
• In this case study, we will employ panel data methods to forecast
the market index prices
Sr Country Year GDP Return
1 A 1990 21.01801 27.8%
2 A 1991 -21.3649 32.1%
3 A 1992 -16.2345 36.3%
4 A 1993 21.69623 24.6%
5 A 1994 21.82465 42.5%
6 A 1995 21.89562 47.7%
7 A 1996 21.73732 50.0%
8 A 1997 21.74277 5.2%
9 A 1998 21.94626 36.6%
10 A 1999 17.49863 39.6%
11 B 1990 -22.5041 -8.2%
12 B 1991 -20.3831 10.6%
Case Study: Index Return Prediction
• The data includes, information about the country, year, GDP, log
scaled mean deviated form, and returns
• Using different panel data methods, the relationship between GDP
and returns to be modeled
• The subsequent slides provide the problem statement
Case Study: Index Return Prediction
The following tasks need to be performed:
Visualize the data
• Plot year wise market returns for each country
• Plot year wise GDP for each country
• Using the box plot show the heterogeneity in GDP and returns
across countries and years
• What do we infer
Case Study: Index Return Prediction
The following tasks need to be performed:
• Model the relationship of GDP with index returns using simple
pooled OLS: what are the problems with this approach
• Through visual approach show the fitted line with actual data
• Follow the LSDV approach and model the relationship after
adding countrywide dummies
Case Study: Index Return Prediction
The following tasks need to be performed:
• Convert the data into panel format with country and year as the
data identifiers (index)
• Model the data using fixed effect, using individual, time, and both
the effects
• Perform the tests of poolability to show whether these fixed
effects are significant
Case Study: Index Return Prediction
The following tasks need to be performed:
• Using the modeled fixed effect object from the panel data, extract
the time and individual fixed effects
• Next model the data with random effects, examine the output and
comment on individual and idiosyncratic heterogeneity
• Comment whether the random effects transformation is closer to
pool and fixed effect
• Conduct the tests to examine whether random or fixed effects
method is more appropriate
Case Study: Index Return Prediction
The following tasks need to be performed:
• Examine the cross-sectional and serial correlation in errors for
pool, fixed, and random effects method
• Using robust standard errors, estimate coefficients that are robust
to autocorrelation and heteroscedasticity in the model
INDIAN INSTITUTE OF TECHNOLOGY KANPUR
Summary and Concluding Remarks
Summary and Concluding Remarks
• Broad marketwide returns are modelled with GDP factor
• First, data is visualized to see the individual and time effects
• Fixed and random effects models are estimated
• We performed residual diagnostics for cross-sectional and serial
correlation in errors
• Estimate coefficients using robust standard errors
Thanks!