0% found this document useful (0 votes)
73 views63 pages

Factor Modeling For Volatility

Uploaded by

Steve
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views63 pages

Factor Modeling For Volatility

Uploaded by

Steve
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

Factor Modeling for Volatility

∗ † ‡ §
Yi Ding Robert Engle Yingying Li Xinghua Zheng

November 20, 2022

Abstract

We establish a framework to study the factor structure in stock variance un-


der a high-frequency and high-dimensional setup. We prove the consistency
of conducting principal component analysis on realized variances in estimating
the factor structure. Moreover, based on strong empirical evidence, we propose
a multiplicative volatility factor (MVF) model, where stock variance is repre-
sented by a common variance factor and a multiplicative lognormal idiosyncratic
component. We further show that our MVF model leads to significantly im-
proved volatility prediction. The favorable performance of the proposed MVF
model is seen in both US stocks and global equity indices.

Keywords: Volatility modeling; Factor model; High-frequency data; High-


dimension; Principal component analysis.
JEL Codes: C13, C51, C53, C55, C58, G17

Faculty of Business Administration, University of Macau, Taipa, Macau. Email: yid-
ing@um.edu.mo

Stern School of Business, New York University, 44 West Fourth Street Suite 9-62, New York.
Email: rengle@stern.nyu.edu

Department of ISOM and Department of Finance, Hong Kong University of Science and Tech-
nology, Clear Water Bay, Kowloon, Hong Kong. Research is supported in part by RGC grants GRF
16503419, GRF 16502118 and T31-64/18-N of the HKSAR and NSFC 19BM03. Email: yyli@ust.hk
§
Department of ISOM, Hong Kong University of Science and Technology, Clear Water Bay,
Kowloon, Hong Kong. Research is supported in part by RGC grants GRF-16304521, GRF 16304019
and T31-64/18-N of the HKSAR. Email: xhzheng@ust.hk

Electronic copy available at: https://ssrn.com/abstract=4282265


1 Introduction
Volatilities–intrinsically linked with macro- and microeconomics–play a central role
in investments, asset pricing, risk management and monetary policies. Nowadays, we
have witnessed the impact of geopolitical turmoils, pandemics and climate change on
the clearly more than ever uncertain and connected global economy. The imminent
era of long-lasting instability raises challenges and amplifies the imperativeness of a
better understanding of the volatilities in the financial market.
The past few decades have seen tremendous progress in modeling time-varying
volatilities. Well-known volatility models include the autoregressive conditional het-
eroskedasticity (ARCH) model and the generalized ARCH (GARCH) model (Engle
(1982); Bollerslev (1986)), as well as stochastic volatility models (Clark (1973); Taylor
(1982)).
Thanks to the availability of high-frequency data and recent developments in
volatility measuring with high-frequency data, we can now estimate daily volatili-
ties with high accuracy. The realized volatility (RV) is a consistent estimator of the
integrated volatility (IV) as the sampling frequency increases when microstructure
noise is absent; see, for example, Jacod and Protter (1998) and Barndorff-Nielsen
and Shephard (2002). Various robust IV estimators have been proposed when there
is microstructure noise, including the two-scale realized volatility (Zhang, Mykland,
and Aït-Sahalia (2005)), multi-scale realized volatility (Zhang et al. (2006)), pre-
averaging approach (Jacod, Li, Mykland, Podolskij, and Vetter (2009), Jacod, Li,
and Zheng (2019)), and quasi-maximum likelihood estimator (Xiu (2010)). Jump-
robust volatility estimators have also been proposed, including the bipower variation
by Barndorff-Nielsen and Shephard (2004) and the truncated RV by Mancini (2009).
Gonçalves and Meddahi (2009) and Hounyo, Gonçalves, and Meddahi (2017) develop
bootstrap methods for inference on integrated volatility. Li, Liu, and Xiu (2019) pro-
pose an efficient multi-scale jackknife estimator for integrated volatility functionals.
Empirically, Liu, Patton, and Sheppard (2015) find that the simple 5-minute RVs
achieve high estimation accuracies. RV-based models have been further studied for
volatility prediction, including fractionally-integrated Gaussian vector autoregression
for log RV (Andersen, Bollerslev, Diebold, and Labys (2003)) and heterogeneous AR

Electronic copy available at: https://ssrn.com/abstract=4282265


Model (HAR, Corsi (2009)).
To have a concrete idea of daily volatilities in the market, we compute the daily RVs
of S&P 500 Index constituent stocks using 5-minute intraday returns between 2003
and 2020, pick out a few stocks, more specifically, the stocks that have their mean
RVs on the 30%, 50%, and 70% quantiles, and plot the time series of their RVs in
Figure 1.

log (RV)
1e−02

RV 30%
RV 50%
RV 70%
2e−03
stock daily log(RV)

5e−04
1e−04
2e−05

Jan−2003 Jan−2005 Jan−2007 Jan−2009 Jan−2011 Jan−2013 Jan−2015 Jan−2017 Jan−2019

Figure 1: Time series plots (log scale) of three representative S&P500 Index con-
stituent stocks’ RVs (RV 30%, RV 50%, RV 70%) based on 5-minute intraday returns
from 2003 to 2020 with mean RVs falling on the 30%, 50%, 70% quantiles of all mean
RVs.

Figure 1 shows clearly that the stock RVs co-move. Such a co-movement feature
in volatilities has been well-documented. For example, Engle, Ito, and Lin (1990)
and Calvet, Fisher, and Thompson (2006) examine exchange markets, Susmel and
Engle (1994); Da and Schaumburg (2006) and Kelly, Lustig, and Van Nieuwerburgh
(2013) study equities, and Bollerslev, Hood, Huss, and Pedersen (2018) and Engle
and Martin (2019) study global multiple asset classes. The volatility co-movement
has been used in volatility forecasting; see, for example, Luciani and Veredas (2015);
Asai and McAleer (2015); Barigozzi and Hallin (2017), and Bollerslev, Hood, Huss,
and Pedersen (2018).
The co-movement in volatility is not surprising as it is well known that returns ad-

Electronic copy available at: https://ssrn.com/abstract=4282265


mit a factor structure, such as the capital asset pricing model (CAPM, Sharpe (1964)),
Fama-French three-factor/five-factor/six-factor models (FF3/FF5/FF6, Fama and
French (1993, 2015, 2018)), multi-factor and approximate factor models (Ross (1976);
Chamberlain and Rothschild (1983); Chen, Roll, and Ross (1986)). When volatili-
ties are stochastic, the factor structure in returns will induce a factor structure in
variance, which leads to the volatility co-movement. An interesting question then
arises:

Is the co-movement in volatility purely due to the factor structure in returns?

We find that it is not the case.


First, factors in volatilities can exist even when the return factor is absent. Con-
sider the following example. Suppose that there are N assets with returns Rt =
(R1t , ..., RN t )T . They follow a single factor model, Rt = βft + Ut , where Ut =
(U1t , ..., UN t )T are the idiosyncratic returns. Suppose Uit ∼ N (0, σu;it
2 2
), where σu;it = a + bδt + zit ,
i.i.d.
a ≥ 0, b > 0, and δt and zit are independent positive random variables. Under such a
model, the idiosyncratic returns (Uit )1≤i≤N are uncorrelated, and hence do not admit
a factor structure. The idiosyncratic variances, however, have a common factor δt .
An idiosyncratic variance factor structure that is unlikely induced by omitted
return factors is indeed what we find in the empirical data. Our sample includes the
high-frequency data of 291 constituent stocks in the S&P 500 Index between 2003
and 2020. We take the FF3 model as an example. We obtain 5-minute estimated
idiosyncratic returns and the daily idiosyncratic realized variances by regressing the 5-
minute intraday stock returns over the FF3 factors. The principal component analysis
(PCA) on the idiosyncratic returns does not suggest a clear factor structure. In
contrast, the PCA on the idiosyncratic RVs suggests a clear factor structure. The
cross-sectional average of the idiosyncratic variances can be approximately considered
as the factor in idiosyncratic variances. The results are consistent with the findings
of Herskovic, Kelly, Lustig, and Van Nieuwerburgh (2016).
Second, the number of factors in stock variance is not necessarily the same as
the number of return factors. In fact, we find strong empirical evidence for a sin-
gle factor in stock variance. We take the FF3 model again as an example. When
performing PCA on four variance factor candidates, namely, the RVs of FF3 factors

Electronic copy available at: https://ssrn.com/abstract=4282265


and the common idiosyncratic realized variance factor, we find that they are strongly
correlated with each other, and there exists a common component in the four variance
factor candidates. The common component in the four variance factors is strongly
correlated with the first principal component (PC) in stock RVs, which explains a
majority proportion (60%) of the total variation in stock RVs. The evidence strongly
suggests a common component in stock variances. Our empirical evidence about a sin-
gle variance factor in the stock variances agrees with the findings of Kapadia, Linn,
and Paye (2020), who find a common factor in both market volatility and market
neutral volatilities of a wide range of return factors.
The empirical evidence described above is obtained using PCA on realized vari-
ances. Under the high-dimensional setting, PCA is consistent in identifying factor
structure in returns; see, for example, Bai and Ng (2002); Bai (2003); Fan, Liao,
and Mincheva (2013); Aït-Sahalia and Xiu (2017), and Ding, Li, and Zheng (2021).
However, unlike returns, volatilities are not observable and have to be estimated.
The resulting estimated variance contains errors that accumulate as the dimension
increases. This leads to an important question:

Is PCA-based estimation valid in identifying the factor structure in stock variances?

To address this question, we establish a framework to study the factor structure


in stock variance under a high-frequency and high-dimensional setup. In brief, one
estimates stock integrated variances using realized variances and then conducts PCA
on the sample covariance matrix of the stock realized variances. We prove the con-
sistency of such a procedure in estimating the factor structure in stock variance.
Specifically, we develop statistical theories about the explicit convergence rate in es-
timating the population covariance matrix of the stock variance using the sample
covariance matrix of the stock realized variances. Furthermore, we obtain the con-
vergence rate in using PCA to estimate the factor structure when it exists in stock
variance. We also obtain the consistency results for the factor structure estimation
in idiosyncratic variances using PCA on idiosyncratic realized variances, which are
based on estimated idiosyncratic returns from regressing the high-frequency stock re-
turns over factor returns. It is worth pointing out that our setting is different than
the usual error-in-variable setting because the errors in RVs are not i.i.d..

Electronic copy available at: https://ssrn.com/abstract=4282265


Next, we investigate the following question:

What model could suitably describe the volatility co-movements?

We propose a multiplicative volatility factor (MVF) model for stock volatility. We


observe a high correlation between the first PC and the cross-sectional average of
stock realized variances, which we term common realized variance (CRV). It suggests
that the cross-sectional average of variance, or common variance (CV), can be ap-
proximately considered as the single factor in stock variance. We also find strong
empirical evidence from the analysis of the usual additive linear model that the vari-
ance factor model is in a surprisingly neat multiplicative format. In our proposed
MVF model, the variance is represented by a multiplicative common factor and a
multiplicative lognormal idiosyncratic component, which we term the idiosyncratic
variance exposure.
Our proposed MVF model has a number of desirable properties. First, the
model captures important volatility characteristics such as non-negativity and heavy-
tailedness. Second, it incorporates the co-movement in the volatility in a simple
way, which makes the model estimation straightforward. The common multiplicative
factor can be well approximated by the common variance. The idiosyncratic vari-
ance exposure is then simply the variance divided by the common variance, based on
which, the two model coefficients, mean and standard deviation of the idiosyncratic
lognormal component can be estimated. The simplicity of the MVF model makes it
particularly attractive for the study of volatility in a high-dimensional context. Third,
the MVF model enjoys internal model consistency. To be more specific, under the
MVF model, the volatility of a stock portfolio inherits the common factor from the
underlying stock volatilities, while other factor model structures such as the log-linear
factor model do not have such a desirable property.
Finally, we address the following question:

How much could the MVF model be helpful?

We mainly answer this question from the perspective of volatility forecasting. Under
our proposed MVF model, we simply predict the CV factor and the multiplicative
idiosyncratic components separately using log HAR models. The stock volatility

Electronic copy available at: https://ssrn.com/abstract=4282265


forecast is the multiplication of the forecasts of the two components. We use the
MVF model to predict daily volatilities of the S&P 500 Index constituent stocks be-
tween 2004 and 2020. We find that our approach outperforms the benchmark models
by generating lower Q-like losses. When checking stock by stock, the outperformance
of our approach is statistically significant in a majority (around 90%) of the stocks
we evaluate.
Our proposed MVF model is built upon empirical evidence from the US stocks.
Beyond the US market, we also find that the MVF model applies to the global market
based on a parallel analysis using daily realized variances of 31 global equity indices.
In addition, the MVF model outperforms in the global market volatility forecasting.
To summarize, our contributions lie in the following aspects:
First, we develop a framework to study the factor structure in stock variance (and
idiosyncratic variance) using high-frequency data under a high-dimensional setup.
Second, we establish theoretical support for using PCA on realized variances to
estimate the factor structure in stock variance (and idiosyncratic variance).
Third, we propose a single factor volatility model with a multiplicative idiosyn-
cratic component, the MVF model, based on strong empirical evidence in US stocks.
Our MVF model has several desirable features that make it attractive in various
applications.
Fourth, we utilize the proposed MVF model for volatility forecasting. Our model
performs dominantly well compared with various benchmark approaches.
Last but not least, we show that our MVF model applies to the global market
and helps predict the volatilities of global equity indices.
The rest of this paper is organized as follows. In Section 2, we discuss the evidence
of factor structure in variance. We develop the MVF model in Section 3. Section 4
presents the out-of-sample volatility forecasting results. In Section 5, we examine
our MVF model in the global equity indices. Section 6 contains concluding remarks.
Proofs and additional empirical results are collected in the Supplementary Materials
Ding, Engle, Li, and Zheng (2022).

Electronic copy available at: https://ssrn.com/abstract=4282265


2 Evidence of Factor Model for Volatility

2.1 Data

We focus on the S&P 500 Index constituent stocks in 2003 and exclude the least liquid
stocks that have more than 20% zero 5-minute returns from January 2003 to Decem-
ber 2020. We collect high-frequency stock prices from the TAQ database. Following
the common data cleaning procedure (e.g., Aït-Sahalia and Mancini (2008)), “bounce
back”s are removed. We sample the log prices starting from 9:35 until 16:00, using
the previous-tick approach (Gençay, Dacorogna, Muller, Pictet, and Olsen (2001)).
Regarding the sampling frequency, we use 5-minute log-returns for which the market
microstructure noise can be safely ignored (Liu, Patton, and Sheppard (2015)). Hol-
idays, half trading days and overnight returns are eliminated. Same as the treatment
in Li and Xiu (2016), we also remove May 6, 2010, the day when the “Flash Crash”
occurred. After the cleaning procedure, we obtain 291 stocks for 4491 trading days
in 2003–2020, and each stock has 77 5-minute intraday log-returns per day. About
return factors, we consider the Fama-French three-factor model (Fama and French
(1993)) and use 5-minute returns of the market, the small-minus-big (SmB) and the
high-minus-low (HmL) portfolios.1
Following Bollerslev and Todorov (2011); Aït-Sahalia, Fan, and Li (2013) and
Li, Todorov, and Tauchen (2017), for each stock i and each day t, we estimate
P77
the continuous component of variance with RVitc = tr 2 tr
j=1 (Ri;t[j] ) , where Ri;t[j] =
p
Ri;t[j] 1{|Ri;t[j] |≤vit } , 1 ≤ j ≤ 77, and vit is set to be vit = 3 min(RVit , BVit ) ×
P[1/∆n ] 2 [1/∆n ] P[1/∆n ]
∆0.49
n , ∆n = 1/77, RVit = j=1 Ri;t[j] , and BVit = π2 [1/∆ n ]−1 j=2 |Ri;t[j] Ri;t[j−1] |
is the bipower variation (Barndorff-Nielsen and Shephard (2004)). We apply the same
truncation procedure to the high-frequency factor data to obtain the continuous com-
ponent of the factor return. The truncated factor returns are denoted by F tr . Our
analysis is based on the truncated returns Rtr , truncated factor returns F tr and the
continuous component of realized variance, RV c . For notational ease, when there is no
ambiguity, we denote the truncated return Rtr by R and the continuous component
of realized variance RV c by RV .
1
We thank Saketh Aleti for sharing high-frequency factor data from the paper Aleti (2022).

Electronic copy available at: https://ssrn.com/abstract=4282265


2.2 Factor Structure in Idiosyncratic Variances

We analyze daily idiosyncratic realized variances constructed from 5-minute returns


of S&P 500 Index stocks and the Fama-French three-factor model. Specifically, we
regress the 5-minute intraday returns over the Fama-French three factors, from which
we get the idiosyncratic returns and the corresponding idiosyncratic realized vari-
ances. We find that PCA on the idiosyncratic returns shows no evidence of a factor
structure. In contrast, PCA on the idiosyncratic realized variances shows that the
first PC accounts for a large proportion (49%) of the total variation in idiosyncratic
realized variances. In addition, consistent with the results in Herskovic, Kelly, Lustig,
and Van Nieuwerburgh (2016), we find that the first PC has a high correlation (0.955)
with the cross-sectional average of the idiosyncratic realized variances, that is, the
common idiosyncratic realized volatility (CiRV). Therefore, the cross-sectional aver-
age of idiosyncratic variance, or common idiosyncratic variance (CiV),2 can be con-
sidered as the factor in idiosyncratic variance. Beyond the Fama-French three factors,
we also use statistical factors to check the robustness of our findings. The results are
similar.3

2.3 Factor Structure in Stock Variances

The factor structure in stock returns naturally induces a factor structure in stock
variance. For example, under the FF3 model, we have

2 2 2
Vit =βiM kt VM kt t + βiHmL VHmL t + βiSmB VSmB t + VUi t

+ covariance terms,

where Vit and VUi t denote stock and idiosyncratic variances, respectively, and VM kt t ,
VHmL t and VSmB t are the factor variances. Hence, the return factor variances (VM kt ,
VHmL , VSmB ) are potential factors for the stock variance. Moreover, as discussed in
2
In Herskovic, Kelly, Lustig, and Van Nieuwerburgh (2016), CIV refers to the cross-sectional
average of standard deviations, while in our paper, we refer to CiRV as the cross-sectional average
of idiosyncratic realized variances. We refer to CiV as the cross-sectional average of integrated
idiosyncratic variance. Despite such a difference, we still use the name CiV. We find this name
nicely summarizes the most important first principle component.
3
The detailed results are available upon request.

Electronic copy available at: https://ssrn.com/abstract=4282265


Eigenvalues over Eigenvalues over
sum of total eigenvalues sum of total eigenvalues
based on covariance matrix of FF3 RVs and CiRV based on correlation matrix of FF3 RVs and CiRV

1.0

1.0


0.8

0.8
0.6

0.6
0.4

0.4
0.2

0.2




0.0

0.0
● ●

1.0 1.5 2.0 2.5 3.0 3.5 4.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0

Figure 2: Eigenvalue ratios of the sample covariance matrix (left panel) and sample
correlation matrix (right panel) of FF3 factors’ RVs and CiRV.

the previous section, a single factor exists in idiosyncratic variance, CiV. Therefore,
altogether, there are four potential factors in stock variances. An interesting question
arises, namely, are there indeed four factors?
To address this question, we first employ PCA on the four variance factor can-
didates, namely, the three return factor realized variances (RVM kt , RVHmL , RVSmB ),
and the common idiosyncratic realized variance, CiRV. We compute the eigenvalue
ratios, which are the eigenvalues divided by the sum of the total eigenvalues, and plot
the results in Figure 2. Surprisingly, we find that the first PC explains more than
90% of the total variation in the four variance factor candidates, suggesting a single
common component.
We then perform PCA directly4 on the stock RVs, and compute the ratio of the top
eigenvalues over the sum of the total eigenvalues, based on both the covariance matrix
and the correlation matrix of the stock RVs. The results are plotted in Figure 3.
Figure 3 shows that a high proportion (60%) of the total variation in stock RVs
can be explained by the first PC, while the second and other PCs do not account for
a proportion substantially higher than the remaining. These observations suggest a
single factor model for the stock variances. We also estimate the number of factors
4
Outliers are removed by 95% winsorization to avoid the effect of extreme variations in the RVs.

10

Electronic copy available at: https://ssrn.com/abstract=4282265


Top ten eigenvalues over Top ten eigenvalues over
sum of total eigenvalues sum of total eigenvalues
based on covariance matrix of RV based on correlation matrix of RV

1.0

1.0
0.8

0.8

0.6

0.6

0.4

0.4
0.2

0.2

● ●
● ●
● ●
0.0

0.0
● ● ● ● ● ● ● ● ● ● ●

2 4 6 8 10 2 4 6 8 10

Figure 3: Top ten eigenvalue ratios of the sample covariance matrix (left panel) and
sample correlation matrix (right panel) of stock RVs.

using estimators from Bai and Ng (2002) and Ahn and Horenstein (2013), and the
results also suggest a single factor model for the stock variances.
We next compute the pairwise correlations among the variance factor candidates,
which are the return factor RVs (RVM kt , RVHmL , RVSmB ), the CiRV, and in addition,
the first PC (P CRV ) in stock RVs. The results are summarized in Table 1.

Table 1: Pairwise correlations among RVM kt , RVHmL , RVSmB , CiRV, and the first
PC in the stock RVs.

CiRV RVM kt RVHmL RVSmB


P CRV 0.970 0.868 0.800 0.827
CiRV 0.848 0.788 0.855
RVM kt 0.680 0.920
RVHmL 0.689

Table 1 shows that all variance factor candidates are highly correlated with an
average pairwise correlation of around 0.80, and are also highly correlated with the
first PC in the stock RVs. These results are consistent with the findings in Li, Todorov,
and Tauchen (2016), which show a high correlation between the spot market factor

11

Electronic copy available at: https://ssrn.com/abstract=4282265


volatilities and idiosyncratic volatilities of sector portfolios. Kapadia, Linn, and Paye
(2020) also have similar findings about the high correlation between market volatility
and the market neutral volatilities of a wide range of return factors.
In summary, we find compelling evidence for a single factor structure in stock
variance.

2.4 PCA Consistency of Using Realized Variance in Estimat-


ing Factors in Integrated Variance

The empirical studies in Sections 2.2 and 2.3 are based on realized (idiosyncratic)
variances, which inevitably contain estimation errors. Because both the number of
assets and the time span are large, the estimation errors accumulate. In this section,
we analyze the consistency of conducting PCA on realized variances in identifying
factor structure in integrated variances.

2.4.1 Continuous-time Log Price Process

We consider the following continuous-time factor model for log prices:

dYt = αt dt + βdXt + dZt , t ∈ [0, T ], (2.1)

where (Yt ) is an N -dimensional log price process, (Xt ) is a K-dimensional factor


process, (Zt ) is the idiosyncratic component, αt =: (α1t , ..., αN T )T is the drift term,
and β = (β1 , ..., βN )T =: (βik ) is a factor loading matrix of dimension N × K.
For any matrix A = (aij ), we denote the entrywise norm as kAkmax := maxi,j |aij |;
pP
the spectrum norm is denoted as kAk2 := maxkxk2 ≤1 kAxk2 , where kxk2 = x2i ;
and the minimum singular value and the maximum singular value are denoted as
λmin (A) and λmax (A), respectively.
We make the following assumptions.

Assumption 1. (Xt ) and (Zt ) are continuous Itô semimartingales:


Z t Z t Z t
Xt = X0 + hs ds + ηs dWs , Zt = ζs dBs ,
0 0 0

12

Electronic copy available at: https://ssrn.com/abstract=4282265


where (Wt ) and (Bt ) are independent Brownian motions, and ht is the drift term
for factors. We write the spot covariance matrix of (Xt ) and (Zt ) as Φt = ηt ηtT
and Θt = ζt ζtT . The processes (ηt ) and (ζt ) are càdlàg, and Φt , Φt− , Θt and Θt−

are positive definite. In addition, max kβkmax , sups≥0 kαs kmax , sups≥0 khs kmax ≤ C,
R1 
and λmin E( 0 Φs ds) > c for some constants c, C > 0.
Finally, the mixing coefficients ρ(χ) = supA∈F−∞
0 ,B∈Fχ∞ |P (AB) − P (A)P (B)|, where
0
F−∞ , Fχ∞ are σ-algebras generated by {(Φt , Θt ) : −∞ ≤ t ≤ 0} and {(Φt , Θt ) : χ ≤
t ≤ ∞}, respectively, satisfy that ρ(χ) ≤ c1 exp(−c2 χ) for some constants c1 , c2 > 0
and any positive integer χ.

For each day t = 1, ..., T , we denote the integrated variances and integrated id-
iosyncratic variances of N stocks as Vt = (V1t , ..., VN t )T and VU ;t = (VU ;1t , ..., VU ;N t )T ,
respectively, namely,
Z t Z t
Vit = Ψτ,ii dτ, and VU ;it = Θτ,ii dτ, 1 ≤ i ≤ N, (2.2)
t−1 t−1

Rt
where Ψt = βΦt β T + Θt . We define VF ;kt = t−1
Φτ,kk dτ , 1 ≤ k ≤ K, and VF ;t =
(VF ;1t , ..., VF ;Kt )T .
Suppose that we observe log-returns of stocks and factors at sampling frequency
∆n . For each t = 1, ..., T and j = 1, ..., n := [1/∆n ], we write the log-returns of stocks
and factors as Rt[j] and Ft[j] , respectively, where

Rt[j] = Yt−1+∆n j − Yt−1+∆n (j−1) =: (R1,t[j] , ..., RN,t[j] )T , and


Ft[j] = Xt−1+∆n j − Xt−1+∆n (j−1) =: (F1,t[j] , ..., FK,t[j] )T .

Model (2.1) induces a factor model for high-frequency returns:


Z t−1+∆n j
Rt[j] = αn;t[j] + βFt[j] + Ut[j] , αn;t[j] = αs ds, and
t−1+∆n (j−1) (2.3)
T
Ut[j] = Zt−1+∆n j − Zt−1+∆n (j−1) =: (U1,t[j] , ..., UN,t[j] ) ,

where Ut[j] is an N -dimensional vector of idiosyncratic returns.


The realized variance of stock i on day t, RVit , is defined as RVit = nj=1 Ri,t[j]
2
P
. It

is consistent in estimating the integrated variance and enjoys n rate of convergence.

13

Electronic copy available at: https://ssrn.com/abstract=4282265


2.4.2 Stock Variance Factor Estimation

Next, we present the theoretical results for the estimation of factor structure in stock
variance. We make the following assumptions on the stock volatility processes.
Rt
Assumption
 2. The integrated
 variances ( t−1
Ψτ,ii dτ )1≤i≤N are stationary, and
M
supt∈N E (supt−1≤s<t Ψs,ii ) ≤ kδ for some positive constants kδ , M > 0 and for
all t ∈ N, 1 ≤ i ≤ N .

Assumption 3. The covariance matrix of integrated variances, ΣV = Cov Vt ,
satisfies that for some constants c, C > 0, c ≤ λV,i /N < λV,i−1 /N ≤ C and λV,i−1 /N −
λV,i /N > c for 1 ≤ i ≤ q, and c ≤ λV,i ≤ C for q < i ≤ N , where λV,1 ≥ ... ≥ λV,N
are the eigenvalues of ΣV , and q is a fixed positive integer.

Assumption 3 is a standard assumption in factor models (Bai (2003); Fan, Liao,


and Mincheva (2013)). It implies that the integrated variances admit a factor struc-
ture with q (strong) factors.
To estimate ΣV , we use the sample covariance matrix of the realized variances,

T
b RV = 1
X
Σ (RVt − RV )(RVt − RV )T ,
T t=1

PT
where RVt = (RV1t , ..., RVN T )T and RV = t=1RVt /T . We denote the ith eigenvec-
tor of ΣVU by ξVU ;i , the ith largest eigenvalue of Σ
b RV by λbRV and the corresponding
i

eigenvector by ξbRV , 1 ≤ i ≤ N .
i

The next theorem gives the error bound of Σ


b RV in estimating ΣV .

Theorem 1. Under Assumptions 1 and 2, if log T /| log ∆n | = O(1), N = O(T γ ) for


some γ > 0, and the M in Assumption 2 satisfies M > 4(1 + 2γ), then
r !
p log N

b RV − ΣV kmax = Op ∆n + . (2.4)
T

14

Electronic copy available at: https://ssrn.com/abstract=4282265


In addition, if Assumption 3 holds, then
r !
λ
bRV
i
p log N
max − 1 = Op ∆n + , and (2.5)
1≤i≤q λVi T
r !
p log N
max kξbRVi − ξVi k2 = Op ∆n + . (2.6)
1≤i≤q T

Theorem 1 guarantees that if a factor structure exists in the stock variance, then
it can
 be consistently
 estimated by conducting PCA on the stock RV as long as
max ∆n , (log N )/T → 0.

2.4.3 Idiosyncratic Variance Factor Estimation

In this subsection, we give the consistency results of conducting PCA on realized


idiosyncratic variances in identifying factor structure in integrated idiosyncratic vari-
ances.
Under the factor model setup, we obtain the factor loading estimator βb =:
1
PT P[1/∆n ]
(βb1 , ..., βbN )T , the estimator α
cn = (c cnN )T of the average drift αn =
αn1 , ..., α αn;t[j
T ·[1/∆n ] t=1 j=1
and the idiosyncratic return estimator U
b t[j] =: (U bN,t[j] )T . Specifically,
b1,t[j] , ..., U

T [1/∆
!−1
Xn ] T [1/∆
Xn ]
!
X X
T T
βb = (Rt[j] − R)(Ft[j] − F ) (Ft[j] − F )(Ft[j] − F ) ,
t=1 j=1 t=1 j=1

cn = R − βb F ,
α b t[j] = Rt[j] − α
and U cn − βF
b t[j] ,
(2.7)
1
PT P[1/∆n ] 1
PT P[1/∆n ]
where R = T ·[1/∆n ] t=1 j=1 Rt[j] , and F = T ·[1/∆n ] t=1 j=1 Ft[j] . The feasi-
ble idiosyncratic realized variance is defined as follows:

[1/∆n ] [1/∆n ]
X X
2
RVUb ;it = U
bi,t[j] = cni − βbiT Ft[j] )2 ,
(Ri,t[j] − α 1 ≤ i ≤ N, 1 ≤ t ≤ T. (2.8)
j=1 j=1

We then estimate ΣVU using the sample covariance matrix of (RVUb ;it )1≤i≤N,1≤t≤T :

T
1X
ΣRVUb =
b (RVUb ;t − RV Ub )(RVUb ;t − RV Ub )T ,
T t=1

15

Electronic copy available at: https://ssrn.com/abstract=4282265


1
PT
where RVUb ;t = (RVUb ;1t , ..., RVUb ;N t )T and RV Ub = T t=1 RVUb ;t .
We make the following assumptions on the integrated factor variances and the
integrated idiosyncratic variances.
Rt Rt
Assumption 4. ( t−1 Φτ,jk dτ )1≤j,k≤K and ( t−1 Θτ,ii dτ )1≤i≤N are stationary, and
thereexist kδ and M such that for all t ≥ 1, 1 ≤ k  ≤ K and 1 ≤ i ≤ N ,
max supt∈N E(supt−1≤s≤t ΦM M
s,kk ), supt∈N E(supt−1≤s≤t Θs,ii ) ≤ kδ .

 5. The covariance matrix of integrated idiosyncratic variance, ΣVU =


Assumption

Cov VU ;t , satisfies that for some constants c, C > 0, one has c ≤ λVU ;i /N <
λVU ;i−1 /N ≤ C for 1 ≤ i ≤ r, and c ≤ λVU ;i ≤ C for r < i ≤ N , where
λVU ;1 ≥ λVU ;2 ≥, ..., ≥ λVU ;N are the eigenvalues of ΣVU , and r is a fixed positive
integer.

We denote by ξVU ;i the ith eigenvector of ΣVU , 1 ≤ i ≤ N . For the sample covari-
ance matrix Σ bRV , 1 ≤ i ≤ N ,
b RV , the eigenvectors and eigenvalues are denoted by ξbRV and λ
U
b U
b ;i U
b ;i

respectively. The next theorem gives the error bound of using Σ b RV to estimate ΣV .
U
b U

Theorem 2. Under Assumptions 1, 2 and 4, if log T /| log ∆n | = O(1), N = O(T γ )


for some γ > 0 and the M in Assumption 4 satisfies M > 4(1 + 2γ), then
r !
p log N

b RV − ΣV kmax = Op
U
∆n + . (2.9)
U
b
T

In addition, if Assumption 5 holds, then


r !
λ
bRV ;i p log N
max U
− 1 = Op ∆n + , and (2.10)
b

1≤i≤r λV ;i T
U
r !
p log N
max kξbRVUb ;i − ξVU ;i k2 = Op ∆n + . (2.11)
1≤i≤r T

Theorem 2 guarantees that if a factor structure exists in the idiosyncratic variance,


then the factor structure can be consistently
 estimated
 by conducting PCA on the
idiosyncratic RV provided that max ∆n , (log N )/T → 0.

16

Electronic copy available at: https://ssrn.com/abstract=4282265


3 Factor Modeling for Stock Volatility

3.1 Common Variance Factor

The empirical evidence in Section 2.3 suggests that both return factor variances and
idiosyncratic variances are driven by a single variance factor. In order to construct the
single factor, by Theorem 1, we can use the first PC in stock RVs. Alternatively, one
can take the common variance (CV) of stocks, which is defined as the cross-sectional
average of stock integrated variance:

N
1 X
CVt = Vit .
N i=1

Correspondingly, the common realized variance (CRV) is defined as follows:

N
1 X
CRVt = RVit .
N i=1

Rescaled first PC in stock RV and CRV

log rescaled first PC in RV


5e−03

log CRV
2e−03
5e−04
2e−04
5e−05

Jan−2003 Jan−2005 Jan−2007 Jan−2009 Jan−2011 Jan−2013 Jan−2015 Jan−2017 Jan−2019

Figure 4: Time series plots of common realized variance (CRV) and rescaled first PC
in stock RVs. The plots are drawn on a log scale to improve visibility.

In Figure 4, we plot the time series of CRV and the first PC in stock RV rescaled

17

Electronic copy available at: https://ssrn.com/abstract=4282265


to have the same mean and standard deviation as CRV. We find that the first PC
in RVs largely coincides with CRV. They have a high correlation of 0.979. The
results suggest that CV can be approximately considered as the single factor in stock
variance. Compared with the first PC, the CV factor is more interpretable and easier
to estimate.

Remark 1. Besides CV, we also evaluate VIX (the CBOE Market Volatility Index,
transformed to daily variance) as a possible candidate for the volatility factor. The
correlation between VIX and the first PC in stock RVs is lower than the correlation
between CRV and the first PC (0.842 vs. 0.979). VIX measures implied volatility
for the future and is more informative in longer monthly/yearly horizons (see, e.g.,
the discussion in Andersen and Benzoni (2010)). In addition, it carries a volatility
risk premium, which complicates volatility forecasting. Given the nature of VIX, we
consider CV as a more appropriate factor proxy in stock volatility. Nevertheless,
in practice, when predicting volatility, people often find VIX to be helpful. In our
volatility prediction method to be introduced in Section 4, replacing CRV with VIX
leads to similar performance.

3.2 Evidence about the Multiplicative Factor Structure

3.2.1 Single Factor Model for Variance

The empirical evidence from conducting PCA on stock RVs suggests the following
factor model:
Vit = aξ,i + bξ,i ξt + εξ,it , 1 ≤ i ≤ N, (3.1)

where ξt is the single (latent) factor. Taking average over i on both sides yields

N
1 X
CVt = Vit = āξ + b̄ξ ξt + εξ,t , (3.2)
N i=1

PN PN PN
where āξ = i=1 aξ,i /N , b̄ξ = i=1 bξ,i /N , and εξ,t = i=1 εξ,it /N . We can hence
rewrite model (3.1) using CV as the variance factor:

Vit = ai + bi CVt + εit , 1 ≤ i ≤ N, (3.3)

18

Electronic copy available at: https://ssrn.com/abstract=4282265


where ai = aξ,i − āξ bξ,i /b̄ξ + E(εξ,it ) − bξ,i E(ε̄ξ,t )/b̄ξ , bi = bξ,i /b̄ξ , and εit = εξ,it −

E(εξ,it ) − bξ,i ε̄ξ,t − E(εξ,t ) /b̄ξ .
To estimate the coefficients ai and bi in model (3.3), we regress RVit over CRVt ,
and obtain
PT
t=1 (CRVt − CRV )(RVit − RV i )
bbi = PT , ai = RV i − bbi CRV , for 1 ≤ i ≤ N,
b
2
t=1 (CRVt − CRV )
(3.4)
PT PT
where CRV = t=1 CRVt /T , and RV i = t=1 RVit /T .
The next result shows that the estimators (b
ai , bbi ) are consistent under the following
mild assumptions.
 
T
Assumption 6. The factor process (ξt ) is stationary. (εξ,t ) = (εξ,1t , εξ,2t , ..., εξ,N t )
is stationary and uncorrelated with ξt . Moreover, |b̄ξ | > c, and k Cov(εξ,t )k2 ≤ C for
some constant c, C > 0.

Proposition 1. Under the assumptions of Theorem 1 and Assumption 6,


r r !
p log N 1
max |bbi − bi | = Op ∆n + + ,
1≤i≤N T N
r r ! (3.5)
p log N 1
max |b
ai − ai | = O p ∆n + + .
1≤i≤N T N

3.2.2 From Additive to Multiplicative

We estimate model (3.3) using the S&P 500 Index constituent stock RVs. We have
the following interesting findings.
First, when checking the idiosyncratic component εit , we find a strong correlation
between ε2it and CVt2 , while the correlation between ε2it /CVt2 and CVt2 is almost
zero. This result suggests that εit scales with CVt . In addition, after checking the
distribution of εit /CVt , we find that εit can be well modeled by the multiplication of
CVt and a centered lognormal random variable, namely, εit = CVt exp(µi + σi zit ) −

exp(µi + σi2 /2) . Details are given in Appendix A.1 of the Supplementary Material.
Second, when further analyzing the coefficients ai and bi , strong evidence suggests
that the intercept terms (ai )1≤i≤N in (3.3) are close to zero. In addition, the slope

19

Electronic copy available at: https://ssrn.com/abstract=4282265


term bi in (3.3) and exp(µi +σi2 /2), that is, the expectation of the lognormal term, are
approximately equal. These results motivate us to impose the restrictions: ai = 0,
and bi = exp(µi + σi2 /2), 1 ≤ i ≤ N .
Combining the findings above yields a multiplicative factor model, which we
present in the next subsection. The detailed analysis results are relegated to Ap-
pendix A in the Supplementary Material.

3.3 Main Model: Multiplicative Volatility Factor Model

The analysis in Section 3.2.2 leads us to propose the following Multiplicative Volatility
Factor (MVF) model:

Vit = ξt exp(µi + σi zit ), 1 ≤ i ≤ N, (3.6)

where ξt is the multiplicative latent factor, and exp(µi + σi zit ) is a multiplicative


idiosyncratic component, where zit ∼ N (0, 1) and is independent with ξt . We denote
Veit = exp(µi + σi zit ) and refer to it as idiosyncratic variance exposure (iE). We allow
log(Veit ) to be dependent over time. The latent factor ξt can be well approximated by
the common variance CVt . Correspondingly, Veit = Vit /CVt .
The MVF model (3.6) has several desirable properties: 1) it reflects important
volatility characteristics such as non-negativity and heavy-tailedness; 2) it has a sim-
ple form that eases the model estimation and volatility prediction; 3) it enjoys internal
model consistency in the following sense: if a model applies to individual assets, then
it also applies to portfolios of the assets. We explain the three properties in more
detail below.
First, volatility is usually modeled in a multiplicative way. Examples include the
lognormal stochastic volatility model (Hull and White (1987)), EGARCH (Nelson
(1991)), and realized-GARCH with log-linear specification (Hansen, Huang, and Shek
(2012)). The multiplicative model (3.6) naturally captures important features in
volatilities such as non-negativity and heavy-tailedness.
Second, the MVF model (3.6) has a surprisingly neat format, which makes the
model estimation and volatility prediction very straightforward. The common factor

20

Electronic copy available at: https://ssrn.com/abstract=4282265


is the common variance, and the only two model parameters, µi and σi , can be
easily estimated using the sample mean and standard deviation of log(RV
g it ), where
RV
g it = RVit /CRVt is the idiosyncratic realized variance exposure (iRE). For volatility
forecasting, log(CVt ) and log(Veit ) can be separately modeled by, for example, a HAR
model (Corsi (2009)).
Third, the MVF model (3.6) is a special case of the additive linear factor model
(3.1), and hence describes the factor structure in variance. When evaluating portfolio
risk, the variance of a portfolio involves a linear combination of underlying stock vari-
ances. Under our MVF model, the portfolio’s volatility inherits the factor component
from the underlying stock volatilities. That is, the MVF model enjoys internal model
consistency.
We find that PCA on log(RV) also suggests a single factor model structure. The
modeling of log variance will lead to a log-linear single factor model:

log(Vit ) = a0i + b0i ξt + ε0it . (3.7)

Our MVF model (3.6) is closely related to (3.7). However, there are subtle but
important differences between them.
Comparing the MVF model to the single factor model for log variance, we see
that model (3.7) is more difficult to interpret and does not enjoy internal model
consistency. Note that model (3.7) is equivalent to

b0 0 0
Vit = ξt i eai +εit ,

where the coefficient b0i becomes the exponent, which makes the interpretation diffi-
cult. One natural choice of the factor ξt in (3.7) is the common log variance (ClogV),
namely, the cross-sectional average of the log variances. We estimate the coefficients b0
in the log-linear model (3.7) by regressing log(RV ) over ClogRV , the cross-sectional
average of the log RVs. We find that the coefficient b0 varies around 1, with an in-
terquartile range of 0.89∼1.08. In particular, b0 can deviate from 1. As a result,
model (3.7) does not enjoy internal model consistency.
In addition, the MVF model has the same format as the linear model for log

21

Electronic copy available at: https://ssrn.com/abstract=4282265


variance with the slope term constrained to be one. Note that if all b0i = 1, then
model (3.7) becomes our MVF model (3.6). The difference between the two is the
choice of factor. Under the linear model for log variance, a natural choice of the factor
is ClogV. We find that CRV and the exponential of ClogRV are almost identical with
a correlation higher than 0.99. The volatility prediction performance of the MVF
model and the constrained log-linear model is also very similar. We conclude that
the MVF model is almost equivalent to the log-linear single factor model with the
slope term constrained to be one.
As a result of the above comparison, we recommend our MVF model for variance
over the linear model for log variance.

Remark 2. Barigozzi and Hallin (2020) discuss a factor model for log variance and
the estimation consistency of the factor structure. However, we find that modeling
variance rather than log variance has several advantages as discussed above. In addi-
tion, they assume no heteroskadasticity in returns. In contrast, we model the dynamic
volatilities and naturally allow heterokasticity in returns. Moreover, our model does
not rely on factor structure in returns.

Remark 3. The MVF model (3.6) can be easily modified to include multiple factors.
When multiple factors exist, the generalized MVF takes the following form:

K
X 
Vit = ξkt exp(µki ) exp(σi zit ), 1 ≤ i ≤ N, (3.8)
k=1

K
where (ξkt )K
k=1 are K factors, and exp(µki + σi zit ) k=1 are the multiplicative id-
iosyncratic exposures. Model (3.8) can be analyzed, estimated, and used in volatility
forecasting in a similar way to the single factor model.

4 Volatility Forecasting
In this section, we utilize the proposed MVF model (3.6) for volatility forecasting.

22

Electronic copy available at: https://ssrn.com/abstract=4282265


4.1 Forecasting Models

4.1.1 Our Approach: MVF Model

We use CVt as the proxy of the latent factor in our proposed MVF model (3.6).5 The
idiosyncratic variance exposure is Veit = Vit /CVt . We estimate CVt by the common
realized variance CRVt and compute the idiosyncratic realized variance exposures
(iRE), RV
g it := RVit /CRVt for i = 1, ..., N . Then, we model the log(CRVt ) and
g it ) separately using the HAR model6 (Corsi (2009)):
log(RV

xt+1 = θ0 + θd xt + θw xt−5,t + θm xt−22,t + ut , (4.1)

where xt represents log(CRVt ) or log(RV


g it ), xt−5,t and xt−22,t are the previous one
week and one month averages of xt , respectively, ut ∼ N (0, σu2 ), and θ0 , θd , θw ,
i.i.d.
θm and σu are constants. The parameters in (4.1) and the models in benchmark
approaches presented in the next subsection are estimated with a 252-day rolling
window. We denote the forecasts of CV and Vei on day t + 1 as CV
d t+1 and Vb
e it+1 for
1 ≤ i ≤ N . Then, the forecast of the volatility of stock i on day t + 1 is

d t+1 × Vb
Vbit+1 = CV e it+1 , i = 1, . . . , N.

We denote the predictions by our proposed MVF model as CVlogHAR ×iElogHAR .

4.1.2 Benchmark Models

We compare our proposed MVF model with the following benchmark models.

BM1: Individual Volatility Modeling


This approach predicts each stock volatility using a logHAR model.7 Specifically,
5
We also evaluate the performance of the MVF model by replacing CV with the market volatility
as the factor proxy. The forecasting results are much worse than the model with CV factor.
6
Fitting log(RV
g it ) with a HAR model is equivalent to first estimating µ
bi and σ bi from model (3.6)
with sample mean and sample standard deviation of log(RVg it ), then fitting zbit = (log(RV
g it )− µ
bi )/b
σi
with a HAR model.
7
We also evaluate the forecasting performance of the standard HAR model by fitting a HAR
model directly on variance. When comparing the logHAR model with the standard HAR model, we

23

Electronic copy available at: https://ssrn.com/abstract=4282265


for each stock i, we fit log(RVit ) with a HAR model (4.1). The fitted model is then
used for prediction. This approach does not incorporate cross-sectional information,
neither the factor structure in returns nor in volatilities. We denote the prediction of
this method as IdvlogHAR .

BM2: Return Factor model + Individual Idiosyncratic Volatility modeling

This approach predicts systematic and idiosyncratic components of the volatility


separately using return factor models. On each day t, we estimate β and (Ut[j] )1≤j≤n
using a 252-day rolling window under the Fama-French three-factor model or a sta-
tistical factor model.8
As to volatility forecasting, similar to BM1, the idiosyncratic variance (iV) is
predicted with a logHAR model. To forecast the factor covariance matrix ΣFt+1 ,
BM2 uses the realized GARCH-DCC (rG) model. Specifically, we use the realized-
GARCH (1,1) model with a log-linear specification (Eqn. (1), (4) and (5) in Hansen,
Huang, and Shek (2012)) to forecast the factor variances, and use the DCC model
(Engle (2002)) to forecast the correlation matrix of the factor returns. We denote the
resulting forecast of the factor covariance matrix as Σ
b Ft+1 . Then, the forecast of the
stock variance is

T b
Vbit+1 = βbi(t) ΣFt+1 βbi(t) + iV
c it+1 , 1 ≤ i ≤ N.

This approach utilizes the cross-sectional structure in returns but does not incorporate
the factor structure in idiosyncratic variances. We denote the prediction of the method
under the Fama-French three-factor model as FF3rG +iVlogHAR and the prediction
under the statistical factor model as StatsFrG +iVlogHAR .

BM3: Return Factor Model + Common Idiosyncratic Volatility Modeling

find that the standard HAR model performs worse.

8
Specifically, for the statistical factor model, we use five PCs in stock returns as return factors,
estimated with a 252-day rolling window based on the high-frequency 5-min returns.

24

Electronic copy available at: https://ssrn.com/abstract=4282265


This approach utilizes the return factor model as well as the factor structure in
idiosyncratic variances. Same as the BM2 approach, the systematic/idiosyncratic
components of the volatility are predicted separately and the systematic component
is predicted using the realized GARCH+DCC model. About the forecasting of the id-
iosyncratic variance, BM3 utilizes the following single factor structure in idiosyncratic
variance:
iVit = c0i + c1i CiVt + εit , 1 ≤ i ≤ N.

On each day t, c0i and c1i are estimated using a 252-day rolling window regression
of the idiosyncratic realized variance iRVi over the common idiosyncratic realized
variance CiRV . We employ a logHAR model to predict the CiV factor. The residual
of the factor model for idiosyncratic variance, ε, is modeled by an AR(1), εi,t =
ξi εi,t−1 + uit .9 We predict the idiosyncratic variance of stock i on day t + 1 with
iV
c it+1 = b c0i(t) + b
c1i(t) CiV
d t+1 + εbit+1 . The forecast of the stock volatility is

T b
Vbit+1 = βbi(t) ΣFt+1 βbi(t) + b
c0i(t) + b
c1i(t) CiV
d t+1 + εbit+1 , i = 1, ..., N.

We denote the prediction of BM3 under FF3 model as FF3rG +iVCiV and the prediction
of BM3 under the statistical factor model as StatsFrG +iVCiV .

4.2 Evaluation Metrics

We evaluate the performance of different models in forecasting daily continuous vari-


ance. We use the same S&P 500 Index consitituent stocks data described in Section
2.1. The evaluation period is from January 2004 to December 2020. The performance
of different approaches is evaluated by Q-like (Patton (2011))10 :

L
!
1X  Vb  RV
it it
QLIKEi = log + −1 i = 1, . . . , N,
L t=1 RVit Vit
b
9
We also check the method that predicts ε with 0. The performance is worse than the AR(1)
model.
10
Besides Q-like, we also evaluate the performance using out-of-sample R2 . Our approach
performs consistently well compared to the benchmark models in most of the years under evaluation.

25

Electronic copy available at: https://ssrn.com/abstract=4282265


where Vbit is stock i’s forecast variance on day t, RVit is the truncated realized variance,
and L is the total length of the forecasting period. The Q-like measure is robust to
the presence of noise in the volatility proxy; see Patton (2011). A smaller Q-like
indicates a better volatility prediction. We use the same formulation of Q-like loss as
Bollerslev, Patton, and Quaedvlieg (2016) so that the Q-likes for different stocks are
standardized by the stocks’ volatilities.

4.3 Forecasting Results

In Table 2, we summarize the Q-likes of different models in forecasting the volatil-


ities of the S&P Index constituent stocks. The results show that our MVF model,
CVlogHAR ×iElogHAR , outperforms the benchmark models in that its Q-likes have the
lowest mean, median 25% and 75% quantiles.

Table 2: Summary statistics of the Q-likes of various forecasting models in pre-


dicting S&P 500 Index constituent stocks’ daily volatilities from January 2004 to
December 2020. The total number of stocks under evaluation is 291, and the length
of the evaluation period is L = 4239. The reported values are the 25% quantile
(Q1), median, mean and the 75% quantile (Q3) of Q-likes across the stocks under
evaluation.

Forecasting models Q1 Median Mean Q3


BM1: Individual volatility modeling
IdvlogHAR 0.126 0.135 0.140 0.149
BM2: Return factor model + Individual idiosyncratic volatility modeling
FF3rG +iVlogHAR 0.127 0.136 0.142 0.150
StatsFrG +iVlogHAR 0.129 0.137 0.144 0.150
BM3: Return factor model + Common idiosyncratic volatility modeling
FF3rG +iVCiV 0.134 0.143 0.159 0.165
StatsFrG +iVCiV 0.136 0.145 0.161 0.166
MVF model
CVlogHAR ×iElogHAR 0.120 0.129 0.135 0.142

26

Electronic copy available at: https://ssrn.com/abstract=4282265


We further compare the forecasting performance stock by stock and test the sig-
nificance in Q-like differences between our MVF model and the benchmark models.
Specifically, we compare the Q-like of the MVF model with the benchmark models
for each stock, and compute the percentage of stocks for which our model generates
lower Q-likes:
N
1 X
Out.perf. = 1{QLIKEM V F,i <QLIKEbm,i } , (4.2)
N i=1

where N = 291 is the total number of stocks under evaluation, and QLIKEM V F,i
and QLIKEbm,i are the Q-likes of our MVF model and the benchmark model for
the ith stock, respectively. Furthermore, we perform the Diebold-Mariano (DM) test
(Diebold and Marino (1995)) to examine the significance of the Q-like differences
between our MVF model and the benchmark models. Specifically, we perform the
following one-sided Q-like difference test:

H0 : E(e1,t − e2,t ) ≥ 0 vs. H1 : E(e1,t − e2,t ) < 0,

where em,t = log(Vbm,t /RVt )+RVt /Vbm,t −1 is the Q-like loss, m = 1, 2, which represent
the Q-like loss of our MVF model and the benchmark model, respectively. We write
¯ σ (d),
dt = e1,t − e2,t . The DM test statistic is d/b ¯ where d¯ = PL dt /L and σ ¯ is the
b(d)
t=1
standard error of d¯ estimated by heteroskedasticity-autocorrelation-consistent (HAC)
estimator. We then compute the proportion of stocks where our MVF model generates
statistically significantly lower Q-likes than the benchmark models:

N
1 X
Sig.Out.perf. = 1{pi <α} , (4.3)
N i=1

where pi is the p-value of the DM test for the ith stock, and α = 5% is the significance
level.
The outperformance proportion (Out.perf.) of our proposed MFV model over
the benchmark models and the significant outperformance proportion (Sig.Out.perf.)
are reported in Table 3. The results show that the MVF model, CVlogHAR ×iElogHAR ,
yields a more accurate prediction than the benchmark models for almost all the stocks.
Moreover, the DM test results show that the outperformance of our MVF model is

27

Electronic copy available at: https://ssrn.com/abstract=4282265


statistically significant for a very high percentage of stocks under evaluation.

Table 3: Outperformance proportion of the proposed MVF model over the bench-
mark models among the S&P Index constituent stocks in terms of the Q-like measure
during January 2004–December 2020. The values are the percentages of stocks for
which the MVF model outperforms the benchmark models.

MVF Model: CVlogHAR ×iElogHAR Outperformance proportion


vs. Benchmark models Out.perf. (%) Sig.Out.perf. (%)
BM1: IdvlogHAR 99.3 93.5
BM2: FF3rG +iVlogHAR 96.9 86.6
BM2: StatsFrG +iVlogHAR 97.9 93.1
BM3: FF3rG +iVCiV 99.3 94.5
BM3: StatsFrG +iVCiV 99.0 96.6

The prediction results have several implications. First, the outperformance of our
proposed MVF model compared with the model that only uses the individual stock’s
information, BM1:IdvlogHAR , demonstrates the benefit of utilizing cross-sectional in-
formation in individual stock volatility prediction. Second, our MVF model has a
dominant outperformance compared with the model that uses only return factor mod-
els, BM2: FF3rG +iVlogHAR and BM2: StatsFrG +iVlogHAR , showing the importance
of incorporating the stock/idiosyncratic variance factor structure. Third, our MVF
model not only simplifies the forecasting but also generates more accurate forecasting
compared with the more complex models, the four-factor model (BM3: FF3rG +iVCiV
and StatsFrG +iVCiV ). These comparisons demonstrate the solid advantages of the
MVF model in volatility forecasting.

5 Global Evidence
The MVF model is built upon empirical evidence from the S&P 500 Index constituent
stocks. We next examine whether the MVF model also applies to the global market.
We perform a parallel analysis of the factor structure in the global market using

28

Electronic copy available at: https://ssrn.com/abstract=4282265


Table 4: Summary statistics of the Q-likes of the forecasting models in predicting
global equity indices’ daily volatilities from January 2001 to March 2021. The re-
ported values are the 25% quantile (Q1), median, mean and the 75% quantile (Q3)
of Q-likes across the indices under evaluation.

Forecasting models Q1 Median Mean Q3


BM1: IdvlogHAR 0.152 0.185 0.189 0.222
MVF: CVlogHAR ×iElogHAR 0.146 0.183 0.183 0.209

daily realized variances of 31 global equity indices11 from January 1, 2001 to March
12, 2021 obtained from the Oxford-Man Institute’s “realized library”. In the global
equity indices’ volatilities, we also find a strong co-movement feature and high pairwise
correlation with a mean pairwise correlation of 0.5. When performing PCA on the
indices’ RVs, we find that a high proportion (67%) of the total variation in global
equity indices’ RVs can be explained by the first PC, while the second and other PCs
do not account for a proportion substantially higher than the remaining. The number
of factors estimators (Bai and Ng (2002), Ahn and Horenstein (2013)) also suggest a
single factor in the global indices’ variances. We find that CRV (the cross-sectional
average of all indices’ RVs) has a 0.988 correlation with the first PC and can still
be a good proxy for the variance factor in the global market. In addition, similar to
the US market, strong evidence suggests that the multiplicative factor structure still
holds for the volatilities in the global market.12
We next evaluate the performance of our MVF model in forecasting the volatil-
ity of the global indices. We forecast the volatility based on our MVF model,
CVlogHAR ×iElogHAR , and compare the results with the benchmark model BM1:IdvlogHAR .
We do not include BM2 and BM3 because there is no well-established factor model
for the global indices. In Table 4, we summarize the Q-likes of these two models in
forecasting the volatilities of the global equity indices. Table 4 shows that our MVF
model outperforms BM1:IdvlogHAR in that its Q-likes have the lowest mean, median,
25% and 75% quantiles. We also find that our MVF model outperforms 25 out of 31,
or 80.6% of all the indices we study. The results confirm the advantage of our MVF
11
The data from different indices are synchronized by treating GMT 00:00 – GMT 23:59 as the
same day. There is no market opening overnight.
12
The detailed results are available upon request.

29

Electronic copy available at: https://ssrn.com/abstract=4282265


model in global market volatility forecasting.

6 Conclusion
This work provides a framework to study the factor structure in stock variance based
on high-frequency and high-dimensional price data. We theoretically show that the
factor structure in stock and idiosyncratic variance can be consistently estimated by
conducting PCA on the stock/idiosyncratic realized variances. Empirically, based on
the strong empirical evidence from the analysis of daily volatilities of S&P 500 Index
constituent stocks, we propose a multiplicative volatility factor (MVF) model. The
MVF model includes a multiplicative variance factor and a multiplicative idiosyncratic
component, where the variance factor is approximately the cross-sectional average of
stock variances. Based on the proposed MVF model, we develop a forecasting model,
which is found to provide more accurate volatility forecasts than various benchmark
approaches for a majority of the stocks we evaluate. Finally, we demonstrate that
our MVF model also applies to the global market and helps to predict the volatilities
of global equity indices.
The volatility factor modeling framework that we propose facilitates a deeper
understanding of the financial market. The MVF model achieves dimension reduction
in volatility modeling for a large cross-section of assets. Besides volatility forecasting,
our framework provides insights into the study of shock spillover and transmission in
financial systems and holds promise in applications such as large portfolio allocation,
risk management and volatility trading.

References
Ahn, Seung C. and Alex R. Horenstein (2013): “Eigenvalue ratio test for the
number of factors,” Econometrica, 81, 1203–1227.

Aït-Sahalia, Yacine, Jianqing Fan, and Yingying Li (2013): “The leverage


effect puzzle: Disentangling sources of bias at high frequency,” Journal of Financial
Economics, 109, 224–249.

30

Electronic copy available at: https://ssrn.com/abstract=4282265


Aït-Sahalia, Yacine and Loriano Mancini (2008): “Out of sample forecasts of
quadratic variation,” Journal of Econometrics, 147, 17–33.

Aït-Sahalia, Yacine and Dacheng Xiu (2017): “Using principal component


analysis to estimate a high dimensional factor model with high-frequency data,”
Journal of Econometrics, 201, 384–399.

Aleti, Saketh (2022): “The high-frequency factor zoo,” Working paper.

Andersen, Torben G and Luca Benzoni (2010): “Do bonds span volatility risk
in the US Treasury market? A specification test for affine term structure models,”
The Journal of Finance, 65, 603–653.

Andersen, Torben G, Tim Bollerslev, Francis X Diebold, and Paul


Labys (2003): “Modeling and forecasting realized volatility,” Econometrica, 71,
579–625.

Asai, Manabu and Michael McAleer (2015): “Forecasting co-volatilities via


factor models with asymmetry and long memory in realized covariance,” Journal
of Econometrics, 189, 251–262.

Bai, Jushan (2003): “Inferential theory for factor models of large dimensions,”
Econometrica, 71, 135–171.

Bai, Jushan and Serena Ng (2002): “Determining the number of factors in ap-
proximate factor models,” Econometrica, 70, 191–221.

Barigozzi, Matteo and Marc Hallin (2017): “Generalized dynamic factor mod-
els and volatilities: estimation and forecasting,” Journal of Econometrics, 201,
307–321.

——— (2020): “Generalized dynamic factor models and volatilities: Consistency,


rates, and prediction intervals,” Journal of Econometrics, 216, 4–34.

Barndorff-Nielsen, Ole E and Neil Shephard (2002): “Econometric analysis


of realized volatility and its use in estimating stochastic volatility models,” Journal
of the Royal Statistical Society: Series B (Statistical Methodology), 64, 253–280.

31

Electronic copy available at: https://ssrn.com/abstract=4282265


——— (2004): “Power and bipower variation with stochastic volatility and jumps,”
Journal of Financial Econometrics, 2, 1–37.

Bollerslev, Tim (1986): “Generalized autoregressive conditional heteroskedastic-


ity,” Journal of Econometrics, 31, 307–327.

Bollerslev, Tim, Benjamin Hood, John Huss, and Lasse Heje Pedersen
(2018): “Risk everywhere: Modeling and managing volatility,” The Review of Fi-
nancial Studies, 31, 2729–2773.

Bollerslev, Tim, Andrew J Patton, and Rogier Quaedvlieg (2016): “Ex-


ploiting the errors: A simple approach for improved volatility forecasting,” Journal
of Econometrics, 192, 1–18.

Bollerslev, Tim and Viktor Todorov (2011): “Estimation of jump tails,”


Econometrica, 79, 1727–1783.

Calvet, Laurent E, Adlai J Fisher, and Samuel B Thompson (2006):


“Volatility comovement: a multifrequency approach,” Journal of Econometrics, 131,
179–215.

Chamberlain, Gary and Michael Rothschild (1983): “Arbitrage, Factor


Structure, and Mean-Variance Analysis on Large Asset Markets,” Econometrica,
51, 1281–1304.

Chen, Nai-Fu, Richard Roll, and Stephen A Ross (1986): “Economic forces
and the stock market,” Journal of Business, 383–403.

Clark, Peter K (1973): “A subordinated stochastic process model with finite


variance for speculative prices,” Econometrica: Journal of the Econometric Society,
135–155.

Corsi, Fulvio (2009): “A simple approximate long-memory model of realized volatil-


ity,” Journal of Financial Econometrics, 7, 174–196.

Da, Zhi and Ernst Schaumburg (2006): “The factor structure of realized volatil-
ity and its implications for option pricing,” .

32

Electronic copy available at: https://ssrn.com/abstract=4282265


Diebold, Francis X and Roberto S Marino (1995): “Comparing Predictive
Accuracy,” Journal of Business & Economic Statistics, 13.

Ding, Yi, Robert Engle, Yingying Li, and Xinghua Zheng (2022): “Supple-
ment to “Factor modeling for volatility”,” .

Ding, Yi, Yingying Li, and Xinghua Zheng (2021): “High dimensional min-
imum variance portfolio estimation under statistical factor models,” Journal of
Econometrics, 222, 502–515.

Engle, Robert (1982): “Autoregressive conditional heteroscedasticity with esti-


mates of the variance of united kingrom inflation,” Econometrica, 50, 391–407.

——— (2002): “Dynamic conditional correlation: A simple class of multivariate gen-


eralized autoregressive conditional heteroskedasticity models,” Journal of Business
& Economic Statistics, 20, 339–350.

Engle, Robert and Susana Martin (2019): “Measuring and hedging geopolitical
risks,” .

Engle, Robert F, Takatoshi Ito, and Wen-Ling Lin (1990): “Meteor Showers
or Heat Waves? Heteroskedastic Intra-Daily Volatility in the Foreign Exchange
Market,” Econometrica, 58, 525–542.

Fama, Eugene F and Kenneth R French (1993): “Common risk factors in the
returns on stocks and bonds,” Journal of Financial Economics, 33, 3–56.

——— (2015): “A five-factor asset pricing model,” Journal of financial economics,


116, 1–22.

——— (2018): “Choosing factors,” Journal of financial economics, 128, 234–252.

Fan, Jianqing, Yuan Liao, and Martina Mincheva (2013): “Large covariance
estimation by thresholding principal orthogonal complements,” J. R. Stat. Soc. Ser.
B. Stat. Methodol., 75, 603–680, with 33 discussions by 57 authors and a reply by
Fan, Liao and Mincheva.

33

Electronic copy available at: https://ssrn.com/abstract=4282265


Gençay, Ramazan, Michel Dacorogna, Ulrich A Muller, Olivier
Pictet, and Richard Olsen (2001): An introduction to high-frequency finance,
Elsevier.

Gonçalves, Sílvia and Nour Meddahi (2009): “Bootstrapping realized volatil-


ity,” Econometrica, 77, 283–306.

Hansen, Peter Reinhard, Zhuo Huang, and Howard Howan Shek (2012):
“Realized garch: a joint model for returns and realized measures of volatility,”
Journal of Applied Econometrics, 27, 877–906.

Herskovic, Bernard, Bryan Kelly, Hanno Lustig, and Stijn


Van Nieuwerburgh (2016): “The common factor in idiosyncratic volatility:
Quantitative asset pricing implications,” Journal of Financial Economics, 119, 249–
283.

Hounyo, Ulrich, Sílvia Gonçalves, and Nour Meddahi (2017): “Bootstrap-


ping pre-averaged realized volatility under market microstructure noise,” Econo-
metric Theory, 33, 791–838.

Hull, John and Alan White (1987): “The pricing of options on assets with
stochastic volatilities,” The Journal of Finance, 42, 281–300.

Jacod, Jean, Yingying Li, Per A Mykland, Mark Podolskij, and Mathias
Vetter (2009): “Microstructure noise in the continuous case: the pre-averaging
approach,” Stochastic processes and their applications, 119, 2249–2276.

Jacod, Jean, Yingying Li, and Xinghua Zheng (2019): “Estimating the inte-
grated volatility with tick observations,” Journal of Econometrics, 208, 80–100.

Jacod, Jean and Philip Protter (1998): “Asymptotic error distributions for the
Euler method for stochastic differential equations,” The Annals of Probability, 26,
267–307.

Kapadia, Nishad, Matthew Linn, and Bradley S Paye (2020): “One Vol to
Rule Them All: Common Volatility Dynamics in Factor Returns,” Available at
SSRN 3606637.

34

Electronic copy available at: https://ssrn.com/abstract=4282265


Kelly, Bryan, Hanno Lustig, and Stijn Van Nieuwerburgh (2013): “Firm
volatility in granular networks,” Tech. rep., National Bureau of Economic Research.

Li, Jia, Yunxiao Liu, and Dacheng Xiu (2019): “Efficient estimation of inte-
grated volatility functionals via multiscale jackknife,” The Annals of Statistics, 47,
156–176.

Li, Jia, Viktor Todorov, and George Tauchen (2016): “Inference theory for
volatility functional dependencies,” Journal of Econometrics, 193, 17–34.

——— (2017): “Jump regressions,” Econometrica, 85, 173–195.

Li, Jia and Dacheng Xiu (2016): “Generalized Method of Integrated Moments for
High-Frequency Data,” Econometrica, 84, 1613–1633.

Liu, Lily Y, Andrew J Patton, and Kevin Sheppard (2015): “Does anything
beat 5-minute RV? A comparison of realized measures across multiple asset classes,”
Journal of Econometrics, 187, 293–311.

Luciani, Matteo and David Veredas (2015): “Estimating and forecasting large
panels of volatilities with approximate dynamic factor models,” Journal of Fore-
casting, 34, 163–176.

Mancini, Cecilia (2009): “Non-parametric threshold estimation for models with


stochastic diffusion coefficient and jumps,” Scandinavian Journal of Statistics, 36,
270–296.

Nelson, Daniel B (1991): “Conditional heteroskedasticity in asset returns: A new


approach,” Econometrica: Journal of the Econometric Society, 347–370.

Patton, Andrew J (2011): “Volatility forecast comparison using imperfect volatil-


ity proxies,” Journal of Econometrics, 160, 246–256.

Ross, Stephen (1976): “The arbitrage theory of capital asset pricing,” Journal of
Economic Theory, 13, 341–360.

Sharpe, William (1964): “Capital Asset Prices: A Theory of Market Equilibrium


Under Conditions of Risk,” Journal of Finance, 19, 425–442.

35

Electronic copy available at: https://ssrn.com/abstract=4282265


Susmel, Raul and Robert F Engle (1994): “Hourly volatility spillovers between
international equity markets,” Journal of International Money and Finance, 13,
3–25.

Taylor, Stephen John (1982): “Financial returns modelled by the product of


two stochastic processes-a study of the daily sugar prices 1961-75,” Time series
analysis: theory and practice, 1, 203–226.

Xiu, Dacheng (2010): “Quasi-maximum likelihood estimation of volatility with high


frequency data,” Journal of Econometrics, 159, 235–250.

Zhang, Lan, Per A Mykland, and Yacine Aït-Sahalia (2005): “A tale of


two time scales: Determining integrated volatility with noisy high-frequency data,”
Journal of the American Statistical Association, 100, 1394–1411.

Zhang, Lan et al. (2006): “Efficient estimation of stochastic volatility using noisy
observations: A multi-scale approach,” Bernoulli, 12, 1019–1043.

36

Electronic copy available at: https://ssrn.com/abstract=4282265


Supplement to “Factor Modeling for Volatility”
∗ † ‡ §
Yi Ding Robert Engle Yingying Li Xinghua Zheng

November 20, 2022

This supplement gives additional empirical results and proofs of the theoretical
results for Ding, Engle, Li, and Zheng (2022).

A Analysis under Additive Linear CV Factor Model

A.1 Analysis of Idiosyncratic Component

We first investigate the idiosyncratic component ε in (3.3). To do so, we pick a


random stock, estimate the coefficients and obtain the residual εb, and then check the
time series plots of εb and εb/CRV in Figure 1. It suggests that the residual εb is clearly
heteroskedastic. Moreover, εb seems to scale with CRV. If we scale εb by CRV, the
scaled residual εb/CRV appears to be more homoskedastic.
To quantitatively evaluate the abovementioned relationship, for each stock i, we
compute the correlation between εb2i and CRV 2 , and the correlation between εb2i /CRV 2
and CRV 2 . The results are summarized in Table 1. From Table 1, we see a very

Faculty of Business Administration, University of Macau, Taipa, Macau. Email: yid-
ing@um.edu.mo

Stern School of Business, New York University, 44 West Fourth Street Suite 9-62, New York.
Email: rengle@stern.nyu.edu

Department of ISOM and Department of Finance, Hong Kong University of Science and Tech-
nology, Clear Water Bay, Kowloon, Hong Kong. Research is supported in part by RGC grants GRF
16503419, GRF 16502118 and T31-64/18-N of the HKSAR and NSFC 19BM03. Email: yyli@ust.hk
§
Department of ISOM, Hong Kong University of Science and Technology, Clear Water Bay,
Kowloon, Hong Kong. Research is supported in part by RGC grants GRF-16304521, GRF 16304019
and T31-64/18-N of the HKSAR. Email: xhzheng@ust.hk

Electronic copy available at: https://ssrn.com/abstract=4282265


^ε ^
b × CRV

8
0.003

6
0.002

^ε CRV

4
0.001

2
0.000

0
−0.001

Jan−2003 Jan−2007 Jan−2011 Jan−2015 Jan−2019 Jan−2003 Jan−2007 Jan−2011 Jan−2015 Jan−2019

Figure 1: Left: Estimated idiosyncratic component εb of a typical stock in model (3.3)


and bb × CRV . Right: Time series plot of εb/CRV .

Table 1: Summary statistics of correlations in absolute value between εb2 from model
(3.3) and CRV 2 , and between εb2 /CRV 2 and CRV 2 . The values are the 25% quantile
(Q1), median, mean and the 75% quantile (Q3).

Q1 Median Mean Q3
ε2i,· , CRV·2 )|1≤i≤N
|corr(b 0.276 0.447 0.425 0.568
ε2i,· /CRV·2 , CRV·2 )|1≤i≤N
|corr(b 0.006 0.009 0.017 0.017

Electronic copy available at: https://ssrn.com/abstract=4282265


(RV − a^) CRV Normal Q−Q Plot

1200

12
Sample Quantiles
800
Frequency

8
6
400 ●

4
●●●


●●

2

●●




●●


●●

●●


●●


●●


●●


●●


●●



●● ●

●●
●●
●●

●●

●●
●●


●●
●●
●●●
●●
●●

●●

●●●
●●

●●
●●
●● ●●

●●

●●

●●
●●

●●
●●

●●
●● ●
●●

●●
●●


●●

●●
●●
● ●

●●
●●

●●

●●

●●
●●


●● ●

●●


●●

●●
●●
●●
●●
●●

●●

●● ●


●●
●●
●●
●●

●● ●●


●●


●●

●●

●●

● ●●●●


●●●
●●


●●

●●●


●●


●●●


●●


●●
●●●

●●

●●

●●
● ●

●●

●●



●●
●●

●●
●●
● ●

●●●


●●
●●●●●

●●

●●
●●

●●●

●●
●●

●●●
●●

●●●


●●
●●

●●
●●●

● ●●



●●

●●

●●

●●

●●●
●●

●●
●●

●●
0

0
0 2 4 6 8 10 12 −2 0 2

log((RV − a
^ ) CRV) Normal Q−Q Plot
200

2
Sample Quantiles

●●●
Frequency

1


●●

●●




●●




●●






50 100


●●

0

●●


●●



●●


●●




●●


●●

●●


●●






●●


●●



●●


●●




●●


●●


●●
●●


●●


●●




●●


●●




●●


●●





●●


●●
−2

●●

●●


●●

●●

●●

●●
●●

●●
●●

0

−2 −1 0 1 2 −2 0 2


Figure 2: Distributions of (RV − b a)/CRV and log (RV − b a)/CRV of a random
a is the weighted regression estimate in model (A.1).
stock, where b

interesting phenomenon: the size of εb clearly correlates with CRV, and the correlation
almost disappears when we scale εb by CRV.
In summary, the evidence suggests the following model:

Vit = ai + bi CVt + CVt ε̃it , 1 ≤ i ≤ N, (A.1)

where ε̃it is independent with CV. We then estimate the model (A.1) by weighted
regression using CRV as weights. We then compare the distribution of (RV −b
a)/CRV

and log (RV −b a)/CRV against normal distribution. The results for a random stock
are presented in Figure 2. We observe in Figure 2 that (RV − b
a)/CRV is heavy-
tailed, while log((RV − b
a)/CRV ) appears to be roughly normally distributed. We

Electronic copy available at: https://ssrn.com/abstract=4282265


are therefore led to the following refined model:

Vit =ai + bi CVt + CVt ε̃it


  (A.2)
=ai + bi CVt + CVt exp(µi + σi zit ) − exp(µi + σi2 /2) , i = 1, ..., N,

where the idiosyncratic component scales with CV, and ε̃it is modeled by a centered
lognormal distribution.

A.2 Analysis of Model Coefficients

We estimate the model (A.2) using MLE on subsampled data. We subsample the
data to circumvent possible auto-correlation. Specifically, we conduct MLE of model
(A.2) on observations subsampled every ten days, and then take the average of the
ten estimates.
ai over RV i (time series
We compute the proportion of the estimated intercept b
average) for each stock i and find that the proportions are all very small with an
average of 0.035. The results suggest that ai plays little role in the model (A.2). As
another check, we note that (A.2) is equivalent to Vit /CVt = bi + ai /CVt + ε̃it . We
then regress RVit /CRVt over 1/CRVt for each individual stock and find that the R2 s
are all nearly zero with an average of 0.024. This evidence suggests that the intercept
term ai in (A.2) can be ignored. We therefore get the simplified model with zero
intercept:  
Vit = bi CVt + CVt exp(µi + σi zit ) − exp(µi + σi2 /2) .

Next, we check the relation between bi and exp(µi + σi2 /2). The scatterplot of the
MLE of bi and exp(µi + σi2 /2) is presented in Figure 3. It shows that b and exp(µ +
σ 2 /2) are strongly linearly related. The correlation between the two reaches 0.99.
Moreover, the linear relation fits well with the line y = x.
By constraining a = 0 and b = exp(µ + σ 2 /2) in (A.2), we reach our final multi-
plicative volatility factor model (3.6).

Electronic copy available at: https://ssrn.com/abstract=4282265


4

●●

3

● ●●


^ 2 2)
●● ●
● ●●● ●

^+σ

2

● ●●

exp(µ
● ●●


●●

● ●●
●●
●●
● ● ●●

●●
● ● ●
● ● ● ●●
●●●
● ●
●●●●



●●
●●
●●
● ●
●●●●
●●●●●●●●

1
●● ●●●●

●●●●
●● ●●
●●

● ●●


●●
● ●
●●
●●●●
●●●
●●

●●

●●











●●●●●



●●
●●
● ●


●●
●●


●●



●●●
●●
●●
●●



0

0 1 2 3 4
^
b

Figure 3: Scatterplot of the MLE of bi and exp(µi +σi2 /2) of all stocks under evaluation
in model (A.2). The diagonal line is y = x.

B Proofs
In the following, c, C, C1 , C 0 , C0 , ..., etc, denote constants which do not depend on T , N ,
∆n , and can vary from place to place.
The first lemma extends the concentration inequality for estimating (co)-integrated
variance (Fan, Li, and Yu (2012); Cai, Hu, Li, and Zheng (2020)) to the case when
only polynomial tail decay is imposed on the spot volatility.

Lemma 1. Suppose that (ν1t ) and (ν2t ) satisfy dνjt = µjt dt + σjt dWj;t for j = 1, 2,
where (W1;t ) and (W2;t ) are standard Brownian motions that can be dependent with
each other, and there exist constants Cµ , Kσ , M > 0, such that max0≤t≤1 |µjt | ≤ Cµ ,
and for any x > 0 and j = 1, 2,
  K
σ
P max |σjt | > x < M .
0≤t≤1 x

Suppose also that the observation times (ti ) satisfy supn max1≤i≤n n|ti − ti−1 | ≤ C∆
for some constant C∆ > 0. For j1 , j2 ∈ {1, 2}, denote the realized (co)variance
P
by [νj1 , νj2 ]t = {i:ti ≤t} (νj1 ti − νj1 ti−1 )(νj2 ti − νj2 ti−1 ). Then for any 0 < δ < 1, there

Electronic copy available at: https://ssrn.com/abstract=4282265



exist C1 and C2 such that for all x ∈ [0, (2C∆ n)1/(1−δ) ],
!

Z 1

P n [νj , νj ]1 − (σjs )2 ds > x ≤ C 1 x− 2 , j = 1, 2, and (B.1)
0
!

Z 1

P n [ν1 , ν2 ]1 − σ1s σ2s ρs ds > x ≤ C 2 x− 2 , (B.2)
0

where ρs = corr(dW1;s , dW2;s ) := limh→0 corr(W1;s+h − W1;s , W2;s+h − W2;s |Ft ), Ft is


the information available at time t, and the constants C1 , C2 depend only on Kσ ,
Cµ , δ, C∆ and M .

Proof: Define

 1, if 0 ≤ x < 1, and
ϕ(x) = (B.3)
 xδ/2 , if 1 ≤ x ≤ (2C∆ √n)1/(1−δ) .

√ √
Note that 0 ≤ x ≤ 2ϕ(x)2 C∆ n when 0 ≤ x ≤ (2C∆ n)1/(1−δ) . By the proof of

Lemma 1 in Cai, Hu, Li, and Zheng (2020), for any 0 ≤ x ≤ (2C∆ n)1/(1−δ) , we have
!

Z 1
P n [νj , νj ]1 − (σjs )2 ds > x
0
n√ Z 1 o\n o  
≤P n [νj , νj ]1 −(σjs )2 ds > x max σjt ≤ ϕ(x) + P max σjt > ϕ(x)
0 0≤t≤1 0≤t≤1
!
Cµ2 x2 Kσ
≤3 exp 2
− 4 2
+ .
ϕ(x) 32ϕ(x) C∆ ϕ(x)M


We have ϕ(x)−M ≤ x−

. Moreover, when 0 ≤ x ≤ 1, exp − x2 /(32ϕ(x)4 C∆
2
2
) ≤
Mδ √
1 ≤ x− 2 . When 1 < x ≤ (2C∆ n)1/(1−δ) , by the fact that exp(−x)xy ≤ exp(−y)y y
for all x, y > 0, we have
! !
x2 x2(1−δ)
exp − 2
= exp − 2
32ϕ(x)4 C∆ 32C∆
!

 8M δC 2  4(1−δ)  Mδ  Mδ
≤ ∆
· exp − · x− 2 .
1−δ 4(1 − δ)

Electronic copy available at: https://ssrn.com/abstract=4282265



Hence for all 0 ≤ x ≤ (2C∆ n)1/(1−δ) , we have
!
Cµ2 x2
3 exp − 2
ϕ(x)2 32ϕ(x)4 C∆
!
2
x
≤3 exp Cµ2 − 2
32ϕ(x)4 C∆
!

 8M δC 2  4(1−δ)  Mδ  Mδ
≤3 exp(Cµ2 ) · ∆
· exp − +1 · x− 2 .
1−δ 4(1 − δ)

 Mδ
 4(1−δ) 
2
8M δC∆
The desired bound (B.1) follows by setting C1 = 3 exp(Cµ2 ) · 1−δ
· exp −
!


4(1−δ)
+ 1 + Kσ .

The bound (B.2) follows from a similar argument above by using the inequality
!

Z 1
P n [ν1 , ν2 ]1 − σ1s σ2s ρs ds > x
0
!
n√ Z 1 o\n o
≤P n [ν1 , ν2 ]1 − σ1s σ2s ρs ds > x max (|σ1t |, |σ2t |) ≤ ϕ(x)
0 0≤t≤1
! !
+P max |σ1t | > ϕ(x) +P max |σ2t | > ϕ(x) ,
0≤t≤1 0≤t≤1

 Mδ
 4(1−δ)
2
32M δC∆
Lemma 2 in Cai, Hu, Li, and Zheng (2020), and setting C2 = 6 exp(Cµ2 )· 1−δ
·
!
 

exp − 4(1−δ) + 1 + 2Kσ .

Lemma 2. Suppose that Xt = (x1t , ..., xSt )T , 1 ≤ t ≤ T , S = O(T γ ) and


max1≤i≤S,1≤t≤T |E(xM
it )| < c for some constants γ > 0, M > 2 + 2γ, and c >
0. Assume the strong mixing condition that ρ(χ) ≤ c1 exp(−c2 χ) for some con-
stants c1 , c2 > 0 and any positive integer χ, where ρ(χ) = supA∈F−∞
0 ,B∈Fχ∞ |P (AB) −
0
P (A)P (B)|, F−∞ , Fχ∞ are σ-algebras generated by {Xt : −∞ ≤ t ≤ 0} and {Xt :
χ ≤ t ≤ ∞}, respectively. Then for some constant C > 0,

Electronic copy available at: https://ssrn.com/abstract=4282265


(i) if S is fixed,

T
r ! !
 1X  log T (log T )M/2 1
P max xit − E(xit ) > C ≤C + ;
1≤i≤S T t=1 T T M/2−1−γ T

(ii) if S → ∞,

T
r ! !
 1X  log S (log S)M/2 1 1
P max xit − E(xit > C ≤C + + .
1≤i≤S T t=1 T T M/2−1−γ T S

Proof : We only show the case when S → ∞. For the case when S is fixed,
p
the results can be shown similarly by using truncation level T /(log T ) instead of
p
T /(log S).
We denote xtr √
it = xit 1{|x it |≤C0 T /(log S)}
. By Markov’s inequality, we have

 p  C(log S)M/2
P |xit | > C0 T /(log S) ≤ .
T M/2

By Bonferroni’s inequality and that E(xM


it ) < c and M > 2 + 2γ, we have

!
P xit = xtr
it for all 1 ≤ t ≤ T, 1 ≤ i ≤ S

!
(B.4)
p
=1 − P max |xit | > C0 T /(log S)
1≤t≤T,1≤i≤S

C(log S)M/2 ST C(log S)M/2


≥1 − ≥ 1 − .
T M/2 T M/2−1−γ

By E(xM
it ) < c and the Cauchy-Schwarz inequality, we have that, for any 1 ≤ M0 < M ,

 
max E |xtr
it − x it |M0
1≤i≤S
 
= max E |xit |M0 · 1{x ≥C0 √T /(log S)}
1≤i≤S it
!1−M0 /M
 M0 /M  p  (B.5)
M
≤ max E(|xit | ) · max P xit ≥ C0 T /(log S)
1≤i≤S 1≤i≤S

C(log S)(M −M0 )/2


≤ .
T (M −M0 )/2

Electronic copy available at: https://ssrn.com/abstract=4282265


Because M > 2, (B.5) implies that
p
max E(xtr
it ) − E(x it ) = o( 1/T ). (B.6)
1≤i≤S

By the fact that (a+b)g ≤ 2g (ag +bg ) for all a, b > 0, and g ≥ 1, for some 2 < M0 < M ,
and C > 0,
!
     
max E |xtr
it |
M0
≤ max 2M0 E |xtr
it − xit |
M0
+ E |xit |M0 < C. (B.7)
1≤i≤S 1≤i≤S

By (B.6), (B.7) and the triangle inequality, applying Bernstein’s inequality (Theo-
rem 2 Eqn. (2.3) of Merlevède, Peligrad, Rio et al. (2009)) to xtr tr
it − E(xit ) yields

T
r ! !
1 X  tr  log S 1 1
P max xit − E(xit ) > C ≤C + . (B.8)
1≤i≤S T T S T
t=1

The desired bound follows from (B.4) and (B.8).

Lemma 3. Under the assumptions of Theorem 2, for some constant C0 > 0, the βb
and α
cn defined in (2.7) satisfy
! !
p 1 1
P max kβbi − βi k2 > C0 ∆n ≤ C0 + , and (B.9)
1≤i≤N N T
r !! !
p log N 1 1
P kα
cn − αn kmax > C0 ∆n ∆n + ≤ C0 + . (B.10)
T N T

Proof: First, we denote

T [1/∆
Xn ]
T 1 X
U =: (U 1 , ..., U N ) = Ut[j] , and
T · [1/∆n ] t=1 j=1
T [1/∆
Xn ]
T 1 X
F =: (F 1 , ..., F K ) = Ft[j] .
T · [1/∆n ] t=1 j=1

Electronic copy available at: https://ssrn.com/abstract=4282265


By definition (2.7), we have,

T [1/∆
X Xn ] T [1/∆
−1  X Xn ] 
T
βbi − βi = (Ft[j] − F )(Ft[j] − F ) (Ft[j] − F )(Ui,t[j] − U i )
t=1 j=1 t=1 j=1
T [1/∆
X Xn ] T [1/∆
−1  X Xn ] 
T
+ (Ft[j] − F )(Ft[j] − F ) (Ft[j] − F )(αn;t[j]i − αni ) ,
t=1 j=1 t=1 j=1

where αn;t[j]i and αni are the ith element of αn;t[j] defined in (2.3) and average drift
1
PT P[1/∆n ]
αn = T ·[1/∆ n] t=1 j=1 αn;t[j] , respectively.
We define an event A as follows. For some c, C > 0,

T [1/∆
Xn ]
( )
X p
A= max (Fk,t[j] − F k )(Ui,t[j] − U i ) < C ∆n T N 1/γ
1≤k≤K,1≤i≤N
t=1 j=1
T [1/∆
Xn ]
( )
Z T
\ X  1  
λmin (Ft[j] − F )(Ft[j] − F )T ≥ λmin Φ s ds
t=1 j=1
2 0

T [1/∆
Xn ]
( )
\ X  Z T 
λmax (Ft[j] − F )(Ft[j] − F )T ≤ 2λmax Φs ds
t=1 j=1 0
( )
\ Z T  Z T 
cT < λmin Φs ds < λmax Φs ds < CT .
0 0

By Assumption 1 that sups≥0 kαs kmax = O(1), we have that

max |αn;t[j]i − αni | ≤ C∆n .


1≤i≤N,1≤t≤T,1≤j≤[1/∆n ]

Under the event A, we have

T [1/∆
!2
8 X
K X Xn ]
max kβbi − βi k22 ≤ 2 2 max (Fk,t[j] − F k )(Ui,t[j] − U i )
1≤i≤N c T k=1 1≤i≤N t=1 j=1
2 2
32C T
+ · T K[1/∆n ] · max (αn;t[j]i − αni )2
c2 T 2 1≤i≤N,1≤t≤T,1≤j≤[1/∆n ]

8KC 2 ∆n T N 1/γ 32C 2 T 2 KC 2 ∆n


≤ + .
c2 T 2 c2 T 2

10

Electronic copy available at: https://ssrn.com/abstract=4282265



By the assumption that N = O(T γ ), we have max1≤i≤N kβbi − βi k2 = O( ∆n ).
It remains to show that P (A) ≥ 1 − O(1/N + 1/T ). First, under Assumption 4
and that M > 4, by Jensen’s inequality, we have, for any t ≤ T − 1 and 1 ≤ j, k ≤ N ,

!
Z t+1 M
 Z t+1 M  Z t+1
M
E Φs,jk ds ≤E Φs,jk ds ≤E Φs,jk ds ≤ C.
t t t

 R1   R  (B.11)
1
Under Assumption 1 that 0 < c1 < λmin E( 0
Φs ds ) ≤ λmax E( 0 Φs ds ) < C1 ,
by Lemma 2(i) and Weyl’s Theorem, we have, for all large T
!
T T
CK 2
Z 1 Z
c1 1  
P < λmin Φs ds < λmax Φs ds < 2C1 ≥1− . (B.12)
2 T 0 T 0 T

√ √
Applying Lemma 1 to XtT / T with x = T and δ = 1/2, and by Bonferroni’s
inequality, under Assumption 4, we have, for some constants C1 , C2 > 0,

T [1/∆n ]
!
1 T C2 K 2 C2 K 2
Z
1X X T
p
P Ft[j] Ft[j] − Φs ds > C1 ∆n < ≤ .
T t=1 j=1 T 0 max T M/2 T2
(B.13)
where the last inequality holds because M > 4.
By Assumption 1 that sups≥0 khs kmax = O(1), we have

1 Z T 
max hs ds ≤ C.
1≤k≤K T 0 k

By Assumption 4 and the Burkholder-Davis-Gundy inequality, we have


!
Z T √ 2M
max E ηs dWs / T
1≤k≤K 0 k
! !
1 Z T M 1 Z T 
≤ max E Φs,kk ds ≤ max E ΦM
s,kk ds ≤ C,
1≤k≤K T 0 1≤k≤K T 0

where the second inequality holds by Jensen’s inequality and that M > 4. By

11

Electronic copy available at: https://ssrn.com/abstract=4282265


Markov’s inequality, we have, for large T ,
!
Z T
1 CK
P ηs dWs >C ≤ . (B.14)
T 0 max T2

RT RT
Note that F = ( 0 hs ds + 0 ηs dWs )/(T [1/∆n ]). For large T , we have
  CK
P max (|F k |) > C∆n ≤ 2 . (B.15)
1≤k≤K T

Therefore, by the inequality that kAk2 ≤ tr(A) for any nonnegative definite matrix
A, we have
 T
 CK 2
P T · [1/∆n ] · kF F k2 > CKT ∆n ≤ . (B.16)
T2
Note that ∆n = o(1). By Weyl’s Theorem, (B.12), (B.13) and (B.16), we get

T [1/∆
Xn ]
!
Z T
X  1  
P λmin (Ft[j] − F )(Ft[j] − F )T < λmin Φs ds
t=1 j=1
2 0

[1/∆n ]
T
!
Z T
X X
T T
 1  
(B.17)
= P λmin Ft[j] Ft[j] − T · [1/∆n ] · F F < λmin Φ s ds
t=1 j=1
2 0

CK 2
≤ ,
T2

and

T [1/∆
Xn ]
!
X  Z T 
P λmax (Ft[j] − F )(Ft[j] − F )T > 2λmax Φs ds
t=1 j=1 0

T [1/∆
Xn ]
!
Z T
(B.18)
X T
 
T
= P λmax Ft[j] Ft[j] − T · [1/∆n ] · F F > 2λmin Φ s ds
t=1 j=1 0
2
CK
≤ .
T2

The assumptions that log T /| log ∆n | = O(1) and N = O(T γ ) imply that there exists
1 √
δ0 ∈ (0, 1) such that N 1/(2γ) = o((T /∆n ) 2(1−δ0 ) ). Applying Lemma 1 to XtT / T and

ZtT / T with x = N 1/(2γ) and δ = δ0 , and using Bonferroni’s inequality again, we

12

Electronic copy available at: https://ssrn.com/abstract=4282265


obtain under Assumption 4 that

T [1/∆n ]
!
1 X X p CKN CK
P max √ Fi,t[j] Ui,t[j] > C ∆n N 1/γ ≤ M δ /(2γ)
≤ ,
1≤k≤K,1≤i≤N T t=1 j=1 N 0 N
(B.19)
where the last inequality holds by the assumption that M > 4(1 + 2γ) and we can
choose δ0 close to one such that M δ0 /(2γ) > 2. By Assumption 4, the Burkholder-
Davis-Gundy inequality and Jensen’s inequality, we have
!
Z t 2M
max E ζs dBs
1≤i≤N,1≤t≤T t−1 i
! !
Z t M Z t 
≤ max E Θs,ii ds ≤ max E ΘM
s,ii ds ≤ C.
1≤i≤N,1≤t≤T t−1 1≤i≤N,1≤t≤T t−1

By Lemma 2(ii), under the assumption that M > 4(1 + 2γ), we have
!
Z T 1
p 1
P ζs dBs > C T log N ≤C + .
0 max N T

RT
Noting that U i = ( 0 ζs dBs )/(T [1/∆n ]), we get
r !
log N 11
P max |U i | > C∆n ≤C + . (B.20)
1≤i≤N T N T

Combining (B.15) and (B.20) yields, for large T ,


! !
C p 1 K
P max T · [1/∆n ] · F k U i > ∆n T log N ≤C + . (B.21)
1≤k≤K,1≤i≤N 2 N T

13

Electronic copy available at: https://ssrn.com/abstract=4282265


By (B.19), (B.21) and that ∆n = o(1),

T [1/∆
Xn ]
!
X p
P max (Fk,t[j] − F i )(Ui,t[j] − U i ) > C ∆n T N 1/γ
1≤k≤K,1≤i≤N
t=1 j=1
T [1/∆
Xn ]
!
X p
=P max Fk,t[j] Ui,t[j] − T · [1/∆n ] · F k U i > C ∆n T N 1/γ
1≤k≤K,1≤i≤N
t=1 j=1
T [1/∆
Xn ]
!
X Cp
≤P max Fk,t[j] Ui,t[j] > ∆n T N 1/γ
1≤k≤K,1≤i≤N
t=1 j=1
2
!
Cp
+P max T · [1/∆n ] · F k U i > ∆n T N 1/γ
1≤k≤K,1≤i≤N 2
!
1 1
≤CK + .
N T
(B.22)
Combining (B.12), (B.17), (B.18) and (B.22), we get that P (A) ≥ 1 − O(1/N + 1/T ).
The desired bound (B.9) follows.
cn − αn = (β − βb )F + U . Hence,
As to (B.10), note that α


cn − αn kmax ≤ k(β − βb )F kmax + kU kmax ≤ max kβi − βbi k2 · kF k2 + kU kmax .
1≤i≤N

The bound (B.10) follows from (B.9), (B.15) and (B.20).

Lemma 4. Under the assumptions of Theorem 2, for some 0 < ε < M/4 − 1 − 2γ
and C0 > 0, RVUb defined in (2.8) satisfies

T T
! !
1 X X p 1 1
P max RVUb ;it − VU ;it > C0 ∆n ≤ C0 + , (B.23)
1≤i≤N T N T
t=1 t=1

and
T T
! !
1X 1X p 1 1
P max VU ;it VU ;jt − RVUb ;it RVUb ;jt > C0 ∆n ≤ C0 + ε .
1≤i,j≤N T t=1 T t=1 N T
(B.24)
P[1/∆n ] b 2 P[1/∆n ] 2
Proof: Recall that RVUb ;it = j=1 Ui,t[j] . We write RVU ;it = j=1 Ui,t[j] .

14

Electronic copy available at: https://ssrn.com/abstract=4282265


About (B.23), we consider the following event

T [1/∆n ]
( )
1/γ 
1X X b  N
B = max (Ui,t[j] − Ui,t[j] )2 ≤ CK 2 ∆n 1 +
1≤i≤N T T
t=1 j=1
( T T
r )
\ 1X 1X ∆n N 1/γ
max RVU ;it − VUi ;t ≤ C
1≤i≤N T T T
t=1 t=1
( T
)
\ 1X
max VUi ;t ≤ C for some C > 0 .
1≤i≤N T
t=1

By the Cauchy-Schwarz inequality,

T T
1X 1X
max RVUb ;it − RVU ;it
1≤i≤N T T t=1
t=1
T [1/∆n ] T [1/∆n ]
1X X b 2 1X X b
≤ max (Ui,t[j] − Ui,t[j] ) + 2 max (Ui,t[j] − Ui,t[j] )Us,t[j]
1≤i≤N T 1≤i,s≤N T
t=1 j=1 t=1 j=1
T [1/∆n ]
1X X b
≤ max (Ui,t[j] − Ui,t[j] )2
1≤i≤N T
t=1 j=1
v v
u T [1/∆n ] u T
u 1 X X
2
u 1X
+ 2 max
t (Ui,t[j] − Ui,t[j] ) ·
b t max RVU ;it .
1≤i≤N T 1≤i≤N T
t=1 j=1 t=1

Under event B, by the triangle inequality,

T
r !
1X ∆n N 1/γ
max RVU ;it ≤ C 1 + < 2C,
1≤i≤N T T
t=1

where the last inequality holds by the assumptions that N 1/γ = O(T ) and ∆n = o(1).
Therefore, under event B,

T T
1X 1X p
max RVUb ;it − RVU ;it ≤ C 0 K ∆n .
1≤i≤N T T t=1
t=1

15

Electronic copy available at: https://ssrn.com/abstract=4282265


It follows from the triangle inequality that

T T
1X 1X
max RVUb ;it − VU ;it
1≤i≤N T t=1 T t=1
T T T T
1X 1X 1X 1X
≤ max RVUb ;it − RVU ;it + max RVU ;it − VU ;it
1≤i≤N T T 1≤i≤N T T
t=1 t=1 t=1 t=1
00
p
≤C K ∆n .

It remains to show that P (B) ≥ 1 − O(K/N + K 2 /T ). The assumptions that


N = O(T γ ), log T /| log ∆n | = O(1), and M > 4(1 + 2γ) imply that there exists δ0
1 
such that M δ0 /(2γ) > 2 and N 1/(2γ) = o (T /∆n ) 2(1−δ0 ) . Applying Lemma 1(ii)

to ZtT / T with x = N 1/(2γ) and δ = δ0 , and using Bonferroni’s inequality, under
Assumption 4, we have, for some C > 0,

T T
r !
1X 1X ∆n N 1/γ C2 N C
P max RVU ;it − VU ;it > ≤ ≤ . (B.25)
1≤i≤N T
t=1
T t=1 T N δ0 /(2γ)
M N

By (2.7) and the inequality that (a + b)2 ≤ 2a2 + 2b2 , for each 1 ≤ i ≤ N , 1 ≤ t ≤ T
and 1 ≤ j ≤ [1/∆n ],

bi,t[j] − Ui,t[j] )2 ≤ 2(c


(U αni − αn;t[j]i )2 + 2kβbi − βi k22 · kFt[j] k22
αni − αni )2 + 4(αni − αn;t[j]i )2 + 2kβbi − βi k22 · kFt[j] k22 .
≤ 4(c

By the assumption that sups≥0 kαs kmax = O(1), we have

T [1/∆n ]
1X X b
max (Ui,t[j] − Ui,t[j] )2
1≤i≤N T
t=1 j=1
T [1/∆n ]
!
1X X
≤2 max kβbi − βi k22 · kFt[j] k22 cn − αn k2max + C∆n .
+ 4[1/∆n ] · kα
1≤i≤N T t=1 j=1
(B.26)

16

Electronic copy available at: https://ssrn.com/abstract=4282265


By (B.12) and (B.13), for some C 0 > 0, when T is large, we have

T [1/∆n ]
!
1X X C 0K 2
P kFt[j] k22 > KC 0 ≤ . (B.27)
T t=1 j=1 T

Combining (B.9), (B.10), (B.26) and (B.27) yields, for some C > 0,

T [1/∆n ]
! !
1X X b  N 1/γ  K K2
P max (Ui,t[j] − Ui,t[j] )2 > CK 2 ∆n 1 + ≤C + .
1≤i≤N T T N T
t=1 j=1
(B.28)
Under Assumptions 1 and 4, by Lemma 2(ii),

T
r ! !
1X log N 1 1
P max VU ;it − E(VU ;i ) ≥ C ≤C + . (B.29)
1≤i≤N T T N T
t=1

Assumption 4 implies max1≤i≤N E(VU4 ;i ) = O(1). By the assumption that N = O(T γ ),


we have for some C > 0,

T
! !
1X 1 1
P max VU ;it > C ≤C + . (B.30)
1≤i≤N T N T
t=1

Combining (B.25), (B.28) and (B.30) yields P (B) ≥ 1 − O(1/N + 1/T ). The desired
bound (B.23) follows.
As to (B.24), by the triangle inequality and the Cauchy-Schwarz inequality, we

17

Electronic copy available at: https://ssrn.com/abstract=4282265


have
T T
1X 1X
max VU ;it VU ;jt − RVUb ;it RVUb ;jt
1≤i,j≤N T t=1 T t=1
T
2X
≤ max VU ;it (VU ;jt − RVUb ;jt )
1≤i,j≤N T t=1
T
1X
+ max (VU ;it − RVUb ;it )(VU ;jt − RVUb ;jt ) (B.31)
1≤i,j≤N T
t=1
v v
u T u T
u 1X 2 u 1X
≤ 2 max
t VU ;it · t max (VU ;it − RVUb ;it )2
1≤i≤N T 1≤i≤N T
t=1 t=1
T
1X
+ max (VU ;it − RVUb ;it )2 .
1≤i≤N T
t=1

Under Assumption 3, by Lemma 2(ii), we have, for some ε < M/4 − 1 − 2γ,

T
r ! !
1X log N 1 1
P max VU ;it VU ;jt − E(VU ;i VU ;j ) > C ≤C + ε .
1≤i,j≤N T t=1 T N T
(B.32)
Assumption 4 implies that max1≤i≤N E(VUM;i ) = O(1). By Lemma 2(ii) and M >
4(1 + 2γ),

T
r ! !
1X 2 log N 1 1
P max VU ;it − E(VU2;i ) ≥ C ≤C + ε .
1≤i≤N T T N T
t=1

This implies that

T
! !
1X 2 1 1
P max VU ;it > C ≤C + ε . (B.33)
1≤i≤N T N T
t=1

By Lemma 1, for any 1 ≤ t ≤ T , 1 ≤ i ≤ N , 0 < δ < 1, there exist constants


1

C, C1 > 0 such that for all x ≤ C∆n 2(1−δ) ,
!
p C1
P RVU ;it − VU ;it > ∆n x ≤ . (B.34)
xM δ

We define (RVU ;it − VU ;it )tr = (RVU ;it − VU ;it ) · 1|RV √


∆n T (1+γ+ε)/(M δ)
, where
U ;it −VU ;it |≤

18

Electronic copy available at: https://ssrn.com/abstract=4282265


ε > 0 and 0 < δ < 1 are to be determined. The assumptions log T /| log ∆n | = O(1)
and M > 4(1 + 2γ) imply that, for some ε sufficiently small, we have

∆1/ε
n T = o(1) and M > 4(1 + γ + ε).

Set δ satisfying M δ > 4(1 + γ + ε), and δ/(1 − δ) > 2(1 + γ + ε)/(M ε). Then

T (1+γ+ε)/(M δ) = o ∆−1/(2(1−δ))

n .

Hence, by (B.34), for any 1 ≤ t ≤ T , 1 ≤ i ≤ N and all x > 0,


!
(RVU ;it − VU ;it )2tr C1
P > x ≤ M δ/2 .
∆n x

This implies that, for all 0 < y < M δ/2,


!y !
(RVU ;it − VU ;it )2tr
max E = O(1).
1≤i≤N,1≤t≤T ∆n

In particular, by setting y = 1,
!
(RVU ;it − VU ;it )2tr
max E < C. (B.35)
1≤i≤N,1≤t≤T ∆n
 
Applying Lemma 2(ii) to (RVU ;it − VU ;it )2tr /∆n − E (RVU ;it − VU ;it )2tr /∆n , and by
M δ > 4(1 + γ), we have that

T
! r !
1X (RVU ;it − VU ;it )2tr  (RV 2 
U ;it − VU ;it )tr log N
P max −E >C
1≤i≤N T ∆n ∆n T
!t=1
1 1
≤C + ε .
N T
(B.36)
By (B.35), (B.36) and the triangle inequality, we get

T
! !
1X 1 1
P max (RVU ;it − VU ;it )2tr > C∆n ≤C + ε . (B.37)
1≤i≤N T N T
t=1

19

Electronic copy available at: https://ssrn.com/abstract=4282265


Note also that by (B.34) and Bonferroni’s inequality, we have

T T
!
1X 1 X
P max (RVU ;it − VU ;it )2tr − (RVU ;it − VU ;it )2 > 0
1≤i≤N T T t=1
t=1
N X T
! (B.38)
X   CN 1
1
p
(1+γ+ε)/(M δ)
≤ P |RVU ;it − VU ;it | > ∆n T ≤ γ+ε = O ,
i=1 t=1
T Tε

where the last inequality holds by the assumption that N = O(T γ ). Combining
(B.37) and (B.38) yields, for some constant C > 0,

T
! !
1X 1 1
P max (RVU ;it − VU ;it )2 > C∆n ≤C + ε . (B.39)
1≤i≤N T N T
t=1

Moreover, by the inequality (a + b)2 ≤ 2a2 + 2b2 , we have

T
1X
max (RVUb ;it − RVU ;it )2
1≤i≤N T
t=1
T [1/∆n ] [1/∆n ]
!2
1X X
2
bi,t[j] − Ui,t[j] ) + 2
X
bi,t[j] − Ui,t[j] )Ui,t[j]
= max (U (U
1≤i≤N T
t=1 j=1 j=1
T [1/∆n ]
!2 (B.40)
2X X
2
≤ max bi,t[j] − Ui,t[j] )
(U
1≤i≤N T
t=1 j=1
T [1/∆n ]
!2
8X X
bi,t[j] − Ui,t[j] )Ui,t[j]
+ max (U .
1≤i≤N T
t=1 j=1

Applying the Cauchy-Schwarz inequality repeatedly yields

T [1/∆n ]
!2
1X X
bi,t[j] − Ui,t[j] )Ui,t[j]
max (U
1≤i≤N T
t=1 j=1
[1/∆n ] [1/∆n ]
T
! !
1X X b X
≤ max (Ui,t[j] − Ui,t[j] )2 · 2
Ui,t[j] (B.41)
1≤i≤N T
t=1 j=1 j=1
v v
u T [1/∆n ]
! 2 u T
u 1X X b u 1X
≤ max
t (Ui,t[j] − Ui,t[j] )2 · t max RVU2 ;it .
1≤i≤N T 1≤i≤N T
t=1 j=1 t=1

20

Electronic copy available at: https://ssrn.com/abstract=4282265


By (B.33) and (B.39), we have, for some constant C > 0,

T
! !
1X 1 1
P max RVU2;it > C ≤C + ε . (B.42)
1≤i≤N T N T
t=1

Similarly, we have
v
u
u1 X T  [1/∆
Xn ] 2
max t (Ui,t[j] − Ui,t[j] )
b 2
1≤i≤N T t=1 j=1
v !2
u T [1/∆n ] [1/∆n ]
u1 X X X
≤ max t 2kβbi − βi k22 · kFt[j] k22 + αni − αn;t[j]i |2
2|c
1≤i≤N T t=1 j=1 j=1
v
u
u8 X T X K [1/∆
Xn ] 2
2 2
≤ max kβbi − βi k2 · t Fk,t[j]
1≤i≤N T t=1 k=1 j=1
v
u
u8 X T  [1/∆
Xn ] 2
+ max t |αc − α |2
ni n;t[j]i
1≤i≤N T t=1 j=1
v
u T X K
2 t 8K
u X
≤ max kβi − βi k2 ·
b RVF2;kt
1≤i≤N T t=1 k=1
v
u
u1 X T  [1/∆Xn ] 2
2
+ 8[1/∆n ] · kαcn − αn kmax + 8 max t |αni − αn;t[j]i |2 ,
1≤i≤N T t=1 j=1

P[1/∆n ] 2
where RVF ;kt = j=1 Fk,t[j] . By the assumption that sups≥0 kαs kmax = O(1), we
have v
u
u1 X T  [1/∆
Xn ] 2
max t |αni − αn;t[j]i |2 ≤ C∆n . (B.43)
1≤i≤N T t=1 j=1

Moreover, by Lemma 2(i), similar to the proof of (B.33), (B.39) and (B.42), one can
show that, for some C > 0,

T
 1X 2
 C
P max RVF ;kt > C ≤ ε . (B.44)
1≤k≤K T T
t=1

Under the assumption that N = O(T γ ), combining (B.43), (B.44) and Lemma 3

21

Electronic copy available at: https://ssrn.com/abstract=4282265


yields
v
T  [1/∆
Xn ]
u ! !
u1 X 2
2
bi,t[j] − Ui,t[j] )2 > CK 2 ∆n ≤ C K K
P max t (U + .
1≤i≤N T t=1 j=1
N T
(B.45)
By the triangle inequality, combining (B.39), (B.40), (B.41), (B.42) and (B.45) yields

v ! !
u
u1 X T p K 1
P max t (RVUb ;it − VU ;it )2 > CK ∆n ≤ C + ε . (B.46)
1≤i≤N T t=1 N T

By (B.31), (B.33) and (B.46), the desired bound (B.24) follows.

Proof of Theorem 1:
Applying the same argument as the proof of (B.25) to the stock RV, under the
assumptions of Theorem 1, one can show that, for some C > 0,

T T
r !
1 X X ∆n N 1/γ C
P max RVit − Vit > ≤ . (B.47)
1≤i≤N T T N
t=1 t=1

By the triangle inequality and the Cauchy-Schwarz inequality,

T T
1X 1X
max Vit Vjt − RVit RVjt
1≤i,j≤N T t=1 T t=1
T T
2X 1X
≤ max Vit (Vjt − RVjt ) + max (Vit − RVit )(Vjt − RVjt )
1≤i,j≤N T 1≤i,j≤N T
t=1 t=1
v v
u T u T T
u 1 X u 1X 1X
≤ 2 max
t 2 t
Vit · max 2
(Vit − RVit ) + max (Vit − RVit )2 .
1≤i≤N T 1≤i≤N T 1≤i≤N T
t=1 t=1 t=1

In addition, under the assumptions of Theorem 1, (B.29), (B.32), (B.33) and (B.39)

22

Electronic copy available at: https://ssrn.com/abstract=4282265


hold by replacing RVU ;it with RVit , and VU ;it with Vit . It follows that

T T
! !
1X 1X p 1 1
P max Vit Vjt − RVit RVjt > C ∆n ≤C + ε , and
1≤i,j≤N T t=1 T t=1 N T
(B.48)
r ! !
log N 1 1
P kΣ
b V − ΣV kmax > C ≤C + ε , (B.49)
T N T

where Σ
b V is the sample covariance matrix of Vit .

The bound (2.4) follows from (B.47), (B.48) and (B.49). The bounds (2.5) and
(2.6) follow from (2.4), Assumption 3, Weyl’s Theorem and the sin θ Theorem (Davis
and Kahan (1970)), which asserts that, for i ≤ q,

bV − Σ
2kΣ b RV k2
kξbRVi − ξVi k ≤ .
min |λ
bRV −1 − λV |, |λ
i i
bRV − λV |
i+1 i

Proof of Theorem 2:
By (B.29) and (B.32), we have
r !
log N

b V − ΣV kmax = Op
U U
, (B.50)
T

where Σ
b V is the sample covariance matrix of VU . By Lemma 4, we have
U

p 

bV − Σ
U
b RV kmax = Op
U
b
∆n . (B.51)

Combining (B.50) and (B.51) yields the desired bound (2.9).


The bound (2.10) follows from (B.51), Weyl’s Theorem and Assumption 5. The
bound (2.11) follows from (2.9) and the sin θ Theorem.

Proof of Proposition 1:

23

Electronic copy available at: https://ssrn.com/abstract=4282265


Define
PT
t=1 (CVt − CV )(Vit − V i)
bbi,V = PT , ai,V = V i − bbi,V CV ,
b (B.52)
2
t=1 (CVt − CV )

PT PT
where CV = t=1 CVt /T , and V i = t=1 Vit /T . We have,

T
X T
−1  X 
bbi,V − bi = 2
(CVt − CV ) · (CVt − CV )(εit − εi ) ,
t=1 t=1

|b
ai,V − ai | ≤ |bbi,V − bi | · CV + |εi |,

PT
where εi = t=1 εit /T .

Under the assumptions


 of Theorem
 1, max1≤i≤N E(VitM ) = O(1), E(CVtM ) =
O(1), and max1≤i≤N |bξ,i |, |aξ,i | = O(1). Moreover, under Assumption 6 that
|b̄ξ | > c, we have max1≤i≤N |(bξ,i /b̄ξ )| = O(1). Hence,

max |E(εM M
it )| ≤ M · max E(Vit ) + M · |bξ,i /b̄ξ |
M
· E(CVtM ) = O(1).
1≤i≤N 1≤i≤N

PT
By Lemma 2, we have t=1 (CVt − CV )2 /T = Var(CVt ) + op (1), and CV = Op (1).
By the assumption that |b̄ξ | > c, we have Var(CVt ) > b̄2ξ > 0.

Recall that CVt = āξ + b̄ξ ξt + ε̄ξ,t , and εit = εξ,it − E(εξ,it ) − (bξ,i /b̄ξ ) ε̄ξ,t − E(ε̄ξ,t ) .
By Assumption 6, we have

max |E(CVt · εit )|


1≤i≤N
   2 
= max E ε̄ξ,t − E(ε̄ξ,t ) · εξ,it − E(εξ,it ) − (bξ,i /b̄ξ ) ε̄ξ,t − E(ε̄ξ,t )
1≤i≤N
 1 1    √
≤ √ + max |bξ,i /b̄ξ | · λmax Cov(εξ,t ) = O(1/ N ).
N N 1≤i≤N

By Lemma 2 and that M > 4(1 + 2γ), we get

T
r !
1X log N
max CVt · εit − E(CVt · εit ) = Op ,
1≤i≤N T T
t=1
r ! r !
log T log N
|CV − E(CVt )| = Op , and max |εi | = Op .
T 1≤i≤N T

24

Electronic copy available at: https://ssrn.com/abstract=4282265


Combining the results above yields that

T
r !
1X log N 1
max (CVt − CV ) · (εit − εi ) = Op +√ .
1≤i≤N T T N
t=1

It follows that
r !
  log N 1
max |bbi,V − bi |, |b
ai,V − ai | = Op +√ . (B.53)
1≤i≤N T N

Next, we bound the differences bbi,V − bbi and b


ai,V − b
ai , where, recall that bbi and b
ai
are defined in (3.4). By (B.47),
r !
∆n N 1/γ
|CV − CRV | ≤ max |RV i − V i | = Op . (B.54)
1≤i≤N T

By the triangle inequality and the Cauchy-Schwarz inequality, we have


v v
T T u T u T
1 X
2 1 X
2
u 1 X
2 t1
u X
CRVt − CVt ≤2t CVt · (CVt − CRVt )2
T t=1 T t=1 T t=1 T t=1
T
1X
+ (CVt − CRVt )2 ,
T t=1

and
T T
1X 1X
max CVt · Vit − CRVt · RVit
1≤i≤N T T t=1
t=1
v v
u
u1 X T u T
u
2 t 1X
≤2 t CVt · max (Vit − RVit )2
T t=1 1≤i≤N T
t=1
v v
u
u1 X T u T
2
u 1X
+ t (CVt − CRVt ) · t max (Vit − RVit )2 .
T t=1 1≤i≤N T
t=1

25

Electronic copy available at: https://ssrn.com/abstract=4282265


By the Cauchy-Schwarz inequality again,

T T N N
1X 1 XX X 2
(CVt − CRVt )2 = V it − RV it
T t=1 T N 2 t=1 i=1 i=1
T N T
1 XX 1X
≤ (Vit − RVit )2 ≤ max (Vit − RVit )2 .
T N t=1 i=1 1≤i≤N T t=1

1
PT
Under Assumption 6, we have T t=1 CVt2 = Op (1). Moreover, similar to (B.39),
1
PT
one can show that max1≤i≤N T t=1 (RVit − Vit )2 = Op (∆n ). Combining the results
above, we have that

T T
1X 2 1X p
CRVt − CVt2 =Op ( ∆n ), and
T t=1 T t=1
T T
(B.55)
1X 1X p
max CVt · Vit − CRVt · RVit =Op ( ∆n ).
1≤i≤N T T t=1
t=1

Combining (B.47), (B.52), (B.54) and (B.55) yields


p p
max |bbi,V − bbi | = Op ( ∆n ), and max |b
ai,V − b
ai | = Op ( ∆n ). (B.56)
1≤i≤N 1≤i≤N

The desired bound (3.5) follows from (B.53) and (B.56).

References
Cai, T Tony, Jianchang Hu, Yingying Li, and Xinghua Zheng (2020):
“High-dimensional minimum variance portfolio estimation based on high-frequency
data,” Journal of Econometrics, 214, 482–494.

Davis, Chandler and William Morton Kahan (1970): “The rotation of eigen-
vectors by a perturbation. III,” SIAM Journal on Numerical Analysis, 7, 1–46.

Ding, Yi, Robert Engle, Yingying Li, and Xinghua Zheng (2022): “Factor
modeling for volatility,” .

Fan, Jianqing, Yingying Li, and Ke Yu (2012): “Vast volatility matrix esti-

26

Electronic copy available at: https://ssrn.com/abstract=4282265


mation using high-frequency data for portfolio selection,” J. Amer. Statist. Assoc.,
107, 412–428.

Merlevède, Florence, Magda Peligrad, Emmanuel Rio, et al. (2009):


“Bernstein inequality and moderate deviations under strong mixing conditions,”
in High dimensional probability V: the Luminy volume, Institute of Mathematical
Statistics, 273–292.

27

Electronic copy available at: https://ssrn.com/abstract=4282265

You might also like