Factor Modeling For Volatility
Factor Modeling For Volatility
∗ † ‡ §
Yi Ding Robert Engle Yingying Li Xinghua Zheng
Abstract
log (RV)
1e−02
RV 30%
RV 50%
RV 70%
2e−03
stock daily log(RV)
5e−04
1e−04
2e−05
Figure 1: Time series plots (log scale) of three representative S&P500 Index con-
stituent stocks’ RVs (RV 30%, RV 50%, RV 70%) based on 5-minute intraday returns
from 2003 to 2020 with mean RVs falling on the 30%, 50%, 70% quantiles of all mean
RVs.
Figure 1 shows clearly that the stock RVs co-move. Such a co-movement feature
in volatilities has been well-documented. For example, Engle, Ito, and Lin (1990)
and Calvet, Fisher, and Thompson (2006) examine exchange markets, Susmel and
Engle (1994); Da and Schaumburg (2006) and Kelly, Lustig, and Van Nieuwerburgh
(2013) study equities, and Bollerslev, Hood, Huss, and Pedersen (2018) and Engle
and Martin (2019) study global multiple asset classes. The volatility co-movement
has been used in volatility forecasting; see, for example, Luciani and Veredas (2015);
Asai and McAleer (2015); Barigozzi and Hallin (2017), and Bollerslev, Hood, Huss,
and Pedersen (2018).
The co-movement in volatility is not surprising as it is well known that returns ad-
We mainly answer this question from the perspective of volatility forecasting. Under
our proposed MVF model, we simply predict the CV factor and the multiplicative
idiosyncratic components separately using log HAR models. The stock volatility
2.1 Data
We focus on the S&P 500 Index constituent stocks in 2003 and exclude the least liquid
stocks that have more than 20% zero 5-minute returns from January 2003 to Decem-
ber 2020. We collect high-frequency stock prices from the TAQ database. Following
the common data cleaning procedure (e.g., Aït-Sahalia and Mancini (2008)), “bounce
back”s are removed. We sample the log prices starting from 9:35 until 16:00, using
the previous-tick approach (Gençay, Dacorogna, Muller, Pictet, and Olsen (2001)).
Regarding the sampling frequency, we use 5-minute log-returns for which the market
microstructure noise can be safely ignored (Liu, Patton, and Sheppard (2015)). Hol-
idays, half trading days and overnight returns are eliminated. Same as the treatment
in Li and Xiu (2016), we also remove May 6, 2010, the day when the “Flash Crash”
occurred. After the cleaning procedure, we obtain 291 stocks for 4491 trading days
in 2003–2020, and each stock has 77 5-minute intraday log-returns per day. About
return factors, we consider the Fama-French three-factor model (Fama and French
(1993)) and use 5-minute returns of the market, the small-minus-big (SmB) and the
high-minus-low (HmL) portfolios.1
Following Bollerslev and Todorov (2011); Aït-Sahalia, Fan, and Li (2013) and
Li, Todorov, and Tauchen (2017), for each stock i and each day t, we estimate
P77
the continuous component of variance with RVitc = tr 2 tr
j=1 (Ri;t[j] ) , where Ri;t[j] =
p
Ri;t[j] 1{|Ri;t[j] |≤vit } , 1 ≤ j ≤ 77, and vit is set to be vit = 3 min(RVit , BVit ) ×
P[1/∆n ] 2 [1/∆n ] P[1/∆n ]
∆0.49
n , ∆n = 1/77, RVit = j=1 Ri;t[j] , and BVit = π2 [1/∆ n ]−1 j=2 |Ri;t[j] Ri;t[j−1] |
is the bipower variation (Barndorff-Nielsen and Shephard (2004)). We apply the same
truncation procedure to the high-frequency factor data to obtain the continuous com-
ponent of the factor return. The truncated factor returns are denoted by F tr . Our
analysis is based on the truncated returns Rtr , truncated factor returns F tr and the
continuous component of realized variance, RV c . For notational ease, when there is no
ambiguity, we denote the truncated return Rtr by R and the continuous component
of realized variance RV c by RV .
1
We thank Saketh Aleti for sharing high-frequency factor data from the paper Aleti (2022).
The factor structure in stock returns naturally induces a factor structure in stock
variance. For example, under the FF3 model, we have
2 2 2
Vit =βiM kt VM kt t + βiHmL VHmL t + βiSmB VSmB t + VUi t
+ covariance terms,
where Vit and VUi t denote stock and idiosyncratic variances, respectively, and VM kt t ,
VHmL t and VSmB t are the factor variances. Hence, the return factor variances (VM kt ,
VHmL , VSmB ) are potential factors for the stock variance. Moreover, as discussed in
2
In Herskovic, Kelly, Lustig, and Van Nieuwerburgh (2016), CIV refers to the cross-sectional
average of standard deviations, while in our paper, we refer to CiRV as the cross-sectional average
of idiosyncratic realized variances. We refer to CiV as the cross-sectional average of integrated
idiosyncratic variance. Despite such a difference, we still use the name CiV. We find this name
nicely summarizes the most important first principle component.
3
The detailed results are available upon request.
1.0
1.0
●
●
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
●
●
●
●
0.0
0.0
● ●
1.0 1.5 2.0 2.5 3.0 3.5 4.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Figure 2: Eigenvalue ratios of the sample covariance matrix (left panel) and sample
correlation matrix (right panel) of FF3 factors’ RVs and CiRV.
the previous section, a single factor exists in idiosyncratic variance, CiV. Therefore,
altogether, there are four potential factors in stock variances. An interesting question
arises, namely, are there indeed four factors?
To address this question, we first employ PCA on the four variance factor can-
didates, namely, the three return factor realized variances (RVM kt , RVHmL , RVSmB ),
and the common idiosyncratic realized variance, CiRV. We compute the eigenvalue
ratios, which are the eigenvalues divided by the sum of the total eigenvalues, and plot
the results in Figure 2. Surprisingly, we find that the first PC explains more than
90% of the total variation in the four variance factor candidates, suggesting a single
common component.
We then perform PCA directly4 on the stock RVs, and compute the ratio of the top
eigenvalues over the sum of the total eigenvalues, based on both the covariance matrix
and the correlation matrix of the stock RVs. The results are plotted in Figure 3.
Figure 3 shows that a high proportion (60%) of the total variation in stock RVs
can be explained by the first PC, while the second and other PCs do not account for
a proportion substantially higher than the remaining. These observations suggest a
single factor model for the stock variances. We also estimate the number of factors
4
Outliers are removed by 95% winsorization to avoid the effect of extreme variations in the RVs.
10
1.0
1.0
0.8
0.8
●
0.6
0.6
●
0.4
0.4
0.2
0.2
●
● ●
● ●
● ●
0.0
0.0
● ● ● ● ● ● ● ● ● ● ●
2 4 6 8 10 2 4 6 8 10
Figure 3: Top ten eigenvalue ratios of the sample covariance matrix (left panel) and
sample correlation matrix (right panel) of stock RVs.
using estimators from Bai and Ng (2002) and Ahn and Horenstein (2013), and the
results also suggest a single factor model for the stock variances.
We next compute the pairwise correlations among the variance factor candidates,
which are the return factor RVs (RVM kt , RVHmL , RVSmB ), the CiRV, and in addition,
the first PC (P CRV ) in stock RVs. The results are summarized in Table 1.
Table 1: Pairwise correlations among RVM kt , RVHmL , RVSmB , CiRV, and the first
PC in the stock RVs.
Table 1 shows that all variance factor candidates are highly correlated with an
average pairwise correlation of around 0.80, and are also highly correlated with the
first PC in the stock RVs. These results are consistent with the findings in Li, Todorov,
and Tauchen (2016), which show a high correlation between the spot market factor
11
The empirical studies in Sections 2.2 and 2.3 are based on realized (idiosyncratic)
variances, which inevitably contain estimation errors. Because both the number of
assets and the time span are large, the estimation errors accumulate. In this section,
we analyze the consistency of conducting PCA on realized variances in identifying
factor structure in integrated variances.
12
For each day t = 1, ..., T , we denote the integrated variances and integrated id-
iosyncratic variances of N stocks as Vt = (V1t , ..., VN t )T and VU ;t = (VU ;1t , ..., VU ;N t )T ,
respectively, namely,
Z t Z t
Vit = Ψτ,ii dτ, and VU ;it = Θτ,ii dτ, 1 ≤ i ≤ N, (2.2)
t−1 t−1
Rt
where Ψt = βΦt β T + Θt . We define VF ;kt = t−1
Φτ,kk dτ , 1 ≤ k ≤ K, and VF ;t =
(VF ;1t , ..., VF ;Kt )T .
Suppose that we observe log-returns of stocks and factors at sampling frequency
∆n . For each t = 1, ..., T and j = 1, ..., n := [1/∆n ], we write the log-returns of stocks
and factors as Rt[j] and Ft[j] , respectively, where
13
Next, we present the theoretical results for the estimation of factor structure in stock
variance. We make the following assumptions on the stock volatility processes.
Rt
Assumption
2. The integrated
variances ( t−1
Ψτ,ii dτ )1≤i≤N are stationary, and
M
supt∈N E (supt−1≤s<t Ψs,ii ) ≤ kδ for some positive constants kδ , M > 0 and for
all t ∈ N, 1 ≤ i ≤ N .
Assumption 3. The covariance matrix of integrated variances, ΣV = Cov Vt ,
satisfies that for some constants c, C > 0, c ≤ λV,i /N < λV,i−1 /N ≤ C and λV,i−1 /N −
λV,i /N > c for 1 ≤ i ≤ q, and c ≤ λV,i ≤ C for q < i ≤ N , where λV,1 ≥ ... ≥ λV,N
are the eigenvalues of ΣV , and q is a fixed positive integer.
T
b RV = 1
X
Σ (RVt − RV )(RVt − RV )T ,
T t=1
PT
where RVt = (RV1t , ..., RVN T )T and RV = t=1RVt /T . We denote the ith eigenvec-
tor of ΣVU by ξVU ;i , the ith largest eigenvalue of Σ
b RV by λbRV and the corresponding
i
eigenvector by ξbRV , 1 ≤ i ≤ N .
i
14
Theorem 1 guarantees that if a factor structure exists in the stock variance, then
it can
be consistently
estimated by conducting PCA on the stock RV as long as
max ∆n , (log N )/T → 0.
T [1/∆
!−1
Xn ] T [1/∆
Xn ]
!
X X
T T
βb = (Rt[j] − R)(Ft[j] − F ) (Ft[j] − F )(Ft[j] − F ) ,
t=1 j=1 t=1 j=1
cn = R − βb F ,
α b t[j] = Rt[j] − α
and U cn − βF
b t[j] ,
(2.7)
1
PT P[1/∆n ] 1
PT P[1/∆n ]
where R = T ·[1/∆n ] t=1 j=1 Rt[j] , and F = T ·[1/∆n ] t=1 j=1 Ft[j] . The feasi-
ble idiosyncratic realized variance is defined as follows:
[1/∆n ] [1/∆n ]
X X
2
RVUb ;it = U
bi,t[j] = cni − βbiT Ft[j] )2 ,
(Ri,t[j] − α 1 ≤ i ≤ N, 1 ≤ t ≤ T. (2.8)
j=1 j=1
We then estimate ΣVU using the sample covariance matrix of (RVUb ;it )1≤i≤N,1≤t≤T :
T
1X
ΣRVUb =
b (RVUb ;t − RV Ub )(RVUb ;t − RV Ub )T ,
T t=1
15
We denote by ξVU ;i the ith eigenvector of ΣVU , 1 ≤ i ≤ N . For the sample covari-
ance matrix Σ bRV , 1 ≤ i ≤ N ,
b RV , the eigenvectors and eigenvalues are denoted by ξbRV and λ
U
b U
b ;i U
b ;i
respectively. The next theorem gives the error bound of using Σ b RV to estimate ΣV .
U
b U
1≤i≤r λV ;i T
U
r !
p log N
max kξbRVUb ;i − ξVU ;i k2 = Op ∆n + . (2.11)
1≤i≤r T
16
The empirical evidence in Section 2.3 suggests that both return factor variances and
idiosyncratic variances are driven by a single variance factor. In order to construct the
single factor, by Theorem 1, we can use the first PC in stock RVs. Alternatively, one
can take the common variance (CV) of stocks, which is defined as the cross-sectional
average of stock integrated variance:
N
1 X
CVt = Vit .
N i=1
N
1 X
CRVt = RVit .
N i=1
log CRV
2e−03
5e−04
2e−04
5e−05
Figure 4: Time series plots of common realized variance (CRV) and rescaled first PC
in stock RVs. The plots are drawn on a log scale to improve visibility.
In Figure 4, we plot the time series of CRV and the first PC in stock RV rescaled
17
Remark 1. Besides CV, we also evaluate VIX (the CBOE Market Volatility Index,
transformed to daily variance) as a possible candidate for the volatility factor. The
correlation between VIX and the first PC in stock RVs is lower than the correlation
between CRV and the first PC (0.842 vs. 0.979). VIX measures implied volatility
for the future and is more informative in longer monthly/yearly horizons (see, e.g.,
the discussion in Andersen and Benzoni (2010)). In addition, it carries a volatility
risk premium, which complicates volatility forecasting. Given the nature of VIX, we
consider CV as a more appropriate factor proxy in stock volatility. Nevertheless,
in practice, when predicting volatility, people often find VIX to be helpful. In our
volatility prediction method to be introduced in Section 4, replacing CRV with VIX
leads to similar performance.
The empirical evidence from conducting PCA on stock RVs suggests the following
factor model:
Vit = aξ,i + bξ,i ξt + εξ,it , 1 ≤ i ≤ N, (3.1)
where ξt is the single (latent) factor. Taking average over i on both sides yields
N
1 X
CVt = Vit = āξ + b̄ξ ξt + εξ,t , (3.2)
N i=1
PN PN PN
where āξ = i=1 aξ,i /N , b̄ξ = i=1 bξ,i /N , and εξ,t = i=1 εξ,it /N . We can hence
rewrite model (3.1) using CV as the variance factor:
18
We estimate model (3.3) using the S&P 500 Index constituent stock RVs. We have
the following interesting findings.
First, when checking the idiosyncratic component εit , we find a strong correlation
between ε2it and CVt2 , while the correlation between ε2it /CVt2 and CVt2 is almost
zero. This result suggests that εit scales with CVt . In addition, after checking the
distribution of εit /CVt , we find that εit can be well modeled by the multiplication of
CVt and a centered lognormal random variable, namely, εit = CVt exp(µi + σi zit ) −
exp(µi + σi2 /2) . Details are given in Appendix A.1 of the Supplementary Material.
Second, when further analyzing the coefficients ai and bi , strong evidence suggests
that the intercept terms (ai )1≤i≤N in (3.3) are close to zero. In addition, the slope
19
The analysis in Section 3.2.2 leads us to propose the following Multiplicative Volatility
Factor (MVF) model:
20
Our MVF model (3.6) is closely related to (3.7). However, there are subtle but
important differences between them.
Comparing the MVF model to the single factor model for log variance, we see
that model (3.7) is more difficult to interpret and does not enjoy internal model
consistency. Note that model (3.7) is equivalent to
b0 0 0
Vit = ξt i eai +εit ,
where the coefficient b0i becomes the exponent, which makes the interpretation diffi-
cult. One natural choice of the factor ξt in (3.7) is the common log variance (ClogV),
namely, the cross-sectional average of the log variances. We estimate the coefficients b0
in the log-linear model (3.7) by regressing log(RV ) over ClogRV , the cross-sectional
average of the log RVs. We find that the coefficient b0 varies around 1, with an in-
terquartile range of 0.89∼1.08. In particular, b0 can deviate from 1. As a result,
model (3.7) does not enjoy internal model consistency.
In addition, the MVF model has the same format as the linear model for log
21
Remark 2. Barigozzi and Hallin (2020) discuss a factor model for log variance and
the estimation consistency of the factor structure. However, we find that modeling
variance rather than log variance has several advantages as discussed above. In addi-
tion, they assume no heteroskadasticity in returns. In contrast, we model the dynamic
volatilities and naturally allow heterokasticity in returns. Moreover, our model does
not rely on factor structure in returns.
Remark 3. The MVF model (3.6) can be easily modified to include multiple factors.
When multiple factors exist, the generalized MVF takes the following form:
K
X
Vit = ξkt exp(µki ) exp(σi zit ), 1 ≤ i ≤ N, (3.8)
k=1
K
where (ξkt )K
k=1 are K factors, and exp(µki + σi zit ) k=1 are the multiplicative id-
iosyncratic exposures. Model (3.8) can be analyzed, estimated, and used in volatility
forecasting in a similar way to the single factor model.
4 Volatility Forecasting
In this section, we utilize the proposed MVF model (3.6) for volatility forecasting.
22
We use CVt as the proxy of the latent factor in our proposed MVF model (3.6).5 The
idiosyncratic variance exposure is Veit = Vit /CVt . We estimate CVt by the common
realized variance CRVt and compute the idiosyncratic realized variance exposures
(iRE), RV
g it := RVit /CRVt for i = 1, ..., N . Then, we model the log(CRVt ) and
g it ) separately using the HAR model6 (Corsi (2009)):
log(RV
d t+1 × Vb
Vbit+1 = CV e it+1 , i = 1, . . . , N.
We compare our proposed MVF model with the following benchmark models.
23
T b
Vbit+1 = βbi(t) ΣFt+1 βbi(t) + iV
c it+1 , 1 ≤ i ≤ N.
This approach utilizes the cross-sectional structure in returns but does not incorporate
the factor structure in idiosyncratic variances. We denote the prediction of the method
under the Fama-French three-factor model as FF3rG +iVlogHAR and the prediction
under the statistical factor model as StatsFrG +iVlogHAR .
8
Specifically, for the statistical factor model, we use five PCs in stock returns as return factors,
estimated with a 252-day rolling window based on the high-frequency 5-min returns.
24
On each day t, c0i and c1i are estimated using a 252-day rolling window regression
of the idiosyncratic realized variance iRVi over the common idiosyncratic realized
variance CiRV . We employ a logHAR model to predict the CiV factor. The residual
of the factor model for idiosyncratic variance, ε, is modeled by an AR(1), εi,t =
ξi εi,t−1 + uit .9 We predict the idiosyncratic variance of stock i on day t + 1 with
iV
c it+1 = b c0i(t) + b
c1i(t) CiV
d t+1 + εbit+1 . The forecast of the stock volatility is
T b
Vbit+1 = βbi(t) ΣFt+1 βbi(t) + b
c0i(t) + b
c1i(t) CiV
d t+1 + εbit+1 , i = 1, ..., N.
We denote the prediction of BM3 under FF3 model as FF3rG +iVCiV and the prediction
of BM3 under the statistical factor model as StatsFrG +iVCiV .
L
!
1X Vb RV
it it
QLIKEi = log + −1 i = 1, . . . , N,
L t=1 RVit Vit
b
9
We also check the method that predicts ε with 0. The performance is worse than the AR(1)
model.
10
Besides Q-like, we also evaluate the performance using out-of-sample R2 . Our approach
performs consistently well compared to the benchmark models in most of the years under evaluation.
25
26
where N = 291 is the total number of stocks under evaluation, and QLIKEM V F,i
and QLIKEbm,i are the Q-likes of our MVF model and the benchmark model for
the ith stock, respectively. Furthermore, we perform the Diebold-Mariano (DM) test
(Diebold and Marino (1995)) to examine the significance of the Q-like differences
between our MVF model and the benchmark models. Specifically, we perform the
following one-sided Q-like difference test:
where em,t = log(Vbm,t /RVt )+RVt /Vbm,t −1 is the Q-like loss, m = 1, 2, which represent
the Q-like loss of our MVF model and the benchmark model, respectively. We write
¯ σ (d),
dt = e1,t − e2,t . The DM test statistic is d/b ¯ where d¯ = PL dt /L and σ ¯ is the
b(d)
t=1
standard error of d¯ estimated by heteroskedasticity-autocorrelation-consistent (HAC)
estimator. We then compute the proportion of stocks where our MVF model generates
statistically significantly lower Q-likes than the benchmark models:
N
1 X
Sig.Out.perf. = 1{pi <α} , (4.3)
N i=1
where pi is the p-value of the DM test for the ith stock, and α = 5% is the significance
level.
The outperformance proportion (Out.perf.) of our proposed MFV model over
the benchmark models and the significant outperformance proportion (Sig.Out.perf.)
are reported in Table 3. The results show that the MVF model, CVlogHAR ×iElogHAR ,
yields a more accurate prediction than the benchmark models for almost all the stocks.
Moreover, the DM test results show that the outperformance of our MVF model is
27
Table 3: Outperformance proportion of the proposed MVF model over the bench-
mark models among the S&P Index constituent stocks in terms of the Q-like measure
during January 2004–December 2020. The values are the percentages of stocks for
which the MVF model outperforms the benchmark models.
The prediction results have several implications. First, the outperformance of our
proposed MVF model compared with the model that only uses the individual stock’s
information, BM1:IdvlogHAR , demonstrates the benefit of utilizing cross-sectional in-
formation in individual stock volatility prediction. Second, our MVF model has a
dominant outperformance compared with the model that uses only return factor mod-
els, BM2: FF3rG +iVlogHAR and BM2: StatsFrG +iVlogHAR , showing the importance
of incorporating the stock/idiosyncratic variance factor structure. Third, our MVF
model not only simplifies the forecasting but also generates more accurate forecasting
compared with the more complex models, the four-factor model (BM3: FF3rG +iVCiV
and StatsFrG +iVCiV ). These comparisons demonstrate the solid advantages of the
MVF model in volatility forecasting.
5 Global Evidence
The MVF model is built upon empirical evidence from the S&P 500 Index constituent
stocks. We next examine whether the MVF model also applies to the global market.
We perform a parallel analysis of the factor structure in the global market using
28
daily realized variances of 31 global equity indices11 from January 1, 2001 to March
12, 2021 obtained from the Oxford-Man Institute’s “realized library”. In the global
equity indices’ volatilities, we also find a strong co-movement feature and high pairwise
correlation with a mean pairwise correlation of 0.5. When performing PCA on the
indices’ RVs, we find that a high proportion (67%) of the total variation in global
equity indices’ RVs can be explained by the first PC, while the second and other PCs
do not account for a proportion substantially higher than the remaining. The number
of factors estimators (Bai and Ng (2002), Ahn and Horenstein (2013)) also suggest a
single factor in the global indices’ variances. We find that CRV (the cross-sectional
average of all indices’ RVs) has a 0.988 correlation with the first PC and can still
be a good proxy for the variance factor in the global market. In addition, similar to
the US market, strong evidence suggests that the multiplicative factor structure still
holds for the volatilities in the global market.12
We next evaluate the performance of our MVF model in forecasting the volatil-
ity of the global indices. We forecast the volatility based on our MVF model,
CVlogHAR ×iElogHAR , and compare the results with the benchmark model BM1:IdvlogHAR .
We do not include BM2 and BM3 because there is no well-established factor model
for the global indices. In Table 4, we summarize the Q-likes of these two models in
forecasting the volatilities of the global equity indices. Table 4 shows that our MVF
model outperforms BM1:IdvlogHAR in that its Q-likes have the lowest mean, median,
25% and 75% quantiles. We also find that our MVF model outperforms 25 out of 31,
or 80.6% of all the indices we study. The results confirm the advantage of our MVF
11
The data from different indices are synchronized by treating GMT 00:00 – GMT 23:59 as the
same day. There is no market opening overnight.
12
The detailed results are available upon request.
29
6 Conclusion
This work provides a framework to study the factor structure in stock variance based
on high-frequency and high-dimensional price data. We theoretically show that the
factor structure in stock and idiosyncratic variance can be consistently estimated by
conducting PCA on the stock/idiosyncratic realized variances. Empirically, based on
the strong empirical evidence from the analysis of daily volatilities of S&P 500 Index
constituent stocks, we propose a multiplicative volatility factor (MVF) model. The
MVF model includes a multiplicative variance factor and a multiplicative idiosyncratic
component, where the variance factor is approximately the cross-sectional average of
stock variances. Based on the proposed MVF model, we develop a forecasting model,
which is found to provide more accurate volatility forecasts than various benchmark
approaches for a majority of the stocks we evaluate. Finally, we demonstrate that
our MVF model also applies to the global market and helps to predict the volatilities
of global equity indices.
The volatility factor modeling framework that we propose facilitates a deeper
understanding of the financial market. The MVF model achieves dimension reduction
in volatility modeling for a large cross-section of assets. Besides volatility forecasting,
our framework provides insights into the study of shock spillover and transmission in
financial systems and holds promise in applications such as large portfolio allocation,
risk management and volatility trading.
References
Ahn, Seung C. and Alex R. Horenstein (2013): “Eigenvalue ratio test for the
number of factors,” Econometrica, 81, 1203–1227.
30
Andersen, Torben G and Luca Benzoni (2010): “Do bonds span volatility risk
in the US Treasury market? A specification test for affine term structure models,”
The Journal of Finance, 65, 603–653.
Bai, Jushan (2003): “Inferential theory for factor models of large dimensions,”
Econometrica, 71, 135–171.
Bai, Jushan and Serena Ng (2002): “Determining the number of factors in ap-
proximate factor models,” Econometrica, 70, 191–221.
Barigozzi, Matteo and Marc Hallin (2017): “Generalized dynamic factor mod-
els and volatilities: estimation and forecasting,” Journal of Econometrics, 201,
307–321.
31
Bollerslev, Tim, Benjamin Hood, John Huss, and Lasse Heje Pedersen
(2018): “Risk everywhere: Modeling and managing volatility,” The Review of Fi-
nancial Studies, 31, 2729–2773.
Chen, Nai-Fu, Richard Roll, and Stephen A Ross (1986): “Economic forces
and the stock market,” Journal of Business, 383–403.
Da, Zhi and Ernst Schaumburg (2006): “The factor structure of realized volatil-
ity and its implications for option pricing,” .
32
Ding, Yi, Robert Engle, Yingying Li, and Xinghua Zheng (2022): “Supple-
ment to “Factor modeling for volatility”,” .
Ding, Yi, Yingying Li, and Xinghua Zheng (2021): “High dimensional min-
imum variance portfolio estimation under statistical factor models,” Journal of
Econometrics, 222, 502–515.
Engle, Robert and Susana Martin (2019): “Measuring and hedging geopolitical
risks,” .
Engle, Robert F, Takatoshi Ito, and Wen-Ling Lin (1990): “Meteor Showers
or Heat Waves? Heteroskedastic Intra-Daily Volatility in the Foreign Exchange
Market,” Econometrica, 58, 525–542.
Fama, Eugene F and Kenneth R French (1993): “Common risk factors in the
returns on stocks and bonds,” Journal of Financial Economics, 33, 3–56.
Fan, Jianqing, Yuan Liao, and Martina Mincheva (2013): “Large covariance
estimation by thresholding principal orthogonal complements,” J. R. Stat. Soc. Ser.
B. Stat. Methodol., 75, 603–680, with 33 discussions by 57 authors and a reply by
Fan, Liao and Mincheva.
33
Hansen, Peter Reinhard, Zhuo Huang, and Howard Howan Shek (2012):
“Realized garch: a joint model for returns and realized measures of volatility,”
Journal of Applied Econometrics, 27, 877–906.
Hull, John and Alan White (1987): “The pricing of options on assets with
stochastic volatilities,” The Journal of Finance, 42, 281–300.
Jacod, Jean, Yingying Li, Per A Mykland, Mark Podolskij, and Mathias
Vetter (2009): “Microstructure noise in the continuous case: the pre-averaging
approach,” Stochastic processes and their applications, 119, 2249–2276.
Jacod, Jean, Yingying Li, and Xinghua Zheng (2019): “Estimating the inte-
grated volatility with tick observations,” Journal of Econometrics, 208, 80–100.
Jacod, Jean and Philip Protter (1998): “Asymptotic error distributions for the
Euler method for stochastic differential equations,” The Annals of Probability, 26,
267–307.
Kapadia, Nishad, Matthew Linn, and Bradley S Paye (2020): “One Vol to
Rule Them All: Common Volatility Dynamics in Factor Returns,” Available at
SSRN 3606637.
34
Li, Jia, Yunxiao Liu, and Dacheng Xiu (2019): “Efficient estimation of inte-
grated volatility functionals via multiscale jackknife,” The Annals of Statistics, 47,
156–176.
Li, Jia, Viktor Todorov, and George Tauchen (2016): “Inference theory for
volatility functional dependencies,” Journal of Econometrics, 193, 17–34.
Li, Jia and Dacheng Xiu (2016): “Generalized Method of Integrated Moments for
High-Frequency Data,” Econometrica, 84, 1613–1633.
Liu, Lily Y, Andrew J Patton, and Kevin Sheppard (2015): “Does anything
beat 5-minute RV? A comparison of realized measures across multiple asset classes,”
Journal of Econometrics, 187, 293–311.
Luciani, Matteo and David Veredas (2015): “Estimating and forecasting large
panels of volatilities with approximate dynamic factor models,” Journal of Fore-
casting, 34, 163–176.
Ross, Stephen (1976): “The arbitrage theory of capital asset pricing,” Journal of
Economic Theory, 13, 341–360.
35
Zhang, Lan et al. (2006): “Efficient estimation of stochastic volatility using noisy
observations: A multi-scale approach,” Bernoulli, 12, 1019–1043.
36
This supplement gives additional empirical results and proofs of the theoretical
results for Ding, Engle, Li, and Zheng (2022).
8
0.003
6
0.002
^ε CRV
4
0.001
2
0.000
0
−0.001
Jan−2003 Jan−2007 Jan−2011 Jan−2015 Jan−2019 Jan−2003 Jan−2007 Jan−2011 Jan−2015 Jan−2019
Table 1: Summary statistics of correlations in absolute value between εb2 from model
(3.3) and CRV 2 , and between εb2 /CRV 2 and CRV 2 . The values are the 25% quantile
(Q1), median, mean and the 75% quantile (Q3).
Q1 Median Mean Q3
ε2i,· , CRV·2 )|1≤i≤N
|corr(b 0.276 0.447 0.425 0.568
ε2i,· /CRV·2 , CRV·2 )|1≤i≤N
|corr(b 0.006 0.009 0.017 0.017
1200
●
12
Sample Quantiles
800
Frequency
8
6
400 ●
4
●●●
●
●
●●
●
2
●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●● ●
●
●●
●●
●●
●
●●
●
●●
●●
●
●
●●
●●
●●●
●●
●●
●
●●
●
●●●
●●
●
●●
●●
●● ●●
●
●●
●
●●
●
●●
●●
●
●●
●●
●
●●
●● ●
●●
●
●●
●●
●
●
●●
●
●●
●●
● ●
●
●●
●●
●
●●
●
●●
●
●●
●●
●
●
●● ●
●
●●
●
●
●●
●
●●
●●
●●
●●
●●
●
●●
●
●● ●
●
●
●●
●●
●●
●●
●
●● ●●
●
●
●●
●
●
●●
●
●●
●
●●
●
● ●●●●
●
●
●●●
●●
●
●
●●
●
●●●
●
●
●●
●
●
●●●
●
●
●●
●
●
●●
●●●
●
●●
●
●●
●
●●
● ●
●
●●
●
●●
●
●
●
●●
●●
●
●●
●●
● ●
●
●●●
●
●
●●
●●●●●
●
●●
●
●●
●●
●
●●●
●
●●
●●
●
●●●
●●
●
●●●
●
●
●●
●●
●
●●
●●●
●
● ●●
●
●
●
●●
●
●●
●
●●
●
●●
●
●●●
●●
●
●●
●●
●
●●
0
0
0 2 4 6 8 10 12 −2 0 2
log((RV − a
^ ) CRV) Normal Q−Q Plot
200
2
Sample Quantiles
●
●●●
Frequency
1
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
50 100
●
●●
0
●
●●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●●
●●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●●
−2
●
●●
●
●●
●
●
●●
●
●●
●
●●
●
●●
●●
●
●●
●●
●
0
−2 −1 0 1 2 −2 0 2
Figure 2: Distributions of (RV − b a)/CRV and log (RV − b a)/CRV of a random
a is the weighted regression estimate in model (A.1).
stock, where b
interesting phenomenon: the size of εb clearly correlates with CRV, and the correlation
almost disappears when we scale εb by CRV.
In summary, the evidence suggests the following model:
where ε̃it is independent with CV. We then estimate the model (A.1) by weighted
regression using CRV as weights. We then compare the distribution of (RV −b
a)/CRV
and log (RV −b a)/CRV against normal distribution. The results for a random stock
are presented in Figure 2. We observe in Figure 2 that (RV − b
a)/CRV is heavy-
tailed, while log((RV − b
a)/CRV ) appears to be roughly normally distributed. We
where the idiosyncratic component scales with CV, and ε̃it is modeled by a centered
lognormal distribution.
We estimate the model (A.2) using MLE on subsampled data. We subsample the
data to circumvent possible auto-correlation. Specifically, we conduct MLE of model
(A.2) on observations subsampled every ten days, and then take the average of the
ten estimates.
ai over RV i (time series
We compute the proportion of the estimated intercept b
average) for each stock i and find that the proportions are all very small with an
average of 0.035. The results suggest that ai plays little role in the model (A.2). As
another check, we note that (A.2) is equivalent to Vit /CVt = bi + ai /CVt + ε̃it . We
then regress RVit /CRVt over 1/CRVt for each individual stock and find that the R2 s
are all nearly zero with an average of 0.024. This evidence suggests that the intercept
term ai in (A.2) can be ignored. We therefore get the simplified model with zero
intercept:
Vit = bi CVt + CVt exp(µi + σi zit ) − exp(µi + σi2 /2) .
Next, we check the relation between bi and exp(µi + σi2 /2). The scatterplot of the
MLE of bi and exp(µi + σi2 /2) is presented in Figure 3. It shows that b and exp(µ +
σ 2 /2) are strongly linearly related. The correlation between the two reaches 0.99.
Moreover, the linear relation fits well with the line y = x.
By constraining a = 0 and b = exp(µ + σ 2 /2) in (A.2), we reach our final multi-
plicative volatility factor model (3.6).
4
●
●●
3
●
● ●●
●
●
●
^ 2 2)
●● ●
● ●●● ●
^+σ
●
2
●
● ●●
exp(µ
● ●●
●
●
●●
●
● ●●
●●
●●
● ● ●●
●
●●
● ● ●
● ● ● ●●
●●●
● ●
●●●●
●
●
●
●●
●●
●●
● ●
●●●●
●●●●●●●●
1
●● ●●●●
●
●●●●
●● ●●
●●
●
● ●●
●
●
●●
● ●
●●
●●●●
●●●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●●●
●
●
●
●●
●●
● ●
●
●
●●
●●
●
●
●●
●
●
●
●●●
●●
●●
●●
●
●
●
0
0 1 2 3 4
^
b
Figure 3: Scatterplot of the MLE of bi and exp(µi +σi2 /2) of all stocks under evaluation
in model (A.2). The diagonal line is y = x.
B Proofs
In the following, c, C, C1 , C 0 , C0 , ..., etc, denote constants which do not depend on T , N ,
∆n , and can vary from place to place.
The first lemma extends the concentration inequality for estimating (co)-integrated
variance (Fan, Li, and Yu (2012); Cai, Hu, Li, and Zheng (2020)) to the case when
only polynomial tail decay is imposed on the spot volatility.
Lemma 1. Suppose that (ν1t ) and (ν2t ) satisfy dνjt = µjt dt + σjt dWj;t for j = 1, 2,
where (W1;t ) and (W2;t ) are standard Brownian motions that can be dependent with
each other, and there exist constants Cµ , Kσ , M > 0, such that max0≤t≤1 |µjt | ≤ Cµ ,
and for any x > 0 and j = 1, 2,
K
σ
P max |σjt | > x < M .
0≤t≤1 x
Suppose also that the observation times (ti ) satisfy supn max1≤i≤n n|ti − ti−1 | ≤ C∆
for some constant C∆ > 0. For j1 , j2 ∈ {1, 2}, denote the realized (co)variance
P
by [νj1 , νj2 ]t = {i:ti ≤t} (νj1 ti − νj1 ti−1 )(νj2 ti − νj2 ti−1 ). Then for any 0 < δ < 1, there
Proof: Define
1, if 0 ≤ x < 1, and
ϕ(x) = (B.3)
xδ/2 , if 1 ≤ x ≤ (2C∆ √n)1/(1−δ) .
√ √
Note that 0 ≤ x ≤ 2ϕ(x)2 C∆ n when 0 ≤ x ≤ (2C∆ n)1/(1−δ) . By the proof of
√
Lemma 1 in Cai, Hu, Li, and Zheng (2020), for any 0 ≤ x ≤ (2C∆ n)1/(1−δ) , we have
!
√
Z 1
P n [νj , νj ]1 − (σjs )2 ds > x
0
n√ Z 1 o\n o
≤P n [νj , νj ]1 −(σjs )2 ds > x max σjt ≤ ϕ(x) + P max σjt > ϕ(x)
0 0≤t≤1 0≤t≤1
!
Cµ2 x2 Kσ
≤3 exp 2
− 4 2
+ .
ϕ(x) 32ϕ(x) C∆ ϕ(x)M
Mδ
We have ϕ(x)−M ≤ x−
. Moreover, when 0 ≤ x ≤ 1, exp − x2 /(32ϕ(x)4 C∆
2
2
) ≤
Mδ √
1 ≤ x− 2 . When 1 < x ≤ (2C∆ n)1/(1−δ) , by the fact that exp(−x)xy ≤ exp(−y)y y
for all x, y > 0, we have
! !
x2 x2(1−δ)
exp − 2
= exp − 2
32ϕ(x)4 C∆ 32C∆
!
Mδ
8M δC 2 4(1−δ) Mδ Mδ
≤ ∆
· exp − · x− 2 .
1−δ 4(1 − δ)
Mδ
4(1−δ)
2
8M δC∆
The desired bound (B.1) follows by setting C1 = 3 exp(Cµ2 ) · 1−δ
· exp −
!
Mδ
4(1−δ)
+ 1 + Kσ .
The bound (B.2) follows from a similar argument above by using the inequality
!
√
Z 1
P n [ν1 , ν2 ]1 − σ1s σ2s ρs ds > x
0
!
n√ Z 1 o\n o
≤P n [ν1 , ν2 ]1 − σ1s σ2s ρs ds > x max (|σ1t |, |σ2t |) ≤ ϕ(x)
0 0≤t≤1
! !
+P max |σ1t | > ϕ(x) +P max |σ2t | > ϕ(x) ,
0≤t≤1 0≤t≤1
Mδ
4(1−δ)
2
32M δC∆
Lemma 2 in Cai, Hu, Li, and Zheng (2020), and setting C2 = 6 exp(Cµ2 )· 1−δ
·
!
Mδ
exp − 4(1−δ) + 1 + 2Kσ .
T
r ! !
1X log T (log T )M/2 1
P max xit − E(xit ) > C ≤C + ;
1≤i≤S T t=1 T T M/2−1−γ T
(ii) if S → ∞,
T
r ! !
1X log S (log S)M/2 1 1
P max xit − E(xit > C ≤C + + .
1≤i≤S T t=1 T T M/2−1−γ T S
Proof : We only show the case when S → ∞. For the case when S is fixed,
p
the results can be shown similarly by using truncation level T /(log T ) instead of
p
T /(log S).
We denote xtr √
it = xit 1{|x it |≤C0 T /(log S)}
. By Markov’s inequality, we have
p C(log S)M/2
P |xit | > C0 T /(log S) ≤ .
T M/2
!
P xit = xtr
it for all 1 ≤ t ≤ T, 1 ≤ i ≤ S
!
(B.4)
p
=1 − P max |xit | > C0 T /(log S)
1≤t≤T,1≤i≤S
By E(xM
it ) < c and the Cauchy-Schwarz inequality, we have that, for any 1 ≤ M0 < M ,
max E |xtr
it − x it |M0
1≤i≤S
= max E |xit |M0 · 1{x ≥C0 √T /(log S)}
1≤i≤S it
!1−M0 /M
M0 /M p (B.5)
M
≤ max E(|xit | ) · max P xit ≥ C0 T /(log S)
1≤i≤S 1≤i≤S
By the fact that (a+b)g ≤ 2g (ag +bg ) for all a, b > 0, and g ≥ 1, for some 2 < M0 < M ,
and C > 0,
!
max E |xtr
it |
M0
≤ max 2M0 E |xtr
it − xit |
M0
+ E |xit |M0 < C. (B.7)
1≤i≤S 1≤i≤S
By (B.6), (B.7) and the triangle inequality, applying Bernstein’s inequality (Theo-
rem 2 Eqn. (2.3) of Merlevède, Peligrad, Rio et al. (2009)) to xtr tr
it − E(xit ) yields
T
r ! !
1 X tr log S 1 1
P max xit − E(xit ) > C ≤C + . (B.8)
1≤i≤S T T S T
t=1
Lemma 3. Under the assumptions of Theorem 2, for some constant C0 > 0, the βb
and α
cn defined in (2.7) satisfy
! !
p 1 1
P max kβbi − βi k2 > C0 ∆n ≤ C0 + , and (B.9)
1≤i≤N N T
r !! !
p log N 1 1
P kα
cn − αn kmax > C0 ∆n ∆n + ≤ C0 + . (B.10)
T N T
T [1/∆
Xn ]
T 1 X
U =: (U 1 , ..., U N ) = Ut[j] , and
T · [1/∆n ] t=1 j=1
T [1/∆
Xn ]
T 1 X
F =: (F 1 , ..., F K ) = Ft[j] .
T · [1/∆n ] t=1 j=1
T [1/∆
X Xn ] T [1/∆
−1 X Xn ]
T
βbi − βi = (Ft[j] − F )(Ft[j] − F ) (Ft[j] − F )(Ui,t[j] − U i )
t=1 j=1 t=1 j=1
T [1/∆
X Xn ] T [1/∆
−1 X Xn ]
T
+ (Ft[j] − F )(Ft[j] − F ) (Ft[j] − F )(αn;t[j]i − αni ) ,
t=1 j=1 t=1 j=1
where αn;t[j]i and αni are the ith element of αn;t[j] defined in (2.3) and average drift
1
PT P[1/∆n ]
αn = T ·[1/∆ n] t=1 j=1 αn;t[j] , respectively.
We define an event A as follows. For some c, C > 0,
T [1/∆
Xn ]
( )
X p
A= max (Fk,t[j] − F k )(Ui,t[j] − U i ) < C ∆n T N 1/γ
1≤k≤K,1≤i≤N
t=1 j=1
T [1/∆
Xn ]
( )
Z T
\ X 1
λmin (Ft[j] − F )(Ft[j] − F )T ≥ λmin Φ s ds
t=1 j=1
2 0
T [1/∆
Xn ]
( )
\ X Z T
λmax (Ft[j] − F )(Ft[j] − F )T ≤ 2λmax Φs ds
t=1 j=1 0
( )
\ Z T Z T
cT < λmin Φs ds < λmax Φs ds < CT .
0 0
T [1/∆
!2
8 X
K X Xn ]
max kβbi − βi k22 ≤ 2 2 max (Fk,t[j] − F k )(Ui,t[j] − U i )
1≤i≤N c T k=1 1≤i≤N t=1 j=1
2 2
32C T
+ · T K[1/∆n ] · max (αn;t[j]i − αni )2
c2 T 2 1≤i≤N,1≤t≤T,1≤j≤[1/∆n ]
10
!
Z t+1 M
Z t+1 M Z t+1
M
E Φs,jk ds ≤E Φs,jk ds ≤E Φs,jk ds ≤ C.
t t t
R1 R (B.11)
1
Under Assumption 1 that 0 < c1 < λmin E( 0
Φs ds ) ≤ λmax E( 0 Φs ds ) < C1 ,
by Lemma 2(i) and Weyl’s Theorem, we have, for all large T
!
T T
CK 2
Z 1 Z
c1 1
P < λmin Φs ds < λmax Φs ds < 2C1 ≥1− . (B.12)
2 T 0 T 0 T
√ √
Applying Lemma 1 to XtT / T with x = T and δ = 1/2, and by Bonferroni’s
inequality, under Assumption 4, we have, for some constants C1 , C2 > 0,
T [1/∆n ]
!
1 T C2 K 2 C2 K 2
Z
1X X T
p
P Ft[j] Ft[j] − Φs ds > C1 ∆n < ≤ .
T t=1 j=1 T 0 max T M/2 T2
(B.13)
where the last inequality holds because M > 4.
By Assumption 1 that sups≥0 khs kmax = O(1), we have
1 Z T
max hs ds ≤ C.
1≤k≤K T 0 k
where the second inequality holds by Jensen’s inequality and that M > 4. By
11
RT RT
Note that F = ( 0 hs ds + 0 ηs dWs )/(T [1/∆n ]). For large T , we have
CK
P max (|F k |) > C∆n ≤ 2 . (B.15)
1≤k≤K T
Therefore, by the inequality that kAk2 ≤ tr(A) for any nonnegative definite matrix
A, we have
T
CK 2
P T · [1/∆n ] · kF F k2 > CKT ∆n ≤ . (B.16)
T2
Note that ∆n = o(1). By Weyl’s Theorem, (B.12), (B.13) and (B.16), we get
T [1/∆
Xn ]
!
Z T
X 1
P λmin (Ft[j] − F )(Ft[j] − F )T < λmin Φs ds
t=1 j=1
2 0
[1/∆n ]
T
!
Z T
X X
T T
1
(B.17)
= P λmin Ft[j] Ft[j] − T · [1/∆n ] · F F < λmin Φ s ds
t=1 j=1
2 0
CK 2
≤ ,
T2
and
T [1/∆
Xn ]
!
X Z T
P λmax (Ft[j] − F )(Ft[j] − F )T > 2λmax Φs ds
t=1 j=1 0
T [1/∆
Xn ]
!
Z T
(B.18)
X T
T
= P λmax Ft[j] Ft[j] − T · [1/∆n ] · F F > 2λmin Φ s ds
t=1 j=1 0
2
CK
≤ .
T2
The assumptions that log T /| log ∆n | = O(1) and N = O(T γ ) imply that there exists
1 √
δ0 ∈ (0, 1) such that N 1/(2γ) = o((T /∆n ) 2(1−δ0 ) ). Applying Lemma 1 to XtT / T and
√
ZtT / T with x = N 1/(2γ) and δ = δ0 , and using Bonferroni’s inequality again, we
12
T [1/∆n ]
!
1 X X p CKN CK
P max √ Fi,t[j] Ui,t[j] > C ∆n N 1/γ ≤ M δ /(2γ)
≤ ,
1≤k≤K,1≤i≤N T t=1 j=1 N 0 N
(B.19)
where the last inequality holds by the assumption that M > 4(1 + 2γ) and we can
choose δ0 close to one such that M δ0 /(2γ) > 2. By Assumption 4, the Burkholder-
Davis-Gundy inequality and Jensen’s inequality, we have
!
Z t 2M
max E ζs dBs
1≤i≤N,1≤t≤T t−1 i
! !
Z t M Z t
≤ max E Θs,ii ds ≤ max E ΘM
s,ii ds ≤ C.
1≤i≤N,1≤t≤T t−1 1≤i≤N,1≤t≤T t−1
By Lemma 2(ii), under the assumption that M > 4(1 + 2γ), we have
!
Z T 1
p 1
P ζs dBs > C T log N ≤C + .
0 max N T
RT
Noting that U i = ( 0 ζs dBs )/(T [1/∆n ]), we get
r !
log N 11
P max |U i | > C∆n ≤C + . (B.20)
1≤i≤N T N T
13
T [1/∆
Xn ]
!
X p
P max (Fk,t[j] − F i )(Ui,t[j] − U i ) > C ∆n T N 1/γ
1≤k≤K,1≤i≤N
t=1 j=1
T [1/∆
Xn ]
!
X p
=P max Fk,t[j] Ui,t[j] − T · [1/∆n ] · F k U i > C ∆n T N 1/γ
1≤k≤K,1≤i≤N
t=1 j=1
T [1/∆
Xn ]
!
X Cp
≤P max Fk,t[j] Ui,t[j] > ∆n T N 1/γ
1≤k≤K,1≤i≤N
t=1 j=1
2
!
Cp
+P max T · [1/∆n ] · F k U i > ∆n T N 1/γ
1≤k≤K,1≤i≤N 2
!
1 1
≤CK + .
N T
(B.22)
Combining (B.12), (B.17), (B.18) and (B.22), we get that P (A) ≥ 1 − O(1/N + 1/T ).
The desired bound (B.9) follows.
cn − αn = (β − βb )F + U . Hence,
As to (B.10), note that α
kα
cn − αn kmax ≤ k(β − βb )F kmax + kU kmax ≤ max kβi − βbi k2 · kF k2 + kU kmax .
1≤i≤N
Lemma 4. Under the assumptions of Theorem 2, for some 0 < ε < M/4 − 1 − 2γ
and C0 > 0, RVUb defined in (2.8) satisfies
T T
! !
1 X X p 1 1
P max RVUb ;it − VU ;it > C0 ∆n ≤ C0 + , (B.23)
1≤i≤N T N T
t=1 t=1
and
T T
! !
1X 1X p 1 1
P max VU ;it VU ;jt − RVUb ;it RVUb ;jt > C0 ∆n ≤ C0 + ε .
1≤i,j≤N T t=1 T t=1 N T
(B.24)
P[1/∆n ] b 2 P[1/∆n ] 2
Proof: Recall that RVUb ;it = j=1 Ui,t[j] . We write RVU ;it = j=1 Ui,t[j] .
14
T [1/∆n ]
( )
1/γ
1X X b N
B = max (Ui,t[j] − Ui,t[j] )2 ≤ CK 2 ∆n 1 +
1≤i≤N T T
t=1 j=1
( T T
r )
\ 1X 1X ∆n N 1/γ
max RVU ;it − VUi ;t ≤ C
1≤i≤N T T T
t=1 t=1
( T
)
\ 1X
max VUi ;t ≤ C for some C > 0 .
1≤i≤N T
t=1
T T
1X 1X
max RVUb ;it − RVU ;it
1≤i≤N T T t=1
t=1
T [1/∆n ] T [1/∆n ]
1X X b 2 1X X b
≤ max (Ui,t[j] − Ui,t[j] ) + 2 max (Ui,t[j] − Ui,t[j] )Us,t[j]
1≤i≤N T 1≤i,s≤N T
t=1 j=1 t=1 j=1
T [1/∆n ]
1X X b
≤ max (Ui,t[j] − Ui,t[j] )2
1≤i≤N T
t=1 j=1
v v
u T [1/∆n ] u T
u 1 X X
2
u 1X
+ 2 max
t (Ui,t[j] − Ui,t[j] ) ·
b t max RVU ;it .
1≤i≤N T 1≤i≤N T
t=1 j=1 t=1
T
r !
1X ∆n N 1/γ
max RVU ;it ≤ C 1 + < 2C,
1≤i≤N T T
t=1
where the last inequality holds by the assumptions that N 1/γ = O(T ) and ∆n = o(1).
Therefore, under event B,
T T
1X 1X p
max RVUb ;it − RVU ;it ≤ C 0 K ∆n .
1≤i≤N T T t=1
t=1
15
T T
1X 1X
max RVUb ;it − VU ;it
1≤i≤N T t=1 T t=1
T T T T
1X 1X 1X 1X
≤ max RVUb ;it − RVU ;it + max RVU ;it − VU ;it
1≤i≤N T T 1≤i≤N T T
t=1 t=1 t=1 t=1
00
p
≤C K ∆n .
T T
r !
1X 1X ∆n N 1/γ C2 N C
P max RVU ;it − VU ;it > ≤ ≤ . (B.25)
1≤i≤N T
t=1
T t=1 T N δ0 /(2γ)
M N
By (2.7) and the inequality that (a + b)2 ≤ 2a2 + 2b2 , for each 1 ≤ i ≤ N , 1 ≤ t ≤ T
and 1 ≤ j ≤ [1/∆n ],
T [1/∆n ]
1X X b
max (Ui,t[j] − Ui,t[j] )2
1≤i≤N T
t=1 j=1
T [1/∆n ]
!
1X X
≤2 max kβbi − βi k22 · kFt[j] k22 cn − αn k2max + C∆n .
+ 4[1/∆n ] · kα
1≤i≤N T t=1 j=1
(B.26)
16
T [1/∆n ]
!
1X X C 0K 2
P kFt[j] k22 > KC 0 ≤ . (B.27)
T t=1 j=1 T
Combining (B.9), (B.10), (B.26) and (B.27) yields, for some C > 0,
T [1/∆n ]
! !
1X X b N 1/γ K K2
P max (Ui,t[j] − Ui,t[j] )2 > CK 2 ∆n 1 + ≤C + .
1≤i≤N T T N T
t=1 j=1
(B.28)
Under Assumptions 1 and 4, by Lemma 2(ii),
T
r ! !
1X log N 1 1
P max VU ;it − E(VU ;i ) ≥ C ≤C + . (B.29)
1≤i≤N T T N T
t=1
T
! !
1X 1 1
P max VU ;it > C ≤C + . (B.30)
1≤i≤N T N T
t=1
Combining (B.25), (B.28) and (B.30) yields P (B) ≥ 1 − O(1/N + 1/T ). The desired
bound (B.23) follows.
As to (B.24), by the triangle inequality and the Cauchy-Schwarz inequality, we
17
Under Assumption 3, by Lemma 2(ii), we have, for some ε < M/4 − 1 − 2γ,
T
r ! !
1X log N 1 1
P max VU ;it VU ;jt − E(VU ;i VU ;j ) > C ≤C + ε .
1≤i,j≤N T t=1 T N T
(B.32)
Assumption 4 implies that max1≤i≤N E(VUM;i ) = O(1). By Lemma 2(ii) and M >
4(1 + 2γ),
T
r ! !
1X 2 log N 1 1
P max VU ;it − E(VU2;i ) ≥ C ≤C + ε .
1≤i≤N T T N T
t=1
T
! !
1X 2 1 1
P max VU ;it > C ≤C + ε . (B.33)
1≤i≤N T N T
t=1
18
∆1/ε
n T = o(1) and M > 4(1 + γ + ε).
Set δ satisfying M δ > 4(1 + γ + ε), and δ/(1 − δ) > 2(1 + γ + ε)/(M ε). Then
T (1+γ+ε)/(M δ) = o ∆−1/(2(1−δ))
n .
In particular, by setting y = 1,
!
(RVU ;it − VU ;it )2tr
max E < C. (B.35)
1≤i≤N,1≤t≤T ∆n
Applying Lemma 2(ii) to (RVU ;it − VU ;it )2tr /∆n − E (RVU ;it − VU ;it )2tr /∆n , and by
M δ > 4(1 + γ), we have that
T
! r !
1X (RVU ;it − VU ;it )2tr (RV 2
U ;it − VU ;it )tr log N
P max −E >C
1≤i≤N T ∆n ∆n T
!t=1
1 1
≤C + ε .
N T
(B.36)
By (B.35), (B.36) and the triangle inequality, we get
T
! !
1X 1 1
P max (RVU ;it − VU ;it )2tr > C∆n ≤C + ε . (B.37)
1≤i≤N T N T
t=1
19
T T
!
1X 1 X
P max (RVU ;it − VU ;it )2tr − (RVU ;it − VU ;it )2 > 0
1≤i≤N T T t=1
t=1
N X T
! (B.38)
X CN 1
1
p
(1+γ+ε)/(M δ)
≤ P |RVU ;it − VU ;it | > ∆n T ≤ γ+ε = O ,
i=1 t=1
T Tε
where the last inequality holds by the assumption that N = O(T γ ). Combining
(B.37) and (B.38) yields, for some constant C > 0,
T
! !
1X 1 1
P max (RVU ;it − VU ;it )2 > C∆n ≤C + ε . (B.39)
1≤i≤N T N T
t=1
T
1X
max (RVUb ;it − RVU ;it )2
1≤i≤N T
t=1
T [1/∆n ] [1/∆n ]
!2
1X X
2
bi,t[j] − Ui,t[j] ) + 2
X
bi,t[j] − Ui,t[j] )Ui,t[j]
= max (U (U
1≤i≤N T
t=1 j=1 j=1
T [1/∆n ]
!2 (B.40)
2X X
2
≤ max bi,t[j] − Ui,t[j] )
(U
1≤i≤N T
t=1 j=1
T [1/∆n ]
!2
8X X
bi,t[j] − Ui,t[j] )Ui,t[j]
+ max (U .
1≤i≤N T
t=1 j=1
T [1/∆n ]
!2
1X X
bi,t[j] − Ui,t[j] )Ui,t[j]
max (U
1≤i≤N T
t=1 j=1
[1/∆n ] [1/∆n ]
T
! !
1X X b X
≤ max (Ui,t[j] − Ui,t[j] )2 · 2
Ui,t[j] (B.41)
1≤i≤N T
t=1 j=1 j=1
v v
u T [1/∆n ]
! 2 u T
u 1X X b u 1X
≤ max
t (Ui,t[j] − Ui,t[j] )2 · t max RVU2 ;it .
1≤i≤N T 1≤i≤N T
t=1 j=1 t=1
20
T
! !
1X 1 1
P max RVU2;it > C ≤C + ε . (B.42)
1≤i≤N T N T
t=1
Similarly, we have
v
u
u1 X T [1/∆
Xn ] 2
max t (Ui,t[j] − Ui,t[j] )
b 2
1≤i≤N T t=1 j=1
v !2
u T [1/∆n ] [1/∆n ]
u1 X X X
≤ max t 2kβbi − βi k22 · kFt[j] k22 + αni − αn;t[j]i |2
2|c
1≤i≤N T t=1 j=1 j=1
v
u
u8 X T X K [1/∆
Xn ] 2
2 2
≤ max kβbi − βi k2 · t Fk,t[j]
1≤i≤N T t=1 k=1 j=1
v
u
u8 X T [1/∆
Xn ] 2
+ max t |αc − α |2
ni n;t[j]i
1≤i≤N T t=1 j=1
v
u T X K
2 t 8K
u X
≤ max kβi − βi k2 ·
b RVF2;kt
1≤i≤N T t=1 k=1
v
u
u1 X T [1/∆Xn ] 2
2
+ 8[1/∆n ] · kαcn − αn kmax + 8 max t |αni − αn;t[j]i |2 ,
1≤i≤N T t=1 j=1
P[1/∆n ] 2
where RVF ;kt = j=1 Fk,t[j] . By the assumption that sups≥0 kαs kmax = O(1), we
have v
u
u1 X T [1/∆
Xn ] 2
max t |αni − αn;t[j]i |2 ≤ C∆n . (B.43)
1≤i≤N T t=1 j=1
Moreover, by Lemma 2(i), similar to the proof of (B.33), (B.39) and (B.42), one can
show that, for some C > 0,
T
1X 2
C
P max RVF ;kt > C ≤ ε . (B.44)
1≤k≤K T T
t=1
Under the assumption that N = O(T γ ), combining (B.43), (B.44) and Lemma 3
21
v ! !
u
u1 X T p K 1
P max t (RVUb ;it − VU ;it )2 > CK ∆n ≤ C + ε . (B.46)
1≤i≤N T t=1 N T
Proof of Theorem 1:
Applying the same argument as the proof of (B.25) to the stock RV, under the
assumptions of Theorem 1, one can show that, for some C > 0,
T T
r !
1 X X ∆n N 1/γ C
P max RVit − Vit > ≤ . (B.47)
1≤i≤N T T N
t=1 t=1
T T
1X 1X
max Vit Vjt − RVit RVjt
1≤i,j≤N T t=1 T t=1
T T
2X 1X
≤ max Vit (Vjt − RVjt ) + max (Vit − RVit )(Vjt − RVjt )
1≤i,j≤N T 1≤i,j≤N T
t=1 t=1
v v
u T u T T
u 1 X u 1X 1X
≤ 2 max
t 2 t
Vit · max 2
(Vit − RVit ) + max (Vit − RVit )2 .
1≤i≤N T 1≤i≤N T 1≤i≤N T
t=1 t=1 t=1
In addition, under the assumptions of Theorem 1, (B.29), (B.32), (B.33) and (B.39)
22
T T
! !
1X 1X p 1 1
P max Vit Vjt − RVit RVjt > C ∆n ≤C + ε , and
1≤i,j≤N T t=1 T t=1 N T
(B.48)
r ! !
log N 1 1
P kΣ
b V − ΣV kmax > C ≤C + ε , (B.49)
T N T
where Σ
b V is the sample covariance matrix of Vit .
The bound (2.4) follows from (B.47), (B.48) and (B.49). The bounds (2.5) and
(2.6) follow from (2.4), Assumption 3, Weyl’s Theorem and the sin θ Theorem (Davis
and Kahan (1970)), which asserts that, for i ≤ q,
√
bV − Σ
2kΣ b RV k2
kξbRVi − ξVi k ≤ .
min |λ
bRV −1 − λV |, |λ
i i
bRV − λV |
i+1 i
Proof of Theorem 2:
By (B.29) and (B.32), we have
r !
log N
kΣ
b V − ΣV kmax = Op
U U
, (B.50)
T
where Σ
b V is the sample covariance matrix of VU . By Lemma 4, we have
U
p
kΣ
bV − Σ
U
b RV kmax = Op
U
b
∆n . (B.51)
Proof of Proposition 1:
23
PT PT
where CV = t=1 CVt /T , and V i = t=1 Vit /T . We have,
T
X T
−1 X
bbi,V − bi = 2
(CVt − CV ) · (CVt − CV )(εit − εi ) ,
t=1 t=1
|b
ai,V − ai | ≤ |bbi,V − bi | · CV + |εi |,
PT
where εi = t=1 εit /T .
max |E(εM M
it )| ≤ M · max E(Vit ) + M · |bξ,i /b̄ξ |
M
· E(CVtM ) = O(1).
1≤i≤N 1≤i≤N
PT
By Lemma 2, we have t=1 (CVt − CV )2 /T = Var(CVt ) + op (1), and CV = Op (1).
By the assumption that |b̄ξ | > c, we have Var(CVt ) > b̄2ξ > 0.
Recall that CVt = āξ + b̄ξ ξt + ε̄ξ,t , and εit = εξ,it − E(εξ,it ) − (bξ,i /b̄ξ ) ε̄ξ,t − E(ε̄ξ,t ) .
By Assumption 6, we have
T
r !
1X log N
max CVt · εit − E(CVt · εit ) = Op ,
1≤i≤N T T
t=1
r ! r !
log T log N
|CV − E(CVt )| = Op , and max |εi | = Op .
T 1≤i≤N T
24
T
r !
1X log N 1
max (CVt − CV ) · (εit − εi ) = Op +√ .
1≤i≤N T T N
t=1
It follows that
r !
log N 1
max |bbi,V − bi |, |b
ai,V − ai | = Op +√ . (B.53)
1≤i≤N T N
and
T T
1X 1X
max CVt · Vit − CRVt · RVit
1≤i≤N T T t=1
t=1
v v
u
u1 X T u T
u
2 t 1X
≤2 t CVt · max (Vit − RVit )2
T t=1 1≤i≤N T
t=1
v v
u
u1 X T u T
2
u 1X
+ t (CVt − CRVt ) · t max (Vit − RVit )2 .
T t=1 1≤i≤N T
t=1
25
T T N N
1X 1 XX X 2
(CVt − CRVt )2 = V it − RV it
T t=1 T N 2 t=1 i=1 i=1
T N T
1 XX 1X
≤ (Vit − RVit )2 ≤ max (Vit − RVit )2 .
T N t=1 i=1 1≤i≤N T t=1
1
PT
Under Assumption 6, we have T t=1 CVt2 = Op (1). Moreover, similar to (B.39),
1
PT
one can show that max1≤i≤N T t=1 (RVit − Vit )2 = Op (∆n ). Combining the results
above, we have that
T T
1X 2 1X p
CRVt − CVt2 =Op ( ∆n ), and
T t=1 T t=1
T T
(B.55)
1X 1X p
max CVt · Vit − CRVt · RVit =Op ( ∆n ).
1≤i≤N T T t=1
t=1
References
Cai, T Tony, Jianchang Hu, Yingying Li, and Xinghua Zheng (2020):
“High-dimensional minimum variance portfolio estimation based on high-frequency
data,” Journal of Econometrics, 214, 482–494.
Davis, Chandler and William Morton Kahan (1970): “The rotation of eigen-
vectors by a perturbation. III,” SIAM Journal on Numerical Analysis, 7, 1–46.
Ding, Yi, Robert Engle, Yingying Li, and Xinghua Zheng (2022): “Factor
modeling for volatility,” .
Fan, Jianqing, Yingying Li, and Ke Yu (2012): “Vast volatility matrix esti-
26
27