1 jds997
1 jds997
6339/21-JDS997
January 2021                                                                       Statistical Data Science
                                                 Abstract
Climate change is widely recognized as one of the most challenging, urgent and complex problem
facing humanity. There are rising interests in understanding and quantifying climate changing.
We analyze the climate trend in Canada using Canadian monthly surface air temperature, which
is longitudinal data in nature with long time span. Analysis of such data is challenging due
to the complexity of modeling and associated computation burdens. In this paper, we divide
this type of longitudinal data into time blocks, conduct multivariate regression and utilize a
vine copula model to account for the dependence among the multivariate error terms. This
vine copula model allows separate specification of within-block and between-block dependence
structure and has great flexibility of modeling complex association structures. To release the
computational burden and concentrate on the structure of interest, we construct composite
likelihood functions, which leave the connecting structure between time blocks unspecified. We
discuss different estimation procedures and issues regarding model selection and prediction. We
explore the prediction performance of our vine copula model by extensive simulation studies.
An analysis of the Canada climate dataset is provided.
Keywords climate change; composite likelihood; longitudinal data; prediction
1     Introduction
Climate change is widely recognized as one of the most challenging, urgent and complex problem
facing humanity (OBrien, 2010). Its impacts are global in scope and unprecedented in scale, its
fingerprints are across natural systems (Parmesan and Yohe, 2003), and it exposes human-being
under the risks of but not limited to global food security (Wheeler and von Braun, 2013) and
forced immigration (Hugo, 2013). Climate change features global warming since the mid-20th
century and it is believed that human activities are responsible for the observed waming (Stocker
et al., 2013). Therefore, it is increasingly important to understand and quantify climate change
and the magnitude and the speed of global warming, which provides a basis for policy makers
and financial institutions to respond smartly (Lim et al., 2004; Fang et al., 2019).
     The objective of our research is to develop a new statistical model to better characterize
and forecast the trend of temperature changing. The temperature data is usually longitudinal
in nature and features long time span, which imposes challenges to conventional method for
longitudinal data analysis.
     Longitudinal data analysis, which studies the change of repeated observations of the same
subjects over time, has long been a thriving topic in statistical research. There have been sev-
eral approaches in this field, including multivariate analysis, linear and generalized linear mixed
and mixture models, generalized estimating equations, structural equation models, transition
methods, Bayesian methods and so on. A large body of books, papers and reviews discussed
and summarized the aforementioned topics, and for comprehensive summaries of different ap-
proaches, we can refer to Diggle et al. (2002), Hedeker and Gibbons (2006), Fitzmaurice et al.
(2009), Verbeke and Molenberghs (2009), Verbeke et al. (2014), for example. Analysis of intensive
longitudinal data (Walls and Schafer, 2006) is also an emerging area.
      Copula (Joe, 1997; Nelsen, 2007) is a powerful and flexible tool to model the multivariate
distribution and it allows separate models for marginal distribution and dependence structure.
To cope with the restrictions of multivariate copula, a graphical model, vine copula, (Joe, 1997;
Bedford and Cooke, 2002; Aas et al., 2009) was developed based on density decomposition and
bivariate copulas and it can model the multivariate distribution flexibly.
      The applications of copulas and vine copulas to longitudinal data are limited. Lambert and
Vandenhende (2002) introduced copula to model multivariate non-normal longitudinal data.
Smith et al. (2010) considered using D-Vine copula to model the serial dependence in time
series, but they focused more on the estimation of the vine copulas and did not include covariates
into the model. Ruscone and Osmetti (2016) and Smith (2015) consider using copula and vine
copula to model the multivariate time series. Killiches and Czado (2018) considered modeling
the unbalanced longitudinal data with a homogeneous vine copula model. Each bivariate copula
in the vine structure is assumed to have the form of Gaussian copula, so that the model can
be used to make prediction easily. Other studies include Frees and Wang (2006); Shen and
Weissfeld (2006); Domma et al. (2009); Madsen and Fang (2011); Shi and Yang (2018). Most of
these references considered a short time span, or used model selection methods to create sparse
vine structure, such as Smith et al. (2010).
      In this paper, we use vine copula model to describe the dependence structure of longitudi-
nal data with possible long time span by dividing data into different time blocks. The temporal
length of the longitudinal data determines the number of parameters in regular vine model,
which increases quadratically as time length increases. Thus directly using vine copula model
for the longitudinal data on a large time-span will introduce a large number of parameters and
hence create difficulties for parameter estimation. As a result, we consider using the compos-
ite likelihood (Lindsay, 1988; Varin, 2008; Varin et al., 2011; Lindsay et al., 2011; Yi, 2017)
to simplify the likelihood function and concentrate on the parameters of primary interest. We
also compare different estimation procedures, simultaneous estimation and two-stage estima-
tion, to further facilitate the fast inference of our proposed model. Moreover, we find out in
simulation studies that the composite likelihood provides robustness against misspecification on
structure linking between time blocks, accurate selection of the (conditional) bivariate copulas
and convenient structure for prediction. The proposed model yields promising prediction results
in terms of subject and time extrapolations in both simulation studies and analysis of Canadian
temperature data.
      The rest of the paper is organized as follows. In Section 2, we discuss the model formulation,
including marginal model and association model. In Section 3, we describe how to estimate the
parameters, and in Section 4, we give the procedure for copula selection and prediction based
on our model. In Section 5 and 6, simulation studies and analysis of Canadian temperature data
are provided, respectively.
                           A Vine Copula Model for Climate Trend Analysis                                        39
2     Model Formulation
Suppose that we are interested in modeling longitudinal data collected over a long period of
time, for instance, monthly temperature data over years. Such data usually features with natural
period (e.g., years) or exhibits a periodic pattern. To feature the periodic patterns, we examine
the data by periods, called time blocks in what follows, and let b denote the number of time
points in each time blocks. Suppose that we have a time blocks, let m = ab denote the total
number of observed occasions, and n subjects are observed at the m occasions. For longitudinal
data with no periodic pattern, we set a = 1. Let Yikl be the continuous response for the ith
subject at the lth time point in the kth time block, and let xikl be the associated covariate
matrices. Let Yik = (Yik1 , . . . , Yikb )T be the vector of responses of the ith subject in the kth time
block, and let Yi = (Yi1T , . . . , YiaT )T be the full vector of responses of subject i for i = 1, . . . , n and
k = 1, . . . , a. Let lower case letters yik and yi denote the realizations of of Yik and Yi , respectively,
and let xik and xi denote the corresponding covariates.
     We now introduce the joint model for Yi which shows the dependence of Yi on xi . It is difficult
to directly specify a meaningful joint distribution of Yi , given xi , to facilitate the dependence
structure of the components of Yi . To come up with an interpretable joint model for Yi given xi ,
we take two steps. In the first step, we characterize the dependence of Yi on xi via regression
models, which contain random errors; in the second step, we further delineate the dependence
structures of the components of Yi by characterizing the dependence structures of the random
errors resulted from the first step.
     Specifically, for i = 1, . . . , n, k = 1, . . . , a, and l = 1, . . . , b, we assume that
where μikl = E(Yikl |xikl ), and εikl is the associated random error term. We further assume that
                                                gl (μikl ) = xikl
                                                              T
                                                                  βl ,
where gl (·) is the link function and βl is the parameter vector associated with time l. Let
β = (β1T , . . . , βbT )T . For i = 1, . . . , n and k = 1, . . . , a, we let εik = (εik1 , . . . , εikb )T and εi =
  T             T T
(εi1 , . . . , εia ) .
       To reflect that responses from the same subject across time points are possibly associated,
in the next step, we focus on characterizing the dependence structure among the components of
εi using vine copula models.
εikl ∼ Fl (εikl ; ωl ),
T1 T2 T3
εi31 εi33 εi31 , εi32 εi32 , εi33 |εi31 εi32 , εi34 |εi31
Figure 1: A R-Vine structure for 4 time blocks and 4 time points within each block.
     We first introduce necessary notation before we give the mathematical form of the R-Vine
structure. For c = 1, . . . , a and d = 2, . . . , b + 1, let Gicd = {εicl : l = 1, . . . , d − 1}. For
s, g ∈ {1, . . . , a} and h, r ∈ {1, . . . , b}, let
                                  ⎧ g−1              
                                  ⎪
                                  ⎪     
                                  ⎪
                                  ⎪           Gic(b+1) ∪ Gish ∪ Gigr ,        if s < g − 1;
                                  ⎨
                       Dish,igr =      c=s+1
                                  ⎪
                                  ⎪ G
                                  ⎪   ish ∪ Gigr ,                            if s = g − 1;
                                  ⎪
                                  ⎩
                                    Gish ,                             if s = g and h < r.
     Furthermore, for a random variable Z1 , a random vector Z2 = (Z21 , . . . , Z2d1 )T and a random
vector Z3 = (Z31 , . . . , Z3d2 )T with 1 + d1 + d2 = m, let FZ1 Z2 Z3 (z1 , z2 , z3 ) denote the joint CDF of
Z1 ,Z2 and Z3 , with fZ1 Z2 Z3 as the corresponding density function. As a result, the joint density
of the random vector Z2 is derived as fZ2 (z2 ) =         fZ1 Z2 Z3 (z1 , z2 , z3 )dz1 dz3 , and the conditional
CDF of Z1 , given Z2 is
                                                                ∂ d1 FZ1 Z2 (z1 , z2 ) 1
                                    FZ1 |Z2 (z1 |z2 ) =                                      ,                             (2)
                                                                  ∂z21 . . . ∂z2d1 fZ2 (z2 )
where FZ1 Z2 (z1 , z2 ) = lim FZ1 Z2 Z3 (z1 , z2 , z3 ) is the joint CDF of Z1 and Z2 .
                          z3 →∞
      For εikh and εikr with h < r in the same time block k, let ckh,kr (·, ·) denote the conditional
copula density function between εikh and εikr , given the conditioning set Dikh,ikr , where the first
and second arguments in the copula density are given by uikh|Dikh,ikr = Fεikh |Dikh,ikr (εikh |Dikh,ikr )
and uikr|Dikh,ikr = Fεikr |Dikh,ikr (εikr |Dikh,ikr ) respectively, and Fεikh |Dikh,ikr and Fεikr |Dikh,ikr are the con-
ditional CDFs of εikh and εikr , given the conditioning set Dikh,ikr respectively, which are ob-
tained from (2) by letting Z1 = εikh or εikr , Z2 = Dikh,ikr and Z3 = εi \{εikh ∪ Dikh,ikr } or
εi \{εikr ∪ Dikh,ikr }.
      For εish and εigr in different time block with s < g, let csh,gr (·, ·) denotes the conditional
copula density function between εish and εigr , given the conditioning set Dish,igr , where the first
and second arguments in the copula density are given by uish|Dish,igr = Fεish |Dish,igr (εish |Dish,igr )
and uigr|Dish,igr = Fεigr |Dish,igr (εigr |Dish,igr ) respectively, and Fεish |Dish,igr and Fεigr |Dish,igr are the con-
ditional CDFs of εish and εigr , given the conditioning set Dish,igr respectively, which are ob-
tained from (2) by letting Z1 = εish or εigr , Z2 = Dish,igr and Z3 = εi \{εish ∪ Dish,igr } or
εi \{εigr ∪ Dish,igr }.
      Combining the marginal model and the dependence structures specified, we write the joint
density function of εi as
                                        a     b
               f (εi ; ω, θ, ψ) =                 fl (εikl ; ωl )
                                     k=1 l=1
                                             a b−1      b
                                    ×                           ckh,kr (uikh|Dikh,ikr , uikr|Dikh,ikr ; θkh,kr )
                                            k=1 h=1 r=h+1
                                         a−1       a       b     b                                                   
                                    ×                                 csh,gr (uish|Dish,igr , uigr|Dish,igr ; ψsh,gr ) ,   (3)
                                            s=1 g=s+1 h=1 r=1
where the product in the first set of brackets corresponds to the marginal densities of the εikl ,
the product in the second set of brackets corresponds to the C-Vine structure within time blocks
42                                                         Zhuang, H. et al.
by using (1).
3     Estimation Methods
Given the availability of the joint distribution of Yi , it is natural to use the likelihood method
to estimate the marginal parameters η and dependence parameters ϑ simultaneously. Let
Maximizing the likelihood function (5) with respect to η and ϑ gives the maximum likelihood
estimator of (ηT , ϑ T )T , denoted by (η̂T , ϑ̂ T )T .
     The likelihood method is conceptually easy to implement, and it yields consistent and ef-
ficient estimators if the associated models are correctly specified. However, this method has
two major limitations. Computationally, when the dimension of Yi increases, the number of pa-
rameters in the likelihood function will increase dramatically, and thus, using the likelihood for
estimation can be computationally prohibitive. Theoretically, the validity of the maximum likeli-
hood estimator hinges on the correctness of all the assumed models. Any model misspecification
may result in biased results.
                          A Vine Copula Model for Climate Trend Analysis                                           43
    To overcome the weakness of the likelihood method, we explore the alternative estimation
methods using the composite likelihood framework (Lindsay, 1988; Varin, 2008; Varin et al., 2011;
Lindsay et al., 2011; Yi, 2017), of which a review is provided in Section 2 of the supplementary
materials and the details of formulation are elaborated in following sections.
Matrices HCS (φ) and JCS (φ) can be estimated by their empirical counterparts with φ̂cs plugged
in, when using the asymptotic distribution to conduct inference about φ.
                                                   
where Lil (ηl ) = ak=1 fl yikl − gl−1 (xikl
                                         T
                                            βl ); ωl . Maximizing (8) with respect to ηl yields an esti-
mator of ηl , denoted by η̂l , for l = 1, . . . , b. Let η̂CT = (η̂1T , . . . , η̂bT )T . In the second stage, we
plug η̂CT into (7) and obtain Lc (η̂CT , θ ). Then maximizing Lc (η̂CT , θ ) with respect to θ provides
an estimator of θ , denoted   by θ̂CT . Let φ̂CT = (η̂CT  T     T T
                                                            , θ̂CT ) .
                   ∂ b
    Let Qi (η) = ∂η l=1 log[Lil (ηl )] and Ui (η, θ ) = ∂θ log[Lci (η, θ )]. Define
                                                                ∂
                                                        
                           ∂
                         ∂ηT i
                              Q (η)             0
         HCT (φ) = E ∂                    ∂                   and JCT (φ) = E[Wi (η, θ )Wi (η, θ )T ],
                        ∂ηT
                             Ui (η, θ ) ∂θ T
                                             U  i (η, θ )
where Wi (η, θ ) = (Qi (η)T , Ui (θ, η)T )T . Similarly, by the results of Varin (2008); Varin et al.
(2011) and Yi (2017), under regularity conditions, φ̂CT is a consistent estimator of φ, and
√
  n(φ̂CT − φ) has the asymptotic normal distribution with mean zero and covariance matrix
         −1
HCT (φ)JCT  (φ)HCT (φ). When conducting inference about φ using this asymptotic distribution,
HCT (φ) and JCT (φ) are estimated by their empirical counterparts with φ̂ct plugged in.
for l = 1, . . . , (h + 1). Then, the error terms of the h observed time points can be calculated and
transformed as “pseudo-observations", i.e., for l = 1, . . . , h,
Next, the conditional distribution of the error term at time (h + 1) can be approximated as
                                                                        f (ε̂ik1 , . . . , ε̂ikh , εik(h+1) )
                             f (εik(h+1) |ε̂ik1 , . . . , ε̂ikh ) =                                           ,
                                                                              f (ε̂ik1 , . . . , ε̂ikh )
which by (3), is equal to
                                                            h
               f (ε̂ik1 , . . . , ε̂ikh )fh+1 (εik(h+1) )       r=1 ckr,k(h+1) (ûikr|Dikr,ik(h+1) , uik(h+1)|Dikr,ik(h+1) )
                                                            f (ε̂ik1 , . . . , ε̂ikh )
                           A Vine Copula Model for Climate Trend Analysis                                   45
                                          h
            = fh+1 (εik(h+1) ; ω̂h+1 )         ckr,k(h+1) (ûikr|Dikr,ik(h+1) , uik(h+1)|Dikr,ik(h+1) ),    (9)
                                         r=1
where the conditional terms ûikr|Dikr,ik(h+1) and uik(h+1)|Dikr,ik(h+1) are calculated by applying the
                  ∂c (u ,u )            ∂c (u ,u )
formulas up|q = pq∂uqp q and uq|p = pq∂upp q iteratively, in which p and q can be any uncon-
ditional label, such as ikr, or conditional label, such as ikr|Dikr,ik(h+1) . As a result, the predicted
outcome ŷik(h+1) for subject i at time point (h + 1) in time block k is given by
with f (εik(h+1) |ε̂ik1 , . . . , ε̂ikh ) determined by (9). The prediction variance of ŷik(h+1) is calculated
as
where the first component is related to the marginal model at time h + 1, and the second
component can be calculated from the conditional density f (εik(h+1) |ε̂ik1 , . . . , ε̂ikh ).
5     Simulation Studies
We conduct extensive simulation studies to examine the finite sample performance of the pro-
posed composite likelihood under simultaneous and two-stage estimation procedures. To save
space, the details of the simulation designs, evaluation measures, simulation results concerning
efficiency, robustness, model selection and subject extrapolation (making prediction for a sample
out of training set) are reported in supplementary materials. Here we summarize the simula-
tion results which confirm that the proposed methods yield consistent estimates, with negligible
empirical biases, fairly agreeable empirical standard errors and asymptotic standard errors and
good coverage rates of 95% confidence intervals. The simultaneous procedure incurs moderate
efficiency loss, compared to the full likelihood method, as expected. It is more efficient than
the two-stage estimation procedure, at the price of a longer computational time. The composite
likelihood provides robustness against misspecification on structure linking between time blocks
and accurate selection of the (conditional) bivariate copulas. The prediction performance for
subject extrapolation is similar to that for time extrapolation, which is discussed in detail as
follows.
      We elaborate the studies to evaluate the time extrapolation (predicting for a future time)
using the proposed R-Vine model and compare it to that of the conventional regression models
and time-series models here. We consider various settings in Section 5.1, and report our findings
in Section 5.2, respectively.
 • Scenario 5: The error terms εi are simulated from an AR(1) structure instead of an R-Vine.
   We set ρ = 0.5 for m = ab = 20 time points. The marginal model is assumed to be
                                                                  
                                                  πj            πj
                      yij = 2.5 + 3.5xij − 50 sin      + 50 cos      + εij ,
                                                   2             2
   where εij are independently generated from N (0, 1) for i = 1, . . . , n and j = 1, . . . , m. The
   sine and cosine functions are used to model the periodic trend.
 • Scenario 6: We consider the same setting as that of Scenario 5, except that the marginal
   model does not contain the periodic sine and cosine functions but is of the form
 • VINE2 : The same as VINE1 except that the parameters are estimated using two-stage
   estimation procedure presented in Section 3.2.
 • VINE3 : The (conditional) bivariate copulas are selected using the methods presented in
   Section 4 and the parameters are estimated under simultaneous estimation.
 • VINE4 : The same as VINE3 except that the parameters are estimated under two-stage
   estimation.
 • MRM: We assume that the marginal model for the lth time point is identical across time
   blocks. A marginal regression model of the form (10) is fitted. The dependence structure is
   completely ignored.
 • LRM: A linear regression model is fitted, which takes both time block k and time point l as
   covariates and is of the form
Table 1: MAEs of different models for time extrapolation under the proposed scenarios. S: strong
dependence setting; M: moderate dependence setting.
Scenario     VINE1             VINE2             VINE3             VINE4              MRM               LRM                 AR
    1(S)   0.760   (1.076)   0.760   (1.082)   0.765   (1.076)   0.765   (1.083)   1.596   (1.954)   2.999   (2.823)   11.356 (7.681)
    1(M)   1.145   (1.344)   1.145   (1.352)   1.146   (1.345)   1.146   (1.352)   1.598   (1.963)   3.002   (2.832)   11.360 (7.682)
    2(S)   0.760   (1.076)   0.760   (1.083)   0.765   (1.076)   0.765   (1.083)   1.596   (1.953)   1.596   (1.953)    1.597 (1.951)
    2(M)   1.145   (1.344)   1.145   (1.352)   1.146   (1.344)   1.146   (1.353)   1.598   (1.963)   1.597   (1.963)    1.598 (1.962)
    3(S)   0.847   (0.663)   0.865   (0.675)   0.837   (0.664)   0.888   (0.675)   1.596   (1.942)   3.000   (2.818)   11.356 (7.665)
    3(M)   1.219   (1.168)   1.222   (1.190)   1.230   (1.169)   1.232   (1.190)   1.599   (1.951)   3.002   (2.827)   11.359 (7.781)
    4(S)   0.847   (0.663)   0.865   (0.675)   0.837   (0.664)   0.888   (0.675)   1.596   (1.942)   1.596   (1.942)    1.597 (2.976)
    4(M)   1.219   (1.168)   1.222   (1.190)   1.230   (1.169)   1.232   (1.190)   1.599   (1.951)   1.598   (1.951)    1.599 (2.981)
      5    0.830   (1.040)   0.830   (1.040)   0.830   (1.040)   0.830   (1.040)   0.922   (1.154)   0.922   (1.154)    0.920 (1.153)
      6    0.830   (1.040)   0.831   (1.040)   0.831   (1.040)   0.831   (1.040)   0.923   (1.156)   0.923   (1.156)    0.922 (1.154)
However, factoring in the computation cost, the improvement of using the former method over
the latter one seems marginal; in applications, it may not always be worthwhile to pursue the
simultaneous estimation method due to its computation cost. Incorporating the observation his-
tory can greatly reduce the prediction standard errors. Moreover, prediction standard errors
decrease as the strength of dependence increases.
     From the boxplots of MAEs by time points in the supplementary materials, we find the
MAEs for a later time point are always smaller and less variant when using the vine models,
which is the benefit of taking into account the dependence structure within time blocks.
6     Data Analysis
6.1    Description of the Dataset
We consider the climate data available publicly on the website of Government of Canada. It
is homogenized Canadian surface air temperature data (Vincent et al., 2012). The data is
available at https://www.canada.ca/en/environment-climate-change/services/climate-change/
science-research-data/climate-trends-variability/adjusted-homogenized-canadian-data.html. The
dataset we use contains monthly mean of daily mean temperature in Celsius degree at 47 On-
tarian observation stations from January 1978 to December 2018. Figure 2 is a run chart of the
monthly temperature of the 47 stations from January 1978 to December 2018, which obviously
exhibits a yearly periodic pattern and a mild overall increasing trend.
Figure 2: Monthly temperatures of all 47 stations from January 1978 to December 2018 (in grey)
overlaid by that of the yearly average temperatures of the 47 stations (in red).
and for l = 3, 4, 5, 6, 7, 8, 9,
Table 2: Summary of the selected bivariate copula functions for the C-Vine structure within each
year. Cl=Clayton, Fr=Frank, Ga=Gaussian, Gu=Gumbel, In=Independent, Jo=Joe, T=Stu-
dent t, T1=Tawn type 1, T2=Tawn Type 2, CG=Clayton-Gumbel mixed, JC=Joe-Clayton
mixed, JF=Joe-Frank mixed. R means rotated with rotated degree in the bracket and S means
survival copula.
          2      3   4       5       6    7       8      9        10       11        12     min(τ̂ ) max(τ̂ )
 1    RT1(180) T2 Ga       Cl      In SCl    Cl          Fr    Ga       In      In    −0.151           0.186
 2             JF SCl      In     SCl SCl   SCl         T1     Jo      T2       T2     0.000           0.215
 3                 T       Cl      In Cl RT1(90)         In   SJC    RT2(180)   T     −0.054           0.179
 4                       RT1(180) T Ga       In          In RCl(90)    SJF      Ga    −0.089           0.165
 5                                 In SCl RT2(180)       In   SCl       In      Jo     0.000           0.076
 6                                    Ga     JC          Fr    Fr    RJo(90)    In    −0.048           0.371
 7                                          SJC         SCl RJo(90)    Gu     RGu(90) −0.081           0.178
 8                                                       In RT2(180)    In      T     −0.109           0.111
 9                                                            SGu    RT2(180)   In     0.000           0.180
10                                                                   RT2(180) RCl(90) −0.077           0.053
11                                                                              JF     0.208           0.208
where the εikl are marginally distributed as N (0, σl2 ) for l = 1, . . . , 12, i = 1 . . . , n is the index
of observation stations and k = 1, . . . , 40 is the index of time block (year).
Table 3: The estimates of marginal parameters for each month l under simultaneous estimation
and two-stage estimation of composite likelihood method (standard error in the parentheses).
                                                 Two-Stage Estimation
 l          β0l               β1l              β2l             β2l2              β3l             β4l             σl
  1   −135.740(62.240)   −1.978(0.041)    −0.210(0.037)          −          −0.009(0.002)    0.101(0.030)   3.204(0.064)
  2   −34.827(27.651)    −1.739(0.042)    −0.256(0.039)          −          −0.008(0.003)    0.043(0.014)   3.164(0.067)
  3    22.785(89.626)    −1.429(0.119)   −44.929(19.069)   16.994(5.543)    −0.005(0.003)    0.021(0.044)   2.207(0.116)
  4     2.431(26.704)    −1.012(0.099)   −28.691(12.179)   25.054(4.627)    −0.003(0.002)    0.025(0.133)   1.944(0.090)
  5     44.601(6.172)    −0.708(0.100)   −14.893(26.378)   25.789(6.031)    −0.002(0.007)    0.007(0.102)   1.939(0.034)
  6   −100.571(21.824)    0.681(0.046)   −17.278(12.666)   22.259(3.084)    −0.002(0.003)    0.075(0.010)   1.578(0.034)
  7    30.540(45.497)    −0.626(0.036)   −21.546(5.831)    19.626(2.988)   −0.004(< 0.001)   0.010(0.022)   1.417(0.034)
  8    1.261(23.072)     −0.685(0.032)   −27.126(5.479)    15.480(3.112)    −0.006(0.001)    0.025(0.011)   1.482(0.032)
  9   −90.891(13.597)    −0.877(0.073)   −27.676(10.608)   8.679(3.121)     −0.007(0.001)    0.074(0.006)   1.335(0.063)
 10   −54.479(10.110)    −0.918(0.020)    −0.117(0.012)          −         −0.008(< 0.001)   0.048(0.005)   1.518(0.029)
 11   −28.415(11.045)    −1.275(0.028)    −0.061(0.018)          −         −0.009(< 0.001)   0.042(0.006)   2.085(0.047)
 12   −68.781(19.581)    −1.764(0.045)    −0.112(0.028)          −         −0.008(< 0.001)   0.068(0.010)   3.324(0.067)
     In the estimation results, β1l is negative for all 12 months, which suggests high-latitude
areas tend to have lower temperature year around and this trend is more obvious in winter
months (i.e., |β1l | is larger in months 1, 2, 3, 11 and 12). For winter months, i.e., months 1-2 and
10-12, the mean temperature has a linear negative relation with the longitude. For months 3-9
in spring and summer, the mean temperature has a quadratic relation with the longitude. β3l
is negative but close to zero, suggesting that as the elevation increases, the mean temperature
will slightly decrease. β1l , the annual temperature increase of the lth month in Celsius degree, is
positive in all 12 months, which suggests a mildly increasing trend of temperature change over
years. The findings perfectly align with our expectations.
     We are interested in both subject extrapolation (predicting temperature for a new station
based on geographical information and time) and time extrapolation (predicting temperature
for a future time). In practice, the former allows us to predict temperatures for locations without
a station and the latter allows us to forecasting future temperatures. For subject extrapolation,
we predict temperatures for the 5 stations in the test group from January 1978 to December
2008, of which the results are provided in Section 4.3 in Supplementary Materials. For time
extrapolation, we predict for 37 stations in the training group from January 2009 to December
2018. There are five stations closed after 2008 and data from January 2009 to December 2018
are not available. We are interested in short-term, mid-term and long-term prediction. For short-
term prediction, the prediction for the lth month is made based on information from previous
l − 1 months in the same year and the prediction of the first months is using the marginal
distribution; in other words, this is prediction for the next month. For mid-term prediction, the
prediction for the lth month is made based on the temperature in the first season (months 1-3)
in the same year, for l = 4, . . . , 12; in other words, this is the prediction made for the rest of the
year. For long-term prediction, we are predicting the change of the temperature in a decade.
     We compare the prediction performance of VINE4 with MRM, LRM and AR using the
evaluation metrics MAE and Percentage Outperformance as we did in the simulation studies:
  • MRM: The monthly marginal regression model (MRM) (11) and (12) without considering
    the dependence structure.
  • LRM: A linear regression model (LRM) includes month x5 as a covariate to account for the
52                                           Zhuang, H. et al.
     variation across months. The LRM model is selected by the AIC criterion and fitted to be
                      Yikl =β0 + β1 · latitude + β2 · longitude + β3 · elevation
                                             
                                             2
                             + β4 · year +          β5j · monthj + εikl ,
                                             j =1
7     Discussion
In this paper, we propose a R-Vine based regression model for analyzing longitudinal data with
long time span. We introduce composite likelihood methods which outperforms the likelihood-
based methods in terms of robustness and computational efficiency. We conduct extensive sim-
ulation studies to evaluate the performance of the proposed methods. The numerical studies
                      A Vine Copula Model for Climate Trend Analysis                         53
Figure 3: Boxplot of MAEs for the short-term (on the left) and mid-term (on the right) time
extrapolation.
suggest that the (conditional) bivariate copulas can still be accurately selected and the param-
eters of interest can be consistently estimated with moderate efficiency loss when simultaneous
procedure is used. Moreover, the model provides more precise prediction results than the con-
ventional models in both the simulation studies and the real data analysis. Time extrapolation
is what we usually care about in prediction problems, and both time and subject extrapolations
are valuable for imputing missing response values.
References
Aas K, Czado C, Frigessi A, Bakken H (2009). Pair-copula constructions of multiple dependence.
  Insurance: Mathematics and Economics, 44(2): 182–198.
Bedford T, Cooke RM (2002). Vines–a new graphical model for dependent random variables.
  Annals of Statististics, 30(4): 1031–1068.
Brechmann EC, Czado C, Aas K (2012). Truncated regular vines in high dimensions with ap-
  plication to financial data. Canadian Journal of Statistics, 40(1): 68–85.
54                                      Zhuang, H. et al.
Diggle P, Heagerty P, Heagerty PJ, Liang KY, Zeger S (2002). Analysis of Longitudinal Data.
  Oxford University Press.
Dissmann J, Brechmann EC, Czado C, Kurowicka D (2013). Selecting and estimating regular
  vine copulae and application to financial returns. Computational Statistics and Data Analysis,
  59(1): 52–69.
Domma F, Giordano S, Perri P (2009). Statistical modelling of temporal dependence in financial
  data via a copula function. Communications in Statistics – Simulation and Computation, 38:
  703–728.
Fang M, Tan KS, Wirjanto TS (2019). Sustainable portfolio management under climate change.
  Journal of Sustainable Finance & Investment, 9(1): 45–67.
Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G (2009). Longitudinal Data Analysis.
  CRC Press.
Frees E, Wang P (2006). Copula credibility for aggregate loss models. Insurance: Mathematics
  and Economics, 38: 360–373.
Hedeker D, Gibbons RD (2006). Longitudinal Data Analysis. John Wiley & Sons.
Hugo G (2013). Migration and Climate Change. Edward Elgar Publishing Limited.
Joe H (1997). Multivariate Models and Multivariate Dependence Concepts. Chapman and
  Hall/CRC.
Killiches M, Czado C (2018). AD-vine copula-based model for repeated measurements extending
  linear mixed models with homogeneous correlation structure. Biometrics, 74(3): 997–1005.
Lambert P, Vandenhende F (2002). A copula-based model for multivariate non-normal longi-
  tudinal data: analysis of a dose titration safety study on a new antidepressant. Statistics in
  Medicine, 21: 3197–3217.
Lim B, Spanger-Siegfried E, Burton I, Malone E, Huq S (2004). Adaptation Policy Frame-
  works for Climate Change: Developing Strategies, Policies and Measures. Cambridge Univer-
  sity Press.
Lindsay B (1988). Composite likelihood methods. Contemporary Mathematics, 80: 220–239.
Lindsay BG, Yi GY, Sun J (2011). Issues and strategies in the selection of composite likelihoods.
  Statistica Sinica, 21: 71–105.
Madsen L, Fang Y (2011). Joint regression analysis for discrete longitudinal data. Biometrics,
  67: 1171–1176.
Nelsen RB (2007). An Introduction to Copulas. Springer Science & Business Media.
OBrien K (2010). Responding to climate change: The need for an integral approach. In: Integral
  Theory in Action: Applied, Theoretical, and Constructive Perspectives on the AQAL Model (S
  Esbjörn-Hargens, ed.), 65–78. SUNY Press.
Parmesan C, Yohe G (2003). A globally coherent fingerprint of climate change impacts across
  natural systems. Nature, 421(6918): 37–42.
Ruscone MN, Osmetti SA (2016). Modelling the dependence in multivariate longitudinal data
  by pair copula decomposition. In: Soft Methods for Data Science (MB FerraroPaolo Giordani,
  B Vantaggi, M Gagolewski, M Ángeles Gil, P Grzegorzewski, O Hryniewicz, eds.), 373–380.
  Springer.
Schepsmeier U, Stoeber J, Brechmann EC, Graeler B, Nagler T, Erhardt T, et al. (2020).
  VineCopula: Statistical Inference of Vine Copulas. R package version 2.4.1.
Shen C, Weissfeld L (2006). A copula model for repeated measurements with non-ignorable
  non-monotone missing outcome. Statistics in Medicine, 25: 2427–2440.
Shi P, Yang L (2018). Pair copula constructions for insurance experience rating. Journal of the
                      A Vine Copula Model for Climate Trend Analysis                         55