Copula Goodness of Fit Test
Copula Goodness of Fit Test
www.elsevier.com/locate/ime
Abstract
Many proposals have been made recently for goodness-of-fit testing of copula models. After reviewing them briefly, the authors concentrate
on “blanket tests”, i.e., those whose implementation requires neither an arbitrary categorization of the data nor any strategic choice of smoothing
parameter, weight function, kernel, window, etc. The authors present a critical review of these procedures and suggest new ones. They describe
and interpret the results of a large Monte Carlo experiment designed to assess the effect of the sample size and the strength of dependence on
the level and power of the blanket tests for various combinations of copula models under the null hypothesis and the alternative. To circumvent
problems in the determination of the limiting distribution of the test statistics under composite null hypotheses, they recommend the use of a
double parametric bootstrap procedure, whose implementation is detailed. They conclude with a number of practical recommendations.
c 2007 Elsevier B.V. All rights reserved.
Keywords: Anderson–Darling statistic; Copula; Cramér–von Mises statistic; Gaussian process; Goodness-of-fit; Kendall’s tau; Kolmogorov–Smirnov statistic;
Monte Carlo simulation; Parametric bootstrap; Power study; Pseudo-observations; P-values
1. Introduction However, nowhere has the methodology been adopted and used
with greater intensity than in finance. Ample illustrations are
Consider a continuous random vector X = (X 1 , . . . , X d ) provided in the books of Cherubini et al. (2004) and McNeil
with joint cumulative distribution function H and margins et al. (2005), notably in the context of asset pricing and credit
F1 , . . . , Fd . The copula representation of H is given by risk management.
H (x1 , . . . , xd ) = C {F1 (x1 ), . . . , Fd (xd )}, where C is a unique Given independent copies X1 = (X 11 , . . . , X 1d ), . . . , Xn =
cumulative distribution function having uniform margins on (X n1 , . . . , X nd ) of X, the problem of estimating θ under the
(0, 1). A copula model for X arises when C is unknown but assumption
assumed to belong to a class
H0 : C ∈ C0
C0 = {Cθ : θ ∈ O} ,
has already been the object of much work; see, e.g., Genest et al.
where O is an open subset of R p for some integer p ≥ 1. (1995), Shih and Louis (1995), Joe (1997, 2005), Tsukahara
The books of Joe (1997) and Nelsen (2006) provide handy (2005) or Chen et al. (2006). However, the complementary issue
compendiums of the most common parametric families of of testing H0 is only beginning to draw attention.
copulas. The situation is evolving rapidly but at this point in time,
Copula modeling has found many successful applications the literature on the subject can be divided broadly into three
of late, notably in actuarial science, survival analysis and groups:
hydrology; see, e.g., Frees and Valdez (1998), Cui and Sun (1) Procedures developed for testing specific dependence
(2004) and Genest and Favre (2007) and references therein. structures such as the Normal copula (Malevergne and
Sornette, 2003) or the equally popular Clayton family,
∗ Corresponding author. also referred to as the gamma frailty model in survival
E-mail address: genest@mat.ulaval.ca (C. Genest). analysis (Shih, 1998; Glidden, 1999; Cui and Sun, 2004).
0167-6687/$ - see front matter c 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.insmatheco.2007.10.005
200 C. Genest et al. / Insurance: Mathematics and Economics 44 (2009) 199–213
(2) Statistics that can be used to test the goodness-of-fit of any the full standard maximum likelihood method under H0 and the
class of copulas but whose implementation involves: additional assumption that
(a) an arbitrary parameter, as in the rank-based statistic due
H00 : F1 ∈ F1 , . . . , Fd ∈ Fd .
to Wang and Wells (2000);
(b) kernels, weight functions and associated smoothing An alternative technique that is computationally more
parameters, as in Berg and Bakken (2005), Fermanian convenient has been advocated by Joe (1997). His “Inference
(2005), Panchenko (2005) and Scaillet (2007); Functions for Margins” or IFM approach proceeds in two
(c) ad hoc categorization of the data into a multiway steps: parametric estimates Fγ̂1 , . . . , Fγ̂d of the margins are
contingency table in order to apply an analogue of the first obtained under H00 ; they are then plugged into the log-
standard chi-squared test, along the lines of Genest and likelihood, viz.
Rivest (1993), Klugman and Parsa (1999), Andersen n
X
et al. (2005), Dobrić and Schmid (2005) or Junker and L(θ ) = log[cθ {Fγ̂1 (X i1 ), . . . , Fγ̂d (X id )}],
May (2005). i=1
(3) “Blanket tests”, i.e., those applicable to all copula structures in which cθ denotes the density of the copula Cθ (assuming that
and requiring no strategic choice for their use. Included in it exists). The function L(θ ) is then maximized. As illustrated
this category are variants of the Wang–Wells approach due by Joe (2005), however, the gain in computational convenience
to Genest et al. (2006), but also the procedures investigated often comes at the expense of efficiency. Kim et al. (2007)
or used by Breymann et al. (2003), Genest and Rémillard further show that an inappropriate choice of models for the
(in press) and Dobrić and Schmid (2007). margins may have detrimental effects on the estimation of the
And then there are authors who, in applied work, use dependence parameter per se.
standard goodness-of-fit statistics as a tool for choosing If one is unwilling to assume H00 , nonparametric estimation
between several copulas, but without attempting to formally of the margins must be used. The most natural choice consists
test whether the selected model is appropriate, in the light of in replacing F j by its empirical counterpart
a P-value. See, e.g., the analysis of stock index returns by Ané n
1X
and Kharoubi (2003). F̂ j (t) = 1(X i j ≤ t),
The purpose of this paper is to present a critical review of n i=1
the blanket goodness-of-fit tests proposed to date, to suggest
variants or improvements, and to compare the relative power and then estimating θ by the value θ̂ that maximizes the log
of these procedures through a Monte Carlo study involving a pseudo-likelihood
large number of copula alternatives and dependence conditions. n
X
After some general considerations given in Section 2, existing `(θ ) = log[cθ { F̂1 (X i1 ), . . . , F̂d (X id )}].
tests are described in Section 3 and new statistics are proposed i=1
in Section 4. Listed in Section 5 are the factors considered in This amounts to working with the ranks of the observations,
the study designed to assess the level and compare the power because for all i ∈ {1, . . . , n} and j ∈ {1, . . . , d}, Ri j =
of the selected tests. Results are reported and discussed in n F̂ j (X i j ) is the rank of X i j among X 1 j , . . . , X n j .
Section 6. Finally, various observations and methodological The asymptotic normality of θ̂ was established by Genest
recommendations are made in the Conclusion. et al. (1995), and by Shih and Louis (1995) in the presence of
censorship. As shown by Genest and Werker (2002), however,
2. General considerations this method is not asymptotically semi-parametrically efficient
in general. See Klaassen and Wellner (1997) for a notable
There is a fundamental difference between the problem of exception and Tsukahara (2005) or Chen et al. (2006) for other
estimating the dependence parameter of a copula model C0 = rank-based estimators.
{Cθ : θ ∈ O} and the complementary issue of testing the
validity of the null hypothesis H0 : C ∈ C0 for some class 2.2. Goodness-of-fit testing
C0 of copulas. The distinction is spelled out below, as it helps to
understand the technical challenges associated with goodness- When testing the hypothesis H0 : C ∈ C0 that the
of-fit testing in this context. dependence structure of a multivariate distribution is well-
represented by a specific parametric family C0 of copulas, the
2.1. Estimation option of modeling the margins by parametric families is no
longer viable. For, it would be tantamount to testing the much
Two broad approaches to the estimation of the dependence narrower null hypothesis H0 ∩ H00 corresponding to a full
parameter θ have been developed. They differ mainly through parametric model. In this context, the marginal distributions
the user’s willingness to make parametric assumptions or not F1 , . . . , Fd are (infinite-dimensional) nuisance parameters.
about the unknown margins. Given that the underlying copula C of a random vector
Given specific choices of parametric families F j = {Fγ j : is invariant by continuous, strictly increasing transformations
γ j ∈ Γ j } of univariate distributions, estimation can proceed via of its components, it appears that the only reasonable option
C. Genest et al. / Insurance: Mathematics and Economics 44 (2009) 199–213 201
for testing H0 consists of basing the inference on the Genest and Rémillard (in press) examine the implementation
maximally invariant statistics with respect to this set of issues in detail. In particular, they consider rank-based versions
transformations, i.e., the ranks. Indeed, all formal goodness- of the familiar Cramér–von Mises and Kolmogorov–Smirnov
of-fit tests mentioned in the introduction are rank-based. statistics, viz.
Alternatively, they can be viewed as functions of the collection Z
U1 = (U11 , . . . , U1d ), . . ., Un = (Un1 , . . . , Und ) of pseudo- Sn = Cn (u)2 dCn (u) and Tn = sup |Cn (u)| . (2)
observations deduced from the ranks, viz. Ui j = Ri j /(n + 1) = [0,1]d u∈[0,1]d
n F̂ j (X i j )/(n + 1), where the scaling factor n/(n + 1) is only Large values of these statistics lead to the rejection of H0 .
introduced to avoid potential problems with cθ blowing up at Approximate P-values can be deduced from their limiting
the boundary of [0, 1]d . distributions, which depend on the asymptotic behavior of
The pseudo-observations U1 , . . . , Un can be interpreted as a the process Cn . Genest and Rémillard (in press) establish
sample from the underlying copula C. It is plain, however, that the convergence of the latter under appropriate regularity
they are not mutually independent and that their components conditions on the parametric family C0 and the sequence (θn ) of
are only approximately uniform on (0, 1). Accordingly, any estimators. They also show that the tests based on Sn and Tn are
inference procedure based on these constructs should take these consistent; i.e., if C 6∈ C0 , then H0 is rejected with probability
features into account. As will be seen, testing procedures that 1 as n → ∞.
mistakenly ignore these considerations not only lack power but In practice, the limiting distributions of Sn and Tn depend on
fail to hold their nominal level. the family of copulas under the composite null hypothesis, and
on the unknown parameter value θ in particular. As a result, the
3. “Blanket tests” currently available
asymptotic distribution of the test statistics cannot be tabulated
and approximate P-values can only be obtained via specially
This section describes five rank-based procedures that have
adapted Monte Carlo methods. A specific parametric bootstrap
been recently proposed for testing the goodness-of-fit of any
procedure is described in Appendix A. Its validity is established
class of d-variate copulas. Of all the tests listed in Section 1,
by Genest and Rémillard (in press).
these are the only ones that qualify as “blanket”, in the sense
that they involve no parameter tuning or other strategic choices.
3.2. Two tests based on Kendall’s transform
3.1. Two tests based on the empirical copula
Another avenue successively explored by Genest and Rivest
As mentioned in Section 2, the pseudo-observations (1993), Wang and Wells (2000) and Genest et al. (2006) con-
U1 , . . . , Un constitute the maximally invariant statistics on sists in basing a test of H0 on a probability integral transforma-
which to test H0 : C ∈ C0 . The information they tion of the data. The specific mapping they consider is
contain is conveniently summarized by the associated empirical X 7→ V = H (X) = C(U1 , . . . , Ud ),
distribution, viz.
n where Ui = Fi (X i ) for i ∈ {1, . . . , d} and the joint distribution
1X of U = (U1 , . . . , Ud ) is C. This has come to be called Kendall’s
Cn (u) = 1(Ui1 ≤ u 1 , . . . , Uid ≤ u d ),
n i=1 transform, because the expectation of V is an affine transforma-
tion of the multivariate version of Kendall’s coefficient of con-
u = (u 1 , . . . , u d ) ∈ [0, 1]d . (1)
cordance; see Barbe et al. (1996) or Jouini and Clemen (1996).
It is usually called the “empirical copula”, though it is neither Let K denote the (univariate) distribution function of V .
a copula nor exactly the same (except asymptotically) as Genest and Rivest (1993) show that K can be estimated
originally defined by Deheuvels (1979). nonparametrically by the empirical distribution function
Gänßler and Stute (1987), Fermanian et al. (2004) of a rescaled version of the pseudo-observations V1 =
and Tsukahara (2005) give various conditions under which Cn Cn (U1 ), . . . , Vn = Cn (Un ). Barbe et al. (1996) give weak
(or slight variants thereof) is a consistent estimator of the true regularity conditions under which a central limit theorem can
underlying copula C, i.e., whether H0 is true or not. Given that be proved for the slight variant
it is entirely nonparametric, Cn is arguably the most objective n
benchmark for testing H0 : C ∈ C0 . Therefore, natural 1X
K n (v) = 1 (Vi ≤ v) , v ∈ [0, 1]. (3)
goodness-of-fit tests consist in comparing a “distance” between n i=1
Cn and an estimation Cθn of C obtained under H0 . Here and in
In particular, the latter is a consistent estimator of the
the sequel, θn = Tn (U1 , . . . , Un ) stands for an estimate of θ
underlying distribution K .
derived from the pseudo-observations.
Now under H0 , the vector U = (U1 , . . . , Ud ) is distributed
Goodness-of-fit tests based on the empirical process
√ as Cθ for some θ ∈ O, and hence the Kendall transform Cθ (U )
Cn = n(Cn − Cθn ) has distribution K θ . Through a measure of distance between K n
and a parametric estimation K θn of K , one can test
are briefly considered by Fermanian (2005), who comments
that they “seem to be unpractical, except by bootstrapping”. H000 : K ∈ K0 = {K θ : θ ∈ O}.
202 C. Genest et al. / Insurance: Mathematics and Economics 44 (2009) 199–213
Because H0 ⊂ H000 , of course, the nonrejection of H000 does not copula C⊥ . Of course, these pseudos are not mutually
entail the acceptance of H0 . Consequently, tests based on the independent and only approximately uniform on (0, 1)d . Any
empirical process inference procedure involving these constructs should thus
√ take these features into account. This point is raised though
Kn = n(K n − K θn )
eventually ignored by Breymann et al. (2003).
are not generally consistent. Although they point out this To describe the procedure of Breymann et al. (2003), let
limitation, Genest et al. (2006) investigate tests of H0 based Φ denote the cumulative distribution function of a standard
on this process. The idea had been put forward earlier (but N (0, 1) random variable and define
not carried through) by Wang and Wells (2000) in the case
d
of bivariate Archimedean copulas, for which H000 and H0 are X
χi = {Φ −1 (E i j )}2 , i ∈ {1, . . . , n}.
equivalent.
j=1
The specific statistics considered by Genest et al. (2006)
are rank-based analogues of the Cramér–von Mises and Exploiting the fact that E1 , . . . , En are “approximately”
Kolmogorov–Smirnov statistics, viz. uniformly distributed over (0, 1)d , these authors argue that
Z 1 χ1 , . . . , χn can be interpreted as a sample from G, the
Sn(K ) = Kn (v)2 dK θn (v) and Tn(K ) = sup |Kn (v)|. (4) distribution function of a chi-square random variable with
0 v∈[0,1] d degrees of freedom. Now a natural estimate of G is the
Large values of either one of these statistics lead to the empirical distribution of the set χ1 , . . . , χn , viz.
rejection of H000 . Approximate P-values can be deduced from n
1X
their limiting distributions, which depend on the asymptotic G n (t) = 1 (χi ≤ t) , t ≥ 0. (6)
behavior of Kn . The convergence of the latter is established n i=1
by Genest et al. (2006) under appropriate regularity conditions
For convenience, Breymann √ et al. (2003) assume that the
on the parametric families C0 , K0 , and the sequence (θn ) of
empirical process Gn = n(G n − G) behaves asymptotically
estimators.
(K ) (K ) as if E1 , . . . , En were exactly uniform. They further suppose
As the asymptotic distributions of Sn and Tn depend
that the asymptotic distribution is independent of θ , and hence
both on the unknown copula Cθ and on θ, approximate P-
values for these statistics must again be found via simulation. that it can be represented as β ◦ G, where β is the standard
See Appendix B for a parametric bootstrap procedure. Brownian bridge.
Should these assumptions hold true, Breymann et al. (2003)
3.3. A test based on Rosenblatt’s transform argue that it would then be possible to test H0 with the
Anderson–Darling statistic
Another well-known probability integral transformation on n
which goodness-of-fit tests could be based is due to Rosenblatt 1X
An = −n − (2i − 1)[log{G(χ(i) )}
(1952). This mapping, which is commonly used for simulation, n i=1
provides a simple way of decomposing a random vector with
+ log{1 − G(χ(n+1−i) )}], (7)
a given distribution into mutually independent components
that are uniformly distributed on the unit interval. Its standard where χ(1) ≤ · · · ≤ χ(n) are the order statistics corresponding
definition is recalled below for convenience. to χ1 , . . . , χn . The P-value would be simply given by reference
Definition. Rosenblatt’s probability integral transform of a to the limiting distribution of the original Anderson–Darling
copula C is the mapping R : (0, 1)d → (0, 1)d which to statistic; see e.g., Shorack and Wellner (1986).
every u = (u 1 , . . . , u d ) ∈ (0, 1)d assigns another vector As mentioned by Dobrić and Schmid (2007), however, the
R(u) = (e1 , . . . , ed ) with e1 = u 1 and for each i ∈ {2, . . . , d}, conclusions of Breymann et al. (2003) are too optimistic.
Simulations show clearly that if the tabulated values of the
, Anderson–Darling statistic are used to perform their test, the
∂ i−1 C(u 1 , . . . , u i , 1, . . . , 1) ∂ i−1 C(u 1 , . . . , u i−1 , 1, . . . , 1)
ei = . (5) resulting procedure has essentially no power and does not even
∂u 1 · · · ∂u i−1 ∂u 1 · · · ∂u i−1
maintain its nominal level.
A critical property of Rosenblatt’s transform is that U is To fix this problem, Dobrić and Schmid (2007) explain how
distributed as C, denoted U ∼ C, if and only if the distribution the results of Genest and Rémillard (in press) could be exploited
of R(U ) is the d-variate independence copula to compute reliable P-values for test statistics based on Gn .
In their paper, the Anderson–Darling test statistic An is used,
C⊥ (e1 , . . . , ed ) = e1 × · · · × ed , e1 , . . . , ed ∈ [0, 1]. together with the parametric bootstrap procedure described in
Thus H0 : U ∼ C ∈ C0 is equivalent to H0∗ : Rθ (U ) ∼ C⊥ for Appendix C. Note, however, that the validity of the parametric
some θ ∈ O. bootstrap depends critically on the existence of a limiting
To test this hypothesis, therefore, one can use the fact that distribution for An . The conditions (if any) under which this
under H0 , the pseudo-observations E1 = Rθn (U1 ) , . . . , En = happens remain to be determined. Nevertheless, this test was
Rθn (Un ) can be interpreted as a sample from the independence included in the simulation study.
C. Genest et al. / Insurance: Mathematics and Economics 44 (2009) 199–213 203
4. New procedures based on Rosenblatt’s transform To curtail the computational effort, comparisons were
limited to the bivariate case and to three degrees of dependence,
One avenue not covered by Breymann et al. (2003) or viz. τ = 0.25, 0.50, 0.75. Seven one-parameter families of
Dobrić and Schmid (2007) consists in working directly with copulas were also considered, both under the null hypothesis
the process, using the full power of Rosenblatt’s transform. The and under the alternative. They fall into three categories:
idea is not new, as it appeared in Klugman and Parsa (1999)
(1) Three meta-elliptical copula families uniquely determined
for bivariate censored data. These authors propose a Pearson
from the following classical bivariate distributions with
chi-square statistic computed from E1 , . . . , En . However, their
correlation coefficient ρ = sin(π τ/2):
P-value calculation is incorrect, because it assumes wrongly (a) the Gaussian distribution;
that the limiting distribution is chi-square. The fact that the (b) the Student distribution with ν = 4 degrees of freedom;
margins were estimated using parametric families is not taken (c) the Pearson type II distribution with ν = 4 degrees of
into account in their work. freedom.
Under the null hypothesis H0 , the empirical distribution (2) Three of the most common Archimedean copula models,
function namely
1X n (a) the Clayton family, also known in the survival analysis
Dn (u) = 1 (Ei ≤ u) , u ∈ [0, 1]d (8) literature as the gamma frailty model (Clayton, 1978;
n i=1
Cook and Johnson, 1981);
associated with the pseudo-observations E1 , . . . , En should be (b) the Frank family (Nelsen, 1986; Genest, 1987);
“close” to C⊥ . Thus, any reasonable notion of distance between (c) the Gumbel–Hougaard family originally considered by
Dn and C⊥ is a good candidate for testing goodness-of-fit. Gumbel (1960) in the context of extreme-value theory.
Here, two Cramér–von Mises statistics are considered, namely (3) The Plackett family of copulas (Plackett, 1965).
Z The class of meta-elliptical copulas was introduced by Fang
Sn(C) = n {Dn (u) − C⊥ (u)}2 dDn (u) et al. (2002, 2005); its properties were examined by Frahm et al.
[0,1]d
(2003) and Abdous et al. (2005). These dependence structures
n
X are popular in actuarial science and in finance, where data often
= {Dn (Ei ) − C⊥ (Ei )}2 (9)
(but not always) exhibit heavy-tail dependence; see Malevergne
i=1
and Sornette (2003), Cherubini et al. (2004) and McNeil et al.
and (2005) and references therein.
The Archimedean models are also commonly used in
Z
Sn(B) = n {Dn (u) − C⊥ (u)}2 du practice, particularly in survival analysis, because of their
[0,1]d
n Y d
interpretation as mixture models and the natural extension they
n 1 X 2
provide for Cox’s proportional hazards model; see, e.g., Oakes
= d
− d−1
1 − E ik
3 2 i=1 k=1 (1989), Faraggi and Korn (1996) or Wang and Wells (2000).
n X n Y d Refer also to Frees and Valdez (1998) and Klugman and Parsa
1X (1999) for actuarial applications.
1 − E ik ∨ E jk ,
+
n i=1 j=1 k=1 Finally, the Plackett system of distributions, which is neither
Archimedean nor meta-elliptical, has found applications in
where a ∨ b = max(a, b). These statistics only differ in their biostatistics because of its constant cross-ratio property; see,
integration measure. e.g., Burzykowski et al. (2004). Dobrić and Schmid (2005),
Using the tools described in the paper of Ghoudi and among others, investigated the relevance of this specific copula
Rémillard (2004), one can easily determine the asymptotic null model in a financial context.
√ (B)
behavior of n (Dn −C⊥ ) and, in turn, the convergence of Sn For every possible choice of copula and fixed value of τ ,
(C)
and Sn . The limiting null distributions of these statistics are 10,000 random samples of size n = 50 were generated. An
both unwieldy and, as in previous cases, they are functions both equal number of samples of size n = 150 was also obtained.
of the underlying copula and of its unknown parameter value Each of these samples was then used to test the goodness-of-
θ . Nevertheless, goodness-of-fit testing is possible through the fit of the seven families of distributions. Each of the following
parametric bootstrap procedure described in Appendix D. eight tests was applied in turn:
(1) The two tests derived by Genest and Rémillard (in press)
5. Experimental design
from the empirical copula process, i.e., those based on the
A large-scale Monte Carlo experiment was conducted to statistics Sn and Tn .
assess the finite-sample properties of the proposed goodness- (2) The two tests developed by Genest et al. (2006) using
(K )
of-fit tests for various choices of dependence structures and Kendall’s transform, i.e., those involving statistics Sn and
(K )
degrees of association. Two characteristics of the tests were of Tn .
interest: their ability to maintain their nominal level, arbitrarily (3) The test of Breymann et al. (2003) based on the statistic An
fixed at 5% throughout the study, and their power under a and its corrected version developed by Dobrić and Schmid
variety of alternatives. (2007), which both rely on Rosenblatt’s transform.
204 C. Genest et al. / Insurance: Mathematics and Economics 44 (2009) 199–213
Table 1
Percentage of rejection of H0 by various tests for data sets of size n = 150 arising from different copula models with τ = 0.25
the seven tests and the 21 = 7 × 3 combinations of null approximating the null distribution of the various statistics.
hypothesis C0 and level of dependence τ . The data for n = 50 Except in a few cases, the performance is quite acceptable when
(from Beaudoin (2007)) and n = 150 (from Tables 1–3) are in n = 50. It is almost irreproachable when n = 150.
the top and bottom panel, respectively.
In Fig. 1, the dimensions of each box are defined by the 6.2. Effect of sample size
three quartiles of the empirical distribution of levels; outliers
are indicated by open dots. The graphs show that overall, It is a classical fact of statistics that the power of a test
the parametric bootstrap algorithm does a very good job of increases with sample size. As Fig. 2 clearly shows, the present
206 C. Genest et al. / Insurance: Mathematics and Economics 44 (2009) 199–213
Table 2
Percentage of rejection of H0 by various tests for data sets of size n = 150 arising from different copula models with τ = 0.50
case is no exception. The box plots displayed there portray the growing. However, there are also a few cases where no gain in
variation in the ratio power (n = 150)/power (n = 50) for power occurs. It is instructive to examine more carefully what
each of the seven tests, as observed across 126 = 7 × 6 × 3 happens in those extreme cases.
combinations of factors C0 , C and τ , when the first two factors
are different. (1) What are the outliers identified in Fig. 2 and why is the
One can readily see from Fig. 2 that on average, the tests increase in power so large in those cases?
double their power as sample size goes from n = 50 to 150. (a) Most outliers occur either at τ = 0.25 or 0.75.
(B)
In many instances, the improvement is more than four-fold but (b) The statistics Sn , Tn and Sn have very few outliers, if
needless to say, it would quickly level off (to 1) as n keeps any.
C. Genest et al. / Insurance: Mathematics and Economics 44 (2009) 199–213 207
Table 3
Percentage of rejection of H0 by various tests for data sets of size n = 150 arising from different copula models with τ = 0.75
(K ) (K )
(c) The outliers at τ = 0.25 are for Sn and Tn , which (2) In what cases does one observe an increase in power of 10%
prove particularly apt at detecting that data are not of or less (as identified by the vertical line crossing the box
the Clayton type as n increases. plots), and why?
(K ) (a) This phenomenon occurs mostly when τ = 0.25 or
(d) Most of the outliers at τ = 0.75 are for Tn and
An ; when n = 150, the first is much better as a 0.75, and twice as often in the former case than in the
goodness-of-fit test for the Frank copula, while the latter.
(K ) (K )
second can discriminate a Clayton dependence structure (b) This problem spares Sn and Tn and affects all
more easily. others equally.
208 C. Genest et al. / Insurance: Mathematics and Economics 44 (2009) 199–213
Fig. 3. Samples of size n = 50 from seven different copulas with parameter τ = 0.50. From left to right, and top to bottom: Clayton, Gumbel–Hougaard, Frank,
Plackett, Normal, Student and Pearson with 4 degrees of freedom.
C. Genest et al. / Insurance: Mathematics and Economics 44 (2009) 199–213 209
Fig. 4. Samples of size n = 1000 from seven different copulas with parameter τ = 0.50. From left to right, and top to bottom: Clayton, Gumbel–Hougaard, Frank,
Plackett, Normal, Student and Pearson with 4 degrees of freedom.
(C)
on the combination of factors τ , C, and C0 . From the additional (2) The tests based on Sn and An are average with 2.5 and 2
tables reported in Beaudoin (2007), a dependence on n is also wins, respectively.
(K )
evident. (3) The performance of the tests involving Tn and Tn is much
In practice, of course, only the last two factors are known for less impressive, as they had no victory.
sure, i.e., the null hypothesis under investigation and the sample These observations are consistent with the common wisdom
size. At the expense of mild “data snooping”, one can get also of the goodness-of-fit literature, to the effect that test
a fairly good idea of the level of dependence in the data, as statistics based on the Cramér–von Mises functional of a
measured by Kendall’s tau. Prior knowledge of the exact nature process tend to be more powerful than those based on the
of dependence in the data, however, would defeat the purpose Kolmogorov–Smirnov distance taken on the same process.
of goodness-of-fit testing. A similar message is conveyed by the average ranks reported
In order to extract methodological recommendations from at the bottom of Table 4, which yield the following preference
the mass of data contained in Tables 1–3, it is convenient to rank ranking:
the tests from 1 to 7 in each of the 126 = 7×6×3 experimental
conditions corresponding to the seven possible choices of C0 , Sn(B) Sn Sn(K ) Sn(C) Tn An Tn(K ) .
the six alternatives C, and the three values of tau. The sample Although their differences may not be statistically significant,
size was fixed at n = 150 throughout, as those results are less (K )
these means suggest that the tests based on Tn , An and Tn are
subject to random variation and possibly more representative of
much less powerful than the others.
situations one would encounter in practice.
Other salient features of Table 4 are as follows:
Table 4 displays average ranks computed over the
alternatives, for given C0 and τ . In the table, the best test is (1) Among the tests based on a Cramér–von Mises statistic,
highlighted in each of the 21 = 7 × 3 scenarios considered. In there seems to be little to choose between a construction
Table 4, the tests are ranked from 1 to 7 in increasing order of involving Cn , K n or Rosenblatt’s transform. Their averages
power. Based on the number of times each test had the highest are comparable, as are their respective number of wins
(C)
rank, it appears that: (although Sn had only 2.5 wins).
(2) The statistic Sn unequivocally yields the most powerful
(B)
(1) The best procedures overall are those based on Sn , Sn and test of the Clayton hypothesis; it also does quite well for
(K )
Sn with 6.5, 5 and 5 “wins”, respectively. goodness-of-fit testing of Frank’s model.
210 C. Genest et al. / Insurance: Mathematics and Economics 44 (2009) 199–213
Table 4
Average ranking over factor C of the seven goodness-of-fit tests in 21 = 7 × 3 combinations of factors C0 and τ
H0 τ Test based on
(K ) (K ) (B) (C)
Sn Tn Sn Tn Sn Sn An
Clayton 0.25 7.0 3.5 3.3 1.8 5.8 5.0 1.5
0.50 7.0 3.2 3.8 2.0 6.0 5.0 1.0
0.75 5.9 3.5 4.0 2.0 5.8 5.8 1.0
Gumbel–Hougaard 0.25 3.6 4.0 7.0 5.6 3.7 2.3 1.8
0.50 3.5 2.8 6.3 3.5 6.2 4.7 1.0
0.75 2.9 2.7 4.8 2.7 6.8 5.8 2.5
Frank 0.25 4.6 3.4 5.3 4.0 2.8 4.0 3.8
0.50 5.3 2.8 5.5 4.0 3.3 5.2 1.8
0.75 5.8 2.3 4.8 2.2 3.6 5.6 3.7
Plackett 0.25 4.2 4.1 4.8 4.2 3.8 2.8 4.3
0.50 4.9 4.3 4.8 3.3 3.9 2.7 4.1
0.75 4.8 5.0 4.6 2.8 3.3 2.3 5.2
Normal 0.25 4.7 3.8 3.7 2.4 5.0 4.8 3.8
0.50 4.3 3.3 4.3 2.8 5.0 4.7 3.7
0.75 4.3 3.2 3.7 2.5 5.5 4.8 4.1
Student 4 dl 0.25 4.6 4.4 4.5 2.6 4.7 1.9 5.3
0.50 4.8 3.9 5.1 3.0 5.7 2.3 3.2
0.75 3.7 3.3 4.2 3.3 5.2 3.5 4.8
Pearson 4 dl 0.25 4.2 2.2 3.1 2.3 5.3 6.5 4.5
0.50 4.8 3.0 3.3 1.8 6.2 6.2 2.7
0.75 4.8 2.3 3.8 1.9 5.8 5.9 3.5
Average 4.75 3.38 4.50 2.89 4.92 4.36 3.21
Standard error 1.04 0.75 0.99 0.96 1.15 1.45 1.39
(B)
(3) The test based on Sn seems particularly good at detecting was found to be an acceptable minimum. While this
the lack of Normal or Student types of dependence, while is not a problem when using a test once, it quickly
(C) becomes computationally demanding in the context of
Sn is most powerful for the Pearson hypothesis; it would
be interesting to see whether this conclusion extends to a simulation study. In the present case, the recourse to a
other meta-elliptical copula structures. double bootstrap whenever Cθ or K θ was not available
(4) Among the tests constructed using the Kendall transform, in closed form made it totally impractical to run the
(K ) experiment at a sample size of n = 250, for lack of
the procedure based on Sn was far superior and offered
the best performance when testing the goodness-of-fit of sufficient computing resources.
(B) (C)
Gumbel–Hougaard and Frank copula structures. (c) In this regard, the tests based on An , Sn and Sn
(5) No clear recommendation emerges for goodness-of-fit are at an advantage: because they rely on Rosenblatt’s
testing of the Plackett. transform, a single bootstrap is enough to approximate
their null distribution and extract P-values. However,
7. Observations and recommendations the value of these statistics depends on the order in
which the variables are successively conditioned. While
Based on the experience gained from carrying out this it is traditional to take U2 |U1 , U3 |(U1 , U2 ), . . . as in (5),
comparative power study of the existing blanket goodness-of- any other sequence could be used. Different decisions
fit tests for copula models, the following general observations could possibly ensue. (This point will need to be the
and specific recommendations can be made. object of future research.)
I. General observations: (d) When statistics based on Cramér–von Mises and
(a) In goodness-of-fit testing as in any other inferential Kolmogorov–Smirnov functionals of the same empirical
context, the greater the sample size, the better. Large process are compared, the former are almost invariably
data sets not only help to distinguish between copula more powerful. The present simulations and those
models but play a role in the reliability of the parametric reported earlier by Genest et al. (2006) both point
bootstrap procedures used to approximate the statistics’ strongly in that direction.
null distribution. II. Specific recommendations, based on the present state of
(b) In order for the double bootstrap to be efficient, the knowledge:
(B)
number m of repetitions must be substantially larger (a) Overall, statistics Sn and Sn yield the best blanket
than the sample size n. In the present study, m = 2500 goodness-of-fit test procedures for copula models.
C. Genest et al. / Insurance: Mathematics and Economics 44 (2009) 199–213 211
(B)
While Sn is slightly more consistent than Sn in its (1) Compute Cn as per formula (1) and estimate θ with θn =
performance across models and can be implemented Tn (U1 , . . . , Un ).
without ever calling upon a double bootstrap, it relies (2) If there is an analytical expression for Cθ , compute the
on a nonunique (and therefore somewhat arbitrary) value of Sn , as defined in (2). Otherwise, proceed by Monte
Rosenblatt transform. Carlo approximation. Specifically, choose m ≥ n and carry
(C) (K )
(b) Statistics Sn and Sn are also recommendable, and the out the following extra steps:
latter is especially convenient when the null hypothesis (a) Generate a random sample U∗1 , . . . , U∗m from distribu-
is Archimedean, since the Kendall distribution K is then tion Cθn .
available in closed form. (b) Approximate Cθn by
(c) The jury is still out on the merits of the test based on 1 X m
Bm∗ (u) = 1 Ui∗ ≤ u , u ∈ [0, 1]d .
An . Anderson–Darling type statistics have proved useful
m i=1
in many other contexts, particularly in circumstances
where differences in the tail of a distribution were (c) Approximate Sn by
n
deemed to be important. While it seems plausible that X 2
Sn = Cn (Ui ) − Bm∗ (Ui ) .
the same would hold in a copula context, the simulation
i=1
results are not convincing in this regard. The asymptotic (3) For some large integer N , repeat the following steps for
behavior of this statistic also remains to be studied. every k ∈ {1, . . . , N }:
(d) There are no strong arguments in favor of using the tests
(K ) (a) Generate a random sample Y∗1,k , . . . , Y∗n,k from
based on Tn or Tn . As for the uncorrected version of distribution Cθn and compute their associated rank
the test proposed by Breymann et al. (2003), it should vectors R∗1,k , . . . , R∗n,k .
never be used. (b) Compute Ui,k ∗ = R∗ /(n + 1) for i ∈ {1, . . . , n} and let
i,k
In future work, it would be interesting to investigate the 1 X n
∗
(u) = ∗
≤ u , u ∈ [0, 1]d
sensitivity of tests based on the Rosenblatt transform to the Cn,k 1 Ui,k
n i=1
order in which conditioning is done. It would also be useful to
expand the present study to include comparisons with general and estimate θ by θn,k ∗ = T (U∗ , . . . , U∗ ).
n 1,k n,k
goodness-of-fit tests involving tuning parameters, as well as (c) If there is an analytical expression for Cθ , let
n
with procedures developed to test for specific dependence
X
} .
∗ ∗ ∗
∗
2
Sn,k = {Cn,k Ui,k − Cθn,k
∗ Ui,k
structures such as the Clayton or the Normal copula. i=1
On the theoretical front, several of the procedures that Otherwise, proceed as follows:
1,k , . . . , Ym,k from
have been proposed recently for goodness-of-fit testing of (i) Generate a random sample Y∗∗ ∗∗
copula models remain on shaky grounds. As illustrated by distribution Cθn,k
∗ .
the appalling performance of the test proposed by Breymann (ii) Approximate Cθn,k ∗ by
et al. (2003), the dependence between pseudo-observations m
must imperatively be taken into account. 1 X
∗∗
(u) = ∗∗
≤ u , u ∈ [0, 1]d
Bm,k 1 Yi,k
Nontrivial mathematics are required before one can m i=1
conclude (or not) that the limiting distribution of a rank-based and let
n
statistic is the same as in the classical multivariate context X 2
∗ ∗ ∗ ∗∗ ∗
.
Sn,k = Cn,k Ui,k − Bm,k Ui,k
in which it was originally developed. Furthermore, conditions
i=1
are required for the convergence of bootstrap algorithms, and
failure to check them may lead to disaster. No sleight of hand P NAn approximate P-value for the test is then given by
∗ > S )/N .
will change that fact. k=1 1(Sn,k n
(b) Approximate K θn by (3) For some large integer N , repeat the following steps for
1 X m every k ∈ {1, . . . , N }:
Bm∗ (t) = 1 Vi∗ ≤ t , t ∈ [0, 1],
m i=1 (a) Generate a random sample Y∗1,k , . . . , Y∗n,k from
where distribution Cθn and compute their associated rank
1 Xm vectors R∗1,k , . . . , R∗n,k .
Vi∗ = 1 U∗j ≤ Ui∗ , i ∈ {1, . . . , m}. (b) Compute Ui,k ∗ = R∗ /(n + 1) for i ∈ {1, . . . , n}.
m j=1 i,k
(c) Estimate θ with θn,k ∗ = Tn (U∗1,k , . . . , U∗n,k ), and
(K )
(c) Approximate Sn by compute χ1,k∗ , . . . , χ ∗ , where
n,k
m
n X d n
Sn(K ) =
2
K n Vi∗ − Bm∗ Vi∗ .
X o2
χi,k
∗
Φ −1 (E i∗j,k ) ∗ ∗
,
m i=1 = and Ei,k = Rθn,k ∗ Ui,k
Note in passing that m × Bm∗ (Vi∗ ) is the rank of Vi∗ j=1
Appendix C. A parametric bootstrap for An An approximate P-value for the test is then given by
PN (C)∗ (C)
k=1 1(Sn,k > Sn )/N .
Although the following algorithm is described in terms of
statistic An , it is also valid mutatis mutandis for any other rank- References
based statistic based on χ1 , . . . , χn .
Abdous, B., Genest, C., Rémillard, B., 2005. Dependence properties of meta-
(1) Compute G n as per formula (6) and estimate θ with θn = elliptical distributions. In: Statistical Modeling and Analysis for Complex
Tn (U1 , . . . , Un ). Data Problems. In: GERAD 25th Anniv. Ser., vol. 1. Springer, New York,
(2) Compute the value of An as per formula (7). pp. 1–15.
C. Genest et al. / Insurance: Mathematics and Economics 44 (2009) 199–213 213
Andersen, P.K., Ekstrøm, C.T., Klein, J.P., Shu, Y., Zhang, M.-J., 2005. A class Genest, C., Rémillard, B., 2008. Validity of the parametric bootstrap for
of goodness of fit tests for a copula based on bivariate right-censored data. goodness-of-fit testing in semiparametric models, Annales de I’Institut
Biometrical Journal 47, 815–824. Henri Poincaré. Probabilités et Statistiques 44 (in press).
Ané, T., Kharoubi, C., 2003. Dependence structure and risk measure. Journal Genest, C., Rivest, L.-P., 1993. Statistical inference procedures for bivariate
of Business 76, 411–438. Archimedean copulas. Journal of the American Statistical Association 88,
Barbe, P., Genest, C., Ghoudi, K., Rémillard, B., 1996. On Kendall’s process. 1034–1043.
Journal of Multivariate Analysis 58, 197–229. Genest, C., Werker, B.J.M., 2002. Conditions for the asymptotic semiparamet-
Beaudoin, D., 2007. Estimation de la dépendance et choix de modèles pour des
ric efficiency of an omnibus estimator of dependence parameters in copula
données bivariées sujettes à censure et à troncation. PhD thesis, Université
models. In: Distributions with Given Marginals and Statistical Modelling.
Laval, Québec, Canada.
Kluwer, Dordrecht, The Netherlands, pp. 103–112.
Berg, D., Bakken, H., 2005. A goodness-of-fit test for copulae based on the
probability integral transform. Technical Report SAMBA/41/05, Norsk Ghoudi, K., Rémillard, B., 2004. Empirical processes based on pseudo-
Regnesentral, Oslo, Norway. observations. II. The multivariate case. In: Asymptotic Methods in
Breymann, W., Dias, A., Embrechts, P., 2003. Dependence structures for Stochastics. In: Fields Inst. Commun., vol. 44. Amer. Math. Soc.,
multivariate high-frequency data in finance. Quantitative Finance 3, 1–14. Providence, RI, pp. 381–406.
Burzykowski, T., Molenberghs, G., Buyse, M., 2004. The validation of Glidden, D.V., 1999. Checking the adequacy of the gamma frailty model for
surrogate end points by using data from randomized clinical trials: A case- multivariate failure times. Biometrika 86, 381–393.
study in advanced colorectal cancer. Journal of the Royal Statistical Society Gumbel, E.J., 1960. Distributions des valeurs extrêmes en plusieurs
Series A 167, 103–124. dimensions. Publications de l’Institut de Statistique de l’Université de Paris
Chen, X., Fan, Y., Tsyrennikov, V., 2006. Efficient estimation of 9, 171–173.
semiparametric multivariate copula models. Journal of the American Joe, H., 1997. Multivariate Models and Dependence Concepts. Chapman &
Statistical Association 101, 1228–1240. Hall, London.
Cherubini, U., Luciano, E., Vecchiato, W., 2004. Copula Methods in Finance.
Joe, H., 2005. Asymptotic efficiency of the two-stage estimation method for
Wiley, New York.
copula-based models. Journal of Multivariate Analysis 94, 401–419.
Clayton, D.G., 1978. A model for association in bivariate life tables and
its application in epidemiological studies of familial tendency in chronic Jouini, M.N., Clemen, R.T., 1996. Copula models for aggregating expert
disease incidence. Biometrika 65, 141–151. opinions. Operations Research 44, 444–457.
Cook, R.D., Johnson, M.E., 1981. A family of distributions for modelling Junker, M., May, A., 2005. Measurement of aggregate risk with copulas. The
nonelliptically symmetric multivariate data. Journal of the Royal Statistical Econometrics Journal 8, 428–454.
Society Series B 43, 210–218. Kim, G., Silvapulle, M.J., Silvapulle, P., 2007. Comparison of semiparametric
Cui, S., Sun, Y., 2004. Checking for the gamma frailty distribution under the and parametric methods for estimating copulas. Communications in
marginal proportional hazards frailty model. Statistica Sinica 14, 249–267. Statistics. Simulation and Computation 51, 2836–2850.
Deheuvels, P., 1979. La fonction de dépendance empirique et ses propriétés: Klaassen, C.A.J., Wellner, J.A., 1997. Efficient estimation in the bivariate
Un test non paramétrique d’indépendance. Académie Royale de Belgique. normal copula model: Normal margins are least favourable. Bernoulli 3,
Bulletin de la Classe des Sciences, 5e Série 65, 274–292. 55–77.
Dobrić, J., Schmid, F., 2005. Testing goodness of fit for parametric families Klugman, S., Parsa, R., 1999. Fitting bivariate loss distributions with copulas.
of copulas: Application to financial data. Communications in Statistics. Insurance: Mathematics & Economics 24, 139–148.
Simulation and Computation 34, 1053–1068.
Malevergne, Y., Sornette, D., 2003. Testing the Gaussian copula hypothesis for
Dobrić, J., Schmid, F., 2007. A goodness of fit test for copulas based on
financial assets dependences. Quantitative Finance 3, 231–250.
Rosenblatt’s transformation. Computational Statistics & Data Analysis 51,
4633–4642. McNeil, A.J., Frey, R., Embrechts, P., 2005. Quantitative Risk Management.
Fang, H.-B., Fang, K.-T., Kotz, S., 2002. The meta-elliptical distributions with Princeton University Press, Princeton, NJ.
given marginals. Journal of Multivariate Analysis 82, 1–16. Nelsen, R.B., 1986. Properties of a one-parameter family of bivariate
Fang, H.-B., Fang, K.-T., Kotz, S., 2005. Corrigendum to: “The meta-elliptical distributions with specified marginals. Communications in Statistics A.
distributions with given marginals” [J. Multivariate Anal. 82: 1–16 (2002)]. Theory and Methods 15, 3277–3285.
Journal of Multivariate Analysis 94, 222–223. Nelsen, R.B., 2006. An Introduction to Copulas, second ed. Springer, New
Faraggi, D., Korn, E.L., 1996. Competing risks with frailty models when York.
treatment affects only one failure type. Biometrika 83, 467–471. Oakes, D., 1989. Bivariate survival models induced by frailties. Journal of the
Fermanian, J.-D., 2005. Goodness-of-fit tests for copulas. Journal of American Statistical Association 84, 487–493.
Multivariate Analysis 95, 119–152.
Panchenko, V., 2005. Goodness-of-fit test for copulas. Physica A 355, 176–182.
Fermanian, J.-D., Radulović, D., Wegkamp, M.H., 2004. Weak convergence of
empirical copula processes. Bernoulli 10, 847–860. Plackett, R.L., 1965. A class of bivariate distributions. Journal of the American
Frahm, G., Junker, M., Szimayer, A., 2003. Elliptical copulas: Applicability Statistical Association 60, 516–522.
and limitations. Statistics & Probability Letters 63, 275–286. Rosenblatt, M., 1952. Remarks on a multivariate transformation. The Annals of
Frees, E.W., Valdez, E.A., 1998. Understanding relationships using copulas. Mathematical Statistics 23, 470–472.
North American Actuarial Journal 2, 1–25. Scaillet, O., 2007. Kernel based goodness-of-fit tests for copulas with fixed
Gänßler, P., Stute, W., 1987. Seminar on Empirical Processes. Birkhäuser smoothing parameters. Journal of Multivariate Analysis 98, 533–543.
Verlag, Basel, Switzerland. Shih, J.H., 1998. A goodness-of-fit test for association in a bivariate survival
Genest, C., 1987. Frank’s family of bivariate distributions. Biometrika 74, model. Biometrika 85, 189–200.
549–555.
Shih, J.H., Louis, T.A., 1995. Inferences on the association parameter in copula
Genest, C., Favre, A.-C., 2007. Everything you always wanted to know about
models for bivariate survival data. Biometrics 51, 1384–1399.
copula modeling but were afraid to ask. Journal of Hydrologic Engineering
12, 347–368. Shorack, G.R., Wellner, J.A., 1986. Empirical Processes with Applications to
Genest, C., Ghoudi, K., Rivest, L.-P., 1995. A semiparametric estimation pro- Statistics. Wiley, New York.
cedure of dependence parameters in multivariate families of distributions. Tsukahara, H., 2005. Semiparametric estimation in copula models. The
Biometrika 82, 543–552. Canadian Journal of Statistics 33, 357–375.
Genest, C., Quessy, J.-F., Rémillard, B., 2006. Goodness-of-fit procedures Wang, W., Wells, M.T., 2000. Model selection and semiparametric inference for
for copula models based on the integral probability transformation. bivariate failure-time data. Journal of the American Statistical Association
Scandinavian Journal of Statistics 33, 337–366. 95, 62–72.