Wave and Ocean Energy
Wave and Ocean Energy
Ocean Engineering
journal homepage: www.elsevier.com/locate/oceaneng
A R T I C LE I N FO A B S T R A C T
Keywords: This paper considers three types of method for calculating return periods of individual wave and crest heights.
Wave height The methods considered differ in the assumptions made about serial correlation in wave conditions. The long-
Crest height term distribution of individual waves is formed under the assumption that either (1) individual waves, (2) the
Long-term distribution maximum wave height in each sea state or (3) the maximum wave height in each storm are independent events.
Return period
The three types of method are compared using long time series of synthesised storms, where the return periods of
Serial correlation
individual wave heights are known. The methods which neglect serial correlation in sea states are shown to
produce a positive bias in predicted return values of wave heights. The size of the bias is dependent on the shape
of the tail of the distribution of storm peak significant wave height, with longer-tailed distributions resulting in
larger biases. It is shown that storm-based methods give accurate predictions of return periods of individual
wave heights. In particular, a Monte Carlo storm-based method is recommend for calculating return periods of
individual wave and crest heights. Of all the models considered, the Monte Carlo method requires the fewest
assumptions about the data, the fewest subjective judgements from the user and is simplest to implement.
∗
Corresponding author.
E-mail address: e.mackay@exeter.ac.uk (E. Mackay).
https://doi.org/10.1016/j.oceaneng.2018.07.047
Received 22 May 2018; Received in revised form 5 July 2018; Accepted 15 July 2018
Available online 21 July 2018
0029-8018/ © 2018 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/BY/4.0/).
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
random “event”, where the event is either a storm, a sea state, or a generate random time series of realistic storm histories, which are used
single wave. In the last case the maximum wave height in the event is to compare various methods of estimating return periods of individual
just the individual wave height. Long-term statistics of individual waves wave heights. The block-resampling method divides a time series of
are then calculated from the random event distribution, assuming that measured wave data into discrete blocks consisting of storms where the
occurrences of events are independent. peak wave heights can be considered approximately independent. The
The assumption of independence of these events is not true in problem of determining time scales over which storm peak wave
general, with records of wave measurements exhibiting serial correla- heights can be considered independent is also discussed in some detail.
tion at multiple scales, with correlation between successive individual The results presented in this paper also have implications for the
wave heights, sea states and storms. In the statistical literature, the estimation of extreme load values on marine structures. The “full long-
effect of serial correlation on estimates of extreme values is often term response analysis” method advocated by some authors (e.g.
quantified using the extremal index, which can be introduced as fol- Sagrilo et al., 2011; Naess and Moan, 2012; Giske et al., 2017) is es-
lows. Suppose that X1 , …, Xn are a sequence of n independent variables sentially a method for forming the long-term distribution of the max-
with common distribution function F . In this case, the distribution of imum load in each sea state, analogous to the SSM method for wave and
the maximum observation is given by Pr(max{X1, …, Xn } < x ) = F n (x ) . If crest heights. This method is therefore likely to be subject to the same
instead Y1, …, Yn are a stationary process, also with common distribution problems associated with neglecting serial correlation in sea states.
function F , but with some level of serial correlation, then subject to Methods for calculating extreme loads which account for serial corre-
certain regularity conditions, it can be shown that lation in sea states have been proposed by other authors (e.g. Forristall
Pr(max{Y1, …, Yn} < x ) = F θn (x ) , where θ ∈ [0,1] is known as the ex- et al., 1991; Tromans and Vanderschuren, 1995). However, the focus of
tremal index (see e.g. Coles, 2001). Serial correlation effectively re- this work is on wave and crest heights, and the effect of serial corre-
duces the probability of large observations in a sequence of a given lation on extreme load values is beyond the scope of the paper.
length. Therefore, for processes with θ < 1, assuming that observations The paper starts in Section 2 with a brief review of how return
are independent will lead to a positive bias in the estimates of extreme periods and return values are defined in the context of the various types
values. Some studies have proposed estimating θ explicitly and using of long-term distributions considered. Section 3 presents a short dis-
this in the estimation of extremes (e.g. Fawcett and Walshaw, 2012). cussion of models for the short-term distribution of wave and crest
However, obtaining a reliable estimate of θ is difficult in practice and heights conditional on sea state. Section 4 presents the mathematical
subject to considerable uncertainties (Davis et al., 2013). For oceano- derivation of the methods for combining the long-term and short-term
graphic data it is more common to adopt the peaks-over-threshold distributions, and highlights where various assumptions about in-
scheme which selects events that are approximately independent, en- dependence are made – either implicitly or explicitly. The methods are
suring that θ ≈ 1 (Jonathan and Ewans, 2013). compared in a simplified example in Section 5, which isolates the ef-
The methods proposed by Battjes (1970) and Krogstad (1985) (re- fects of serial correlation in sea states. The methods are then applied to
ferred to as the ‘all-wave’ (AW) and ‘sea state maxima’ (SSM) methods measured data in Section 6, providing a quantitative comparison of the
respectively from here onwards) and the various storm-based methods effect the various assumptions made in each method in a real situation.
make implicit assumptions about independence between events. The The accuracy of the methods is compared in Section 7 using Monte
AW method assumes that all wave heights are independent, the SSM Carlo simulations of synthetic storms. Finally, conclusions are pre-
method assumes that sea state maxima are independent, and storm- sented in Section 8.
based methods assumes that the maximum wave heights in separate
storms are independent. Given the serial correlation in wave height 2. Return periods & return values
time series, the three methods would be expected to give different re-
sults, with the AW method producing the highest estimates and storm- The methods for estimating the long-term distribution of individual
based methods producing the lowest estimates. wave heights considered in this paper are used to define return periods
Forristall (2008) compared estimates of the long-term distribution and return values in slightly different ways. It is therefore useful to
of individual wave heights from the AW, SSM and storm-based review how return periods and return values are defined in each con-
methods. Forristall conducted Monte Carlo simulations of 100,000 text. Return values are defined in terms of the distribution of the
years of individual wave heights from synthetic storms, assuming that maximum wave height in a year. We denote the probability that the
the time series of Hs in a storm follows a triangular shape with a fixed maximum individual wave height, Hmax , does not exceed h in any year
relationship between the peak Hs and duration of the storm. The selected at random as Pr(Hmax ≤ h 1 year) . The T -year return value of
duration, D , of the storm was defined to be the time for which Hs > 0 individual wave height, HT , is then defined as the value which has an
and a value of D = 8Hs, peak was used, where D is in hours. The zero- exceedance probability 1/ T in any year:
crossing period, Tz , was assumed to be constant at 10s and wave heights
1
were assumed to follow a Rayleigh distribution. It was shown that AW Pr(Hmax > HT 1 year) = , T>1
T (1)
and SSM methods produce a positive bias in estimates of the 100-year
wave height, consistent with both models neglecting serial-correlation The duration T is referred to as the return period and is the average
effects. Forristall showed that the storm-based method of Tromans and period between exceedances of HT . Over the last few decades, the
Vanderschuren (1995) gave the correct return values of individual peaks-over-threshold (POT) method has gained popularity over the
wave heights when applied to the synthetic triangular storm data with annual-maxima method (see Jonathan and Ewans, 2013, for a review of
Rayleigh distributed wave heights. the use of POT in an oceanographic context). In this method, the dis-
Forristall's study provides a useful insight into the differences be- tribution of the annual maximum is not estimated explicitly. Instead,
tween various methods for calculating return periods of individual the distribution of independent threshold exceedances is estimated. If
waves. However, due to the assumptions about the fixed shape of the each independent threshold exceedance is described as an ‘event’ then
storms and constant wave period, it is difficult to draw conclusions return values of wave heights can be defined in terms of the distribution
about the accuracy of storm-based methods when applied to real data. of the maximum wave height in a random event, as the solution of:
The purpose of this paper is to compare methods for estimating return
1 1
periods of individual wave heights based on more realistic simulations Pr(Hmax > HT event) = , T>
Tm m (2)
of synthetic storms, where the wave period varies throughout the storm
and the temporal evolution of sea state parameters is based on mea- where m is the expected number of events per year (see e.g. Coles,
sured data. In the current work, a block-resampling method is used 2001). In the present context, the event is either a storm, sea state or
165
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
individual wave. as discussed in Section 3, is not necessarily true. The effect of other sea
The definitions (1) and (2) lead to slightly different return values for state parameters on the short-term distribution could be captured by
low return periods, but the differences become negligible at larger re- using the joint density function of multiple parameters, however, this
turn periods. This can be seen as follows. The distribution of the max- quickly becomes very complicated as the number of variables is in-
imum wave height in a year can be calculated by assuming in- creased. In some applications of the AW method, the joint density
dependence of maxima in each event as: function is used (e.g. Hagen et al., 2017). However, in most applications
(7) is simplified as follows. To avoid having to model joint distribution,
Pr(Hmax ≤ h 1 year) = Pr(Hmax ≤ h event)m, (3)
the joint density function is written as p (Hs , Tz ) = p (Hs ) p (Tz Hs ) , and
Substituting (3) into (1) and using a Taylor series expansion gives: the mean number of waves for a given Hs is defined as
∞
1 1/ m 1 1
Pr(Hmax ≤ HT event) = ⎛1 − ⎞ =1− + O⎛ 2 ⎞ N (Hs ) = ∫ p (Tz Hs) N (Tz )dTz.
⎝ T⎠ Tm ⎝T ⎠ (4)
0 (8)
So, the two definitions are equivalent for large T . In practice the
Substituting this back into (7) gives:
difference is negligible for T ≥ 10 , so both definitions will be used in the
∞
following discussions, depending on which is more appropriate.
Myr (h) = NPY ∫ p (Hs) N (Hs)Pr(H > h|σ (Hs))dHs.
0 (9)
3. Short-term distribution of wave heights conditional on sea state
In this approach, all the sea state variables that influence the short-
A sea state, σ , is defined to be a period of time in which the wave term distribution must be modelled as functions of Hs .
conditions can be considered approximately stationary, with the dura- In the original AW method proposed by Battjes, the expected
tion normally defined in the range 30 min to 3 h. A sea state is defined number of waves in any sea state selected at random is calculated as
in terms of the wave spectrum and summarised in terms of spectral ∞
parameters, such as significant wave height Hs = 4 m 0 , zero up- Nr = ∫ p (Hs) N (Hs) dHs.
crossing period Tz = m 0 / m2 , mean period Tm = m 0 / m1, where 0 (10)
∞
mn = ∫ f n S (f )df is the nth moment of the wave frequency spectrum, The expected number of waves per year is therefore Nr . NPY and
0 the probability that any wave selected at random exceeds height h is
S (f ) .
The short-term distribution of wave heights, H , conditional on sea Myr (h)
Pr(H > h any wave) =
state is denoted Nr . NPY
∞
1
Pr(H ≤ h|σ ). (5) =
Nr
∫ p (Hs) N (Hs)Pr(H > h|σ (Hs))dHs
0 (11)
Models for (5) in terms of sea state parameters and water depth is
the subject of ongoing research and it is beyond the scope of this work The distribution of the annual maximum wave height is then cal-
to provide a review of the extensive literature on this topic. The focus of culated as:
this work is on how to combine the short-term distribution (5) with the
Pr(Hmax ≤ h 1 year) = Pr(H ≤ h|any wave) Nr ⋅ NPY (12)
long-term distribution of sea states. The methods described are ap-
plicable to any model for the short-term distribution. The short-term This calculation makes the implicit assumption that individual wave
distribution is primarily dependent on Hs , but is also affected by the heights are independent, which, as discussed in the introduction, is not
water depth, wave steepness, spectral bandwidth, directional spread true.
and currents. In the following, it is assumed that the model for the Tucker (1989) pointed out that Nr is sensitive to the number of
short-term distribution adequately captures these effects. waves in low sea states and that it seems counter-intuitive that this
should influence the extreme values. Tucker noted that the T -year re-
4. Long-term distributions turn value of individual wave height is the value of h for which the
expected number of waves exceeding h per year is Myr (h) = 1/ T . The
4.1. Long-term distribution of all wave heights annual distribution can therefore be calculated directly from (1) and (9)
as:
Battjes (1970) proposed a method for calculating return values of
individual wave heights in terms of the long-term distribution of all Pr(Hmax ≤ h 1 year) = 1 − Myr (h)
∞
wave heights. The probability that any wave selected at random ex-
ceeds height h is calculated as the ratio of the expected number of
= 1 − NPY ∫ Pr(H > h|σ (Hs)) p (Hs) N (Hs)dHs.
0 (13)
waves exceeding h per year to the expected number of waves per year.
The expected number of waves exceeding h in a sea state is Note that this definition is only valid for Myr (h) < 1. Although it is
not immediately obvious from (13), Tucker's expression for the annual
Mσ (h) = N (Tz ) Pr(H > h|σ ), (6)
distribution implicitly assumes that individual wave heights are un-
where N (Tz ) = Dσ / Tz is the expected number of waves in a sea state, Dσ correlated. Expanding the original Battjes expression (11) as a Taylor
is the duration of the sea state, normally assumed to be somewhere in series, we have
the region 1–6 h, and Tz is the zero up-crossing period. The expected
Pr(Hmax ≤ h 1 year) = Pr(H ≤ h|any wave) Nr ⋅ NPY
number of waves exceeding h in a year, denoted Myr (h) , is calculated by
Nr ⋅ NPY
integrating Mσ (h) over the probability of occurrence of sea states and
multiplying by the number of sea states per year, NPY :
(
= 1−
Myr (h)
Nr . NPY )
∞ ∞
= 1 − Myr (h) + O ((Myr (h))2) (14)
Myr (h) = NPY ∫ ∫ p (Hs, Tz ) N (Tz )Pr(H > h|σ ) dHs dTz. In Tucker's definition Myr (h) = 1/ T , so for large T the terms of order
0 0 (7)
(Myr (h))2 are negligible and the two expressions are equivalent.
The occurrence of sea states is specified in terms of p (Hs , Tz ) , the Therefore, both Battjes' and Tucker's expressions for the distribution of
joint probability density function of Hs and Tz . This implicitly assumes the maximum wave height in a year can be thought of as being formed
that the short-term distribution is only influenced by Hs and Tz , which, under the assumption that there is no serial correlation in individual
166
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
wave heights. This assumption is discussed further in the following the calculation.
section. The application of the ergodic assumption in passing from (17) to
(18) effectively counts the proportion of sea states which exceed a
4.2. Long-term distribution of sea state maxima threshold level but does not account for temporal clustering of sea
states in storms. To make this explicit, if A = Dσ , then, recalling
The long-term distribution of the maximum wave height in a sea N (Hs ) = Dσ / Tz (Hs ) and using (15), gives
state can be calculated in two ways, with one approach based on an ∞
⎡
average over time and the other based on an average over the prob- Pr(Hmax ≤ h any sea state) = exp ⎢ ∫ p (Hs)ln[Pr(Hmax ≤ h|σ (Hs))]dHs⎤⎥
ability of occurrence of sea states. Both methods give equivalent results, ⎣ 0 ⎦
but it is not immediately obvious to see this from looking at the ex- (19)
pressions. The temporal average method will be presented first and it
will then be shown how it is equivalent to the probabilistic method. Putting A = NPY ⋅Dσ = 1 year gives
The distribution of the maximum wave in a sea state is usually ∞
calculated by assuming that successive wave heights are independent, Pr(Hmax ≤ h 1 year) = exp ⎡NPY ∫ p (Hs )ln[Pr(Hmax ≤ h|σ (Hs ))]dHs⎤
⎢ ⎥
so that ⎣ 0 ⎦
= Pr(Hmax ≤ h any sea state) NPY
Pr(Hmax ≤ h|σ ) = Pr(H ≤ h|σ ) N . (15)
(20)
where N = Dσ / Tz is the expected number of waves in the sea state, as
So, in Krogstad's formulation (18), the annual distribution is identical to
defined in the previous section. In reality, consecutive wave and crest
that formed under the assumption that the maxima in each sea state are
heights are correlated, with the largest waves occurring in groups.
independent.
Naess (1985) showed that for linear waves from a JONSWAP spectrum,
A distinction should be made between the type of independence
the effect of ignoring correlation between successive waves has a neg-
assumed to derive (16) and the type of independence assumed in (18).
ligible influence on the distribution of the maximum wave height.
In the derivation of (16) it is assumed that the maximum wave height in
Fedele (2016) considered the effect of correlation between successive
each sea state is dependent only on the sea state parameters and not on
wave crests using the linear model of Fedele (2005) and also concluded
the maximum height in adjacent sea states. In this expression, the va-
that correlation could be neglected for typical broadbanded waves from
lues of the sea state parameters are assumed to be known, such as
a JONSWAP spectrum. Fedele (2016) also noted that for the case of
would be the case for a measured time series of sea states in a storm. In
nonlinear waves, since the nonlinear harmonics are phase-locked to the
(18) the sequence of sea states over the year are not known, only their
linear components, the dependence between successive crests is un-
marginal distribution is known. By replacing the integral over a known
likely to be affected by nonlinearities. Janssen (2015) suggested that it
time series of sea states (17) with an integral over the marginal dis-
is more appropriate to characterise N as a function of the number of
tribution of sea states (18), the implicit assumption is made that there is
wave groups in the record, as it is more plausible that the maximum
no serial correlation in sea states. This implicit assumption is made
wave height in each group is independent than it is that successive
clear by equations (19) and (20). As noted in the introduction, serial
wave heights are independent. However, this idea has not been pursued
correlation reduces the probability of large observations in a sequence
in this work.
of a given length. This effect is not accounted for in the model of
The maximum wave heights in successive sea states can be con-
Krogstad (1985).
sidered independent, in the sense that the maximum height is depen-
dent only on the sea state parameters and not the maximum height in
4.2.1. Connection to other formulations
adjacent sea states. The distribution of the maximum wave height in the
Krogstad's model can be shown to be equivalent to other expressions
time interval [0, A] can therefore be written as:
for the long-term distribution of sea state maxima. Sagrilo et al. (2011)
k noted that by using Taylor series expansions for the exponential and
Pr(Hmax ≤ h [0, A]) = ∏i = 1 Pr(Hmax ≤ h σ (τi ))
k logarithm terms, Krogstad's expression (18) can be shown to be ap-
= ∏i = 1 Pr(H ≤ h|σ (τi )) Dσ / Tz (τi) proximately equal to the probabilistic average. Using the Taylor series
( k
= exp ∑i = 1 T
Dσ
z (τi )
ln[Pr(H ≤ h|σ (τi ))] ) (16)
expansion for the logarithm ln(x ) = x − 1 + O (x 2) and noting that
∞
∫ p (Hs )dHs = 1 gives
where τi ∈ [0, A]. As the duration of each sea state, Dσ , tends to zero, 0
∞
the summation in (16) can be expressed as an integral (Borgman, 1973): ⎡
Pr(Hmax ≤ h any sea state) ≈ exp ⎢ ∫ p (Hs)Pr(Hmax ≤ h|σ (Hs))dHs − 1⎤⎥
A ⎣ 0 ⎦
⎛
Pr(Hmax ≤ h [0, A]) = exp ⎜ ∫ Tz1(t ) ln[Pr(H ≤ h σ (t ))]dt ⎞⎟, (21)
⎝ 0 ⎠ (17)
Note that the approximation ln(x ) ≈ x − 1 is an identity when
Equations (16) and (17) give expressions for the distribution of the x = 1, and for the values of h relevant to extremes,
maximum wave height in an interval [0, A] if the time series of sea Pr(Hmax ≤ h|σ (Hs )) ≈ 1, so this approximation of the logarithm term is
states is known, buy they contain no information about the long-term accurate for this purpose. Similarly, using the Taylor series expansion
distribution of sea states. By applying the ergodic assumption (Naess, exp(x ) ≈ 1 + x (which is an identity for x = 0 and a good approxima-
1984) that, on average, the length of time that sea states are in the tion in this case) gives
interval [Hs , Hs + dHs] during time [0, A] is dt = Ap (Hs )dHs , the dis- ∞
tribution (17) can be rewritten as (Krogstad, 1985): Pr(Hmax ≤ h any sea state) ≈ ∫ p (Hs)Pr(Hmax ≤ h|σ (Hs))dHs
∞ 0 (22)
⎡
Pr(Hmax ≤ h [0, A]) = exp ⎢A ∫ Tz (1Hs) p (Hs)ln[Pr(H ≤ h|σ (Hs))]dHs ⎤⎥ This expression, given by Tucker and Pitt (2001, Section 6.4.2), is
⎣ 0 ⎦ the population mean, whereas Krogstad's expression (18) is the ergodic
(18) mean. These two expressions differ slightly for lower values of h , but
Note that in moving from the integral over time to the integral over are almost identical for the higher values of h relevant to extremes.
Hs , it must be assumed that Tz and other sea state parameters can be However, Krogstad's expression is more accurate to compute as it uses
modelled as taking a mean value conditional on Hs , without affecting the logarithm of Pr(Hmax ≤ h|σ (Hs )) .
167
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
Krogstad's model can also be related to the AW model as follows. states (in terms of the values of Hs ), for which a linear, power or ex-
Taking A = NPY ⋅Dσ = 1 year, in (18) and using the same linear ex- ponential law is a reasonable fit. The drawback of this approach is that
pansions for the exponential and logarithm terms gives: only the temporal evolution of Hs is modelled. It is therefore necessary
∞ to establish models for the mean values of Tz and other sea state
Pr(Hmax ≤ h 1 year) ≈ 1 + NPY ∫ p (Hs) N (Hs)(Pr(H ≤ h|Hs) − 1)dHs, parameters as a function of Hs , in the same way as for the AW and SSM
0 models.
(23) The other approach is to model Pr(Hmax ≤ h|MS) directly and define
the equivalent storm as a statistical distribution rather than a time
which is identical to (13). It is somewhat surprising that the AW and
series of Hs . This approach was adopted by Tromans and Vanderschuren
SSM methods result in approximately equivalent expressions for the
(1995), who assumed that the square of the maximum wave height in
distribution of the annual maximum wave height. However, if we
the storm followed a Gumbel distribution. Mackay and Johanning
consider that both models assume that there is no serial correlation
(2018a) developed this idea and used the generalised extreme value
between individual wave heights or sea states, then the reason for the
distribution (GEV) to model the distribution of the maximum wave
equivalence becomes apparent. In the examples considered in Sections
height in a storm. They demonstrated that using the GEV to model
5–7 the AW and SSM methods show differences for return periods less
Pr(Hmax ≤ h|MS) improves the goodness-of-fit by an order of magnitude
than 10 years but give nearly identical results for higher return periods.
compared to both temporal evolution methods and the Tromans and
Vanderschuren (TV) method. Moreover, the GEV model was shown to
4.3. Long-term distribution of storm maxima be more robust than both the TV and temporal evolution methods to
uncertainties in the models for the distribution of storm parameters.
A storm can be thought of as a sequence of sea states, where wave The GEV model also results in a much simpler form for Pr(Hmax ≤ h|ES)
heights increase to a peak value and then decrease again. The dis- than temporal evolution methods.
tribution of the maximum wave height in a storm can be calculated The TV method is recommend in several metocean design standards
using (16) where the time interval [0, A] is defined to be the duration of (ISO, 2015; DNV GL, 2017) and will therefore be considered as a re-
the storm. The process of partitioning a time series into separate storms ference model in comparison to the GEV model. The temporal evolution
is analogous to defining “declustering” criteria in the peaks-over- methods will not be considered further.
threshold (POT) method (see Coles, 2001). The criteria used to define In the TV method the distribution of the maximum wave height in
separate storms typically state that the time between the peak Hs of two the equivalent storm is defined as
adjacent storms must be larger than some minimum value. In some
2
models an additional criterion is used, requiring that the minimum Hs ⎛ ⎛ ⎛ h ⎞ ⎞⎞⎞
between two adjacent peaks must be less than some multiple of the Pr(Hmax ≤ h ES ) = exp ⎜−exp ⎜−ln(N ) ⎜ ⎜⎛ ⎟ − 1
⎟ ⎟ ⎟⎟,
Hmp
lower of the peak. The criteria for defining what constitutes separate ⎝ ⎝ ⎝⎝ ⎠ ⎠⎠⎠ (24)
storms can be defined more rigorously by considering the correlation
where N is the number of waves in the storm and Hmp is the most
between successive storm peak wave heights. This will be discussed in
probable maximum wave height in the storm (i.e. the mode of
more detail in Section 6.2. For now, it is assumed that the time series
Pr(Hmax ≤ h|MS) ). Tromans and Vanderschuren (1995) noted that
has been partitioned into discrete blocks, constituting storms where the
ln(N ) ≈ 8 for Northern European winter storms, but did not state ex-
peak wave heights can be considered independent.
plicitly how they estimated ln(N ) . Mackay and Johanning (2018a)
The distribution of the maximum wave height in any storm selected
considered two options for calculating ln(N ) . In the first method, ln(N )
at random, denoted Pr(Hmax ≤ h|any storm) , can be estimated in two
was calculated in terms of the moments of Pr(Hmax ≤ h|MS) , as
ways. In the first method, the distributions of the maximum wave 2
ln(N ) = πHmp 2
/(std(Hmax ) 6 ) . In the second method the value of ln(N )
height in the measured storms can be parameterised in some way, in was determined numerically by finding the value that minimises the
terms of a statistically equivalent storm, and the long-term distribution difference between Pr(Hmax ≤ h|MS) and Pr(Hmax ≤ h|ES) , quantified in
of storm parameters can be combined with the short-term distribution terms of the Cramér–von Mises goodness-of-fit parameter, ω :
for the storm, in a manner that is analogous to the SSM and AW ∞
methods. Alternatively, Pr(Hmax ≤ h|any storm) can be estimated di-
rectly from the data using a Monte Carlo method. The two methods are
ω2 = ∫ [Pr(Hmax ≤ h|MS) − Pr(Hmax ≤ h|ES)]2 dh
0 (25)
described in Sections 4.3.1 and 4.3.2.
It was demonstrated that the second method gave a better fit to the
data. However, the moment based estimators are recommended in DNV
4.3.1. The equivalent storm method
GL (2017) and will therefore be used here.
The distribution of the maximum wave height in a measured storm
Mackay and Johanning (2018a) defined the distribution of the
will be denoted Pr(Hmax ≤ h|MS) and the distribution for the equivalent
maximum wave height in the equivalent storm in terms of the GEV as:
storm will be denoted Pr(Hmax ≤ h|ES) . Two types of approach for
parameterising Pr(Hmax ≤ h|MS) have been proposed. One approach is ⎧ −1
to model the temporal evolution of sea states in a storm using some
Pr(Hmax ≤ h ES) =
⎛
⎝
(
⎪ exp ⎜− 1 + k ( ))
h−a
b
k⎞
⎟
⎠
for k ≠ 0
simplified geometric form, such as a triangle (Boccotti, 1986, 2000; ⎨
Arena and Pavone, 2006; Laface et al., 2017), trapezoid (Martin-
Soldevilla et al., 2015), parabola (Tucker and Pitt, 2001, Section 6.5.4), ⎩
(
⎪ exp −exp −( ( )))
h−a
b
for k = 0
(26)
power law (Fedele and Arena, 2010; Fedele, 2012; Arena et al., 2014) where a , b and k are the location, scale and shape parameters, re-
or exponential law (Laface and Arena, 2016). In most of the ap- spectively. The GEV can be fitted to (16) by finding the parameters that
proaches, the parameters of the equivalent storm are fitted using an minimise ω . The first guess for the parameters of the GEV is defined by
iterative procedure so that the distribution of the maximum wave assuming k = 0 , which from the properties of the GEV, then implies
height in the equivalent storm is matched as closely as possible to the a = Hmp and b = 6 std(Hmax MS)/ π . The parameters that minimise ω
measured storm (although Fedele (2012) provided an explicit formula are then found using a simplex search technique (Lagarias et al., 1998).
for the duration of the equivalent storm). This approach is reasonably Mackay and Johanning (2018a) demonstrated that the GEV provides an
effective because the order of sea states in the storm does not affect the excellent fit to Pr(Hmax ≤ h|MS) , with a bias of less than 0.2% in
distribution of the maximum wave height. Therefore, the product in quantiles up to an exceedance probability of 10−3 , meaning that there is
(16) can be re-ordered into a monotonically increasing series of sea very little inaccuracy introduced by parameterising the storm using the
168
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
GEV, compared to modelling all sea state variables as directly related to Pi1/ Ni = Pr(H ≤ Hmax , i |σi ), (28)
Hs .
Once each measured storm has been fitted with an equivalent storm, where Ni = Dσ / Tz, i , as before. Many models for Pr(H ≤ h|σ ) can be in-
the distribution of the maximum wave height in a random storm can be verted analytically to obtain a closed form solution for the random
calculated using the law of total probability, by integrating maximum wave heights. However, even if this is not possible, inter-
Pr(Hmax ≤ h|ES) over the long-term joint density function of storm polation of Pr(H ≤ h|σ ) to the desired probability gives a simple means
parameters, in a manner analogous to the SSM model (22). In the case of for calculating the random maximum wave heights. The difference
of the TV model, ln(N ) is assumed to be constant and the mean value between the MC method of Brown et al. (2017) and that of Mackay and
over all storms is used. For the GEV model Mackay and Johanning Johanning (2018b) is that in the latter study a new set of random
(2018a) showed that the triple integral over the joint density function maximum wave heights are generated on each trial via (28), whereas in
p (a, b, k ) could be reduced to a single integral over p (a) using the the former study a pre-determined set of load maxima are resampled.
mean value of k and a linear model b = A + Ba established using re- A threshold must be selected for fitting the GPD to storm peak wave
gression. It was also shown that the GEV location parameter a is almost heights. Selecting a threshold for each iteration of the MC method
identical to Hmp (since the mean value of k was close to zero). There- would be impractical. Mackay and Johanning (2018b) recommended
fore, the distribution of the maximum wave height in a random storm that the threshold is selected prior to the MC simulations, using the
can be written in the same way for both the TV and GEV models as: median value of the maximum wave height in each sea state, denoted
Hmed , which is calculated from (28) by setting Pi = 0.5 in each sea state.
The median values can then be declustered to select the peak values in
Pr(Hmax ≤ h any storm) = ∫ p (Hmp)Pr(Hmax ≤ h ES) dHmp. (27)
each storm and the threshold can be selected, following the procedures
outlined in Section 6.3. The choice of declustering criteria is discussed
The similarity between (27) the SSM model (22) is clear. However,
in Section 6.2.
one advantage of this form is that no model is required for how other
The MC method could also be applied to estimate the SSM and AW
sea state parameters vary with Hs .
distributions directly from the data, removing the need to model the
The density function p (Hmp) is modelled using the POT method,
variation of the short-term distribution parameters with Hs . However,
with the generalised Pareto distribution (GPD) fitted to exceedances of
for the AW method, simulating all waves for each sea state would be
Hmp above some high threshold. The fitting of the GPD and the selection
very computationally intensive and these ideas are not pursued further
of the threshold is discussed in Section 6.3.
here.
169
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
gives larger return values, rather than a positive bias. However, the
simulation studies presented in Section 7 will show that it is in fact a
positive bias. The assumptions made in this example are clearly un-
realistic. In the following two sections it will be shown that biases
observed in this artificial example also occur under realistic conditions.
In this section the methods used to fit the models to the data are
discussed. Two sets of measured wave data are used to illustrate the
procedures. The datasets used are described in Section 6.1. The selec-
tion of declustering criteria for the storm-based methods is discussed in
Section 6.2. Parameter estimation and threshold selection for the GPD
are discussed in Section 6.3. Finally, a comparison of return values of
individual crest heights calculated using each method is presented in
Section 6.4.
6.1. Datasets
Fig. 1. Distribution of the maximum wave height in a year from the all-wave, The datasets used are long-term records from wave buoys in two
sea-state-maxima and storm methods for hypothetical case of independent
distinct wave climate regimes and are the same as those used in Mackay
rectangular storms.
and Johanning (2018a). The first dataset, referred to as Site A, is from
NDBC buoy number 44025, located off the coast of Long Island, New
Comparison of (30) and (31) shows that the assumptions about in- York, in a water depth of 36 m. The dataset consists of 25 years of
dependence of sea states and storms lead to differing expressions for the hourly records of wave spectra over the period April 1991–December
distribution of the annual maximum wave height. To quantify the dif- 2016. The second dataset, referred to as Site B, is from NDBC buoy
ference between the two models, we assume that sea states have a 1-h number 46014, located off the coast of Northern California in a water
duration, storms consist of N = 100 sea states and there are NPY = 8760 depth of 256 m. The dataset consists of 35 years of hourly records of
sea states per year. We also assume that Hs follows a Weibull dis- wave spectra over the period April 1981–December 2016. Scatterplots
tribution: of Hs against Tz for the two datasets are shown in Fig. 2. Both datasets
have similar maximum observed Hs , with a maximum Hs at Site A of
x β
Pr(Hs ≤ x ) = 1 − exp⎜⎛−⎛ ⎞ ⎞⎟ 9.64 m and a maximum at Site B of 10.34 m. In the shallower water site,
⎝ ⎝ α ⎠ ⎠ (32) the maximum values of Hs occur for the steepest seas with
0.06 < sz < 0.07 (where sz = 2πHs / gTz2 ), whilst at the deeper site the
where α = β = 2 . Fig. 1 shows the distributions of the annual maximum extreme values of Hs occur for a wider range of steepness. The range of
wave heights from the sea state (30) and storm-based models (31) for Tz observed at the Californian site is also much larger than at the New
this simple example. Results from the AW model (13) are also included York site.
for comparison. As expected from (23) the AW and SSM models give In the examples presented in Sections 6.4 and 7, the Forristall
almost identical results for lower exceedance probabilities. The AW and (2000) model for the short-term distribution of crest heights in direc-
SSM method predicts larger wave heights than the storm method at a tionally spread seas, will be used to illustrate the application of the
given exceedance level (i.e. larger return values), but the difference long-term models. The Forristall model assumes that crest heights, C ,
decreases at lower exceedance probabilities (i.e. larger return periods). follow a Weibull distribution:
In this example, the difference is due only to the differing assumptions
β
about independence of sea states and storms. The results are consistent ⎛ η ⎞ ⎞
Pr(C ≤ η|σ ) = 1 − exp ⎜−⎛ ⎜ ⎟
⎟.
with the observation made in the introduction, that neglecting serial ⎝ αHs ⎠
⎝ ⎠ (33)
correlation results in a positive bias in estimates of extreme values.
Strictly speaking, we have only shown that the sea state-based model The parameters α and β are defined in terms of the significant
Fig. 2. Scatter plots of Hs against Tz for the two datasets used in this study. Colour indicates density of occurrence. Black dashed lines indicate sz from 0.01 to 0.07 at
intervals of 0.01. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
170
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
steepness, sm , and Ursell number, Urs , defined as to the annual signal. The pattern in the autocorrelation in storm peak Hs
is very similar for these two locations, which is due to both datasets
2πHs
sm = , being located at similar latitudes where the level of seasonal variability
gTm2 (34)
is similar. The slight difference between the autocorrelation functions
Hs for the two buoys is likely to be due to differences in the distribution of
Urs = , storm lengths and the effects of missing data in time series, which has
km2 d3 (35)
not been accounted for in this simple analysis.
where km is the finite depth wave number corresponding to Tm and d is The seasonal variation in Hs can be modelled as:
the water depth. The distribution parameters are given by
Hs (t ) = m (t ) + s (t ) Hsstat (t ) (38)
α = 0.3536 + 0.2568sm + 0.0800Urs (36)
where Hsstat (t ) is assumed to be a stationary time series and m (t ) and
β = 2 − 1.7912sm − 0.5302Urs + 0.2824Urs2 (37) s (t ) are deterministic periodic functions with period one year, re-
The AW and SSM methods both require models for the mean values presenting the mean and standard deviation (STD) of the distribution of
of α , β and Tz as a functions of Hs . Mackay and Johanning (2018a) fitted Hs at a given time in the year (see Monbet el al, 2007 and references
linear models for the variation of the mean values of these parameters therein). The functions m (t ) and s (t ) are estimated here using the
as a function of Hs , which are used here as well. The reader is referred method of Athanassoulis and Stefanakos (1995), with the mean and
to Mackay and Johanning (2018a) for details. STD of Hs modelled using low order Fourier expansions. Figs. 4 and 5
show the mean and STD calculated in 10-day bins throughout the year
for each dataset, together with a second order Fourier expansion of the
6.2. Selection of declustering criteria
binned values. The pattern in the seasonal variation differs between the
two locations, but the second order model appears adequate for both
The objective of this article is to establish the accuracy of storm-
locations. Figs. 4 and 5 also show the original data plotted against the
based methods for estimating return periods of individual wave heights
day of year and the normalised data, Hsstat , together with 90, 95 and
and contrast this to methods which neglect serial correlation in sea
99% quantiles calculated in 15-day windows (the longer bin length has
states. It is therefore important to consider the correlation structure of
been used here for stability). It appears that the normalised data is
sea state time series in some detail and to establish a rigorous method of
approximately stationary for Site A, but there is some residual trend in
estimating a minimum separation time between local maxima in the
the extreme values at Site B. However, as will be shown below, the
wave height time series, for which adjacent maxima can be considered
normalisation seems to be sufficient to examine the short-range de-
statistically independent.
pendence (i.e. on time scales less than seasonal variation) in the ex-
Most statistical methods developed to quantify dependence between
treme values.
extreme events assume that the time series is stationary. Time series of
Fig. 3 showed the autocorrelation function between storm peak Hs
Hs exhibit a clear seasonal variation in storm peak wave heights, which
for peaks separated by at least 5 days. If we wish to assess the depen-
introduces a long-range dependence structure between extreme events
dence structure in the extreme values without pre-selecting a declus-
and a non-stationarity in the time series. The long-range dependence
tering criterion to identify storm peaks, then the autocorrelation func-
due to seasonal variability is deterministic in origin, with the variation
tion is not appropriate since it measures the correlation between data at
having a period of exactly one year. The variation can therefore be
all levels. Davis and Mikosch (2009) introduced the concept of the
filtered out and the normalised, stationary time series can be examined
extremogram as an analogue of the autocorrelation function for se-
for shorter-range dependence of extreme events using established sta-
quences of extreme events in a time series. The definition of the ex-
tistical methods.
tremogram presented by Davis and Mikosch (2009) is used for assessing
The autocorrelation in storm peak Hs is shown in Fig. 3 for both
dependence between multivariate time series, but the concept is also
datasets. In this example, storm peaks are defined as the maxima within
applicable to univariate time series. To simplify the notation, the de-
a 5-day moving window. The lag shown is the number of storms se-
finition will be presented here for univariate time series only.
parating adjacent peaks. The median time between storm peaks is 9.79
For a stationary univariate process (Xt ) , the extremogram is defined
days for Site A and 9.86 days for Site B. The peaks in autocorrelation
is defined as
function occur around lags of ∼36 and ∼72 storms, which corresponds
ρ (τ ) = lim Pr(Xt + τ > u Xt > u)
u →∞ (39)
In the univariate case, the extremogram is the same as the tail de-
pendence coefficient between Xt and Xt + τ (Beirlant et al., 2004). To test
for correlation in exceedances at finite levels, an estimate of the ex-
tremogram at level u can be defined as
n−h
∑t = 1 Iu (Xt ) Iu (Xt + τ )
ρu (τ ) = n
∑t = 1 Iu (Xt ) (40)
1 if X > u,
Iu (X ) = ⎧
⎩ 0 if X ≤ u.
⎨ (41)
171
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
Fig. 4. Seasonal variation plots for Site A. Lower panels: 10-day mean and STD of Hs and second-order Fourier expansion. Upper panels: Hourly values of Hs (left) and
normalised Hs (right) against day of year. Red dashed lines are 90, 95 and 99% quantiles in 15-day bins. (For interpretation of the references to colour in this figure
legend, the reader is referred to the web version of this article.)
172
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
Fig. 6. Extremogram plots for normalised Hs data for both dataset. Colour of lines denote threshold value u from 0 to 4 at intervals of 5m. (For interpretation of the
references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 6 shows ρu (τ ) and ρu (τ ) − Pr(X > u) for the normalised Hs data 5-day window appears to be sufficient to ensure independence. This
for both datasets. Values are shown for separation times between 0 and definition of storms will be used from here onwards. A separation time
10 days and a range of threshold levels. A threshold of 0 corresponds to of 5 days is also reasonable based on physical arguments, since peaks
just over the 50th percentile for both datasets, whilst the upper separated by 5 days will correspond to waves generated from separate
threshold level shown corresponds approximately the 99.9th percentile low pressure systems.
for both datasets. Obviously, the threshold value of 0 does not represent It is emphasised that the seasonal normalisation applied above has
extreme values, but the range of thresholds is shown to illustrate the only been used to examine the dependence structure in the extreme
difference in the dependence structure between common and extreme values and define a declustering criterion. No model of the seasonal
values. It is also interesting to note that even for lower sea states the variation is required for estimating the long-term distribution of wave
correlation decays to zero after about 5 days for both datasets. This heights. The effect of seasonal variability on the simulated time series
means that selecting a declustering criteria that ensures independence and extreme values is discussed in Section 7.1.
at lower levels will also ensure independence at extreme levels.
For Site A, the dependence structure remains almost constant with
6.3. Fitting of distributions
threshold for most of the range, with events separated by ∼4 days
being effectively independent. For the highest thresholds, the correla-
So far, the method used to fit the long-term distribution models to
tion range decreases rapidly, with events separated by ∼1 day being
the data has not been discussed. The AW and SSM methods both require
effectively independent for a threshold of 5.5. In contrast, Site B ex-
a model for of p (Hs ) , the equivalent storm method requires an estimate
hibits a more continuous gradual decrease in dependence with
of p (Hmp) , and the MC method requires multiple estimates of
threshold, but a similar pattern in dependence at the highest levels.
Pr(Hmax ≤ h|any storm) . For the storm-based methods, estimates of the
By applying a seasonal normalisation and examining the depen-
distributions are only required for storms where the maximum wave
dence structure of the normalised time series, we are making the im-
height exceeds a threshold level, and the occurrence rate of storms
plicit assumption that the dependence structure of the smaller storms in
exceeding the threshold level is accounted for in the calculation of re-
the summer months is the same as that for the larger storms in the
turn periods (2). This means that the standard POT approach can be
winter months, except for a shift in mean and STD of the time series. It
adopted for the storm-based methods.
is conceivable that this is assumption is invalid, and that the duration of
For the AW and SSM methods, a model for p (Hs ) is only needed for
high peaks differs in summer and winter. However, selecting a suffi-
the tail of the distribution as well, since for high values of h and low
ciently large minimum separation time for declustering will ensure that
values of Hs , Pr(H > h|σ (Hs )) ≈ 0 . Therefore, the GPD can also be used
the declustered events are effectively independent.
to model p (Hs ) for the AW and SSM models and we can set the lower
Given these observations, defining storms as local maxima in Hs in a
limit of integration in (13) and (18) as the threshold level used to fit the
173
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
GPD. This avoids the need to establish a model for the distribution of estimators do not achieve these asymptotic properties until large
the entire range of values of Hs , which can be problematic, since fitting sample sizes. Hosking and Wallis (1987) showed that the ML estimators
a model for the entire range of Hs does not guarantee a good fit to the are non-optimal for sample sizes up to 500, with higher bias and var-
highest values which have the strongest influence on the extreme wave iance than other estimators, such as the moments and probability-
heights. Ferreira and Guedes Soares (1999) showed that three models of weighted-moments estimators. Hosking and Wallis also noted that
p (Hs ) which all had a good fit over the bulk of the data, gave estimates sometimes solutions to the ML equations do not exist and that at other
of 100-year Hs which differed by over 5 m for a dataset from the Por- times when a solution does exist there can be convergence problems
tuguese coast. Just modelling the tail of the distribution avoids this with the algorithm they used to find them.
problem. Estimation of the GPD parameters is the subject of ongoing research.
There are conflicting requirements in selecting the threshold used A quantitative comparison of recent methods for estimating the para-
for modelling p (Hs ) . The threshold must be low enough that the in- meters was presented by Kang and Song (2017). They concluded that
tegrals (13) and (18) are not affected, but high enough to ensure a good the empirical Bayesian-likelihood method (EBM) of Zhang (2010) has
fit of the GPD to the data. For the datasets considered here, it was found one of the best performances in a wide range of cases. The EBM method
that the maximum percentile which could be used for the threshold is both computationally efficient and has low bias and variance com-
level differed between the two datasets, with the integrals for Site A pared to other methods, which is especially important for the MC
remaining invariant for thresholds up to the 99.5% quantile (Hs ≈ 8m ), method, where the fitting is performed for each random trial. The EBM
but the integrals for Site B remaining invariant only up to the 95% method has therefore been used in this work. An algorithm for calcu-
quantile (Hs ≈ 7m ). The difference in sensitivity is due to the different lating the EBM estimators of the GPD parameters was provided by
shapes of the tails of the distribution (discussed below), with the dis- Zhang (2010), using the R statistical language, which can easily be
tribution for Site A having a “longer tail” than for Site B, meaning that translated into other codes. The reader is referred there for details.
exceedance probabilities decrease with Hs at a lower rate in the tail of
the distribution for Site A. Fortunately, for both datasets it was found 6.3.2. Threshold selection
that the GPD provides a good fit to the data for lower threshold values The selection of the threshold for fitting the GPD is a compromise
than this, as discussed below. between having a sufficient number of observations and violating the
The GPD fitted to the tail of the distribution gives the conditional asymptotic arguments which justify the use of the GPD. Various
non-exceedance probability Pr(Hs ≤ h Hs > u) , where u is the threshold methods have been proposed for selecting an appropriate threshold. An
level. The CDF of Hs can then be calculated as: overview of graphical methods is presented by Coles (2001) and
Pr(Hs ≤ h) = 1 − Pr(Hs > u)[1 − Pr(Hs ≤ h Hs > u)] Jonathan and Ewans (2013) discuss some more recent studies. In this
(42)
work the threshold has been selected by fitting the GPD for a range of
where Pr(Hs > u) is the fraction of samples where Hs exceeds the thresholds and selecting the threshold as the lowest value for which the
threshold. The GPD is defined as shape parameter ξ and estimates of high quantiles of the distribution
reach an approximately stable value.
−1
⎧1 − 1 + ξ h − u
Pr(X ≤ h|X > u) =
⎪ ( σ( )) ξ
for ξ ≠ 0 and σ > 0, For each threshold value a bootstrap technique is used to establish
confidence bounds on the estimates. At each trial of the bootstrap
⎨
⎪
⎩
1 − exp − ( ( ))
h−u
σ
for ξ = 0 and σ > 0.
(43)
procedure, a random error is added to each resampled data point,
corresponding to the sampling uncertainty in the measurement. This
Here, the variable X denotes either Hs , Hmp or Hmax , depending on the addition of a random error helps smooth the threshold plots, making
application. The parameters σ and ξ are the scale and shape parameters trends easier to identify. The sampling distribution of Hs is approxi-
respectively. When ξ ≥ 0 , the distribution is unbounded from above and mately Guassian, with a coefficient of variation of COV(Hs ) = α Tm/ Dσ
referred to as heavy tailed or long tailed. When ξ < 0 the distribution where α ≈ 0.48 for a Pierson Moskowitz spectrum and α ≈ 0.61 for a
has a maximum value, equal to u − σ / ξ , and is referred to as short JONSWAP spectrum with peak enhancement factor of 3.3 (Forristall
tailed. The methods used to estimate the parameters of the GPD and et al., 1996). In our datasets Dσ = 1200s and the mean value of Tm/ Dσ
select the threshold are discussed in the following two subsections. was 0.067 for Site A and 0.082 for Site B. A value of COV(Hs ) = 0.04 has
It is worth noting that some practitioners opt to use the Weibull dis- been used for both datasets as a representative value, given the range of
tribution for modelling the tail of the long-term distribution, rather than spectral shapes and values of Tm .
the GPD. Jonathan and Ewans (2013) discuss differences between the GPD Threshold selection plots for the fit of the GPD to measured Hs for
and Weibull approach and note that using a Weibull distribution implicitly each dataset are shown in Fig. 7. Note that for the AW and SSM
constrains the tail to be unbounded from above. It is reasonable to expect methods, the GPD is fitted to all values of Hs exceeding the threshold
an upper bound on storm peak Hs or wave heights, due to physical con- and no declustering is applied. For Site B the threshold has been se-
straints such as fetch limitations or water depth. The use of the GPD allows lected as 7.5 m, since both the GPD shape parameter and 50-year return
for the possibility of an upper bound on the tail of the distribution, value of Hs tend to approximately constant values above this threshold.
whereas the Weibull distribution does not. The use of the GPD for mod- There are 44 storms where the peak Hs exceeds 7.5 m or ∼1.4 storms
elling the tail of the distribution is also justified by asymptotic arguments per year. For Site A, the shape parameter and 50-year return values also
(see e.g. Coles, 2001). For these reasons, the GPD has been used to model appear approximately constant for thresholds above ∼7.5 m. However,
the tails of the long-term distributions in this work. in this case there are only two storms where the peak Hs exceeds 7.5 m,
which is insufficient to establish a reliable fit to the tail of the dis-
6.3.1. GPD parameter estimation tribution. Instead, the threshold has been set at 5 m, where there is also
The method used to estimate the parameters of the GPD can have a a period of stability in the GPD shape parameter, and for which there
significant influence on results, especially with relatively small sample are 54 storms where the peak Hs exceeds the threshold (2.4 storms per
sizes (Mackay et al., 2011). The metocean time series available for es- year). The fit of the GPD to the data at the selected thresholds is shown
timating extreme values are typically quite short in relation to the re- in Fig. 8. In both cases, the GPD provides a good visual fit to the data.
turn periods of interest, resulting in small samples sizes. This means Note that at Site A, for the two storms where the peak Hs exceeds 7.5 m,
that using accurate estimators is important. The maximum likelihood there are 17 hourly records with Hs exceeding 7.5 m.
(ML) method is commonly used in metocean applications (e.g. The fitting of equivalent storms (defined in terms of the Gumbel and
Jonathan and Ewans, 2013), due to its asymptotic properties of being GEV distributions) to the two datasets was discussed in detail by
unbiased and having the lowest possible variance. However, the ML Mackay and Johanning (2018a) and is not repeated here. The threshold
174
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
selection plots for the fit of the GPD to the most probable maximum with Cmp ∼2 m higher than all other storms. In contrast the distribution
crest height in each storm, Cmp , are shown in Fig. 9. In this case, there is for Site B is short-tailed and there is a relatively close grouping in the
a clear choice of threshold for both datasets. For Site A both the shape values of Cmp for the largest storms.
parameter and 50-year return value of Cmp are approximately stable for For the MC method the threshold is selected by fitting to the de-
thresholds above ∼5 m and for Site B stability is reached at ∼5.5 m. clustered values of Cmed (the median value of the maximum crest height
The fit of the GPD to each dataset at the selected threshold level is in each storm, defined in Section 4.3.2). Fig. 11 shows the maximum
shown in Fig. 10 and appears satisfactory in both cases. For Site A the value of Cmed in each storm compared to Cmp for both datasets. The
distribution is long-tailed due to the occurrence of two large storms, value of Cmp is slightly larger than max(Cmed ) on average, but the two
Fig. 8. Fit of the GPD to the tail of the distribution of Hs for both datasets.
175
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
Fig. 10. Fit of the GPD to the tail of the distribution of Cmp for both datasets.
variables are strongly correlated with a Pearson correlation coefficient the predicted crest heights from the SSM method for return periods of
in excess of 0.99. Given the very strong correlation in the two variables, greater than 10 years, as expected from the theoretical considerations
the choice of threshold for the equivalent storm method will also be presented in Section 4. The storm-based methods agree well for both
appropriate for the MC method and has been applied here. datasets. For Site A, the Gumbel and GEV methods are in close agree-
ment over the entire range of return periods, whereas for Site B the
6.4. Comparison of return values of individual crest heights Gumbel method produces slightly lower estimates at higher return
periods. Mackay and Johanning (2018a) showed that for this dataset,
Fig. 12 shows the return values of maximum crest heights calculated the Gumbel and GEV methods agreed better if ln(N ) is modelled as
using each method. For both datasets the AW method gives the highest linearly dependent on Cmp , rather than constant. The difference in the
predicted crest heights at return periods of 1 year, but converges with estimate of Cmax at the 100-year level between the Gumbel and GEV
176
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
Fig. 11. Comparison of maximum value of Cmed in each storm with Cmp . Dashed lines show 1:1 relation.
methods is around 0.2 m and remains similar at the 1000-year level. investigate the accuracy of each method. The synthetic time series are
Given the uncertainty in the estimates due to the finite length of the not intended to be precise representations of the wave climates at either
metocean dataset (discussed further in Section 7.2.2), the difference of the locations considered in the previous section, but are intended to
between the two methods is not significant. capture realistic features of wave climates in order to assess the accu-
For Site A, the AW and SSM methods result in significantly higher racy of the models for the long-term distributions of individual crest
predictions at lower return periods than the storm-based methods, with heights. Section 7.1 describes the method used to synthesise the time
a difference of 1.5 m at a return period of 10years and 1.1 m at a return series and the application of the models to the synthetic data is de-
period of 100years. For Site B, the AW and SSM methods agree better scribed in Section 7.2.
with the storm-based methods and give slightly lower predictions at
return periods above 50 years. 7.1. Creation of synthetic time series
Given that the AW and SSM methods neglect serial correlation in sea
states, it would be reasonable to expect that they would give higher Numerous methods have been proposed for generating synthetic
estimates of return values than the storm-based methods for both da- time series of sea states (see Monbet et al., 2007 for a review). The
tasets. However, the difference between the return values calculated challenge here is to generate a synthetic time series where the extreme
using AW and SSM methods and storm-based methods will depend both characteristics are representative of a real situation and the joint evo-
on the level of serial correlation in the extreme values and also any lution of sea state parameters during storms follows realistic patterns,
differences in the fitted distributions of Hs and Cmp . Given the un- matching the observed dependence structures (both in terms of serial
certainty in the fitted distributions, it is not possible to conclude from correlation and the joint distribution of parameters). A block-resam-
this example whether the differences in the return values observed for pling method (Härdle et al., 2003) has been applied here, where the
Site A are due to serial correlation or uncertainty in the fitted dis- measured time series of sea states is divided into discrete blocks and
tributions. The simulations presented in the following section remove these blocks are resampled with replacement. The blocks correspond to
the influence of the uncertainty in the fitted distributions, to illustrate storms where the peak Hs can be considered independent. Independent
the differences that arise from serial correlation only. values of storm peak Hs are defined in the same way as before, as
maxima within a 5-day moving window. The dividing points between
7. Application to synthesised data blocks are defined to be the minimum value of Hs between adjacent
storm peaks. Blocks with more than 10% missing data have been dis-
In this section, synthesised time series of sea states are used to carded. Dividing the time series in this way ensures that the maximum
Fig. 12. Return values of maximum crest heights calculated from measured data using each method.
177
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
crest heights in each block are independent and that storm histories are shown in Fig. 15, together with the measured Hs against Tz . The block-
realistic. resampling method provides a realistic-looking joint distribution,
If the time series was resampled without modification, this would which replicates the observed shapes reasonably well and preserves the
mean that the maximum Hs over any time period would be the max- relationship between Hs and significant steepness. However, as the
imum observed Hs . To overcome this, a distribution is fitted to the method does not create new storm histories, but simply rescales mea-
observed values of storm peak Hs , which can be used to extrapolate to sured storms, the storm with the largest value of Tz at Site A has been
extreme values (described below). To generate a storm with a random resampled several times, exaggerating this part of the distribution. It is
peak Hs , a random variable is generated from the fitted distribution and reasonable to ask whether resampling of measured storms will give an
a block is selected at random from the 20 measured storms with the adequate representation of the variability in storm characteristics.
closest peak Hs . The values of Hs in the randomly selected measured Mackay and Johanning (2018a) showed that the distribution of the
storm are then linearly rescaled so that the peak Hs matches the ran- maximum crest heights in storms can be very accurately represented in
domly generated value. The values of Tm and Tz are scaled to maintain terms of the three-parameter GEV. Moreover, the joint distribution of
the observed steepness. If we denote the ratio of the random storm peak the fitted GEV parameters exhibits remarkably regular characteristics,
Hs to the measured storm peak Hs as r = HsR, peak / HsM, peak (where the su- with the scale parameter (b ) being strongly correlated to the location
perscripts denote the random and measured values) then the rescaled parameter (a ≈ Cmp ) and the shape parameter (k ) taking a narrow
values of Tm and Tz in the random storm are given by TmR = TmM r and distribution of values, approximately independent of a . Due to this
TzR = TzM r . regularity in the distribution of the maximum crest heights in measured
The reason for selecting the random storm from only the measured storms, the block-resampling method should give an adequate re-
storms with the closest values of peak Hs is that the characteristics of presentation of the variability in storm characteristics.
the storms change with Hs , so sampling from the nearest neighbours No attempt has been made to capture the seasonal variation in sea
preserves this relation. From Fig. 2 it is clear that the distribution of Tz states in the synthetic time series. The rationale for this is that sampling
is dependent on Hs . The distribution of Hs within a storm is also con- all storms at random will capture the distribution of storms throughout
ditional on the value of the storm peak Hs . Fig. 13 shows the distribu- the year. Over a one year section of simulated time series the seasonal
tion of Hs / Hs, peak , binned by the value of storm peak Hs . It is evident that variation in storms should be captured on average. Moreover, as we
the distribution changes with the value of storm peak Hs and that the have assumed that the time series is composed of independent blocks, it
patterns differ between the two datasets, with the largest storms at Site is only the distribution of storms within the year that determines the
A having a more peaked shape than the largest storms at Site B. distribution of the maximum crest height in the year and not the order
To avoid the problems with associated with initial distribution in which storms occur. This argument is analogous to the argument that
methods, discussed in Section 6.3, the distribution of storm peak Hs is the order of sea states within a storm does not affect the distribution of
fitted with a two-part model, using a lognormal model for the body of the maximum crest height in the storm.
the data and a GPD model for the tail. A smooth transition between the
models for the body and tail of the distribution is achieved using a
mixing function, defined as 7.2. Results and discussion
1⎛ h − v ⎞⎞
m (h) = 1 + tanh ⎛
⎜ ⎟
A 100,000-year synthetic time series of sea states was generated for
2⎝ ⎝ d ⎠⎠ (44)
each dataset. For each sea state a random maximum crest height was
where v is the mixing point and d is the mixing distance. The model for simulated from the Forristall (2000) distribution using the method
the CDF of storm peak Hs is then given by described in Section 4.3.2. The largest crest in each year of the synthetic
time series was extracted and used to form an empirical estimate of
Pr(Hs, peak ≤ h) = (1 − m (h)) Pbody (h) + m (h) Ptail (h) (45)
Pr(Cmax ≤ h 1 year) , from which return values were calculated using
It was found that setting d = 0.5 and v = u + d , where u is the (1). The return periods estimated from the 100,000-year simulation
threshold at which the GPD is fitted, gave a smooth transitions for both have approximately a 3% STD at the 100-year level and a 10% STD at
datasets. The threshold for the GPD was selected as u = 5.25m for Site A the 1000-year level (see Appendix A). The synthetic time series were
and u = 5.75m for Site B. The fit of the models to the data are shown in used to assess the accuracy of the SSM, ES and MC methods in two
Fig. 14. The models provide a good visual fit, both in the bodies and the ways, as described in the following sections. The AW method was not
tails of the distributions. considered in this section as it gives identical results to the SSM method
Scatter plots of Hs against Tz for example 500-year simulations are for return periods above 10years.
Fig. 13. CDF of normalised Hs , binned by peak Hs . Colour of line indicates peak Hs in metres. (For interpretation of the references to colour in this figure legend, the
reader is referred to the web version of this article.)
178
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
7.2.1. Example 1 – distributions derived from 100,000-year simulation the mean value of ln(N ) for the Gumbel method and for the GEV the
In the first example, the long-term distributions Pr(Hs ≤ h) and mean value of k and linear model b (Hmp) ). Similarly, the linear models
Pr(Cmp ≤ h) have been derived from the full 100,000-year simulated for the variation of the short-term distribution parameters α (Hs ) and
time series. This effectively eliminates the uncertainty associated with β (Hs ) derived from the measured data were also used here for the
fitting the distributions and isolates the effect of serial correlation on calculations using the SSM method.
the estimated return values. The values of Cmp were calculated explicitly The MC method is not considered in this example. As the long-term
for each simulated storm, since the variation of Cmp with Hs, peak is distribution is assumed to be known, there is no fitting stage involved
nonlinear due to the dependence of the short-term distribution para- for the MC method. This means that the application of the MC method
meters on the Ursell number. to the simulated sea states would be identical to the method used to
To reduce computational time, the Gumbel and GEV distributions derive the simulated crest heights.
were not fitted to the distribution of the maximum crest height in the The return values of Cmax calculated using the SSM and storm-based
simulated storms. Instead the Gumbel and GEV parameters that were methods are compared to the simulated values in Fig. 16. The estimates
calculated from the measured data were applied in the simulations (i.e. are very similar to those shown in Fig. 12, indicating that the models
Fig. 15. Scatter plots of Hs against Tz for measured data and 500-year simulation.
179
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
Fig. 16. Return values of maximum crest heights from simulations and calculated from sea-state-maxima and storm-based methods for case when the long-term
distribution is known.
derived for Pr(Hs ≤ h) and Pr(Cmp ≤ h) for the measured data were parameter. This confirms that the tail shape has the dominant influence
reasonably accurate. The return values from the GEV method are very on the level of bias caused by neglecting serial correlation.
close to the simulated values for both datasets, whereas the Gumbel
method produces slightly lower estimates than the simulated values. As
mentioned in Section 6.4, this is likely to be due using a constant value 7.2.2. Example 2 – distributions derived from 20-year simulations
of ln(N ) in the Gumbel model rather than a value that is linearly de- The second example is designed to examine the robustness of the
pendent on Cmp (see Mackay and Johanning, 2018a). The SSM method estimated return values when the long-term distributions are fitted to
gives return values that are close to the simulated values for Site B, but the data. The ‘true’ return periods are calculated in the same way as in
significantly overestimates return values for Site A. Example 1, using 100,000-year simulations for each dataset. The dis-
In Section 6.2 the serial-dependence in sea states during high storms tributions Pr(Hs ≤ h) and Pr(Cmp ≤ h) are fitted to 20-year simulations,
was shown to be similar for both datasets. Why then does neglecting which is representative of a real situation using measured or hindcast
serial correlation give reasonable results for one dataset but not the data (analogous to that described in Section 6). The thresholds identi-
other? The fitted distribution of storm peak Hs for Site B was short- fied in Section 6.3.2 for fitting the GPD have also been used here. The
tailed, with the GPD shape parameter ξ = −0.2 3, whereas the dis- fitting process was repeated 1000 times on different 20-year simula-
tribution of storm peak Hs for Site A was (marginally) long-tailed, with tions to establish the mean and STD of the return periods calculated
ξ = 0.03. This means that the difference between the 10-year and 100- using each method. The mean and STD of the return values over the
year Hs is larger for Site A than for Site B. So when a storm occurs with 1000 simulations are shown in Figs. 18 and 19.
Hs > Hs,100 at Site B there are only a few sea states with Hs > Hs,10 , The storm-based methods give very similar predictions for both
whereas a storm with Hs > Hs,100 for Site A will contain many sea states datasets. They produce a small positive bias in the predicted return
with Hs > Hs,10 . Therefore, neglecting serial correlation will have a values for both datasets. This positive bias is a result of the estimators
larger impact when the distribution of storm peak Hs is long-tailed. used for the GPD parameters, which is a known feature of the EBM
To test this explanation, the simulations for Site A were repeated method (Kang and Song, 2017). However, of the estimators considered,
with GPD shape parameter set at values of ξ = −0.2, −0.1, 0 and 0.05. the EBM method performed the best in terms of bias and STD. In par-
All other variables were left unchanged. In particular, the same set of ticular, using ML estimators results in larger bias and much slower
measured storms was resampled. The assumed distributions of Hs, peak computations. The Gumbel and GEV methods have very similar STDs
are shown in Fig. 17(a) and the bias in the return values of Cmax cal- for both datasets. The STD for the MC method is similar to that for the
culated using the SSM method for each set of simulations are shown in Gumbel and GEV methods at return periods up to 100 years, but differs
Fig. 17(b). As expected, the bias increases with increasing shape slightly at longer return periods, with a slightly higher STD for Site A
and slightly lower STD for Site B.
Fig. 17. Left: Assumed distributions of storm peak Hs with various GPD tail shapes. Right: Bias in return values of crest heights using sea-state-maxima method for
simulations with various GPD tail shapes.
180
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
Fig. 18. Mean return values of maximum crest heights from 100,000-year simulations and calculated from 20-year simulations using sea-state-maxima and storm-
based methods.
The SSM method performs similarly here to the previous examples, the tail of the distribution of storm peak Hs , with longer tails (larger values
with a significant positive bias for low return periods at Site A and a of the GPD shape parameter) leading to larger bias.
negative bias at high return periods. For Site B the estimates from the It was shown that storm-based methods give accurate predictions of
SSM method are in close agreement with the simulated return values. return periods of individual wave heights. The method of Tromans and
Interestingly, the SSM method gives a higher STD in return value esti- Vanderschuren (1995) which models the distribution of the maximum
mates than the storm-based methods at return periods less than 200 wave height in a storm using a Gumbel distribution, gives very similar
years, but lower STD at longer return periods. The higher STD at low results to the GEV method of Mackay and Johanning (2018a). The
return periods is likely related to the SSM method neglecting serial Gumbel method produces a slight underestimate of return values when
correlation. When a single large storm occurs which has multiple sea the number of waves in the storm (ln(N ) ) is assumed to be constant.
states above the 100 year level, the inclusion of all these sea states in The Monte Carlo method proposed by Mackay and Johanning
the long-term distribution affects return values at low return periods. In (2018b) was shown to compare well to the Gumbel and GEV storm-
contrast, the occurrence of such a storm only affects a single data point based methods for both the measured and simulated data. Of all the
used in the inference for the storm-based methods, which results in models considered, the Monte Carlo method requires the fewest as-
greater stability in the estimates at lower return periods. sumptions and fitting stages with subjective judgements from the user.
It does not require calculating the distribution of the maximum wave
8. Conclusions height in measured storms, fitting equivalent storms, estimating joint
distributions of storm parameters or combining distributions via in-
This paper has considered three methods for combining the long-term tegration. Also, in comparison to the all-wave and sea-state-maxima
distribution of sea states with the short-term distribution of wave or crest methods, there is no requirement to form models of for the mean values
heights conditional on sea state. It has been demonstrated both theoreti- of the short-term distribution parameters as a function of Hs . It is
cally and numerically that methods which treat all waves as independent therefore recommended that the Monte Carlo method should be used
events give the same long-term distributions as methods that treat the for calculating return periods of individual wave and crest heights.
maximum wave height in each sea state as independent events. Both of
these methods neglect the effects of serial correlation in sea states. The
numerical simulations presented in this work show that this can lead to Declarations of interest
significant positive bias in estimates of return values of individual wave
and crest heights. The size of the positive bias is dependent on the shape of None.
Fig. 19. STD in return values of maximum crest heights from 20-year simulations using sea-state-maxima and storm-based methods.
181
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
Fig. 20. STD in empirical estimates of return periods for simulation length of n = 105 years.
Acknowledgement
The sampling properties of the return periods estimated from the data can be derived as follows. Denote the ordered sample of annual maximum
wave (or crest) heights as H(1) ≤ H(2) ≤ …≤H(n) , where n is the length of the dataset in years. The empirical non-exceedance probability for the k th
order statistic is defined as
∼ k
P =
n+1 (A.1)
Define
P(k ) = Pr(H(k ) ≤ h) (A.2)
∼ ∼
Note that P(k ) is a random variable, but P is not – for any sample P is always given by (A.1). The empirical return period associated with the k order
th
statistic is
∼ 1
T = ∼
1−P (A.3)
which is also fixed for any sample. However, the true return period, T = 1/(1 − P(k ) ) , associated with the k th
order statistic is a random variable,
whose sampling properties can be derived in terms of the distribution of P(k ) . The variable P(k ) follows a beta distribution (David and Nagaraja,
2003), with density function given by:
n!
f (P(k ) ) = (P(k ) )k − 1 (1 − P(k ) )n − k
(k − 1) ! (n − k )! (A.4)
The expected values of T and T2 are:
1 k − 1 (1
− P )n − k
E(T ) =
n!
(k − 1) ! (n − k )!
∫P 1−P
dP =
n
n−k
0 (A.5)
1 k − 1 (1
− P )n − k
E(T 2) =
n!
(k − 1) ! (n − k )!
∫P (1 − P )2
dP =
n (n − 1)
(n − k )(n − k − 1)
0 (A.6)
The variance of T is given by:
kn
Var(T ) = E(T 2) − [E(T )]2 =
(n − k )2 (n − k − 1) (A.7)
From (A.1) and (A.3) we have:
∼
(T − 1)
k = (n + 1) ∼
T (A.8)
Substituting this into (A.7) we arrive at:
182
E. Mackay, L. Johanning Ocean Engineering 165 (2018) 164–183
∼
STD (T ) 1 n (n + 1)(T − 1) ∼ n+1
= ∼ ∼ , T <
T (n − T + 1) n − 2T + 1 2 (A.9)
The function in (A.9) is plotted in Fig. 20 using a value of n = 105 years. The estimated return periods from the 100,000 year simulations have a
3% STD at the 100-year level and a 10% STD at the 1000-year level.
Substituting (A.8) into (A.5) gives the expected value of the sample return periods as:
n ∼
E(T ) = ⎛ ⎜
∼
⎞T ⎟
⎝n − T + 1⎠ (A.10)
∼ ∼
When n is large and n ≫ T , the term in brackets is close to unity, but when T is close to n , there can be a significant positive bias in the sample return
periods. For the 100,000 year simulations the bias is approximately 0.1% at the 100-year level and 1% at the 1000-year level. The bias is therefore
effectively negligible compared to the STD.
References Hogben, N., 1990. Long term wave statistics. In: Le Mehaute, B., Hanes, D. (Eds.), In the
Sea v.9, Pt. A Ocean Engineering Science. Wiley, New York, pp. 293–333.
Hosking, J.R.M., Wallis, J.R., 1987. Parameter and quantile estimation for the generalized
Arena, F., Pavone, D., 2006. The return period of non-linear high wave crests. J. Geophys. Pareto distribution. Technometrics 29, 339–349.
Res. 111 C08004. ISO, 2015. Petroleum and Natural Gas Industries – Specific Requirements for Offshore
Arena, F., Malara, G., Romolo, A., 2014. On long-term statistics of high waves via the Structures - Part 1: Metocean Design and Operating Considerations. ISO 19901-
equivalent power storm model. Probabilist. Eng. Mech. 38, 103–110. 1:2015.
Athanassoulis, G.A., Stefanakos, C.N., 1995. A nonstationary stochastic model for long- Jahns, H.O., Wheeler, J.D., 1973. Long term wave probabilities based on hindcasting of
term time series of significant wave height. J. Geophys. Res. 100 (C8), 16149–16162. severe storms. J. Petrol. Technol. 25 (4), 473–486.
Battjes, J.A., 1970. Long-term Wave Height Distribution at Seven Stations Around the Janssen, P.A.E.M., 2015. Notes on the maximum wave height distribution. European
British Isles. National Institute of Oceanography Internal Report No. A44, 31pp. centre for medium-range weather forecasts. Tech. Memo. 755 June 2015.
Beirlant, J., Goegebeur, Y., Segers, J., Teugels, J., 2004. Statistics of Extremes: Theory and Jonathan, P., Ewans, K., 2013. Statistical modelling of extreme ocean environments for
Applications. Wiley, New York MR2108013. marine design: a review. Ocean Eng. 62, 91–109.
Boccotti, P., 1986. On coastal and offshore structure risk analysis. Excerpta of the Italian Kang, S., Song, J., 2017. Parameter and quantile estimation for the generalized Pareto
Contribution to the Field of Hydraulic Eng., 1. pp. 19–36. distribution in peaks over threshold framework. J. Korean Statistical Society 46 (4),
Boccotti, P., 2000. Wave Mechanics for Ocean Engineering. Elsevier, New York. 487–501.
Borgman, L.E., 1973. Probabilities for highest wave in hurricane. J. Waterw. Harb. Coast. Krogstad, H.E., 1985. Height and period distributions of extreme waves. Appl. Ocean Res.
Eng. Div. 99 (WW2), 185–207. 7, 158–165.
Brown, A., Gorter, W., Vanderschuren, L., Tromans, P., Jonathan, P., Verlaan, P., 2017. Krogstad, H.E., Barstow, S.F., 2004. Analysis and applications of second-order models for
Design approach for Turret Moored vessels in highly variable Squall conditions. In: maximum crest height. J. Offshore Mech. Arctic Eng. 126, 66–71.
ASME. International Conference on Offshore Mechanics and Arctic Engineering, Laface, V., Arena, F., 2016. A new equivalent exponential storm model for long-term
Volume 3A: Structures, Safety and Reliability:V03AT02A049, https://doi.org/10. statistics of ocean waves. Coast Eng. 116, 133–151.
1115/OMAE2017-61005. Laface, V., Arena, F., Maisondieu, C., Romolo, A., 2017. On long term statistics of ocean
Carter, D.J.T., Challenor, P.G., 1990. Metocean Parameters – Wave Parameters. Part 2: storms starting from partitioned sea states. In: Proc. 36th Int. Conf. Oce., Offshore
Estimation of Extreme Waves. UK Department of Energy, OTH89–300. HMSO, Arctic Eng., June 25-30, 2017, Trondheim, Norway, OMAE2017-61750.
London. Lagarias, J.C., Reeds, J.A., Wright, M.H., Wright, P.E., 1998. Convergence properties of
Coles, S., 2001. An Introduction to the Statistical Modelling of Extreme Values. Springer- the Nelder-Mead simplex method in low dimensions. SIAM J. Optim. 9 (1), 112–147.
Verlag, London. Mackay, E.B.L., Johanning, J., 2018a. A Generalised Equivalent Storm Model for Long-
David, H.A., Nagaraja, H.N., 2003. Order Statistics. Wiley Series in Probability and term Statistics of Ocean Waves. Accepted for publication in Coastal
Statistics. Engineeringhttps://doi.org/10.1016/j.coastaleng.2018.06.001.
Davis, R.A., Mikosch, T., 2009. The extremogram: a correlogram for extreme events. Mackay, E.B.L., Johanning, J., 2018b. A simple and robust method for calculating return
Bernoulli 15, 977–1009. periods of ocean waves. In: ASME 37th International Conference on Ocean, Offshore
Davis, R.A., Mikosch, T., Zhao, Y., 2013. Measures of serial extremal dependence and and Arctic Engineering OMAE2018, Madrid, Spain, 17th - 22nd Jun 2018.
their estimation. Stoch. Process. their Appl. 123, 2575–2602. Mackay, E.B.L., Challenor, P.G., Bahaj, A.S., 2011. A comparison of estimators for the
DNV GL, 2017. Environmental Conditions and Environmental Loads. Recommended generalised Pareto distribution. Ocean Eng. 38 (11–12), 1338–1346.
Practice DNVGL-RP-C205, August 2017. Martín-Soldevilla, M.J., Martín-Hidalgo, M., Negro, V., López-Gutiérrez, J.S., Aberturas,
Fawcett, L., Walshaw, D., 2012. Estimating return levels from serially dependent ex- P., 2015. Improvement of theoretical storm characterization for different climate
tremes. Environmetrics 23, 272–283. conditions. Coast Eng. 96, 71–80.
Fedele, F., 2005. Successive wave crests in Gaussian Seas. Probabilist. Eng. Mech. 20, Monbet, V., Ailliot, P., Prevosto, M., 2007. Survey of stochastic models for wind and sea
355–363. state time series. Probabilist. Eng. Mech. 22 (2), 113–126.
Fedele, F., 2012. Space-time extremes in short-crested storm seas. J. Phys. Oceanogr. 42 Naess, A., 1984. Technical Note: on the long-term statistics of extremes. Appl. Ocean Res.
(9), 1601–1615. 6 (4), 227–228.
Fedele, F., 2016. Are rogue waves really unexpected? J. Phys. Oceanogr. 46, 1495–1508. Naess, A., 1985. The joint crossing frequency of stochastic processes and its application to
Fedele, F., Arena, F., 2010. Long-term statistics and extreme waves of sea storms. J. Phys. wave theory. Appl. Ocean Res. 7 (1), 35–50.
Oceanogr. 40 (5), 1106–1117. Naess, A., Gaidai, O., Haver, S., 2007a. Efficient estimation of extreme response of drag
Ferreira, J.A., Guedes Soares, C., 1999. Modelling the long-term distribution of significant dominated offshore structures by Monte Carlo simulation. Ocean Eng. 34 (16),
wave height with the Beta and Gamma models. Ocean Eng. 26, 713–725. 2188–2197.
Forristall, G.Z., Heideman, J.C., Leggett, I.M., Roskam, B., Vanderschuren, L., 1996. Effect Naess, A., Gaidai, O., Teigen, P.S., 2007b. Extreme response prediction for nonlinear
of sampling variability on hindcast and measured wave heights. J. Waterw. Port, floating offshore structures by Monte Carlo simulation. Appl. Ocean Res. 29,
Coast. Ocean Eng. 122 (5), 216–225 Discussion: Goda Y 124(4): 214. 221–230.
Forristall, G.Z., Larrabee, R.D., Mercier, R.S., 1991. Combined oceanographic criteria for Naess, A., Moan, T., 2012. Stochastic Dynamics of Marine Structures. Cambridge
deepwater structures in the Gulf of Mexico. In: 23rd Offshore Technology Conference, University Press, Cambridge ISBN: 9781139021364.
Houston, Texas, OTC-6541-MS. Prevosto, M., Krogstad, H.E., Robin, A., 2000. Probability distributions for maximum
Forristall, G.Z., 2000. Wave crest distributions: observations and second order theory. J. wave and crest heights. Coast Eng. 40 (4), 329–360.
Phys. Oceanogr. 30 (8), 1931–1943. Sagrilo, L.V.S., Naess, A., Doria, A.S., July 2011. On the long-term response of marine
Forristall, G.Z., 2008. How should we combine long and short term wave height dis- structures. Appl. Ocean Res. 33 (3), 208–214.
tributions? In: Proc 27th Int. Conf. Offshore Mech. Arctic Eng, (OMAE2008–58012). Tromans, P.S., Vanderschuren, L., 1995. Response based design conditions in the North
Estoril, Portugal, 15–20 June 2008. Sea: application of a new method. In: Proc. Offshore Tech. Conf., OTC, pp. 7683.
Giske, F.I.G., Leira, B.J., Øiseth, O., October 2017. Full long-term extreme response Tucker, M.J., 1989. Improved ‘Battjes’ method for predicting the probability of extreme
analysis of marine structures using inverse FORM. Probabilist. Eng. Mech. 50, 1–8. waves. Appl. Ocean Res. 11 (4), 212–218.
Hagen, O., Grue, I.H., Birknes-Berg, J., Lian, G., Bruserud, K., 2017. Analysis of Short- Tucker, M.J., Pitt, E.G., 2001. Waves in Ocean Engineering. Elsevier, Amsterdam.
term and Long-term Wave Statistics by Time Domain Simulations Statistics. Proc Ward, E.G., Borgman, L.E., Cardone, V.J., 1979. Statistics of hurricane waves in the Gulf
OMAE. Paper 61510. of Mexico. J. Petrol. Technol. 632–642.
Haring, R.E., Heideman, J.C., 1980. Gulf of Mexico rare wave return periods. J. Petrol. Zhang, J., 2010. Improving on estimation for the generalized Pareto distribution.
Technol. 32 (1), 35–47. Technometrics 52, 335–339.
Hardle, W., Horowitz, J., Kreiss, J.P., 2003. Bootstrap methods for time series. Int. Stat.
Rev. 71 (2), 435–459.
183