2001 HSJ Brath Et Al
2001 HSJ Brath Et Al
MARCO FRANCHINI
Department of Engineering, University of Ferrara, Via Saragat, 1, I-44100 Ferrara, Italy
GIORGIO GALEATI
ENEL Production Division, Corso del Popolo 93, I-30172 Mestre, Italy
Abstract Three indirect techniques for index flood estimation are analysed in order to
evaluate their applicability and effectiveness. These indirect techniques, based on both
statistical and conceptual approaches, are applied to a set of 33 hydrometric stations,
located in a large area in northern-central Italy. The results show that the statistical
model, due to its flexible structure, has a better descriptive ability than the physically-
based models, which are rigidly structured as they conceptualize the rainfall–runoff
transformation. However, the rigid structure of the conceptual approaches reduces
their dependence on the specific information of the single stations and therefore
increases their robustness. Finally, the results highlight that direct estimation
techniques could be advisable for catchments with peculiar geomorphoclimatic
properties; that is to say properties which differ substantively from those of the
majority of the basins considered in the identification of the indirect models. This
conclusion seems to hold even when a very limited amount of hydrometric
information is available.
Key words index flood, northern-central Italy, jack-knife, rational formula, multiple
regression, geomorphoclimatic model
Estimation de l’indice de crue par des méthodes indirectes
Résumé Cet article analyse l’adaptabilité et la précision de trois méthodes indirectes
d’estimation de l’indice de crue. Ces méthodes, statistique pour l’une et conceptuelles
pour les deux autres, ont été apliquées à un ensemble de 33 stations hydrométriques
réparties sur une vaste zone du centre de l’Italie du Nord. Le modèle statistique, à
structure non prédéfinie, présente une meilleure capacité descriptive par rapport aux
modèles à base physique, dont les structures sont plus rigides en raison des
conceptualisations de la transformation pluie-débit sur lesquelles ils s’appuient. Par
contre les structures rigides de ces derniers les rendent moins sensibles à l’information
spécifique des stations hydrométriques, et donc plus robustes. Enfin l’analyse montre
que les techniques directes d’estimation peuvent être préférables lorsque les bassins
sont sensiblement différents, en termes de caractéristiques géomorphoclimatiques, de
l’échantillon de bassins ayant servi à l’identification des méthodes indirectes. Cela
reste vrai même lorsque cet échantillon est très restreint.
Mots clefs indice de crue; Italie du Nord et Centrale; Jack-Knife; formule rationnelle ;
régression multipel, modèle géomorphologique
INTRODUCTION
The index flood method, introduced by Dalrymple (1960), is the most widely used
method of regional flood frequency analysis. It is based on the identification of zones,
called homogeneous regions, within which the probability distribution of annual
maximum peak flows is invariant except for a scale factor represented by the index
flood. The flood peak discharge with an assigned return period T relative to the
selected site is, in fact, expressed as the product of two terms: the scale factor of the
examined site (the index flood) and the dimensionless growth factor, which has
regional validity. In general it is assumed that the index flood coincides with the
average µX of annual maximum flood peak flows. The literature contains numerous
studies on the identification of homogeneous groups of basins and the estimation of the
growth factor (see for instance Reed et al., 1999; Burn & Goel, 2000; Castellarin et al.,
2001), and relatively few on estimating the index flood.
The countless practical applications of the index flood method have however
highlighted the importance and, at the same time, the difficulty in obtaining reliable
estimates of µX. In fact, whereas it is clearly possible to obtain this estimate directly in
gauged sections, by calculating the arithmetic mean of the available observations,
indirect methods have to be used in ungauged sections.
The most widely used indirect methods are of the statistical type which, using
multiregression models, link µX to an appropriate set of morphological and climatic
indices of the basin. This approach does not make any attempt to represent the physical
phenomena that determine the transformation of rainfall to runoff. Other models, by
contrast, estimate µX using relations derived from schematic representations of the
mechanism of occurrence of intense rainfall events and the hydrological response of
the basin. This is the case with the studies by Rossi & Villani (1988), in which an
estimation method is proposed based on the use of the well known rational formula,
and the studies by Becciu et al. (1993) and Brath et al. (1997a), in which an index
flood evaluation method is based on the analytical derivation of the probability
distribution of floods (geomorphoclimatic model).
This paper examines the models mentioned above, in order to highlight, for each
of them, both the problems encountered on application and the reliability of the index
flood estimates they produce. The analysis was conducted over a vast area of central-
northern Italy, encompassing the regions of Emilia-Romagna and Marche (Fig. 1).
Multiregression model
The multiregression model expresses the estimate µ̂ X of the index flood µX as:
Rational model
This model is based on a simplified conceptual representation of the process of
transformation of intense rainfall into runoff. It implicitly assumes that the average
Estimating the index flood using indirect methods 401
value of the annual maximum peak discharges is related to the average value of the
annual maximum rainfall depths within a duration equal to the time of concentration of
the basin Tc (defined as the time taken for a droplet falling on the most remote point of
a drainage basin to reach the outlet). In other words, the index flood can be expressed
as a function of the intensity of the areal rainfall I(A, Tc) of duration Tc:
µˆ x = ψ ⋅ A ⋅ I ( A, Tc ) (2)
where A is the area of the basin and ψ the runoff coefficient (ratio of flood runoff to
rainfall), assumed to be independent of rainfall duration and intensity. The standard
form of the intensity-duration-curve (e.g. Chow et al., 1988) can be used to represent
the link between the average value of the annual maximum rainfall intensity and the
corresponding duration, i.e. I(A, Tc) = µ′P ,1 ⋅ Tcn′ −1 , where µ′P ,1 is the average of the
annual maximum depths of areal hourly precipitation, while the coefficient n′ can be
expressed as:
ln (µ′P , 24 ) − ln (µ′P ,1 )
n′ = (3)
ln(24)
where µ′P , 24 is the average of the annual maximum areal precipitation depth over a
storm duration of 24 h, and the prime used in the notation (i.e. µ′P ,1 , µ′P , 24 and n′)
indicates that areal rainfall instead of local rainfall is being considered. Finally,
equation (2) yields:
µˆ X = ψ ⋅ A ⋅ µ′P ,1 ⋅ Tcn′ −1 (4)
The runoff coefficient ψ can be assumed to be linked to the permeability (Cp) and soil
use (Cu) characteristics of the considered basin (i.e. ψ = ψ(Cp, Cu)). This relationship
can be identified using an optimization procedure aimed at minimizing the differences
between the observed values of µX and the values calculated using the model.
Geomorphoclimatic model
As previously stated, the rational model implicitly assumes that the average value of
the annual maximum peak discharge can be related to the average value of annual
maximum rainfall depth. This assumption over-simplifies the phenomenon and is also
incorrect in its conceptual premise (see, for example, Yen, 1990). These limitations
can, at least in principle, be overcome by means of the analytical derivation of the
flood frequency distribution, first presented by Eagleson (1972). In this latter context,
Adom et al. (1989) and Brath et al. (1997a) suggest employing methods based on the
approximate derivation of the moments of dependent random variables. Assuming
(a) that the precipitation process can be described as a sequence, with Poissonian
occurrence, of rectangular pulses of random intensity and duration with average values
µi and µd respectively, (b) that the runoff production can be described using the Soil
Conservation Service Curve Number method (SCS-CN; US Soil Conservation Service,
1985), and (c) that the IUH of the basin is of the Nash type, these authors arrive at the
following equation, which allows one to express the expected value µQmax of the peak
discharge Qmax for any given flood event:
402 Armando Brath et al.
{[( )( )
µ Q max = Aµ i η 1 − e − χ 1 + κ 2 − κ 2 χe − χ (1 + χ 2 ) ] (5)
[( (
⋅ 1 + 3κ 2 1 − η 2 ))] + κ (2 − η)[e (1 + χ) − 1]}
2 −χ
where κ is a climatic scale factor that accounts for the reduction which occurs when
considering areal rainfall instead of local rainfall (Rodriguez-Iturbe & Mejía, 1974),
while the dimensionless parameters η and χ are given by η = µ p /(µ p + S ) and
χ = µ d / τ L respectively, with µp = µd⋅µi representing the average rainfall depth at the
centre of the shower, S representing the soil’s potential maximum retention at basin
scale (determined in accordance with the SCS-CN method) and τL is the basin lag time.
The average value of the annual maximum flood flows, µX, which coincides with
the index flood in the present study, can be expressed as a function of µQmax, once an
appropriate schematization of the process of occurrence of the peak flood flows of the
individual events has been selected. For example, assuming that the frequency
distribution of the annual maximum flood flows can be suitably fitted by the Two
Component Extreme Value distribution (TCEV; Rossi et al., 1984), it is possible to
obtain the following relationships for µQmax and µX (Becciu et al., 1993):
é
µ X = ϑ1 êln (λ1 ) + 0.5772 − å
∞
(− 1) λ *i æ i ö ù
i
Γç ÷ ú = ϑ1 f1 (λ1 , λ*, ϑ *) (6)
ë i =1 i! è ϑ * øû
æ λ + λ 2ϑ * ö
µQ max = ϑ1 çç 1 ÷÷ = ϑ1 f 2 (λ1 , λ*, ϑ *) (7)
è λ1 + λ 2 ø
with ϑ* = ϑ2/ϑ1 and λ* = λ2/(λ1)1/ϑ* where λ1 > 0, λ2 ≥ 0 and 0 < ϑ1 < ϑ2 are the four
parameters of the TCEV distribution, representing, in that order, the average annual
number of events and the mean value of the two components, ordinary and extra-
ordinary, envisaged by this distribution. The following expression is obtained from
equations (6) and (7):
f 2 (λ1 , λ*, ϑ *)
µ X = µQ max (8)
f1 (λ1 , λ*, ϑ *)
where f1(⋅) and f2(⋅) are defined by equations (6) and (7), respectively. Once the values
of the parameters λ*, ϑ* and λ1 of the TCEV law are known (can be deduced from a
regional flood frequency analysis), expression (8) can be used to estimate µX through
µQmax, which, in turn, is a function of a set of geomorphoclimatic parameters
characterizing the basin, according to equation (5).
It should be noted that, in general, the use of the geomorphoclimatic model is not
dependent on the choice of the TCEV distribution, which can in fact be substituted by
another probabilistic law such as, for example, the Generalized Extreme Value (GEV)
distribution (Brath et al., 1996). In this presentation, however, only the approach based
on the TCEV distribution is presented, since the numerical application described later
refers to a region for which the TCEV distribution has proved to be a suitable model
for regional flood frequency analyses (Franchini & Galeati, 1996; Brath et al., 1997b).
Accordingly, the estimates of λ*, ϑ* and λ1 required for the parameterization of
equation (8) were retrieved from these latter papers.
Estimating the index flood using indirect methods 403
Available data
In order to analyse the various index flood estimation models, reference was made to
the data available at the gauging stations located within the geographic region shown
in Fig. 1, situated in central-northern Italy and referred to henceforth as “Emilia-
Romagna-Marche”. The study considered only the gauging stations with at least
15 years of observed annual maximum floods, identifying a total of 33 stations (Fig. 1)
having a minimum, average and maximum number of observations of 15, 32 and 74
respectively. The size of the corresponding basins ranges from 6.3 to 1439 km2, with
an average area of 421.5 km2. Numerous data were collected on the morphology,
permeability, soil use and rainfall regime of these basins.
The morphological information employed in the study was partially obtained from
the National Hydrographic Service of Italy (SIMN) and partially derived from the
topography maps (1:100 000) produced by the Geographic Military Institute of Italy
(IGM). Thus, for instance, the watershed area A and its highest, zmax, medium, zmed, and
lowest, zmin, altitudes were obtained from SIMN, while the mainstream length L and
the Hortonian bifurcation, length and area ratios were derived from the topography
maps. Further morphological indexes were then evaluated from these quantities, such
as the average slope of the mainstream, a shape factor (i.e. A/L2), an elongation ratio
(i.e. 2 A π L ) and the time of concentration of the basin Tc. In particular, this latter
parameter was calculated according to Giandotti’s empirical formula:
( )(
Tc = 4 A + 1.5 L 0.8 z med − z
min
) (9)
The permeability and soil use characteristics of the basins were extracted from a
Geographic Information System (GIS) of the area of interest, which identifies a
subdivision in five soil-use classes and three permeability classes (Table 1). Using the
GIS tools it was possible automatically to obtain the portions of each basin to which to
assign the various permeability and soil-use classes set out in the classification.
The rainfall regime information consists of the maximum annual rainfall values for
durations of 1 h, 24 h and one day, collected at the gauging stations present in the area
under study. The difference existing between 24-h and 1-day annual maxima is due to
the fact that the former are obtained for a given raingauge from the hourly rainfall
series by using a moving time-window of size 24 h, whereas the latter are achieved
directly from the daily rainfall series (e.g. rainfall depth observed from 10:00 to
10:00). Only stations with at least 20 years of observed data were considered, thereby
obtaining 169 stations for rainfall with duration of 1 h and 24 h and 495 stations for
daily rainfall. Numerous gauging stations located outside the area of study were added
to this set in order to improve the description of the rainfall regime in the boundary
areas. Once the average value of the samples of data collected had been calculated at
each station and for a set duration (d = 1 h, 24 h, 1 day), the corresponding maps of the
contour lines were drawn for the entire area under study (see, for example, Fig. 2).
These maps were then used to estimate, for each of the 33 basins involved, the mean
value of the rainfall depths in the basin centroid for each of the three durations con-
sidered, µ P ,1 , µ P , 24 and µ P , day , which are referred to henceforth as reference local
rainfall values. Using the data available for d = 1 h and 24 h, the indices µi and µd
were estimated at each station by applying the method described in Bacchi et al.
(1989), and their corresponding contour line representations were drawn. Lastly, the
mean values of µi and µd were inferred over the area of each basin; these values,
namely µ ir and µ rd , are referred to hereafter as reference local values of the mean
duration and intensity of rainstorms.
The estimates λ̂ * , ϑ̂ * and λˆ 1 λˆ *, ϑ
ˆ * of the parameters of the regional growth
curve TCEV, required for application of the geomorphoclimatic model, were extracted
from the studies developed by Franchini & Galeati (1996) and Brath et al. (1997b).
The first study showed that for the Romagna-Marche region, including basins
numbered 11 to 33, a single TCEV growth curve can be used with parameters
Estimating the index flood using indirect methods 405
Fig. 2 Contour line representation of the mean annual maximum rainfall depth for
storm duration of 1 h (mm).
λˆ * = 0.75 , ϑ
ˆ * = 2.51 and λˆ λˆ *, ϑˆ * = 9.50 . The second study, on the other hand,
1
showed that for the Emilia region, encompassing basins numbered 1–10, a TCEV
regional growth curve can be assumed with the following parameters: λˆ * = 0.13 ,
ϑˆ * = 1.34 and λˆ λˆ *, ϑ
ˆ * = 9.39 . As one may notice, the values of the parameters for the
1
two TCEV regional models are rather different. The underlying reason is that the severe
storm events which occur in the two regions have different origins: the storms occurring
in the Emilia region mainly come from the Thyrrenian Sea (see Fig. 1), while those in
the Romagna-Marche region mostly arrive from the cold areas of central Europe.
Consequently, these regions are characterized by two different flood frequency
distributions as shown in Fig. 3 (Franchini & Galeati, 1996; Brath et al., 1997b).
Multiregression model
The coefficients Cj of the model (equation (1)) were estimated by means of the least-
squares method, using the information available at the gauging stations. All the
geomorphoclimatic quantities mentioned at the beginning of the present section were
considered as independent variables Aj, to which were added the percentage of wooded
area in the basin and the reduced area Ared, defined as the part of the basin charac-
terized by low and medium permeability according to the classification given in Table 1.
The definition of how many and which variables Aj to use in expression (1) was
undertaken following a stepwise regression analysis procedure (Draper & Smith,
406 Armando Brath et al.
T = 5,000
T = 2,000
T = 1,000
T = 500
Recurrence interval [years]
T = 200
T = 100
T = 50
T = 20
T = 10
T=2
Theoretical regional growth curve (sites 1 to 10)
Theoretical regional growth curve (sites 11 to 33)
Observed flood frequency distribution (sites 1 to 10)
Observed flood frequency distribution (sites 11 to 33)
0 1 2 3 4 5
Growth factor [-]
Fig. 3 Theoretical regional growth curves and observed frequency distributions of the
annual maximum peak discharge for Emilia (sites 1–10) and Romagna-Marche (sites
11–33).
1981). Furthermore, to ensure the maximum “robustness” of the model and thus limit
the reliability problems typical of purely statistical links, this procedure was coupled
with the jack-knife method (Shao & Tu, 1995). In this way the following equation was
obtained:
3.504
µˆ X = 2.904 ⋅ 10 −5 Ared
1.299
µ P ,1 L−0.778 (10)
where Ared is expressed in km2, µ P ,1 in mm, the length of the main stream L in km, and
µX in m3 s-1. The use of statistical relations characterized by more than three
explicative variables was not deemed appropriate, since it was observed that the
determination coefficient increases to an insignificant degree and that, at the same
time, the estimation variance of the coefficients of equation (1) increases considerably.
Table 2 summarizes the performance of the multiregression model (equation (10))
in terms of determination coefficient, R2, and expectation of the absolute value of
relative error, E[| ε rel |] . These statistical indexes were calculated according to the
following equations:
N
å (µˆ − µiX )
i 2
X
R2 = 1 − i =1
N
(11)
å (µ − µX )
i 2
X
i =1
Ε[ ε rel ] =
1 N
(µˆ i
− µ iX )
N
å i =1
X
µ iX
(12)
Estimating the index flood using indirect methods 407
where N is the number of sites in the area of interest; µ iX and µ̂ iX are the observed and
estimated values of the index flood for the site i; µ X is the average value of µ iX .
Moreover Table 2 sets out the analogous coefficients, R 2jk and E[| ε rel, jk |] , obtained
from equations (11) and (12) in which the estimates µ̂ iX , with i = 1, ..., N, are derived
by applying a jack-knife procedure. The procedure is applied N times and each time it
estimates µ̂ iX through an equation having the structure of equation (10), the coeffi-
cients of which are calibrated by the least-squares method, using the index floods
observed at every gauging station but the site i. The other two quantities shown in
Table 2, ∆R 2 = R 2 − R 2jk and ∆E[| ε rel |] = E[| ε rel, jk |] − E[| ε rel |] , can be used as
measures of model robustness (the smaller their value, the more robust the model).
Lastly, Fig. 4 contains the dispersion diagrams characteristic of the multiregression
1000
900
800
700
Predicted index-flood (m3/s)
600
500
400
300
200
Standard estimates
100
Jack-knife estimates
0
0 100 200 300 400 500 600 700 800 900 1000
Observed index-flood (m3/s)
Rational model
The application of this model requires estimation of µ ′P ,1 and µ ′P , 24 , which are used in
equations (3) and (4), the time of concentration of the basin Tc and the definition of an
explicit form for the relationship ψ = ψ(Cp, Cu). The estimation of µ ′P ,1 and µ ′P , 24 was
performed as follows: (a) for each basin, reference point rainfall values µ P ,1 and µ P , 24
were selected by referring to the relevant centroids as described above; (b) the Areal
Reduction Factor (ARF) for the duration d and the area A, ARF(d, A), was calculated
by using the formula proposed by the US Weather Bureau (1958), the coefficients of
which were re-estimated to optimize the reproduction of the rainfall data available in
the area under study (Brath et al., 1999):
ARF(d, A) = 1–[1– exp(–0.01298 A)] exp(–0.679d0.3320) (13)
and (c) the parameters µ′P ,1 and µ′P , 24 , were then estimated as µ ′P ,1 = ARF(1, A)µ P ,1 and
µ′P , 24 = ARF(24, A)µ P , 24 .
The law ψ = ψ(Cp, Cu) was identified on the basis of the permeability and soil use
information of the various basins. The process was structured around repeated
attempts, which considered the various forms of the law ψ = ψ(Cp, Cu) and verified
their performance, using the jack-knife procedure, in terms of their ability to describe
the observed values of µX. The form ultimately adopted is as follows:
b
æA ö
ψ = aç red ÷ (14)
è A ø
The coefficients a and b of equation (14) were estimated by applying the least-squares
method to the deviations between the observed index flood values and the index flood
values calculated using formula (4). The estimates obtained were â = 0.610 and
b̂ = 1.366, while the related values of variance were very small as a consequence of the
quite large amount of data used for their estimation. The estimate obtained for a,
â = 0.610, represents the maximum value that ψ can assume when Ared ≡ A. This value
is significantly lower than 1; however, one should recall that Ared represents the poorly
permeable or almost impervious portions of a basin (which are characterized by values
of ψ very near to 1), along with those parts with a medium relative permeability, which
are characterized by smaller values of ψ. As a consequence, the maximum value of ψ
reflects the combined effect of the two levels of permeability thus assuming a value
around 0.6.
Figure 5 shows the scatter plot of the index flood estimates furnished by the
rational model in two cases. In the first case, the model employs the values â = 0.610
and b̂ = 1.366 estimated by using the index floods observed at all available gauging
stations (standard estimates). In the second case, a jack-knife procedure is applied: the
Estimating the index flood using indirect methods 409
1000
900
800
700
Predicted index-flood (m3/s)
600
500
400
300
200
Standard estimates
100
Jack-knife estimates
0
0 100 200 300 400 500 600 700 800 900 1000
Observed index-flood (m3/s)
Fig. 5 Rational model: observed vs predicted index flood (m3 s-1).
estimate of µX for each site is now obtained from the rational model where the
coefficients a and b of equation (14) are calibrated using the data pooled from all
available sites except for the station to which the µX estimate refers (jack-knife
estimates). Table 2 summarizes the statistical performance indices of the rational
model applied with (i.e. R 2jk and E[| ε rel, jk |] ) or without (i.e. R2 and E[| ε rel |] ) the
jack-knife procedure, along with the robustness indices presented before (i.e. ∆R 2 and
∆E[| ε rel |] ).
Geomorphoclimatic model
The application of the geomorphoclimatic model requires the estimation of µi, µd, κ, τL
and S, and also of the parameters λ*, ϑ* and λ1 which define the regional growth curve
according to the TCEV distribution. As discussed above, the entire area examined
comprises two homogeneous regions for which, therefore, two separate sets of λ*, ϑ*
and λ1 values are identified. The estimates shown were used to calculate the
coefficients f1 and f2 of equations (6) and (7), obtaining for the Emilia basins
(i.e. basins 1–10): f1 = 1.023 and f2 = 2.969 and for the Romagna-Marche basins
(i.e. basins 11–33): f1 = 1.245 and f2 = 4.128.
The values to be used in equation (5) of parameters µi and µd should be regarded
as local values characteristic of each basin and they can therefore be taken as equal to
µir and µ rd respectively. The climatic scale factor κ, as suggested in Brath et al.
(1997a), was considered equal to the areal reduction factor ARF(d, A), for a rainfall
duration d equal to Tc , and it was calculated using equation (13).
410 Armando Brath et al.
The lag time τL can be estimated using one of the many empirical relations that
link this parameter to morphometric indices of the basin. In this study various relations
were examined, and the best results were obtained with the law:
L
τˆ L = ξ (15)
3 .6 v
First parameterization The values CNi = CNi(Cp, Cu) were inferred directly from
the indications reported by Borselli et al. (1992) for Italian soils, obtaining the values
reported in Table 3. The coefficient ξ was calibrated over the whole area of interest by
using a least squares optimization procedure with respect to the observed values of
index flood, µX. The optimization was performed twice, firstly by simultaneously
utilizing all 33 observed µX ( ξ̂ = 0.536) and secondly by applying a jack-knife
procedure. The results are shown in Fig. 6 and the model performances are
summarized by the statistical indexes of Table 2.
Table 3 Values of curve number (CN) for different combinations of land-use and permeability classes.
Land-use Permeability:
I II III
1 84 74 50
2 82 70 40
3 80 67 35
4 98 90 80
5 90 83 55
Estimating the index flood using indirect methods 411
1000
900
700
3
600
500
400
300
200
Standard estimates
100
Jack-knife estimates
0
0 100 200 300 400 500 600 700 800 900 1000
3
Observed index-flood (m /s)
Fig. 6 Geomorphoclimatic model—first parameterization: observed vs predicted index
flood (m3 s-1).
PRESENTATION OF RESULTS
With regard to the multiregression model, it is observed that equation (10) contains
one parameter representative of the area of the basin, one representing the rainfall
regime, and a third parameter linked to the shape of the basin itself. Only a few of the
many parameters available to describe the basin morphology and climate were found
to be useful as explicative variables of the index flood. This indicates that, due to the
close statistical correlation between these parameters (cf. Pitlik, 1994), it is sufficient
to consider a very small number of them in order to represent the effects of the
morphology on the index flood.
With regard to the rational model, it can be observed that equation (14) adopted for
the runoff coefficient ψ does not take into consideration the dependence of ψ on the
412 Armando Brath et al.
1000
900
700
600
500
400
300
200
Standard estimates
100
Jack-knife estimates
0
0 100 200 300 400 500 600 700 800 900 1000
3
Observed index-flood (m /s)
Fig. 7 Geomorphoclimatic model—second parameterization: observed vs predicted
index flood (m3 s-1).
soil use characteristics Cu, but only its dependence on the permeability characteristics
Cp, as expressed through reduced area Ared. The introduction of the soil use
characteristics did not, in fact, produce any significant improvements in the ability to
represent the observed values of µX by means of an expression similar to equation (2).
The geomorphoclimatic model, insofar as it seeks to represent and characterize the
phenomena underlying the formation of floods in more detail than the preceding
models, is more complex to use, since its application requires the definition of eight
parameters. However, the estimation of the parameters µi, µd and κ is performed using
coded techniques and therefore with a minimal margin of subjectivity; the same
applies to the parameters λ*, ϑ* and λ1, which can be obtained on the basis of
estimation procedures typical of the TCEV model. More caution is required in
estimating the parameter τL, in respect of which both of the parameterizations
described above highlighted the need to adopt a calibration procedure.
The performance indices of the various models are shown in Table 2, which
contains the values of the determination coefficients and the mean relative errors of
each model, both in the case in which the data of all 33 gauging stations are used
simultaneously and in the case in which the jack-knife procedure is used. With
reference to the values of R2, it was immediately noted that the rational model
produced the worst results, whereas the best results were obtained from the multi-
regression model. Similar considerations may arise with reference to the values of R 2jk .
An analysis of the expectations of the absolute value of the relative error, E[| ε rel |] ,
produces different conclusions. In fact, according to the values of E[| ε rel |] , the
rational model is seen to perform better than the geomorphoclimatic one, even though
the multiregression model continues to produce the best results.
Estimating the index flood using indirect methods 413
In summary, the multiregression model is the one which performs best; the rational
model is the one which performs worst in terms of explained variance; the
geomorphoclimatic model seems to show promising performance in terms of R2, even
though some significant differences are present between the two examined
parameterizations. The second parameterization outperforms the first parameterization
in terms of both the performance indexes, R2 and E[| ε rel |] . Conversely, the first
parameterization shows lower values of the indexes ∆R2 and ∆ E[| ε rel |] , and therefore
a lower sensitivity to the jack-knife procedure, not only with respect to the second
parameterization, but also to all other models (see Table 2).
These results were somewhat expected. In fact, they prove that the more flexible
the structure of the model (e.g. multiregression model), the higher its capability to fit
the observed data. On the other hand they seem to show that the sensitivity of the
model to the jack-knife procedure becomes higher as the number of parameters
evaluated via statistical optimization increases. Accordingly, the first parameterization
of the geomorphoclimatic model shows the least sensitivity to the jack-knife procedure
and is therefore the most robust.
Interesting considerations can be made on the basis of the three box plots shown in
Fig. 8, representing the distributions of the relative estimation errors obtained by
applying the jack-knife procedure to the multiregression model, the rational model and
the second parameterization of the geomorphoclimatic model (this latter was preferred
to the first parameterization because of its better performances in terms of both R2 and
E[| ε rel |] ). Each box plot shows the median, the upper and lower quartiles, and the
minimum and maximum values, excluding the outliers (•). Outliers are defined as
values situated at a distance from the lower and upper quartiles more than 1.5 times the
distance between the quartiles themselves (inter-quartile distance). Low error
6
29 29
29 28
4 28 28
18
2
27
a b c
Fig. 8 Box plots representations (median, upper and lower quartiles, highest and
lowest values and outliers, •) of the relative errors for the jack-knife estimates:
(a) Multiregression model, (b) Rational model, and (c) Geomorphoclimatic model-
second parameterization.
414 Armando Brath et al.
dispersion and the absence of bias highlight the superior performance of the
multiregression model. On the other hand, both of the conceptual models reveal a
tendency to overestimate the index flood, and this tendency is particularly noticeable in
the geomorphoclimatic model.
As shown in Fig. 8, with respect to the two gauging stations 28 and 29, both
located on the Chienti River, all three models produce very large overestimation errors
(even in excess of 400%). In these basins very high permeable area percentages are
found, as a result of which the unit flood contributions are much lower than those of
the remaining basins. The singular nature of these basins makes it difficult to describe
their hydrological behaviour using models such as those considered here which, by
their very nature, tend to represent the dominant mode of behaviour in the set of basins
considered for their calibration. This is undoubtedly true in the case of the multi-
regression model, which is purely statistical in structure; however, it is also true of the
remaining models, given that at least some of their parameters are identified using an
optimization procedure. An examination of the relative estimation errors characteristic
of each gauging station, reported in Fig. 9, showed that that sizeable overestimation
errors are also obtained with basins 27 and 30, located close to above-mentioned
basins 28 and 29, and they too are characterized by very high permeability, though it is
lower than that of basins 28 and 29. Furthermore, Fig. 9 shows that the conceptual
models for basins 16, 18 and 19 produce very large deviations, with values
approaching 200%. In this case the basins are the smallest of all those examined
(A = 11.8, 23.1 and 6.3 km2 respectively), well below the average area of the others
( A = 462.2 km2), and they therefore present values for one geomorphologic charac-
teristic, in this case the area, very different from those typical of the set of basins
examined. This appears to confirm that the basins whose particular characteristics
differ greatly from the average of the set tend to be poorly represented.
6 Multiregression model
Relative error of index-flood estimation
Rational model
5
Geomorphoclimatic model - 2nd parameterisation
4
-1
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33
Number of the hydrometric station
Fig. 9 Comparison between the relative errors of the index flood estimates obtained
with the jack-knife procedure.
Estimating the index flood using indirect methods 415
Table 4 Statistical indexes of model performance once stations 28 and 29 were excluded (see Table 2
for the meaning of the terms).
Model R2 R2jk ∆R2 E[|εrel|] E[|εrel,jk|] ∆E[|εrel|]
Multiregression 0.954 0.928 0.026 0.220 0.250 0.030
Rational 0.768 0.717 0.051 0.410 0.440 0.030
Geomorphoclimatic 0.889 0.873 0.016 0.489 0.495 0.007
First parameterization
Geomorphoclimatic 0.913 0.887 0.026 0.440 0.463 0.023
Second parameterization
Next, the performance provided by the direct estimation method in relation to the
size of the available sample was analysed. For this purpose, the direct estimates of µX
were calculated for each site using a moving average process within a time window of
size F = 2, 5 and 10 years. These N–F+1 direct estimates, where N is the length of the
available sample of observations for the considered gauging station, were compared to
the average of the N observed annual maximum floods (i.e. observed µX) and the
corresponding relative errors were then calculated. These errors were compared with
those of the jack-knife estimates resulting from the application of the indirect
multiregression model (equation (10)), selected because it produces the best
performance indicators. The three box plots in Fig. 10 show the percentage
distributions of the cases in which the absolute values of relative errors produced by
the direct estimate, for F = 2, 5 and 10 years respectively, is less than the relative error
of the indirect multiregression model. Comparing the three distributions in Fig. 10, it is
found that, as it was expected, the direct estimate substantially improves performance
as the time window increases, such that the median of the distributions rises from 47%
for F = 2 years to 58% for F = 5 years, and to 86% for F = 10 years.
In Fig. 11, the case F = 2 is analysed station by station. Even with such a limited
number of observations, it can be seen that, for stations 28 and 29, the direct estimate is
416 Armando Brath et al.
100
75
50
25
0
2 yr 5 yr 10 yr
Fig. 10 Box plots representations (median, upper and lower quartiles, highest and
lowest values) of the percentages of cases in which the direct estimation of the index
flood based on few annual data is more precise than the indirect one (2yr, 5yr and 10yr
stand for 2-, 5- and 10-year data, respectively).
100
90
80
70
60
50
40
30
20
10
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Number of the hydrometric station
Fig. 11 Percentage of cases in which direct estimates of index flood based upon two
years of observation are characterized by lower relative errors than indirect ones.
better than the indirect method in 100% of cases. Similar results are obtained for basins
27 and 30, which are also characterized by a permeability much greater than the average
permeability of the set of basins, although it is lower than that of basins 28 and 29.
The results obtained tally well with those of Hebson & Cunnane (1987) and
confirm that, despite the presence of a limited amount of data, direct estimation may be
a preferable alternative to the indirect methods, especially for basins with geomorpho-
climatic characteristics very different from the average characteristics of the set of
basins considered for the identification of the indirect estimation models. With respect
to these basins, it may be useful, if the number of gauged sections allows it, to
undertake specific sub-region scale studies, grouping the basins together on the basis
of those characteristics which make them unusual, in order to parameterize more
Estimating the index flood using indirect methods 417
effectively the indirect index flood estimation methods to be used for the ungauged
sections of the sub-region.
CONCLUSIONS
This study analysed several indirect estimation methods of index flood, in order to
highlight both the problems encountered in application and the quality of their
performance. The first model considered was the multiregression type model in which
the explicative variables were selected using stepwise regression techniques and the
jack-knife procedure. Next, two conceptual models were considered, namely the
rational model and the geomorphoclimatic model, the latter being specifically based on
the analytical derivation of the flood frequency distribution and parameterized in two
different ways.
These models were applied to a set of 33 gauging stations located over an
extensive geographical area of central-northern Italy. The results show that the
multiregression model best describes the index flood values at the stations used; one
particular parameterization of the geomorphoclimatic model, on the other hand, shows
the least sensitivity to the jack-knife procedure; the rational model, despite reducing
the expectation of the relative estimation error, is characterized by the lowest explained
variance. These conclusions do not represent the last word on the subject, but they
undoubtedly reveal the existence of different levels of reliability to assign to the
various index flood estimation models.
The results also show the superiority, as one might intuitively have expected, of
the models without a predefined structure, such as the statistical type, compared with
the conceptual models, which are characterized by a more rigid structure since they are
derived from an attempt to interpret the dynamics of rainfall–runoff transformation. By
contrast, the presence of a more pronounced conceptualization of the natural pheno-
mena, as in the case of the geomorphoclimatic model, may reduce the influence on
model parameterization of the specific information which arrives from any gauging
station, making the model itself less sensitive to the jack-knife procedure.
The analysis undertaken also shows that the indirect estimation of the index flood
using both statistical and conceptual models presents limited reliability in the case of
basins whose geomorphoclimatic characteristics are very different from the average for
the set of basins used for the identification of the models themselves. With respect to
these basins, it was seen that it may be preferable to estimate the index flood using the
direct method, even when the amount of available sample information is very limited.
At the same time, it may be useful to exclude these basins from the set used for the
identification of the indirect estimation model, since their presence could diminish its
descriptive ability with regard to the remaining basins. Moreover, it may be advisable
to develop, should the number of basins allow it, an indirect model on a smaller spatial
scale, relating to a sub-region defined by them.
REFERENCES
Adom, D. N., Bacchi, B., Brath A. & Rosso, R. (1989) On the geomorphoclimatic derivation of flood frequency (peak and
volume) at the basin and regional scale. In: New Directions for Surface Water Modeling (ed. by M. L. Kavvas) (Proc.
Baltimore Symp., May 1989), 165–176. IAHS Publ. no. 181.
Bacchi, B., Burlando, P. & Rosso, R. (1989) Extreme value analysis of stochastic models of temporal rainfall. Poster paper
presented at the Third IAHS Scientific Assembly, Baltimore, Maryland, USA, May 1989.
Becciu, G., Brath, A. & Rosso, R. (1993) A physically based methodology for regional flood frequency analysis. In:
Engineering Hydrology (ed. by C. Y. Kuo), 461–466. Am. Soc. Civil Engng, New York, USA.
Borselli, L., Busoni, E. & Torri, D. (1992) Applicabilità del S.C.S. Curve Number Method: il fattore lambda per la stima
del deflusso superficiale (Applicability of the US SCS Curve Number Method: the lambda factor for estimating
surface runoff, in Italian). Linea 1 – 1989 Report, CNR—GNDCI (Group for the Prevention of Hydrogeological
Disasters), Arti Grafiche Lux, Genova, Italy, 43–52.
Brath, A., Castellarin, A., Franchini, M. & Galeati, G. (1999) La stima della portata indice mediante metodi indiretti
(Indirect methods for index flood estimation, in Italian). L'Acqua 6, 21–34.
Brath, A., De Michele, C. & Rosso, R. (1996) Una metodologia indiretta a base concettuale per la valutazione della portata
indice (A conceptual model for indirect index-flood estimation, in Italian) In: Proc. XXV Convegno di Idraulica e
Costruzioni Idrauliche (Torino, Italy), 52–63.
Brath, A., De Michele, C. & Rosso, R. (1997a) Combining statistical and conceptual approaches for index flood
estimation. In: FRIEND’97—Regional Hydrology: Concepts and Models for Sustainable Water Resources
Management (ed. by A. Gustard et al.) (Proc. Postojna Conf., October 1997), 287–295. IAHS Publ. no. 246.
Brath, A., De Michele, C., Galeati, G. & Rosso R. (1997b) Una metodologia per l'identificazione delle regioni omogenee
nel regime di piena. Applicazione all'Italia nord-occidentale (A methodology for delineating homogeneous regions
with respect to flood regime: an application to north-western Italy, in Italian), L'Acqua 1, 17-26.
Burn, D. H. & Goel, N. K. (2000) The formation of groups for regional flood frequency analysis. Hydrol. Sci. J. 45(1), 97–
112.
Castellarin, A., Burn, D. H. & Brath, A. (2001) Assessing the effectiveness of hydrological similarity measures for
regional flood frequency analysis. J. Hydrol. 241(3–4), 270–285.
Chow, V. T., Maidment, D. R. & Mays, L. W. (1988) Applied Hydrology. McGraw-Hill, New York, USA.
Dalrymple, T. (1960) Flood frequency analyses. Water Supply Paper 1543–A, US Geol. Survey, Reston, Virginia, USA.
Draper, N. & Smith, H. (1981) Applied Regression Analysis (second edn). John Wiley & Sons, New York, USA.
Eagleson, P. S. (1972) Dynamics of flood frequency. Wat. Resour. Res. 8(4), 878–898.
Franchini, M. & Galeati, G. (1996) Analisi regionale dei massimi annuali delle portate al colmo per la regione Romagna-
Marche (Regional analysis of annual maximum floods observed in the Italian region Romagna-Marche, in Italian),
L'Energia Elettrica 73(3), 200–212.
Hebson, C. S. & Cunnane, C. (1987) Assessment of use of at-site and regional flood data for flood frequency estimation.
In: Hydrologic Frequency Modeling (ed. by V. P. Singh), 433–448. Reidel, Dordrecht, The Netherlands.
Pitlik, J. (1994) Relation between peak flows, precipitation and physiography for five mountainous regions in the western
USA. J. Hydrol. 158, 219–240.
Reed, D. W., Jackob, D., Robinson, A. J., Faulkner, D. S. & Stewart, E. J. (1999) Regional frequency analysis: a new
vocabulary. In: Hydrological Extremes: Understanding, Predicting, Mitigating (ed. by L. Gottschalk, J.-C. Olivry,
D. Reed & D. Rosbjerg) (Proc. Birmingham Symp., July 1999), 237–243. IAHS Publ. no. 255.
Rodriguez-Iturbe, I. & Mejía, J. M. (1974) The design of rainfall networks in time and space. Wat. Resour. Res. 10(4),
713–735.
Rossi, F., Fiorentino, M. & Versace, P. (1984) Two-component extreme value distribution for flood frequency analysis.
Wat. Resour. Res. 20(7), 847–856.
Rossi, F. & Villani, P. (1988) La regionalizzazione della piena annuale media attraverso un metodo analitico di tipo
geomorfoclimatico (Regionalization of the mean annual maximum flood through an analytically derived
geomorphoclimatic model, in Italian) In: Proc. XXI Convegno di Idraulica e Costruzioni Idrauliche (L’Aquila,
Italy), Part I, 225–242.
Shao, J. & Tu, D. (1995) The Jackknife and Bootstrap. Springer-Verlag, Berlin, Germany.
US Soil Conservation Service (1985) National Engineering Handbook, Sec. 4, Hydrology, US Dept Agric., Washington
DC, USA.
US Weather Bureau (1958) Rainfall–intensity–frequency regime, Part 2 – Southeastern United States. Tech. Paper no. 29.
Yen, B. C. (1990) Return period, risk and probability in urban storm drainage. From the experience of the 20th century to
the science in 21st century. Proc. Fifth Int. Conf. on Urban Storm Drainage (Osaka, Japan, 23–27 July 1990) (ed. by
Y. Iwasa & T. Sueishi), 59–72.