Comparison of Short-Term Rainfall Prediction Models For Real-Time Flood Forecasting
Comparison of Short-Term Rainfall Prediction Models For Real-Time Flood Forecasting
www.elsevier.com/locate/jhydrol
Abstract
  This study compares the accuracy of the short-term rainfall forecasts obtained with time-series analysis techniques, using
past rainfall depths as the only input information. The techniques proposed here are linear stochastic auto-regressive moving-
average (ARMA) models, artificial neural networks (ANN) and the non-parametric nearest-neighbours method. The rainfall
forecasts obtained using the considered methods are then routed through a lumped, conceptual, rainfall–runoff model, thus
implementing a coupled rainfall–runoff forecasting procedure for a case study on the Apennines mountains, Italy. The study
analyses and compares the relative advantages and limitations of each time-series analysis technique, used for issuing rainfall
forecasts for lead-times varying from 1 to 6 h. The results also indicate how the considered time-series analysis techniques, and
especially those based on the use of ANN, provide a significant improvement in the flood forecasting accuracy in comparison to
the use of simple rainfall prediction approaches of heuristic type, which are often applied in hydrological practice. 䉷 2000
Elsevier Science B.V. All rights reserved.
Keywords: Rainfall forecasting; Flood warning; Stochastic processes; Artificial neural networks; Non-parametric predictors
of rain intensities yet (Krzysztofowicz, 1995). The              3–5 describe the examined methodologies and some
radar detection is, in addition, particularly difficult          relevant aspects regarding the application to short-
in mountainous regions, because of ground occulta-               term rainfall forecasting. We have gone into more
tion and altitude effects.                                       details about artificial neural networks and nearest-
   The QPF may be obtained also by means of time-                neighbour methods because their application in
series analysis techniques (Brath et al., 1998;                  hydrology is relatively new and less well established
Burlando et al., 1993). One should be aware that the             with respect to ARMA models. Section 6 presents and
predictive ability of these methods is limited owing to          compares the obtained results in terms of predicted
the low persistence in time that usually characterises           rainfall depths. Section 7 describes the hydrologic
rainfall time-series, even if sampled at a fine temporal         rainfall–runoff transformation model and introduces
scale. However, the moderate computer time and data              heuristic rainfall forecasting approaches used as
availability required by such techniques make their              benchmarks for the comparison of the time-series
application very attractive in the context of real-time          rainfall forecasting methods. The performances of
flood forecasting. Thus, the benefits potentially                the coupled rainfall–runoff forecasting systems corre-
achievable from their application with real precipita-           sponding to all the rainfall predictive schemes, both
tion data, in terms of efficiency of the flood forecast,         time-series methods and heuristic approaches, are
are worth analysing.                                             then compared, evaluating the efficiency of the
   The following time-series analysis techniques have            rainfall predictions in terms of the transformed river
been applied: (1) Linear stochastic auto-regressive              discharges. Section 8 offers conclusions and a
moving-average models (ARMA), which express the                  perspective on future developments.
future rainfall as a linear function of past data. The
approach is thus linear, model-driven and parametric,
i.e. it first requires the identification of the type of         2. Case study
relationship among the variables (model identifica-
tion) and then the estimation of model parameters.               2.1. Study area and data set description
(2) Artificial neural network architectures (ANN),
belonging to the non-linear, data-driven approaches:                The case study considered herein is referred to the
the resulting model depends on the available data to             Sieve River basin, a first-order tributary of the Arno
be “learned”, without any a priori hypothesis about              River, located in the Apennines Mountains in Central
the kind of relationship, which is allowed to be                 Italy. Since the Sieve joins the right bank of the Arno
complex and non-linear. (3) K-nearest-neighbour                  just a few kilometres upstream of the city of Florence,
method (K-NN), a non-parametric regression metho-                the prediction of the corresponding flood hydrographs
dology, not implying any structured interaction, but             plays a key role in determining the flood risk for the
exploiting the closeness (“neighbourhood”) between               city. The main stream length is 58 km, the areal exten-
the most recent observations and K “similar” sets of             sion of the watershed is around 830 km 2 and the time
observations chosen in an adequately large training              of concentration is approximately 10 h. The closure
sample.                                                          section is Fornacina, where hourly discharge observa-
   The aim of our analysis is a comparison of the above          tions were collected between 1 January 1992 and 31
methods, not only in terms of obtained rainfall depths,          December 1996.
but also from an hydrological point of view, considering            For the same observation period, hourly tempera-
an integrated rainfall and runoff forecasting system             tures (for estimating the potential evapo-transpiration)
operated on a real-world case study. The issued rainfall         were measured at four stations and hourly rainfall
forecasts, therefore, will be routed through a lumped            depths at 12 raingauges located inside the study
conceptual rainfall–runoff transformation model and              basin, thus allowing the computation of the average
the performances of the flow forecast will be analysed           areal precipitation over the watershed. The rainfall is
and compared.                                                    spatially averaged (with the Thiessen polygons
   Section 2 describes the study area and the data sets,         method) so that it is consistent with the use in the
along with the use of the data in the analysis. Sections         lumped rainfall–runoff transformation model. The
134                               E. Toth et al. / Journal of Hydrology 239 (2000) 132–147
averaging also improves the efficiency of the forecast,          allowing the calibration phase to “learn” an accurate
compared to the efficiency of making forecasts for               representation of the problem domain. These first two
each one of the gauges separately before averaging               sets of events will be referred to as split-sample pair
the results over the basin. In fact, the averaging               A. In order to verify the variability of the forecasts
produces a smoothing of non-stationarities and fluc-             results if different split-samples are used, three further
tuations recorded at each gauge, while preserving the            pairs (B–D) of calibration and validation sets were
general pattern of the storm event over the watershed            selected, by slightly changing the definition of storm
(Burlando et al., 1993).                                         event described in Section 2.1, so as to include also
   Given the predominance of the presence of null or             events not considered before. In pair B, the validation
very low values in the rainfall series and our interest          set remains the same, so that a fair comparison with
in flood forecasting, we limited our analysis, both in           the results described above is allowed. For the
the calibration and validation phases, to the rainfall           calibration set, instead, the events (again twice the
values belonging to storm events, so as to identify the          number of the validation events) containing the high-
temporal pattern characterising the storms, whose                est rainfall values (above 3 mm/h) in the remaining
persistence properties are different from those of dry           data were chosen, lowering at the same time the
or low rainfall sequences. A storm event was conven-             threshold imposed on the change in the corresponding
tionally defined as an interval containing at least one          discharge from 20 to 10 m 3/s. In pairs C and D, new
hourly rainfall depth exceeding 1 mm (“wet” observa-             events were chosen for both the calibration and
tion), preceded by at least 5 and followed by at least           validation sets, so as to force the presence of the
20 h of rainfall lower than 1 mm, provided that in the           most extreme rainfall events in the calibration set
same time interval the corresponding observed hourly             still more (even if at the expense of the corresponding
discharge increases by at least 20 m 3/h. In this way,           increase in discharges), with the aim of improving the
the wet observations are followed by a time interval             “learning” of rainfall peaks. The new events in these
not shorter than the concentration time of the basin, so         last two experiments were chosen requiring a peak of
that the observed river discharges include the total             at least 4 and 5 mm/h, respectively. In addition, the
contribution of the surface runoff generated by each             number of observations following the wet observa-
storm event. In addition, the requirement on the                 tions was reduced in the last case from 20 to 5; in
change in discharge allows the exclusion of the events           this way, there is a further reduction of the presence
that do not produce hydrologic response (dry periods).           of low rainfall values in the sets. We here anticipate
In the observation period a total of 84 storm events             that the forecast performances on the split-samples
were identified, including a total number of 4580                B–D in the analysis were found to be very close to
hourly rainfall observations, for an average storm               those obtained when using the A split-sample pairs.
length of 55 h.                                                  For this reason and for brevity and clarity of presenta-
                                                                 tion, only the results referring to case A will be
2.2. Description of the calibration approaches                   described in what follows.
                                                                    In the adaptive calibration no database of past
   Two alternative approaches were followed for esti-            significant observed events is supposed available,
mating the parameters of the models: split-sample                but only the most recently observed values, immedi-
calibration and adaptive calibration.                            ately preceding the forecast instant. Thus, the set of
   In the split-sample calibration, the storm events             data is extremely poor for generalisation purposes,
were divided into two sets: a calibration (or training)          but, on the other hand, the limited amount of data to
set and a validation set, where the former contains              be processed allows the re-calibration of the model
twice the number of events as the latter. The sets               on-line, at each time step, as soon as new observations
were chosen so as to have approximately the same                 become available. This adaptability enables the model
proportion of low duration–high intensity and high               parameters to adjust to the properties of an ongoing
duration–low intensity rainfall events. It follows               event, by capturing the characteristics of the current
that the training set should be representative of the            meteorological situation.
characteristics of different kinds of event, thus                   For both calibration approaches, in correspondence
                                    E. Toth et al. / Journal of Hydrology 239 (2000) 132–147                           135
of each hourly time step belonging to the validation               vary with the season. Nonetheless, the limited number
set, a rainfall forecast was issued for the subsequent             of rainfall events in the observation period prevented
1–6 h, using the most recent observations as input.                us, in the split-sample calibration, from grouping the
The resulting forecasted rainfall values were                      events in monthly periods, as it is usually done in
processed as inputs to the rainfall–runoff transforma-             hydrology to circumvent non-stationarity. In the
tion model, thus providing the discharge forecast.                 adaptive calibration application, non-stationarity is
                                                                   accounted for by allowing the model parameters to
                                                                   vary with time since the calibration is performed
3. ARMA models                                                     solely on the progress of the current event. We
                                                                   preferred not to perform any preliminary transforma-
   Most of the time-series techniques traditionally                tion of the data in order to make them as close to
used for modelling water resources series fall within              Gaussian as possible. In fact, Gaussian data are not
the framework of the ARMA class of linear stochastic               required for the forecast application of ARMA
processes. They are usually denoted as ARMA(p,q)                   models, since they provide the best linear prediction
models, where p and q are the auto-regressive and                  even in the non-Gaussian case (Brockwell and Davis,
moving-average orders, respectively (Box and                       1987).
Jenkins, 1976; Brockwell and Davis, 1987; Bras                        The selection of the model orders, p and q, was
and Rodriguez-Iturbe, 1994). They describe each                    driven by some results available in literature.
observation of the time series as a weighted sum of                Obeysekera et al. (1987) determined an equivalence
p previous data, and the current as well as q previous             between the correlation structure of an ARMA(1,1)
values of a white noise process:                                   model and some point process models, like the
xt  f1 
xt⫺1 ⫺ mx  ⫹ f2 
xt⫺2 ⫺ mx                              Poisson rectangular pulses and the Neyman–Scott
                                                                   white noise models (see Rodriguez-Iturbe et al.,
      ⫹… ⫹ fp 
xt⫺p ⫺ mx  ⫹ ht ⫹ u1 ht⫺1                          1984). On the other hand, the Neyman–Scott rectan-
                                                                   gular pulses model, which has proved to represent the
      ⫹u2 ht⫺2 ⫹ … ⫹ uq ht⫺q ⫹ mx ;                     (1)        stochastic structure of rainfall better (Rodriguez-
                                                                   Iturbe et al., 1987), has a correlation structure
where xt is the investigated time series; ht , a white             equivalent to that of an ARMA(2,2) process.
noise, i.e. a non-correlated, zero-mean random                        In the adaptive calibration, the parameters are
variable that is also not correlated with the past values          estimated in correspondence with each forecast
of xt ; f1 ; …; fp and u1 ; …; up , the auto-regressive and        instant, on the basis of the last values measured in
moving-average parameters, respectively; and mx ; the              real-time. The number of past observations to be
mean of the time series.                                           used for each calibration was chosen on the basis of
   Parameter estimation for ARMA models can be                     the results of a previous study (Brath et al. 1998). The
performed in several ways. We applied here an                      estimation of the parameters was performed there
approximation in the spectral domain of the Gaussian               with a number w of observations xt immediately
maximum likelihood function, which was first                       preceding each forecast instant, with w varying from
proposed by Whittle (1953) for short-memory models.                7 to 100, aiming at identifying the value of w that
                                                                   provides the best forecasting performances. The
3.1. ARMA model application                                        results showed that for increasing w, the efficiency
                                                                   of the forecast improved moderately for short lead-
   The application of low-order ARMA processes to                  times (1–3 h), but a longer set of past data (more than
model short-term precipitation values is considered                3 days of previous hourly observations) provided a
here, following the modelling framework proposed                   much better performance for lead-times longer than
by Brath et al. (1988) and Burlando et al. (1993).                 4 h. Thus, we set the moving window of past rainfall
The application of ARMA models requires the data                   observations to be used in each adaptive calibration
to be stationary and this is often not the case for hourly         equal to the 100 last measured hourly observations
rainfall observations, whose statistical properties may            (that is, w  100:
136                               E. Toth et al. / Journal of Hydrology 239 (2000) 132–147
4. Artificial neural networks                                       The use of ANN for rainfall forecasting has not been
                                                                 fully explored, yet. A pioneer work is the study by
   As mentioned above, the rainfall generation                   French et al. (1992), who applied a neural network to
mechanism is extremely difficult to identify and                 forecast 1 h ahead, two-dimensional rainfall fields on a
model: there is still much that is not understood                regular grid. The forecast was based on synthetically
regarding the small-scale behaviour of precipitation             generated rainfall grid values corresponding to the
and the underlying physical laws. It is not easy to              previous hourly period. Kuligowski and Barros (1998)
recognise all the existing complex, and typically                generated a QPF of point precipitation accumulated
non-linear, relationships between the various aspects            over the following 6-h period using as inputs the ante-
of the dynamical process (French et al., 1992;                   cedent rainfall depths measured in adjacent gauges and
Kuligowski and Barros, 1998).                                    the radiosonde-based wind direction. Our interest is
   To accommodate the inability of an ARMA-type                  mainly focused on the hydrological use of rainfall fore-
model to account for these effects, non-linear                   casts: the implemented scheme provides a spatially
statistical models have been proposed, such as the               averaged rainfall forecast for all the lead-times from 1
threshold model and the bilinear model, but their                to 6 h, directly usable as input to the rainfall–runoff
complexity often makes them unsuitable for                       lumped model, and the only available information
operational applications (Chakraborty et al., 1992).             used are past rainfall observations.
In addition, such models still belong, like the                     Neural networks distribute computations to
ARMA type, to the model-driven approaches, requir-               relatively simple processing units called neurons,
ing an identification of the kind of relation between            grouped in layers and densely interconnected. Three
the variables (model selection) and an estimation of             different layer types can be distinguished: input layer,
the selected model parameters. From these considera-             connecting the input information (and not carrying out
tions stems the interest in alternative non-linear fore-         any computation), one or more hidden layers, acting
casters, belonging to the data-driven approaches,                as intermediate computational layers between input
where no a priori relationship between known para-               and output, and an output layer, producing the final
meters and observed values has to be hypothesised                output. In correspondence to each computational node
and no knowledge of the underlying process is                    J, each entering value 
IJi  is multiplied by a connec-
needed. Both artificial neural network architectures             tion weight 
wij : Such products are then all summed
and nearest-neighbour methods present these                      with a neuron-specific parameter, called bias 
bj ;
appealing characteristics.                                       used to scale the sum of products into a useful
   Artificial neural networks have been widely studied           range. The computational node finally applies a non-
and applied to a variety of problems, including hydro-           linear activation function ( f ), often a sigmoid, to the
meteorological simulation and forecasting. Several               above sum producing the node output 
OJ:
studies have been dedicated to the prediction of                    Neural networks are trained (calibrated) with a set
river flows (at a time scale ranging from one year to            of observed input and output (called target to be
one day) with no exogenous inputs, that is with only             distinguished from the network final output) data
the use of past flow observations (e.g. Karunanithi et           pairs, the training data set, which is processed repeat-
al., 1994). However, the large majority of the ANN               edly, changing the values of the parameters until they
hydrologic applications predict future flows based on            converge to values such that each input vector
the knowledge of previous rainfall depths (and other             produces output values as close as possible to the
meteorological variables) along with past observed               desired target vectors. Several variants of the popular
flows. The appeal of the use of ANN as black-box                 and extensively tested BackPropagation (BP) training
rainfall–runoff models lies mainly in their ability to           algorithm were tested in the present work. The
reproduce the non-linear nature of the rainfall–runoff           Levenberg–Marquardt algorithm, a quasi-Newton
transformation, and encouraging results have been                method that proved to be the quickest and less easily
obtained in literature on both real and synthetic hydro-         trapped in local minima among all the tested training
logic data (e.g. Hsu et al., 1995; Minns and Hall,               techniques, was finally chosen.
1996; Shamseldin, 1997; Zealand et al., 1999).                      The comparison of the performance of networks
                                          E. Toth et al. / Journal of Hydrology 239 (2000) 132–147                                            137
                                              w11h              b1h
                         Pt
                                                                          w1o
                                                                b2h                       bo
                         Pt-1
                                                                                                 Pt+1* Lead-time = 1 step
                         Pt-2                                   b3h
Pt-3
                                              w11h              b1h
                         Pt+1*
                                                                          w1o
                                                                b2h                       bo
                         Pt
                                                                                                 Pt+2* Lead-time = 2 steps
                         Pt-1                                   b3h
Pt-2
                  a
                                                                                          bo1
                 b
Fig. 1. (a) Feed-forward recursive network. Pt is the precipitation process; whij ; woi are the connection weights towards the hidden and output
layers, respectively, and bj are the node biases. (b) Feed-forward direct multi-step network. Pt is the precipitation process; whij ; woij are the
connection weights towards the hidden and output layers, respectively, and bj are the node biases.
allowing different feed-forward and feedback connec-                            were also tested: they provided comparable results,
tions between the nodes led to the choice of a classic                          but longer time and more random initialisations
multi-layer feed-forward network, where the informa-                            were needed for training.
tion flows only in one direction, from the input                                   As far as the number of hidden layers is concerned,
through the hidden up to the output layer. Networks                             there is no theory yet to tell how many hidden layers
allowing other types of connections between the                                 are needed. It has been proved that only one layer of
layers (recurrent and cascade-forward networks)                                 hidden units “suffices to approximate any function
138                                E. Toth et al. / Journal of Hydrology 239 (2000) 132–147
with finitely many discontinuities to arbitrary preci-            that the optimal network architecture and properties
sion”, provided the activation functions of the hidden            are highly problem-dependent and no definitive estab-
units are non-linear (the “Universal Approximation                lished methodology exists to deal with the neural
Theorem”, see Hornik et al., 1989). The hidden                    network modelling problem. As described above,
nodes also allow taking into account the presence of              preliminary forecast analyses were performed on a
non-stationarities in the data, such as trends and                few storm events, for choosing, on the analysed time
seasonal variations (Maier and Dandy, 1996).                      series, different architectures and properties of the
Obviously, the introduction of additional hidden                  networks.
layers allows the fit of a larger variety of target func-            From all the above considerations it was decided to
tions. On the other hand, the use of more than one                extensively test ANN architectures consisting of a
hidden layer substantially increases the number of                multi-layer feed-forward network with only one hidden
parameters to be estimated and the training time,                 layer. The network is trained with the Levenberg–
and it exacerbates the problem of local minima,                   Marquardt algorithm and the multi-step approach is
increasing the need for several random initialisations.           the direct multi-step method. The optimal complexity
Moreover, the addition of hidden layers often fails to            of the model, that is the number of input and hidden
provide noticeable improvement in the out-of-training             nodes, will be determined, as it is usually done, by a
forecasting application (see, for example, Zealand et             trial-and-error approach.
al., 1999), as confirmed also in our case study
applications.
   The use of an ANN for forecasting time series                  5. K-NN method
implies that the input nodes are connected to a number
of past observed values supposed sufficient to identify              The K-nearest-neighbour method has its origins as a
the process at future time steps. For forecasting                 non-parametric statistical pattern recognition proce-
several time steps ahead (multi-step ahead predic-                dure, aiming at distinguishing between different
tion), two methods have been considered. The first                patterns according to chosen criteria. Among the
is the recursive multi-step method: the network has               various non-parametric techniques, in the sense that
only one output node, forecasting a single step ahead,            no theoretical or analytical relation is known or
and the network is applied recursively, using the                 assumed between the inputs and the outputs, it is the
previous predictions as inputs for the subsequent                 most intuitive, but nevertheless possesses powerful
forecasts (see Fig. 1a). This forecasting technique is            statistical properties. Yakowitz (1987) and Karlsson
similar to the one used by the ARMA-type models.                  and Yakowitz (1987a,b) did considerable work in
The second method (direct multi-step) exploits the                extending the K-NN method to time-series and fore-
capability of a neural network to provide a multiple              casting problems, obtaining satisfactory results and
output, when several nodes are included in the output             constructing a robust theoretical base for the K-NN
layer, and each output node represents one time step to           method. The intuitiveness of the approach and the
be forecasted (see Fig. 1b). In our preliminary tests,            powerful theoretical basis have made the method
the recursive multi-step method provided very good                attractive to forecasters, especially in the hydrologic
results for lead-time equal to one time step, but there           field, where the method found successful applications
was a drastic deterioration when the lead-time                    (Karlsson and Yakowitz, 1987a,b; Galeati, 1990;
increased, as could be expected, because in the recur-            Kember and Flower, 1993; Todini, 1999).
sive methodology the forecast errors are propagated                  The prediction of a time series is based on a local
into subsequent forecasts. Given the importance of                approximation, making use of only the nearby obser-
good performance in correspondence with the longer                vations. For each forecast instant t, let x d 
t 
lead-times, the direct multi-step method was chosen.              
xt ; …; xt⫺d⫹1  be a feature vector of past records. A
                                                                  feature vector is a vector that summarises the whole
4.1. ANN application                                              past history in a smaller-dimension vector of observa-
                                                                  tions supposed to contain most of the information
  The most crucial disadvantage of ANN models is                  relevant to the forecast. The method assumes that
                                     E. Toth et al. / Journal of Hydrology 239 (2000) 132–147                              139
the probability distribution of the random variable                 subsequent to these K historical nearest neighbours.
conditioned on the entire past 
xt⫹1 =xt ; xt⫺1 ; …; is            It may be noticed that the K-NN approach does not
the same as that of the random variable conditioned                 require the selection of a class of models and the
on only the d past observations 
xt⫹1 =xd 
t:                    estimation of the model parameters, so that the
   It was proved that, even if x d 
t does not satisfy the        identification of a specific form of the input/output
above “history summarisation” properties, the K-NN                  relationship is not needed.
forecaster will be asymptotically optimal among all
the forecasters defined on the feature vector x d 
t:             5.1. K-NN method application
That is, under fairly general circumstances, conver-
                                                                       The nature of the nearest-neighbour method makes an
gence to the optimal forecaster is assured as the
                                                                    adaptive calibration approach completely meaningless,
historical data set increases (Karlsson and Yakowitz,
                                                                    because the approach is based on the presence of an
1987b). Let us indicate the expectation of the next
                                                                    extended database. In fact, in addition to being data-
value as x^t⫹1 , conditioned on the current feature
                                                                    driven, it is a method that does not detect any input/
vector x d 
t; that is,
                                                                    output mapping function, not even a posteriori (whereas
x^t⫹1  Ext⫹1 兩xd 
t:                                
2        the ANNs do), and it has, therefore, no extrapolation
                                                                    ability when presented with an unfamiliar input vector.
To estimate x^t⫹1 ; the K-NN method imposes a metric,               As a consequence, only the split-sample application was
denoted by 储·储, on the feature vector x d 
t to find the          performed.
set of K past nearest neighbours of x d 
t; i.e. the K d-            A trial-and-error procedure was implemented for
dimensional vectors of past observations: x d 
tj ; J            finding the number of nearest neighbours, K, and the
1; …; K; which minimise 储xd 
t ⫺ x d 
tj 储:                     dimension of the feature vector d (corresponding to
   The most intuitive and widely used metric to                     the number of past rainfall data considered represen-
identify neighbours is the Euclidean norm, which,                   tative for the forecast), providing the best performing
for a d-dimensional vector Z  d  
z1 ; z2 ; …; zd ; is
                                                                    forecasts. As Karlsson and Yakowitz (1987a) empha-
                !1=2                                                sise, K and d seem, and indeed are, parameters but the
          X
          d
储Z 储 
   d          2
             zi      :                                    
3       method itself is nevertheless non-parametric. In fact,
           i1                                                      K and d do not imply a model for x^t⫹L : the purpose of
                                                                    their search “is pragmatic; it is to make the Nearest-
The forecast is then obtained by averaging the                      Neighbour forecaster as accurate as possible for a
temporal evolution of the nearest neighbours,                       given database”.
assumed to be similar to the evolution of the current
situation, that is,
                                                                    6. Analysis and comparison of rainfall forecasting
          1 XK
                                                                    results
x^t⫹1         x :                                       
4
          K j1 tj ⫹1
                                                                       The performances of the considered time-series
The generalisation to higher lead-times L is straight-              methods (ARMA models, ANN and K-NN method)
forward:                                                            are first of all analysed and compared assessing the
                                                                    respective ability to forecast the spatially averaged
          1 XK
                                                                    rainfall depths belonging to the validation set storms.
x^t⫹L         x :                                       
5
          K j1 tj ⫹L                                               The issued rainfall forecasts will be successively
                                                                    routed through the conceptual rainfall–runoff trans-
Thus, in our case, the K-NN algorithm looks through                 formation model and the performances of the result-
all consecutive d-dimensional vectors in the entire                 ing river flow forecasts will be analysed and compared
historical rainfall depths database and locates K of                in the next session.
these d-ples, which are closest to the vector of the d                 The performances of the various forecasting techni-
most recent rainfalls. The prediction of the next rain-             ques are investigated through a trial-and-error process.
fall is then taken to be the average of the rainfall                The rainfall forecasting results are extremely difficult to
140                                      E. Toth et al. / Journal of Hydrology 239 (2000) 132–147
Fig. 2. Mean correlation coefficients of the one to six steps ahead rainfall forecasts issued: (a) by ANN architectures with varying number of
input nodes, and (b) by nearest-neighbour implementations with varying number of neighbours.
analyse, and a variety of quantitative measures (root                     respective standard deviations. It ranges from ⫺1.0 to
mean square error, mean absolute error, coefficient of                    1.0, with higher values indicating better agreement.
persistence, efficiency coefficient, correlation coeffi-
cient, index of agreement) were considered for synthe-                    6.1. Description of the split-sample and adaptive
sising the effectiveness of the performances over all the                 applications
lead-times, so to make a comparison possible. Different
methods provided the best results for different forecast                  6.1.1. ARMA models: trial-and-error processes
performance measures and for different lead-times, so                        In the split-sample calibration, low-order ARMA
that a performance classification could not be assessed                   models were implemented, both purely auto-regressive
unequivocally.                                                            and with a moving-average component, provided that
   Among the considered measures, the correlation                         the sum of the auto-regressive and moving-average
coefficient (CC) was chosen, following other rainfall                     orders, p ⫹ q; is equal to or less than 6. All the tested
forecasting studies (French et al., 1992; Kuligowski                      ARMA models provided almost analogous perfor-
and Barros, 1998), as the most representative for                         mances (mean CC  0.291–0.293).
assessing rainfall forecast performance. The correla-                        In the adaptive calibration, the tested ARMA
tion coefficient is given by the covariance of the fore-                  models had auto-regressive orders of 1 and 2 and
casts and observations divided by the product of the                      moving-average orders less or equal to 3. The trend
                                           E. Toth et al. / Journal of Hydrology 239 (2000) 132–147                                   141
Table 1
Correlation coefficients of rainfall predictions
1 2 3 4 5 6
ANN split-sample                     NI  18, NH  2                 0.689     0.511     0.407        0.358   0.331   0.327   0.437
ARMA split-sample                    Almost all equivalent           0.686     0.430     0.276        0.173   0.114   0.077   0.293
Nearest Neighbours                   K  70, d  2                   0.709     0.493     0.336        0.239   0.174   0.110   0.344
ANN adaptive                         NI  3, NH  3                  0.527     0.307     0.196        0.171   0.169   0.162   0.255
ARMA adaptive                        p  1, q  0                    0.744     0.472     0.283        0.134   0.060   0.003   0.281
of the obtained performances, for all the ARMA                            from 2 to 12. The improvement of the performance
models, is characterised by relatively good perfor-                       with an increasing number of nearest neighbours is
mance for lead-time of 1 h, followed by a collapse                        less noticeable for more than 20 neighbours and
in correspondence of longer time horizons. The over-                      there is no marginal improvement in the overall
all best results are provided by ARMA models with                         performance when increasing K beyond 70 (see Fig.
parsimonious configurations, indicating ARMA(1,0)                         2b). Small values (from 2 to 4) of the feature dimen-
as the best performing model (mean CC  0.281).                           sion d gave the most satisfactory results for each given
                                                                          number of neighbour vectors K.
6.1.2. ANN: trial-and-error processes
   In the split-sample application, architectures with a                  6.2. Overall comparison of rainfall forecasts
number of input nodes NI ranging from 2 to 24 were
tested. For each input layer dimension, the number of                        A direct comparison of all the implemented
hidden nodes (NH) was progressively increased from                        methodologies is presented here, so as to highlight
2 to 8 nodes. A deterioration of the forecasting per-                     the relative strengths and limitations. Table 1 shows
formance on the validation set, indicating over-fitting,                  the correlation coefficients of the rainfall predictions
was always shown for moderate dimensions of the                           provided by each time-series analysis technique (with
hidden layer, the best results corresponding to NH                        the modelling configurations identified in the trial-
between 2 and 6. The performance of ANN architec-                         and-error processes), both for different lead-times
tures, considering all the lead-times, improved as the                    and for the mean over all the lead-times. The table
number of input nodes (NI) increased, with modest                         ranks ANNs above the nearest-neighbour method and
additional gain for more than 15–18 nodes (see                            this latter above the ARMA models, on our precipita-
Fig. 2a).                                                                 tion data. Such results seem to indicate the appropri-
   The networks tested in the adaptive calibration of                     ateness of non-linear approaches when modelling
were extremely parsimonious (both NI and NH                               rainfall time-series.
ranging from 2 to 5), because the limited number of                          With regard to the ARMA models, the adaptive
training samples (here chosen corresponding to 100                        calibration application provides very good performance
past observations for a fair comparison with the                          for very short lead-times, while the split-sample
ARMA adaptive approach) would make a complex                              approach achieves better results for lead-times longer
network easily subject to over-fitting. The optimal net-                  than 3 h. This is consistent with the results obtained by
work complexity for adaptively calibrated neural                          Burlando et al. (1993), who found a superiority of the
networks seems to correspond to numbers of input                          adaptive calibration method tested for predictions 1 and
and hidden nodes, NI and NH, equal to 3.                                  2 h ahead.
                                                                             The ANN adaptive calibration application proves to
6.1.3. Nearest-neighbours: trial-and-error process                        be unreliable for short lead-times and especially
   A trial-and-error procedure was implemented for a                      inadequate for reproducing low rainfall, even if it is
number of nearest neighbours, K, ranging from 5 to                        satisfactorily stable for lead-times longer than 3 h. For
100 and a dimension of the feature vector, d, ranging                     a better result on short lead-times, we should probably
 142                                        E. Toth et al. / Journal of Hydrology 239 (2000) 132–147
Table 2
Mean correlation coefficients of predictions for different rainfall ranges
Low rainfall (⬍0.1 mm) Medium rainfall (0.1–1 mm) High rainfall (⬎1 mm)
 have resorted to the recursive multi-step method, but                       obtained when predicting extreme rainfall. However,
 at the expense of the performance for several steps                         such a choice would provide poor performance in the
 ahead. The ANN corresponding to the split-sample                            prediction of events that, although not extreme, are
 calibration gives the overall best results for lead-                        relevant in terms of vulnerability of flood-prone areas.
 times longer than 2 h, even if this kind of structure                          Considering all the lead-times and all the rainfall
 slightly penalises the 1-h ahead forecast.                                  categories, the ANN with split-sample calibration
    Table 2 presents the correlation coefficients                            seems the most adequate among all the considered
 obtained for different rainfall intensity ranges, to get                    approaches, at least in reference to our case study.
 an insight into the type of rainfall for which each                         In addition, it should be underlined that almost all
 method performs better. Both the nearest-neighbour                          the computational effort spent for the implementation
 and ARMA split-sample models provide a good fit of                          of the ANN split-sample application is concentrated
 low-intensity precipitation, but the ANN with split-                        in the training phase, while the issue of the forecasts
 sample calibration allows the best results for medium                       with the trained network is practically instantaneous,
 and high rainfall values.                                                   thus making this approach very appealing in a real-
    The forecasts issued when the calibration is of the                      time forecasting framework.
 split-sample type present, overall, smaller deviations
 from the observed values. Nevertheless, Table 2 high-
 lights the difficulties experienced by all the methods                      7. Rainfall–runoff transformation: analysis and
 in reproducing high-intensity rainfall. The analysed                        comparison of discharge forecasts
 techniques do not seem to properly reproduce high
 precipitation values. This is probably due to the fact                      7.1. Hydrologic model description
 that they are influenced by the majority of low-rainfall
 observations in the calibration sets. Analysing the                            The deterministic model used for simulating the rain-
 issued forecasts, it may be noticed that the adaptive                       fall–runoff transformation is a conceptual continuous
 calibrations often issue unreliable forecasts in                            simulation model called ADM (Franchini, 1996),
 correspondence with abrupt changes in the rainfall                          which is based on the concept of probability distributed
 intensities; on the other hand, the forecasts obtained                      soil moisture storage capacity. The catchment is
 with the split-sample approaches tend, when the lead-                       assumed to be composed of an infinite number of
 time increases, towards the average of the calibration                      elementary areas (each one with a different soil moisture
 data, often underestimating the values that follow the                      content and a different soil moisture capacity) and the
 high-intensity occurrences. It may be inferred that this                    proportion of elementary areas that are saturated is
 is the same reason that causes the presence of an                           described by a distribution function: the total surface
 apparent upper limit in the split-sample forecasts.                         runoff is the spatial integral of the infinitesimal contri-
    If a different goodness-of-fit criterion had been                        bution deriving from the saturated elementary areas.
 chosen in the trial-and-error processes, for instance                       The model is divided into two main blocks: the first
 maximising the fit on the highest rainfall values                           represents the water balance at soil level, while the
 alone, better results probably would have been                              second represents the transfer of runoff production at
                                  E. Toth et al. / Journal of Hydrology 239 (2000) 132–147                             143
the basin outlet. The soil, in turn, is divided into two         casting theory, is the persistent method, which equals
zones: the upper zone produces surface and subsur-               the future rainfall intensity, over all the investigated
face runoff, having as inputs precipitation and poten-           lead-times, to the last measured value,
tial evapo-transpiration, while the lower zone (whose
                                                                 x^t⫹L  xt ;        ᭙L:                               
6
input from the first one is the percolation flow)
produces base runoff. The transfer of these compo-               The last investigated heuristic approach, somehow
nents to the outlet section takes place in two distinct          similar to the persistent method, consists in extrapo-
stages: the first represents the flow along the hill-            lating future values setting the intensity for each given
slopes towards the channel network, while the second,            lead-time L equal to the mean intensity measured over
the flow along the channel network towards the basin             the last L observations, that is,
outlet. The 11 parameters of the ADM model were                            X
                                                                           L
accurately calibrated with the Shuffled Complex                                  xt⫺i⫹1
Evolution global optimisation algorithm (Duan et                 x^t⫹L    i1
                                                                                          :                            
7
al., 1992). The conceptual model has been separately                             L
parameterised off-line (fixed calibration) so that its           This last predictive scheme will be denoted as the
parameters do not change during the forecasting                  modified persistent method.
period. Joint optimisation of the rainfall–runoff
model coupled with the rainfall forecasters would,               7.3. Analysis and comparison of flow forecasting
of course, be possible, but this might cause, owing              performances
to compensation effects, undesirable biases in the
calibration of the parameters of the rainfall–runoff                The performance of the discharge forecasts attain-
model (and possibly a departure from their physical              able using the QPF provided by the different rainfall
meaning).                                                        predictive models was evaluated by computing the
                                                                 corresponding coefficient of efficiency, which is
                                                                 widely recognised as one of the most suitable good-
7.2. Standards of reference: heuristic rainfall
                                                                 ness-of-fit measures for runoff. For the analysis of
predictive approaches
                                                                 discharge performance results, the discharge series
   To evaluate the performances of the analysed time-            chosen as a reference was not the series of observed
series forecasting methods when used for providing               discharges, but the hourly discharges simulated by the
the inputs to the rainfall–runoff transformation model,          conceptual model when using as inputs the observed
the resulting discharges will be compared with those             precipitation (“true” discharges). This scenario was
obtained with some predictive benchmarks, consisting             considered in order to be able to evaluate the improve-
of rainfall forecasting approaches of a purely heuristic         ment obtainable by the rainfall forecasting module
nature. Three predictive procedures have been consid-            alone, independently of the effects of the simulation
ered among the possible alternatives that a hydrologi-           errors induced by possible residual inadequacies of
cal practitioner may envisage in case sophisticated              the hydrologic model.
modelling tools are not available and in case the                   The coefficient of efficiency for each lead-time L is
only information at his/her disposal are the most                given by:
                                                                             P
recent rainfall observations.                                                 
Q ⫺ Q^ t⫹L 2
   Probably, the most widespread approach when                   EL  1 ⫺ P t⫹L             2 ;      L  1; …; 6     
8
                                                                                
Qt⫹L ⫺ Q
using a rainfall–runoff transformation model in a
real-time framework is to assume that the future rain-           where Q^ t is the discharge at time t forecasted using as
fall will be null (null rainfall). It is an optimistic           input the predicted rainfall values; Qt ; the value of the
hypothesis, assuming that the forecast is issued at              corresponding “true” discharge (known rainfall); and
the end of the event, whereas, especially in watersheds            the mean of the Qt series. The summations are
                                                                 Q;
with short response time, a forecast is needed earlier           extended to all the issued forecasts, that is, to all the
in the storm progress.                                           forecast instants t belonging to all the validation
   A second term of comparison, widely used in fore-             events.
144                                                      E. Toth et al. / Journal of Hydrology 239 (2000) 132–147
                                     1.00
                                     0.99
            EFFICIENCY COEFFICIENT
                                     0.98
                                     0.97
                                     0.96
                                     0.95
                                     0.94
                                     0.93
                                     0.92
                                     0.91
                                     0.90
                                            1             2                   3             4                           5               6
                                                                             LEAD-TIME (HOURS)
                                                ARMA split-sample                                       ARMA adaptive
                                                ANN split-sample                                        ANN adaptive
                                                Nearest Neighbours                                      Null rainfall
                                                Heuristic persistent                                    Heuristic persistent modified
Fig. 3. Efficiency coefficients of the river flows corresponding to the different rainfall forecasting procedures: ARMA models with split-sample
and adaptive calibration; ANN with split-sample and adaptive calibrations; and nearest-neighbour method. The heuristic approaches: null
rainfall, persistent rainfall method and persistent modified method.
   Fig. 3 shows the performances, in terms of                                           inadequate data sets, as the records of low-intensity
efficiency coefficient, of the coupled rainfall–runoff                                  rainfall immediately preceding the arrival of the
forecasting schemes obtained with all the considered                                    highest precipitation intensities certainly are. As was
rainfall forecasting procedures. It may be observed                                     expected, the null rainfall hypothesis proves to be
that the hydrological processes taking place in the                                     unrealistic, since it may strongly underestimate the
rainfall–runoff transformation tend to level out all                                    rainfall volumes, whereas the persistent methods
the rainfall forecasts corresponding to very short                                      (both in the traditional formulation and modified)
lead-times and it dampens out most of the variability                                   provide an improvement with respect to the null
between the different methods highlighted in the                                        rainfall approach.
analysis of the rainfall depths described in the                                           The ARMA model with adaptive calibration shows
previous section. As a consequence, the good                                            a relevant deterioration with increasing lead-time, as
performance of the ARMA adaptive calibration                                            could be expected, given the poor performance of the
approach for 1-h lead-time becomes unnoticeable,                                        rainfall forecasts obtained using this approach in
whereas the relevant deterioration with increasing                                      correspondence to longer lead-times. The discharges
lead-time confirms how the method is less                                               simulated with the ARMA split-sample rainfall
appropriate than the three split-sample calibration                                     predictions appear slightly closer to the reference
methods.                                                                                discharges in comparison with the results obtained
   Moreover, the response time of the watershed shifts                                  with the nearest-neighbours scheme, whilst the perfor-
the poor results provided by the adaptive calibration                                   mance of the rainfall forecasts appeared superior for
of ANN from the low to the large lead-times. The                                        the nearest-neighbour method. It may be hypothesised
unsatisfactory results of the ANN with adaptive                                         that this is due to the non-linear and threshold effects
calibration is probably due to the strong limitation                                    characterising the rainfall–runoff transformation
of the forecasting ability of ANN when trained on                                       modelling.
                                                        E. Toth et al. / Journal of Hydrology 239 (2000) 132–147                             145
                                               0
                  Hourly precipitation (mm)
                                              0.5
                                               1
                                              1.5
                                               2
                                              2.5
                                                    0   10             20             30             40            50        60
                                                             ANN
                                              200            K-NN
                                                             ARMA
                  Discharge (m /s)
                                              150
                 3
100
50
                                                0
                                                    0   10             20             30             40            50        60
                                                                    Time (hours from 5.00 of 22/02/95)
Fig. 4. Example of rainfall and runoff observations and forecasts during the event of 22 February 1995: observed precipitation (past, black bars;
future, white bars), observed discharge (past, solid line; future, dotted line) and 1 to 6 h ahead predictions issued by ANN, nearest-neighbour
and ARMA methods with split-sample calibration.
the predictors based on ANN architectures, within the            anthropogenic effects on hydrological processes”,
analysed range of parameters and model structures.               contract no. 9908032871_001.
The dampening effect induced by the rainfall–runoff
transformation processes tends to level the per-
formances of the different methods, so that, when                References
considering the predicted runoff, the satisfactory fit
that some of the methods allow for very short-term               Box, G.E.P., Jenkins, G.M., 1976. Time Series Analysis: Forecasting
rainfall forecasts is of little worth.                              and Control. Holden Day, San Francisco, CA.
   Overall, the study indicates that the considered              Bras, R.L., Rodriguez-Iturbe, I., 1994. Random Functions and
time-series analysis techniques provide an improve-                 Hydrology. Dover, New York.
                                                                 Brath, A., 1999. On the role of numerical weather prediction models
ment in the flood forecasting accuracy with respect to              in real-time flood forecasting. In: Proceedings of the
the use of intuitive, heuristic rainfall prediction                 International Workshop on River Basin Modeling: Management
approaches, even if the rainfall forecasting perfor-                and Flood Mitigation, September 25–26, 1997, Monselice
mance measures indicate only a weak to moderate                     (Italy), pp. 249–259.
relationship between forecasted and observed values.             Brath, A., Burlando, P., Rosso, R., 1988. Sensitivity analysis of real-
                                                                    time flood forecasting to on-line rainfall predictions. In: Siccardi,
This is due to the fact that past rainfall observations             F., Bras, R.L. (Eds.), Selected Papers from the Workshop on
alone are not sufficient to predict future precipitation            Natural Disasters in European-Mediterranean Countries, Perugia,
accurately, not even for short time periods.                        Italy, pp. 469–488.
   The results shows that the use of time-series                 Brath, A., Montanari, A., Toth, E., 1998. Su alcune tecniche stocas-
analysis techniques for precipitation forecasting may               tiche per il miglioramento delle prestazioni del preannuncio di
                                                                    piena (in Italian, with English Abstr.). Proc. XXVI Convegno di
allow an extension of the lead-time up to which a                   idraulica e costruzioni idrauliche, Catania, Italy, vol. II, pp. 159–
reliable flood forecast may be issued, providing a                  172.
quick prediction based on past values solely and                 Brockwell, P.J., Davis, R.A., 1987. Time Series: Theory and
directly in the format required by the rainfall–runoff              Methods. Springer, New York.
transformation model. On the other hand, strong                  Burlando, P., Rosso, R., Cadavid, L.G., Salas, J.D., 1993. Forecasting
                                                                    of short-term rainfall using ARMA models. J. Hydrol. 144, 193–
limitations to a time-series analysis approach are                  211.
due to the lack of information needed for a reliable             Chakraborty, K., Mehootra, K., Mohan, C.K., Ranka, S., 1992.
prediction. More substantial improvement should                     Forecasting the behavior of multivariate series using neural
certainly be pursued through numerical weather                      networks. Neural Netw. 5, 961–970.
prediction models, once they are able to provide                 Duan, Q., Sorooshian, S., Gupta, V.K., 1992. Effective and efficient
                                                                    global optimization for conceptual rainfall–runoff models.
timely rainfall forecasts at a temporal and spatial                 Water Resour. Res. 28 (4), 1015–1031.
scale compatible with the requirements of flood fore-            Franchini, M., 1996. Use of a genetic algorithm combined with a
casting in small and medium-sized basins.                           local search method for automatic calibration of conceptual
                                                                    rainfall runoff models. Hydrol. Sci. J. 41 (1), 21–39.
                                                                 French, M.N., Krajewski, W.F., Cuykendall, R.R., 1992. Rainfall
                                                                    forecasting in space and time using a neural network. J. Hydrol.
Acknowledgements                                                    137, 1–31.
                                                                 Galeati, G., 1990. A comparison of parametric and non-parametric
   We are grateful to the three anonymous reviewers                 methods for runoff forecasting. Hydrol. Sci. J. 35 (1), 79–94.
whose suggestions allowed us to improve the presen-              Hornik, K., Stinchcombe, M., White, H., 1989. Multilayer feedfor-
tation of our results. The work presented here has been             ward networks are universal approximators. Neural Netw. 2,
                                                                    359–366.
partially supported by the European Community                    Hsu, K., Gupta, H.V., Sorooshian, S., 1995. Artificial neural
through the grant ENV4-CT97-0529 (FRAMEWORK                         network modeling of the rainfall–runoff process. Water Resour.
project), by the National Research Council of Italy —               Res. 31 (10), 2517–2530.
Group for the Prevention of Hydrogeological                      Karlsson, M., Yakowitz, S., 1987a. Nearest-neighbor methods for
Disasters, contract no. 98.00587.PF42 and by the                    nonparametric rainfall–runoff forecasting. Water Resour. Res.
                                                                    23 (7), 1300–1308.
Ministry of University and Research in Science and               Karlsson, M., Yakowitz, S., 1987b. Rainfall–runoff forecasting
Technology (MURST) of Italy through its 40%                         methods, old and new. Stochastic Hydrol. Hydraul. 1, 303–318.
national grant to the program on “Climatic and                   Karunanithi, N., Grenney, W.J., Whitley, D., Bovee, K., 1994.
                                          E. Toth et al. / Journal of Hydrology 239 (2000) 132–147                                          147
   Neural networks for river flow prediction. J. Comput. Civil              considerations in the modeling of temporal rainfall. Water
   Engng 8 (2), 201–220.                                                    Resour. Res. 20 (11), 1611–1619.
Kember, G., Flower, A.C., 1993. Forecasting river flow using             Rodriguez-Iturbe, I., Febres de Power, B., Valdes, J.B., 1987.
   nonlinear dynamics. Stochastic Hydrol. Hydraul. 7, 205–212.              Rectangular pulses point process models for rainfall: analysis
Krzysztofowicz, R., 1995. Recent advances associated with flood             of empirical data. J. Geophys. Res. 92 (D8), 9645–9656.
   forecast and warning systems. US National Report to IUGG,             Shamseldin, A.Y., 1997. Application of a neural network technique
   1991–1994, Rev. Geophys. 33 (Part 2, Suppl. S), 1139–1147.               to rainfall–runoff modelling. J. Hydrol. 199, 272–294.
Kuligowski, R.J., Barros, A.P., 1998. Experiments in short-term          Todini, E., 2000. Real-time flood forecasting: operational experience
   precipitation forecasting using artificial neural networks. Mon.         and recent advances. In: Marsalek, J. et al. (Eds.), Flood issues in
   Weather Rev. 126, 470–482.                                               contemporary water management, Kluwer Academic Publisher,
Maier, H.R., Dandy, G.C., 1996. The use of artificial neural                The Netherlands, pp. 261–270.
   networks for the prediction of water quality parameters. Water        Whittle, P., 1953. Estimation and information in stationary time
   Resour. Res. 32 (4), 1013–1022.                                          series. Ark. Mat. 2, 423–434.
Minns, A.W., Hall, M.J., 1996. Artificial neural networks as             Yakowitz, S., 1987. Nearest neighbor method for time series
   rainfall–runoff models. Hydrol. Sci. J. 41 (3), 399–417.                 analysis. J. Time Ser. Anal. 8 (2), 235–247.
Obeysekera, J.T.B., Tabios III, G.Q., Salas, J.D., 1987. On parameter    Zealand, C.M., Burn, D.H., Simonovic, S.P., 1999. Short term
   estimation of temporal rainfall models. Water Resour. Res. 23            streamflow forecasting using artificial neural networks. J.
   (10), 1837–1850.                                                         Hydrol. 214, 32–48.
Rodriguez-Iturbe, I., Gupta, V.K., Waymire, E.C., 1984. Scale