FLOOD SUCCOUR
Forecasting and Predictive Analytics of Flood using Data Mining
ABSTRACT
Flood forecasting (FF) is one the most challenging and difficult problems in hydrology. However, it is also
one of the most important problems in hydrology due to its critical contribution in reducing economic and
life losses. In many regions of the world, flood forecasting is one among the few feasible options to manage
floods. Reliability of forecasts has increased in the recent years due to the integration of meteorological and
hydrological modelling capabilities, improvements in data collection through satellite observations, and
advancements in knowledge and algorithms for analysis and communication of uncertainties. Earthquakes,
floods, rainfall represent a class of nonlinear systems termed chaotic, in which the relationships between
variables in a system are dynamic and disproportionate, however completely deterministic. Classical linear
time series models have proved inadequate in analysis and prediction of complex geophysical phenomena.
Nonlinear approaches such as Artificial Neural Networks, Hidden Markov Models and Nonlinear
Prediction are useful in forecasting of daily discharge values in a river. The focus of these methods is on
forecasting magnitudes of future discharge values and not the prediction of floods. Chaos theory provides
a structured explanation for irregular behavior and anomalies in systems that are not inherently stochastic.
Time Series Data Mining methodology combines chaos theory and data mining to characterize and predict
complex, non-periodic and chaotic time series. Time Series Data Mining focuses on the prediction of
events. Floods constitute the events in a river daily discharge time series. This system focuses on
application of the Time Series Data Mining to prediction of floods.
Chapter 1. Introduction
This research is an application of Time Series Data Mining methodology to prediction of floods.
Chapter 1 is an introduction to effects of floods, nature of geophysical phenomena,
existing flood forecasting techniques, the Time Series Data Mining approach and its application
to flood forecasting.
1.1 Effects of Floods
According to the United States Geological Survey (USGS) and National Weather Service
(NWS), as much as 90 percent of the damage related to natural disasters (excluding droughts) is
caused by floods and associated mud and debris flows. Over the last 10 years, floods have cost on
average, $3.1 billion annually in damages. USGS and NWS estimate that more than 95 lives are
lost, on average, per year and proper detection and prediction of these disasters can save countless
lives and over 1 billion dollars a year in damages .
1.2 Nature of Geophysical Systems
Traditionally, geophysical systems are viewed as systems that exhibit irregular behavior,
essentially due to the large number of variables that govern and dominate the underlying systems
. Geophysical phenomena such as earthquakes, floods etc represent nonlinear systems whose
occurrences are subject to high levels of uncertainty and unpredictability.
A few deterministic approaches have been applied, but the stochastic approaches have proved better in the
process of representing important statistical characteristics of the geophysical system and provide
reasonably good predictions . Nonlinear dynamics or chaos addresses the set of systems that may be
stochastic but also display correlations that are deterministic, and are known as “deterministic
chaotic systems”. Deterministic chaos provides a structured explanation for irregular behavior
and anomalies in systems which do not seem to be inherently stochastic . Thus, chaotic
systems are treated as "slightly predictable" and can be studied in the framework of nonlinear
system dynamics. This fact has sparked a surge of interest in nonlinear models among researchers
in applied sciences.
1.3 Existing Flood Forecasting Techniques
Flood prediction is a complex process because of the numerous factors that affect river water
levels such as the location, rainfall, soil types and size of catchments. The relationship between
these factors has not been fully understood.Nonlinear time series approaches
such as Hidden Markov Models (HMM) [3], Artificial Neural Networks (ANN) and
Nonlinear Prediction (NLP) applied to discharge forecasting produce accurate predictions
for short prediction periods of up to one day.
Because the world of nonlinear models is so vast, much attention has been devoted to particular
families of models, which have been found to perform well in a range of applications. Time
delayed embedding is one such technique that has been applied to a variety of physical
applications in the domains of physiology , economics , geophysics and
engineering applications . The Time Series Data Mining methodology is
based on a variant of time delayed embedding called the reconstruction of phase space.
1.4 Time Series Data Mining
The Time Series Data Mining (TSDM) methodology , proposed by Richard Povinelli follows
the time delayed embedding process to predict future occurrences of important events. TSDM
framework combines the methods of phase space reconstruction and data mining to reveal hidden
patterns predictive of future events in nonlinear, nonstationary time series. TSDM has its
theoretical justification in the theory of nonlinear dynamics, specifically the Takens’ Embedding
Theorem and Sauer’s Theorem .
1.5 Flood Forecasting Using Time Series Data Mining
Motivation
Nonlinear approaches such as the HMM, ANN and NLP have been applied to the area of
flood forecasting, with the prediction results decreasing in accuracy as the prediction periods
exceed a day. Moreover, HMM, ANN and NLP are useful in the forecasting of all future values,
and their accuracy is measured over all forecasted values and not for the accuracy of predicting
events such as floods. The prediction of floods requires a technique that can predict events
(floods) in particular.
Chapter 2. LITERATURE SURVEY
This chapter includes an introduction to presence of chaos in geophysical systems, followed
by the classification and a brief review of existing flood prediction techniques along with their
advantages and disadvantages. This is followed by a review of application of Non Linear Time
Series Analysis to flood forecasting.
2.1 Presence of Chaos in Geophysical Systems
It is necessary to confirm the presence of chaos in any system before applying techniques
based on chaos theory. It has been observed that the application of chaos theory based methods to
systems that are not chaotic may produce wrong results [25]. Additionally, if there is no proof of
existence of chaos, other methods targeted towards deterministic or stochastic time series analysis
can be applied with greater success. Sivakumar [44] presents a detailed literature review on
application of Chaos Theory in geophysics with applications to rainfall, river flow, rainfall runoff,
sediment transport, temperature, pressure, wind velocity, wave amplitudes, sunshine duration and
tree rings. M.N. Islam and B. Sivakumar [23] have performed a comparative study of existing
techniques to determine the presence of chaos in hydrological systems. Each technique has its
own advantages and disadvantages and no single method guarantees a correct classification of a
system as being chaotic or non chaotic. Although proof for existence of chaos in different river
flow time series has been illustrated in [23, 25, 36, 50, 51], its presence in the river discharge
time series needs to be confirmed for the three examples considered.
2.2 Existing Flood Forecasting Techniques
In recent years, numerous studies from varied fields of hydrodynamics, civil engineering,
statistics and data mining have contributed to the area of flood prediction. Some of the existing
techniques used in flood prediction are:
1. Stream and Rain-Gauge Networks and Hydrograph Analysis
2. Radar and Information Systems
3. Linear Statistical Models and
4. Nonlinear Time Series Analysis and Prediction.
The first three techniques mentioned above use more than one input parameter (multivariate)
for characterization and prediction of floods. For example one of the linear statistical models uses
the flood discharge, weighted flood discharge, precipitation intensity, elevations, stream length,
and main channel slope for flood prediction.
Some of the nonlinear time series approaches such as Hidden Markov Models [3] and Artificial
Neural Networks [5] are also based on multiple time series. Nonlinear Prediction (NLP) [28]
method developed by Farmer and Sidorowich [14] has been used in river discharge forecasting by
Porporato et al [36-38]. Researchers have experimented with the application of NLP to discharge
forecasting based on both single variable time series [45-46] and, multiple variable time series [8,
37]. Laio et al [28] have performed a comparison of ANN and NLP approaches in daily discharge
forecasting. The results have shown that the NLP method provides accurate forecasts over a
shorter prediction period (1-6 hours), but over prediction periods exceeding 24 hours, the ANN
approach is more accurate. However, for periods exceeding a day, the HMM, ANN and NLP
methods lose their accuracy. Moreover, the HMM, ANN and NLP techniques are not adequate
for event predictions for reasons explained in Section 1.5. These approaches are described in the
following sections.
2.2.1 Stream and Rain-Gauge Networks and Hydrograph Analysis
River flows and precipitation volumes are measured and monitored by more than 7,300 gauge
stations operated mainly by USGS out of which about 4,200 are telemetries by an earth satellite
based communication system. There also exist more than 14,000 rain gauges operated by NWS
[34]. Data from these two recording systems acts as an input to statistical hydraulic models that
estimate the possible river stage and discharge that may result. The models often turn out to be
inaccurate because they are built using historic data that hasn’t been recorded for more than the
past 25 years and changes in topography as a result of rapid urbanization. Since hydraulic models
cannot predict exactly what will happen to the river, rating curves or hydrographs that show the
relationship between water flows and water levels are used simultaneously [34]. The
disadvantage of using a hydrograph is that it does not consider the changes in river cross sections
that result from changes in channel bed and as a result, the stage and discharge relationship is
altered [34].
2.2.2 Radar and Information Systems
With the development of Advanced Hydrological Prediction Services (AHPS) and NEXRAD
by the NWS, the Doppler Radar and the Geographic Information System (GIS) is being used
along with the traditional hydraulic models for improved flood forecasting [34]. This is complex
system comprising of simulation programs and uses data from various sources such as telemetry,
automated gauges to calculate runoff, infiltration and precipitation volumes using land use and
elevation information. AHPS has been found to result in flood forecasts that are 20% more
accurate than the stream and rain gauge analysis. NWS has implemented 478 AHPS forecast
points by the year 2004 with a one-time fee of $2.1 million and $300,000 annually for
maintenance [34]. The major disadvantages of AHPS are the complexity and the high cost of
implementation and maintenance.
9
2.2.3 Linear Statistical Models
Linear Statistical Models such as Autocorrelation functions, Spectral Analysis, Analysis of
cross correlations, Linear Regression and Autoregressive Integrated Moving Average (ARIMA)
have been studied for the applicability to flood forecasting. Solomatine et al. [46] have found in
their study that the use of stationary (ARMA) as well as non stationary (ARIMA) versions of linear
prediction techniques does not provide accurate predictions. Application of other linear stochastic
methods has also resulted in inaccurate predictions, clearly indicating that linear statistical models
do not accurately represent historical data and hence are not acceptable methods for a non-linear
application such as flood forecasting [46].
2.3 Nonlinear Time Series Analysis and Prediction Applied to Flood Forecasting
2.3.1 Hidden Markov Models
The concept of the state of a system is powerful even for nondeterministic systems. A
Markov chain consists of a continuous range of flow values and, given the transition probability
of moving from one state to another, will predict the most probable future state based on the
current state. The objective of using Hidden Markov Models to predict floods is to provide a
simpler, generalized data mining model that could be reused for various geographical areas, in
which independence of predictions could be obtained with minimal consideration of past events.
Most of the applications of Hidden Markov Models in flood forecasting have used the
spatio-temporal approach, whereas the time series prediction used in this research is a purely
temporal approach [3, 29]. The drawbacks of this approach are that the initial structure of the
Markov model may not be certain at the time of model construction and it is very difficult to
change the transition probabilities as the model itself changes with time. It was also observed that
the Hidden Markov Models have a higher error for longer prediction periods as well as for
prediction of events with sudden occurrences such as flash floods, leading to the conclusion that
Hidden Markov Models do not perform better than other data mining techniques [3].
2.3.2 Artificial Neural Networks
Artificial Neural Networks (ANN) are widely accepted as a potentially useful way of
modeling complex nonlinear and dynamic systems. They are particularly useful in situations
where the underlying physical process relationships are not fully understood or where the nature
of the event being modeled (i.e. a flood) may display chaotic properties. Though neural networks
do not remove the need for knowledge or prior information about the systems of interest, they
reduce the model's reliance on this prior information. This removes the need for an exact
specification of the precise functional form of the relationship that the model seeks to represent.
Artificial Neural Networks represent input output “connectionist models” where different factors
such as temperature, precipitation, flow rate, depth etc are provided as input to the model. This
technique has been used by a number of researchers in a variety of geophysical phenomena from
predicting currents in sea to flood prediction [5, 10-11, 20, 22, 52-53]. There is however, no set
algorithm that can be applied to ensure that the network will always yield an optimal solution as
opposed to a local minimum value, and the nonlinear nature of the ANN often results in multiple
predicted values [27].
2.3.3 Non Linear Prediction Method
Non Linear Prediction (NLP) method was developed by J. Doyne Farmer and John J.
Sidorowich [14] and was subsequently used in flood prediction by Porporato et al [36-38].
Researchers have experimented with flood prediction based on single parameter (river flow) time
series [45-46] and multiple variable time series as well [8, 37]. The first steps of NLP are same as
that of Time Series Data Mining (TSDM) methodology used in this research, starting with the
11
reconstruction of phase space from the measured time series of the variable to be forecast. The
phase space reconstruction and Takens theorem which provides theoretical justification for phase
space reconstruction are explained next.
Phase Space Reconstruction
Attractors are the states towards which a system evolves when starting from certain initial
conditions. Since the dynamic of the system is unknown, the original theoretical attractor that
gives rise to the observed time series cannot be constructed. Instead, a phase space is created
where the attractor is reconstructed from the scalar observed data that preserves the invariant
characteristics of the original unknown attractor described by the time delay method to
approximate the state space from a single time series data.
The reconstructed phase space is a Q-dimensional metric space into which a time series is
embedded. It is a vector space for the system such that specifying a point in this space specifies
the state of the system and vice versa. Time-delayed embedding maps a set of Q time series
observations taken from X onto xt, where xt is a vector or point in phase space. A time series is
represented by { xt-(Q-1)t,…, xt-2t, x t-t, x t} where xt represents the current observation, and
(xt-(Q-1)t,…, xt-2,, xt-t) are the past observations. If t is the current time index, then t-t is a time index
in the past, and t+t is a time index in the future. The embedding delay (t) is the time difference in
number of time units between adjacent components of delay vectors. The embedding dimension
(m) is the number of dimensions of reconstructed phase space that are required to achieve an
embedding. Any further analysis of deterministic properties of a nonlinear time series depends on
the precondition of a successful reconstruction of a state space of the underlying process [17].
There are many theorems based on time delayed embedding used in reconstruction of phase
space such as the Takens Embedding Theorem [44] and Whitney’s embedding Theorem [46].
Other methodologies that are not based on the method of time delays are Filtered Embedding [40]
that comprise of Principal Components [17] and Derivatives and Legendre coordinates [17].
12
Takens Embedding Theorem
Takens Theorem guarantees that the reconstructed dynamics are
topologically identical to the true dynamics of the system.
According to the Takens Embedding Theorem, the selection of any value for the delay (t)
will result in an embedding, given the fact that the data is infinitesimally accurate and does not
contain any noise . The data collected from naturally occurring dynamical systems hardly
matches these specifications.
ARCHITECTURE
Figure 1.1 Block Diagram of Time Series Data Mining Methodology
Time Series Data Mining for Forecasting Flood
A primary difference between ANN, HMM and NLP and, Richard Povinelli’s Time Series
Data Mining (TSDM) is that the focus of TSDM is to identify and predict the occurrences of
events, which in our case are floods whereas, the methods mentioned in the sections above focus
on forecasting all future values of a time series. The TSDM methodology in this research is based
on a univariate river flow time series. The steps in TSDM methodology are:
1. Training Stage
Reconstruct the phase space from the observed time series by using the method
of delays.
Frame TSDM goal in terms of event characterization function, objective function
and optimization formulation.
Define the event characterization function g. . Define the objective function f.
Define the optimization formulation, the constraints on the objective
function.
Associate eventness with each time index represented by the event
characterization function. Create the augmented phase space.
Search for optimal temporal pattern cluster in the augmented phase space that
best characterizes the events.
Evaluate training stage results and repeat training stage as necessary.
2. Testing Stage
Embed the testing time series into phase space.
Use optimal temporal pattern cluster for predicting events.
Evaluate testing results.
DATA FLOW DIAGRAM
ALGORITHM
GENERIC ALGORITHM:
An unsupervised clustering technique, the Genetic Algorithm [18] is used in the search
process for optimal temporal pattern cluster. The Genetic Algorithm (GA) searches for a global
maxima and, identifies the optimal cluster. As explained in Section 3.2.2, priorities are assigned
to the two objectives, with the maximization of objective function being the first priority and
minimization of radius is the secondary priority. Thus, the GA will search for a temporal pattern
cluster that maximizes the objective function value and if there are multiple clusters with equally
high objective function value, it selects the cluster with minimum radius. The minimization
objective is required to select the crispest cluster in order to minimize the number of false
positives in the testing phase. The Genetic Algorithm Toolbox in Matlab, version 7.0.1 (Release
14) is used for the search and the output is the cluster center and its radius.
Many algorithms have been proposed that provide an estimation of optimal embedding
dimension and time delay. Some of the methods for estimation of embedding dimension are the
method of False Nearest Neighbors , The Fillfactor algorithm , and The Integral Local
Deformation algorithm.
Cluster Prediction Accuracy
In order to determine the prediction accuracy of the cluster and measure its ability to identify
and characterize the floods in the training and testing phases, the following set of parameters is
defined.
1. tp (True Positives): If the event identified by the cluster as a flood is actually a flood it is
called a true positive.
2. fp (False Positive): If an event identified by the cluster is not a flood it constitutes a false
positive or a false alarm.
3. a (Type I Error): In hypothesis testing, a is defined as the probability of rejecting a null
hypothesis when its true. Analogically, applied to cluster prediction accuracy, a is the
probability of missing a true positive.
4. ß (Type II Error): ß is defined as the probability of failing to reject the null hypothesis
even though it’s false. Applied to cluster prediction accuracy, ß is the probability of
selecting a false positive.
5. Positive Prediction Accuracy: Positive Prediction Accuracy (PPA) [41] is the percentage
of true positives in the cluster. Since the events are classified either as true positives or
false positives, the positive prediction accuracy of a cluster is calculated as
100 × + fptp tp .
PPA can also be calculated as( ) 1100 −× b , since ß is
fptp fp +. ( ) b−1 is the probability
of selecting a true positive.
6. Correct Prediction Percentage: Correct Prediction Percentage (CPP) is defined as the
percentage of true positives predicted. Correct Prediction Percentage =
() () actualtp predictedtp
.
The cluster having a high PPA as well as high CPP is the crispest cluster with maximum
prediction accuracy.
7. Number of Starts Missed: Since the goal is to predict occurrences of floods, it is more
important to predict the first instance when the discharge exceeded the flood threshold,
causing the river to overflow, than to predict all events (tp’s) when the discharge exceeds
the threshold. Hence, along with CPP measured with respect to number of tp’s predicted,
the number of starts of floods missed are also measured
Conclusions and Future Work
This chapter presents the conclusions and directions for further research. Section 5.1
summarizes this research and presents its application as a decision making tool. Section 5.2
enlists the directions for future research in this area.
5.1 Conclusions
Time Series Data Mining is applied to the area of flood forecasting with the goal of predicting floods
accurately and as early as possible. Three examples of gauging stations,
representing high, medium and low flood occurrences are considered. The prediction accuracy is
evaluated in terms of a, ß, Positive Prediction Accuracy (PPA) and Correct Prediction Accuracy
(CPP). Two variations of objective function are presented. Objective Function I can be used in
flood forecasting problems where the information about history of floods is available. Where this
information is not available, the flood zoning information can be used along with Objective
Function II. The effect of earliness of prediction on the prediction accuracy is also presented.
Earlier approaches have dealt with forecasting magnitudes of future discharge values. This
research focuses on the early prediction of floods (events). It is the first application of an event
based data mining technique to flood forecasting.
The predictions are specific to the location of the gauging station and its catchment area.
Variations in factors such as river depth, cross section, rainfall runoff , snowmelt affect the flood
characteristics. For example, for the St. Louis gauging station, a discharge of 780,000 cubic feet
per second causes the river to overflow. On the other hand, for the Kansas City gauging station, a
discharge of 380,000 cubic feet per second is enough to make the river overflow. Hence, the
predictions are not generic and for a flood prediction problem at another location, the GA needs
to be trained on the discharge time series from that location.
The prediction results from TSDM can be used as a decision making tool by city planners.
The decision variables in this approach are ß (Proportion of False Positives) and, the Event
Characterization Function (Step-ahead function). Depending on the location of the gauging
station and its catchment area, the impact of flood on surrounding human population and
economy, the city planners can decide the values for the above mentioned variables that would
provide them enough time to plan for flood mitigation and evacuation procedures. For example,
the flooding on Hillsborough River passing through the City of Tampa will affect a huge
population and have large scale economic repercussions. Whereas, a flood on the Suwanee River
that passes through large unpopulated areas in Florida would have a lesser impact. Thus, for an
impact area such as the City of Tampa, the planners would not want to miss predicting a flood
and would also accept a certain number of false alarms as long as the no flood is missed. This
criterion determines the value of ß (proportion of false positives) for the cluster. From the results
it can be seen that a higher ß leads to high number of false alarms in the testing time series.
Another factor that the city planners have to account for is the earliness of prediction. This
factor affects the choice of event characterization function. The planners would want to predict
the flood as early as possible; however, a tradeoff exists between the earliness of prediction and
the Correct Prediction Percentage. From the results it can be seen that the earlier the TSDM
methodology tries to predict the flood, less is the and Correct Prediction Percentage.
Concluding, this research leads to the development of a decision making tool for planning
flood mitigation and evacuation procedures for use by planning authorities. Based on location and
the possible impact , the planners can make a choice in selecting the parameters for TSDM. The
tradeoffs involved in selecting different values for these parameters are also presented. This is a
general approach and can be applied to any gauging station and its catchment area for flood
prediction.
5.2 Future Work
1. Decision Support Software Development
A Decision Support Software can be developed for a general purpose use by city planners and
emergency operations agencies. The proposed software would have a graphical user interface
with input screens to allow users to input the training time series for any gauging station.
Depending on whether the history of floods is available the software will train the GA using one
of the two objective functions. The software would also
allow the decision makers to choose how many days early they want the prediction to be.
The possible customers for this software would be insurance agencies and city planning
authorities.
2. Use of Multiple Parameters
In this research the daily discharge time series is used for flood prediction. A future study
could use other data sets such as rainfall runoff, the height of river (water level), snowmelt. The
use of combination of multiple parameters such as a discharge and rainfall runoff time series may
lead to higher prediction accuracy. Another approach would be to modify the Event
Characterization Function itself to include multiple variables.
3. Changing Embedding Time Delay
It was observed that the efficiency of the GA in finding a cluster depends on the spread of
points in the phase space. For St. Louis gauging station, the spread of phase space points is
minimum and the points are concentrated around the diagonal. The spread of points can be
controlled by specifying different values for delay. A higher delay would possibly lead to lower
spread in the phase space points, however it also results in loss of information. An interesting
direction for future research would be to experiment with different time delays to observe the
effect on the prediction accuracy of the GA.
4. TSDM in Conjunction with NLP
TSDM and NLP can be used in conjunction to predict both the time and magnitude of floods.
This will help in real time alerting and evacuation planning. If the time of floods and magnitude is
known the area of impact can be estimated more accurately. For example, discharge over the
threshold would cause the river to overflow, however the magnitude of discharge will determine
what area is actually affected by the flood.
References
[1] Abarbanel H.D.I. Analysis of Observed Chaotic Data. Institute for Nonlinear Science, 1996.
[2] Albano A.M, Passamante A., and Farell M.E. Using higher-order correlations to define an embedding
window. Physica D, 54:85-97, 1991.
[3] Ayewah N. Prediction of Spatial Temporal Events Using a Hidden Markov Model. Department of
Computer Science and Engineering, Southern Methodist University, Dallas, TX, 2003.
[4] Bangura J.F., Povinelli R.J., Demerdash N.A.O., Brown R.H. Diagnostics of Eccentricities and Bar/End-
Ring Connector Breakages in Polyphase Induction Motors through a Combination of Time-Series Data
Mining and Time-Stepping Coupled FEState Space Techniques. IEEE Transactions On Industry
Applications, vol. 39, no. 4, 1005-1013, 2003.
[5] Boogard H. F. P. van den, Gautam, D. K. and Mynett, A. E. Auto-regressive neural networks for the
modeling of time series. Hydrodinamics98, Babovic & Larsen (eds), Balkema, Rotterdam. 741-748, 1998.
Publishers, Dordrecht, 23-51, 2000.
[6] Buzug T., Pfister G. Comparison of algorithms calculating optimal parameters for delay time
coordinates. Physica D, 58:127, 1992.
[7] Buzug T., Reamers T., and Pfister G. Optimal reconstruction of strange attractors from purely
geometrical arguments. Europhysics Letters, 13:605-610, 1990.
[8] Cao L., Mees A., Judd K. Dynamics from multivariate time series. Physica D, 121:75-88, 1998.
[9] Clemins P., Povinelli R.J. Detecting Regimes in Temperature Time Series. Artificial Neural Networks
in Engineering, Proceedings, 727-732, 2001.
[10] Coulibaly, P., Anctil, F. and Bobée, B. Daily reservoir inflow forecasting using artificial neural
networks with stopped training approach. Journal of Hydrology 230, 244-257 2000.
[11] Deo, M. C. and Thirumalaiah, K. Real time forecasting using neural networks. Artificial neural
networks in Hydrology, R. S. Govindaraju and A. Ramachandra Rao (eds), Kluwer Academic Publishers,
Dordrecht, 53-71, 2000.
[12] Diggs D.H., Povinelli. R.J. A Temporal Pattern Approach for Predicting Weekly Financial Time
Series. Artificial Neural Networks in Engineering, St. Louis, Missouri, 707-712, 2003.
[13] Duan M., Povinelli R.J. Estimating Stock Price Predictability Using Genetic Programming.
Proceedings of the Genetic and Evolutionary Computation Conference (GECCO2001), 174, 2001.
[14] Farmer and Sidorowich. Predicting chaotic time series. Physics Review Letters 59, 845– 848, 1987.
[15] Fraser A. M., and Swinney H. L. Independent coordinates for strange attractors from mutual
information. Physics Review A, 33:1134-1140, 1986.