Sedgeo S 22 00177
Sedgeo S 22 00177
Manuscript Number:
Keywords: Sediment yield; Hydroclimatic data; Mahanadi River; Artificial Neural Network;
ANFIS; Regression analysis
Opposed Reviewers:
Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation
Cover Letter
Dear Sir/Madam
I, on behalf of co-author, am submitting a manuscript entitled “Assessment of sediment yield prediction
algorithms for Mahanadi River India” for your kind consideration for publication in your esteemed
journal “Sedimentary Geology”.
Mahanadi river is the second largest peninsular river in India, and the largest river system in the state
of Odisha. Sediment is one of the major parameters in any river which directly affects the Morphology
and planform dynamics of the river. Different computational approaches for the assessment of sediment
yield have been analyzed for the Mahanadi River basin. Four standard models like Adaptive Neuro-
Fuzzy Inference System (ANFIS), a multilinear regression model, a traditional rating curve model, and
error back-propagation ANN technique were employed by using the monthly hydro-meteorological
parameters to estimate the suspended sediment concentration at the Tikarapada gauging station of the
Mahanadi River. Tikarapada is the major gauging station of river Mahanadi as all the sediment
transported from the entire basin gets measured here before the sediment gets deposited in Bay-of-
Bengal. All the algorithms show that discharge, stage, and rainfall are having a major contribution to
sediment yield. After the comparison with all the four standard models, we surprisingly found that the
ANFIS model is providing high accuracy for sediment yield estimation. The inclusion of different
hydroclimatic parameters is also been justified by different correlation coefficients and variance tests.
In the end, the accuracy assessment and comparison of all the algorithms have been performed with
different error statics to find out the best possible models. This study can be a major contribution to the
sediment yield prediction for the most important peninsular river of India which is the Mahanadi River
basin.
We believe that there are significant new hydro-dynamics of river sediment system communicated
through this article and therefore Sedimentary Geology would be an ideal platform to publish these
results.
The authors hope to have a favorable response to this manuscript. It is needless to state that we are
happy to provide any more information in support of this manuscript.
Best Regards
Biswajit Pradhan
Ph.D. Research scholar
National Institute of Technology Rourkela, India.
E-mail: biswajitpradhan030@gmail.com
Manuscript Click here to access/download;Manuscript;paper ready.docx
a Ph.D. Research scholar Department of Civil engineering National institute of technology Rourkela, India
a Professor Department of Civil engineering National institute of technology Rourkela, India
Abstract:
Different computational approaches for the assessment of sediment concentration have been analyzed for the
Mahanadi River basin. Four standard predictive models such as Adaptive Neuro-Fuzzy Inference System
(ANFIS), a multilinear regression model, a traditional rating curve model, and an error back-propagation ANN
technique were employed by utilizing the monthly hydro-meteorological parameters to estimate the suspended
sediment concentration at the Tikarapada gauging station of the Mahanadi River. All the algorithms show that
discharge, stage, and rainfall are having a major contribution to sediment yield. The SRC model performance was
found to be the lowest of the other three models. Negative sediment yield values were predicted by the multi-
linear regression (MLR) model; however, this was not possible in practice. After the comparison with all the four
standard models, it is found that the ANFIS model is providing high accuracy for sediment yield estimation. The
inclusion of different hydroclimatic parameters is also been justified by different correlation coefficients and
variance tests. In the end, the accuracy assessment and comparison of all the algorithms have been performed with
different error statics to find out the best possible models. This study can be a major contribution to the sediment
yield prediction for the most important Tikarapada basin of the Mahanadi River basin.
Keywords: Sediment yield, Hydroclimatic data, Mahanadi River, Artificial Neural Network, ANFIS, Regression
analysis.
1. Introduction
Assessments of sediment yield in rivers are essential for a wide range of applications starting from hydrological
modeling to water quality issues. Sediment load is primarily responsible for the modification processes of the
river patterns. Direct estimation of sediment load is a great challenge because it requires an adequate amount of
time and technical resources. Assessment of sediment yield is consistently a vital issue during the assessment of
the plan for various developmental structures in water resources engineering like dams and reservoirs, transport
of sediment particles inside the natural river, streams, and lakes, plan of stable channels, assurance of the
importance of water resources management, protection of aquatic animals, environment and climatic impact
assessment and evaluation and hydroelectric equipment life span (Cigizoglu, 2004a; Cobaner et al., 2009). The
testimony of sediment load has a great impact on the flooding effects as well which influences the farming and
agricultural region and the soil disintegration process of that region. The productivity of dams has likewise been
decreased because of sedimentation. With the above-mentioned issues, the estimation of suspended sediment yield
is becoming a fundamental requirement in natural river management.
The sediment yield depends upon many factors and their interrelation between the parameters is highly complex
in nature. The prerequisite displaying of suspended sediments has quickly enhanced during the recent decades
because of changes in climate and environmental factors. Environment and climate change will have a significant
impact on it in the decades to come. Recent research shows that change in river sediment pattern is greatly affected
by discharge (Bürger & Menzel.2002.; Nijssen et al., 2001; Yadav et al., 2018a), soil erosion (Michael et
al., 2005; Pruski & Nearing, 2002), and the sediment movement (Duan. et al., 2013). Apart from discharge
other agents are also indirectly contributing to the river sediment yield. Most influencing parameters have been
analyzed in the present study to check the interdependency with sediment patterns for the Mahanadi River Odisha.
The erosion of soil and weathering of rock in the watersheds and inside the channels are intricate hydrological
and ecological issues of nature (Seyed Ahmad et al,1988; Duan, et al., 2013.), and are the major feeders to
sediment load. Because of the involvement of the enormous number of hydroclimatic parameters with this
peculiarity framework, the existing models and the theoretical equations are not adequate to correlate the entire
hydrological process associated with the complex phenomenon of the sediment yielding process.
Estimation of suspended sediment can't be executable during flood event scale because of the inclusion of different
complex cycles in sediment yield, so the accuracy of predicting the sediment yield is not adequate by using the
traditional methods. The traditional linear models are having a restricted capacity to catch non-linearities in the
hydrologic and climatic datasets. Numerical models like the multilinear regression (MLR) model was utilized for
sediment load estimation by many researchers(Zhu et al., 2007) (Yadav et al., 2018b). Further, the utilization of
physical and theoretical models requires definite geographical, eco hydro climatological, and land data
(Mosquera-Machado & Ahmad, 2007; Seyed Ahmad Mirbagheri et al., 1988.). Such data acquisition
processes are expensive and difficult to acquire in a limited time frame. Considering this point an alternative
methodology is recommended such as an improved Adaptive Neuro-Fuzzy Inference System (ANFIS) has been
suggested. The model has been regenerated with updated data sets for the sediment load prediction.
ANFIS is a basic information learning method that utilizes Fuzzy Logic to change given sets of input to an ideal
output through a set of highly interconnected Neural Network handling components. The given information is
weighted to plan the mathematical input to the desired output. This design has the possible potential to catch the
advantages of both the neural network and the fuzzy logic interface in one frame. The soft computing technique
is a potential procedure that deals with adaptable numerical algorithms and has the capacity for recognizing the
complex non-linear behavior of the model. The ANN framework incorporates the capacity to display any typical
non-linear interaction which relates sediment concentration to hydroclimatic datasets (Wang & Traore, 2009).
Few examinations have been performed by the utilization of artificial neural networks in the aspect of sediment
concentration assessment(Jain, 2001) (Cigizoglu, 2004) (Tayfur & Guldal, 2006) (Kerem Cigizoglu & Kisi,
2006) (Karl & Lohani, 2010) (Ghose et al., 2012) (Yadav et al., 2018). (Jain, 2001) utilized ANN to deal with
setting up a daily basics sediment and water discharge relationship, and observed that the neural network model
performance is superior, compared to the traditional sediment rating curve. Kisi (2004) utilized distinctive ANN
strategies for daily-based sediment load forecast and assessment and demonstrated that the MLP model could
show preferably better results than the traditional regression model. (Cigizoglu, 2004b) examined the accuracy of
a solitary neural network in the assessment and prediction of daily sediment datasets. (Tayfur & Guldal, 2006)
utilized a multi-layer perceptron (MLP) framework for forecasting complete suspended sediment from monthly
rainfall datasets.
The river Mahanadi is the second-largest and one of the most significant streams in Peninsular India after the
Godavari River (India-WRIS 2021(Version 2.0, n.d.)). For this river basin, little potential research has been
conducted earlier using both the neural network approach and numerical models for forecasting runoff,
streamflow, precipitation, and floods (Ghose et al., 2012; Karl & Lohani, 2010b; Meher, 2014; Panigrahy
& Raymahashay, 2015.; Yadav et al., 2018), however, there is no potential model exists to forecast of the
sediment concentration in the river Mahanadi effectively. The existing models which are based on limited input
parameters have been now reverified with a few updated data sets. Subsequently, the fundamental goal of this
paper is to assess the different algorithms like ANN's model, ANFIS model, and the traditional strategies like the
MLR model and rating curve approach for the sediment load prediction in the Mahanadi stream.
The study area for this research is the Tikarapara gauging station of the Mahanadi River basin, Odisha, India. The
Mahanadi River system is a significant stream in eastern India. The overall river length from the starting point to
the endpoint is around 850 km. Around 356 km stretch length of the river was in Chhattisgarh state and 494 km
is in the state of Odisha. The Mahanadi basin is located between the longitudes of 80°30' - 86°50' east and the
latitudes of 19°20' - 23°35' north. The total catchment area of the river is 141,700 square kilometers. The largest
drainage area of the basin is covered by the Tikarapara gauging station i.e., 124,450 𝑘𝑚2 , while the smallest area
of 1100 𝑘𝑚2 is covered by the Mahendragarh site. Several major and minor tributaries are joining the river from
both sides. 14 major tributaries enter the river, with 12 joining upstream of Hirakud reservoir and two joining
downstream of Hirakud reservoir. With a catchment area of roughly 200 km, the Mahanadi River's lower basin
runs from the Hirakud reservoir to the Bay of Bengal. the Mahanadi catchment receives around 1200mm to
1400mm of yearly precipitation. in the Mahanadi, the catchment differs from 1200 to 1400 mm based on day-by-
day precipitation data recorded during the period 1990 to 2010 (IWRIS 2021). The Mahanadi basin receives over
90% of its annual precipitation from June to October. The temperature varies drastically in the winter and summer
seasons with the lowest temperature of 9°C in winter and summer the highest temperature of 39°C to 43°C
recorded from 1990 to 2010 (IWRIS 2020). The total Mahanadi basin is covered by different types of land use
land cover (LULC) classes like Horticulture (54.27%), water bodies (4.45%), forest area (32.74%) barren land
(5.24%) built-up area (3.30%) in the year 2006-2007 (IWRIS 2020). The major water bodies that lie within the
river basin are Hirakud reservoir and the Chilika lake. The Mahanadi River basin is the richest in different minerals
belts of eastern India comprising iron, coal, bauxite, gold, graphite, manganese, limestone, quartz, copper, zinc,
lead, etc (Chakrapani & Subramanian, 1993).
The name of the research station from which the data set was collected is the Tikarapada gauging station which
is the last measuring point for sediment yield for the entire Mahanadi basin after that the sediment gets deposited
in the Bay of Bengal. Because of the location of the Tikarapada and the accessibility of hydro-meteorological
data (sediment yield, discharge, rainfall intensity, stage, temperature, and wind speed) at this measuring site,
Tikarapada has been chosen as the research site.
Fig 1 DEM map for Tikarapada basin of Mahanadi River.
3. Methodology
Komasi, 2013). ANFIS utilizes linguistic knowledge from fuzzy logic and artificial neural network learning
capabilities for automated fuzzy. A fuzzy inference system (FIS) is also useful because it can merge the
informative essence of membership functions with the strength of ANN. The ANFIS model with fuzzy logic
inputs and neural network base can solve complex problems in nature. The ANFIS model uses a back-propagation
algorithm to optimize and enhance the membership function variables (fuzzy rules). The duty neural network
approach is made a bit more accessible by the fuzzification technique, which characterizes input parameters that
aren't in distinction overall resulting in a more powerful model.
3.2 Artificial neural networks (ANNs)
An ANN framework is made up of hundreds of single units, also known as artificial neurons or processing
elements (PE), which are coupled by coefficients (weights) and organized in layers to form the neural network
structure. The strength of neural computations is derived from the interconnection of the neurons in a network.
An artificial neural network (ANN) is an adaptable numerical-based algorithm, which can catch linear, as well as
nonlinear relations among the variables through proper learning. It takes many inputs and produces a weighted
composite of all the inputs as an output. In ANN the neurons are being considered to justify the mechanism
associated with the model. To understand the influence of the various input parameters, the network's input
parameters and neurons were adjusted accordingly. Over the last recent twenty years, ANN-based models have
seen promising growth in research for hydrological simulation processes (Singh et al., 2011).
3.3 Artificial neural network with Multilayer perceptron (MLP) type algorithm.
MLP is built on a three-layer feed-forward artificial neural network (ANN): an input layer, a secret or hidden
layer, and a final output layer. Each layer of the network is made up of a specified number of neurons, each with
its activation function. Previous research has identified that a single set of hidden layers is adequate for the neural
networks for approximation of any complex nonlinear model (Cybenkot, 1989; Hornik1989.). The single
hidden layer is utilized to try not to build any complexity in the network (Tang et al., 1991). The MLP model is
set up to use a variety of ANN models, including Levenberg–Marquardt (FFBP–LM) (Yadav et al., 2018c).
Different sets of ANN models were created by picking the different permutation combinations of the input
variables from discharge (Q), rainfall intensity (R), wind speed (Vw), water level, or stage (y) temperature (T). All
the ANN models were created utilizing the proper training for the input variables.
Table 1 shows the statical analysis of hydro-climatic data from the Tikarapada gauging station for the Mahanadi
River, which are precipitation, temperature, windspeed, stage (water level), discharge, and suspended sediment
matter. Temperature is negatively skewed, whereas precipitation, stage (water level), discharge, wind speed, and
suspended sediment are positively skewed. The skewness test is an important test to identify the types of
distribution among different variables. Positive skewness denotes a distribution with a relative asymmetrical tail
that expands outwards towards more positive (larger than mean) values, whereas negatively skewed denotes a
distribution that stretches outwards towards more negative (smaller than mean) values. The unequal skewness of
variables indicates that they have non-linear distributions. During the investigation of all the algorithms, the
skewness values ranged from - 0.52 to 2.77. Because it falls between -3 and 3, this is considered a common range.
A larger skewness coefficient value has a considerable detrimental impact on the ANFIS and ANN performance
(Altun et al., 2007).
Table 1: Statistical analysis of Tikarapada's hydro-climatic parameters
The Spearman rank correlation is shown in Table 2, while the Pearson correlation coefficients are shown in Table
3. The Pearson correlation coefficient shows a linear relationship between the parameters, but the Spearman rank
correlation coefficients show a non-linear relationship. Both of these coefficients (r) are highest among discharge,
stage (water level), rainfall intensity, and sediment yield. This means that discharge, water level (stage), and
rainfall intensity are all directly related to sediment output. The temperature and wind speed has a bit less relation
with sediment yield, both have a direct and strong relationship with rainfall intensity data (precipitation), which
has a solid relationship with sediment load. In this way, it is assumed that temperature and wind speed have an
indirect or secondary relationship with sediment load. All these five input parameters have different contributions
to the performance of the model which has been seen during the application of each parameter individually as
input to the model. Several studies have confirmed the indirect relationship between sediment yields and
temperature (Zhu et al., 2007) (Ghose et al., 2012b) (Yadav et al., 2018c). The sediment yield is influenced by
temperature in several ways. Temperature changes and wind speed can affect sediment release by affecting runoff
and erosion rates, which affects evapotranspiration, vegetation, and other factors (Zhu et al., 2007). Additionally,
we have proved the selection of all the parameters by conducting the variance test i.e., the ANOVA test. To check
whether each of the five input parameters precipitation, temperature, windspeed, stage (water level), discharge,
and suspended sediment data are coming from a similar appropriation or not, the ANOVA test was performed.
The ANOVA test dismissed the null hypothesis; in this way, it tends to be concluded that all the input parameters
are not having the equivalent influential properties to sediment load. The plot of the ANOVA test shows that all
the boxes are not in a single line, which represents that the input parameters are having different distributions and
can be used as a group as an input parameter. In this manner, it was chosen to accept all the five parameters as an
input for the estimation of sediment yield. The ANOVA test for the input parameters is presented in figure 2.
Table 2 Spearman Rank Correlation Coefficient
Wind
Stage Discharge Intensity Temperature
Speed
Discharge 0.904
Discharge 0.747
4. Data processing
The input and output data collected from the gauging station have different ranges and different units, therefore
the normalization of data before the model training is an essential component during the training process of the
model. To remove the dimension of variables to be used as input, data normalization is an essential step to be
followed. Because all of the input and output variables had different dimensions, the following formula was used
to normalize the data (from 0 to 1):
𝐶𝑖 −𝐶𝑚𝑖𝑛
𝐶𝑛𝑜𝑟𝑚 = (1)
𝐶𝑚𝑎𝑥 − 𝐶𝑚𝑖𝑛
where Cnorm is the normalized value of data, Ci is the ith value at the start, Cmax is the input data's maximum value,
and Cmin is the input data's minimum value. For developing all the models, the available data set was divided into
groups for the use in training, validation, and testing of all algorithms for the estimation of suspended sediment.
The normalization of data was done for all sets of input and output data used for this research. The highest
percentage of data i.e., around 64% of data were used in the model training process whereas the remaining data
were equally shared in the testing and validation process. To avoid overestimation or underestimation, an equal
number of data sets with equivalent statistical features of the parameters must be used in the training testing and
validation (Boukhrissa et al., 2013). Hydroclimatic data sets spanning over two decades from 1990-to 2010 like
discharge, and rainfall. Windspeed, water level (stage), and temperature were used to prepare all four models. To
avoid overfitting issues the training and validation data wasn’t taken continually. It is an important to note that
the property of these five data sets should be pretty much similar. For this reason, a combined t-test was performed
to check whether every one of the five parameters in the group is showing comparatively similar distributions or
not. The results are presented in Table 4. It can be figured out from the table that the null hypothesis probability
of acceptance is greater than 0.050 for the paired test. Therefore, there is a 5% confidence level accepted by the
null hypothesis and, from table 4 it can be clear that the data sets can be used in a group.
The MLR is a widely used regression model for predicting the linear connection between input parameters and
sediment load. The equation of the MLR model developed for this research site by using all the five parameters
is presented as:
Qs= 20691 – 2799y – 1966T + 8962Vw – 2240 i + 37.944 Q (2)
Where Qs is the Suspended Sediment Yield (t/d), y is the Water stage (m), T is the Temperature (ᵒC), Vw is the
Wind Speed (m/s), i is the Rainfall Intensity (mm/d), Q is the Water Discharge (m3/s). With the input and observed
output data, the values were determined using a least-squares approach of regression. Thirty-one possible models
of MLR were analyzed using all the possible input combinations and the best model was selected based on
minimum error statistics.
The non-linear model which was developed for this research was the SRC model which is a traditional sediment
rating curve. A non-linear relationship was created between the input parameters and the sediment load using the
non-linear approach or rating curve model. The SRC model was used by many researchers like (Jain, 2001b) (Zhu
et al., 2007)(Yadav et al., 2018c). The equation for the SRC model developed for this research is presented as
follows
Where Qs is Suspended Sediment Yield (t/d), Q is the Water Discharge (m3/s). The values were derived by
regression analysis using the least square approach. Only one input data i.e., discharge, is employed in the SRC
model to forecast sediment concentration (Jansson, 1997). This model's output is also compared to the output of
the other three models.
Feedforward back-propagation neural network (FFBP) model with Levenberg- Marquardt algorithm (FEBP-
LM) approach.
The FFBP-LM approach of neural network frameworks was created through sources of input, output, and neurons
with a solitary hidden layer. Grid search algorithms were used to choose the number of hidden layers and the
learning parameters for the ANN-based FFBP-LM model. All the possible types of permutations and
combinations were used from the available input parameters to select the required input parameters for the model.
The ANN-based FFBP-LM model was created by choosing all the thirty-one possible combinations of input
parameters of the five hydroclimatic parameters considered for this study i.e., discharge (Q), stage (S), rainfall
intensity (I), Temperature (T), and Wind Speed (W). Fig.6 represents the RMSE error for the sediment yield
prediction for all the possible models. From Fig. 6, it can be seen that the QSTWI model is producing the lowest
RMSE error of 0.0108 by taking all the five hydro-climatic datasets as input parameters for the sediment load
prediction. The hidden layer of the ANN-based FFBP-LM model has a tan-sigmoidal activation function, while
the output layer has a linear activation function. There were 1000 iterations (epochs) taken to get the optimized
model. To achieve the optimal performance in the hidden layer, the number of neurons changed from 1 to 60. In
the FFBP-LM model, the learning parameter (µ) values were changed from 0.001 to a maximum of 109. The
calculation values increased by 10 and decreased by a factor of 0.1. To increase model performance, the value of
µ changes in each epoch of the algorithm. Grid search algorithm techniques were used to select the µ value and
the number of hidden nodes for the ANN-based FFBP-LM model. The optimal value of µ and the number of
concealed nodes were found to be 104 and 19, respectively. For the development of the model, 70% of data (Jan
90-Jul 04) was used for training, 15% (Aug 04-Aug 07) for validation, and 15% (Sep 07-Sep 10) for testing the
model. The error statistics used as the evaluation criteria for all the models were RMSE, MAE, R2, and NSE.
Regarding the predictive ability of the model the RMSE and MAE sight different ideas. The goodness of fit of the
RMSE indicates to high sediment value and for the MAE it tends to rise to a moderate sediment value. The
difference in both the error statistics indicates variation in error during training, validation, and testing are shown
in Fig. 5 for the ANN-based FFBP-LM model. The error matrix in Table 5, shows different trends for the MAE
and RMSE errors. The RMSE during validation was 0.011 and during testing, data were 0.0108. Similarly, MAE
were 0.008 and 0.009 during testing and validation respectively. The results don’t have any direct relation between
the RMSE and MAE but it is also not highly different in values. The approximate close relation of the error matrix
between RMSE and MAE shows the model is not overfitted in any case. The R2 value between both the output
values i.e., observed and predicted was 0.988 in the testing phase which was closer to R2 during validation and
training of data. Similar trends were also observed in the case of the mean squared errors as well. The fact that the
R2 and the mean squared error have similar values indicates that the proposed model is effective. and the
performance is also good enough for all the input data types. NSE refers to Nash-Sutcliffe Efficiency which is
used to determine the efficiency of a model for predictive capabilities. NSE for testing data was found to be 0.981,
which is very close to 1 and hence the model has very good predictive capabilities. From Fig. 5 and the error
statistics table, it can be seen that the performance of the model during all training, testing and validation have
similar properties as a result of comparable goodness of fit criteria values. There is no significant overfitting
observed from the results as well. The results indicate that it is a better model among MLR and SRC for sediment
yield prediction for the lower Mahanadi River, India.
0.1
0.08 0.07
0.052 0.058
0.06 0.0537 0.056 0.055 0.053 0.05
0.054 0.05
0.04 0.0297 0.049 0.049 0.051
0.0198 0.022 0.0191
0.02 0.02
0.018 0.0256 0.021 0.0217
0.019 0.0198 0.01980.01940.0108
0
QSWI
ST
SW
SI
TW
TI
WI
IWS
QTWI
T
QSI
STW
STI
QSTW
QSTI
Q
I
W
S
QTW
TIW
QSTWI
QST
QT
QSW
QWI
QTI
QW
QI
QS
STWI
Input Combinations
R2
0.08 0.5
0.06 0.4
0.3
0.04
0.2
0.02 0.1
0 0
SW
ST
SI
STW
STI
QSTWI
W
TW
IWS
QSTW
QSTI
QSWI
S
TI
WI
QST
QSW
QWI
QTWI
Q
I
T
QI
QSI
QS
QTW
TIW
QTI
QT
QW
STWI
Input Combinations
RMSE R-Squared
R2 0.938 0.936
400000
350000 R² = 0.9368
300000
Estimated Yield (t/d)
250000
200000
150000
100000
50000
0
0 50000 100000 150000 200000 250000 300000 350000
-50000
Observed Yield (t/d)
Fig 7. Plot between the actual data and predicted sediment data of testing period of MLR Model
7.2 Non-linear regression (NLR) model
The non-linear model used in this paper is primarily based on the power relation (PR) model, which was built by
combining the training and validation datasets. For the fitting purpose of the model, the common least square
technique was used. The coefficients a, and b estimated for the PR model are 0.532 and 1.448 respectively. Since
the model was developed by using only a single input parameter i.e, water discharge, there was no change in the
algorithm for the selection of input parameters. The error matrix for the model is designated in Table 7. The table
indicates that the model is showing lower R2 values and higher RMSE during the training when compared to the
testing model which is showing higher R2 and low RMSE values. Thus, it can be concluded that the RMSE values
and R2 are not giving a similar pattern of direct relationships. It can also be noticed in Fig. 8 of the testing data
plot that the plot points do not fall along the 45-degree line. There were also significant differences in the greater
sediment output zones (Fig. 8). The model's predicted values were 1.2 times lower than the observed model. Very
big discrepancies between observed and anticipated values of sediment yield by the SRC (PR) model were also
seen in higher-value locations. These greater deviations could be owing to the model's incapacity to capture data
non-linearity, or to poor fitting of a non-linear function. This emphasizes the significance of incorporating
additional parameters as input data.
R2 0.948 0.949
400000
350000
R² = 0.949
Estimated Sediment (t/d)
300000
250000
200000
150000
100000
50000
0
0 50000 100000 150000 200000 250000 300000 350000
Observed Sediment (t/d)
Fig 8. Scatter plot of actual data and predicted sediment yield data of testing phase of NLR Model
7.3 Adaptive Neuro-Fuzzy Inference System (ANFIS)
For the development of the ANFIS Model, water discharge, stage, rainfall intensity, and temperature were
considered as the input parameter to estimate the suspended sediment yield. 70% of data (Jan 90-Jul 04) was used
for training, 15% (Aug 04-Aug 07) for validation, and 15% (Sep 07-Sep 10) for testing the model.
The fuzzy model used was the Takagi-Sugeno-Kang type with a maximum of 1000 epochs by taking into account
the backpropagation learning algorithm approach. This was considered to identify the network which trains the
model more efficiently. The architecture of the ANFIS network was developed using sigmoidal membership
functions where the number of membership functions per input varied from 2 to 50. Fig. 9 shows the number of
membership functions plotted against the R2 value. The best type of membership function was found to be the p-
sigmoidal (psig) membership function with the optimum number of membership functions as 29. From Fig. 9, it
is evident that the maximum R2 value of 0.9839 for testing data was found when the number of membership
functions was 29 for the input and output parameters. Fig. 10 shows the generated architecture for the sediment
yield modeling rule base. The error statistics used as the evaluation criteria for the model were RMSE, MAE, R 2,
and NSE. Table 8 shows the error statistics for the developed ANFIS model. The RMSE during validation was
0.009 and during testing, data were 0.008. Similarly, MAE were 0.005 and 0.001 during testing and validation
respectively. The model is not overfitted, as seen by the error matrix's approximate close relationship between
RMSE and MAE. i.e., observed and predicted was 0.993 in the testing phase (Fig. 11) which was closer to R2
during validation and training of data as seen in Table 8. Similar trends were also observed in the case of the mean
squared errors as well. NSE for testing data was found to be 0.989, which is very close to 1 and hence the model
has very good predictive capabilities. There is no significant overfitting observed from the results as well.
0.984 0.9839
0.9832
0.983
0.9821 0.9822
0.982
R2
0.9815
0.981
0.9803
0.98 0.9799 0.9798
0.9796 0.9792
0.979
16 17 18 20 23 26 29 30 31 32
Number of Membership Functions
Fig. 9 Plot of the optimum number of membership functions for the ANFIS model
Fig. 10 ANFIS Architecture for sediment yield modeling
250000
R² = 0.9933
200000
Estimated sediment (t/d)
150000
100000
50000
0
0 50000 100000 150000 200000 250000
Observed Sediment (t/d)
Fig. 11 Scatter plot of actual data and predicted sediment yield data of testing phase of ANFIS Model
8. Comparative study of all the models.
The potential of all four models was examined in terms of their capacity to estimate the Mahanadi River's
suspended sediment output. The prime performance indicators were based on the R2 values and RMSE values of
the test parameters for the comparative assessment of sediment yield. For the testing phase of all four models, the
same data sets were used from Sept 07 to Sept 10. The best output model among all the four models was finalized
for the prediction of sediment yield. The R2 and RMSE observed in the testing phase of all the four models are
represented in Table 9.
The observations from the table indicate the ANFIS-based model has the lowest RMSE value of 0.008 and the
highest R2 value of 0.993 among all the available models used in this research. It reflects that the performance of
ANFIS based model is superior and best among all models although a very marginal improvement was noticed
compared to the ANN model. On the other hand, Table 9 also represents that the MLR-based model has the lowest
R2 of 0.936 and the highest RMSE of 0.041As a result, the MLR model has the lowest ability to estimate sediment
yield. As previously noted, it could be due to a failure to address the non-linear relationship between hydro-
climatic parameters and suspended sediment output. The earlier observations also showed that the ANFIS model
showed the highest R2 values and lowest RMSE even during the training period when compared to other models.
The most consistent behavior has been observed in the case of both the ANFIS and ANN-based model during
both training and testing of the model by neither overfitting nor underfitting the dataset. As per the RMSE and R 2
values, the best model comes out as the ANFIS model followed by the ANN model. The ANN was a superior
model to of MLR model even when the number of input parameters was the same. Coming to the accuracy and
higher prediction ability, the ANFIS-based model shows better results among all. The results also revealed that
traditional mathematical models like MLR and NLR have the least predicting ability. The traditional methods
failed to capture the nonlinearity among datasets. Table 9 shows how the ANFIS model outperforms the ANN,
MLR, and NLR models. Both the ANFIS and ANN models are capable of making accurate predictions, and the
anticipated sediment yield values are in good agreement with the observed sediment yield values., however, the
SRC model has were unable to capture the high nonlinearity nature of high sediment yield values, yielding poor
results. When the observed data for sediment yield is low, the MLR model generates negative sediment yield
values. For both the ANFIS and ANN models, the projected sediment yield is remarkably similar to observed
values of sediment yield but ANFIS has a better prediction capacity compared to ANN. There is no such
significant over or underestimations were observed neither in ANFIS nor in the ANN model. The magnitude of
low medium and high, only a few overestimations was observed during August 2008 in ANFIS and ANN model
as well. There are no such significant underestimations seen in the ANN and ANFIS models. It was also observed
that in the ANN-based model, the predicted values are lying closely on the 45-degree line in the plot. The peaks
for the predicted and observed sediment yield were nearly the same in August 2008 and July 2009 in the ANFIS
model. The stage, discharge, and rainfall intensity were 20.66 m, 6412.99 m3/s, 11.69 mm/d, and 10.72 m,
6750.17 m3/s, 13.63 mm/d respectively in August 2008 and July 2009. The predicted values approximate the
corresponding peak of actual sediment yield in July 2009 with an observed yield of 221796.42 tons/d and an
estimated yield by ANFIS as 231890.04 tons/d. Very high sediment yields were observed in July 2009 just because
of a high increase in stage due to high rainfall in that recorded period.
250000
Suspended sediment yield (t/d)
200000
150000
100000 Observed
NLR
50000
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37
Months
(a)
250000
Suspended sediment yield (t/d)
200000
150000
100000 Observed
MLR
50000
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37
-50000
Months
(b)
250000
Suspended sediment yield (t/d)
200000
150000
100000 Observed
ANN FFBP-LM
50000
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37
Months
(c)
250000
Suspended sediment yield
200000
150000
(t/d)
100000 Observed
50000 ANFIS
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37
Months
(d)
Fig. 12 Comparison assessment plot between actual data and predicted sediment data of testing phase of
all the four models such as (a) NLR, (b) MLR, (c) FFBP-LM, and (d) ANFIS models
Certain values which are 0, 2.7, 10.8 and 48.64% of sediment yield were underestimated by ANFIS, ANN, NLR,
and MLR respectively during the testing phase. The high values of underestimations yield points are observed in
the MLR model and the ANFIS and ANN models were able to capture the highest overestimations of sediment
yield data. The results show three peaks i.e., first in August 2008, second in July 2009 and third in August 2010.
Most underestimated yield points values were observed in MLR.
The ANFIS model has a modest advantage over the ANN model in terms of potential. Based on the R2 and RMSE,
as well as the plot of observed and projected values, the ANFIS model outperforms the ANN, MLR, and SRC
models in predicting sediment yield. Due to large underestimations, the MLR is the least accurate of all available
models. In September 2009, some differences between anticipated and observed data in the ANFIS model were
discovered. During rainy seasons, this mistake is most noticeable. This inaccuracy could be attributed to other
influencing factors, such as unknown sediment yield contributors. This may be natural or manmade or might be
due to a change in the pattern of LULC over the years, which has not been considered as input parameters in this
work. In the research work, only hydroclimatic data were considered as input parameters. Change in the LULC
pattern human activities related to deforestations, bank erosion, construction work, etc contributes a lot to
sediment yield which needs to be studied in future works. The residual histograms for overall data are presented
in Fig 13.
(a) (b)
(c) (d)
Fig.13 Residual histograms (a) SRC Model, (b) MLR Model, (c) ANFIS Model, (d) ANN Model
9. Conclusions
In the present work, the suspended sediment yield has been predicted at the Tikarapada gauging station
downstream of Mahanadi River, India using four different algorithms with the help of different combinations of
input variables of hydroclimatic data sets i.e., Stage, discharge, rainfall intensity, wind speed, and temperature.
The earlier studies are providing poor results because of the inclusion of only limited hydroclimatic parameters
such as Discharge and rainfall. In general sediment yield is not limited to these parameters only. Additional
hydroclimatic parameters like Stage, wind speed, and temperature are considered for the modified models using
the updated data sets. All the sediment transported from the upper and middle reach of the river Mahanadi was
measured in the Tikarapada gauging station before the sediment gets deposited at the Bay of Bengal. Earlier
research shows that only rainfall and discharge were major parameters in terms of hydroclimatic data sets, but it
is found that stage is also the other dominant controlling variable to the sediment yield. In this research work, a
few extra and updated data datasets were considered for the prediction of sediment yield, and a new model based
on the soft computing technique is introduced which is having a better ability than the rest of the models. Effects
of a few other hydroclimatic parameters were considered for this study i.e., wind speed and temperature
insignificant on sediment yield. The capacity of the MLR model to estimate sediment yield was found to be much
lower than the other three models. The standard mathematical model with extended data sets also failed to estimate
sediment yield as intended. The ANFIS-based model proves to be the best of the four, with the highest prediction
power. The ANFIS model is found to be the most suitable model to replace traditional models. such as MLR,
SRC, and the new ANN model. Although the improvement over the ANN model was marginal, the ANFIS model
is reliable and simple and provided very high accuracy. The error matrix shows also shows that the MLR has the
lowest predicting ability. The highest number of underestimated sediment peak values corresponding to observed
sediment peaks were observed in the MLR model. The MLR model had the most underestimated yield values,
while the ANFIS or ANN model had the highest inflated yield points. Significant contributing parameters to the
sediment yield were not included in the SRC algorithm. Even though the ANFIS and ANN models produced
positive sediment values even when suspended sediment levels were low, they outperformed the SRC and MLR
models in terms of output potential. The ANFIS model provides a more accurate prediction capability compared
to the ANN model as well. The MLR and ANN model shows reasonable performance by considering rainfall
intensity, stage, discharge, temperature, and wind speed as input parameters. The ANN model was improved by
the inclusion of temperature, stage, and wind speed as the hydro-climatic parameters influencing the suspended
sediment yield. The ANFIS model predicted sediment yield is quite similar to recorded sediment data in the other
three models. Therefore, it can be concluded that ANFIS or soft computing-based model can be an improvement
over the existing methods in terms of the sediment yield prediction ability. ANFIS model has shown the highest
accuracy while only considering one input parameter to estimate suspended sediment yield which shows that the
ANFIS model has very high predictive capabilities. Although there is no such significant variability in the model
output few have been observed by the model. This is due to may the effects of a few unknown factors contributing
to the sediment yield and is not considered as input but those are needs to be addressed in future research. Further
inclusion of hydro-climatic parameters can be done in the ANFIS model to improve the efficiency of sediment
yield modeling. This study may be a major contribution to hydrological studies where the sediment values are not
accessible or easily available.
Acknowledgments
The first author is grateful to the National Institute of Technology Rourkela for providing fellowship during the
research period and all the authors are grateful to the Central Water Commission, Bhubaneswar, Odisha, and
IMD for providing all required data.
References
Altun, H., Bilgil, A., & Fidan, B. C. (2007). Treatment of multi-dimensional data to enhance
neural network estimators in regression problems. Expert Systems with Applications,
32(2), 599–605. https://doi.org/10.1016/j.eswa.2006.01.054
Boukhrissa, Z. A., Khanchoul, K., le Bissonnais, Y., & Tourki, M. (n.d.). Prediction of sediment
load by sediment rating curve and neural network (ANN) in El Kebir catchment, Algeria.
Bürger, G., & Menzel, L. (n.d.). Climate change scenarios and runoff response in the Mulde
catchment (Southern Elbe, Germany). www.elsevier.com/locate/jhydrol
Chakrapani, G. J., & Subramanian, V. (1993). Rates of erosion and sedimentation in the
Mahanadi river basin, India. In Journal of Hydrology (Vol. 149).
Cigizoglu, H. K. (2004a). Estimation and forecasting of daily suspended sediment data by multi-
layer perceptrons. Advances in Water Resources, 27(2), 185–195.
https://doi.org/10.1016/j.advwatres.2003.10.003
Cigizoglu, H. K. (2004b). Estimation and forecasting of daily suspended sediment data by multi-
layer perceptrons. Advances in Water Resources, 27(2), 185–195.
https://doi.org/10.1016/j.advwatres.2003.10.003
Cobaner, M., Unal, B., & Kisi, O. (2009). Suspended sediment concentration estimation by an
adaptive neuro-fuzzy and neural network approaches using hydro-meteorological data.
Journal of Hydrology, 367(1–2), 52–61. https://doi.org/10.1016/j.jhydrol.2008.12.024
Duan, W., He, B., Takara, K., Luo, P., Nover, D., Sahu, N., & Yamashiki, Y. (2013). Spatiotemporal
evaluation of water quality incidents in japan between 1996 and 2007. Chemosphere,
93(6), 946–953. https://doi.org/10.1016/j.chemosphere.2013.05.060
Duan, W., Takara, K., He, B., Luo, P., Nover, D., & Yamashiki, Y. (2013). Spatial and temporal
trends in estimates of nutrient and suspended sediment loads in the Ishikari River, Japan,
1985 to 2010. Science of the Total Environment, 461–462, 499–508.
https://doi.org/10.1016/j.scitotenv.2013.05.022
Ghose, D. K., Swain, P. C., & Panda, S. S. (2012a). Sedimentation load analysis using ANN and
GA. Applied Mechanics and Materials, 110–116, 2693–2698.
https://doi.org/10.4028/www.scientific.net/AMM.110-116.2693
Ghose, D. K., Swain, P. C., & Panda, S. S. (2012b). Sedimentation load analysis using ANN and
GA. Applied Mechanics and Materials, 110–116, 2693–2698.
https://doi.org/10.4028/www.scientific.net/AMM.110-116.2693
hornik1989. (n.d.).
Karl, A. K., & Lohani, A. K. (2010a). Development of Flood Forecasting System Using Statistical
and ANN Techniques in the Downstream Catchment of Mahanadi Basin, India. Journal of
Water Resource and Protection, 02(10), 880–887.
https://doi.org/10.4236/jwarp.2010.210105
Karl, A. K., & Lohani, A. K. (2010b). Development of Flood Forecasting System Using Statistical
and ANN Techniques in the Downstream Catchment of Mahanadi Basin, India. Journal of
Water Resource and Protection, 02(10), 880–887.
https://doi.org/10.4236/jwarp.2010.210105
Kerem Cigizoglu, H., & Kisi, Ö. (2006). Methods to improve the neural network performance in
suspended sediment estimation. Journal of Hydrology, 317(3–4), 221–238.
https://doi.org/10.1016/j.jhydrol.2005.05.019
Meher, J. (2014). RAINFALL AND RUNOFF ESTIMATION USING HYDROLOGICAL MODELS AND
ANN TECHNIQUES ATHESIS SUBMITTED FOR THE AWARD OF THE DEGREE OF DOCTOR OF
PHILOSCOPHY CIVIL ENGINEERING.
Michael, A., Schmidt, J., Enke, W., Deutschländer, T., & Malitz, G. (2005). Impact of expected
increase in precipitation intensities on soil loss - Results of comparative model
simulations. Catena, 61(2-3 SPEC. ISS.), 155–164.
https://doi.org/10.1016/j.catena.2005.03.002
Mosquera-Machado, S., & Ahmad, S. (2007). Flood hazard assessment of Atrato River in
Colombia. Water Resources Management, 21(3), 591–609.
https://doi.org/10.1007/s11269-006-9032-4
Nijssen, B., O’donnell, G. M., Hamlet, A. F., & Lettenmaier, D. P. (2001). HYDROLOGIC
SENSITIVITY OF GLOBAL RIVERS TO CLIMATE CHANGE.
Nourani, V., & Komasi, M. (2013). A geomorphology-based ANFIS model for multi-station
modeling of rainfall-runoff process. Journal of Hydrology, 490, 41–55.
https://doi.org/10.1016/j.jhydrol.2013.03.024
Panigrahy, B. K., & Raymahashay, B. C. (n.d.). River water quality in weathered limestone: A
case study in upper Mahanadi basin, India.
Pruski, F. F., & Nearing, M. A. (2002). Climate-induced changes in erosion during the 21st
century for eight U.S. locations. Water Resources Research, 38(12), 34-1-34–11.
https://doi.org/10.1029/2001wr000493
Seyed Ahmad Mirbagheri, B., Tanji, K. K., Member, A., & Krone, R. B. (n.d.). SEDIMENT
CHARACTERIZATION AND TRANSPORT IN COLUSA BASIN DRAIN.
Singh, G., Kumar Panda, R., Professor, A., & Panda, R. K. (2011). Daily Sediment Yield Modeling
with Artificial Neural Network using 10-fold Cross Validation Method: A small agricultural
watershed, Kapgari, India Climate Change Impacts on Hydrological Extremities View
project GROUNDWATER DEVELOPMENT AND MANAGEMENT View project Daily Sediment
Yield Modeling with Artificial Neural Network using 10-fold Cross Validation Method: A
small agricultural watershed, Kapgari, India. In International Journal of Earth Sciences and
Engineering (Vol. 04). https://www.researchgate.net/publication/265988179
Tang, Z., de Almeida, C., & Fishwick, P. A. (1991). Time series forecasting using neural networks
vs. Box-Jenkins methodology. Simulation, 57(5), 303–310.
https://doi.org/10.1177/003754979105700508
Tayfur, G., & Guldal, V. (2006a). Artificial neural networks for estimating daily total suspended
sediment in natural streams. Nordic Hydrology, 37(1), 69–79.
https://doi.org/10.2166/nh.2005.031
Tayfur, G., & Guldal, V. (2006b). Artificial neural networks for estimating daily total suspended
sediment in natural streams. Nordic Hydrology, 37(1), 69–79.
https://doi.org/10.2166/nh.2005.031
Wang, Y.-M., & Traore, S. (2009). Time-lagged recurrent network for forecasting episodic event
suspended sediment load in typhoon prone area. In International Journal of Physical
Sciences (Vol. 4, Issue 9). http://www.academicjournals.org/ijps
Yadav, A., Chatterjee, S., & Equeenuddin, S. M. (2018a). Prediction of suspended sediment yield
by artificial neural network and traditional mathematical model in Mahanadi river basin,
India. Sustainable Water Resources Management, 4(4), 745–759.
https://doi.org/10.1007/s40899-017-0160-1
Yadav, A., Chatterjee, S., & Equeenuddin, S. M. (2018b). Prediction of suspended sediment
yield by artificial neural network and traditional mathematical model in Mahanadi river
basin, India. Sustainable Water Resources Management, 4(4), 745–759.
https://doi.org/10.1007/s40899-017-0160-1
Yadav, A., Chatterjee, S., & Equeenuddin, S. M. (2018c). Prediction of suspended sediment yield
by artificial neural network and traditional mathematical model in Mahanadi river basin,
India. Sustainable Water Resources Management, 4(4), 745–759.
https://doi.org/10.1007/s40899-017-0160-1
Zhu, Y. M., Lu, X. X., & Zhou, Y. (2007). Suspended sediment flux modeling with artificial neural
network: An example of the Longchuanjiang River in the Upper Yangtze Catchment, China.
Geomorphology, 84(1–2), 111–125. https://doi.org/10.1016/j.geomorph.2006.07.010
Declaration of Interest Statement
Declaration of interests
☒ The authors declare that they have no known competing financial interests or personal relationships
that could have appeared to influence the work reported in this paper.
☐The authors declare the following financial interests/personal relationships which may be considered
as potential competing interests: