Water 15 00475 v2
Water 15 00475 v2
Article
Water-Quality Prediction Based on H2O AutoML and
Explainable AI Techniques
Hamza Ahmad Madni 1, * , Muhammad Umer 2 , Abid Ishaq 2 , Nihal Abuzinadah 3 , Oumaima Saidani 4 ,
Shtwai Alsubai 5 , Monia Hamdi 6 and Imran Ashraf 7, *
1 College of Electronic and Information Engineering, Beibu Gulf University, Qinzhou 535011, China
2 Department of Computer Science & Information Technology, The Islamia University of Bahawalpur,
Bahawalpur 63100, Pakistan
3 Faculty of Computer Science and Information Technology, King Abdulaziz University, P.O. Box 80200,
Jeddah 21589, Saudi Arabia
4 Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint
Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
5 Department of Computer Science, College of Computer Engineering and Sciences in Al-Kharj,
Prince Sattam bin Abdulaziz University, P.O. Box 151, Al-Kharj 11942, Saudi Arabia
6 Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint
Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
7 Department of Information and Communication Engineering, Yeungnam University,
Gyeongsan 38541, Republic of Korea
* Correspondence: hamza@bbgu.edu.cn (H.A.M.); imranashraf@ynu.ac.kr (I.A.)
Abstract: Rapid expansion of the world’s population has negatively impacted the environment,
notably water quality. As a result, water-quality prediction has arisen as a hot issue during the last
decade. Existing techniques fall short in terms of good accuracy. Furthermore, presently, the dataset
available for analysis contains missing values; these missing values have a significant effect on the
performance of the classifiers. An automated system for water-quality prediction that deals with
the missing values efficiently and achieves good accuracy for water-quality prediction is proposed
in this study. To handle the accuracy problem, this study makes use of the stacked ensemble H2 O
AutoML model; to handle the missing values, this study makes use of the KNN imputer. Moreover,
Citation: Madni, H.A.; Umer, M.;
the performance of the proposed system is compared to that of seven machine learning algorithms.
Ishaq, A.; Abuzinadah, N.; Saidani,
Experiments are performed in two scenarios: removing missing values and using the KNN imputer.
O.; Alsubai, S.; Hamdi, M.; Ashraf, I.
Water-Quality Prediction Based on
The contribution of each feature regarding prediction is explained using SHAP (SHapley Additive
H2 O AutoML and Explainable AI exPlanations). Results reveal that the proposed stacked model outperforms other models with 97%
Techniques. Water 2023, 15, 475. accuracy, 96% precision, 99% recall, and 98% F1-score for water-quality prediction.
https://doi.org/10.3390/w15030475
Keywords: water-quality prediction; KNN imputer; missing values; machine learning; deep learning
Academic Editor: Kyriaki
Kalaitzidou
accessible amount of water is 60% [3], which means that water is easily available to use
and abundant on Earth; it is accessible for industry, agriculture, and for drinking use [4].
Rivers and groundwater are the fundamental sources of fresh water; social and eco-
nomic development is directly linked with fresh water [5]. Due to human activities, both
surface water and groundwater are under great pressure. Activities such a commercial-
ization, urbanization, population growth, and industrialization have a direct impact on
water quality and quantity [6]. Additionally, climate change and global warming have a
worse effect on water quality. Therefore, water quality evaluation and estimation are of
great concern today [7].
The index used for the assessment and classification of surface water and groundwater
is the water-quality index (WQI). WQI is a widely used parameter for water-quality classi-
fication. For water-quality level estimation, Brown et al. [8] proposed an index. The index
is computed based on water physiochemical parameters such as pH, the concentration
of pollutants, dissolved oxygen, temperature, turbidity, and biochemical oxygen demand.
For policymakers, this WQI parameter gives meaningful qualitative data and is helpful
for the planners of water distribution systems. The drawback of WQI is that it consists of
lengthy and complex computations, and a lot of time and effort are needed in this regard [9].
To address the above-mentioned problems, it is the need of the hour to have an alternative
and state-of-the-art system for efficient water-quality classification (WQC).
AI-based modeling removes the complex and lengthy calculations and classifies WQI
promptly [9]. Therefore, water-quality classification using an artificial intelligence-based
system is getting the attention of many researchers. Different researchers have proposed
different WQC systems using machine learning and deep learning models. Predominantly,
such efforts often achieve low accuracy. Furthermore, the available dataset for the exper-
iments has some missing values that are much-needed for water-quality prediction and
have a direct impact on the results.
Clean and easily available water is required for drinking, home usage, recreational
activities, and food production. Better water-supply and resource management may signifi-
cantly increase a country’s economic development. Sufficient water should be available for
personal and domestic usage and should always be safe, easily accessible, and available
to everyone. Every year, many individuals die from kidney failure, cancer, and other
diseases caused by polluted water. Laboratory methods for classifying water quality are
resource-intensive and time-consuming. Many water-quality classification methods are
already available; however, many lack accuracy. As a result, it is very important to have
an automated system that can classify water quality with low human effort and with
time efficiency.
The continuous, diligent evaluation and acceptability of drinking water sources by
the public health community is referred to as potable-water-quality surveillance. A perfect
water distribution and monitoring system guarantees people’s health if the potable water is
treated without errors. Further, the perfect water treatment system is in vain if the architec-
ture of the water supply and water treatment allows contamination into the potable water.
During the last decade, concerns about water contamination have been raised. Prediction
of water quality comes out as an important topic as it directly relates to life survival on
earth. As a result, there is a vast amount of work on automated water-quality prediction
techniques. Such efforts often yield comparably low accuracy. Moreover, the dataset avail-
able for experimentation had missing values and missing attributes. These missing values
affect the results of water-quality prediction. To address this issue efficiently, this study
made the following contributions
• A novel H2 O AutoML stacked ensemble model is proposed that provides higher
accuracy for drinking water-quality prediction.
• For resolving the issue of missing values, experiments are performed using two
scenarios, where the first scenario involves deleting the missing values, while a K
nearest neighbor (KNN) imputer is used in the second scenario.
Water 2023, 15, 475 3 of 17
• Experiments are conducted to assess the performance of the KNN imputer and the
proposed H20 AutoML stacked ensemble model involving the use of several learning
models including logistic regression (LR), extra tree classifier (ETC), random forest
(RF), stochastic gradient descent classifier (SGDC), Gaussian naïve Bayes (GNB),
and gradient-boosting machine (GBM).
• The importance of different features is explained using the SHapley Additive exPlana-
tions (SHAP) model.
This study of WQC consists of four further sections: Section 2 briefly discusses the
previous research related to WQC. Section 3 consists of the description of the dataset,
proposed methodology, and description of the machine learning model used in this study.
Section 4 describes the results, and Section 5 discusses the conclusions of the study.
2. Related Work
Water is one of the most important resources for the existence of life, and human
needs are directly linked with the availability of water from both sources (surface and
groundwater). Thus, it is very important to have a state-of-the-art system that can classify
water quality. Many studies carried out for water-quality classification have provided
promising results. The literature review constitutes several previous works that used
artificial intelligence systems for water-quality index prediction.
Juna et al. [10] worked on automatic water-quality prediction using a KNN imputer
and MLP. They handled the missing values efficiently and obtained higher performance
regarding accuracy. They proposed a nine-layer multilayer perceptron (MLP) system with
KNN imputer to deal with the missing values. They also used seven machine learning
algorithms for comparison. Experimental results show that the proposed nine-layer MLP
achieved an accuracy value of 99% for water-quality prediction using the KNN imputer.
A dependable approach was proposed by Nida Nasir et al. [4] for predicting water quality
accurately. The authors used various machine learning and stacked ensemble learning
model for water-quality classification via the water-quality index. They used LR, RF, DT,
SVM XGBoost, CATBoost, and MLP for this purpose. Results of the study show that
CATBoost achieved an accuracy of 94.51%. For water-quality classification, Radhakrishnan
and Pillai [2] used machine learning models. They used three machine learning models,
including DT, SVM, and NB, in their study and used multiple datasets. The performance
of the machine learning models was compared, and the results revealed that DT achieved
better classification accuracy, i.e., 98.50%.
Aldhgani et al. [11] used a non-linear autoregressive neural network (NARNET) and
long short-term memory (LSTM). In addition to these deep learning models, they also
used three machine learning models, including NB, SVM, and KNN, for experiments.
NARNET and LSTM achieved almost the same accuracy but a slightly different regression
coefficient (RLSTM= 94.21%, NARNET = 96.17%), and from machine learning models, SVM
achieved an accuracy of 97.01%. Shahra et al. [12] proposed a deep learning-based system
for water-quality classification for water distribution networks. The study aims to achieve
high accuracy and keep low time for computation. They used two learning algorithms:
ANN and SVM. ANN outperformed the SVM model in terms of accuracy and achieved an
accuracy of 94%, whereas SVM achieved an accuracy of 89%.
An adaptive neuro-fuzzy system was proposed by Hadi et al. [13] for the classification
of drinking water into two classes: safe and unsafe. They used a real-time time-series
dataset that had four water quality parameters: bacteria count, color, turbidity, and pH.
The proposed adaptive neuro-fuzzy system achieved an accuracy of 92% for detecting
contaminated data. Abuzir and Abuzir [14] used j48, MLP, and NB for water-quality classi-
fication. They used a dataset that had 10 features. Different feature extraction techniques
were used for the dimensionality reduction of the dataset. They experimented with three
scenarios: using all features, using five features, and using two features. With all features
and with selected features, MLP outperformed the other two learning models.
Water 2023, 15, 475 4 of 17
Hassan et al. [15] used machine learning and deep learning models for classification of
Indian water quality data. The authors used SVM, RF, NN, multinomial logistic regression
(MLR), and bagged tree models (BTM). The results revealed that the main features, such as
total coliform, biological oxygen demand, dissolved oxygen, conductivity pH, and nitrate,
affect the water quality classification. A study by Sillbery et al. [16] used attribute realization
(AR) and SVM for water-quality classification of the Chao Phraya River. When they used
AR-SVM on six features of river-water data, they achieved accuracy from 86% to 95%.
The study by Ahmed et al. [17] used four different features, including turbidity, pH,
TDS, and temperature, for water-quality prediction. Experimental results show that MLP
outperformed the other learning algorithms in terms of accuracy and achieved an accuracy
of 85.05% with a (3,7) configuration.
The IoT-based system played a vital role in water-quality classification. Kakkar et al. [18]
used IoT-based devices for the data collection of residential overhead tanks. After data
collection, they use machine learning and a deep learning system for WQC. Malek et al. [19]
used Kelantan River data from the years 2005 to 2020 for water-quality classification. They
employed different kinds of machine learning models. For water quality, they used 13
physical and chemical parameters. From the experiments, results show that gradient boost-
ing with a learning rate of 0.1 achieved an accuracy value of 94.90%. For water quality and
water-demand prediction, Rustam et al. [20] proposed an artificial neural network system.
The authors used an artificial neural network with one hidden layer and several dropouts
and activation layers. Experiments were conducted on two datasets to predict water quality
and water consumption. For water-quality prediction, they achieved an accuracy of 0.96%,
while the R2 score for water consumption prediction was 0.99%. A comparative analysis of
existing approaches for water-quality prediction is presented in Table 1.
Table 1. Cont.
Preprocessing Stacked
Drinking Water Quality Ensemble H20
Water Features 1. KNN Imputer AutoML
Dataset 2. Label Encoder
Evaluation
Trained Accuracy
Train Test Split Model Precision
Feature Recall
Engineering F-score
Feature Description
pH Water pH (0 to 14).
Hardness Soap precipitate capacity in water in mg/L.
Solids Total dissolved solids in ppm.
Chloramines Number of chloramines in ppm.
Sulfate Sulfates dissolved in mg/L.
Conductivity Electrical conductivity of water in µS/cm
Water 2023, 15, 475 6 of 17
Table 2. Cont.
Feature Description
Organic_carbon Organic carbon in ppm.
Trihalomethanes Trihalomethanes in µg/L.
Turbidity Light-emitting property of water in NTU.
Potability Target class of whether the water is potable or not potable: potable
is 1, and not potable is 0.
where
total number of coordinates
weight = (2)
number of present coordinates
forest [27], extra tree classifier [28], gradient boosting machine [29], stochastic gradient
decent [30], and H2 O stacked ensemble [23].
For the majority of classification tasks, the Gini index is used as a cost function for the
estimation of a split in the dataset. The Gini index can be computed using
classes
i
Gini = 1 − ∑ p ( )2
t
(7)
i =1
where α is the model learning rate and Θ j is the parameter. For better performance, SGD
uses several hyperparameters.
to explain the ML predictions. Tree-SHAP uses the linear-explanatory model and shapely
values for the initial prediction model estimation.
N
h(ź) = ∅0 + ∑ ∅i źi (9)
i =1
where z0 represents the basic features, ∅ denotes the feature attribution, and h0 shows the
explanation model. Lundberg and Lee [32] calculate each feature attribution using the
below equation:
| k | ! ( N − | K | − 1) !
∅i = ∑ N!
[ gx (K ∪ {i }) − gx (K )] (10)
K ⊆ M {i }
gx (K ) = E[ g( x )| xK ] (11)
where M0 represents the set of all inputs, K shows the input subset of a feature, and
E[( g( x )| x_k )] is the expected value of the function on subset k. A linear additive feature
attribute method is used by SHAP for the simpler explanation
3.15. Evaluation
The model’s evaluation is the important step that mainly focuses on estimation of
the performance of the model on unseen data. For water-quality classification, the four
outcomes are described below:
True Positive (TP): instances that are actually positive and are predicted positive.
True Negative (TN): instances that are actually negative and are predicted negative.
False Positive (FP): instances that are negative and are predicted as positive.
False Negative (FN): instances that are positive and are predicted as negative.
This study evaluates the proposed system in terms of accuracy, precision, recall,
and F-score. The values of these parameters range between 0 and 1.
Accuracy is the percentage of correctly predicted instances. It can be computed using
the following formula:
TP + TN
Accuracy = (12)
TP + TN + FP + FN
Precision is the exactness of the classifier. Mathematically, precision can be computed
as:
TP
Precision = (13)
TP + FP
Water 2023, 15, 475 10 of 17
Recall is the completeness of the classifiers. Mathematically, recall can be computed as:
TP
Recall = (14)
TP + FN
The harmonic mean of recall and precision is called the F1 score. It is also referred to
as F-score. It can be calculated using the following formula:
Precision × Recall
F1 − Score = 2 × (15)
Precision + Recall
Instances
RF GBM
P(1)= (P RF + P GBM)/2
P(2)= (P RF+ P GBM)/2
been removed. The results clearly show that the performances from GBM and RF are
acceptable, and the rest of the models’ performances are poor.
Figure 3. Accuracy, precision, and other metrics for models using the deleted-values dataset.
4.2. Experimental Results of All Learning Models by Filling Values with KNN Imputer
In the second set of experiments, the KNN imputer is used. Following preprocessing,
some missing values are discovered in the dataset. To handle missing data, we applied
the KNN imputer. The value is computed by the KNN imputer using the Euclidean
distance and the mean of the given values. The data are used for machine learning model
experiments once the missing values are imputed. Table 4 displays the results of the
machine learning models produced with the KNN imputer.
Water 2023, 15, 475 12 of 17
Table 4. Experimental results using machine learning models with KNN imputer data.
The results show that RF and GBM reach 80% accuracy, while the RF obtains 80%
precision, recall, and F1 score. GBM has a precision and a recall score of 80%, but an
F1-score of 79%. With SGDC, an accuracy score of 59% is attained. In terms of accuracy,
precision, recall, and F1 score, the H20 stacked model once again outperforms all other
individual models. The graphical depiction of the machine learning model outcomes using
the KNN imputer is shown in Figure 4. It illustrates that using the KNN imputer enhances
the performance of the machine learning model.
4.3. Accuracy Comparison of All Learning Models with KNN Imputer and Removing Missing Data
For a detailed and clarified performance analysis, the results obtained from the learn-
ing models with and without the KNN imputer are compared in this section. Experimental
results show that the learning models perform well when we employ the KNN imputer for
filling in the missing values. These results are good compared to the results of the learning
models without the KNN imputer. The accuracy comparison of all learning models with
the KNN imputer and the removal of missing data is shown in Table 5.
Water 2023, 15, 475 13 of 17
Table 5. Performance comparison of the machine learning models with and without KNN imputer.
Accuracy
Model
KNN Imputer Deletion of Missing Values
LR 61 48
GNB 61 52
ETC 72 72
RF 80 79
SGDC 59 50
GBM 80 76
H20 Stacked 97 87
The performances of machine learning models when deleting missing values versus
the imputed dataset using the KNN imputer are shown in Figure 5. The KNN imputer
increases not only individual model performance but also the overall performance of all
learning models.
Table 7. Comparison of proposed approach with state-of-the-art approaches for water-quality prediction.
5. Conclusions
Survival of mankind is not possible without safe drinking water. Polluted water has a
lot of adverse effects on human health that ultimately result in severe and life-threatening
diseases. Due to a lot of urbanization in the world, the passages of drinking water are
mixed with polluted water, which is causing a severe problem for human beings to find
safe drinking water. This research work provides a stacked ensemble framework that
Water 2023, 15, 475 16 of 17
accurately classifies safe and harmful drinking water. The proposed stacked H20 AutoML
framework performs best with the KNN imputer technique that deals with the missing
values of the dataset. Experiments are carried out in two phases: with KNN imputer values
and by deleting missing values. Results reveal that using the KNN imputer for filling in the
missing values is a better choice, as deleting the missing value data cause information loss
that affects the performance of the models. The participation of each feature in prediction
is explained using an explainable AI technique, SHAP. The proposed approach obtains 97%
accuracy when used with the KNN imputer.
Author Contributions: Conceptualization, H.A.M. and M.U.; Data curation, S.A.; Formal analysis,
M.H. and O.S.; Funding acquisition, H.A.M. and M.H.; Investigation, M.U.; Methodology, A.I.; Project
administration, A.I.; Resources, O.S.; Software, A.I. and S.A.; Supervision, I.A. and N.A.; Validation,
I.A.; Visualization, S.A. and M.H.; Writing—original draft, H.A.M. and M.U.; Writing—review and
editing, N.A. and I.A. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by College of Electronic and Information Engineering, Beibu
Gulf University, Qinzhou 535011, China and by Princess Nourah bint Abdulrahman University
Researchers Supporting Project number (PNURSP2023R125), Princess Nourah bint Abdulrahman
University, Riyadh, Saudi Arabia.
Institutional Review Board Statement: Not applicable
Informed Consent Statement: Not applicable.
Data Availability Statement: The datasets can be found by the authors at request.
Acknowledgments: This study is supported via funding from Prince Sattam bin Abdulaziz Univer-
sity project number (PSAU/2023/R/1444).
Conflicts of Interest: The authors declare no conflict of interests.
References
1. Muhammad, S.Y.; Makhtar, M.; Rozaimee, A.; Aziz, A.A.; Jamal, A.A. Classification model for water quality using machine
learning techniques. Int. J. Softw. Eng. Its Appl. 2015, 9, 45–52. [CrossRef]
2. Radhakrishnan, N.; Pillai, A.S. Comparison of water quality classification models using machine learning. In Proceedings of
the 2020 5th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 10–12 June 2020;
pp. 1183–1188.
3. Walley, W.; Džeroski, S. Biological monitoring: A comparison between Bayesian, neural and machine learning methods of water
quality classification. In Environmental Software Systems; Springer: Berlin/Heidelberg, Germany, 1996; pp. 229–240.
4. Nasir, N.; Kansal, A.; Alshaltone, O.; Barneih, F.; Sameer, M.; Shanableh, A.; Al-Shamma’a, A. Water quality classification using
machine learning algorithms. J. Water Process Eng. 2022, 48, 102920. [CrossRef]
5. Nouraki, A.; Alavi, M.; Golabi, M.; Albaji, M. Prediction of water quality parameters using machine learning models: A case
study of the Karun River, Iran. Environ. Sci. Pollut. Res. 2021, 28, 57060–57072. [CrossRef] [PubMed]
6. Ambade, B.; Sethi, S.S.; Giri, B.; Biswas, J.K.; Bauddh, K. Characterization, behavior, and risk assessment of polycyclic aromatic
hydrocarbons (PAHs) in the estuary sediments. Bull. Environ. Contam. Toxicol. 2022, 108, 243–252. [CrossRef]
7. Singha, S.; Pasupuleti, S.; Singha, S.S.; Singh, R.; Kumar, S. Prediction of groundwater quality using efficient machine learning
technique. Chemosphere 2021, 276, 130265. [CrossRef]
8. Brown, R.M.; McClelland, N.I.; Deininger, R.A.; Tozer, R.G. A water quality index-do we dare. Water Sew. Work. 1970,
117, 339–343.
9. Bui, D.T.; Khosravi, K.; Tiefenbacher, J.; Nguyen, H.; Kazakis, N. Improving prediction of water quality indices using novel
hybrid machine-learning algorithms. Sci. Total Environ. 2020, 721, 137612. [CrossRef]
10. Juna, A.; Umer, M.; Sadiq, S.; Karamti, H.; Eshmawi, A.; Mohamed, A.; Ashraf, I. Water Quality Prediction Using KNN Imputer
and Multilayer Perceptron. Water 2022, 14, 2592. [CrossRef]
11. Aldhyani, T.H.; Al-Yaari, M.; Alkahtani, H.; Maashi, M. Water quality prediction using artificial intelligence algorithms. Appl.
Bionics Biomech. 2020, 2020. [CrossRef]
12. Shahra, E.Q.; Wu, W.; Basurra, S.; Rizou, S. Deep Learning for Water Quality Classification in Water Distribution Networks. In
Proceedings of the International Conference on Engineering Applications of Neural Networks, Crete, Greece, 17–20 June 2021;
Springer: Berlin/Heidelberg, Germany, 2021; pp. 153–164.
13. Mohammed, H.; Hameed, I.A.; Seidu, R. Machine learning: Based detection of water contamination in water distribution
systems. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, Kyoto, Japan, 15–19 July 2018;
pp. 1664–1671.
Water 2023, 15, 475 17 of 17
14. Abuzir, S.Y.; Abuzir, Y.S. Machine learning for water quality classification. Water Qual. Res. J. 2022, 57, 152–164. [CrossRef]
15. Hassan, M.M.; Hassan, M.M.; Akter, L.; Rahman, M.M.; Zaman, S.; Hasib, K.M.; Jahan, N.; Smrity, R.N.; Farhana, J.; Raihan,
M.; et al. Efficient prediction of water quality index (WQI) using machine learning algorithms. Hum.-Centric Intell. Syst. 2021,
1, 86–97. [CrossRef]
16. Sillberg, C.V.; Kullavanijaya, P.; Chavalparit, O. Water quality classification by integration of attribute-realization and support
vector machine for the Chao Phraya River. J. Ecol. Eng. 2021, 22, 70–86. [CrossRef]
17. Ahmed, U.; Mumtaz, R.; Anwar, H.; Shah, A.A.; Irfan, R.; García-Nieto, J. Efficient water quality prediction using supervised
machine learning. Water 2019, 11, 2210. [CrossRef]
18. Kakkar, M.; Gupta, V.; Garg, J.; Dhiman, S. Detection of water quality using machine learning and IoT. Int. J. Eng. Res. Technol.
(IJERT) 2021, 10, 73–75.
19. Malek, N.H.A.; Wan Yaacob, W.F.; Md Nasir, S.A.; Shaadan, N. Prediction of Water Quality Classification of the Kelantan River
Basin, Malaysia, Using Machine Learning Techniques. Water 2022, 14, 1067. [CrossRef]
20. Rustam, F.; Ishaq, A.; Kokab, S.T.; de la Torre Diez, I.; Mazón, J.L.V.; Rodríguez, C.L.; Ashraf, I. An Artificial Neural Network
Model for Water Quality and Water Consumption Prediction. Water 2022, 14, 3359. [CrossRef]
21. Kaggle. Water Quality. 2021. Available online: https://www.kaggle.com/datasets/adityakadiwal/water-potability (accessed on
1 November 2022).
22. Zhang, S. Nearest neighbor selection for iteratively kNN imputation. J. Syst. Softw. 2012, 85, 2541–2552. [CrossRef]
23. AUTOML: Automatic machine learning. Available online: hhttps://www.automl.org/automl/ (accessed on 1 November 2022).
24. H2O.ai. H2O: Scalable Machine Learning Platform. Available online: https://h2o.ai/platform/h2o-automl/ (accessed on 1
November 2022).
25. Ishaq, A.; Sadiq, S.; Umer, M.; Ullah, S.; Mirjalili, S.; Rupapara, V.; Nappi, M. Improving the prediction of heart failure patients’
survival using SMOTE and effective data mining techniques. IEEE Access 2021, 9, 39707–39716. [CrossRef]
26. Rustam, F.; Ashraf, I.; Mehmood, A.; Ullah, S.; Choi, G.S. Tweets classification on the base of sentiments for US airline companies.
Entropy 2019, 21, 1078. [CrossRef]
27. Manzoor, M.; Umer, M.; Sadiq, S.; Ishaq, A.; Ullah, S.; Madni, H.A.; Bisogni, C. RFCNN: Traffic accident severity prediction based
on decision level fusion of machine and deep learning model. IEEE Access 2021, 9, 128359–128371. [CrossRef]
28. Sharaff, A.; Gupta, H. Extra-tree classifier with metaheuristics approach for email classification. In Advances in Computer
Communication and Computational Sciences; Springer: Berlin/Heidelberg, Germany, 2019; pp. 189–197.
29. Fabian, D.; Guillermo Prieto Eibl, M.d.P.; Alnahhas, I.; Sebastian, N.; Giglio, P.; Puduvalli, V.; Gonzalez, J.; Palmer, J.D. Treatment
of glioblastoma (GBM) with the addition of tumor-treating fields (TTF): A review. Cancers 2019, 11, 174. [CrossRef] [PubMed]
30. Sowmya, B.; Nikhil Jain, C.; Seema, S.; KG, S. Fake News Detection using LSTM Neural Network Augmented with SGD Classifier.
Solid State Technol. 2020, 63, 6985–9665.
31. Ahmad, M.A.; Eckert, C.; Teredesai, A. Interpretable machine learning in healthcare. In Proceedings of the 2018 ACM International
Conference on Bioinformatics, Computational Biology, and Health Informatics, Washington, DC, USA, 29 August–1 September
2018; pp. 559–560.
32. Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 4768–4777.
33. Hasan, A.N.; Alhammadi, K.M. Quality Monitoring of Abu Dhabi Drinking Water Using Machine Learning Classifiers. In
Proceedings of the 2021 14th International Conference on Developments in eSystems Engineering (DeSE), Sharjah, United Arab
Emirates, 7–10 December 2021; pp. 1–6.
34. Dilmi, S.; Ladjal, M. A novel approach for water quality classification based on the integration of deep learning and feature
extraction techniques. Chemom. Intell. Lab. Syst. 2021, 214, 104329. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.