International Journal of Scientific Research and Engineering Development-– Volume 6 Issue 1, Jan-Feb 2023
Available at www.ijsred.com
SEARCH  ARTICLE
 RESEARCH ARTICLE                                                                       OPEN ACCESS
                                                                                           OPEN ACCESS
 Monitoring and Prediction of Air Pollution Using Machine
               Learning Models: A Review
     Hemangi Tamore, Brinda Temkar, Sanya Wakode, Sunidhi Yadav, Dr.Jyoti Dange
                             EXTC, Atharva College Of Engineering, Mumbai
                              Email: tamorehemangi-extc@atharvacoe.ac.in
                             EXTC, Atharva College Of Engineering, Mumbai
                               Email: temkarbrinda-extc@atharvacoe.ac.in
                             EXTC, Atharva College Of Engineering, Mumbai
                               Email: wakodesanya-extc@atharvacoe.ac.in
                             EXTC, Atharva College Of Engineering, Mumbai
                               Email: yadavsunidhi-extc@atharvacoe.ac.in
                             EXTC, Atharva College Of Engineering, Mumbai
                                  Email: jyotidange2112@gmail.com
-----------------------------------------************************---------------------------------
Abstract:
        The influence of Machine learning and Data Science is advancing in healthcare, personalized
recommendation models, environmental studies as well as in education institutes. It has become an
important factor of consideration for both companies as well as individuals. Prediction of air quality is one
of the fields that machine learning has given its contributions. Air quality index measures the
concentration of various gases like carbon dioxide, carbon monoxide, nitrogen dioxide, sulfur dioxide,
particulate matter like smoke, soot, methane that releases after burning natural gas, coal, wood, etc. High
concentrations of these substances can cause severe diseases like lung cancer and even premature deaths.
Machine learning helps in predicting the air quality so necessary actions can be taken if the pollutants
increase more than a certain limit. If pollution of air is not handled carefully and sensibly, some day it can
lead to extinction of humans. This paper provides a revision on the results obtained by researchers on
monitoring and prediction of air pollution using machine learning and IOT. After thorough reviewing it
was observed that the machine learning algorithms used for the analysis were quite effective.
Keywords — Air pollution, air quality prediction, machine learning, iot, regression, maps
-----------------------------------------************************---------------------------------
  INTRODUCTION                                          sulfur dioxide(SO2), Ozone (O3), nitrogen oxide
Human society is evolving at a great pace. This is (NO) and nitrogen dioxide (NO2) and a complex
evident in every city, including various problems mixture of solid and liquid droplets called
caused due to these evolutions and expansions. The Particulate Matters (PM, e.g., PM2.5, PM10) [2].
rapid growth in cities naturally increases pollution There are many studies which show that respiratory
and pollutants caused due to heavy traffic, disorders are due to air pollution. A study
industries, etc. Because of this, the quality of air is conducted by State of Global Air (SOGA) shows
polluted and has a great impact on human health. that continuous exposure to air pollution can reduce
There are certain pollutants which affect human life expectancy by up to 20 months [2]. Air
health and have a great impact these pollutants are, pollution is accountable for the death of 7 million
carbon dioxide (CO2), carbon monoxide (CO), persons worldwide each year or one in eight
ISSN : 2581-7175                     ©IJSRED: All Rights are Reserved                               Page 156
International Journal of Scientific Researc
                                       earch and Engineering Development-– Volume 6 Issue
                                                                                    Issu 1, Jan-Feb 2023
                                                                            Available at www.ijsred.com
premature deaths yearly [4]. Hence, it is a difficult I. LITERATURE REVIEW
task to predict the quality of air in a particular       In [1] the research proposedd a Web application
region, which makes people less caref   reful about the  based on air pollution monito itoring systems using
existing air quality. The World Healthh Organisation     HAPS-based wireless sensorr networks that were
(WHO) set up guidelines and limitatio   tions for urban  able to display information abobout air grade data in
air pollution that should be respected  ed in order to   graphical and table form. Thi  his is in accordance
protect the citizens from the pollutants  ts [2]. Almost with the design of the web   ebsite where the air
570,000 children under the age of five   fiv die every   condition menu contains a lineine-shaped graph that
year from respiratory infection linked to                is divided into two sub menu   nus: graphs CO and
indoor/outdoor pollution and second--hand smoke          PM10 levels.
[4]. Small children when made to com    ome in contact
with air pollution may have a risk of developing         [2] This paper presents an IoToT Platform covering
respiratory disorders. Therefore, a needed to develop a  the data collection process from
                                                                                      rom the sensing nodes
system that predicts air quality aaccurately is          to visualization for the end pro
                                                                                        rocess. They provide
necessary.                                               an easy to assemble and du     duplicate design for
Air Quality Index(AQI) is calculated on the basis of     research purposes to ena       nable fast sensor
concentration of different pollutants likike CO2, SO2,   deployment and data collectiotion. It is a low level
CO, NH3, PM2.5, PM10, etc. as shownn in Fig. 1 the       middle- ware, meaning that it ccan be implemented
index uses different colors for better understanding
                                          u              directly on top of the phys    ysical layer of low
of the level of pollutants in air.Here, green
                                        gre indicates    consumption protocols such     ch as the Zigbee
least polluted air while maroon indica   cates the most  protocol.
polluted air.
                                                         In [3] the paper proposes thatt using
                                                                                          u     a multivariate
                                                         modeling approach enhance      ces the prediction
                                                         accuracy and reduces error   ror because of the
                                                         dependency between targett gasses and other
                                                         features included such as tempmperature, day of the
                                                         week, and H2S.
                                                         In [4]this paper the system waas designed using an
        Fig. 1. Air Quality Index(AQI-iindex)            Arduino microcontroller.Thhis system was
This paper includes Section 2 which describes
                                       d         the     develped to monitor and analyzlyze real time data air
methodology adopted by differentt researchers,           quality and log data to a rememote server,the data
Section 3 explores the results obtained
                                     ed on the same      kept updated over the in        internet.Air quality
and finally Section 4 includes the concl
                                      clusion.           measurement were taken usin  sing Part per Million
                                                         matrics(PPM) and analyzee using Microsoft
                                                         Excel.This designed system w     was taken accurate
                                                         measurement of Air quality.T y.The accurate result
                                                         was displayed on the designed ed hardwares display
                                                         interface and it could be access
                                                                                        ssed via cloud on any
                                                         smart phone service.
                                                         [5]This paper has used Machine learning
                                                         techniques to predict concentr
                                                                                     ntration of the so2 in
   Fig. 2. General steps in predicting Air
                                       A Quality
                                                         the environment. Sulfur dioxid
                                                                                     ide effects on the skin
                      Prediction                         and      mucous       membraranes       of      the
                                     ©IJS
                                     ©IJSRED: All Rights are Reserved                             Page
International Journal of Scientific Research and Engineering Development-– Volume 6 Issue 1, Jan-Feb 2023
                                                                              Available at www.ijsred.com
 eyes,nose,throat,and lung. This system employed           [10]This paper has determined PM10 level best
 models in Time series to predict so2 in the               with Random Forest but does not accurately predict
 environment.In this paper they have used Time             the level of dangerous pollutants but can work with
 series forecast models for prediction of Air              incomplete data sets.
 Quality Index in Metro cities.
                                                           III.COMPARATIVE STUDY AND ANALYSIS
In [6]this SARIMAX and Holt-Winter’s models
are used to predict the air quality index. These time
                                                           In this section we observe the performance of
series forecasting models can be utilized to predict
the values of the Air Quality Index(AQI) based on          different machine learning models, adopted by
past data. To analyze the performance of models            researchers in their studies which are as follows:
Mean Absolute Percentage Error (MAPE) is used as
the score function. The Holt-Winters algorithm can           Table I: Performance of Different ML Models
handle seasonality,but results produced by this
                                                          ML         SO2    CO    O3     NO PM        PM
model have not very accuracy. On the other
                                                          Model                          2  2.5       10
hand,The SARIMAX can handle seasonality and
has much better result accuracy than the Holt-            Linear     0.12   0.02 0.09    0.1   0.02   0.02
Winters model.                                            Regressi   5
                                                          on
In [7]this paper,they have compared the decision
tree, linear regression and random forest. Major air      Decision 0.80     0.61 0.62    0.6   0.75   0.61
pollutants are taken and meteorological conditions        Tree     60                    4
are taken using the Arduino Platform. Random
forest gives better accurate results due to overfitting   Random 0.85       0.79 0.79    0.7   0.86   0.79
that reduces errors But drawback is Random forest         Forest    6                    01
uses more memory and high cost. Haotian Jing &            regressio
Yingchun Wang(2020).                                      n
[8]had predicted the air quality index using XG SARIM 0.86 0.75 0.83 0.8 0.72 0.81
Boost.It uses the weak classifier and shortcoming of AX        3             2      66
the previous weak classifier to form a strong
classifier thus reducing the error between predicted
and actual values. It uses the K- cross               The above comparison was obtained by observing
validation.The mean absolute error and coefficient    the results and performance parameters of the
of determination is determined to predict the         mentioned algorithms in [6] and [7].
difference between actual and predicted value. The
drawback is that it takes the previous value and is
affected by outlier unwanted pollutants in the air.      IV. CONCLUSION
                                                      In this paper, we have successfully understood the
[9]Their research had predicted the level of air      need of machine learning and the huge role that it
pollutant with Recurrent neural network at any
                                                      plays in prediction of air pollution. The papers
given time and removed the drawback of hourly
prediction due to memorization power of algorithm     discussed above proposed Air Pollution
but lacks in working without memory operation.        Monitoring and Prediction systems using Machine
                                                      Learning Models out of which a few successfully
                                                      predicted the pollution levels accurately while
ISSN : 2581-7175                      ©IJSRED: All Rights are Reserved                             Page 158
International Journal of Scientific Research and Engineering Development-– Volume 6 Issue 1, Jan-Feb 2023
                                                                                                        Available at www.ijsred.com
  others didn’t. After going through these papers we                      [12] C. Santos, J. A. Jiménez, and F. Espinosa, “Effect of Event-
                                                                                 Based Sensing on IOT Node Power Efficiency. Case Study: Air
  observed that SARIMAX was the best proposed                                    Quality Monitoring in Smart Cities,” IEEE Access, vol. 7, pp.
  model as it took the parameter of seasonality into                             132577–132586, 2019.
                                                                          [13]   D. Wei, “Predicting air pollution level in a specific city,” 2014.
  consideration which helped in correctly predicting                      [14]   Kostandina Veljanovska and Angel Dimoski, “Air Quality
  the air quality that would then alert potential users                          Index Prediction Using Simple Machine Learning Algorithms,”
                                                                                 International Journal of Emerging Trends & Technology in
  and also help in future predictions.                                           Computer Science, vol. 7, no. 1, 2018.
                                                                          [15]   D. Zhu, C. Cai, T. Yang, and X. Zhou, “A Machine Learning
       REFERENCES                                                                Approach for Air Quality Prediction: Model Regularization and
                                                                                 Optimization,” Big Data and Cognitive Computing, vol. 2, no.
                                                                                 1, p. 5, Mar. 2018.
[1]    Iskandar, Adha, M. B., Hendrawan, & Edward, I. Y. M. (2019).       [16]   A. Masih, “Machine learning algorithms in air quality
       Design and Implementation of Web Application on Air Pollution             modeling,” Global Journal of Environmental Science and
       Monitoring System Using Wireless Sensor Network Based on                  Management, vol. 5, no. 4, pp. 515–534, 2019
       HAPS. 2019 IEEE 5th International Conference on Wireless           [17]   Aditya C R, Chandana R Deshmukh, Nayana D K, Praveen
       and                      Telematics                     (ICWT).           Gandhi Vidyavastu;Detection and Prediction of Air Pollution
       doi:10.1109/icwt47785.2019.897824110.1109/icwt47785.2019.                 using Machine Learning Models(IJETT)
       8978241                                                            [18]   https://archive.ics.uci.edu/ml/datasets/Air+quality
[2]    Ben-Aboud, Y., Ghogho, M., & Kobbane, A. (2020). A                 [19]   David A. Freedman (2009). Statistical Models: Theory and
       research-oriented low-cost air pollution monitoring IoT                   Practice. Cambridge University Press. p. 26. A simple
       platform. 2020 International Wireless Communications and                  regression equation has on the right hand side an intercept and
       Mobile                    Computing                  (IWCMC).             an explanatory variable with a slope coefficient.
       doi:10.1109/iwcmc48107.2020.9148176                                [20]   Rokach, Lior; Maimon, O. (2008). Data mining with decision
       10.1109/iwcmc48107.2020.9148176                                           trees: theory and applications. World Scientific Pub Co Inc.
[3]    Bashir Shaban, K., Kadri, A., & Rezk, E. (2016). Urban Air                ISBN         978-         9812771711.         [12]       BreimanL
       Pollution Monitoring System With Forecasting Models. IEEE                 (2001)."RandomForests".        MachineLearning.       45     (1):32.
       Sensors           Journal,           16(8),         2598–2606.            doi:10.1023/A:1010933404324
       doi:10.1109/jsen.2016.2514378
[4]    Okokpujie, Kennedy & Noma-Osaghae, Etinosa &
       Odusami,Modupe & John, Samuel & Oluwatosin, Oluga.
       (2018). A Smart Air Pollution Monitoring System. International
       Journal of Civil Engineering and Technology. 9. 799-809.
[5]    C R, Aditya & Deshmukh, Chandana & K, Nayana &
       Gandhi, Praveen & astu, Vidyav. (2018). Detection and
       Prediction of Air Pollution using Machine Learning Models.
       International Journal of Engineering Trends and Technology.
[6]    Arora, Himanshu & Solanki, Arun. (2020). Prediction of Air
       Quality Index in Metro Cities using Time Series Forecasting
       Models. Xi'an Jianzhu Keji Daxue Xuebao/Journal of Xi'an
       University of Architecture & Technology. 12. 3052-3067.
       10.37896/JXA 12.05/1721.
[7]    Venkat Rao Pasupuleti , Uhasri , Pavan Kalyan , Srikanth and
       Hari Kiran Reddy, Air Quality Prediction Of Data Log By
       Machine Learning,2020,IEEE.
[8]    Haotian Jing & Yingchun Wang, Research on Urban Air
       Quality Prediction Based on Ensemble Learning of XGBoost,
       2020, E3S Web of Conferences.
[9]    Xiaosong Zhao , Rui Zhang, Jheng-Long Wu, Pei-Chann Chang
       and Yuan Ze University, A Deep Recurrent Neural Network for
       Air Quality Classification, 2018, Journal of Information Hiding
       and Multimedia Signal Processing.
[10]   Nicolás Mejía Martínez, Laura Melissa Montes, Ivan Mura and
       Juan Felipe Franco, Machine Learning Techniques for PM10
       Levels Forecast in Bogotá,2018,IEEE.
[11]   Kennedy Okokpujie, Etinosa Noma-Osaghae, Odusami
       Modupe, Samuel John, and Oluga Oluwatosin, “A SMART AIR
       POLLUTION MONITORING SYSTEM,” International Journal
       of Civil Engineering and Technology (IJCIET), vol. 9, no. 9, pp.
       799–809, Sep. 2018.
ISSN : 2581-7175                                 ©IJSRED: All Rights are Reserved                                                    Page 159