Forest Fire Prediction for Tech Students
Forest Fire Prediction for Tech Students
Prevention
Project Report Submitted in Partial Fulfilment of the Requirements for the Degree of
Bachelor of Technology
In
Computer Science
Submitted by:
Tanul Khare: (Roll No. 200102420)
Shubhangi Maurya: (Roll No. 200102409)
Sudhanshu Kumar: (Roll No. 210102524)
We declare that this written submission represents my ideas in my own words and where
others' ideas or words have been included, we have adequately cited and referenced the
original sources. We also declare that we have adhered to all principles of academic honesty
and integrity and have not misrepresented or fabricated or falsified any idea/data/fact/source
in my submission. I understand that any violation of the above will be cause for disciplinary
action by the University and can also evoke penal action from the sources which have thus
not been properly cited or from whom proper permission has not been taken when needed.
The plagiarism check report is attached at the end of this document.
2
TABLE OF CONTENTS
Forest Fire Prediction: Harnessing Data for early detection and Prevention..................................1
DECLARATION....................................................................................................................................2
LIST OF ABBREVIATIONS................................................................................................................4
CHAPTER – 1........................................................................................................................................5
Forest Fire Prediction............................................................................................................................5
1.1 INTRODUCTION..........................................................................................................5
1.2 APPLICATION...............................................................................................................5
1.3 TECHNIQUES................................................................................................................6
1.4 MACHINE LEARNING...............................................................................................7
1.5 REGRESSION................................................................................................................9
1.6 ALGORITHMS............................................................................................................10
CHAPTER – 2......................................................................................................................................15
PROJECT ANALYSIS........................................................................................................................15
2.1 LITERATURE REVIEW............................................................................................15
2.2 PROBLEM STATEMENT..........................................................................................16
CHAPTER – 3......................................................................................................................................18
METHODOLOGY..............................................................................................................................18
CHAPTER – 4......................................................................................................................................19
RESULTS AND DISCUSSION...........................................................................................................19
CHAPTER - 5……………………………………………………………………………………….23
CONCLUSION AND FUTURE WORK…………………………………………………………..23
REFRENCES………………………………………………………………………………………..24
3
LIST OF ABBREVIATIONS
RH Relative Humidity
WS Wind speed
FFMC Fine Fuel Moisture Code
DMC Duff Moisture Code
DC Drought Code
ISI Initial Spread Index
BUI Buildup Index
FWI Fire Weather Index
TEMP Temperature
4
CHAPTER – 1
FOREST FIRE PREDICTION
1.1 INTRODUCTION
A forest fire, marked by its rapid spread across wooded landscapes and consumption of
vegetation and combustible materials, may stem from natural phenomena like lightning
strikes or human activities such as campfires, discarded cigarettes, or deliberate acts of arson.
These fires pose grave risks to human settlements, wildlife habitats, ecosystems, and air
quality, potentially resulting in community displacement, property devastation, biodiversity
loss, and even loss of human lives.
Forecasting wildfires involves utilizing a range of data sources and analytical methods. This
includes considering topographic elements like vegetation cover and slope steepness,
alongside meteorological factors such as temperature, humidity, and wind speed. As machine
learning algorithms process this data, prediction accuracy improves gradually.
The primary objective is to integrate decision support systems and operational workflows
with predictive insights to enable proactive mitigation of fire risks. Early warning systems
and real-time monitoring play vital roles in alerting stakeholders to new threats
1.1.2 APPLICATION
Gender prediction analysis finds its application in various sectors such as:
1.3 TECHNIQUES
5
Data science is the process of applying methods like statistical analysis, machine learning,
and data mining to extract important insights and knowledge from large and complex
databases. In order to identify patterns, trends, and correlations that may influence decision-
making and inspire innovation across a range of industries, it involves collecting, organizing,
processing, and evaluating data. Organizations may apply data science to solve problems,
predict future events, and optimize operations by transforming raw data into valuable
knowledge.
(Fig 1.1) Hierarchy showing the different fields of Data Science [1]
MODELS
Linear Regression
Logistic Regression.
Elastic net Regression
Decision Tree.
SVM (Support Vector Machine) Algorithm.
KNN (K- Nearest Neighbours) Algorithm
Random Forest Algorithm.
6
improve their performance over time through experience. There are various types, including
supervised learning, unsupervised learning, and reinforcement learning, each with its own
techniques and applications. [2]
SUPERVISED LEARNING:
Supervised learning is the subcategory of machine learning that focuses on learning a
classification, or regression model, that is, learning from labelled training data (i.e., inputs
7
that also contain the desired outputs or targets; basically, “examples” of what we want to
predict).
2. Regression: It is a Supervised Learning task where output is having continuous value. The
goal here is to predict a value as much closer to actual output value as our model can and then
evaluation is done by calculating error value. The smaller the error the greater the accuracy of
our regression model.
UNSUPERVISED LEARNING:
Unsupervised learning is the training of machine using information that is not labelled and
allowing the algorithm to act on that information without guidance. Here the task of machine
is to group information according to similarities, patterns and differences without any prior
training of data.
Unlike supervised learning, no teacher is provided that means no training will be given to the
machine. Therefore, machine is restricted to find the hidden structure in unlabelled data by
it-self.
Reinforcement learning addresses the question of how a system that senses and acts in its
environment can learn to choose optimal actions to achieve its goals. This very generic
problem covers tasks such as learning to control a mobile robot, learning to optimize
operations in factories, and learning to play board games. Each time the system performs an
action in its environment, a trainer may provide a reward or penalty to indicate the
desirability of the resulting state. The task of the agent is to learn to choose sequences of
actions that produce the greatest cumulative reward.
1.5 REGRESSION
Regression is a valuable and widely used tool in the world of data science and machine
learning. It empowers us to explore and predict the connections between multiple factors. In
simpler terms, regression allows us to uncover a mathematical equation that links one factor)
with one or more other factors.
8
1.5.1 IMPORTANCE OF REGRESSION ANALYSIS
1. Prediction: Let's say you have information about the price of houses in a
neighbourhood and want to know the price of a new house. Regression helps you
make a prediction based on the features of the new house, such as its size, number of
rooms, and location.
2. Understanding Relationships: Regression helps us understand how different factors
influence each other. For example, you might want to know how the amount of time
spent studying affects exam scores. Regression can show you if there is a strong
relationship between study time and scores.
3. Identifying Important Factors: In complex situations with many variables,
regression helps us figure out which factors have a significant impact on the outcome.
It helps us separate the essential factors from the ones that don't matter much.
4. Decision Making: Organizations and businesses use regression to make informed
decisions. For instance, a company might use regression to predict customer demand
for a product, helping them plan their production and inventory efficiently.
In order for regression models to function, they must first estimate the relationship between
one or more independent variables, also known as predictors, and a dependent variable, or the
variable we wish to forecast. The amount and direction of each independent variable's
influence on the dependent variable is represented by the coefficients that the model
estimates for each variable. [3]
The model can be trained on historical data and then, by applying the learnt coefficients to
the values of the independent variables, be used to predict the values of the dependent
variable for new or unseen data. How effectively the model explains the underlying
relationship between the variables and how representative the training data is of the
population to which the model is intended to generalize determine how accurate the
predictions are.
9
1.6 ALGORITHMS
A linear equation is fitted to the observed data using the statistical technique of linear
regression to represent the connection between one or more independent variables
(predictors) and a dependent variable (outcome). In statistics and machine learning, it is
among the most straightforward and popular regression approaches.
Prediction: Linear regression is commonly used for making predictions based on the
relationship between independent and dependent variables. For example, predicting
house prices based on features such as square footage, number of bedrooms, and
location.
10
Inference: Linear regression can also be used for inference, where the goal is to
understand the relationship between variables and interpret the coefficients. For
example, determining the effect of education level on income.
11
Feature Selection: Lasso efficiently executes feature selection, identifying the most
significant predictors while discarding superfluous or irrelevant variables by
decreasing some coefficients to zero.
Lasso Regression Gets Into Trouble When The Number Of Predictors Are More Than
The Number Of Observations.
If There Are Two Or More Highly Collinear Variables Then Lasso Regression Will
Select One Of Them Randomly Which Is Not A Good Technique In Data
Interpretation.
12
The regularization parameter (𝜆 λ) regulates the degree to which the coefficients are
penalized.
Parameter Estimation:
By reducing the variance of the coefficients, ridge regression provides estimations of
the coefficients that tend to be more reliable, especially if there are more variables
than data.
Difficulty in Interpretation:
Because the coefficients in ridge regression are decreased towards zero and might not
accurately represent the significance of the predictors, interpreting the results can
become more difficult.
13
Independent Variables (X): As in linear regression, these are the predictor variables that are
used to explain the variation in the dependent variable.
Regularization Parameters (α and λ): Two hyperparameters, α and λ, are introduced by
elastic Net regression to regulate the regularization:
Multicollinearity Handling:
By selecting and shrinking sets of correlated variables together, elastic Net regression
effectively handles multicollinearity (high correlation) among predictor variables.
Computational Overhead:
elastic Net regression requires addressing a more complicated optimization problem
than Lasso or Ridge regression, which can lead to greater computing cost, particularly
for large datasets.
14
CHAPTER – 2
PROJECT ANALYSIS
Authors: Virupaksha Gouda R, Anoop R, Joshi Sameerna, Arif Basha, Sahana Gali
The existing systems use various technology like Machine learning techniques and Artificial
Intelligence and Wireless network utilized for collecting 24- hour weather data continuously,
which provides a higher chance to reflect perfectness of the status of forest environment.
Depending on those system, we can decide which days have the highest possibility of
catching a forest fires and danger and paid special attention to prevent forest fire for forest
guards. [5]
2.1.2 Forest fire Detection Using Machine Learning Technique”,2020
Authors: C. Amira. A. Elsonbaty, Ahmed M. Elshewey
This paper displays predicting forest fire-prone areas using machine learning regression
techniques. The data set used in this paper is presented within the UCI machine learning
repository that consists of climate and physical factors of the Montesinos park which is
present in Portugal. The research also proposes many machine learning approaches, linear
regression, ridge regression and lasso regression algorithm with data set size of 13 features
and 517 entries for each row. The accuracy of the linear regression algorithm gives higher
accuracy than ridge regression and lasso regression algorithms. [6]
2.1.3 Forest Fire Prediction
Author: Saurab Bhattarai
Different ML. algorithms have been applied to forest fire prediction, including decision trees,
random forests, support vector machines (SVM), artificial neural networks (ANN), and
ensemble methods. A comparative study conducted by Di Tommaso et al. (2018) compared
the performance of various algorithm and found that random forests and SVM achieved high
accuracy in predicting forest fire occurrence. [7]
2.1.4 Evaluation of Random Forest model for forest fire prediction based on climatology
over Borneo” 2019
Authors: E. Ayu Shabrina, Intan N. Wahyuni, Rifika Sadikin, Arninda L. Latifah
Forest fires are threatened by human activities, ecosystem and climate processes, but in
Borneo only variable of climate can be quantified the research objective is to assess the
effectiveness of the random forest model in predicting forest fires using satellite data of
burned areas and climate variables as input. Prediction of forest fires is expected to reduce the
15
influence of forest fires in the future Through an analysis of annual and spatial variability, it
was found that the random forest model, incorporating all selected climate variables,
effectively represents forest fire events across the Borneo region of Indonesia. [8]
2.1.5 A Brief Review of Machine Learning Algorithms in Forest Fires Science, 2023
Authors: Ramez Alkhatib ,Wahib Sahwan ,Anas Alkhatieb and Brigitta Schütt
As forest fires become more frequent globally, early prediction is crucial. Artificial
intelligence, particularly machine learning, is vital for forecasting and assessing fire risk. This
article reviews machine learning methods used for forest fire prediction, aiming to identify
research gaps and recent advancements. Selecting the best model is challenging due to
algorithm variations, but tailoring methods to specific forest characteristics enhances
predictive accuracy. [9]
2.1.6 A Survey of Machine Learning Algorithms Based Forest Fires Prediction and
Detection Systems, 2020
Author: Faroudja Abid
Forest fires pose significant environmental threats, annually consuming millions of hectares
worldwide, leading to economic, ecological, and human losses. Predicting and detecting these
fires is crucial for mitigation. Emerging technologies, including artificial intelligence, are
increasingly integrated into fire prediction and detection systems to automate processes. This
paper conducts a thorough survey of machine learning-based algorithms utilized in forest fire
prediction and detection systems. It introduces the forest fire issue, reviews various prediction
and detection methods, and discusses studies evaluating factors influencing fire occurrence
and risk. The paper presents and discusses key findings and challenges from each study. [10]
2.1.7 Role of Machine Learning Algorithms in Forest Fire Management, 2021
Authors: Muhammad Arif , Khloud K Alghamdi , Salma A Sahel , Samar O Alosaimi ,
Mashael E Alsahaft, Maram A Alharthi and Maryam Arif
Given the rising global concern over forest fires amid climate change, accurate prediction and
management are imperative. This paper aims to summarize recent advancements in forest fire
prediction, detection, spread rate estimation, and burned area mapping. Additionally, it
highlights the risks posed by smoke emissions to public health and ecosystems. By leveraging
machine learning algorithms, this review explores opportunities to enhance forest fire
management decision-making, ultimately contributing to cost savings and environmental
health improvement. [11]
2.1.8 Forest Fire Prediction Using Machine Learning Techniques, 2021
Authors: T Preeti, Suvarna Kanakaraddi, Aishwarya Beelagi, Sumalata Malagi, Aishwarya
Sudi
Forest fire prediction is crucial for control due to its environmental impact. Detection
algorithms, often leveraging satellite imagery, are pivotal. This study proposes a system
utilizing meteorological parameters for prediction, employing Random Forest Regression
with Hyperparameter tuning for accuracy enhancement. Comparative analysis encompasses
Decision Trees, Random Forests, Support Vector Machines, and Artificial Neural Networks.
16
Hyperparameter tuning produces promising outcomes, with MAE at 0.03, MSE at 0.004, and
RMSE at 0.07. [12]
2.1.9 Forest Fires Detection Using Machine Learning Techniques,2020
Authors: Ahmed M. Elshewey , Amira. A. Elsonbaty
Presently, forest fires stand as a significant global concern, leading to the investigation of
machine learning regression techniques for forecasting fire-prone regions. This research
employs a dataset sourced from the UCI machine learning repository, containing climate and
physical factors data from Montesinos park in Portugal. Three regression algorithms—linear
regression, ridge regression, and lasso regression—are applied to a dataset comprising 517
entries with 13 features per row. The dataset is examined in two variations: one encompassing
all features and another with 70% of the features. Training involves 70% of the dataset, with
the remaining 30% allocated for testing. Results reveal that linear regression exhibits superior
accuracy compared to ridge regression and lasso regression algorithms. [13]
2.1.10 Predicting wildfires in Algerian forests using machine learning models, 2023
Authors: Abdelhamid Zaidi
Algeria faces significant wildfire challenges with lasting impacts. Early detection is crucial,
but limited datasets hinder prediction methods. Using recent data from Bejaia and Sidi Bel-
Abbes in 2012, principal component analysis reduced variables to six while retaining 96.65%
variance. An artificial neural network (ANN) outperformed other classifiers in accuracy,
precision, and recall, achieving 0.967 ± 0.026 accuracy and 0.971 ± 0.023 F1-score. Feature
importance analysis highlighted RH, DC, and ISI as significant predictors in the ANN model.
[14]
We're focused on predicting the Fire Weather Index (FWI) for Algeria's Bejaia and Sidi Bel-
abbes regions. FWI, crucial for assessing fire risk, relies on meteorological factors. Our goal
is to build a regression model understanding how weather conditions (temperature, humidity,
wind speed, rainfall) and FWI components (FFMC, DMC, DC, ISI, BUI) influence FWI
values. This model will aid in proactive fire hazard assessment and prevention strategies for
these areas.
We'll train regression models with historical data from June to September 2012, comprising
meteorological information and FWI values. These models will then forecast FWI for
upcoming days, considering anticipated weather conditions.
Our ultimate objective is to develop precise models that will enable us to forecast the Fire
Weather Index, which will be useful for these regions of Algeria's fire management and
prevention plans.
17
CHAPTER 3
METHODOLOGY
Data collection and Preprocessing:
Gather a diverse and representative dataset that includes a wide range of Relative humidity,
temperature, fire index and wind speed conditions.
Preprocess the data to standardize data quality, do feature selection and removal of null
values.
18
CHAPTER – 4
Our proposed solution was implemented using various regression models to predict which
model would give more accurate results. Various metrices like Temperature, Humidity and
Wind speed are taken into account for much accurate results related to forest fire spread.
19
Case 3: Linear Regression Model
20
Case 5: Ridge Regression model
21
Case 7: Fire Analysis of region1
22
Chapter - 5
Conclusion And Future work
5.1 Conclusion
In both instances, the R2 Score reflects a commendable level of accuracy, indicating that
both models adeptly explain the variance in the data and yield precise predictions.
Upon examining the Mean Absolute Error (MAE), Linear Regression exhibits a marginally
lower value (0.482) compared to Ridge Regression (0.498).
Despite the slight advantage of Linear Regression in terms of prediction accuracy as
indicated by the R2 Score and MAE, we opt for utilizing the Ridge Regression model due
to its efficacy in addressing overfitting concerns.
5.2Future Work
While strides have been made in forest fire prediction and management, there remains
ample opportunity for advancement. Future endeavours may centre on enhancing predictive
accuracy, embracing emerging technologies, and nurturing interdisciplinary collaborations
to tackle the multifaceted challenges presented by forest fires.
23
Integration of Real-Time Sensor Data:
By utilizing real-time sensor data from IoT devices, remote sensors, and weather
stations, environmental conditions can be continuously tracked. Early detection of fire
outbreaks may also be made possible, enabling timely mitigation and control
behaviour.
Designing a website: Creating an ample website for the same can give easy access
to reliable data and timely information to the users regarding the forest fire
breakouts and preventions.
24
REFERENCES
[1] “(23) Data Science vs ML,AI,DL. Differences and Why It Matters | LinkedIn.” Accessed: May 05,
2024. [Online]. Available: https://www.linkedin.com/pulse/data-science-vs-mlaidl-differences-why-
matters-sai-nithya-akuthota/
[2] S. Raschka, “STAT 451: Introduction to Machine Learning Lecture Notes,” 2020, Accessed: May 05,
2024. [Online]. Available: http://stat.wisc.edu/∼sraschka/teaching/stat451-fs2020/
[3] “A Refresher on Regression Analysis.” Accessed: May 05, 2024. [Online]. Available:
https://hbr.org/2015/11/a-refresher-on-regression-analysis
[4] “Introduction To lasso Regression, Effects And Its Limitations - Pianalytix - Build Real-World Tech
Projects.” Accessed: May 05, 2024. [Online]. Available: https://pianalytix.com/introduction-to-lasso-
regression-effects-and-limitations-of-lasso-regression/
[5] V. G. R, A. R, J. Sameerna, A. Basha, and S. Gali, “Forest Fire Prediction Using Machine Learning,”
Int J Res Appl Sci Eng Technol, vol. 11, no. 5, pp. 792–797, May 2023, doi:
10.22214/IJRASET.2023.51496.
[6] “(PDF) Forest Fires Detection Using Machine Learning Techniques.” Accessed: May 05, 2024.
[Online]. Available:
https://www.researchgate.net/publication/344462171_Forest_Fires_Detection_Using_Machine_Learn
ing_Techniques
[7] “(PDF) ‘Forest Fire Prediction’ Submitted by Saurab Bhattarai.” Accessed: May 05, 2024. [Online].
Available:
https://www.researchgate.net/publication/371640298_Forest_Fire_Prediction_Submitted_by_Saurab_
Bhattarai
[8] A. L. Latifah, A. Shabrina, I. N. Wahyuni, and R. Sadikin, “Evaluation of Random Forest model for
forest fire prediction based on climatology over Borneo,” 2019 International Conference on
Computer, Control, Informatics and its Applications: Emerging Trends in Big Data and Artificial
Intelligence, IC3INA 2019, pp. 4–8, Oct. 2019, doi: 10.1109/IC3INA48034.2019.8949588.
[9] R. Alkhatib, W. Sahwan, A. Alkhatieb, and B. Schütt, “A Brief Review of Machine Learning
Algorithms in Forest Fires Science,” Applied Sciences 2023, Vol. 13, Page 8275, vol. 13, no. 14, p.
8275, Jul. 2023, doi: 10.3390/APP13148275.
[10] F. Abid, “A Survey of Machine Learning Algorithms Based Forest Fires Prediction and Detection
Systems,” Fire Technol, vol. 57, no. 2, pp. 559–590, Mar. 2021, doi:
10.1007/S10694-020-01056-Z/METRICS.
[11] A. Muhammad et al., “Role of Machine Learning Algorithms in Forest Fire Management: A Literature
Review,” Journal of Robotics and Automation, vol. 5, no. 1, Feb. 2021, doi: 10.36959/673/372.
[12] “Forest Fire Prediction Using Machine Learning Techniques | Request PDF.” Accessed: May 05,
2024. [Online]. Available:
https://www.researchgate.net/publication/371651902_Forest_Fire_Prediction_Using_Machine_Learni
ng_Techniques
[13] “(PDF) Forest Fires Detection Using Machine Learning Techniques.” Accessed: May 05, 2024.
[Online]. Available:
25
https://www.researchgate.net/publication/344462171_Forest_Fires_Detection_Using_Machine_Learn
ing_Techniques
[14] A. Zaidi, “Predicting wildfires in Algerian forests using machine learning models,” Heliyon, vol. 9,
no. 7, p. e18064, Jul. 2023, doi: 10.1016/J.HELIYON.2023.E18064.
26