0% found this document useful (0 votes)
23 views9 pages

Yolov 11 S

This research presents an optimized YOLOv11s deep learning model for real-time wildfire detection, achieving 89.04% accuracy while minimizing false positives. The model utilizes advanced architectural improvements and a modified dataset to enhance detection capabilities under various environmental conditions. The findings emphasize the potential of YOLOv11s in transforming wildfire management and response strategies, ultimately contributing to environmental protection and economic benefits.

Uploaded by

zackma127
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views9 pages

Yolov 11 S

This research presents an optimized YOLOv11s deep learning model for real-time wildfire detection, achieving 89.04% accuracy while minimizing false positives. The model utilizes advanced architectural improvements and a modified dataset to enhance detection capabilities under various environmental conditions. The findings emphasize the potential of YOLOv11s in transforming wildfire management and response strategies, ultimately contributing to environmental protection and economic benefits.

Uploaded by

zackma127
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Deep Learning for Wildfire Detection:

ed
Optimizing YOLOv11s for High-Precision
Fire Identification

iew
Naveen Jayaraj Madhav K Mohan Aswin S
DataBase Management System DataBase Management System DataBase Management System
SRM Institute Of Science And SRM Institute Of Science And SRM Institute Of Science And
Technology Technology Technology
Tiruchirappalli, India Tiruchirappalli, India Tiruchirappalli, India

v
naveenpainthouse@gmail.com madhavkmohan@gmail.com faswin746@gmail.com

and improve emergency response processes. Traditional

re
Abstract: - detection methos such as satellite tracking, heat sensors,
Wildfires are serious ecological disasters that cause ground monitoring, lookout towers, and visual
widespread destruction, affecting natural habitats, human observation—have limitations, including reporting delays,
settlements, and air quality. Early detection is crucial to restricted coverage, and false alarms caused by atmospheric
minimizing damage and enabling timely intervention. This conditions like haze and fog.
research presents a wildfire detection model utilizing YOLOv11s,
er
a lightweight yet highly efficient deep learning architecture
optimized for real-time object detection. The model dataset is a
The innovation of Deep Learning and Artificial
Intelligence has been one of the most promising solutions
modified roboflow dataset for better results by removing for improving the efficacy of wildfire response and
unwanted information, enhanced with custom preprocessing
detection systems. There is a pressing need to create an
pe
techniques to improve detection accuracy under various
efficient, accurate, and automatic wildfire detection system
environmental conditions. Experimental results demonstrate that
our approach achieves an 89.04% accuracy, outperforming for proper disaster management. Deep learning and object
existing models while maintaining a low false-positive rate. By detection advancements have been of tremendous potential
leveraging YOLOv11s, The best model for this application. Our in resolving this problem.
system ensures high precision while taking as little processing
power as possible. It is made to be deployed on drones, satellites, YOLO, or You Only Look Once, is a state-of-the-
art deep learning technique for real-time object detection
ot

surveillance cameras, and other UAVs for real-time wildfire


monitoring. These findings highlight the potential of YOLOv11s and classification. It has been popularized for its speed,
in revolutionizing wildfire detection and prevention strategies, flexibility, and accuracy. This paper suggests a better model
contributing to more effective, quick, and precise disaster for wildfire detection using YOLOv11s that includes
management for environmental protection.
tn

recently developed architectural improvements like the


C3K2 block, SPFF module, and C2PSA block. These
I. INTRODUCTION developments enhance detection performance, feature
extraction, and computational performance, hence being a
Wildfires have become one of the most destructive
top pick for detecting wildfire smoke. The suggested
natural disasters, resulting in significant environmental,
system overcome the limitations of conventional
rin

economic and human losses. They pose a major global


approaches through the utilization of high-quality data sets,
challenge with serious implications for ecosystems human
state-of- the-art data augmentation processes, and
habitation and air quality. Over the past few decades
maximized training processes through the integration of
climate change, deforestation and human activities have
DEEP LEARNING and AI strengths, this study seeks to
contributed to the increasing frequency of wildfires driven
contribute to more efficient wildfire detection system,
ep

by climate change and anthropogenic activities emphasizes


ultimately reducing the environmental, economic and
timely and accurate detection systems.
human impacts of these destructive disasters.

The importance of this study is that it has the potential to


The effects of wildfires go beyond immediate transform wildfire detection and management. Through the
Pr

environmental effects, including loss of biodiversity and combination of advanced technologies, the system
carbon emissions. They also encompass economic losses proposed:
and severe health threats from airborne pollutants. Precise
wildfire detection is vital in order to crucial for these effects

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
Real-Time Detection: The YOLOv11s model provides method includes Wise-IoU (WIoU) v3 for bounding box
real time detection of wildfires enabling quicker response regression, Ghost Shuffle Convolution (GSConv) to reduce
times and less damage. model parameters and accelerate convergence, and
BiFormer attention mechanism to focus on crucial features
Enhanced Accuracy: Architectural improvements such as while ignoring irrelevant background noise. The dataset
the C3K2 block and C2PSA block enhance the model’s comprises 3200 images of forest fire smoke and 2800
capacity to identify fire and smoke from intricate

ed
images without smoke, totaling 6000 images, sourced from
backgrounds minimizing false positives Google, Kaggle, Flickr, and Bing. The model achieved an
Scalability: The system can be implemented on multiple average precision (AP) of 79.4%, with a notable 3.3%
platforms such as drones, satellites and ground cameras, improvement over the baseline, and robust performance for
giving overall coverage. small (APS of 71.3%) and large (APL of 92.6%) smoke
areas. However, there are drawbacks such as the model's

iew
Environmental impact: It can greatly minimize the sensitivity to atmospheric conditions like fog, haze, and
environmental harm caused by wildfire, such as carbon clouds, which can mimic smoke, leading to false positives.
emission, soil loss and biodiversity loss. Additionally, the dynamic nature of smoke plumes and the
complexities of the forest environment pose challenges to
Economic and social Benefits: By avoiding extensive
precise detection. The study highlights the need for further
wildfires, the system can prevent billions of dollars in
advancements in distinguishing smoke from similar
economic losses and safeguard population from

v
atmospheric conditions to improve detection accuracy.
displacement and health hazards.

1.1 Challenges and Future Directions

re
2.2 Wildfire Detection from Multisensor Satellite
Though with promising performance, there are some
Imagery Using Deep Semantic Segmentation
challenges remaining. Environmental factors such as
(Rashkovetsky, Mauracher, Langer, & Schmitt, 2021)
clouds, haze, and fog continue to affect the detection
accuracy. Furthermore, the dynamics of wildfire behaviour
er In this study, "Wildfire Detection from
and fluctuating environmental factors also present Multisensor Satellite Imagery Using Deep Semantic
consistent difficulties for generalizing the model. Future Segmentation", a workflow using deep learning to detect
studies will concentrate on: fire-affected areas from multisensor satellite imagery is
proposed. For achieving deep semantic segmentation on
. Enhancing the model's ability to handle diverse
pe
data from four satellite instruments, this method uses a U-
environmental conditions.
Net-based convolutional neural network: Sentinel-1 C-
. Incorporating multisensory data fusion for more robust
SAR, Sentinel-2 MSI, Sentinel-3 SLSTR, and MODIS on-
detection.
board Terra and Aqua satellites. Different satellite datasets
. Expanding the dataset to include more diverse fire
projected different spectral bands: visible, infrared, and
scenarios and geographic regions.
microwave were exploited to achieve more robust and
In conclusion, the paper provides an extensive
accurate fire detection. Moreover, by fusing data from these
ot

method for wildfire detection utilizing the YOLOv11s


instruments, the model could detect wildfires more
model, and it showcases its ability to transform disaster
accurately across different operational environments,
management and response. The results emphasize the need
including clear and cloudy conditions.
to incorporate advanced technologies in addressing the
tn

increasing issues of wildfires in an era of climate change The dataset used in this work is the imagery of the
and environmental degradation. above-said satellite instruments, augmented with reference
This paper introduces a better model for wildfire data from the California Fire Perimeter Database, which
smoke detection with YOLOv11s, integrating new includes wildfire perimeters in California from 1950 to
architectural innovations like the C3K2 block, SPFF 2019. The multispectral and multisensor quality of the
rin

module, and C2PSA block. These innovations improve dataset enhances the detection mechanism capabilities by
detection accuracy, enhance feature extraction, and exploiting the advantages of each satellite's unique
optimize computational efficiency, making the model an features.
extremely effective tool for wildfire smoke detection.
The U-Net models trained in this work exhibit significant
ep

success, with substantial accuracy percentiles achieved.


II. RELATED WORKS The amounts of accuracy achieved differ from one
condition and satellite combination to another. For
2.1 An Improved Wildfire Smoke Detection Based on example, rice was the best detection rate in fusion with
YOLOv8 and UAV Images (Saydirasulovich, Sentinel 2-Sentinel 3 when there were clear skies, while it
Mukhiddinov, Djuraev, Abdusalomov, & Cho, 2023) conquered cloudy weather in the midst of -1-2 fusion. In
Pr

This study presents an enhanced model for other words, the single-instrument model implemented
detecting wildfire smoke using YOLOv8, incorporating with only Sentinel-2 achieves precision with a clear
improvements to boost accuracy and efficiency. The Republic of 0.83 and a recall of 0.92, thus dominance in
performance among other models. Such advances

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
notwithstanding, several limitations and drawbacks were III. PROPOSED SYSTEM
observed. Basically, clouds and smokes can obscure the 3.1 System:
fire signal and lead to false negatives; these are basically
the two most important problems. The imbalanced This section provides an elaborate description of
classification problem of fire-affected pixels, being less the hyperparameter settings, utilized test dataset,
than their unaffected counterparts, is associated with experimental configuration, and validation process

ed
difficulties in fire segmenting. Wildfires are dynamic and employed to measure the effectiveness of the improved
complex YOLOv11s model in identifying wildfires from CCTC,
satellite, or UAV sources. All the experiments were
phenomena involving many interacting factors conducted under consistent hardware conditions to ensure
such as weather, topography, and vegetation, resulting in a the reliability of the proposed methodology. The
more complex process in detection. Also, the performance experiment was carried out on Google’s Colaboratory

iew
of the model is very dependent on the training data's project with specific specifications, including the
availability and quality; thus, complete and high-quality following: Intel Xeon CPU with 2 vCPUs (virtual CPUs)
datasets are crucial and 13GB of RAM NVIDIA Tesla K80 with 12GB of
In conclusion, while the multisensor approach and VRAM We obtained our model from both Roboflow and
deep learning methods show great promise in improving Kaggle, with each image resized to 640 × 640 pixels. The
wildfire detection, it remains imperative to address the comprehensive evaluation encompasses a diverse range of

v
issues surrounding atmospheric conditions and data aspects, including the experimental setup and design,
imbalance to enhance detection accuracy and robustness YOLOv11s performance analysis, method impact
assessment, model comparisons, ablation studies, and

re
further.
visualization results. A table displaying the parameters
2.3 Forest Fire Detection and Notification Method utilized during the training of the model for detecting fire
Based on AI and IoT Approaches (Avazov, Hyun, S, and smoke is included in Table 4 of the manuscript. This
Abdusalomov, & Cho, 2023) provides a clear overview of the training and configuration
of this specific task.
This study proposes a new AI- and IoT-based method

a combination of a smoke detection sensor-MQ-2 and a


er
for forest fire detection and notification. The method uses 3.2 Model Architecture:

YOLOv5 convolutional neural network for real-time fire What is YOLO?


detection. The MQ-2 sensor detects smoke and flammable YOLO (You Only Look Once) Model is mainly
pe
gases and, once smoke is detected, sends an alert to trigger used for object detection in computer vision including
the camera to take pictures, which will then be analyzed by accurate identification of objects in an image. Joseph
YOLOv5 to detect fires. The old school also adds Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi
notifications to alert the fire department and notify all introduced a paper named as “You Only Look Once:
persons within a mile of any confirmed fire to be on the Unified, Real-Time Object Detection” and later it is called
lookout. The dataset for training and testing the YOLOv5 as YOLO. The main reason for the publishing of YOLO is
ot

model includes images from the Robmarkcole and Glenn- to create faster and single-shot detection of algorithm and
Jocher databases as well as video footage of various fire mainly with high accuracy. It is single-shot approach which
scenarios collected from YouTube. The dataset consists of helps us by dividing the images into grids and predicts the
75% training and 25% test images, for a total of 3120 bounding boxes and class probabilities for each grid.
tn

images. The model attained enormous accuracy with


confidence levels above 80% after retraining the IoV5x
model with an epoch size of 10. The study, however, does
mention some other drawbacks. Small regions of fire could
also affect the performance of the setup leading to lower
rin

confidence levels in certain situations. Blaring lights or red


shirts may also create false detection of fire in its actual
context. With these very shortcomings, the proposed
system still seems promising in terms of enhancing early
fire detection and reducing wildfire incidents through
ep

integrating AI and IoT-based technologies. Further


improvements by training larger datasets should be equated
for these challenges with the intention of improving Explanation for YOLOv11 Architecture
detection accuracy.
Backbone (Efficient Feature Extraction)
The core of YOLOv11 is powered by a strong
Pr

backbone, liable for extracting and processing features


from input data. The backbone consists of multiple
convolutional layers that progressively downsample the
input while increasing the depth of feature maps. This

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
hierarchical technique guarantees that the network captures structure copies a traditional bottleneck module, ensuring
both low-level details (such as edges and textures) and lightweight computations. However, when c3k is set to
high-level semantic representations required for accurate True, the head employs a deeper C3 module, enabling more
object detection. A prominent enhancement in YOLOv11 is extensive feature extraction for complex scenarios.
the introduction of the C3k2 block, which replaces the older The detection head incorporates a series of
C2f module. Unlike standard large-kernel convolutions, the Convolution-BatchNorm-SiLU (CBS) layers, which

ed
C3k2 block employs two smaller convolutional layers in further refine the extracted features before final prediction.
succession, achieving improved efficiency and reduced These layers stabilize the learning process by normalizing
computational cost without sacrificing accuracy. feature distributions and applying the SiLU activation
Another key element in the backbone is the Spatial Pyramid function to advancement non-linearity. The final
Pooling - Fast (SPPF) block, which is a critical part of convolutional layers in the detection head minimize the
multi-scale feature accumulation. By pooling features at feature maps into precise outputs, predicting bounding box

iew
different scales, SPPF ensures that the network keeps coordinates, confidence scores, and class labels. This
valuable spatial information across varying object sizes and process is followed by post-processing steps to filter out
shapes. To further enhance feature extraction, YOLOv11 redundant detections and ensure the most accurate
introduces the Cross Stage Partial with Spatial Attention predictions are retained.
(C2PSA) block. This module incorporates spatial attention
mechanisms, allowing the model to focus on important 3.3 Experimental Setup:

v
regions in an image while inhibiting less relevant areas, this
helps in making sure the most important parts are given
utmost attention. The union of these elements strengthens

re
the backbone's ability to detect objects with greater
precision, particularly in complicated environments where
objects may be obstructed or overlapping.

Neck (Multi-Scale Feature Fusion) er


The neck in YOLOv11 helps as the intermediary
between the backbone and the detection head, playing a
pivotal role in aggregating and refining multi-scale
features. Traditionally, the neck includes a series of
pe
upsampling and feature fusion layers designed to
consolidate information from different levels of the
network. One of the key architectural improvements in
YOLOv11 is the adoption of the C3k2 block within the
neck. By replacing the standard C2f block, the C3k2 block
streamlines the process of feature fusion, making it more
efficient while preserving the integrity of extracted
ot

information. This ensures that the detection head receives


high-quality, well-processed features that enhance object
localization and classification. To strengthen the training process and improve the
tn

An important advancement in YOLOv11’s neck is generalization capabilities of our YOLOv11s wildfire


its enhanced spatial attention mechanism. By integrating detection model, we arranged a rigorous experimental setup
the C2PSA module within the feature aggregation layers, that involves data augmentation, preprocessing techniques,
the network can dynamically adjust its focus, emphasizing and multiple optimization strategies to. This setup was
critical regions in the image while reducing the impact of designed to ensure that the model could effectively learn
background noise. This is particularly beneficial for
rin

and generalize across a wide variety of wildfire scenarios,


detecting smaller objects or those positioned in cluttered reducing false detections and improving accuracy in real-
scenes. The refined attention mechanism improves the world applications.
model’s ability to distinguish objects from their
surroundings, leading to more reliable detections in real- 3.4 Data Augmentation and Preprocessing:
world applications.
ep

Since the original dataset contained 3,466 high-


Head (Precision in Detection) quality images, we created an broad data augmentation
The detection head in YOLOv11 makes the final pipeline to artificially improve the dataset size while
predictions, translating processed feature maps into precise introducing greater variability in dataset. This
bounding box coordinates, objectness scores, and class augmentation process expanded our dataset into a huge
probabilities. A major innovation in this stage is the
Pr

8,340 images, which is around twice the size of original


integration of multiple C3k2 blocks, which allow for dataset and significantly improving the model’s ability to
greater flexibility in feature refinement. The design of these recognize fires under different conditions.
blocks can be adjusted based on the complexity of the
detection task. When the c3k parameter is set to False, the

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
3.5 Augmentation Techniques that we used in the dataset so the model has more data to work with but also
dataset: improved the diversity and variability of data which insured
a high level of generalization that could be useful in
1. Flip: Horizontal Flip detection of wild-fire even in low light, high smoke, low
- We applied horizontal flipping to simulate real-world resolution or differently angled images from wide amount
scenarios where a fire could appear in different of sources.
orientations.

ed
- This helped the model generalize better to flames 3.7 Optimization Strategies for Training
appearing from the left or right side of an image.
To ensure the best performance of our YOLOv11s
2. Rotations: Clockwise, Counterclockwise, and Upside model, we experimented with multiple optimization
Down algorithms. The choice of optimizer plays a crucial role in

iew
- To farther enhance the model robustness, we rotated how well the model learns from the dataset, affecting
images 90° clockwise, 90° counterclockwise, and 180° convergence speed and accuracy. We utilized the following
upside down so the model can recognize fire in any optimizers, all of which are available as default options in
orientation or format which were especially useful in YOLOv11s:
situations where the angle of the wire could be varied like SGD (Stochastic Gradient Descent)
satellite and UAV vehicles. A default optimizer used for deep learning models,
3. Cropping and Zooming Gradient Descent is an iterative optimization process that

v
- We applied random cropping with 0% minimum zoom searches for an objective function’s optimum value
and 20% maximum zoom to improve variations in fire size (Minimum/Maximum). It is one of the most used methods

re
and positioning within an image. for changing a model’s parameters in order to reduce a cost
- This helped the model recognize fires from different function in machine learning projects.
distances and perspectives, ensuring that both small and
large fire instances were adequately detected. Adam (Adaptive Moment Estimation)
It is an optimizer widely used that combines the
benefits of SGD with adaptive learning rates. The learning
er rate for each parameter is dynamically adjusted, leading to
faster convergence and improved generalization.

AdamW (Adam with Weight Decay)


pe
A variant of Adam that includes weight decay to
prevent overfitting. This optimizer helps in better
regularization by controlling excessive updates to model
weights.

NAdam (Nesterov-Accelerated Adaptive Moment


ot

Estimation)
An extension of Adam, incorporating Nesterov
3.6 Preprocessing Techniques: momentum to further accelerate convergence and reduce
oscillations in weight updates. This optimizer is
1. Auto-Orient: Applied
tn

particularly effective in complex models with high-


- To Ensure uniformity in all images and consistency of
dimensional data.
orientation to avoid situations like some images may have
stored incorrect and incomplete dataset which could
RAdam (Rectified Adam)
confuse the model.
It addresses the shortcomings of Adam by
rin

reducing variance in adaptive learning rates, ensuring better


2. Resize: Stretch to 250×250 stability during training. It helps in cases where traditional
- resized some images to 250x250 to show that not all Adam struggles with poor convergence in early training
images will be in the exact same resolution which helped stages.
model differentiate fire even in different size and formats.
ep

RMSProp (Root Mean Square Propagation)


3. Auto-Adjust Contrast: Adaptive Equalization A variant of SGD that keeps a moving average of
- Enhanced image clarity by axiomatically adjusting squared gradients, which in effect adapts learning rates for
contrast using Adaptive Equalization. different parameters. This optimizer is known to be
- This method redistributes pixel intensities for better efficient in handling non-stationary objectives, making it
ideal for dynamic fire detection scenarios.
Pr

visibility of flames in low-light or high-smoke conditions.


All of these optimizers were run and tested for the
By incorporating all these data augmentation and purpose of selecting the best one to train the YOLOv11s
preprocessing technics, we both improved the count of model on the task of wildfire detection. The final selection
was based on convergence speed, validation accuracy, and

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
overall model stability. Impact of Experimental Setup on environment might cause the model to classify fires where
Model Performance none exist in an outdoor wildfire scenario. Similarly,
By implementing a comprehensive experimental images of people randomly inserted in the dataset as a
setup with advanced data augmentation, preprocessing curious case of padding was also removed, leading to much
techniques, and multiple optimizers, we significantly bettr classifications.
improved our model's ability to detect wildfires accurately.
Instead of using the datasets as they were, we

ed
3.9 Training Configuration: carefully selected only the best-quality images that clearly
depicted fire and no-fire scenarios in outdoor
The model was trained using the YOLOv11s architecture environments. This approach allowed us to create a
with the following settings: particular high-quality dataset that enhanced the accuracy
and precision of our model much better than other user

iew
Model Yolo11s whom have tried out the model before where it had
achieved a accuracy of 65-75%. Each image in the final
Epochs 100 dataset was resized to 640 × 640 pixels to maintain
uniformity, improve computational efficiency, and ensure
Batch 32 compatibility with our deep learning model, this created
more issue as we needed to crop out many unwanted details
Workers 4 from the dataset.

v
GPU enabled

re
The choice of 100 epochs was based on
convergence trends observed during early experimentation.
The batch size of 32 ensured an optimal balance between
memory efficiency and gradient stability. Model training
er
was conducted on a GPU-enabled system, significantly
accelerating computations.
pe
IV. RESULTS
4.1 Dataset Description:

In this Study, we constructed a custom dataset by


combining two publicly available data sets: one sourced 4.3 Dataset Composition and Splitting:
from Roboflow by Antonytsai, and the other from
After finishing the extensive curation process, our
ot

Roboflow. Eventhough both the datasets had a large


final dataset contained 3,368 images which was about 70%
number of labeled images that related to fire and non-fire
of actual dataset, which were then split using an 70:20:10
scenarios, they also included a considerable amount of
ratio:
irrelevant and misleading data that could negatively impact
tn

models performance and even hinder or complete change


its target for detection. Many images contain elements, Training set: Testing set: Valid set:
such as human faces, candles, fireworks, and artificial light 2,437 images 343 images 600 images
sources, which do not accurately represent wildfire (70%) (10%) (20%)
situations and the faces where complete slop to our models
rin

prediction and accuracy. As our objective was to develop a An 70:20:10 ratio split was chosen to ensure that
robust wildfire detection model, we performed a thorough the model had a adequatly large training dataset while
manual cleaning process to remove unwanted data. This retaining a meaningful test and valid dataset for
ensured that the final dataset contained only high-quality performance evaluation. This balance allows the model to
and relevant images, thereby enabling the model to learn learn effectively while still being tested on a variety of
ep

more effectively. unseen images data to assess generalization.


4.2 Dataset Curation and Cleaning Process:
4.4 Importance of High-quality Data:
The initial datasets from Kaggle both Roboflow
and Kaggle included thousands of images, but a large One of the most critical aspects of our dataset
Pr

majority of them were unusuable for our purposes. We preparation was ensuring high-quality images. A typical
manually inspected the images and removed misleading issue with public datasets is that they often contain images
samples that could create false detections in our model. For with poor lighting, unclear fire instances, misleading
example, candles and controlled flames in an indoor elements, or irrelevant data and sometimes garbage data. To

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
address this, we meticulously filtered out the images that 4.6.3 Validation Losses
could reduce the accuracy of the model or lead to false
detections. By focusing on quality over quantity, we - Box Loss: The box loss during validation fluctuated but
ensured that our dataset provided only the most meaningful overall reduced, beginning at 1.7316 in epoch 1 to 1.2743
training data. in epoch 100.

In addition, we carefully balanced the dataset to - Classification Loss (Cls Loss): The validation

ed
avoid class imbalance problems, which could lead to biased classification loss had a declining trend, beginning at
predictions. By ensuring a nearly equal distribution of fire 1.37305 in epoch 1 and concluding at 0.71895 in epoch
100.
and no-fire images, we helped the model learn to
distinguish fire occurrences without bias toward one class. - Distribution Focal Loss (DFL Loss): The validation DFL
loss also went down, beginning at 1.58207 in epoch 1 and

iew
4.5 Final Thoughts on Dataset Selection:
concluding at 1.35067 in epoch 100.
By building a high-quality, well-balanced, and
diverse dataset, we improved the reliability and
performance of our model. The dataset played a crucial role
in ensuring that our model could accurately detect wildfires
in real-world scenarios. The removal of unwanted data,

v
along with the careful selection of high-quality images,
enhances the ability of the model to differentiate fire from
non-fire images more effectively.

re
In summary, our dataset preparation process
highlights the importance of manual data curation, high-
quality image selection, and appropriate dataset balancing.
These efforts significantly contributed to the success of our
wildfire detection system, ensuring that it operates
efficiently and accurately in real-world conditions.
er
pe
4.6 Performance Evaluation:

4.6.1 Training and Validation Metrics

The training and validation metrics over 100


epochs are summarized below. These metrics include
training losses (box loss, classification loss, and
ot

distribution focal loss), validation losses, and key


performance indicators such as precision, recall, mAP50,
and mAP50-95.
tn

4.6.2 Training Losses

- Box Loss: This is the accuracy of the bounding box


prediction. It went down gradually from 1.72343 at epoch
1 to 0.7538 at epoch 100, which shows better localization
rin

accuracy.

- Classification Loss (Cls Loss): The classification loss, 4.6.4 Performance Metrics
indicating the correctness of class predictions, dropped
from 2.35903 for epoch 1 to 0.38772 for epoch 100,
ep

demonstrating much improvement in class prediction.

- Distribution Focal Loss (DFL Loss): The DFL loss, which


targets the enhancement of bounding box prediction
quality, went down from 1.66835 in epoch 1 to 0.97826 in
epoch 100.
Pr

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
- Precision (B): Precision increased from 0.73039 during
epoch 1 to 0.874 in epoch 100, reflecting the decrease in
false positives.

- Recall (B): Recall rose from 0.56165 in epoch 1 to


0.79644 in epoch 100, indicating improved detection of
true positives.

ed
- mAP50 (B): The average Average Precision at IoU
threshold 0.5 was enhanced from epoch 1, 0.64417 to epoch
100, 0.8395.

- mAP50-95 (B): The average Average Precision across

iew
IoU thresholds of 0.5 to 0.95 went up from 0.32565 at
epoch 1 to 0.55355 at epoch 100, showing improved overall 4.8 Validation and Training Batch Images
detection precision across different IoU thresholds.3.
Ablation Study on Model Performance
Graphical Representation
Introduction Ablation studies are of great importance in
4.7 DIAGRAMS deep learning research to analyse how different training
Precision-Recall (PR) Curve and F1-Score Curve settings affect the performance of the model. This paper

v
discusses five different experimental setups for evaluating
how modifications in training settings affect model
performance metrics. Training epochs, augmentation

re
strategies, and different versions of YOLO models are
focused on.

Experimental Setup The study was conducted under five


er different conditions:

4.9 Results and Analysis


The performance of each configuration is summarized in
the table below:
pe
Experiment Ep Loss Precisi Recall mAP_ mAP_ Time
och on 50 50_95
With 50 50 0.64916 0.85745 0.76454 0.83088 0.51309 8555.44
epochs
Without 100 0.57935 0.8762 0.80019 0.85791 0.52438 6206.16
augmentatio
ns
ot

Without 50 0.82362 0.84063 0.7175 0.818 0.48104 5012.48


augmentatio
n YOLOm
With 50 0.62615 0.85554 0.77777 0.8388 0.51096 6968.45
augmentatio
tn

n YOLOn
With fine- 50 0.37061 0.87729 0.79808 0.84333 0.56345 7686.16
tuned 50
epoch
Discussion Each experiment provides insight into how
training settings affect the model’s performance. Below, we
discuss key observations from each scenario:
rin

Training With 50 Epochs: The model had a total loss


of 0.64916, which is moderate compared to other
configurations. Precision was 0.85745, indicating a strong
ability to detect positive cases. Recall was 0.76454,
ep

suggesting a good ability to correctly identify positive


cases. The mAP@50 score was 0.83088, showing balanced
detection capability. The mAP@50:95 score was 0.51309,
which is lower than some configurations but within an
acceptable range. Total training time was 8555.44s, making
Pr

this a reliable baseline for further optimizations. This


experiment serves as a reference point to measure the
impact of fine-tuning and data augmentation.

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
Training Without Augmentations (100 Epochs) The (0.82362), suggesting it may generalize well without
total loss decreased to 0.57935, showing improved model additional data transformations.
convergence over a longer training period. Precision Augmentations did not always lead to significant
increased to 0.8762, while recall also improved to 0.80019. improvements, especially in recall, where augmentation on
mAP@50 improved to 0.85791, suggesting better YOLOvN resulted in a lower recall value (0.62615) than
generalization despite the absence of augmentations. unaugmented setups.

ed
mAP@50:95 increased slightly to 0.52438, demonstrating Confidence scores remained stable, indicating that
improved performance across multiple IoU thresholds. The while loss values varied, the certainty of predictions was
total training time was reduced to 6206.16s, showing largely unchanged.
efficiency gains with extended training. This experiment Augmentations may not always be necessary and
suggests that longer training improves generalization, even should be evaluated per dataset and model type.
without augmentations, but at the cost of a slight recall

iew
drop. V. ACKNOWLEDGMENT

Training Without Augmentations on YOLOm (50 We sincerely thank SRM IST TIRUCHIRAPPALLI for
Epochs) The total loss increased to 0.82362, which is the providing the necessary resources and conducive academic
highest among all configurations. Precision decreased environment required for the project success. We express
slightly to 0.84063, while recall dropped to 0.7175. our heartfelt thanks to our mentors Manivannan

v
mAP@50 was slightly lower at 0.818, suggesting reduced Chandrakumar sir and Shanmugapriya mam for their useful
detection performance. mAP@50:95 dropped further to guidance, positive feedback and constant guidance
0.48104, showing weaker generalization compared to other throughout this research. Our study has been deeply

re
models. The total training time was the lowest, at 5012.48s, enriched by their professional expertise. We also want to
making this the fastest configuration. This suggests that thank the subject handler of Database Management system
YOLOm struggles without augmentations, leading to lower Prabu sir for their advice and support, which have gone a
recall and overall performance, despite training quickly. long way to make this work of such quality. Finally, we
er appreciate all those directly or indirectly helped this
Training With Augmentations on YOLOn (50 research. Their efforts have contributed to its successful
Epochs) The loss decreased to 0.62615, improving stability completion.
over the baseline. Precision remained high at 0.85554, VI. REFERENCES
showing consistent performance in detecting wildfire
• Avazov, K., Hyun, A. E., S, A. A., Abdusalomov, A.
pe
cases. Recall improved to 0.77777, indicating that
B., & Cho, Y. I. (2023). Forest fire detection and
augmentations helped capture more positive cases.
mAP@50 improved to 0.8388, showing a well-balanced notification method based on AI and IoT
model. mAP@50:95 remained stable at 0.51096, approaches. Future Internet 2023, 15(2), 61.
suggesting similar generalization to the baseline. Training
• Rashkovetsky, D., Mauracher, F., Langer, M., &
time was 6968.45s, making it more computationally
Schmitt, M. (2021). Wildfire Detection From
intensive than YOLOm but better in accuracy. This
ot

Multisensor Satellite. IEEE JOURNAL OF SELECTED


experiment suggests that augmentations helped in
improving recall and maintaining balanced performance. TOPICS IN APPLIED EARTH OBSERVATIONS AND
REMOTE SENSING,VOL.14.
Training With Fine-Tuned Parameters (50 Epochs)
tn

• Saydirasulovich, S. N., Mukhiddinov, M., Djuraev,


The total loss significantly dropped to 0.37061, making this
O., Abdusalomov, A., & Cho, Y.-I. (2023). An
the most optimized configuration. Precision reached the
highest value at 0.87729, ensuring accurate wildfire Improved Wildfire Smoke Detection Based on
detection. Recall remained strong at 0.79808, maintaining YOLOv8 and UAV Images. Sensors 2023, 23, 8374.
high sensitivity. mAP@50 increased to 0.84333, showing
rin

• P. Panquin, "Fire_Smoke Dataset," Roboflow


further improvement over other experiments. mAP@50:95
Universe, Roboflow, Nov. 2024. [Online].
reached the highest value of 0.56345, demonstrating the
Available:
best generalization across all IoU thresholds. Total training
time was 7686.16s, making it efficient without sacrificing https://universe.roboflow.com/panquin/fire_sm
accuracy. This experiment demonstrates that fine-tuning oke-ziczu-yyfct-bkuru. [Accessed: Mar. 11, 2025].
ep

effectively improves precision and generalization, making


• G. Jocher, J. Qiu, and A. Chaurasia, Ultralytics
it the best-performing setup.
YOLO, version 8.0.0, Jan. 2023. [Online].
Conclusions and Key Takeaways From this ablation
Available:
study, several conclusions can be drawn:
https://github.com/ultralytics/ultralytics.
Longer training without augmentations (100 epochs)
Pr

resulted in lower total loss and higher mAP@50:95, but at


the cost of precision and recall.
YOLOvM performed best without augmentations,
achieving the highest precision (1.2141) and recall

This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112

You might also like