Yolov 11 S
Yolov 11 S
ed
Optimizing YOLOv11s for High-Precision
Fire Identification
iew
Naveen Jayaraj Madhav K Mohan Aswin S
DataBase Management System DataBase Management System DataBase Management System
SRM Institute Of Science And SRM Institute Of Science And SRM Institute Of Science And
Technology Technology Technology
Tiruchirappalli, India Tiruchirappalli, India Tiruchirappalli, India
v
naveenpainthouse@gmail.com madhavkmohan@gmail.com faswin746@gmail.com
re
Abstract: - detection methos such as satellite tracking, heat sensors,
Wildfires are serious ecological disasters that cause ground monitoring, lookout towers, and visual
widespread destruction, affecting natural habitats, human observation—have limitations, including reporting delays,
settlements, and air quality. Early detection is crucial to restricted coverage, and false alarms caused by atmospheric
minimizing damage and enabling timely intervention. This conditions like haze and fog.
research presents a wildfire detection model utilizing YOLOv11s,
er
a lightweight yet highly efficient deep learning architecture
optimized for real-time object detection. The model dataset is a
The innovation of Deep Learning and Artificial
Intelligence has been one of the most promising solutions
modified roboflow dataset for better results by removing for improving the efficacy of wildfire response and
unwanted information, enhanced with custom preprocessing
detection systems. There is a pressing need to create an
pe
techniques to improve detection accuracy under various
efficient, accurate, and automatic wildfire detection system
environmental conditions. Experimental results demonstrate that
our approach achieves an 89.04% accuracy, outperforming for proper disaster management. Deep learning and object
existing models while maintaining a low false-positive rate. By detection advancements have been of tremendous potential
leveraging YOLOv11s, The best model for this application. Our in resolving this problem.
system ensures high precision while taking as little processing
power as possible. It is made to be deployed on drones, satellites, YOLO, or You Only Look Once, is a state-of-the-
art deep learning technique for real-time object detection
ot
environmental effects, including loss of biodiversity and combination of advanced technologies, the system
carbon emissions. They also encompass economic losses proposed:
and severe health threats from airborne pollutants. Precise
wildfire detection is vital in order to crucial for these effects
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
Real-Time Detection: The YOLOv11s model provides method includes Wise-IoU (WIoU) v3 for bounding box
real time detection of wildfires enabling quicker response regression, Ghost Shuffle Convolution (GSConv) to reduce
times and less damage. model parameters and accelerate convergence, and
BiFormer attention mechanism to focus on crucial features
Enhanced Accuracy: Architectural improvements such as while ignoring irrelevant background noise. The dataset
the C3K2 block and C2PSA block enhance the model’s comprises 3200 images of forest fire smoke and 2800
capacity to identify fire and smoke from intricate
ed
images without smoke, totaling 6000 images, sourced from
backgrounds minimizing false positives Google, Kaggle, Flickr, and Bing. The model achieved an
Scalability: The system can be implemented on multiple average precision (AP) of 79.4%, with a notable 3.3%
platforms such as drones, satellites and ground cameras, improvement over the baseline, and robust performance for
giving overall coverage. small (APS of 71.3%) and large (APL of 92.6%) smoke
areas. However, there are drawbacks such as the model's
iew
Environmental impact: It can greatly minimize the sensitivity to atmospheric conditions like fog, haze, and
environmental harm caused by wildfire, such as carbon clouds, which can mimic smoke, leading to false positives.
emission, soil loss and biodiversity loss. Additionally, the dynamic nature of smoke plumes and the
complexities of the forest environment pose challenges to
Economic and social Benefits: By avoiding extensive
precise detection. The study highlights the need for further
wildfires, the system can prevent billions of dollars in
advancements in distinguishing smoke from similar
economic losses and safeguard population from
v
atmospheric conditions to improve detection accuracy.
displacement and health hazards.
re
2.2 Wildfire Detection from Multisensor Satellite
Though with promising performance, there are some
Imagery Using Deep Semantic Segmentation
challenges remaining. Environmental factors such as
(Rashkovetsky, Mauracher, Langer, & Schmitt, 2021)
clouds, haze, and fog continue to affect the detection
accuracy. Furthermore, the dynamics of wildfire behaviour
er In this study, "Wildfire Detection from
and fluctuating environmental factors also present Multisensor Satellite Imagery Using Deep Semantic
consistent difficulties for generalizing the model. Future Segmentation", a workflow using deep learning to detect
studies will concentrate on: fire-affected areas from multisensor satellite imagery is
proposed. For achieving deep semantic segmentation on
. Enhancing the model's ability to handle diverse
pe
data from four satellite instruments, this method uses a U-
environmental conditions.
Net-based convolutional neural network: Sentinel-1 C-
. Incorporating multisensory data fusion for more robust
SAR, Sentinel-2 MSI, Sentinel-3 SLSTR, and MODIS on-
detection.
board Terra and Aqua satellites. Different satellite datasets
. Expanding the dataset to include more diverse fire
projected different spectral bands: visible, infrared, and
scenarios and geographic regions.
microwave were exploited to achieve more robust and
In conclusion, the paper provides an extensive
accurate fire detection. Moreover, by fusing data from these
ot
increasing issues of wildfires in an era of climate change The dataset used in this work is the imagery of the
and environmental degradation. above-said satellite instruments, augmented with reference
This paper introduces a better model for wildfire data from the California Fire Perimeter Database, which
smoke detection with YOLOv11s, integrating new includes wildfire perimeters in California from 1950 to
architectural innovations like the C3K2 block, SPFF 2019. The multispectral and multisensor quality of the
rin
module, and C2PSA block. These innovations improve dataset enhances the detection mechanism capabilities by
detection accuracy, enhance feature extraction, and exploiting the advantages of each satellite's unique
optimize computational efficiency, making the model an features.
extremely effective tool for wildfire smoke detection.
The U-Net models trained in this work exhibit significant
ep
This study presents an enhanced model for other words, the single-instrument model implemented
detecting wildfire smoke using YOLOv8, incorporating with only Sentinel-2 achieves precision with a clear
improvements to boost accuracy and efficiency. The Republic of 0.83 and a recall of 0.92, thus dominance in
performance among other models. Such advances
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
notwithstanding, several limitations and drawbacks were III. PROPOSED SYSTEM
observed. Basically, clouds and smokes can obscure the 3.1 System:
fire signal and lead to false negatives; these are basically
the two most important problems. The imbalanced This section provides an elaborate description of
classification problem of fire-affected pixels, being less the hyperparameter settings, utilized test dataset,
than their unaffected counterparts, is associated with experimental configuration, and validation process
ed
difficulties in fire segmenting. Wildfires are dynamic and employed to measure the effectiveness of the improved
complex YOLOv11s model in identifying wildfires from CCTC,
satellite, or UAV sources. All the experiments were
phenomena involving many interacting factors conducted under consistent hardware conditions to ensure
such as weather, topography, and vegetation, resulting in a the reliability of the proposed methodology. The
more complex process in detection. Also, the performance experiment was carried out on Google’s Colaboratory
iew
of the model is very dependent on the training data's project with specific specifications, including the
availability and quality; thus, complete and high-quality following: Intel Xeon CPU with 2 vCPUs (virtual CPUs)
datasets are crucial and 13GB of RAM NVIDIA Tesla K80 with 12GB of
In conclusion, while the multisensor approach and VRAM We obtained our model from both Roboflow and
deep learning methods show great promise in improving Kaggle, with each image resized to 640 × 640 pixels. The
wildfire detection, it remains imperative to address the comprehensive evaluation encompasses a diverse range of
v
issues surrounding atmospheric conditions and data aspects, including the experimental setup and design,
imbalance to enhance detection accuracy and robustness YOLOv11s performance analysis, method impact
assessment, model comparisons, ablation studies, and
re
further.
visualization results. A table displaying the parameters
2.3 Forest Fire Detection and Notification Method utilized during the training of the model for detecting fire
Based on AI and IoT Approaches (Avazov, Hyun, S, and smoke is included in Table 4 of the manuscript. This
Abdusalomov, & Cho, 2023) provides a clear overview of the training and configuration
of this specific task.
This study proposes a new AI- and IoT-based method
model includes images from the Robmarkcole and Glenn- to create faster and single-shot detection of algorithm and
Jocher databases as well as video footage of various fire mainly with high accuracy. It is single-shot approach which
scenarios collected from YouTube. The dataset consists of helps us by dividing the images into grids and predicts the
75% training and 25% test images, for a total of 3120 bounding boxes and class probabilities for each grid.
tn
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
hierarchical technique guarantees that the network captures structure copies a traditional bottleneck module, ensuring
both low-level details (such as edges and textures) and lightweight computations. However, when c3k is set to
high-level semantic representations required for accurate True, the head employs a deeper C3 module, enabling more
object detection. A prominent enhancement in YOLOv11 is extensive feature extraction for complex scenarios.
the introduction of the C3k2 block, which replaces the older The detection head incorporates a series of
C2f module. Unlike standard large-kernel convolutions, the Convolution-BatchNorm-SiLU (CBS) layers, which
ed
C3k2 block employs two smaller convolutional layers in further refine the extracted features before final prediction.
succession, achieving improved efficiency and reduced These layers stabilize the learning process by normalizing
computational cost without sacrificing accuracy. feature distributions and applying the SiLU activation
Another key element in the backbone is the Spatial Pyramid function to advancement non-linearity. The final
Pooling - Fast (SPPF) block, which is a critical part of convolutional layers in the detection head minimize the
multi-scale feature accumulation. By pooling features at feature maps into precise outputs, predicting bounding box
iew
different scales, SPPF ensures that the network keeps coordinates, confidence scores, and class labels. This
valuable spatial information across varying object sizes and process is followed by post-processing steps to filter out
shapes. To further enhance feature extraction, YOLOv11 redundant detections and ensure the most accurate
introduces the Cross Stage Partial with Spatial Attention predictions are retained.
(C2PSA) block. This module incorporates spatial attention
mechanisms, allowing the model to focus on important 3.3 Experimental Setup:
v
regions in an image while inhibiting less relevant areas, this
helps in making sure the most important parts are given
utmost attention. The union of these elements strengthens
re
the backbone's ability to detect objects with greater
precision, particularly in complicated environments where
objects may be obstructed or overlapping.
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
3.5 Augmentation Techniques that we used in the dataset so the model has more data to work with but also
dataset: improved the diversity and variability of data which insured
a high level of generalization that could be useful in
1. Flip: Horizontal Flip detection of wild-fire even in low light, high smoke, low
- We applied horizontal flipping to simulate real-world resolution or differently angled images from wide amount
scenarios where a fire could appear in different of sources.
orientations.
ed
- This helped the model generalize better to flames 3.7 Optimization Strategies for Training
appearing from the left or right side of an image.
To ensure the best performance of our YOLOv11s
2. Rotations: Clockwise, Counterclockwise, and Upside model, we experimented with multiple optimization
Down algorithms. The choice of optimizer plays a crucial role in
iew
- To farther enhance the model robustness, we rotated how well the model learns from the dataset, affecting
images 90° clockwise, 90° counterclockwise, and 180° convergence speed and accuracy. We utilized the following
upside down so the model can recognize fire in any optimizers, all of which are available as default options in
orientation or format which were especially useful in YOLOv11s:
situations where the angle of the wire could be varied like SGD (Stochastic Gradient Descent)
satellite and UAV vehicles. A default optimizer used for deep learning models,
3. Cropping and Zooming Gradient Descent is an iterative optimization process that
v
- We applied random cropping with 0% minimum zoom searches for an objective function’s optimum value
and 20% maximum zoom to improve variations in fire size (Minimum/Maximum). It is one of the most used methods
re
and positioning within an image. for changing a model’s parameters in order to reduce a cost
- This helped the model recognize fires from different function in machine learning projects.
distances and perspectives, ensuring that both small and
large fire instances were adequately detected. Adam (Adaptive Moment Estimation)
It is an optimizer widely used that combines the
benefits of SGD with adaptive learning rates. The learning
er rate for each parameter is dynamically adjusted, leading to
faster convergence and improved generalization.
Estimation)
An extension of Adam, incorporating Nesterov
3.6 Preprocessing Techniques: momentum to further accelerate convergence and reduce
oscillations in weight updates. This optimizer is
1. Auto-Orient: Applied
tn
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
overall model stability. Impact of Experimental Setup on environment might cause the model to classify fires where
Model Performance none exist in an outdoor wildfire scenario. Similarly,
By implementing a comprehensive experimental images of people randomly inserted in the dataset as a
setup with advanced data augmentation, preprocessing curious case of padding was also removed, leading to much
techniques, and multiple optimizers, we significantly bettr classifications.
improved our model's ability to detect wildfires accurately.
Instead of using the datasets as they were, we
ed
3.9 Training Configuration: carefully selected only the best-quality images that clearly
depicted fire and no-fire scenarios in outdoor
The model was trained using the YOLOv11s architecture environments. This approach allowed us to create a
with the following settings: particular high-quality dataset that enhanced the accuracy
and precision of our model much better than other user
iew
Model Yolo11s whom have tried out the model before where it had
achieved a accuracy of 65-75%. Each image in the final
Epochs 100 dataset was resized to 640 × 640 pixels to maintain
uniformity, improve computational efficiency, and ensure
Batch 32 compatibility with our deep learning model, this created
more issue as we needed to crop out many unwanted details
Workers 4 from the dataset.
v
GPU enabled
re
The choice of 100 epochs was based on
convergence trends observed during early experimentation.
The batch size of 32 ensured an optimal balance between
memory efficiency and gradient stability. Model training
er
was conducted on a GPU-enabled system, significantly
accelerating computations.
pe
IV. RESULTS
4.1 Dataset Description:
prediction and accuracy. As our objective was to develop a An 70:20:10 ratio split was chosen to ensure that
robust wildfire detection model, we performed a thorough the model had a adequatly large training dataset while
manual cleaning process to remove unwanted data. This retaining a meaningful test and valid dataset for
ensured that the final dataset contained only high-quality performance evaluation. This balance allows the model to
and relevant images, thereby enabling the model to learn learn effectively while still being tested on a variety of
ep
majority of them were unusuable for our purposes. We preparation was ensuring high-quality images. A typical
manually inspected the images and removed misleading issue with public datasets is that they often contain images
samples that could create false detections in our model. For with poor lighting, unclear fire instances, misleading
example, candles and controlled flames in an indoor elements, or irrelevant data and sometimes garbage data. To
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
address this, we meticulously filtered out the images that 4.6.3 Validation Losses
could reduce the accuracy of the model or lead to false
detections. By focusing on quality over quantity, we - Box Loss: The box loss during validation fluctuated but
ensured that our dataset provided only the most meaningful overall reduced, beginning at 1.7316 in epoch 1 to 1.2743
training data. in epoch 100.
In addition, we carefully balanced the dataset to - Classification Loss (Cls Loss): The validation
ed
avoid class imbalance problems, which could lead to biased classification loss had a declining trend, beginning at
predictions. By ensuring a nearly equal distribution of fire 1.37305 in epoch 1 and concluding at 0.71895 in epoch
100.
and no-fire images, we helped the model learn to
distinguish fire occurrences without bias toward one class. - Distribution Focal Loss (DFL Loss): The validation DFL
loss also went down, beginning at 1.58207 in epoch 1 and
iew
4.5 Final Thoughts on Dataset Selection:
concluding at 1.35067 in epoch 100.
By building a high-quality, well-balanced, and
diverse dataset, we improved the reliability and
performance of our model. The dataset played a crucial role
in ensuring that our model could accurately detect wildfires
in real-world scenarios. The removal of unwanted data,
v
along with the careful selection of high-quality images,
enhances the ability of the model to differentiate fire from
non-fire images more effectively.
re
In summary, our dataset preparation process
highlights the importance of manual data curation, high-
quality image selection, and appropriate dataset balancing.
These efforts significantly contributed to the success of our
wildfire detection system, ensuring that it operates
efficiently and accurately in real-world conditions.
er
pe
4.6 Performance Evaluation:
accuracy.
- Classification Loss (Cls Loss): The classification loss, 4.6.4 Performance Metrics
indicating the correctness of class predictions, dropped
from 2.35903 for epoch 1 to 0.38772 for epoch 100,
ep
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
- Precision (B): Precision increased from 0.73039 during
epoch 1 to 0.874 in epoch 100, reflecting the decrease in
false positives.
ed
- mAP50 (B): The average Average Precision at IoU
threshold 0.5 was enhanced from epoch 1, 0.64417 to epoch
100, 0.8395.
iew
IoU thresholds of 0.5 to 0.95 went up from 0.32565 at
epoch 1 to 0.55355 at epoch 100, showing improved overall 4.8 Validation and Training Batch Images
detection precision across different IoU thresholds.3.
Ablation Study on Model Performance
Graphical Representation
Introduction Ablation studies are of great importance in
4.7 DIAGRAMS deep learning research to analyse how different training
Precision-Recall (PR) Curve and F1-Score Curve settings affect the performance of the model. This paper
v
discusses five different experimental setups for evaluating
how modifications in training settings affect model
performance metrics. Training epochs, augmentation
re
strategies, and different versions of YOLO models are
focused on.
n YOLOn
With fine- 50 0.37061 0.87729 0.79808 0.84333 0.56345 7686.16
tuned 50
epoch
Discussion Each experiment provides insight into how
training settings affect the model’s performance. Below, we
discuss key observations from each scenario:
rin
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112
Training Without Augmentations (100 Epochs) The (0.82362), suggesting it may generalize well without
total loss decreased to 0.57935, showing improved model additional data transformations.
convergence over a longer training period. Precision Augmentations did not always lead to significant
increased to 0.8762, while recall also improved to 0.80019. improvements, especially in recall, where augmentation on
mAP@50 improved to 0.85791, suggesting better YOLOvN resulted in a lower recall value (0.62615) than
generalization despite the absence of augmentations. unaugmented setups.
ed
mAP@50:95 increased slightly to 0.52438, demonstrating Confidence scores remained stable, indicating that
improved performance across multiple IoU thresholds. The while loss values varied, the certainty of predictions was
total training time was reduced to 6206.16s, showing largely unchanged.
efficiency gains with extended training. This experiment Augmentations may not always be necessary and
suggests that longer training improves generalization, even should be evaluated per dataset and model type.
without augmentations, but at the cost of a slight recall
iew
drop. V. ACKNOWLEDGMENT
Training Without Augmentations on YOLOm (50 We sincerely thank SRM IST TIRUCHIRAPPALLI for
Epochs) The total loss increased to 0.82362, which is the providing the necessary resources and conducive academic
highest among all configurations. Precision decreased environment required for the project success. We express
slightly to 0.84063, while recall dropped to 0.7175. our heartfelt thanks to our mentors Manivannan
v
mAP@50 was slightly lower at 0.818, suggesting reduced Chandrakumar sir and Shanmugapriya mam for their useful
detection performance. mAP@50:95 dropped further to guidance, positive feedback and constant guidance
0.48104, showing weaker generalization compared to other throughout this research. Our study has been deeply
re
models. The total training time was the lowest, at 5012.48s, enriched by their professional expertise. We also want to
making this the fastest configuration. This suggests that thank the subject handler of Database Management system
YOLOm struggles without augmentations, leading to lower Prabu sir for their advice and support, which have gone a
recall and overall performance, despite training quickly. long way to make this work of such quality. Finally, we
er appreciate all those directly or indirectly helped this
Training With Augmentations on YOLOn (50 research. Their efforts have contributed to its successful
Epochs) The loss decreased to 0.62615, improving stability completion.
over the baseline. Precision remained high at 0.85554, VI. REFERENCES
showing consistent performance in detecting wildfire
• Avazov, K., Hyun, A. E., S, A. A., Abdusalomov, A.
pe
cases. Recall improved to 0.77777, indicating that
B., & Cho, Y. I. (2023). Forest fire detection and
augmentations helped capture more positive cases.
mAP@50 improved to 0.8388, showing a well-balanced notification method based on AI and IoT
model. mAP@50:95 remained stable at 0.51096, approaches. Future Internet 2023, 15(2), 61.
suggesting similar generalization to the baseline. Training
• Rashkovetsky, D., Mauracher, F., Langer, M., &
time was 6968.45s, making it more computationally
Schmitt, M. (2021). Wildfire Detection From
intensive than YOLOm but better in accuracy. This
ot
This preprint research paper has not been peer reviewed. Electronic copy available at: https://ssrn.com/abstract=5232112