Part 1
Part 1
A
Project Report
on
BACHELOR OF ENGINEERING
in
DEPARTMENT OF INFORMATION SCIENCE AND ENGINEERING
Shylaja M (1KT22IS406)
CERTIFICATE
Certified that the project work prescribed in 21ISP76 entitled “LUNG CANCER DETECTION
USING MACHINE LEARNING TECHNIQUES” is a Bonafide work carried out by BHASKAR
JHA(1KT21IS007), DHEEKSHITHA J S(1KT21IS014), KHUSHI CHAUHAN (1KT21IS020)
and SHYLAJA M(1KT22IS406) , students of Sri Krishna Institute of Technology, Bengaluru in
partial fulfilment for the award of Bachelor of Engineering Information Science and Engineering of
the Visvesvaraya Technological University Belagavi during the year 2024-25. It is certified that all
corrections / suggestions indicated for Internal Assessment have been incorporated in the project report
deposited in the departmental library. The project report has been approved as it satisfies the academic
requirements with respect to project work prescribed for the said Bachelor of Engineering Degree.
In today’s world, rising health issues are leading to an urgent need for effective diagnostic tools, with
cancer remaining one of the most formidable challenges. Lung cancer, in particular, is a significant
public health problem due to its high mortality rate, which is largely attributed to diagnoses often
occurring at an advanced stage. Research shows that early detection is essential for improving
survival rates, as it allows for timely and potentially less aggressive treatment interventions. Our
project seeks to address this issue by utilizing advanced image analysis techniques on X-ray and CT
scan data to detect lung cancer at earlier stages.
We are implementing a robust machine learning framework that combines deep learning and
computer vision methods to analyze medical images with high precision. Techniques such as
convolutional neural networks (CNNs) are applied to detect patterns and anomalies that may indicate
early signs of lung cancer, even when these signs are not easily discernible to the human eye.
Additionally, by training our models on extensive datasets, we aim to improve the sensitivity and
specificity of lung cancer detection.
i
ACKNOWLEDGEMENT
The completion of Project Work brings with a sense of satisfaction, but it is never complete without
thanking the persons responsible for its successful completion.
At the outset, we express our most sincere grateful acknowledgment to the holy sanctum “Sri
Krishna Institute of Technology”, the temple of learning, for giving us an opportunity to pursue
the degree course in Information Science and Engineering and thus helping us in shaping our career.
We extend our deep sense of sincere gratitude to our Dr. Mahesha K, Principal, Sri Krishna
Institute of Technology, Bangalore, for providing us an opportunity.
We express our heartfelt sincere gratitude to our guide and HOD, Dr. Hemalatha K.L,
Department of Information Science and Engineering, Sri Krishna Institute of Technology,
Bangalore, for her valuable suggestions and support.
We extend our special in-depth, heartfelt, sincere gratitude to our Project Coordinator Mrs. Ragini
Krishna, Assistant Professor, Department of Information Science and Engineering, Sri
Krishna Institute of Technology, Bangalore, for her constant support and valuable guidance for
completion of the project work.
We would like to thank all the teaching and non-teaching staff members in our Department of
Information Science and Engineering, Sri Krishna Institute of Technology, Bangalore, for their
support.
Finally, we would like to thank all our friends and family members for their constant support,
guidance and encouragement.
ii
TABLE OF CONTENTS
Abstract i
Acknowledgement ii
Table of Contents iii
iii
6 TESTING 19
7 RESULTS 21
7.1 RESULTS 21
8 23
CONCLUSION AND
FUTURE ENHANCEMENT 23
8.1 CONCLUSION
8.2 FUTURE ENHANCEMENT 23
BIBLIOGRAPHY
APPENDIX
iv
TABLE OF FIGURES
Forest fires are a serious environmental hazard, with devastating impacts on biodiversity, human
life, and economic resources. These fires not only lead to the loss of flora and fauna but also
contribute significantly to air pollution and greenhouse gas emissions, accelerating climate change.
The frequency and intensity of forest fires have increased due to factors such as rising global
temperatures and prolonged droughts, creating an urgent need for effective monitoring and
management solutions.
Traditional methods for detecting and managing forest fires rely heavily on satellite imagery and
ground surveillance, which often suffer from limitations in spatial resolution, delayed response
times, and difficulty in monitoring remote areas.
These limitations make it challenging to detect fires at an early stage, especially in dense or rugged
forest environments where rapid response is crucial to containment. Furthermore, predicting the
spread of fires based on weather patterns and environmental conditions is complex, requiring real-
time data integration and advanced forecasting models.
EcoGuard’s real-time data collection and analysis capabilities enable precise fire detection, spread
prediction, and timely alerts to relevant authorities, enhancing response effectiveness. This
introduction outlines the importance of the problem, the limitations of existing methods, and how
EcoGuard offers an innovative solution. It sets the context for the rest of the report by highlighting
the project's goals and impact.
Ecoguard Introduction
1.Sensor Module
Initialize sensors (Humidity, Temperature, Gas sensors)
Read sensors value continuously
Send data to processing module
2. Data Preprocessing
Normalize sensor data
Remove anomalies and noise
Forward cleaned data to ML model
3. Machine Learning (TensorFlow Lite)
Load TensorFlow Lite model
Input data into model
Output anomaly status (0=Normal, 1=Anomaly)
4. Visualization and Alerting
Update real-time charts (temperature, humidity, gas level)
Display anomaly status on dashboard
Send email alerts if anomaly detected
1.3 OBJECTIVES
• To provide multi-level alert systems based on fire intensity (e.g., yellow alert for minor risks,
Red alert for high risks).
• To collect data to continuously improve the accuracy of the AI model through machine
learning.
The project "EcoGuard," a system aimed at tackling forest fires through advanced technology.
EcoGuard integrates IoT sensors, such as those for temperature, humidity, and smoke, with AI
algorithms to detect, predict, and manage forest fires in real-time. The project's goal is to
address the limitations of traditional methods like satellite imagery and manual ground
surveillance, which are slow and less effective in remote areas. By providing early detection,
spread prediction, and timely alerts, EcoGuard aims to enhance firefighting response, protect
biodiversity, and mitigate economic losses. This includes a literature review of related
technologies, identifies gaps like the need for real-time data integration and dynamic fire spread
prediction, and sets objectives to refine AI accuracy and collaboration with forest management.
The paper titled presents a smart system designed to detect and monitor fires, smoke, and gas
leaks, with a focus on regions like the GCC where oil and gas leaks pose significant risks. The
proposed system uses solar panels as a renewable energy source, integrating various sensors for
real- time fire and gas detection, and notifying relevant authorities. Additionally, it allows
continuous monitoring of buildings or areas under fire through desktop or mobile devices. Key
features include automatic fire extinguishing mechanisms, redundancy through multiple sensors for
improved accuracy, and an automatic cleaning system for the solar panels to maintain efficiency in
dusty environments like Kuwait. The system architecture uses a PC instead of a microcontroller to
enhance computational power, making it highly suitable for the environment of the GCC. The paper
concludes by highlighting the advantages of the system, including its scalability and reliability,
while also acknowledging limitations such as the need for further integration of wireless
communication and IoT for future improvements
The document "Using IoT and ML for Forest Fire Detection, Monitoring, and Prediction: A
Literature Review" explores the use of Internet of Things (IoT) and machine learning (ML) for
detecting, monitoring, and predicting wildfires. Forest fires are major hazards that cause
deforestation, environmental damage, and air pollution. The paper emphasizes the importance of
early-warning systems and evaluates existing literature on leveraging IoT and deep learning
technologies to address forest fires. It discusses various sensors, including temperature, humidity,
and CO detectors, that monitor environmental data to detect potential fire outbreaks. IoT systems
collect data, which is then analyzed using ML models for accurate fire prediction and early
response. The review categorizes existing approaches into image-based and sensor-based detection,
comparing their effectiveness and technological requirements. Overall, it highlights the potential of
integrating IoT and AI for more efficient wildfire management, prevention, and response strategies
[5] MURUGAPERUMAL KRISHNAMOORTHY, Md ASIF, IIHAMI COLAK, “A Design
and Development of the Smart Forest Alert Monitoring System Using IoT”
The paper discusses the design and development of a Smart Forest Alert Monitoring System
using IoT. The system aims to detect and mitigate forest fires and unlawful deforestation activities
in real-time by employing wireless sensor networks (WSNs) integrated with IoT. Key components
include temperature, humidity, smoke, and vibration sensors to monitor environmental parameters,
detect fire, and identify human activities such as tree cutting.
The data collected is transmitted to a cloud server using 4G/LTE for real-time monitoring by
authorities. The system automatically activates preventive actions like water sprinkling or CO2 fire
extinguishers upon detecting hazardous events. A case study conducted in Hyderabad demonstrates
the system's effectiveness in early wildfire detection and forest protection. The study highlights the
accuracy, quick response, and scalability of the system for forest fire prevention, with potential
applications in industrial, park, and urban settings.
The paper "Smart Forests: Fire Detection Service" explores a fire detection system within the
context of Smart Forests using IoT. The focus is on Edge Computing through Mobile Hubs (M-Hubs)
to improve cost-efficiency and scalability. These hubs collect and process data from Bluetooth Low
Energy sensors installed in forests to monitor temperature and humidity. By utilizing ContextNet
middleware, the system analyzes environmental data and detects potential fire hazards in real-time.
The architecture supports up to 1,500 connected mobile objects, enabling rapid notifications to
authorities like fire departments. The system’s low-cost infrastructure, relying on mobile devices
carried by forest guards and visitors, enhances early detection and response to wildfires. The research
demonstrates scalability and efficiency in fire detection, proposing future improvements like
integrating solar-powered sensors and drone support for data collection.
3. Anomaly Detection: Use AI/ML models for detecting unusual environmental conditions.
4. Email Alerts: Send alerts based on detected anomalies such as high fire risk.
5. Collaboration Tools: Enable data sharing with forest management authorities for decision-
making.
1. Scalability: Support integration with additional sensors or platforms like AWS IoT and
Google Cloud.
4. User Experience: Provide an intuitive interface for real-time data analysis and alerts.
5. Portability: Include future support for mobile apps for on-the-go monitoring.
3. Power Supply: Battery packs or renewable energy sources like solar panels for
uninterrupted power.
5. Edge Devices: Drones or cameras for enhanced coverage and real-time monitoring.
Ecoguard Reqiurements
2. AI Frameworks: TensorFlow or PyTorch for developing and deploying machine learning models.
3. Database Management: Cloud-based databases like Firebase or AWS for data storage and retrieval.
5. Communication Protocols: MQTT or HTTP for data exchange between sensors and servers.
CHAPTER 4
METHODOLOGY
• The process starts with input images, such as CT scans of the lungs, which serve as raw data for
the system.
• These input images go through a preprocessing step to enhance their quality and remove noise.
Techniques like resizing, normalization, and noise reduction are used to prepare the images for analysis.
• After preprocessing, the images are passed through Convolutional, ReLU, and Pooling layers of
a Convolutional Neural Network (CNN).
• Convolution extracts key features like edges, shapes, and patterns from the images.
• ReLU introduces non-linearity, allowing the network to learn complex relationships in the data.
• Pooling reduces the size of the feature maps, retaining important information while improving
computational efficiency.
• The extracted features are then sent to the Fully Connected Network (FCN) for training. At this stage,
the model learns to distinguish between "cancerous" and "non-cancerous" images.
• After training, the system generates a Trained Model that can analyze new, unseen lung images.
• During testing, new images are again preprocessed to match the format used in the training phase.
• The trained model then analyzes the new images to predict whether they indicate the presence of
lung cancer.
• Finally, the system produces a Result, which indicates whether the input image shows signs of lung
cancer or not.
Ecoguard Methodology
• The user requests to view sample images, and the Data Visualization Module displays these
sample images.
• The user uploads a lung tissue dataset, which is sent to the Data Preparation Module for
processing.
• The Data Preparation Module processes the data and returns the prepared dataset to the user.
• The prepared dataset is passed to the Training Module, which trains the machine learning model.
• The Training Module provides training progress updates back to the user.
• Once the model is trained, the user uploads an image for classification to the Prediction Module.
• The Prediction Module processes the image and returns the predicted class and an explanation
to the user.
• The user then requests model evaluation results, and the request is sent to the Evaluation Module.
• The Evaluation Module processes the request and returns the evaluation metrics to the user.
Dept of ISE, SKIT 11 2024-2025
Ecoguard Methodology
The chapter begins with the user requesting to view sample images, which are displayed by the
Data Visualization Module. Next, the user uploads a lung tissue dataset, which is processed by the
Data Preparation Module to generate a prepared dataset for training. This prepared dataset is then
passed to the Training Module, where the machine learning model is trained, and progress updates
are provided to the user.
Once the training is complete, the user can upload an image for classification. The Prediction
Module processes the image and returns the predicted class along with an explanation. Finally, the
user can request evaluation results, which are processed by the Evaluation Module and returned as
performance metrics.
This structured approach ensures a seamless pipeline for viewing, preparing, training, predicting,
and evaluating lung tissue datasets, providing an accurate and explainable solution for lung disease
detection.
IMPLEMENTATION
5.1 ALGORITHM
The proposed Algorithm can be viewed as follows:
Initialize ESP32, sensors (DHT22 for temperature and humidity, MQ135 for gas
detection), and Wi-Fi connection.
Collect sensor readings for: Temperature from DHT22, Humidity from DHT22,
Rule 1: If Temperature > 30°C AND Gas Level > 2500 ppm, trigger an alert.
Rule 2: If Humidity < 60% AND Gas Level > 2000 ppm, trigger an alert.
Rule 3: If Temperature is 30–35°C, Humidity is 70–80%, AND Gas Level > 1500 ppm,
trigger an alert.
Format the email body with the current temperature, humidity, and gas readings.
Serve real-time sensor data on the ESP32 web server through the following endpoints:
STEP 8: Repeat:
1. Ardinouno code
#ifdef ESP32
#include <WiFi.h>
#include <ESPAsyncWebServer.h>
#include <LittleFS.h>
#else
#include <ESP8266WiFi.h>
#include <ESPAsyncTCP.h>
#include <ESPAsyncWebServer.h>
#include <LittleFS.h>
#endif
Dept of ISE, SKIT 14
2024-2025
Ecoguard Implementation
#include <DHT.h>
#include <ESP_Mail_Client.h>
// Define DHT parameters
#define DHTPIN 15 // GPIO pin for DHT22
#define DHTTYPE DHT22 // DHT22 sensor type
DHT dht(DHTPIN, DHTTYPE);
// Define MQ135 gas sensor parameters
#define MQ135PIN 34 // GPIO pin for MQ135 sensor
// Wi-Fi credentials
const char* ssid = "Akash";
const char* password = "9880528258";
// Email settings
SMTPSession smtp;
ESP_Mail_Session mailSession;
SMTP_Message message;
// Email credentials
const char* smtpHost = "smtp.gmail.com";
const int smtpPort = 465;
const char* emailSenderAccount = "akashs.ise@skit.org.in";
const char* emailSenderPassword = "dcuw xxev xbbw ydtl"; // Use app password if 2FA is enabled
const char* emailRecipient = "akashakshay062@gmail.com";
Testing ensures the functionality, reliability, and robustness of your project. Here are the testing
phases and their details:
1. Functional Testing
Objective: To verify the functionality of individual components (e.g., sensors, decision logic, and
email alerts).
Test Scenarios:
Sensor Accuracy: Verify that DHT22 provides correct temperature and humidity readings.
Web Server: Test endpoints (/temperature, /humidity, /gas, /anomaly) for correct data output.
2. Boundary Testing
Objective: To validate the system behavior at or near threshold values.
Test Scenarios: Test for exact threshold values:
Temperature = 30°C, Humidity = 60%, Gas Level = 2500 ppm.
Test for values slightly above or below thresholds:
Temperature = 29.9°C, 30.1°C, Humidity = 59%, 61%, Gas Level = 2499 ppm, 2501 ppm
3. Stress Testing
Test Scenarios:
Sensor readings.
4. Performance Testing
Metrics:
Web Server Latency: Time taken to fetch sensor data from the web server.
Test Scenarios:
7. Observations
8. Summary
The testing phase confirmed the system’s robustness and reliability. It successfully identified
anomalies, sent alerts, and provided real-time data without interruptions. Minor improvements could
focus on further optimizing response times and error handling.
Table 7.1: Performance Metrics of RNN Model for Lung Cancer Detection
• Accuracy: 63%
• Macro Average: Precision: 0.49 | Recall: 0.64 | F1-Score: 0.53
• Weighted Average: Precision: 0.48 | Recall: 0.63 | F1-Score: 0.52
2. Confusion Matrix
The confusion matrix highlights the model's predictions and errors:
Table 7.2: Confusion Matrix for RNN Model Predictions on Lung Cancer Detection
The model correctly classified lung_aca and lung_n to some extent but failed to identify lung_scc
entirely.
8.1 CONCLUSION
The IoT-based Environmental Monitoring and Alert System successfully demonstrates the
integration of sensors, microcontrollers, and decision-making logic to monitor temperature,
humidity, and gas levels in real time. The system ensures timely detection of anomalies and
provides instant email alerts to users, enhancing safety and awareness. By leveraging ESP32 for
real-time data hosting and combining rule-based logic with flexible decision-making, the project
offers a reliable, scalable, and cost-effective solution for critical applications such as industrial
safety, fire detection, and environmental monitoring. This project highlights the potential of IoT
technology in creating smarter and safer environments.
Integration with IoT Platforms: Connect the system to platforms like AWS IoT, Google Cloud, or
ThingSpeak for advanced data analysis, long-term storage, and remote access.
Mobile Application: Develop a mobile app for real-time monitoring, push notifications, and an
intuitive user interface.
Support for Additional Sensors: Include more sensors like flame detectors, smoke sensors, and
PM2.5 sensors for enhanced environmental monitoring capabilities.
Scalability: Implement a mesh network of ESP32 devices to monitor larger areas and aggregate data
from multiple nodes.
Renewable Energy Integration: Power the system using solar panels or other renewable energy
sources for long-term, off-grid operation.
BIBLIOGRAPHY
[1] Mpho Mokate, Vukoti Marinate, “A review and comparative study of cancer detection using
Machine Learning”, Vol.47, No. 5, pp. 42-46, Year 2022.
[2] R. Wulandari, R. Sight, and S. Warhan, "Automatic lung cancer detection using color
histogram calculation," Vol.2, No: 20, pp.45-46, Year 2024.
[3] Vani Rajshekar, S Premkumar, “Lung cancer disease prediction with CT scan and
histopathological images feature analysis using deep learning technology”, Vol. 46, No. 5, pp.
53-58, Year 2024.
[4] Imran Nazir, Mostafa Darshan, “Machine learning-based lung cancer detection using image
registration and fusion” Vol. 7, No.6, pp. 66-68, 5–8 , Year 2023.
APPENDIX