0% found this document useful (0 votes)

32 views22 pages

Mega Report Final

The project report outlines the development of a machine learning-based Heart Attack Prediction System, focusing on stroke risk assessment using health-related factors. It employs Decision Tree and K-Nearest Neighbors (KNN) algorithms to analyze data and predict stroke likelihood, emphasizing the importance of data preprocessing for model accuracy. The findings aim to enhance preventive healthcare by identifying at-risk individuals for timely interventions.

Uploaded by

codingqueens2024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views22 pages

Mega Report Final

Uploaded by

codingqueens2024

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

MAHARASHTRA STATE BOARD OF TECHNICAL EDUCATION

MUMBAI
A
PROJECT REPORT ON
REPORT ON
“Heart Attack Prediction System”

UNDER THE GUIDANCE (MENTOR) OF

Mrs. C. B. Kamble

DEPARTMENT OF COMPUTER ENGINEERING

DR. D. Y. PATIL POLYTECHNIC,

KASABA BAWADA, KOLHAPUR.

SEMESTER - V
YEAR: - 2024-25

SUBMITTED BY: -

Sr. No Roll No Enrollment No Name

1. 3167 2205390069 Samruddhi Sanjay Nikam
2. 3168 2205390141 Tanisha Parag Karekar
3. 3169 2205390201 Parineeta Dattatray Varute
4. 3170 2205390224 Samruddhi Bhimrao Metil
CERTIFICATE

THIS IS TO CERTIFY THAT G-19 FROM DR.D.Y.PATIL POLYTECHNIC. HAVING

ENROLLMENT NO- 2205390069, 2205390141, 2205390201, 2205390224
HAVE SUCCESSFULLY COMPLETED ‘PROJECT REPORT’ HAVING TITLE
“Heart Attack Prediction System Using ML”. IN A GROUP CONSISTING OF 4
PEOPLE UNDER THE GUIDENCE OF MRS.C.B.KAMBLE.

Project Guide Head of Department Principal

Seal of Institute
ACKNOWLEDGEMENT

We like to share our sincere gratitude to all those who help us in completion of this
Capstone Project Planning (CPP) project report .During the work we faced to many
challenges due to work our lack of knowledge and experience but these people help us to
get over from all the difficulties and in final compilation of our idea to shaped sculpture.

We would like to thank Mrs.C .B. Kamble for her governance and guidance, because of
which we were able to learn the minute aspect of a Computer Engineering.

We would also like to show our gratitude our head of the computer department Dr. P. K.
Shinde for their continuous help and monitoring during the work. We are thankful to
supporting staff of our department for their help and support towards our project.

In the last we would like to thank the management of Dr. D. Y. Patil Polytechnic for
providing a such an opportunity to learn from this experiences.

We are also thankful to our whole class and motor all to our parents you have inspired us
to face all the challenges and will all the hurdles in life.
DECLARATION

I, hearby declare that the Project Report entitled “Heart Attack Prediction System
Using ML” being submitted by us towards the partial fulfilment of the Diploma in
Engineering, in the Department of Computer Engineering is a Capstone project work
carried by our team under the supervision of Mrs.C.B.Kamble.

We will be solely responsible if any kind of plagiarism is found.

Enrollment No. Name of Student

2205390069 Samruddhi Sanjay Nikam
2205390141 Tanisha Parag Karekar
2205390201 Parineeta Dattatray Varute
2205390224 Samruddhi Bhimrao Metil
ABSTRACT

This project aims to develop a machine learning-based system for predicting stroke risk by analyzing key
health-related factors such as age, BMI, gender, and lifestyle behaviors. Stroke prediction is crucial in
preventive healthcare, as early identification of at-risk individuals allows for timely interventions and
reduces the burden on healthcare systems. To achieve accurate predictions, this system preprocesses the
data to handle missing values, encodes categorical variables, and scales numerical features.

Two machine learning models, Decision Tree and K-Nearest Neighbors (KNN), are implemented and
trained on a stroke dataset to classify individuals based on their likelihood of experiencing a stroke. Both
models are evaluated using performance metrics such as accuracy and confusion matrices to determine
which model performs best in terms of predictive accuracy and generalizability.

The resulting system can be applied across various healthcare settings, from clinical support in hospitals
to risk assessment in digital health applications. This project not only demonstrates the application of
machine learning in stroke risk prediction but also highlights the importance of data preprocessing and
model evaluation in building effective predictive systems.
CONTENT PAGE

Sr No. Title Page no

1. INTRODUCTION AND BACKGROUND WORK 1-2

1.1 Introduction 1
1.2 Background work 2

2. LITERATURE SURVEY 2-6

2.1 Related work 4

2.2 Problems of existing system 5
2.3 Problem description 5-6

3. GOALS AND OBJECTIVES 6-7

4. APPLICATION 7

5. PROPOSED DETAILED METHODOLOGY 7-11

5.1 Proposed methodology 7-11

5.2 Problem definition 11

6. SYSTEM DESIGN DOCUMENTATION 11-13

6.1 System architecture 11

6.2 DFD 12
6.3 Use case diagram 13

7. REQUIREMENTS 13

7.1 Hardware requirements 13

13
7.2 Software requirements

8. REFREENCE AND BIOGRAPHY 14-15

Heart attack prediction system

INTRODUCTION AND BACKGROUND

1.1 Introduction

This project focuses on predicting the likelihood of stroke in individuals using

machine learning techniques. Stroke is a medical emergency and a leading cause of death
and disability worldwide. Early prediction and preventive measures can significantly
reduce its impact. Through this project, we aim to build models that assist healthcare
professionals in identifying individuals at risk, thereby contributing to better patient
outcomes and reducing healthcare costs.

The project leverages a dataset containing various health-related attributes such

as age, gender, BMI, hypertension, and heart disease status. This data serves as the
foundation for our predictive models. To ensure the data is suitable for machine learning
algorithms, we perform essential preprocessing steps. These include handling missing
values, encoding categorical data, and scaling numerical features, all of which help
improve the accuracy and reliability of our models.

Two machine learning models are implemented and compared in this project:
the Decision Tree Classifier and the K-Nearest Neighbors (KNN) Classifier. The Decision
Tree model is known for its simplicity and interpretability, while KNN is a versatile model
that is often effective with normalized data. By comparing these models, we gain insight
into their respective strengths and weaknesses for stroke prediction, allowing us to choose
the most effective model for this particular use case.

The performance of each model is evaluated using accuracy scores on both

training and testing datasets. This evaluation helps us understand the models’ predictive power
and potential for generating new data.

1.2 Background work

Stroke is a leading cause of mortality and long-term disability worldwide,

creating a significant burden on individuals and healthcare systems. Early prediction of
stroke risk is essential for proactive management and prevention, potentially reducing
stroke incidence through timely intervention. Traditional risk assessments, which often

1
Heart attack prediction system

focus on single factors, can lack the comprehensive insight required for accurate
prediction.

Machine learning provides an advanced approach, analyzing multiple health features—

such as age, BMI, lifestyle factors, and medical history—to recognize complex patterns associated
with stroke risk. This project leverages two machine learning algorithms, Decision Tree and K-
Nearest Neighbors (KNN), to develop a predictive model for stroke risk assessment. Decision
Trees are chosen for their interpretability, which can offer healthcare providers actionable insights,
while KNN relies on similarity to known cases, often improving prediction in healthcare
applications with sufficient data.

LITERATURE SURVEY

i. The paper by P. B. Patil et al. [1] suggests an approach for feature extraction and
ensemble deep learning for identifying and predicting the outcome of heart
disease. The application of classifiers like Random Forest and Gaussian NB for
classification is examined in this article. The findings demonstrate that the
suggested method outperforms existing systems in terms of accuracy, achieving
98.5% for heart disease prediction. The paper also includes confusion matrices for
various features and classifiers. Using ALEXANET features, for instance, the
Random Forest classifier achieves an overall classification accuracy of 96.43%.
With SIFT features, the Gaussian NB classifier obtains an overall classification
accuracy of 45.74%. In summary, the suggested method predicts heart disease
with excellent accuracy, and the Random Forest classifier that makes use of
ALEXANET characteristics exhibits encouraging outcomes with an accuracy of
96.43%.
ii. The paper by M. Thangamani et al. [2] addresses how different algorithms and
techniques are used to classify heart disease. Multi-Layer Perceptron (MLP),
Fluffy Unordered Standard Acceptance Calculation (FURIA), Multinomial
Strategic Relapse (MLR), Sequential Minimal Optimization (SMO), Bayes Net,
Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Artificial Neural
Network (ANN), and C4.5 are some of the classifiers that have been mentioned.
Depending on the study and dataset used, these classifiers produce different results
and levels of accuracy. As an illustration, one study improved the prediction

2
Heart attack prediction system

accuracy of MLP by 11%, MLR by 9.3%, FURIA by 9.2%, and C4.5 by 9.4%.
Another study found that a hybrid random forest model could predict heart disease
with an accuracy of 88.7%. Furthermore, in identifying patients at higher risk, the
decision tree classifier demonstrated a sensitivity rate of 93.3% and a specificity
rate of 63.6%. keep in mind that the accuracy and outcomes can change based on
the dataset, feature choices, preprocessing methods, and assessment metrics
applied in each study.
iii. A comparative analysis of various classifiers for the Heart Disease dataset's
classification is covered by K. M. Almustafa et al. [3]. K-Nearest Neighbor (K-
NN), Naive Bayes, Decision tree J48, JRip, SVM, Adaboost, Stochastic Gradient
Descent (SGD), and Decision Table (DT) classifiers are among the ones that were
employed in the investigation. The K-NN (K = 1), Decision tree J48, and JRip
classifiers exhibit promising classification accuracy, according to the results. With
an accuracy of 99.7073%, the JRip classifier obtained 97.2683%, the Decision tree
J48 classifier reached 98.0488%, and the K-NN classifier achieved 99.7073%.
These classifiers performed better than Adaboost, Naive Bayes, SGD, SVM, and
Decision Table classifiers. For the K-NN, Decision tree J48, and JRip classifiers,
the corresponding Kappa statistics were 0.9941, 0.961, and 0.9454. To summarize,
the heart disease dataset was classified with high accuracy by the K-NN (K = 1),
JRip, and Decision tree J48 classifiers. At 99.7073%, the K-NN classifier had the
best accuracy.
iv. Senthilkumar Mohan et al. [4] suggested a Hybrid Random Forest with Linear
Model (HRFLM), a compound between a Random Forest algorithm and a Linear
algorithm; HRFLM turned out to be very accurate with its predictions, with fewer
errors than all the other tested algorithms. They used the UCI machine learning
database for their data. The dataset had 303 records and used 13 of the 76 features.
HRFLM scored an accuracy of 88.7%.
v. Youness Khourdifi et al. [5] concluded that each algorithm worked better in
certain situations. Random Forest, K-Nearest Neighbor, and Neural Networks
were the models that worked best with the dataset they used. Their results also
showed that the optimization hybrid approach significantly increased prediction
in medical datasets. They also suggested 2 dataset optimization methods: Particle
Swarm optimization (PSO) and Ant Colony Optimization (ACO). They made a
hybrid of both methods and used it with K-Nearest Neighbor, which resulted in an

3
Heart attack prediction system

accuracy of 99.65%, and 99.6% with Random Forest. They got their dataset from
the UCI machine learning repository.
vi. Jaymin Patel et al. [6] suggested the J48 technique, which produced satisfactory
results and took minimal time to build. They used WEKA and the UCI Cleveland
dataset, which comprised 76 features and 303 entries. They used features such as
Diagnosis classes, sex, age, the severity of chest pain, and others. They used 10-
fold Cross Validation with the J48 technique. Furthermore, they used w types of
Reduced Error Pruning: Post pruning and Online pruning. The J48 technique had
a test error of 0.1666667.

2.1 Related work

Numerous studies have explored the application of machine learning techniques

for predicting stroke risk by analyzing health-related datasets. Early research in this field
focused on traditional statistical models such as logistic regression to assess stroke risk
factors, using variables such as age, blood pressure, cholesterol, and lifestyle factors.
However, the advent of machine learning has led to more sophisticated and flexible
approaches, with models such as Decision Trees and K-Nearest Neighbors (KNN)
becoming popular due to their interpretability and adaptability to various data structures.

Decision Trees, in particular, are frequently used in medical data analysis for
their ease of interpretation, allowing healthcare professionals to trace the decision-making
process and understand which features are most influential in stroke risk. Research has
shown that Decision Trees can effectively capture complex patterns in patient data, often
improving diagnostic accuracy when combined with appropriate preprocessing
techniques. However, due to their tendency to overfit, particularly on small datasets,
Decision Trees require careful tuning and validation to ensure generalization.

K-Nearest Neighbors (KNN) has also been widely applied in stroke prediction
studies.As a distance-based model, KNN relies on the proximity of data points in feature
space to classify patients, which can be beneficial in medical applications where patients
with similar characteristics often exhibit similar health outcomes. KNN's performance is
highly dependent on feature scaling, as differences in scale between features can distort

4
Heart attack prediction system

distance calculations, emphasizing the importance of data normalization. Studies have

shown that KNN can yield high accuracy for stroke prediction, particularly when the
number of neighbors is optimized.

This project draws on these methodologies by using both Decision Tree and
KNN classifiers to predict stroke risk in a structured healthcare dataset. The comparison
of these two models in this project aims to identify which model is best suited for
predicting stroke while highlighting the impact of data preprocessing techniques like
feature scaling and label encoding on model performance. The findings can contribute to
ongoing research by providing insights into the strengths and limitations of each approach,
aiding in the selection of optimal models for stroke risk assessment.

2.2 Problems of existing system

a) Dataset Dependency: Accuracy depends on dataset quality and feature selection.

b) Overfitting: Complex models may overfit and perform poorly on unseen data.
c) High Computational Cost: Some models require significant resources and time to
train.
d) Feature Sensitivity: Poor feature selection can reduce model accuracy.
e) Class Imbalance: Imbalanced datasets may cause classifiers to favor the majority
class.
f) Limited Interpretability: Complex models may lack transparency in predictions.
g) Hyperparameter Tuning: Models require careful tuning to achieve optimal
performance.
h) Inconsistent Results: Accuracy can vary across different classifiers and datasets.

2.3 Problem description

o Dataset Dependency: The model's accuracy relies heavily on the quality of the
dataset and the relevance of features. If the dataset lacks diversity or has missing or
inaccurate data, the model's predictions may be unreliable.

5
Heart attack prediction system

o Overfitting: Complex models, particularly Decision Trees, can overfit by learning

noise and specific patterns from the training data. This reduces their ability to
generalize to new data, leading to poor performance on unseen cases.
o High Computational Cost: Some algorithms, like K-Nearest Neighbors, are
computationally intensive, especially with large datasets. This can lead to long
training times and require significant processing resources.
o Feature Sensitivity: Selecting the wrong features can greatly reduce model
accuracy. Irrelevant or redundant features may introduce noise, while missing
essential ones can lead to incomplete or misleading predictions.
o Class Imbalance: In datasets where stroke cases are much rarer than non-stroke
cases, models may become biased toward predicting the majority class, resulting in
high accuracy but low sensitivity for detecting actual stroke cases.
o Limited Interpretability: More complex models, like deep learning or ensemble
methods, may produce accurate results but lack transparency in how predictions are
made, making it difficult for healthcare professionals to trust or understand the
output.
o Hyperparameter Tuning: Achieving optimal performance often requires careful
selection and tuning of model hyperparameters. This can be time-consuming and
may require expertise in machine learning to get right.
o Inconsistent Results: Accuracy and performance can vary across different
classifiers and datasets. A model that performs well on one dataset may not
generalize to another, requiring model retraining and evaluation with each new
dataset.

GOALS AND OBJECTIVES

o Goals

1. Develop a Predictive Model for Stroke Risk: Create a machine learning model
that can accurately predict the likelihood of a stroke based on patient data.
2. Compare Model Performance: Evaluate and compare the effectiveness of
Decision Tree and K-Nearest Neighbors (KNN) algorithms to identify the best-
performing model for stroke prediction.
3. Support Early Stroke Detection: Enable early identification of high-risk
individuals to support proactive medical intervention and preventive healthcare.

6
Heart attack prediction system

o Objectives

a) Collect relevant patient data for training the model.

b) Pre-process the data by handling missing values and scaling features.
c) Train and evaluate multiple machine learning models.
d) Identify the best-performing model for heart attack prediction.
e) Support healthcare professionals in early detection and intervention.

APPLICATION
This stroke prediction system can be used in healthcare for early risk
assessment, helping doctors identify high-risk individuals and take preventive
measures. It can be integrated into hospitals, clinics, and health apps for real-time
monitoring, improving patient care and decision-making. The system also supports
healthcare research by analyzing stroke risk patterns for better prevention and treatment
strategies.

PROPOSED DETAILED METHODOLOGY

5.1 Proposed methodology

1. MODULE-1: Data Collection

 Objective: Gather a comprehensive and relevant dataset to support accurate stroke

prediction.
 Method: Source data from trusted public health repositories or medical databases,
ensuring it includes a range of health-related factors such as age, gender, BMI,
blood pressure, lifestyle indicators (e.g., smoking status), and any history of stroke
or other cardiovascular conditions. This dataset serves as the foundation for model
training and evaluation, so it’s essential that it represents a diverse patient
population and a variety of risk factors for stroke.

7
Heart attack prediction system

2. MODULE-2: Data Preprocessing and Cleaning

 Objective: Prepare the raw dataset to ensure consistency, completeness, and

compatibility with machine learning algorithms.
 Method:
o Handling Missing Values: Identify any missing or incomplete data. Address
missing values by either imputing with statistical measures (e.g., mean for
numerical data, mode for categorical) or, if appropriate, by removing
rows/columns with extensive missing data.
o Encoding Categorical Variables: Convert categorical variables (e.g., gender,
smoking status) into numerical format using Label Encoding or One-Hot
Encoding, depending on the feature type and model requirements.
o Scaling and Standardizing Features: Scale numerical features like age, BMI,
and blood pressure to a standard range using Standardization (zero mean, unit
variance) or Min-Max Scaling. Standardizing data is crucial for distance-based
models like KNN to improve prediction accuracy.

3. MODULE-3: Training Multiple Models

 Objective: Train initial machine learning models on the preprocessed dataset to

establish baseline predictions.
 Method:
o Data Splitting: Divide the dataset into training (70%) and testing (30%) sets to
prevent overfitting and ensure that models generalize well to unseen data.
o Model Training: Train initial classifiers, starting with Decision Tree and K-
Nearest Neighbors (KNN) models. Decision Trees provide interpretability by
showing which features influence predictions, while KNN uses distance-based
analysis to group similar individuals.
o Performance Tracking: Track metrics such as accuracy, precision, and recall
to gauge each model's initial performance for comparison in later stages.

4. MODULE-4: Model Selection

 Objective: Explore and test various machine learning algorithms to identify the
best model for accurate stroke prediction.

8
Heart attack prediction system

 Method:
o Model Exploration: Beyond Decision Tree and KNN, experiment with other
models such as Logistic Regression (to gauge linear relationships), Random
Forest (an ensemble method to reduce variance), Support Vector Machine
(SVM for classification boundaries), and Neural Networks (to capture complex
patterns in data).
o Hyperparameter Tuning: Perform hyperparameter tuning (e.g., adjusting tree
depth in Decision Trees, selecting the optimal number of neighbors for KNN)
using methods like grid search or cross-validation to optimize model
performance. This fine-tuning ensures each model’s configuration is best suited
to the dataset.

5. MODULE-5: Model Evaluation and Accuracy Measurement

 Objective: Evaluate the predictive performance of each model, comparing them

to select the most accurate and reliable option.
 Method:
o Performance Metrics: Measure each model’s accuracy on both training
and testing datasets to assess generalization capability. Calculate
additional metrics like precision, recall, F1-score, and Area Under the
Curve (AUC) to better understand each model’s performance.
o Confusion Matrix: Analyze the confusion matrix to understand the
distribution of true positives, true negatives, false positives, and false
negatives. This helps gauge how well each model handles stroke
predictions, particularly in cases where the dataset may be imbalanced.

6. MODULE-6: Standardization for KNN

 Objective: Enhance the effectiveness of the KNN model by applying feature

scaling to improve its distance-based predictions.
 Method:
o Feature Scaling: Apply Standardization (or other scaling methods as
appropriate) to ensure that all numerical features contribute equally to
distance calculations in KNN. KNN’s distance-based nature makes it

9
Heart attack prediction system

sensitive to unscaled data, so normalizing features like age, BMI, and

blood pressure can significantly improve its predictive performance.
o Retraining and Validation: After scaling, retrain the KNN model on the
standardized training dataset and validate it on the test set to observe
performance improvements.

7. MODULE-7: Optimal Model Selection

 Objective: Select the model with the best performance and generalization
capabilities for stroke risk prediction.
 Method:
o Comparative Analysis: Compare all trained models (Decision Tree,
KNN, Logistic Regression, Random Forest, SVM, and Neural Networks)
based on evaluation metrics, model complexity, interpretability, and
generalization. The best model should demonstrate high accuracy and
reliability on the test data and should generalize well to new data.
o Final Model Selection: Choose the top-performing model to be
recommended for stroke risk prediction, based on a balance of accuracy,
interpretability, and ease of integration into healthcare applications.

8. MODULE-8 : Application and Healthcare Decision-Making

 Objective: Apply the chosen model to assist healthcare providers in early

identification of at-risk patients and support preventative care strategies.
 Method:
o Deployment: Implement the final model in a user-friendly application or
platform that can be used by healthcare professionals. This may include an
interactive interface for inputting patient data and displaying stroke risk
predictions.
o Support for Decision-Making: Use the system as a clinical tool to help
healthcare providers identify patients at higher risk of stroke, enabling
timely interventions and personalized care strategies.
o Patient Education and Awareness: If integrated into digital health
platforms, this tool can empower individuals to monitor their own risk

10
Heart attack prediction system

factors, encouraging lifestyle adjustments or follow-up consultations based

on their risk profile.

5.2 Problem definition

The heart attack prediction system faces challenges such as dependency
on dataset quality, which affects model accuracy, and overfitting, where complex
models fail to generalize well to new data. Additionally, issues like high computational
costs, feature sensitivity, and class imbalance can further hinder model performance.
The lack of model interpretability and the need for extensive hyperparameter tuning
add to the complexity, making it difficult to create a reliable and efficient system.

This project aims to create a machine learning-based solution that can

accurately assess heart attack risk, helping healthcare providers make informed
decisions and offer preventive care to high-risk individuals.

SYSTEM DESIGN DOCUMENTATION

6.1 System architecture

11
Heart attack prediction system

6.2 DFD

12
Heart attack prediction system

6.3 Use case diagram

REQUIREMENTS

7.1 Hardware requirements

Processor : above 500MHz
Ram : 4GB
Hard Disk : 4GB
Input device : Standard keyboard and Mouse
Output device : VGA and High-Resolution
Monitor

7.2 Software requirements

Operating System : Windows 7 or higher
Programming : python 3.6 and related libraries
Software : Anaconda Navigator, Jupyter Notebook and Google colab

13
Heart attack prediction system

REFERENCE AND BIOGRAPHY

[1] P. B. Patil, P. M. Mallikarjun Shastry, and A. P. S Assistant Professor, “Heart Attack

Detection Based On Mask Region Based Convolutional Neural Network Instance
Segmentation and Hybrid Classification Using Machine Learning Techniques,”
Turkish Journal of Computer and Mathematics Education (TURCOMAT), vol. 12, no.
9, pp. 2228- 2244–2228– 2244, May 2021;
https://turcomat.org/index.php/turkbilmat/article/view/3697

[2] M. Thangamani, R. Vijayalakshmi, M. Ganthimathi, M. Ranjitha, P. Malarkodi, and

S. Nallusamy, “Efficient classification of heart disease using K-means clustering
algorithm,” International Journal of Engineering Trends and Technology, vol. 68, no.
12, pp. 48–53; https://ijettjournal.org/archive/ijett-v68i12p209

[3] K. M. Almustafa, “Prediction of heart disease and classifiers’ sensitivity analysis,”

BMC Bioinformatics, vol. 21, no. 1;
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-03626-y

[4] S. Mohan, C. Thirumalai, and G. Srivastava, "Effective heart disease prediction

using hybrid machine learning techniques," IEEE access, vol. 7, pp. 81 542–81 554,
2019;
https://jocc.journals.ekb.eg/article_282098_4b9e9c103330a9a045517d04f3a0a14a.pdf

[5] Y. Khourdifi and M. Bahaj, "Heart disease prediction and classification using
machine learning algorithms optimized by particle swarm optimization and ant colony
optimization," International Journal of Intelligent Engineering and Systems, vol. 12,
no. 1, pp. 242–252, 2019; https://www.inass.org/2019/2019022824.pdf

14
Heart attack prediction system

[6] J. Patel, D. TejalUpadhyay, and S. Patel, "Heart disease prediction using machine
learning and data mining technique," Heart Disease, vol. 7, no. 1, pp. 129–137, 2015;
https://csjournals.com/IJCSC/PDF7-1/18.%20Tejpal.pdf

A Project Report CPP
No ratings yet
A Project Report CPP
55 pages
Project Report
No ratings yet
Project Report
46 pages
Heart Disease Prediction Report Initial 5 Pages
No ratings yet
Heart Disease Prediction Report Initial 5 Pages
5 pages
Latexcode
No ratings yet
Latexcode
42 pages
T.John Institute of Technology: Visvesvaraya Technological University
No ratings yet
T.John Institute of Technology: Visvesvaraya Technological University
29 pages
Black Book Index
No ratings yet
Black Book Index
6 pages
BDA Final
No ratings yet
BDA Final
33 pages
Jaswanth Narayana R (40738003) Vishesh K (40738007)
100% (1)
Jaswanth Narayana R (40738003) Vishesh K (40738007)
37 pages
MINI PROJECT Kshetrika
No ratings yet
MINI PROJECT Kshetrika
41 pages
30 - Heart Disease Prediction
No ratings yet
30 - Heart Disease Prediction
50 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
47 pages
Latex Code
No ratings yet
Latex Code
46 pages
PBL2
No ratings yet
PBL2
21 pages
In Format GROUP FILE
No ratings yet
In Format GROUP FILE
64 pages
Machine Learning for Heart Disease Prediction
No ratings yet
Machine Learning for Heart Disease Prediction
63 pages
Report Heart Disease
No ratings yet
Report Heart Disease
39 pages
DocScanner 14-Mar-2025 11-59
No ratings yet
DocScanner 14-Mar-2025 11-59
64 pages
GR No-01-Project-Report PDF
No ratings yet
GR No-01-Project-Report PDF
46 pages
Heart Disease Prediction Using Machine Learning.
No ratings yet
Heart Disease Prediction Using Machine Learning.
59 pages
Group 6
No ratings yet
Group 6
68 pages
Mini Report2
No ratings yet
Mini Report2
40 pages
Shubhashshashankfinal
No ratings yet
Shubhashshashankfinal
61 pages
Phase 1 Project Report
No ratings yet
Phase 1 Project Report
44 pages
Heart Disease Prediction Report
No ratings yet
Heart Disease Prediction Report
60 pages
Proj Report
No ratings yet
Proj Report
29 pages
Genpactreport Tex
No ratings yet
Genpactreport Tex
48 pages
Heart Disease Prediction Using ML
No ratings yet
Heart Disease Prediction Using ML
48 pages
BT3277 Project Report
No ratings yet
BT3277 Project Report
19 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
70 pages
Project Report First Phase @8 Suhana
No ratings yet
Project Report First Phase @8 Suhana
32 pages
Latexcode
No ratings yet
Latexcode
45 pages
Lizu Report
No ratings yet
Lizu Report
16 pages
Heart Disease Prediction Report
No ratings yet
Heart Disease Prediction Report
81 pages
HEART DISEASE PREDICTION REPORT Op Edited
No ratings yet
HEART DISEASE PREDICTION REPORT Op Edited
29 pages
Heart Disease Prediction System Report
No ratings yet
Heart Disease Prediction System Report
31 pages
1822 B.E Cse Batchno 95
No ratings yet
1822 B.E Cse Batchno 95
57 pages
19MCA1097 Project Report On Heart Failure Prediction
100% (1)
19MCA1097 Project Report On Heart Failure Prediction
63 pages
Heart Disease Prediction Report
No ratings yet
Heart Disease Prediction Report
44 pages
Heart Disease Prediction Model
No ratings yet
Heart Disease Prediction Model
15 pages
Heart Disease Predicition
No ratings yet
Heart Disease Predicition
42 pages
PBL CA II Project Report (1) .Docs
No ratings yet
PBL CA II Project Report (1) .Docs
25 pages
Project
No ratings yet
Project
19 pages
SST Word
No ratings yet
SST Word
15 pages
Human Disease Prediction (2) - 1 - Compressed
No ratings yet
Human Disease Prediction (2) - 1 - Compressed
62 pages
Final Documentation
No ratings yet
Final Documentation
10 pages
M.Tech. (Software Engineering) : Comparitive Analysis of Heart Disease Prediction Using Machine Learning Algorithms
No ratings yet
M.Tech. (Software Engineering) : Comparitive Analysis of Heart Disease Prediction Using Machine Learning Algorithms
68 pages
Project Report Half
100% (1)
Project Report Half
33 pages
Karthik Ai Project Report
No ratings yet
Karthik Ai Project Report
29 pages
BTP Sixth Sem Report
No ratings yet
BTP Sixth Sem Report
31 pages
6th Sem Project PDF
No ratings yet
6th Sem Project PDF
18 pages
Report Heart
No ratings yet
Report Heart
62 pages
Report Bib
No ratings yet
Report Bib
51 pages
6th Sem Report - 1
No ratings yet
6th Sem Report - 1
34 pages
Compparison of Classification Algorithm For Heart Disease - Predictionpdf
No ratings yet
Compparison of Classification Algorithm For Heart Disease - Predictionpdf
34 pages
Hearts Report Final Pages
No ratings yet
Hearts Report Final Pages
27 pages
Minor Project Documentation Sem3
No ratings yet
Minor Project Documentation Sem3
39 pages
Be MJ Report
No ratings yet
Be MJ Report
35 pages
Decision Making Tree
No ratings yet
Decision Making Tree
18 pages
Prediction of Waterborne Diseases Using Machine Learning Tools IJERTV12IS040275
No ratings yet
Prediction of Waterborne Diseases Using Machine Learning Tools IJERTV12IS040275
4 pages
Marketing Colour Psychology and Its Relevance in Computer Science
No ratings yet
Marketing Colour Psychology and Its Relevance in Computer Science
13 pages
Machine Learning and Data Science ANSWER
No ratings yet
Machine Learning and Data Science ANSWER
9 pages
Network Traffic Intrusion Detection System Using Decision Tree & K-Means Clustering Algorithm
No ratings yet
Network Traffic Intrusion Detection System Using Decision Tree & K-Means Clustering Algorithm
3 pages
Pakdd 2018 Workshops Bdasc BDM Ml4cyber Paisi Damemo Melbourne Vic Australia June 3 2018 Revised Selected Papers Mohadeseh Ganji
No ratings yet
Pakdd 2018 Workshops Bdasc BDM Ml4cyber Paisi Damemo Melbourne Vic Australia June 3 2018 Revised Selected Papers Mohadeseh Ganji
141 pages
Department of Electronics & Telecommunications Engineering: ETEL71A-Machine Learning and AI
No ratings yet
Department of Electronics & Telecommunications Engineering: ETEL71A-Machine Learning and AI
4 pages
Data Science With R Text Mining by Graham Williams
No ratings yet
Data Science With R Text Mining by Graham Williams
21 pages
hw2 2011spring
0% (1)
hw2 2011spring
3 pages
Artificial Intelligence: Foundations & Applications: Prof. Partha P. Chakrabarti & Arijit Mondal
No ratings yet
Artificial Intelligence: Foundations & Applications: Prof. Partha P. Chakrabarti & Arijit Mondal
24 pages
6 Benefits of DL Techniques For Credit Scoring
No ratings yet
6 Benefits of DL Techniques For Credit Scoring
14 pages
Classification and Clustering Algorithm Notes
No ratings yet
Classification and Clustering Algorithm Notes
19 pages
GENERATING CLOUD MONITORS FROM MODELS TO SECURE - Docx 2
No ratings yet
GENERATING CLOUD MONITORS FROM MODELS TO SECURE - Docx 2
42 pages
BUSINESS ANALYTICS Assignment
No ratings yet
BUSINESS ANALYTICS Assignment
14 pages
Solutions PDF
No ratings yet
Solutions PDF
122 pages
Section 2 - Introduction To Machine Learning-Bje Edits - Ipynb - Colab
No ratings yet
Section 2 - Introduction To Machine Learning-Bje Edits - Ipynb - Colab
7 pages
Decision Trees for Real Estate
No ratings yet
Decision Trees for Real Estate
5 pages
Unit 2
No ratings yet
Unit 2
57 pages
Algorithm
No ratings yet
Algorithm
27 pages
Road Accident Analysis and Prediction of Accident Severity by Using Machine Learning in Bangladesh
No ratings yet
Road Accident Analysis and Prediction of Accident Severity by Using Machine Learning in Bangladesh
6 pages
Pilot Study Using Decision Trees To Diagnose The Efficacy of Virtual Offshore Egress Training
No ratings yet
Pilot Study Using Decision Trees To Diagnose The Efficacy of Virtual Offshore Egress Training
15 pages
AID 4th Semester Machine Learning Laboratory - Lab Manual
No ratings yet
AID 4th Semester Machine Learning Laboratory - Lab Manual
56 pages
ML Ex02 Solution
No ratings yet
ML Ex02 Solution
22 pages
Hybrid Approach of Cotton Disease Detection For en
No ratings yet
Hybrid Approach of Cotton Disease Detection For en
14 pages
Navdeep Gill, Patrick Hall - An Introduction To Machine Learning Interpretability (2018, O'Reilly Media, Inc.) PDF
No ratings yet
Navdeep Gill, Patrick Hall - An Introduction To Machine Learning Interpretability (2018, O'Reilly Media, Inc.) PDF
45 pages
Prepaid Mobile Dormancy Prediction
No ratings yet
Prepaid Mobile Dormancy Prediction
7 pages
20.k1.0038 Proposal Project Report Kelar-1
No ratings yet
20.k1.0038 Proposal Project Report Kelar-1
31 pages
Application of Machine Learning in A Mineral LeachingProcessTaking Pyrolusite Leaching As An Example
No ratings yet
Application of Machine Learning in A Mineral LeachingProcessTaking Pyrolusite Leaching As An Example
9 pages
IoT Water Quality Monitoring
100% (1)
IoT Water Quality Monitoring
3 pages
Human Resource Analytics: Bachelor of Technology
No ratings yet
Human Resource Analytics: Bachelor of Technology
66 pages

Mega Report Final

Uploaded by

Mega Report Final

Uploaded by

MAHARASHTRA STATE BOARD OF TECHNICAL EDUCATION

UNDER THE GUIDANCE (MENTOR) OF

DEPARTMENT OF COMPUTER ENGINEERING

DR. D. Y. PATIL POLYTECHNIC,

KASABA BAWADA, KOLHAPUR.

Sr. No Roll No Enrollment No Name

THIS IS TO CERTIFY THAT G-19 FROM DR.D.Y.PATIL POLYTECHNIC. HAVING

Project Guide Head of Department Principal

We will be solely responsible if any kind of plagiarism is found.

Enrollment No. Name of Student

Sr No. Title Page no

1. INTRODUCTION AND BACKGROUND WORK 1-2

2. LITERATURE SURVEY 2-6

2.1 Related work 4

3. GOALS AND OBJECTIVES 6-7

5. PROPOSED DETAILED METHODOLOGY 7-11

5.1 Proposed methodology 7-11

6. SYSTEM DESIGN DOCUMENTATION 11-13

6.1 System architecture 11

7.1 Hardware requirements 13

8. REFREENCE AND BIOGRAPHY 14-15

INTRODUCTION AND BACKGROUND

This project focuses on predicting the likelihood of stroke in individuals using

The project leverages a dataset containing various health-related attributes such

The performance of each model is evaluated using accuracy scores on both

1.2 Background work

Stroke is a leading cause of mortality and long-term disability worldwide,

Machine learning provides an advanced approach, analyzing multiple health features—

2.1 Related work

Numerous studies have explored the application of machine learning techniques

distance calculations, emphasizing the importance of data normalization. Studies have

2.2 Problems of existing system

a) Dataset Dependency: Accuracy depends on dataset quality and feature selection.

2.3 Problem description

o Overfitting: Complex models, particularly Decision Trees, can overfit by learning

GOALS AND OBJECTIVES

a) Collect relevant patient data for training the model.

PROPOSED DETAILED METHODOLOGY

5.1 Proposed methodology

1. MODULE-1: Data Collection

 Objective: Gather a comprehensive and relevant dataset to support accurate stroke

2. MODULE-2: Data Preprocessing and Cleaning

 Objective: Prepare the raw dataset to ensure consistency, completeness, and

3. MODULE-3: Training Multiple Models

 Objective: Train initial machine learning models on the preprocessed dataset to

4. MODULE-4: Model Selection

5. MODULE-5: Model Evaluation and Accuracy Measurement

 Objective: Evaluate the predictive performance of each model, comparing them

6. MODULE-6: Standardization for KNN

 Objective: Enhance the effectiveness of the KNN model by applying feature

sensitive to unscaled data, so normalizing features like age, BMI, and

7. MODULE-7: Optimal Model Selection

8. MODULE-8 : Application and Healthcare Decision-Making

 Objective: Apply the chosen model to assist healthcare providers in early

factors, encouraging lifestyle adjustments or follow-up consultations based

5.2 Problem definition

This project aims to create a machine learning-based solution that can

SYSTEM DESIGN DOCUMENTATION

6.3 Use case diagram

7.1 Hardware requirements

7.2 Software requirements

REFERENCE AND BIOGRAPHY

[1] P. B. Patil, P. M. Mallikarjun Shastry, and A. P. S Assistant Professor, “Heart Attack

[2] M. Thangamani, R. Vijayalakshmi, M. Ganthimathi, M. Ranjitha, P. Malarkodi, and

[3] K. M. Almustafa, “Prediction of heart disease and classifiers’ sensitivity analysis,”

[4] S. Mohan, C. Thirumalai, and G. Srivastava, "Effective heart disease prediction

You might also like