0% found this document useful (0 votes)

51 views114 pages

C-G1 2nd Review

Uploaded by

Vippala Srija

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views114 pages

C-G1 2nd Review

Uploaded by

Vippala Srija

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 114

MULTI-CLASS HEART HEALTH CLASSIFICATION FROM

ECG DATA WITH RANDOM FOREST CLASSIFIER

A PROJECT REPORT
Submitted to

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY ANANTAPUR,

ANANTHAPURAMU
In partial fulfillment of the requirements for the award of the degree of
BACHELOR OF TECHNOLOGY
in
ELECTRONICS AND COMMUNICATION ENGINEERING
By
I. SUPARNA B. SWATHI
(H.T.NO.20JN1A0409) (H.T.NO.20JN1A0420)
G. PRATHYUSHA A. AMRUTHA
(H.T.NO.20JN1A0443) (H.T.N.O.20JN1A0403)

Under the Esteemed Guidance of

Mrs. A. PAVANI,M.Tech.,(Ph.D).,
Associate Professor
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
2020-2024

Department of
ELECTRONICS AND COMMUNICATION ENGINEERING

CERTIFICATE
Certified that the Project entitled as “MULTI-CLASS HEART HEALTH
CLASSIFICATION FROM ECG DATA WITH RANDOM FOREST CLASSIFIER”
that is being submitted by I. Suparna (20JN1A0409), B. Swathi (20JN1A0420), G.
prathyusha (20JN1A0443), A. Amrutha (21JN5A0403) in partial fulfillment of the
requirements for the award of degree of BACHELOR OF TECHNOLOGY in
ELECTRONICS AND COMMUNICATION ENGINEERING by Jawaharlal Nehru
Technological University Anantapur, Ananthapuramu during the academic year 2020-
2024. It is certified that all corrections/suggestions indicated for internal assessment have
been incorporated in the report. The project report has been approved as it satisfies the
academic requirements in respect of project work prescribed for the said degree.

Signature of the guide Signature of Head of the Dept.

Mrs. A. Pavani, M.Tech ,(Ph.D)., Dr. P. Giri Prasad, M.Tech, Ph.D.,

Principal

Dr. V. Anil Kumar, M.Tech, Ph.D.,

External Viva-Voice conducted on _

INTERNAL EXAMINER EXRTERNAL EXAMINER

ACKNOWLEDGEMENT
The satisfaction and elation that accompany the successful completion of task
would be incomplete without the mention of people who have made it possibility. It is a
great privilege to express our gratitude and respect to all those who have guided and
inspired us during the course of project work.
We express indebtedness to our honorable chairman Dr. P. GUNA SHEKAR,
Chairman who provided all the facilities and necessary encouragement during the
course of study.
We express profound sense of gratitude and sincere thanks to beloved principal
Dr. V. ANIL KUMAR, M.Tech, Ph.D., for motivating us and providing necessary
infrastructure to Complete the project.
We convey our thanks to the Head of the Department Dr. P. GIRI PRASAD,
M.Tech, Ph.D.,for his timely help and guidance whenever required.
We take this opportunity to express my sincere deep sense of gratitude to our
project Coordinator Mrs. P. MOHANA, M.Tech., Assistant Professor, Department of
Electronics & Communication Engineering for his significant suggestions and help in
every aspect to accomplish the seminar report.
We find immense pleasure in expressing profound gratitude and to our guide
Mrs. A. PAVANI, M.Tech, (Ph.D).,Associate Professor, Department of ECE for standing by
our side all through the project.
We also thank all staff members of the Department of ECE for their support.
We also place on record special thanks to our parents and friends who were with us All
through the course of our project.
Project Associates,
I. SUPARNA (H.T. No. 20JN1A0409)
B. SAWTHI (H.T. No. 20JN1A0420)
G. PRATHYUSHA (H.T. No. 20JN1A0443)
A. AMRUTHA (H.T. No. 21JN5A0403)
Dept. of Electronics and Communication Engineering
S.V. Group Of Institutions,
NH 16 Bypass Road, North Rajupalem, Nellore.
DECLARATION

We hereby declare that the project entitled “Multi-Class Heart Health Classification
From ECG Data with Random Forest Classifier” has been done by us under the guidance
of Mrs. A. PAVANI, M.Tech,(Ph.D) Associate Professor, Department of Electronics &
Communication Engineering. This project work has been submitted to S.V. GROUP OF
INSTITUTIONS as a part of partial fulfillment of the requirements for the award of degree
of Bachelor of Technology.

We also declare that this project report has not been submitted at any time to another
Institute or University for the award of any degree.

Project Associates,

I. SUPARNA (H,T,NO,20JN1A0409)

B. SWATHI (H.T.NO.20JN1A0420)

G. PRATHYUSHA (H.T.NO.20JN1A0443)

A. AMRUTHA (H.T.NO.20JN1A0403)
TABLE OF CONTENTS

CHAPTER.NO. TITLE.NO. PAGE.NO.

ACKNOWLEDGMENT

TABLE OF CONTENTS

ABSTRACT

LIST OF FIGURES

LIST OF TABLES

1 INTRODUCTION 1-11

1.1 Overview 1

1.2 Research Motivation 1-2

1.3 Existing System 2-3
1.4 Problem Statement 3-5
1.5 Objective 5
1.6 Advantages 6
1.7 Application 7-8
1.8 Dataset Description 8
1.9 Performance Evaluation 8-11
2 LITERATURE SURVEY 12-49
2.1 Introduction 12
2.2 Related Work 12-46
2.3 Research Gaps 46-47
2.4 Competative Analysis 47-49
2.5 Summary 49
3 EXISTING SYSTEM 50-52
3.1 Existing Classifier Name 50-51
3.2 Drawbacks Of Existing Classifier 51-52

I
4 PROPOSED SYSTEM 53-58
4.1 Overview 53-54
4.2 Dataset Preprocessing 54-55
4.3 Random Forest Classifier 55-56
4.4 Random Forest Prediction 56
4.5 Classifier 56-57
4.6 Advantages 57-58
5 MACHINE LEARNING 59-66
6 SOFTWARE ENVIRONMENT 67-83
7 SOURCE CODE 84-89
8 RESULT AND DISCUSSION 90-97
8.1 Sample Dataset 90-91
8.2 Sample Figure Of ECG 91
8.3 The Circle Of Value Count in Dataset 92
8.4 One ECG For Each Caterory 93
8.5 Classification Report 94-96
8.6 Confusion_matrix 97
9 CONCLUSION AND FUTURE SCOPE 98-105
9.1 Conclusion 98
9.2 Future Scope 99
9.3 References 99-105

II
ABSTRACT

Electrocardiogram (ECG) ailment classification is a critical task in cardiac health

diagnosis, aimed at identifying various heart-related abnormalities from ECG signals.
ECG ailment multiclass classification has significant applications in cardiology and
healthcare. It plays a crucial role in early detection and diagnosis of various cardiac
conditions, including arrhythmias, myocardial infarctions, and heart blocks. Accurate and
timely classification of ECG signals can aid healthcare professionals in making informed
decisions, enabling appropriate treatment and care for patients. Additionally, automated
ECG analysis can support remote monitoring systems and enhance telemedicine solutions
for heart health assessment.

Traditional ECG ailment classification methods often rely on handcrafted features and
statistical measures extracted from the ECG signals. While these methods have been used
successfully for some ailments, they may struggle to capture complex and subtle patterns
indicative of certain cardiac abnormalities. Additionally, manual feature engineering can
be time-consuming and may limit the model's ability to adapt to new and diverse datasets.
Moreover, conventional machine learning algorithms might not fully exploit the temporal
dependencies present in ECG signals, leading to suboptimal performance, especially for
long term monitoring scenarios.

To overcome the limitations of existing methods, this work propose a novel machine
learning approach for ECG ailment multi-class classification using machine learning
techniques. These ML networks can capture long-term dependencies and temporal
patterns, making them well suited for sequential data like ECG signals. The ML network
then learns to extract relevant features from the sequential data, enabling accurate
classification of different cardiac ailments.

III
LIST OF FIGURES

FIG.NO FIGURE NAME PAGE.NO

1.1 Motivation Block Diagram 1

4.1 Proposed System Block Diagram 53

4.2 Random Forest Classifier 57

8.1 Sample Dataset 90

8.2 The Total Mitbih_train Dataset 91

8.3 Sample Figure of ECG 91

8.4 The Circle Of Value Count in Dataset 92

8.5 One ECG For Each Category 93

8.6 Naïve Bayes Class Confusion Matrix 97

8.7 RFC Confusion Matrix 97

IV
LIST OF TABLES
TABLE.NO TABLE NAME PAGE.NO
8.1 Existing Naive Bayes report. 94
8.2 Proposed RFC report. 94
8.3 Overall performance comparison. 95
8.4 Normal Class Performance Comparison. 95
8.5 Artial Premature class performance comparison. 95
8.6 Premature Ventricular contraction class
performance comparison. 96
8.7 Fusion of ventricular and normal
class performance comparison. 96
8.8 Fusion of paced and normal
class performance comparison. 96

V
Multi-Class Heart Health Classification From ECG Data With RFC

CHAPTER-1
INTRODUCTION
1.1 OVERVIEW
Heart health classification involves the assessment and categorization of individuals
based on the condition of their cardiovascular system. This process typically includes
analyzing various factors such as blood pressure, cholesterol levels, heart rate, and overall
lifestyle habits. Healthcare professionals use this classification to determine the risk of
developing heart diseases such as coronary artery disease, heart attack, or stroke. By
classifying individuals into different risk categories, healthcare providers can tailor
preventive measures and interventions to help individuals maintain or improve their heart
health. These interventions may include lifestyle modifications such as diet and exercise,
medication management, or other medical treatments aimed at reducing the risk of
cardiovascular events.

The classification of heart health is crucial for early detection and prevention of
cardiovascular diseases. It allows healthcare providers to identify individuals who are at
higher risk of developing heart-related problems and intervene promptly to mitigate these
risks. Moreover, by regularly monitoring and updating the classification based on changes
in health status or lifestyle habits, healthcare professionals can provide personalized care
and support to individuals to optimize their heart health and overall well-being.
Ultimately, the goal of heart health classification is to empower individuals to take
proactive steps towards maintaining a healthy heart and reducing their risk of
cardiovascular diseases.

1.2 RESEARCH MOTIVATION

Figure 1.1 Motivation Block diagram

Department of Electronics and Communication Engineering,SVCN,Nellore 1

Multi-Class Heart Health Classification From ECG Data With RFC

The motivation behind research into heart health classification using smartwatch
technology stems from the desire to enhance early detection and management of
cardiovascular diseases. Smartwatches equipped with electrocardiogram (ECG) sensors
can provide real-time monitoring of heart rhythms and detect irregularities. By leveraging
artificial intelligence (AI) and machine learning (ML) algorithms, these devices can
analyze ECG readings collected from the user's wrist and accurately classify potential
heart conditions. This approach enables individuals to conveniently monitor their heart
health continuously without the need for frequent visits to healthcare facilities, leading to
early detection of abnormalities and timely intervention.

Integrating AI and ML algorithms into smartwatches allows for the rapid processing and
interpretation of ECG data, enabling the device to display potential heart disease names
directly on the watch screen. This immediate feedback empowers users to take proactive
measures to address any identified issues promptly, such as seeking medical attention or
adjusting lifestyle habits. Additionally, by providing personalized insights and actionable
information in realtime, smartwatches equipped with heart health classification
capabilities contribute to improving individuals' overall cardiovascular health and
reducing the burden of heart-related illnesses. Thus, the research aims to harness the
potential of wearable technology to revolutionize heart health monitoring and empower
individuals to make informed decisions about their well-being.

1.3 EXISTING SYSTEM

1.3.1 DOCTOR BASED DIAGNOSIS

The existing system for doctors' diagnosis of heart health classification relies on a
combination of patient history, physical examinations, and diagnostic tests. Doctors begin
by gathering information about the patient's symptoms, medical history, lifestyle factors,
and family history of heart disease. They then conduct a physical examination to assess
vital signs, listen to the heart and lungs, and check for any signs of heart problems.
Diagnostic tests such as electrocardiograms (ECGs), echocardiograms, stress tests, and
blood tests are often performed to further evaluate heart function and detect any
abnormalities.

Department of Electronics and Communication Engineering,SVCN,Nellore 2

Multi-Class Heart Health Classification From ECG Data With RFC

Based on the collected information and test results, doctors classify the patient's heart
health into various categories such as normal, at risk, or indicative of specific conditions
like coronary artery disease, arrhythmias, or heart failure. This classification guides
treatment decisions and helps in providing appropriate care to improve the patient's heart
health.

Heart disease, a leading cause of mortality worldwide, poses significant diagnostic

challenges. Among the various techniques, electrocardiograms (ECGs) offer a cost-
effective, non-invasive approach. However, several hurdles persist: a scarcity of medical
experts, the intricate nature of ECG interpretations, the similarity of heart disease
manifestations in ECG signals, and comorbidity.

1.3.2. CLINICAL TRAILS

The current system for conducting clinical trials on heart health classification involves
several steps. Researchers begin by designing the trial, outlining the specific objectives
and criteria for participant eligibility. Potential participants are then recruited, often
through medical centers or community outreach programs. Once enrolled, participants
undergo a series of assessments to gather baseline data on their heart health status. These
assessments may include medical history interviews, physical examinations, and
diagnostic tests such as electrocardiograms (ECGs) and blood tests. Participants are then
randomly assigned to different treatment groups, which may involve receiving a new
medication, undergoing a specific procedure, or following a particular lifestyle
intervention.

Throughout the trial, researchers carefully monitor participants' progress and collect data
on key outcome measures, such as changes in heart function or incidence of
cardiovascular events. The trial concludes with data analysis to determine the
effectiveness and safety of the intervention in classifying heart health, ultimately
contributing to medical knowledge and informing future clinical practice.

Department of Electronics and Communication Engineering,SVCN,Nellore 3

Multi-Class Heart Health Classification From ECG Data With RFC

1.4 PROBLEM STATEMENT

Doctors also need assistance: He problem statement revolves around the potential benefit
of integrating artificial intelligence and machine learning (AIML) assistance into the
process of heart health classification by doctors. Currently, doctors rely on traditional
methods such as patient history, physical examinations, and diagnostic tests to classify
heart health. However, these methods may be time-consuming and subjective, leading to
inconsistencies in diagnosis and treatment decisions. By incorporating AIML assistance,
doctors could potentially enhance the accuracy and efficiency of heart health
classification. AIML algorithms could analyze vast amounts of patient data to identify
patterns and predict heart health outcomes, providing doctors with valuable insights and
decision support. This integration has the potential to improve patient care by facilitating
more accurate diagnoses and personalized treatment plans based on individual risk factors
and characteristics. Nevertheless, challenges such as data privacy concerns, algorithm
bias, and the need for ongoing validation remain to be addressed before widespread
implementation can be realized.

Patient with smart watches: The problem statement revolves around patients who utilize
smartwatches for heart health classification. Currently, many individuals rely on
smartwatches equipped with heart rate monitoring features to track their heart health.
However, there are concerns regarding the accuracy and reliability of these devices in
accurately classifying heart health conditions. While smartwatches can provide
continuous heart rate data, there is uncertainty about their ability to detect subtle changes
or accurately diagnose specific heart conditions. Additionally, the lack of medical-grade
validation and oversight raises questions about the reliability of the information provided
by these devices. Incorporating smartwatch data into clinical decision-making processes
presents challenges such as ensuring data accuracy, interpreting results, and integrating
this information into existing healthcare systems. Further research and validation are
necessary to determine the potential benefits and limitations of utilizing smartwatch data
for heart health classification in clinical practice.

Time complexity: He problem statement concerning time complexity in heart health

Department of Electronics and Communication Engineering,SVCN,Nellore 4

Multi-Class Heart Health Classification From ECG Data With RFC

classification focuses on the duration and efficiency of the classification process.

Currently, the classification of heart health involves numerous steps, including gathering
patient data, conducting diagnostic tests, and analyzing results. However, this process can
be timeconsuming, especially when dealing with large datasets or complex cases. The
computational resources required to process and analyze data add to the time complexity
of the classification process. Additionally, the need for timely diagnosis and treatment
decisions underscores the importance of reducing the time required for heart health
classification. Addressing time complexity issues involves optimizing algorithms,
improving data processing techniques, and streamlining the overall classification
workflow. Enhancing the efficiency of heart health classification can lead to quicker
diagnoses, more timely interventions, and ultimately better patient outcomes.

If more human resources: The problem statement revolves around determining the time
complexity for classifying heart health based on various factors. Specifically, it focuses
on understanding how the use of additional human resources affects the time required for
this classification task. The goal is to analyze and quantify the relationship between the
number of human resources involved and the time it takes to classify heart health
accurately. By examining this relationship, we aim to gain insights into the efficiency and
scalability of the classification process in real-world scenarios.

1.5 OBJECTIVE
The objective of heart health classification is to accurately assess and categorize
individuals' heart conditions based on various medical parameters and data. By analyzing
factors such as blood pressure, cholesterol levels, and heart rate, the aim is to provide a
reliable assessment of an individual's cardiac health status. This classification process
helps healthcare professionals in making informed decisions regarding diagnosis,
treatment, and prevention strategies for heart-related diseases. Ultimately, the goal is to
improve patient outcomes and overall cardiovascular health by effectively identifying and
managing potential heart issues at an early stage.

Department of Electronics and Communication Engineering,SVCN,Nellore 5

Multi-Class Heart Health Classification From ECG Data With RFC

1.6 ADVANTAGES

Disease Diagnosis and Risk Assessment: High accuracy heart health classifications
provide advantageous insights into human written language, enabling precise diagnosis
and proactive treatment strategies. They facilitate a deeper understanding of
cardiovascular conditions, aiding in effective patient care and risk management.

Remote Monitoring and Telemedicine: In discussions on heart health, emphasis is

placed on factors like regular exercise and a balanced diet, which are consistently
identified as advantageous. Additionally, managing stress levels and avoiding smoking
are highlighted as crucial for maintaining a healthy heart.

Personalized Treatment Recommendations: In the evaluation of heart health, factors such

as exercise and diet stand out as advantageous, as indicated by out-of-bag estimation
methods. Stress management and abstaining from smoking also emerge as key
components for maintaining a healthy heart, according to current analyses.

Clinical Decision Support Systems: In the context of heart health, factors like regular
exercise and a balanced diet demonstrate robustness to noise, suggesting their consistent
advantages. Stress management and avoiding smoking similarly exhibit resilience to
noise, reinforcing their importance for a healthy heart.

Research and Drug Development: In heart health discussions, activities such as exercise
and maintaining a balanced diet show advantages due to efficient parallelization, allowing
for simultaneous benefits across various aspects of wellness. Similarly, stress
management and abstaining from smoking exhibit efficiencies in parallelization,
contributing to overall heart health.

Healthcare Resource Allocation: In discussions of heart health, practices like regular

exercise and healthy eating demonstrate advantageous generalization, consistently
benefiting overall well-being. Likewise, stress management and avoiding smoking
showcase generalization in their positive impact on heart health, offering broad benefits
across populations.

Department of Electronics and Communication Engineering,SVCN,Nellore 6

Multi-Class Heart Health Classification From ECG Data With RFC

1.7 APPLICATION

1.7.1 SMART DEVICES

Heart health classification of smart device applications categorizes individuals' heart

health based on factors like heart rate, blood pressure, and activity levels. These
applications provide real-time data and alerts about heart health status, enabling users to
track their progress and receive recommendations for maintaining a healthy heart. They
are user-friendly, making it easy for individuals to stay informed about their
cardiovascular health and take proactive steps to improve it.

1.7.2 DOCTOR LAPTOP

The heart health classification of applications on doctors' laptops categorizes individuals'

heart health based on factors such as ECG readings, blood test results, and medical
history. These applications analyze data to provide accurate assessments of patients'
cardiovascular condition, assisting doctors in making informed diagnoses and treatment
decisions. By utilizing these classifications, healthcare professionals can effectively
manage heart-related conditions and provide personalized care to patients. Overall, these
applications play a crucial role in promoting heart health and improving patient outcomes.

1.7.3 DIAGNOSIS CENTRES

The heart health classification of applications in diagnosis centres categorizes individuals'

heart health based on electrocardiogram (ECG) results. These applications analyze the
electrical activity of the heart to provide immediate feedback on heart rhythm and
potential abnormalities. By accurately interpreting ECG readings, healthcare
professionals can diagnose various heart conditions, such as arrhythmias and myocardial
infarctions. This classification system enables timely intervention and personalized
treatment plans for patients, contributing to better heart health outcomes. Overall, these
applications are essential tools in diagnosing and managing heart-related issues in
diagnosis centres.

Department of Electronics and Communication Engineering,SVCN,Nellore 7

Multi-Class Heart Health Classification From ECG Data With RFC

1.7.4 IOMT

The heart health classification of applications in the Internet of Medical Things (IOMT)
categorizes individuals' heart health based on data collected from wearable devices. These
applications analyze parameters like heart rate, blood pressure, and activity levels to
assess cardiovascular health. By providing real-time monitoring and analysis, they enable
users to understand their heart health status and take proactive measures to improve it.
Healthcare professionals also utilize these classifications to diagnose heart-related
conditions and recommend personalized treatment plans. Overall, IOMT applications
play a crucial role in promoting awareness and management of heart health.

1.8 DATASET DESCRIPTION

The dataset for heart health classification consists of electrocardiogram (ECG) signals
obtained from human subjects. These signals are captured using ECG sensors attached to
the body, typically on the chest area. The dataset contains various attributes extracted
from these signals, such as waveforms, intervals, and amplitudes. Each data point in the
dataset represents a specific ECG recording from an individual. Additionally, the dataset
includes corresponding labels indicating the heart health status of each subject, such as
normal, arrhythmia, or other cardiac conditions. This dataset serves as a valuable resource
for developing and testing machine learning algorithms aimed at accurately classifying
heart health based on ECG signals.

1.9 PERFORMANCE EVALUATION

1.9.1 CONFUSION MATRIX

The performance evaluation of heart health classification involves various metrics, one
of which is the confusion matrix. A confusion matrix provides a detailed breakdown of
the model's performance by comparing predicted labels with actual labels. It consists of
four quadrants: true positive (TP), true negative (TN), false positive (FP), and false
negative (FN). The TP represents correctly classified instances of individuals with heart
disease, while TN indicates correctly classified instances of individuals without heart
disease. On the other hand, FP signifies instances where the model wrongly predicts the

Department of Electronics and Communication Engineering,SVCN,Nellore 8

Multi-Class Heart Health Classification From ECG Data With RFC

presence of heart disease, and FN denotes instances where the model wrongly predicts
the absence of heart disease. By analyzing the values in each quadrant, researchers can
assess the accuracy, sensitivity, specificity, and other performance metrics of the heart
health classification model, providing valuable insights into its effectiveness in
identifying and classifying heart disease cases.

1.9.2 CLASSIFICATION REPORT

The performance evaluation of heart health classification involves assessing the accuracy
and effectiveness of the classification model. One commonly used method for evaluating
the model's performance is the classification report. This report provides detailed metrics
such as precision, recall, and F1-score for each class in the classification task. Precision
represents the proportion of true positive predictions among all positive predictions made
by the model, indicating how reliable the positive predictions are. Recall, on the other
hand, measures the proportion of true positive predictions among all actual positive
instances in the dataset, reflecting the model's ability to correctly identify all positive
instances. The F1-score is the harmonic mean of precision and recall, providing a
balanced measure of the model's performance on both precision and recall.

In the context of heart health classification, the classification report helps healthcare
professionals understand how well the model performs in identifying different heart
conditions such as coronary artery disease, arrhythmias, or heart failure. By examining
the precision, recall, and F1-score for each class, clinicians can assess the model's
strengths and weaknesses in classifying specific heart conditions. For instance, a high
precision value indicates that the model makes few false positive predictions, while a
high recall value suggests that the model captures a large proportion of true positive
instances. Overall, the classification report provides valuable insights into the
performance of the heart health classification model, aiding healthcare providers in
making informed decisions about patient care and intervention strategies.

Department of Electronics and Communication Engineering,SVCN,Nellore 9

Multi-Class Heart Health Classification From ECG Data With RFC

1.9.3 METRICS

The performance evaluation of heart health classification involves assessing various

metrics to gauge the accuracy and effectiveness of the classification system.

Accuracy: Accuracy is the most common metric to be used in everyday talk. Accuracy
answers the question “Out of all the predictions we made, how many were true?”

As we will see later, accuracy is a blunt measure and can sometimes be misleading.

Accuracy measures the overall correctness of the classification model by calculating the
ratio of correctly predicted cases to the total number of cases. Precision focuses on the
proportion of true positive cases among all the cases predicted as positive, highlighting
the model's ability to avoid false positives.

Recall: Recall focuses on how good the model is at finding all the positives. Recall is
also called true positive rate and answers the question “Out of all the data points that
should be predicted as true, how many did we correctly predict as true?”

Recall, also known as sensitivity, evaluates the model's ability to correctly identify all
relevant cases, measuring the proportion of true positive cases identified out of all actual
positive cases.

F1 score: F1 Score is a measure that combines recall and precision. As we have seen
there is a trade-off between precision and recall, F1 can therefore be used to measure how
effectively our models make that trade-off.

One important feature of the F1 score is that the result is zero if any of the components
fall to zero. Thereby it penalizes extreme negative values of either component.

Department of Electronics and Communication Engineering,SVCN,Nellore 10

Multi-Class Heart Health Classification From ECG Data With RFC

The F1 score combines precision and recall into a single metric, providing a balanced
assessment of the classification model's performance by calculating the harmonic mean
of precision and recall.

These metrics collectively provide insights into the model's ability to accurately classify
individuals' heart health status, taking into account both true positives and false positives,
as well as true negatives and false negatives.

Precision: Precision is a metric that gives you the proportion of true positives to the
amount of total positives that the model predicts. It answers the question “Out of all the
positive predictions we made, how many were true?

Department of Electronics and Communication Engineering,SVCN,Nellore 11

Multi-Class Heart Health Classification From ECG Data With RFC

CHAPTER-2

LITERATURE SURVEY

2.1 INTRODUCTION

Heart health classification is a critical aspect of modern healthcare, aiming to accurately

diagnose and manage cardiovascular diseases, which remain a leading cause of mortality
worldwide. By employing advanced medical technologies and data analysis techniques,
healthcare professionals can assess an individual's heart health status with greater
precision and efficiency. This process involves evaluating various physiological
parameters such as blood pressure, cholesterol levels, and electrocardiogram (ECG)
readings to provide a comprehensive understanding of the individual's cardiac condition.
Through the classification of heart health, medical practitioners can identify potential
risks and tailor treatment plans to address specific patient needs effectively.

Moreover, heart health classification plays a crucial role in preventive medicine by

enabling early detection and intervention strategies for individuals at risk of developing
cardiovascular diseases. By identifying risk factors and patterns indicative of heart-
related issues, healthcare providers can implement proactive measures to mitigate these
risks and promote heart health. Additionally, accurate classification of heart health allows
for personalized interventions and lifestyle modifications tailored to each patient's unique
circumstances, ultimately leading to improved outcomes and enhanced quality of life. As
such, the development and refinement of heart health classification methodologies
continue to be a key focus area in healthcare research and practice, driving advancements
in cardiovascular care and disease management.

2.2 RELATED WORK

Malakouti, Seyed Matin.[1] One of the most critical steps when diagnosing
cardiovascular disorders was examining and processing ECG data. Classification of
health and ill persons was the primary focus of research in this Area, and approaches
based on machine learning were being used more often. Research in this Area focused
mainly on classification, and an increasing number of researchers were turning to

Department of Electronics and Communication Engineering,SVCN,Nellore 12

Multi-Class Heart Health Classification From ECG Data With RFC

techniques based on machine learning. In this particular investigation, the methods of

Gaussian NB, Random Forest, Logistic Regression, Linear Discriminant Analysis, and
Dummy Classifier were used for the automated categorization of Electrocardiography
(ECG) data.

Ozcan, Mert, and Serhat Peker.[2] Heart disease remained the leading cause of death,
such that nearly one-third of all deaths worldwide were estimated to be caused by heart-
related conditions. Advancing applications of classification-based machine learning to
medicine facilitated earlier detection. In this study, the Classification and Regression Tree
(CART) algorithm, a supervised machine learning method, was employed to predict heart
disease and extract decision rules in clarifying relationships between input and output
variables. In addition, the study’s findings ranked the features influencing heart disease
based on importance. When considering all performance parameters, the 87% accuracy
of the prediction validated the model’s reliability. On the other hand, extracted decision
rules reported in the study could simplify the use of clinical purposes without needing
additional knowledge. Overall, the proposed algorithm could support not only healthcare
professionals but patients who were subjected to cost and time constraints in the diagnosis
and treatment processes of heart disease.

Fakhry, Mahmoud, and Ascensión Gallardo-Antolín.[3] In this article, we proposed the

optimization of the resolution of time–frequency atoms and the regularization of fitting
models to obtain better representations of heart sound signals. This was done by
evaluating the classification performance of deep learning (DL) networks in
discriminating five heart valvular conditions based on a new class of time–frequency
feature matrices derived from the fitting models. We inspected several combinations of
resolution and regularization, and the optimal one was that provided the highest
performance. To this end, a fitting model was obtained based on a heart sound signal and
an overcomplete dictionary of Gabor atoms using elastic net regularization of linear
models. We considered two different DL architectures, the first mainly consisting of a 1D
convolutional neural network (CNN) layer and a long short-term memory (LSTM) layer,
while the second was composed of 1D and 2D CNN layers followed by an LSTM layer.
The networks were trained with two algorithms, namely stochastic gradient descent with

Department of Electronics and Communication Engineering,SVCN,Nellore 13

Multi-Class Heart Health Classification From ECG Data With RFC

momentum (SGDM) and adaptive moment (ADAM). Extensive experimentation had

been conducted using a database containing heart sound signals of five heart valvular
conditions. The best classification accuracy of 98.95% was achieved with the second
architecture when trained with ADAM and feature matrices derived from optimal models
obtained with a Gabor dictionary consisting of atoms with high-time low-frequency
resolution and imposing sparsity on the models.

Nguyen, Minh Tuan, Wei Wen Lin, and Jin H. Huang. [4] In this study, two models for
classifying heart rate sounds were proposed to classify heart sound by deep learning
techniques based on the log-mel spectrogram of heart sound signals. The heart sound
dataset comprised five classes, one normal class and four anomalous classes, namely,
Aortic Stenosis, Mitral Regurgitation, Mitral Stenosis, and Murmur in systole. First, the
heart sound signals were framed to a consistent length and thereafter extract the log-mel
spectrogram features. Two deep learning models, long short-term memory and
convolution neural network were proposed to classify heartbeat sounds based on the
extracted features. Analysis results demonstrated the high performance of classification
models, with an overall accuracy of about 99.67%. The results also showed higher
performance compared to previous studies.

Huang, Youhe, Hongru Li, and Xia Yu. [5] Electrocardiogram (ECG) was an important
tool used to analyze abnormal heart activity and assess heart health, especially in remote
cardiac health monitoring. Although deep learning had achieved significant results in
automatic ECG classification, how to combine the characteristics of ECG physiological
signals to construct inputs or features with differentiation was still a key point of
classification. To this end, a novel representation input method with temporal
characteristics was proposed in this paper. At first, the temporal characteristic of ECG
signals was extracted and transformed into a time representation input with the original
input. Subsequently, the deep learning network combining Convolutional Neural
Network and Long Short-Term Memory was employed for feature extraction.
Simultaneous attention mechanism was used to focus on feature differences. The
proposed method was validated in the classification of five classes of heartbeats (Normal

Department of Electronics and Communication Engineering,SVCN,Nellore 14

Multi-Class Heart Health Classification From ECG Data With RFC

heartbeat, Left bundle branch block heartbeat, Right bundle branch block heartbeat, Atrial
Premature Contraction, Premature ventricular contraction), achieving a higher average
accuracy, precision, sensitivity, and specificity of 98.95%, 97.07%, 96.54%, and 99.33%
respectively in the MIT-BIH arrhythmia database. The results showed that our method
was able to combine the periodic characteristics of ECG to construct a better temporal
representation input than traditional feature fusion. This method could provide a new way
to classify similar physiological signals with periodic characteristics.

Sk, Khader Basha, D. Roja, Sunkara Santhi Priya, Lavanya Dalavi, Sai Srinivas Vellela,
and Venkateswara Reddy. [6]Nowadays, digitalization in the healthcare organizations
placed great emphasis on technological advances in clinical healthcare providers.
Traditional methods for measuring and evaluating outcomes for patients in forecasting
and diagnosing chronic diseases were being substituted by techniques that captured the
most significant insights from medical information by combining predictive modeling
with a highly valuable application of machine learning. Heart disease was nowadays
among the worst disorders in the world. Because the death rate from heart disease
remained largely significant, more intensive efforts in preventive were required, such as
enhancing the accuracy of a heart disease prediction system. Additionally, an early
diagnosis supported in the appropriate diagnosis of the condition as well as the
management of its symptoms. By creating forecasting analytics, Machine Learning (ML)
techniques could be used to anticipate chronic diseases including kidneys and cardiac
disorders. Hence, this analysis presented coronary heart disease prediction and
classification utilizing Hybrid Machine Learning methods. In this approach the
combination of Decision Tree (DT) and Ada Boosting algorithms was used as a hybrid
ML algorithm to predict the CHD. This method's performance was determined by the
performance metrics such as accuracy, True Positive Rate (TPR), and Specificity.

Li, Jiajia, Christopher Brown, Dillon J. Dzikowicz, Mary G. Carey, Wai Cheong Tam,
and Michael Xuelin Huang. [7] A machine learning-based heart health monitoring model,
named H2M, was developed. Twenty-four-hour electrocardiogram (ECG) data from 112
career firefighters were used to train the proposed model. The model used carefully
designed multilayer convolution neural networks with maximum pooling, dropout, and

Department of Electronics and Communication Engineering,SVCN,Nellore 15

Multi-Class Heart Health Classification From ECG Data With RFC

global maximum pooling to effectively learn the indicative ECG characteristics. H2M
was benchmarked against three existing state-of-the-art machine learning models. Results
showed the proposed model was robust and had an overall accuracy of approximately
94.3%. A parametric study was conducted to demonstrate the effectiveness of key model
components. An additional data study was also carried out, and it was shown that using
non-firefighters’ ECG data to train the H2M model led to a substantial error of ∼40%.
The contribution of this work was to provide firefighters on-demand, real-time status of
heart health status to enhance their situational awareness and safety. This could help
reduce firefighters’ injuries and deaths caused by sudden cardiac events.

Chen, Dan, Juan Feng, HongYan He, WeiPing Xiao, and XiaoJing Liu.[8] Evidence-based
medicine showed that obesity is associated with a wide range of cardiovascular (CV)
diseases. Obesity led to changes in cardiac structure and function, which could result in
obese cardiomyopathy, subclinical cardiac dysfunction, and even heart failure. It also
increased the risk of atrial fibrillation and sudden cardiac death. Many invasive and
noninvasive diagnostic methods could detect obesity-related heart disease at an early
stage, so that appropriate measures could be selected to prevent adverse CV events.
However, studies had shown a protective effect of obesity on clinical outcomes of CV
disease, a phenomenon that had been termed the obesity paradox. The “obesity paradox”
essentially referred to the fact that the classification of obesity defined by body mass
index (BMI) did not consider the impact of obesity heterogeneity on CV disease
prognosis, but simply put subjects with different clinical and biochemical characteristics
into the same category. In any case, indicators such as waistto-hip ratio, ectopic body fat
qualitative and quantitative, and CV fitness had been shown to be able to distinguish
different CV risks in patients with the same BMI, which was convenient for early
intervention in an appropriate way. A multidisciplinary approach, including lifestyle
modification, evidence-based generic and novel pharmacotherapy, and surgical
intervention, could improve CV outcomes in overweight/obese patients.

Maulani, Ahmad Alaik, Sri Winarno, Junta Zeniarja, Rusyda Tsaniya Eka Putri, and Ailsa
Nurina Cahyani.[9] Heart disease, which caused the highest number of deaths worldwide,
recorded about 17.9 million cases in 2019, or about 32% of total global deaths, according

Department of Electronics and Communication Engineering,SVCN,Nellore 16

Multi-Class Heart Health Classification From ECG Data With RFC

to the World Health Organization (WHO). The significance of early detection of heart
disease drove research to develop effective diagnosis systems utilizing machine learning.
The advancement of machine learning in healthcare currently primarily served as a
supporting role in the ability of clinicians or analysts to fulfill their roles, identify
healthcare trends, and develop disease prediction models. Meanwhile, deep learning had
experienced rapid development and had become the most popular method in recent years,
one of which was detecting diseases. The main objective of this research was to optimize
the hybrid convolutional neural network (CNN) and long short-term memory (LSTM)
model for classifying heart disease by comparing hyperparameter optimization using grid
search and random search. Although random search required less time in hyperparameter
tuning, the classification performance results of grid search showed higher accuracy. In
the test, the hybrid CNN and LSTM model with grid search achieved 91.67% accuracy,
89.66% recall (sensitivity), 93.55% specificity, 92.86% precision, 91.23% f1-score, and
0.9310 AUC value. These results confirmed that using a hybrid CNN and LSTM model
with a grid search approach was better suited for classifying heart disease.

Parveen, Nikhat, Manisha Gupta, Shirisha Kasireddy, Md Shamsul Haque Ansari, and
Mohammad Nadeem Ahmed.[10] The timely prediction of heart diseases with an
automated system reduced the mortality rate of cardiac disease patients. However,
detecting cardiac disease was one of the difficult tasks due to the small variations in the
ECG signal that could not be easily visible to the human eyes. To overcome this issue,
many techniques had been introduced to effectively classify the variation in beats.
However, those techniques faced high error and failed to learn the spatiotemporal
features, which badly affected the accuracy performance. Hence, a novel hybridized DL
technique was introduced, which analyzed the spatio-temporal features and performed
the heartbeat classification accurately with less error rate. At the initial stage, the signal
from the raw dataset was smoothened to enhance the accuracy performance. The pre-
processed samples were then balanced using the synthetic minority oversampling
(SMOTE) technique to avoid over-fitting issues. Then, spatiotemporal features were
extracted using a novel hybridized DL based One-Dimensional Residual Deep
Convolutional Auto-Encoder (1D-RDCAE) technique. Finally, ML based extreme

Department of Electronics and Communication Engineering,SVCN,Nellore 17

Multi-Class Heart Health Classification From ECG Data With RFC

gradient boosting (XGB) classifier was introduced to classify the ECG heartbeats
effectively. The proposed model was implemented via PYTHON and processed with the
MIT-BIH arrhythmia dataset. Performance measures like accuracy, sensitivity,
specificity, and false negative rate were analyzed and compared with existing techniques.
In the experimental section, the proposed model obtained an accuracy of 99.9% and a
specificity of 99.8%. Compared to other existing models, the proposed model showed
better outcomes. Consequently, clinical cardiac care systems might benefit from this
strategy as well.

Tartarisco, Gennaro, Giovanni Cicceri, Roberta Bruschetta, Alessandro Tonacci, Simona

Campisi, Salvatore Vitabile, Antonio Cerasa et al.[11] Cardiovascular diseases were
currently the major causes of death globally. Among the strategies to prevent
cardiovascular issues, the automated classification of heart sound abnormalities was an
efficient way to detect early signs of cardiac conditions leading to heart failure or other,
even asymptomatic, complications, quite effective for timely interventions. Despite the
significant improvements in this field, there were still limitations due to the lack of
solutions, available data-sets, and poor (mainly binary — normal vs abnormal)
classification models and algorithms. This paper presented a Medical Cyber–Physical
System (MCPS) for the automatic classification of heart valve diseases onsite, in a timely
manner. The proposed MCPS, indeed, could be deployed into personal and mobile
devices, addressing the limitations of existing solutions for patients, healthcare
practitioners, and researchers, through an efficient and easily accessible tool. It combined
different neural network models trained on a new Italian dataset of 132 adult patients
covering 9 heart sound categories (1 normal and 8 abnormal), also validated against two
main open-access (Physionet/CinC Challenge 2016 and Korean) datasets. The overall
MCPS performance (time, processing, and energy resource utilization) and the high
accuracy of the models (up to 98%) demonstrated the feasibility of the proposed solution,
even with few data. The dataset supporting the findings of this paper was available upon
request to the authors.

Jemima, P. Preethy, R. Gokul, R. Ashwin, and S. Matheswaran.[12] Artificial intelligence

(AI) brought about a revolution in the healthcare sector thanks to the growing availability

Department of Electronics and Communication Engineering,SVCN,Nellore 18

Multi-Class Heart Health Classification From ECG Data With RFC

of both structured and unstructured data, as well as the rapid advancement of analytical
methodologies. Medical diagnosis models were essential to saving human lives; thus, we
had to be confident enough to treat a patient as advised by a black-box model. Concerns
regarding the lack of openness and understandability, as well as potential bias in the
model's predictions, developed as AI's significance in healthcare increased. The use of
neural networks as a classification method became increasingly significant. The benefits
of neural networks made it possible to classify given data effectively. This study used an
optimized generalized metric learning neural network model approach to examine a
dataset on heart disorders. In the context of cardiac disease, the authors first conducted
the correlation and interdependence of several medical aspects. A goal was to identify the
most pertinent characteristics (an ideal reduced feature subset) for detecting heart disease.

Shivadekar, Samit, Ketan Shahapure, Shivam Vibhute, and Ashley Dunn.[13] A heart
failure (HF) condition was a type of chronic cardiovascular disease that affected millions
of people globally. It could lead to various symptoms and had a significant impact on the
quality of life. Despite the advancements that had been made in treating this condition, it
remained a major public health issue. One of the biggest challenges that HF management
faced was the high number of readmissions. This issue contributed to the increasing of
patients' outcomes and cost the healthcare system. Implementing effective interventions
and identifying those at high risk of returning to the hospital could help lower the financial
burden on the system. Through the use of machine learning techniques, researchers could
now predict the likelihood of HF readmissions. These tools could analyze large datasets
and provide a personalized diagnosis and treatment plan. There had been various studies
that had examined the use of ML for predicting HF readmissions. The goal of this study
was to analyze the various techniques used in predicting HF readmissions and provide a
comprehensive analysis of their performance. Through a combination of data collected
from various sources, including a diverse set of patients, we were able to explore the
performance of various ML algorithms. In addition to the algorithms' performance, we
also looked into their impact on various parameters, such as model evaluation metrics,
optimization techniques, and feature selection. The findings of this study would be used
to inform policymakers and healthcare providers about the use of ML techniques to

Department of Electronics and Communication Engineering,SVCN,Nellore 19

Multi-Class Heart Health Classification From ECG Data With RFC

identify patients at high risk of HF readmissions. These insights could help them improve
the quality of care for those with this condition and develop effective interventions. The
objective of this study was to use the power of ML to improve the management of HF
and reduce the burden of readmissions on both the patients and the healthcare systems.

Chakraborty, C. Parnasree.[14] Heart disease was one of the major diseases which caused
a sudden loss of life. Early diagnosis of heart-related problems could prevent the
progression of the disease. In this paper, a Hybrid ensemble machine learning model was
suggested with a correlation-based feature selection algorithm. Our proposed model was
built using conventional ensemble bagging, boosting, and stacking methods. The standard
machine learning algorithms such as Support Vector Machine, k-Nearest Neighborhood,
Logistic regression, Decision tree, and Gaussian naïve Bayes were used to build the
ensemble model. The suggested approach was well suited for medical assistance to
physicians as it achieved 97.4% of disease classification accuracy and 98% of precision
which was 4 % and 2% improvements in conventional methods.

Kaur, Ishleen, and Tanvir Ahmad.[15] The main goals of this study were to create a
reliable data analysis model that could help with (i) a better understanding of congenital
heart disease prediction in the presence of missing and unbalanced data and (ii) creating
cohorts of expectant mothers with similar lifestyle characteristics. Clusters of patient
cohorts were produced using the unsupervised data mining technique density-based
spatial clustering of applications with noise (DBSCAN). For more accurate CHD
prediction, a random forest model was trained using these clusters and their
corresponding patterns. This study used a dataset of 33,831 expectant mothers to make
its prediction. Missing data were handled using the k-NN imputation approach, while
extremely unbalanced data were balanced using SMOTE. These techniques were all data-
driven and needed little to no user or expert involvement.

Searles, Charles D. [16]MicroRNAs (miRNAs)—short, non-coding RNAs—play

important roles in almost all aspects of cardiovascular biology, and changes in
intracellular miRNA expression are indicative of cardiovascular disease development and
progression. Extracellular miRNAs, which are easily measured in blood and can be

Department of Electronics and Communication Engineering,SVCN,Nellore 20

Multi-Class Heart Health Classification From ECG Data With RFC

reflective of changes in intracellular miRNA levels, have emerged as potential non-

invasive biomarkers for disease. This review summarizes current knowledge regarding
miRNAs as biomarkers for assessing cardiovascular disease risk and prognosis.

Wright, Brandon, Carly Fassler, Dmitry Tumin, and Lauren A. Sarno. [17]Patients with
congenital heart defects (CHD) seen in the clinic during 2018 and subsequently lost to
cardiology follow-up were included in the study. Loss to follow-up was defined as not
being seen in the clinic for at least 6 months past the most recently recommended follow-
up visit. Subsequent visits to other locations, including other subspecialty clinics, primary
care clinics, the emergency department (ED), and the hospital, were tracked through
2020. Of 235 patients (median age 7 years, 136/99 female/male), 96 (41%) were seen
elsewhere in the health system. Of 96 patients with any follow-up, 40 were seen by a
primary care provider and 46 by another specialist; 44 were seen in the emergency
department (ED) and 12 more were hospitalized. Patients with medical comorbidities or
Medicaid insurance and those living closer to the clinic were more likely to have
continued receiving care within the same health system.

Jou, Stephanie, Sean R. Mendez, Jason Feinman, Lindsey R. Mitrani, Valentin Fuster,
Massimo Mangiola, Nader Moazami, and Claudia Gidea. [18]Approximately 65 million
adults globally had heart failure, and the prevalence was expected to increase
substantially with ageing populations. Despite advances in pharmacological and device
therapy of heart failure, longterm morbidity and mortality remained high. Many patients
progressed to advanced heart failure and developed persistently severe symptoms. Heart
transplantation remained the goldstandard therapy to improve the quality of life,
functional status, and survival of these patients. However, there was a large imbalance
between the supply of organs and the demand for heart transplants. Therefore, expanding
the donor pool was essential to reduce mortality while on the waiting list and improve
clinical outcomes in this patient population. A shift had occurred to consider the use of
organs from donors with hepatitis C virus, HIV, or SARS-CoV-2 infection. Other
advances in this field had also expanded the donor pool, including opt-out donation
policies, organ donation after circulatory death, and xenotransplantation. We provided a

Department of Electronics and Communication Engineering,SVCN,Nellore 21

Multi-Class Heart Health Classification From ECG Data With RFC

comprehensive overview of these various novel strategies, provided objective data on

their safety and efficacy, and discussed some of the unresolved issues and controversies
of each approach.

Christogianni, Aikaterini. [19]The article discussed the benefits of continuous data

monitoring in healthcare via digital devices and wearables. The purpose was to discuss
recent advancements in digital health technologies and how they could positively impact
the quality of life in chronic diseases, such as cardiovascular diseases and dementia. In
addition, the article discussed how large amounts of health data, medical patient
information, and continuous monitoring could assist in positive patient feedback,
symptom interpretation, and early disease detection. Data processing and simulation
programs, such as digital twins, showed evidence of predictive validity. Machine learning
algorithms showed evidence of identifying patterns and relationships in the data,
improving patient outcomes, and supporting healthcare decisionmaking. However,
clinical validations from experts were necessary to ensure correct diagnoses, prognoses,
and treatment plans.

Baek, Ji Yoon, Seung Hee Seo, Sooyoung Cho, Jun-Bean Park, Bhumsuk Keam, Shin
Hye Yoo, and Aesun Shin. [20]This study aimed to examine the impact of the COVID-
19 pandemic on the emergency department (ED) visits of cardiovascular disease (CVD)
patients. The customized data of the National Health Insurance Service (NHIS) from 2017
to 2020 were analyzed. CVD patients were defined by the code ‘V192’ based on the NHIS
coverage benefit expansion policy. The number of ED visits of CVD patients, as well as
executed procedures in 2020 (during the pandemic), were compared to the corresponding
average numbers in 2018 and 2019 (pre-pandemic). Stratification by age group,
residential area, and hospital location was performed. The number of ED visits of newly
diagnosed CVD patients decreased by 2.1% nationwide in 2020 (2018–2019: 97,041;
2020: 95,038) and decreased the most (by 14.1%) in March (2018–2019: 8539; 2020:
7334). However, the number of executed procedures increased by 1.1% nationwide in
2020 (2018–2019: 74,696; 2020: 75,520), while it decreased by 11.9% in April (2018–
2019: 6603; 2020: 5819). The most notable decreases in the number of newly diagnosed

Department of Electronics and Communication Engineering,SVCN,Nellore 22

Multi-Class Heart Health Classification From ECG Data With RFC

CVD patients (31.7%) and procedures (29.2%) in March 2020 were observed in the
Daegu·Gyeongbuk area. CVD patients living in the epicenter of the COVID19 pandemic
may have experienced difficulty accessing healthcare facilities and receiving proper
treatment.

Trivedi, Rupal. [21]The Southern US was disproportionately burdened by cardiovascular

diseases (CVDs). A large proportion of CVD risk in the South could be attributed to
unhealthy diets. Implementation science approaches might have enabled the uptake of
evidenced-based dietary practices among CVD patients in the South. The objectives of
this dissertation were 1) to explore dietary practices and perceptions of heart failure (HF)
patients in the South who completed a heart-healthy dietary feeding intervention trial
(chapter 2), 2) to determine the acceptability of a newly developed 14-day Southern
Dietary Approaches to Stop Hypertension (DASH) diet meal plan among HF patients in
the South, and to assess its associations with participants’ demographic and health
characteristics (chapter 3), and 3) to examine the effectiveness and feasibility of
telehealth-based dietary interventions in improving CVD risk factors (chapter 4). Study
1 showed HF patients’ dietary practices were characterized by the Southern dietary
pattern and influenced by clinical dietary advice. Participants’ health goals and family
responsibilities enabled the consumption of the DASH diet; however, finances, beliefs in
their abilities of following the diet, social influences, and emotions prevented them from
consuming the diet. Participants reported that additional dietary resources (i.e. new
recipes, weekly menus, financial support, etc.) would enable them to align their diet with
a DASH diet. In study 2, 3 out of the 4 tested days of a newly developed Southern DASH
diet meal plan were accepted among HF patients in the South. The mean acceptability
rating was significantly higher for participants who previously received healthcare
provider-led dietary education than those who did not (7.70 vs. 6.70 [p = 0.01],
respectively). In study 3, telehealthbased dietary interventions significantly improved
systolic blood pressure (MD: -2.74 [95% CI: -4.93 to -0.56]) and low-density lipoprotein
cholesterol (SMD: -0.11 [95% CI: -0.19 to 0.03]), compared to usual care among CVD
patients. There was no significant difference between the feasibility of telehealth-based
dietary interventions and usual care. In conclusion, it was critically important to improve

Department of Electronics and Communication Engineering,SVCN,Nellore 23

Multi-Class Heart Health Classification From ECG Data With RFC

dietary habits and behaviors among CVD patients in the South through implementation
science approaches as a means of promoting secondary CVD management in the region.

Wiatma, Deny Sutrisna, Reksa Samoedra, I. Putu Bayu Agus Saputra, and Bayu Setia.
[22]In 2019, there were 523 million cases of cardiovascular disease which caused the
deaths of 18.6 million people. In this manner, some major issues needed to be considered,
high cardiovascular endurance, for example. Relatively, high cardiovascular endurance
could reduce the incidence of cardiovascular disease by 40% to 70%. The objective was
to analyze the relationship between physical activity and smoking habits among farmers
in Pandan Wangi Village. The method used was quantitative research with a cross-
sectional design involving 108 respondents. The respondents were selected by a simple
random sampling technique. Data in this study were collected using GPAQ for physical
activity variables, Brinkman index questionnaire for smoking variables, and Harvard step
test for cardiovascular endurance variables. Meanwhile, Spearman rank test was used in
the data analysis. The research showed that the characteristics of respondents were
dominated by males (64.8%) within the 36-45 years old age range group (52.8%). In
addition, most of the respondents were in the non-smoker category (62.0%), had a high
level of physical activity (52.8%), and a very good level of cardiovascular endurance
(27.8%). Bivariate analysis showed that there was a significant relationship between
physical activity (p-value = 0.005) and smoking behavior (p-value =
0.047) on cardiovascular endurance among farmers in Pandan Wangi Village. There was
a significant relationship between physical activity and smoking habits on cardiovascular
endurance among farmers in Pandan Wangi Village.

Bhende, Vishal V., Tanishq S. Sharma, Mathangi Krishnakumar, Anikode Subramanian

Ramaswamy, Kanchan Bilgi, Sohilkhan R. Pathan, and Sohilkhan Pathan. [23]Pediatric
patients undergoing reoperative cardiac surgery after a previous sternotomy faced a
higher degree of surgical complexity compared to those undergoing initial procedures.
They had higher intraoperative and postoperative risks. The increased risk of surgery was
due to preoperative patient factors and intraoperative technical challenges. Redo-pediatric
cardiac surgery was a common event in almost every pediatric cardiac surgeon's

Department of Electronics and Communication Engineering,SVCN,Nellore 24

Multi-Class Heart Health Classification From ECG Data With RFC

professional life. Redo-surgery was almost inevitable in patients who had multi-stage
repair of congenital heart surgeries and biological valves at a young age, and often in
those having valve repair in rheumatic disease. So, being familiar with the pitfalls and
precautions to be taken was of crucial importance. In general, the patients presenting for
repeat procedures were sicker, older, and had more comorbid conditions. The dissection
was always rendered difficult by adhesions, scarring, and previous graft placements.
Hence, prolonged dissection time, intraoperative injuries to heart chambers, great vessels,
and grafts, increased bleeding, and poorer cardiac function resulted in higher morbidity
and mortality in such subsets of patients. The outcome was worse with emergency redo-
cardiac surgeries.

Charchar, Fadi J., Priscilla R. Prestes, Charlotte Mills, Siew Mooi Ching, Dinesh
Neupane, Francine Z. Marques, James E. Sharman et al. [24]Hypertension, defined as
persistently elevated systolic blood pressure (SBP) >140 mmHg and/or diastolic blood
pressure (DBP) at least 90 mmHg (International Society of Hypertension guidelines),
affected over 1.5 billion people worldwide. Hypertension was associated with increased
risk of cardiovascular disease (CVD) events (e.g., coronary heart disease, heart failure,
and stroke) and death. An international panel of experts convened by the International
Society of Hypertension College of Experts compiled lifestyle management
recommendations as a first-line strategy to prevent and control hypertension in adulthood.
It was also recommended that lifestyle changes be continued even when blood pressure-
lowering medications were prescribed. Specific recommendations based on literature
evidence were summarized with advice to start these measures early in life, including
maintaining a healthy body weight, increasing levels of different types of physical
activity, adopting healthy eating and drinking habits, avoiding and ceasing smoking and
alcohol use, and managing stress and sleep levels. The relevance of specific approaches
including consumption of sodium, potassium, sugar, fiber, coffee, tea, intermittent
fasting, as well as integrated strategies to implement these recommendations using, for
example , behavior change-related technologies and digital tools, was also discussed.

Department of Electronics and Communication Engineering,SVCN,Nellore 25

Multi-Class Heart Health Classification From ECG Data With RFC

Campbell‐Washburn, Adrienne E., Juliet Varghese, Krishna S. Nayak, Rajiv Ramasawmy,

and Orlando P. Simonetti. [25]Cardiac MR imaging was well established for the
assessment of cardiovascular structure and function, myocardial scar, quantitative flow,
parametric mapping, and myocardial perfusion. Despite the clear evidence supporting the
use of cardiac MRI for a wide range of indications, it was underutilized clinically. Recent
developments in low-field MRI technology, including modern data acquisition and image
reconstruction methods, enabled high-quality low-field imaging that might have
improved the cost–benefit ratio for cardiac MRI. Studies to-date confirmed that low-field
MRI offered high measurement concordance and consistent interpretation with clinical
imaging for several routine sequences. Moreover, low-field MRI might have enabled
specific new clinical opportunities for cardiac imaging such as imaging near metal
implants, MRI-guided interventions, combined cardiopulmonary assessment, and
imaging of patients with severe obesity. In this review, we discussed the recent progress
in low-field cardiac MRI with a focus on technical developments and early clinical
validation studies.

Seng, Nang San Hti Lar, Gebremichael Zeratsion, Oscar Yasser Pena Zapata, Muhammad
Umer Tufail, and Belinda Jim. [26]Cardiovascular disease was a major cause of death
worldwide, especially in patients with chronic kidney disease (CKD). Troponin T and
troponin I were cardiac biomarkers used not only to diagnose acute myocardial infarction
(AMI) but also to prognosticate cardiovascular and all-cause mortality. The diagnosis of
AMI in the CKD population was challenging because of their elevated troponins at
baseline. The development of high-sensitivity cardiac troponins shortened the time
needed to rule in and rule out AMI in patients with normal renal function. While the
sensitivity of high-sensitivity cardiac troponins was preserved in the CKD population, the
specificity of these tests was compromised. Hence, diagnosing AMI in CKD remained
problematic even with the introduction of high-sensitivity assays. The prognostic
significance of troponins did not differ whether it was detected with standard or high-
sensitivity assays. The elevation of both troponin T and troponin I in CKD patients
remained strongly correlated with adverse cardiovascular and all-cause mortality, and the
prognosis became poorer with advanced CKD stages. Interestingly, the degree of troponin

Department of Electronics and Communication Engineering,SVCN,Nellore 26

Multi-Class Heart Health Classification From ECG Data With RFC

elevation appeared to be predictive of the rate of renal decline via unclear mechanisms
though activation of the renin-angiotensin and other hormonal/oxidative stress systems
remained suspect. In this review, we presented the latest evidence of the use of cardiac
troponins in both the diagnosis of AMI and the prognosis of cardiovascular and all-cause
mortality. We also suggested strategies to improve on the diagnostic capability of these
troponins in the CKD/endstage kidney disease population.

Thummisetti, Bala Siva Prakash, and Haritha Atluri. [27]This research paper explored the
transformative potential of federated learning in healthcare informatics, focusing on its
pivotal role in balancing advancements with privacy and security imperatives. In an era
marked by exponential growth in healthcare data, federated learning emerged as a
promising paradigm to enable collaborative model training without compromising the
confidentiality of sensitive patient information. Through a decentralized approach, this
paper elucidated the mechanisms of secure aggregation, differential privacy, and
encryption protocols inherent in federated learning, emphasizing their significance in
preserving data privacy. By dissecting real-world implementations and case studies, it
underscored the practical applicability of federated learning while addressing ethical
implications, regulatory considerations, and potential challenges. Ultimately, this paper
advocated for the widespread integration of federated learning in healthcare informatics,
positing it as a cornerstone in advancing medical research while ensuring robust privacy
and security safeguards.

Shield, Kevin, Catherine Paradis, Peter Butt, Tim Naimi, Adam Sherk, Mark Asbridge,
Daniel Myran et al. [28]Low-Risk Alcohol Drinking Guidelines (LRDGs) aimed to
reduce the harms caused by alcohol. However, considerable discrepancies existed in the
‘low-risk’ thresholds employed by different countries. Drawing upon Canada's LRDGs
update process, the current paper offered the following propositions for debate regarding
the establishment of ‘low-risk’ thresholds in national guidelines: (1) as an indicator of
health loss, years of life lost (YLL) had several advantages that could make it more
suitable for setting guidelines than deaths, premature deaths, or disability-adjusted years
of life (DALYs) lost. (2) Presenting age-specific guidelines may not have been the most

Department of Electronics and Communication Engineering,SVCN,Nellore 27

Multi-Class Heart Health Classification From ECG Data With RFC

appropriate way of providing LRDGs. (3) Given past overemphasis on the so-called
protective effects of alcohol on health, presenting cause-specific guidelines may not have
been appropriate compared with a ‘whole health’ effect derived from a weighted
composite risk function comprising conditions that were causally related to alcohol
consumption. (4) To help people reduce their alcohol use, presenting different risk zones
associated with alcohol consumption instead of a single low-risk threshold may have been
advantageous.

Neshat, Sina, Abbas Rezaei, Armita Farid, Salar Javanshir, Fatemeh Dehghan Niri,
Padideh Daneii, Kiyan Heshmat-Ghahdarijani, and Setayesh Sotoudehnia Korani.
[29]Cardiovascular diseases (CVDs) posed a serious threat to people’s health, with
extremely high global morbidity, mortality, and disability rates. This study aimed to
review the literature that examined the relationship between blood groups and CVD.
Many studies have reported that non-O blood groups were associated with an increased
risk and severity of coronary artery disease and acute coronary syndromes. Non-O blood
groups increased the risk and severity of these conditions by increasing von Willebrand
factor and plasma cholesterol levels and inducing endothelial dysfunction and
inflammation. They were also linked with increased coronary artery calcification,
coronary lesion complexity, and poor collateral circulation. Blood groups also affected
the prognosis of coronary artery disease and acute coronary syndrome and could alter the
rate of complications and mortality. Several cardiovascular complications were described
for coronavirus disease 2019, and blood groups could influence their occurrence. No
studies found a significant relationship between the Lewis blood group and CVD. In
conclusion, people with non-O blood groups should be vigilantly monitored for
cardiovascular risk factors as prevention and proper treatment of these risk factors might
mitigate their risk of CVD and adverse cardiovascular events.

Lee, Chien-Chiang, and Zihao Yuan. "Impact of energy poverty on public health: A non-
linear study from an international perspective. [30]Research on energy poverty and its
impact has been quite extensive, but the impact of such poverty on public health was still
lacking. This paper thus presented the relationship between energy poverty and public

Department of Electronics and Communication Engineering,SVCN,Nellore 28

Multi-Class Heart Health Classification From ECG Data With RFC

health of 185 countries from 2000 to 2020 as well as the role of urbanization development
levels in this nexus. To achieve this goal, this study used a partial linear function
coefficient (PLFC) method to analyze the relationship between them, which could also
clearly exhibit the non-linear impact of energy poverty on public health. First, both linear
and non-linear regression results showed that energy poverty had significantly negative
impacts on public health. Second, urbanization level played a significant moderating
effect in the energy poverty and public health nexus, meaning that energy poverty affected
public health under the influence of urbanization. According to the PLFC model results,
countries that exceeded the threshold of urbanization had significantly reduced the
adverse effects of energy poverty on public health. Third, this study investigated the
heterogeneous impact of energy poverty across different regions, comparing the
SubSaharan Africa region with other areas. The results revealed in the Sub-Saharan
Africa region that affordable energy under the influence of urbanization provided a new
pathway for improving public health in that region, whereas this effect was considerably
smaller in other regions. Additionally, a series of tests confirmed the robustness of the
results. This paper offered a reference for the development and implementation of
renewable energy-related public health policies.

Li, Jian Ping, Amin Ul Haq, Salah Ud Din, Jalaluddin Khan, Asif Khan, and Abdus
Saboor.[31] Heart disease was one of the complex diseases, and globally, many people
suffered from this disease. On time and efficient identification of heart disease played a
key role in healthcare, particularly in the field of cardiology. In our article, we proposed
an efficient and accurate system to diagnose heart disease, based on machine learning
techniques. The system was developed based on classification algorithms, including
Support Vector Machine, Logistic Regression, Artificial Neural Network, K-nearest
Neighbor, Naïve Bayes, and Decision Tree, while standard feature selection algorithms
were used, such as Relief, Minimal Redundancy Maximal Relevance, Least Absolute
Shrinkage Selection Operator, and Local Learning for removing irrelevant and redundant
features. We also proposed a novel fast conditional mutual information feature selection
algorithm to solve the feature selection problem. The feature selection algorithms were
used for feature selection to increase the classification accuracy and reduce the execution

Department of Electronics and Communication Engineering,SVCN,Nellore 29

Multi-Class Heart Health Classification From ECG Data With RFC

time of the classification system. Furthermore, the leave-one-subject-out cross-validation

method was used for learning the best practices of model assessment and for
hyperparameter tuning. Performance measuring metrics were used for assessment of the
performances of the classifiers. The performances of the classifiers were checked on the
selected features as selected by feature selection algorithms. The experimental results
showed that the proposed feature selection algorithm (FCMIM) was feasible with the
classifier Support Vector Machine for designing a high-level intelligent system to identify
heart disease. The suggested diagnosis system (FCMIM-SVM) achieved good accuracy
compared to previously proposed methods. Additionally, the proposed system could
easily be implemented in healthcare for the identification of heart disease.

Deng, Muqing, Tingting Meng, Jiuwen Cao, Shimin Wang, Jing Zhang, and Huijie
Fan.[32] Heart sound classification played a vital role in the early detection of
cardiovascular disorders, especially for small primary health care clinics. Despite much
progress being made for heart sound classification in recent years, most of them were
based on conventional segmented features and shallow structure-based classifiers. These
conventional acoustic representation and classification methods might have been
insufficient in characterizing heart sound and generally suffered from degraded
performance due to the complicated and changeable cardiac acoustic environment. In our
paper, we proposed a new heart sound classification method based on improved Mel-
frequency cepstrum coefficient (MFCC) features and convolutional recurrent neural
networks. The Mel-frequency cepstrums were firstly calculated without dividing the heart
sound signal. We proposed a new improved feature extraction scheme based on MFCC
to elaborate the dynamic characteristics among consecutive heart sound signals. Finally,
the MFCC-based features were fed to a deep convolutional and recurrent neural network
(CRNN) for feature learning and later classification task. The proposed deep learning
framework could take advantage of the encoded local characteristics extracted from the
convolutional neural network (CNN) and the long-term dependencies captured by the
recurrent neural network (RNN). We presented comprehensive studies on the
performance of different network parameters and different network connection strategies.
Performance comparisons with stateof-the-art algorithms were given for discussions.

Department of Electronics and Communication Engineering,SVCN,Nellore 30

Multi-Class Heart Health Classification From ECG Data With RFC

Experiments showed that, for the two-class classification problem (pathological or non-
pathological), a classification accuracy of 98% had been achieved on the 2016
PhysioNet/CinC Challenge database.

Abdellatif, Abdallah, Hamdan Abdellatef, Jeevan Kanesan, Chee-Onn Chow, Joon Huang
Chuah, and Hassan Muwafaq Gheni.[33] Cardiovascular disease (CVD) was the leading
cause of death worldwide. A Machine Learning (ML) system could predict CVD in the
early stages to mitigate mortality rates based on clinical data. Recently, many research
works utilized different machine learning approaches to detect CVD or identify the
patient’s severity level. Although these works obtained promising results, none focused
on employing optimization methods to improve the ML model performance for CVD
detection and severity-level classification. This study provided an effective method based
on the Synthetic Minority Oversampling Technique (SMOTE) to handle imbalance
distribution issue, six different ML classifiers to detect the patient status, and
Hyperparameter Optimization (HPO) to find the best hyperparameter for ML classifier
together with SMOTE. Two public datasets were used to build and test the model using
all features. The results showed that SMOTE and Extra Trees (ET) optimized using
hyperband achieved higher results than other models and outperformed the state-of-the-
art works by achieving 99.2% and 98.52% in CVD detection, respectively. Also, the
developed model converged to 95.73% severity classification using the Cleveland
dataset. The proposed model could help doctors determine a patient’s current heart
disease status. As a result, it was possible to prevent heart disease-related mortality by
implementing early therapy.

Chen, Yongchao, Shoushui Wei, and Yatao Zhang.[34] We proposed a novel method that
combined modified frequency slice wavelet transform (MFSWT) and convolutional
neural network (CNN) for classifying normal and abnormal heart sounds. A hidden
Markov model was used to find the position of each cardiac cycle in the heart sound
signal and determine the exact position of the four periods of S1, S2, systole, and diastole.
Then, the one-dimensional cardiac cycle signal was converted into a two-dimensional
time-frequency picture using the MFSWT. Finally, two CNN models were trained using

Department of Electronics and Communication Engineering,SVCN,Nellore 31

Multi-Class Heart Health Classification From ECG Data With RFC

the aforementioned pictures. We combined two CNN models using sample entropy
(SampEn) to determine which model was used to classify the heart sound signal. We
evaluated our model on the heart sound public dataset provided by the PhysioNet
Computing in Cardiology Challenge 2016. Experimental classification performance from
a 10-fold cross-validation indicated that sensitivity (Se), specificity (Sp), and mean
accuracy (MAcc) were 0.95, 0.93, and 0.94, respectively. The results showed the
proposed method could classify normal and abnormal heart sounds with efficiency and
high accuracy.

Shah, Devansh, Samir Patel, and Santosh Kumar Bharti.[35] Heart disease, alternatively
known as cardiovascular disease, encases various conditions that impact the heart and is
the primary basis of death worldwide over the span of the past few decades. It associates
many risk factors in heart disease and a need of the time to get accurate, reliable, and
sensible approaches to make an early diagnosis to achieve prompt management of the
disease. Data mining is a commonly used technique for processing enormous data in the
healthcare domain. Researchers apply several data mining and machine learning
techniques to analyse huge complex medical data, helping healthcare professionals to
predict heart disease. This research paper presents various attributes related to heart
disease, and the model on basis of supervised learning algorithms as Naïve Bayes,
decision tree, K-nearest neighbor, and random forest algorithm. It uses the existing
dataset from the Cleveland database of UCI repository of heart disease patients. The
dataset comprises 303 instances and 76 attributes. Of these 76 attributes, only 14
attributes are considered for testing, important to substantiate the performance of different
algorithms. This research paper aims to envision the probability of developing heart
disease in the patients. The results portray that the highest accuracy score is achieved with
K-nearest neighbor.

Oliveira, Jorge, Francesco Renna, Paulo Dias Costa, Marcelo Nogueira, Cristina Oliveira,
Carlos Ferreira, Alípio Jorge et al.[36] Cardiac auscultation was one of the most cost-
effective techniques used to detect and identify many heart conditions. Computer-assisted
decision systems based on auscultation could support physicians in their decisions.

Department of Electronics and Communication Engineering,SVCN,Nellore 32

Multi-Class Heart Health Classification From ECG Data With RFC

Unfortunately, the application of such systems in clinical trials was still minimal since
most of them only aimed to detect the presence of extra or abnormal waves in the
phonocardiogram signal, i.e., only a binary ground truth variable (normal vs. abnormal)
was provided. This was mainly due to the lack of large publicly available datasets, where
a more detailed description of such abnormal waves (e.g., cardiac murmurs) existed. To
pave the way for more effective research on healthcare recommendation systems based
on auscultation, our team prepared the currently largest pediatric heart sound dataset. A
total of 5282 recordings were collected from the four main auscultation locations of 1568
patients, in the process, 215780 heart sounds were manually annotated. Furthermore, and
for the first time, each cardiac murmur was manually annotated by an expert annotator
according to its timing, shape, pitch, grading, and quality. In addition, the auscultation
locations where the murmur was present were identified as well as the auscultation
location where the murmur was detected more intensively. Such detailed description for
a relatively large number of heart sounds may pave the way for new machine learning
algorithms with a real-world application for the detection and analysis of murmur waves
for diagnostic purposes.

Balaji, Tata.[37] Heart Disease (HD) was one among the critical diseases that severely
affected humankind. The presence of heart disease arose due to insufficient blood supply
to other body parts. Henceforth, diagnosing the HD on time prevented heart failure.
Traditional diagnosing procedures regarding HD detection and prediction became
unreliable in many circumstances. Recent studies put forth the witness that implication
of Machine Learning (ML) in traditional HD detection and prediction resulted in superior
performance. Further, Computer-Aided Diagnosis using one-dimensional and multi-
dimensional signals assisted in diagnosing the HDs at an early stage, thereby saving
human lives. The objective of this manuscript was to present an overview of HDs,
symptoms, and the role of ML in HD predictions followed by various state-of-the-art ML
algorithms that aided in the identification and prediction of HD at an early stage to save
human lives.

Department of Electronics and Communication Engineering,SVCN,Nellore 33

Multi-Class Heart Health Classification From ECG Data With RFC

Pati, Abhilash, Manoranjan Parhi, and Binod Kumar Pattanayak.[38] The prediction of
heart disease (HD) helped the physicians in taking accurate decisions towards the
improvement of patient's health. Hence, machine learning (ML), data mining (DM), and
classification techniques played a vital role in understanding and reducing the symptoms
related to HDs. In this paper, an integrated heart disease prediction model (IHDPM) was
introduced for HD prediction by considering principal component analysis (PCA) for
dimensionality reduction, sequential feature selection (SFS) for feature selection, and
random forest (RF) classifier for classifications. Some experiments were performed by
considering different evaluative measures on Cleveland Heart Disease Dataset (CHDD)
sourced from the UCI-ML repository and Python language, thereby concluding that the
proposed model outperformed the other six conventional classification techniques. The
proposed model would have helped out the physicians in conducting a diagnosis of the
heart patients proficiently and at the same time, it could have been applicable in
predictions of other chronic diseases like diabetes, cancers, etc.

Nagendra, Kolluru Venkata, Maligela Ussenaiah, and N. Rajasekhar.[39] Prevention was

better than cure. Prediction was a very important aspect of medicinal services.
Forecasting of heart disease was one of the most demanding issues in medicinal services.
Applications of Data Mining Techniques in the healthcare sector were increasing. Neural
Network (NN), Support Vector Machines (SVM), eXtreme Gradient Boosting (XGB),
Random Forest (RF), and Linear Discriminant Analysis (LDA) were some of the Data
Mining Classification Techniques. Enhanced Gradient Boosting (EGB) was a Data
Mining Classification Model which was extended from XGB. The proposed research was
estimated utilizing the Statistical Metrics (Accuracy, Precision, Recall, and F1-Measure)
and ROC (Receiver Operating Characteristic) curve results obtained for performance
comparison. The result showed that the ROC (Area Under Curve) obtained for EGB was
higher than the ROC (Area Under Curve) obtained for all other Data Mining
Classification methods. The Precision value was high when it encountered EGB.

Vamshi Kumar, S., T. V. Rajinikanth, and S. Viswanadha Raju.[40] Recent studies showed
that heart attack was one of the severe problems in today’s world. Prediction was one of

Department of Electronics and Communication Engineering,SVCN,Nellore 34

Multi-Class Heart Health Classification From ECG Data With RFC

the crucial challenges in the medical field. In the heart, there were two main blood vessels
for the supply of blood through coronary arteries. If the arteries got completely blocked,
then it led to a heart attack. The healthcare field had lots of data related to different
diseases, so machine learning techniques were useful to find results effectively for
predicting heart diseases. In this paper, data was preprocessed in order to remove the
noisy data, filling the missing values using measures of central tendencies. Later, the
refined dataset was classified using classifiers apart from prediction. The numbers of
attributes were reduced using dimensionality reduction techniques namely Linear
Transformation Techniques (LTT) like Principal Component Analysis (PCA) and Linear
Discriminant Analysis (LDA). The performances of the classifiers were analyzed based
on various accuracy-related metrics. The designed classifier model was able to predict
the occurrence of a heart attack. The Support Vector Machine (SVM) classifier was
applied along with the three kernels namely Linear (linear), Radial Basis Function (RBF),
and Polynomial (poly). Another technique namely Decision Tree (DT) was also applied
on the Cleveland dataset, and the results were compared in detail, and effective
conclusions were drawn from the results.

Ali, Farman, Shaker El-Sappagh, SM Riazul Islam, Daehan Kwak, Amjad Ali,
Muhammad Imran, and Kyung-Sup Kwak.[41] The accurate prediction of heart disease
was essential to efficiently treating cardiac patients before a heart attack occurred. This
goal could be achieved using an optimal machine learning model with rich healthcare
data on heart diseases. Various systems based on machine learning had been presented
recently to predict and diagnose heart disease. However, these systems could not handle
high-dimensional datasets due to the lack of a smart framework that could use different
sources of data for heart disease prediction. In addition, the existing systems utilized
conventional techniques to select features from a dataset and compute a general weight
for them based on their significance. These methods had also failed to enhance the
performance of heart disease diagnosis. In this paper, a smart healthcare system was
proposed for heart disease prediction using ensemble deep learning and feature fusion
approaches. First, the feature fusion method combined the extracted features from both
sensor data and electronic medical records to generate valuable healthcare data. Second,

Department of Electronics and Communication Engineering,SVCN,Nellore 35

Multi-Class Heart Health Classification From ECG Data With RFC

the information gain technique eliminated irrelevant and redundant features and selected
the important ones, which decreased the computational burden and enhanced the system
performance. In addition, the conditional probability approach computed a specific
feature weight for each class, which further improved system performance. Finally, the
ensemble deep learning model was trained for heart disease prediction. The proposed
system was evaluated with heart disease data and compared with traditional classifiers
based on feature fusion, feature selection, and weighting techniques. The proposed
system obtained an accuracy of 98.5%, which was higher than existing systems. This
result showed that our system was more effective for the prediction of heart disease, in
comparison to other state-of-the-art methods.

Katarya, Rahul, and Sunit Kumar Meena.[42] Nowadays, people were getting caught in
their day-to-day lives doing their work and other things and ignoring their health. Due to
this hectic life and ignorance towards their health, the number of people getting sick
increased every day. Moreover, most of the people were suffering from diseases like heart
disease. Global deaths of almost 31% population were due to heart-related diseases as
data contributed by the World Health Organization (WHO). So, the prediction of
happening heart disease or not became important for the medical field. However, data
received by the medical sector or hospitals was so huge that sometimes it became difficult
to analyze. Using machine learning techniques for this prediction and handling of data
could become very efficient for medical people. Hence, in this study, we discussed heart
disease and its risk factors and explained machine learning techniques. Using those
machine learning techniques, we predicted heart disease and provided a comparative
analysis of the algorithms for machine learning used for the experiment of the prediction.
The goal or objective of this research was completely related to the prediction of heart
disease via a machine learning technique and analysis of them.

Katarya, Rahul, and Polipireddy Srinivas.[43] Predicting and detecting heart disease has
always been a critical and challenging task for healthcare practitioners. Hospitals and
other clinics offered expensive therapies and operations to treat heart diseases. So,
predicting heart disease at the early stages was useful to the people around the world so

Department of Electronics and Communication Engineering,SVCN,Nellore 36

Multi-Class Heart Health Classification From ECG Data With RFC

that they could take necessary actions before it got severe. Heart disease was a significant
problem in recent times; the main reason for this disease was the intake of alcohol,
tobacco, and lack of physical exercise. Over the years, machine learning showed effective
results in making decisions and predictions from the broad set of data produced by the
healthcare industry. Some of the supervised machine learning techniques used in this
prediction of heart disease were artificial neural network (ANN), decision tree (DT),
random forest (RF), support vector machine (SVM), naïve Bayes) (NB), and k-nearest
neighbor algorithm. Furthermore, the performances of these algorithms were
summarized.

Rath, Adyasha, Debahuti Mishra, Ganapati Panda, and Suresh Chandra Satapathy.[44] In
comparison to other diseases, the number of deaths from Heart Disease (HD) was the
highest across the globe. The trend of death due to HD was still rising, which had become
a constant source of concern amongst human beings. Researchers and doctors were
putting tremendous efforts to save lives from HD. It was observed from the literature that
a large number of researchers were currently carrying out their research work in various
aspects of HD. Among those, the early detection and diagnosis of HD were currently the
focus area of research. Appropriate, reliable, accurate, robust, and affordable HD
detection schemes were the ultimate goal for saving the lives of people. In this research,
articles on HD detection and diagnosis published in the recent past were collected and
critically analyzed. The outcome of the analysis was presented in various tabular forms
for easy understanding and further use. The paper provided thorough knowledge on
standard data sources on HD, the feature extraction, selection, and reduction methods,
and Machine Learning (ML) and Deep Learning (DL) based classification schemes. The
categorization of published articles and the various performance measures employed
were presented, which would develop interest amongst new researchers working in the
area of detection or classification of HD. The best performing technique in each category
was listed. The research challenges and future scope of work were also provided to
facilitate further research work in this promising area.

Department of Electronics and Communication Engineering,SVCN,Nellore 37

Multi-Class Heart Health Classification From ECG Data With RFC

Menshawi, Alaa, Mohammad Mehedi Hassan, Nasser Allheeib, and Giancarlo

Fortino.[45] The early, valid prediction of heart problems would have minimized life
threats and saved lives, while lack of prediction and false diagnosis could have been fatal.
Addressing a single dataset alone to build a machine learning model for the identification
of heart problems was not practical because each country and hospital had its own data
schema, structure, and quality. On this basis, a generic framework was built for heart
problem diagnosis. This framework was a hybrid framework that employed multiple
machine learning and deep learning techniques and voted for the best outcome based on
a novel voting technique with the intention to remove bias from the model. The
framework contained two consequent layers. The first layer contained simultaneous
machine learning models running over a given dataset. The second layer consolidated the
outputs of the first layer and classified them as a second classification layer based on
novel voting techniques. Prior to the classification process, the framework selected the
top features using a proposed feature selection framework. It started by filtering the
columns using multiple feature selection methods and considered the top common
features selected. Results from the proposed framework, with 95.6% accuracy, showed
its superiority over the single machine learning model, classical stacking technique, and
traditional voting technique. The main contribution of this work was to demonstrate how
the prediction probabilities of multiple models could be exploited for the purpose of
creating another layer for the final output; this step neutralized any model bias. Another
experimental contribution was proving the complete pipeline’s ability to be retrained and
used for other datasets collected using different measurements and with different
distributions.

Kumar, Ashish, Rama Komaragiri, and Manjeet Kumar.[46] Heart rate monitoring and
therapeutic devices included real-time sensing capabilities reflecting the state of the heart.
Current circuitry could be interpreted as a cardiac electrical signal compression algorithm
representing the time signal information into a single event description of the cardiac
activity. It was observed that some detection techniques developed for ECG signal
detection like artificial neural network, genetic algorithm, Hilbert transform, hidden
Markov model were some sophisticated algorithms which provided suitable results but

Department of Electronics and Communication Engineering,SVCN,Nellore 38

Multi-Class Heart Health Classification From ECG Data With RFC

their implementation on a silicon chip was very complicated. Due to less complexity and
high performance, wavelet transform-based approaches were widely used. In this paper,
after a thorough analysis of various wavelet transforms, it was found that Biorthogonal
wavelet transform was best suited to detect ECG signal's QRS complex. The main steps
involved in the ECG detection process consisted of de-noising and locating different ECG
peaks using adaptive slope prediction thresholding. Furthermore, the significant
challenges involved in the wireless transmission of ECG data were data conversion and
power consumption. As medical regulatory boards demanded a lossless compression
technique, a lossless compression technique with a high bit compression ratio was highly
required. Furthermore, in this work, an LZMA based ECG data compression technique
was proposed. The proposed methodology achieved the highest signal to noise ratio, and
lowest root mean square error. Also, the proposed ECG detection technique was capable
of distinguishing accurately between healthy, myocardial infarction, congestive heart
failure and coronary artery disease patients with a detection accuracy, sensitivity,
specificity, and error of 99.92%, 99.94%, 99.92% and 0.0013, respectively. The use of
LZMA data compression of ECG data achieved a high compression ratio of 18.84. The
advantages and effectiveness of the proposed algorithm were verified by comparing with
the existing methods.

Bahrami, Boshra, and Mirsaeid Hosseini Shirvani.[47] The background— New age- and
sexspecific lipoprotein cut points that were developed from National Health and Nutrition
Examination Survey (NHANES) data were considered to be a more accurate
classification of a high-risk lipoprotein level in adolescents compared with existing cut
points established by the National Cholesterol Education Program (NCEP). The aim of
the study was to determine which of the NHANES or NCEP adolescent lipoprotein
classifications was most effective for predicting abnormal levels in adulthood.

Magnussen, Costan G., Olli T. Raitakari, Russell Thomson, Markus Juonala,

Dharmendrakumar A. Patel, Jorma SA Viikari, Jukka Marniemi et al.[48] Heart murmurs
were often the first signs of pathological changes of the heart valves, and they were
usually found during auscultation in primary health care. Distinguishing a pathological

Department of Electronics and Communication Engineering,SVCN,Nellore 39

Multi-Class Heart Health Classification From ECG Data With RFC

murmur from a physiological murmur was, however, difficult, which is why an

“intelligent stethoscope” with decision support abilities would have been of great value.
Phonocardiographic signals were acquired from 36 patients with aortic valve stenosis,
mitral insufficiency, or physiological murmurs, and the data were analyzed with the aim
to find a suitable feature subset for automatic classification of heart murmurs. Techniques
such as Shannon energy, wavelets, fractal dimensions, and recurrence quantification
analysis were used to extract 207 features. 157 of these features had not previously been
used in heart murmur classification. A multi-domain subset consisting of 14, both old and
new, features was derived using Pudil’s sequential floating forward selection (SFFS)
method. This subset was compared with several single-domain feature sets. Using neural
network classification, the selected multi-domain subset gave the best results; 86%
correct classifications compared to 68% for the first runner-up. In conclusion, the derived
feature set was superior to the comparative sets and seemed rather robust to noisy data.

Bahrami, Boshra, and Mirsaeid Hosseini Shirvani.[49] The background— New age- and
sexspecific lipoprotein cut points were developed from National Health and Nutrition
Examination Survey (NHANES) data and were considered to be a more accurate
classification of a highrisk lipoprotein level in adolescents compared with existing cut
points established by the National Cholesterol Education Program (NCEP). The aim of
the study was to determine which of the NHANES or NCEP adolescent lipoprotein
classifications was most effective for predicting abnormal levels in adulthood.

Magnussen,Costan G., Olli T. Raitakari, Russell Thomson, Markus Juonala,

Dharmendrakumar A. Patel, Jorma SA Viikari, Jukka Marniemi et al.[50] Heart murmurs
were often the first signs of pathological changes of the heart valves, and they were
usually found during auscultation in primary health care. Distinguishing a pathological
murmur from a physiological murmur was, however, difficult, which is why an
“intelligent stethoscope” with decision support abilities would have been of great value.
Phonocardiographic signals were acquired from 36 patients with aortic valve stenosis,
mitral insufficiency, or physiological murmurs, and the data were analyzed with the aim
to find a suitable feature subset for automatic classification of heart murmurs. Techniques

Department of Electronics and Communication Engineering,SVCN,Nellore 40

Multi-Class Heart Health Classification From ECG Data With RFC

such as Shannon energy, wavelets, fractal dimensions, and recurrence quantification

analysis were used to extract 207 features. 157 of these features had not previously been
used in heart murmur classification. A multi-domain subset consisting of 14, both old and
new, features was derived using Pudil’s sequential floating forward selection (SFFS)
method. This subset was compared with several single-domain feature sets. Using neural
network classification, the selected multi-domain subset gave the best results; 86%
correct classifications compared to 68% for the first runner-up. In conclusion, the derived
feature set was superior to the comparative sets and seemed rather robust to noisy data.

Acharya, R., Ashwin Kumar, P. S. Bhat, C. M. Lim, S. S. Lyengar, N. Kannathal, and

ShankarM Krishnan.[51] The heart rate was recognized as a non-stationary signal, with
its variation containing indicators of current disease or warnings about impending cardiac
diseases. These indicators could be present at all times or could occur at random, during
certain intervals of the day. However, studying and pinpointing abnormalities in large
quantities of data collected over several hours was strenuous and time-consuming. Hence,
heart rate variation measurement (instantaneous heart rate against time) became a
popular, non-invasive tool for assessing the autonomic nervous system. Computer-based
analytical tools for the in-depth study and classification of data over day-long intervals
were deemed very useful in diagnostics. The paper dealt with the classification of cardiac
rhythms using an artificial neural network and fuzzy relationships. The results indicated
a high level of efficacy of the tools used, with an accuracy level of 80–85%.

Dewan, Ankita, and Meghna Sharma.[52] Fluctuations in heart rate were intimately
related to changes in the physiological state of the organism. This relationship was
exploited by classifying a human participant's wake/sleep status using their instantaneous
heart rate (IHR) series. An approach was employed using a convolutional neural network
(CNN) to build features from the IHR series extracted from a whole-night
electrocardiogram (ECG) and predict every 30 seconds whether the participant was
awake or asleep. The training database consisted of 56 normal participants, and three
different databases were considered for validation; one was private, and two were public
with different races and apnea severities. On the private database of 27 participants, the

Department of Electronics and Communication Engineering,SVCN,Nellore 41

Multi-Class Heart Health Classification From ECG Data With RFC

accuracy, sensitivity, specificity, and F1 values for predicting the wake stage were 75.3%,
52.4%, 89.4%, and 0.83, respectively. Validation performance was similar on the two
public databases. When the photoplethysmography was used instead of the ECG to obtain
the IHR series, the performance was also comparable. A robustness check was carried out
to confirm the obtained performance statistics. These results advocated for an effective
and scalable method for recognizing changes in physiological state using non-invasive
heart rate monitoring. The CNN model adaptively quantified IHR fluctuation as well as
its location in time and was suitable for differentiating between the wake and sleep stages.

Shuvo, Samiul Based, Shams Nafisa Ali, Soham Irtiza Swapnil, Mabrook S. Al-Rakhami,
and Abdu Gumaei.[53]The alarmingly high mortality rate and increasing global
prevalence of cardiovascular diseases (CVDs) signified the crucial need for early
detection schemes. Phonocardiogram (PCG) signals had been historically applied in this
domain owing to their simplicity and cost-effectiveness. In that article, CardioXNet was
proposed, a novel lightweight end-to-end CRNN architecture for automatic detection of
five classes of cardiac auscultation namely normal, aortic stenosis, mitral stenosis, mitral
regurgitation, and mitral valve prolapse using raw PCG signal. The process was
automated by the involvement of two learning phases namely, representation learning,
and sequence residual learning. Three parallel CNN pathways were implemented in the
representation learning phase to learn the coarse and fine-grained features from the PCG
and to explore the salient features from variable receptive fields involving 2D-CNN based
squeeze-expansion. Thus, in the representation learning phase, the network extracted
efficient time-invariant features and converged with great rapidity. In the sequential
residual learning phase, because of the bidirectional-LSTMs and the skip connection, the
network could proficiently extract temporal features without performing any feature
extraction on the signal. The obtained results demonstrated that the proposed end-toend
architecture yielded outstanding performance in all the evaluation metrics compared to
the previous state-of-the-art methods with up to 99.60% accuracy, 99.56% precision,
99.52% recall, and 99.68% F1-score on an average while being computationally
comparable. This model outperformed any previous works using the same database by a
considerable margin. Moreover, the proposed model was tested on the PhysioNet/CinC

Department of Electronics and Communication Engineering,SVCN,Nellore 42

Multi-Class Heart Health Classification From ECG Data With RFC

2016 challenge dataset achieving an accuracy of 86.57%. Finally, the model was
evaluated on a merged dataset of Github PCG dataset and PhysioNet dataset achieving
an excellent accuracy of 88.09%. The high accuracy metrics on both primary.

Reddy, N. Satish Chandra, Song Shue Nee, Lim Zhi Min, and Chew Xin Ying.[54]The
heart disease has been one of the major causes of death worldwide. The heart disease
diagnosis was expensive, thus it was necessary to predict the risk of getting heart disease
with selected features. The feature selection methods could be used as valuable
techniques to reduce the cost of diagnosis by selecting the important attributes. The
objectives of the study were to predict the classification model and to know which
selected features played a key role in the prediction of heart disease by using Cleveland
and statlog project heart datasets. The accuracy of the random forest algorithm both in
classification and feature selection models was observed to be 90–95% based on three
different percentage splits. The 8 and 6 selected features seemed to be the minimum
feature requirements to build a better performance model. Whereby, further dropping of
the 8 or 6 selected features did not lead to better performance for the prediction model.

Woodward, Mark.[55] To evaluate a deprivation index, calculated from small area

statistics for postcode sectors, as a measure of individual social status in an
epidemiological study of coronary heart disease (CHD). A baseline, cross sectional
survey.Twenty two local authority districts of Scotland surveyed between 1984 and 1986.
A total of 10,359 men and women aged 40-59 years randomly selected to the Scottish
heart health study.The Scottish deprivation categorisation, derived from small area
statistics, exhibited a strong linear trend (p = 0.001 or below) for individual prevalent
CHD for men and women, unadjusted, and adjusted for major cardiovascular risk factors.
The degree of association with CHD was similar to that for measures of social class based
upon occupation.The Scottish deprivation categorisation was an effective measure of
individual social status in the current study, broadly comparable in its effect with the more
traditional classification derived from occupations. The latter had important problems in
definition, especially for women. Small area statistics may have provided a useful marker
of individual social status in a more general epidemiological setting.

Department of Electronics and Communication Engineering,SVCN,Nellore 43

Multi-Class Heart Health Classification From ECG Data With RFC

Deng, Shi-Wen, and Ji-Qing Han.[56] In the study, a novel framework for heart sound
classification without segmentation was presented, based on the autocorrelation feature
and diffusion maps, aiming to provide a primary diagnosis in primary health centers and
home care settings. In the proposed framework, autocorrelation features were first
extracted from the subband envelopes calculated from the sub-band coefficients of the
heart signal using discrete wavelet decomposition (DWT). Then, the autocorrelation
features were fused to obtain the unified feature representation with diffusion maps.
Finally, the unified feature was input into the Support Vector Machines (SVM) classifier
to perform the task of heart sound classification. Additionally, the proposed framework
was evaluated on two public datasets published in the PASCAL Classifying Heart Sounds
Challenge. The experimental results demonstrated outstanding performance of the
proposed method compared with the baselines.

Singh, Jagdeep, Amit Kamra, and Harbhag Singh.[57] Today's healthcare services had
come a long way to provide medical care to the patients and protect them from various
diseases. This paper comprised the development of a framework based on associative
classification techniques on a heart dataset for early diagnosis of heart-based diseases. It
was hard to diagnose the heart diseases with just observation that arrived suddenly and
might prove fatal when uncontrolled. The implementation of work was done on the
Cleveland heart diseases dataset from the University of California Irvine (UCI) machine
learning repository to test on different data mining techniques. The various attributes
related to the cause of heart diseases were viz: gender, age, chest pain type, blood
pressure, blood sugar, etc., that could predict early symptoms of heart disease. Various
data mining algorithms such as Apriori, FP-Growth, Naive Bayes, ZeroR, OneR, J48, and
k-nearest neighbor were applied in this study for the prediction of heart diseases. On the
basis of the best results, the development of a heart disease prediction system was done
by using a hybrid technique for classification associative rules (CARs) to achieve a
prediction accuracy of 99.19%.

Malik, John, Yu-Lun Lo, and Hau-tieng Wu.[58] Fluctuations in heart rate were intimately
related to changes in the physiological state of the organism. This relationship was

Department of Electronics and Communication Engineering,SVCN,Nellore 44

Multi-Class Heart Health Classification From ECG Data With RFC

exploited by classifying a human participant's wake/sleep status using his instantaneous

heart rate (IHR) series. An approach was employed using a convolutional neural network
(CNN) to build features from the IHR series extracted from a whole-night
electrocardiogram (ECG) and predict every 30 s whether the participant was awake or
asleep. The training database consisted of 56 normal participants, and three different
databases were considered for validation; one was private, and two were public with
different races and apnea severities.On the private database of 27 participants, the
accuracy, sensitivity, specificity, and F1 values for predicting the wake stage were [values
here]. Validation performance was similar on the two public databases. When the
photoplethysmography was used instead of the ECG to obtain the IHR series, the
performance was also comparable. A robustness check was carried out to confirm the
obtained performance statistics. This result advocated for an effective and scalable
method for recognizing changes in physiological state using non-invasive heart rate
monitoring. The CNN model adaptively quantified IHR fluctuation as well as its location
in time and was suitable for differentiating between the wake and sleep stages.

Fida, Benish, Muhammad Nazir, Nawazish Naveed, and Sheeraz Akram.[59] Heart
disease diagnosis was considered one of the complicated tasks in the medical field. In
order to perform heart disease diagnosis, an accurate and efficient automation system
could have been very helpful. In this research, a classifier ensemble method was proposed
to improve the decision of the classifiers for heart disease diagnosis. Homogeneous
ensemble was applied for heart disease classification, and finally, results were optimized
by using Genetic algorithm. Data was evaluated by using 10-fold cross-validation, and
the performance of the system was evaluated by classifiers' accuracy, sensitivity, and
specificity to check the feasibility of the system. Comparison of the methodology with
existing ensemble techniques showed considerable improvements in terms of
classification accuracy.

Kirsch, J., and A. McGuire.[60] The study considered the feasibility of defining a QALY
from disease-specific data using the New York Heart Association (NYHA) classification
of heart failure. Health state values for the four different NYHA classifications of disease

Department of Electronics and Communication Engineering,SVCN,Nellore 45

Multi-Class Heart Health Classification From ECG Data With RFC

progression were derived using the time trade-off (TTO) instrument associated with the
five-dimensional (EQ-5D) health state valuation method. Consistent mappings between
the disease classification and the chosen QALY instrument were found. With this being
the case, the assumption of constant proportionality, necessary to define the QALY as an
acceptable measure of healthrelated preferences, was considered. It was found that
constant proportionality did not hold across the more severe health states, thus
questioning the use of QALYs as representing cardinal preference structures.

2.3 RESEARCH GAPS

Current research on heart health classification reveals several notable gaps. Firstly, there
is a lack of standardized criteria for categorizing heart health status across different
populations. This inconsistency hampers the development of universally applicable
classification models. Additionally, while machine learning techniques show promise in
classifying heart health based on various data sources such as medical records and
imaging scans, there remains a need for more robust validation and comparison studies
to determine the most effective algorithms.

Furthermore, there is a scarcity of research focusing on integrating multiple data

modalities for heart health classification. Most existing studies predominantly rely on
single data sources, such as electrocardiograms or echocardiograms, which may not
capture the full complexity of heart health. Addressing this gap by exploring the fusion
of diverse data types, including genetic information, lifestyle factors, and biomarkers,
could significantly enhance the accuracy and reliability of classification models. In
essence, bridging these research gaps is essential for advancing heart health classification
methodologies and ultimately improving diagnostic accuracy and patient outcomes.

Malik, John, Yu-Lun Lo, and Hau-tieng Wu.[59] Fluctuations in heart rate were intimately
related to changes in the physiological state of the organism. This relationship was
exploited by classifying a human participant's wake/sleep status using his instantaneous
heart rate (IHR) series. An approach was employed using a convolutional neural network
(CNN) to build features from the IHR series extracted from a whole-night
electrocardiogram (ECG) and predict every 30 s whether the participant was awake or

Department of Electronics and Communication Engineering,SVCN,Nellore 46

Multi-Class Heart Health Classification From ECG Data With RFC

asleep. The training database consisted of 56 normal participants, and three different
databases were considered for validation; one was private, and two were public with
different races and apnea severities.On the private database of 27 participants, the
accuracy, sensitivity, specificity, and F1 values for predicting the wake stage were [values
here]. Validation performance was similar on the two public databases. When the
photoplethysmography was used instead of the ECG to obtain the IHR series, the
performance was also comparable. A robustness check was carried out to confirm the
obtained performance statistics.This result advocated for an effective and scalable method
for recognizing changes in physiological state using non-invasive heart rate monitoring.
The CNN model adaptively quantified IHR fluctuation as well as its location in time and
was suitable for differentiating between the wake and sleep stages.

Fida, Benish, Muhammad Nazir, Nawazish Naveed, and Sheeraz Akram.[60] Heart
disease diagnosis was considered one of the complicated tasks in the medical field. In
order to perform heart disease diagnosis, an accurate and efficient automation system
could have been very helpful. In this research, a classifier ensemble method was proposed
to improve the decision of the classifiers for heart disease diagnosis. Homogeneous
ensemble was applied for heart disease classification, and finally, results were optimized
by using Genetic algorithm. Data was evaluated by using 10-fold cross-validation, and
the performance of the system was evaluated by classifiers' accuracy, sensitivity, and
specificity to check the feasibility of the system. Comparison of the methodology with
existing ensemble techniques showed considerable improvements in terms of
classification accuracy.

2.4 COMPETATIVE ANALYSIS

TABLE 1. SUMMARY OF LITERATURE SURVEY

REF WORK DONE DRAWBACKS
NO

[1] Exploration of Machine Learning Limited Algorithm Selection.

Techniques.

Department of Electronics and Communication Engineering,SVCN,Nellore 47

Multi-Class Heart Health Classification From ECG Data With RFC

[2] Successful Implementation of CNN model. Limited scope of

Participant Demographics.

[3] Framework development. Interpretability.

[4] Development of a Novel Framework. Complexity and Computational
cost.

[5] Analysis of Various Wavelet Transforms. Complexity of Implementation for

Some Detection Techniques
[6] Development of improved cut points. Data Limitations.
[7] Feature Extraction. Limited Sample Size.
[8] Longitudinal Analysis. Generalizability.
[9] Classification. Potential Overfitting.

[10] Achievement of high efficacy. Limited Accuracy.

[11] Robustness Check. Limited Sensitivity.

[12] Performance Result. Scalability.

[13] Feature Selection. Dataset Dependency.

[14] Strong Linear trend. Definition Problem.

[15] In this section, the key findings and This section highlights the
contributions of the research are limitations or shortcomings
summarized. observed in the methodology or
findings of the research.

[16] Development of Hybrid FrameWork. Complexity.

[17] Implementation of Machine Learning Data Limitations

Algorithms
[18] Overview of Decision Tree as a Identification of limitations in the
classification algorithm. proposed methodology.

[19] Identification of rick factors. Lack of Implementation details

[20] Ensemble Deep Learning. Scalability.

Department of Electronics and Communication Engineering,SVCN,Nellore 48

Multi-Class Heart Health Classification From ECG Data With RFC

[21] classification technique based on these Limited Comparison scope.

metrics.

[22] Support Vector Machine. Model Interpretability.

[23] Validation Studies. Cost Consideration.

[24] Advocacy for Integration. Communication overhead.

[25] Introduction of SMOTE. Limited Generalizability.

[26] Preprocessing. Limited Metrics.

[27] Attribute Selection. Algorithm selection.

[28] Heart sound. Potential Bias.

[29] Integration of ML. Data availability.

[30] Development of H2M, a machine Performance Metrics.

learningbased heart health monitoring
model.
2.5 SUMMARY:

Heart health classification involves assessing the condition of a person's heart based on
various factors such as blood pressure, cholesterol levels, and overall cardiovascular
function. This classification helps medical professionals determine the risk of heart
disease and recommend appropriate interventions to maintain or improve heart health. By
analyzing data from medical tests and examinations, doctors can classify individuals into
different categories ranging from low risk to high risk for heart disease.

The goal of heart health classification is to identify potential issues early on and
implement preventative measures to reduce the risk of heart disease. This may include
lifestyle changes such as diet and exercise modifications, as well as medication or medical
procedures for individuals with higher risk factors. By accurately classifying heart health
status, healthcare providers can tailor treatment plans to meet the specific needs of each
patient, ultimately improving overall heart health and reducing the incidence of
cardiovascular events.

Department of Electronics and Communication Engineering,SVCN,Nellore 49

Multi-Class Heart Health Classification From ECG Data With RFC

CHAPTER-3

EXISTING SYSTEM

3.1 EXISTING CLASSIFIER NAME

The existing classifier commonly used for heart health classification from ECG data is
known as the Random Forest classifier. It works by constructing multiple decision trees
during training and outputs the mode of the classes (classification) or the mean prediction
(regression) of the individual trees. This classifier is particularly favoured for its
robustness and ability to handle large datasets with high dimensionality, making it a
popular choice in the medical field for tasks like identifying various heart conditions
based on ECG signals.

The Random Forest classifier is a powerful tool for assessing heart health using ECG
data. It operates by constructing numerous decision trees during its training process. Each
decision tree examines different aspects of the ECG signals, seeking patterns that might
indicate various heart conditions. These trees work together to make a collective
prediction about the health status of the heart. By combining the outputs of multiple trees,
the classifier arrives at a robust and reliable assessment of the heart's condition.

One of the strengths of the Random Forest classifier is its ability to handle large datasets
with high dimensionality, which is common in medical data like ECG recordings. Each
decision tree in the forest is trained independently and makes its own prediction based on
a subset of the data. This diversity helps to prevent overfitting and ensures that the
classifier generalizes well to new, unseen data. As a result, the Random Forest classifier
is widely trusted by medical professionals for its accuracy and robustness in identifying
various heart conditions from ECG signals.

In the medical field, the Random Forest classifier plays a crucial role in diagnosing
heartrelated issues. Its ability to analyze complex ECG data and provide reliable
classifications makes it a valuable tool for healthcare practitioners. By leveraging the
power of ensemble learning, where multiple decision trees collaborate to make informed

Department of Electronics and Communication Engineering,SVCN,Nellore 50

Multi-Class Heart Health Classification From ECG Data With RFC

decisions, the Random Forest classifier offers a dependable method for assessing heart
health and aiding in the early detection of potential cardiac abnormalities.

The Random Forest Classifier is a versatile ensemble learning method that operates by
constructing multiple decision trees during training and outputting the mode of the classes
(classification) or the mean prediction (regression) of the individual trees. It's known for
its robustness, ability to handle large datasets with high dimensionality, and resistance to
overfitting. In environmental condition monitoring, RFC can be effective in identifying
patterns and relationships within complex sensor data, allowing for accurate predictions
of environmental conditions such as temperature, humidity, air quality, and more.

On the other hand, Naive Bayes is a probabilistic classifier based on Bayes' theorem with
the assumption of independence between features. Despite its simplicity, Naive Bayes
can be remarkably effective in classification tasks, especially when dealing with large
datasets and high-dimensional feature spaces. It's particularly suitable for real-time
applications due to its fast training and prediction times. In environmental monitoring,
Naive Bayes can be utilized to classify sensor data into different environmental
conditions or detect anomalies based on probabilistic reasoning.

In practice, the choice between RFC and Naive Bayes (or other classifiers) depends on
various factors such as the nature of the data, the specific objectives of the monitoring
system, computational resources, and performance requirements. In some cases, a
combination of multiple classifiers or advanced techniques such as ensemble learning
may be employed to further enhance the accuracy and reliability of the monitoring
models. Ultimately, the selection of the most suitable classifier should be based on
thorough experimentation and evaluation to ensure optimal performance in
environmental condition monitoring applications.

3.2 DRAWBACKS OF EXISTING CLASSIFIER

The Naive Bayes method, while commonly used for classification tasks, has several
drawbacks when applied to heart health classification. Firstly, it assumes that all features
are independent of each other, which is often not the case with heart health data. For

Department of Electronics and Communication Engineering,SVCN,Nellore 51

Multi-Class Heart Health Classification From ECG Data With RFC

instance, factors like cholesterol levels, blood pressure, and age can be interrelated, but
Naive Bayes overlooks these correlations, potentially leading to inaccurate predictions.

Secondly, Naive Bayes struggles with handling continuous features effectively. In heart
health classification, many metrics such as heart rate and blood pressure are continuous
variables. Naive Bayes treats these as discrete categories, which may result in loss of
information and decreased accuracy in identifying patterns associated with different heart
conditions.

Moreover, Naive Bayes is known to oversimplify the relationships between features. In

the context of heart health, where the interactions between various factors can be
complex, this oversimplification can lead to a lack of nuance in the classification process.
For example, it may not adequately capture the nuanced interactions between risk factors
like smoking, diet, and exercise habits.

Lastly, Naive Bayes can be prone to overfitting, especially when dealing with small or
imbalanced datasets. It may generalize poorly to new data if the training set does not
adequately represent the full range of heart health conditions. This limitation can
undermine the reliability of the classifier and diminish its usefulness in real-world
applications for heart health assessment.

Department of Electronics and Communication Engineering,SVCN,Nellore 52

Multi-Class Heart Health Classification From ECG Data With RFC

CHAPTER-4

PROPOSED SYSTEMS

4.1 OVERVIEW

Figure 4.1 Proposed system Block diagram

The proposed system is a machine learning-based system that can be used to diagnose
diseases. The system works by training a random forest model on a dataset of ECG
signals. The model is then used to test new ECG signals and predict the presence of
disease. The system can be used to diagnose a variety of diseases, including heart disease,
stroke, and epilepsy.

The system is made up of the following components:

ECG Dataset: This is a collection of ECG signals from patients with and without disease.

Dataset Preprocessing: This component preprocesses the ECG signals before they are fed
into the model.

Train/Test Split: This component splits the dataset into a training set and a test set. The

training set is used to train the model, and the test set is used to evaluate the performance

of the model. RFC Model: This is a random forest model that is trained on the training

set.

Performance Estimation: This component evaluates the performance of the model on the
test set.

Department of Electronics and Communication Engineering,SVCN,Nellore 53

Multi-Class Heart Health Classification From ECG Data With RFC

Simple Test ECG: This is a new ECG signal that is used to test the model.

Type of disease testing: This specifies the type of disease that the model is being used to
test The system works as follows:

Step-1: The ECG dataset is collected.

Step-2: The ECG signals are preprocessed.

Step-3: The dataset is split into a training set and a test set.

Step-4: The RFC model is trained on the training set.

Step-5: The performance of the model is evaluated on the test set.

Step-6: A new ECG signal is preprocessed.

Step-7: The preprocessed ECG signal is fed into the model.

Step-8: The model outputs a prediction of the presence of disease.

Step-9: The system is still under development, but it has the potential to be a valuable
tool for diagnosing diseases.

4.2 DATASET PREPROCESSING

Label encoding: The dataset preprocessing for heart health classification involves several
steps. Firstly, we load the dataset containing various features such as age, gender, blood
pressure, cholesterol levels, and more. Then, we check for any missing values in the
dataset and handle them by either removing the rows or imputing values using techniques
like mean or median imputation. Next, we perform label encoding on categorical
variables like gender, where we assign numeric labels to categories (e.g., 0 for male, 1
for female). This ensures that the algorithm can interpret these variables correctly during
training. Once the encoding is done, we split the dataset into training and testing sets to
evaluate the model's performance accurately. Finally, we scale the numerical features to
bring them to a similar scale, which helps in improving the model's convergence and
performance. After preprocessing, the dataset is ready for training a machine learning
model for heart health classification.

Department of Electronics and Communication Engineering,SVCN,Nellore 54

Multi-Class Heart Health Classification From ECG Data With RFC

Null values removal: In preprocessing the dataset for heart health classification, the first
step is to identify and remove any null values present in the dataset. This involves
scanning each column for missing values and either imputing them with a suitable value
or dropping the rows containing nulls entirely. Removing null values ensures that the
dataset is clean and ready for analysis and model training. This process helps in
preventing bias and inaccuracies in the classification task by ensuring that all data points
are complete and reliable. After null values removal, the dataset can proceed to further
preprocessing steps such as normalization, feature scaling, and feature selection before
feeding it into the classification model.

Scaling: In preprocessing the dataset for heart health classification, the first step is scaling
the features to ensure uniformity and optimal performance of the machine learning model.
This involves transforming the data so that all features have a similar scale, preventing
any particular feature from dominating the others due to its larger magnitude. Typically,
techniques like MinMax scaling or Standard scaling are employed, where values are
adjusted to fall within a specific range or standardized around a mean of zero and a
standard deviation of one, respectively. By scaling the dataset, we prepare it for further
analysis and modeling, ensuring that each feature contributes proportionately to the
classification task without bias towards features with larger numeric ranges.

4.3 RANDOM FOREST CLASSIFIER

Department of Electronics and Communication Engineering,SVCN,Nellore 55

Multi-Class Heart Health Classification From ECG Data With RFC

indicate various heart conditions. These trees work together to make a collective
prediction about the health status of the heart. By combining the outputs of multiple trees,
the classifier arrives at a robust and reliable assessment of the heart's condition.

4.4 RANDOM FOREST PREDICTION:

The random forest classifier predicts the heart health classification by analyzing a
multitude of factors gathered from individuals' medical records and health data. It
examines various indicators such as blood pressure, cholesterol levels, BMI, and lifestyle
habits to determine the likelihood of different heart health outcomes. By considering these
factors collectively, the classifier can make informed predictions about whether an
individual is at low, medium, or high risk for heart-related issues.

Through its analysis, the random forest classifier assigns each individual to one of several
categories based on their heart health status. This classification helps healthcare
professionals identify patients who may require closer monitoring, lifestyle interventions,
or medical treatment to mitigate potential risks and promote better heart health. By
leveraging the power of machine learning, the classifier provides valuable insights that
aid in preventive care and improve overall cardiovascular health outcomes for
individuals.

4.5 CLASSIFIER:

A Random Forest Classifier is a type of machine learning model that is widely used for
both classification and regression tasks. It operates on the principle of ensemble learning,
which combines multiple classifiers to solve complex problems and enhance the model's
performance. The model consists of numerous decision trees, each considering a subset
of observations and splitting based on a subset of features, resulting in a diverse set of
classifiers.

When predicting a new data point, the classifier takes the prediction from each tree and
decides the final output based on the majority votes of predictions. This approach helps
to increase accuracy and prevent overfitting, a common problem in machine learning
where a model performs well on training data but poorly on unseen data.

Department of Electronics and Communication Engineering,SVCN,Nellore 56

Multi-Class Heart Health Classification From ECG Data With RFC

Key parameters of a Random Forest Classifier include the number of trees in the forest
(n_estimators), the function to measure the quality of a split (criterion), the maximum
depth of the tree (max_depth), and the minimum number of samples required to split an
internal node

Figure 4.2 Random forest classifier

(min_samples_split) or to be at a leaf node (min_samples_leaf). Random Forests are

favored for their simplicity and versatility, being applicable to both regression and
classification tasks. They are also robust to outliers and non-linear data, making them a
popular choice in various fields and applications.

4.6 ADVANTAGES

The Random Forest classifier has several advantages:

Robustness: Random Forest is a robust algorithm that can handle noisy data and outliers.
It is less likely to overfit.

Accuracy: Random Forest is one of the most accurate machine learning algorithms. It
can handle both classification and regression tasks.

Speed: Despite being a complex algorithm, Random Forest is efficient and fast¹.

Effective on Large Datasets: Random Forests are particularly well-suited for handling
large and complex datasets, dealing with high-dimensional feature spaces².

Department of Electronics and Communication Engineering,SVCN,Nellore 57

Multi-Class Heart Health Classification From ECG Data With RFC

Feature Importance: Random Forest's ability to provide feature importance scores makes
it a valuable tool for understanding the significance of different variables in the dataset².

Ensemble Nature: The ensemble nature of Random Forests, combining multiple trees,
makes them less prone to overfitting compared to individual decision trees².

Handling of Many Features: Random Forest is effective on datasets with a large number
of features, and it can handle irrelevant variables well².

Capability: Random Forest is capable of performing both Classification and Regression

tasks. It is capable of handling large datasets with high dimensionality³.

High Accuracy: Among all the available classification methods, random forests provide
the highest accuracy⁴.

Big Data Handling: The random forest technique can also handle big data with numerous
variables running into thousands.

Department of Electronics and Communication Engineering,SVCN,Nellore 58

Multi-Class Heart Health Classification From ECG Data With RFC

CHAPTER 5

MACHINE LEARNING
What is Machine Learning

Before we take a look at the details of various machine learning methods, let's start by
looking at what machine learning is, and what it isn't. Machine learning is often
categorized as a subfield of artificial intelligence, but I find that categorization can often
be misleading at first brush. The study of machine learning certainly arose from research
in this context, but in the data science application of machine learning methods, it's more
helpful to think of machine learning as a means of building models of data.

Fundamentally, machine learning involves building mathematical models to help

understand data. "Learning" enters the fray when we give these models tunable
parameters that can be adapted to observed data; in this way the program can be
considered to be "learning" from the data. Once these models have been fit to previously
seen data, they can be used to predict and understand aspects of newly observed data. I'll
leave to the reader the more philosophical digression regarding the extent to which this
type of mathematical, model-based "learning" is similar to the "learning" exhibited by the
human brain. Understanding the problem setting in machine learning is essential to using
these tools effectively, and so we will start with some broad categorizations of the types
of approaches we'll discuss here.

Categories of Machine Leaning

At the most fundamental level, machine learning can be categorized into two main types:
supervised learning and unsupervised learning.

Supervised learning involves somehow modeling the relationship between measured

features of data and some label associated with the data; once this model is determined,
it can be used to apply labels to new, unknown data. This is further subdivided into
classification tasks and regression tasks: in classification, the labels are discrete

Department of Electronics and Communication Engineering,SVCN,Nellore 59

Multi-Class Heart Health Classification From ECG Data With RFC

categories, while in regression, the labels are continuous quantities. We will see examples
of both types of supervised learning in the following section.

Unsupervised learning involves modeling the features of a dataset without reference to

any label and is often described as "letting the dataset speak for itself." These models
include tasks such as clustering and dimensionality reduction. Clustering algorithms
identify distinct groups of data, while dimensionality reduction algorithms search for
more succinct representations of the data. We will see examples of both types of
unsupervised learning in the following section. Need for Machine Learning

Human beings, at this moment, are the most intelligent and advanced species on earth
because they can think, evaluate, and solve complex problems. On the other side, AI is
still in its initial stage and have not surpassed human intelligence in many aspects. Then
the question is that what is the need to make machine learn? The most suitable reason for
doing this is, “to make decisions, based on data, with efficiency and scale”.

Lately, organizations are investing heavily in newer technologies like Artificial

Intelligence, Machine Learning and Deep Learning to get the key information from data
to perform several real-world tasks and solve problems. We can call it data-driven
decisions taken by machines, particularly to automate the process. These data-driven
decisions can be used, instead of using programming logic, in the problems that cannot
be programmed inherently. The fact is that we can’t do without human intelligence, but
other aspect is that we all need to solve real-world problems with efficiency at a huge
scale. That is why the need for machine learning arises.

Challenges in Machines Learning

While Machine Learning is rapidly evolving, making significant strides with

cybersecurity and autonomous cars, this segment of AI as whole still has a long way to
go. The reason behind is that ML has not been able to overcome number of challenges.
The challenges that ML is facing currently are −

Department of Electronics and Communication Engineering,SVCN,Nellore 60

Multi-Class Heart Health Classification From ECG Data With RFC

1. Quality of data − Having good-quality data for ML algorithms is one of the biggest
challenges. Use of low-quality data leads to the problems related to data
preprocessing and feature extraction.
2. Time-Consuming task − Another challenge faced by ML models is the
consumption of time especially for data acquisition, feature extraction and
retrieval.
3. Lack of specialist persons − As ML technology is still in its infancy stage,
availability of expert resources is a tough job.
4. No clear objective for formulating business problems − Having no clear objective
and well-defined goal for business problems is another key challenge for ML
because this technology is not that mature yet.
5. Issue of overfitting & underfitting − If the model is overfitting or underfitting, it
cannot
be represented well for the problem.
6. Curse of dimensionality − Another challenge ML model faces is too many features
of data points. This can be a real hindrance.
7. Difficulty in deployment − Complexity of the ML model makes it quite difficult
to be deployed in real life.
Applications of Machines Learning

Machine Learning is the most rapidly growing technology and according to researchers
we are in the golden year of AI and ML. It is used to solve many real-world complex
problems which cannot be solved with traditional approach. Following are some real-
world applications of ML.

• Emotion analysis
• Sentiment analysis
• Error detection and prevention
• Weather forecasting and prediction
• Stock market analysis and forecasting
• Speech synthesis

Department of Electronics and Communication Engineering,SVCN,Nellore 61

Multi-Class Heart Health Classification From ECG Data With RFC

• Speech recognition
• Customer segmentation
• Object recognition
• Fraud detection
• Fraud prevention
• Recommendation of products to customer in online shopping How to Start
Learning Machine Learning?

Arthur Samuel coined the term “Machine Learning” in 1959 and defined it as a “Field of
study that gives computers the capability to learn without being explicitly programmed”.

And that was the beginning of Machine Learning! In modern times, Machine Learning is
one of the most popular (if not the most!) career choices. According to Indeed, Machine
Learning Engineer Is the Best Job of 2019 with a 344% growth and an average base salary
of $146,085 per year.

But there is still a lot of doubt about what exactly is Machine Learning and how to start
learning it? So, this article deals with the Basics of Machine Learning and also the path
you can follow to eventually become a full-fledged Machine Learning Engineer. Now
let’s get started!!!

How to start learning ML?

This is a rough roadmap you can follow on your way to becoming an insanely talented
Machine Learning Engineer. Of course, you can always modify the steps according to
your needs to reach your desired end-goal!

Step 1 – Understand the Prerequisites

In case you are a genius, you could start ML directly but normally, there are some
prerequisites that you need to know which include Linear Algebra, Multivariate Calculus,
Statistics, and Python. And if you don’t know these, never fear! You don’t need a Ph.D.
degree in these topics to get started but you do need a basic understanding.

(a) Learn Linear Algebra and Multivariate Calculus

Department of Electronics and Communication Engineering,SVCN,Nellore 62

Multi-Class Heart Health Classification From ECG Data With RFC

Both Linear Algebra and Multivariate Calculus are important in Machine Learning.
However, the extent to which you need them depends on your role as a data scientist. If
you are more focused on application heavy machine learning, then you will not be that
heavily focused on maths as there are many common libraries available. But if you want
to focus on R&D in Machine Learning, then mastery of Linear Algebra and Multivariate
Calculus is very important as you will have to implement many ML algorithms from
scratch.

(b) Learn Statistics

Data plays a huge role in Machine Learning. In fact, around 80% of your time as an ML
expert will be spent collecting and cleaning data. And statistics is a field that handles the
collection, analysis, and presentation of data. So it is no surprise that you need to learn
it!!! Some of the key concepts in statistics that are important are Statistical Significance,
Probability Distributions, Hypothesis Testing, Regression, etc. Also, Bayesian Thinking
is also a very important part of ML which deals with various concepts like Conditional
Probability, Priors, and Posteriors, Maximum Likelihood, etc.

(c) Learn Python

Some people prefer to skip Linear Algebra, Multivariate Calculus and Statistics and learn
them as they go along with trial and error. But the one thing that you absolutely cannot
skip is Python! While there are other languages you can use for Machine Learning like
R, Scala, etc. Python is currently the most popular language for ML. In fact, there are
many Python libraries that are specifically useful for Artificial Intelligence and Machine
Learning such as Keras, TensorFlow, Scikit-learn, etc.

So, if you want to learn ML, it’s best if you learn Python! You can do that using various
online resources and courses such as Fork Python available Free on GeeksforGeeks.

Step 2 – Learn Various ML Concepts

Now that you are done with the prerequisites, you can move on to actually learning ML
(Which is the fun part!!!) It’s best to start with the basics and then move on to the more
complicated stuff. Some of the basic concepts in ML are:

Department of Electronics and Communication Engineering,SVCN,Nellore 63

Multi-Class Heart Health Classification From ECG Data With RFC

(a) Terminologies of Machine Learning

• Model – A model is a specific representation learned from data by applying some

machine learning algorithm. A model is also called a hypothesis.
• Feature – A feature is an individual measurable property of the data. A set of
numeric features can be conveniently described by a feature vector. Feature
vectors are fed as input to the model. For example, in order to predict a fruit, there
may be features like colour, smell, taste, etc.
• Target (Label) – A target variable or label is the value to be predicted by our model.
For the fruit example discussed in the feature section, the label with each set of
input would be the name of the fruit like apple, orange, banana, etc.
• Training – The idea is to give a set of inputs(features) and it’s expected
outputs(labels), so after training, we will have a model (hypothesis) that will then
map new data to one of the categories trained on.
• Prediction – Once our model is ready, it can be fed a set of inputs to which it will
provide a predicted output(label).
(b) Types of Machine Learning

• Supervised Learning – This involves learning from a training dataset with labeled
data using classification and regression models. This learning process continues
until the required level of performance is achieved.
• Unsupervised Learning – This involves using unlabelled data and then finding the
underlying structure in the data in order to learn more and more about the data
itself using factor and cluster analysis models.
• Semi-supervised Learning – This involves using unlabelled data like
Unsupervised Learning with a small amount of labeled data. Using labeled data
vastly increases the learning accuracy and is also more cost-effective than
Supervised Learning.
• Reinforcement Learning – This involves learning optimal actions through trial and
error. So, the next action is decided by learning behaviors that are based on the
current state and that will maximize the reward in the future.

Department of Electronics and Communication Engineering,SVCN,Nellore 64

Multi-Class Heart Health Classification From ECG Data With RFC

Advantages of Machine learning

1. Easily identifies trends and patterns: Machine Learning can review large volumes
of data and discover specific trends and patterns that would not be apparent to humans.
For instance, for an e-commerce website like Amazon, it serves to understand the
browsing behaviors and purchase histories of its users to help cater to the right products,
deals, and reminders relevant to them. It uses the results to reveal relevant advertisements
to them.

2. No human intervention needed (automation): With ML, you don’t need to babysit
your project every step of the way. Since it means giving machines the ability to learn, it
lets them make predictions and also improve the algorithms on their own. A common
example of this is anti-virus softwares; they learn to filter new threats as they are
recognized. ML is also good at recognizing spam.

3. Continuous Improvement: As ML algorithms gain experience, they keep

improving in accuracy and efficiency. This lets them make better decisions. Say you need
to make a weather forecast model. As the amount of data, you have keeps growing, your
algorithms learn to make more accurate predictions faster.

4. Handling multi-dimensional and multi-variety data: Machine Learning

algorithms are good at handling data that are multi-dimensional and multi-variety, and
they can do this in dynamic or uncertain environments.

5. Wide Applications: You could be an e-tailer or a healthcare provider and make

ML work for you. Where it does apply, it holds the capability to help deliver a much more
personal experience to customers while also targeting the right customers.

Disadvantages of Machine Learning

1. Data Acquisition: Machine Learning requires massive data sets to train on, and
these should be inclusive/unbiased, and of good quality. There can also be times where
they must wait for new data to be generated.

Department of Electronics and Communication Engineering,SVCN,Nellore 65

Multi-Class Heart Health Classification From ECG Data With RFC

2. Time and Resources: ML needs enough time to let the algorithms learn and
develop enough to fulfill their purpose with a considerable amount of accuracy and
relevancy. It also needs massive resources to function. This can mean additional
requirements of computer power for you.

3. Interpretation of Results: Another major challenge is the ability to accurately

interpret results generated by the algorithms. You must also carefully choose the
algorithms for your purpose.

4. High error-susceptibility: Machine Learning is autonomous but highly susceptible

to errors. Suppose you train an algorithm with data sets small enough to not be inclusive.
You end up with biased predictions coming from a biased training set. This leads to
irrelevant advertisements being displayed to customers. In the case of ML, such blunders
can set off a chain of errors that can go undetected for long periods of time. And when
they do get noticed, it takes quite some time to recognize the source of the issue, and even
longer to correct it.

Department of Electronics and Communication Engineering,SVCN,Nellore 66

Multi-Class Heart Health Classification From ECG Data With RFC

CHAPTER 6

SOFTWARE ENVIRONMENT

What is Python?
Below are some facts about Python.
• Python is currently the most widely used multi-purpose, high-level programming
language.

• Python allows programming in Object-Oriented and Procedural paradigms.

Python programs generally are smaller than other programming languages like
Java.

• Programmers have to type relatively less and indentation requirement of the

language, makes them readable all the time.

• Python language is being used by almost all tech-giant companies like – Google,
Amazon, Facebook, Instagram, Dropbox, Uber… etc.

The biggest strength of Python is huge collection of standard libraries which can be used
for the following –

• Machine Learning

• GUI Applications (like Kivy, Tkinter, PyQt etc.)

• Web frameworks like Django (used by YouTube, Instagram, Dropbox)

• Image processing (like Opencv, Pillow)

• Web scraping (like Scrapy, BeautifulSoup, Selenium)

• Test frameworks

• Multimedia

Advantages of Python
Let’s see how Python dominates over other languages.

Department of Electronics and Communication Engineering,SVCN,Nellore 67

Multi-Class Heart Health Classification From ECG Data With RFC

1. Extensive Libraries

Python downloads with an extensive library and it contain code for various purposes like
regular expressions, documentation-generation, unit-testing, web browsers, threading,
databases, CGI, email, image manipulation, and more. So, we don’t have to write the
complete code for that manually.

2. Extensible

As we have seen earlier, Python can be extended to other languages. You can write some
of your code in languages like C++ or C. This comes in handy, especially in projects.

3. Embeddable

Complimentary to extensibility, Python is embeddable as well. You can put your Python
code in your source code of a different language, like C++. This lets us add scripting
capabilities to our code in the other language.

4. Improved Productivity

The language’s simplicity and extensive libraries render programmers more productive
than languages like Java and C++ do. Also, the fact that you need to write less and get
more things done.

5. IOT Opportunities

Since Python forms the basis of new platforms like Raspberry Pi, it finds the future bright
for the Internet of Things. This is a way to connect the language with the real world.

6. Simple and Easy

When working with Java, you may have to create a class to print ‘Hello World’. But in
Python, just a print statement will do. It is also quite easy to learn, understand, and code.
This is why when people pick up Python, they have a hard time adjusting to other more
verbose languages like Java.

7. Readable

Department of Electronics and Communication Engineering,SVCN,Nellore 68

Multi-Class Heart Health Classification From ECG Data With RFC

Because it is not such a verbose language, reading Python is much like reading English.
This is the reason why it is so easy to learn, understand, and code. It also does not need
curly braces to define blocks, and indentation is mandatory. These further aids the
readability of the code.

8. Object-Oriented

This language supports both the procedural and object-oriented programming paradigms.
While functions help us with code reusability, classes and objects let us model the real
world.
A class allows the encapsulation of data and functions into one.

9. Free and Open-Source

Like we said earlier, Python is freely available. But not only can you download Python
for free, but you can also download its source code, make changes to it, and even
distribute it. It downloads with an extensive collection of libraries to help you with your
tasks.

10. Portable

When you code your project in a language like C++, you may need to make some changes
to it if you want to run it on another platform. But it isn’t the same with Python. Here,
you need to code only once, and you can run it anywhere. This is called Write Once Run
Anywhere (WORA). However, you need to be careful enough not to include any system-
dependent features.

11. Interpreted

Lastly, we will say that it is an interpreted language. Since statements are executed one
by one, debugging is easier than in compiled languages.

Any doubts till now in the advantages of Python? Mention in the comment section.

Advantages of Python Over Other Languages

1. Less Coding

Department of Electronics and Communication Engineering,SVCN,Nellore 69

Multi-Class Heart Health Classification From ECG Data With RFC

Almost all of the tasks done in Python requires less coding when the same task is done in
other languages. Python also has an awesome standard library support, so you don’t have
to search for any third-party libraries to get your job done. This is the reason that many
people suggest learning Python to beginners.

2. Affordable

Python is free therefore individuals, small companies or big organizations can leverage
the free available resources to build applications. Python is popular and widely used so it
gives you better community support.

The 2019 Github annual survey showed us that Python has overtaken Java in the most
popular programming language category.

3. Python is for Everyone

Python code can run on any machine whether it is Linux, Mac or Windows. Programmers
need to learn different languages for different jobs but with Python, you can
professionally build web apps, perform data analysis and machine learning, automate
things, do web scraping and also build games and powerful visualizations. It is an all-
rounder programming language.

Disadvantages of Python

So far, we’ve seen why Python is a great choice for your project. But if you choose it,
you should be aware of its consequences as well. Let’s now see the downsides of choosing
Python over another language.

1. Speed Limitations

We have seen that Python code is executed line by line. But since Python is interpreted,
it often results in slow execution. This, however, isn’t a problem unless speed is a focal
point for the project. In other words, unless high speed is a requirement, the benefits
offered by Python are enough to distract us from its speed limitations.

2. Weak in Mobile Computing and Browsers

Department of Electronics and Communication Engineering,SVCN,Nellore 70

Multi-Class Heart Health Classification From ECG Data With RFC

While it serves as an excellent server-side language, Python is much rarely seen on the
clientside. Besides that, it is rarely ever used to implement smartphone-based
applications. One such application is called Carbonnelle.

The reason it is not so famous despite the existence of Brython is that it isn’t that secure.

3. Design Restrictions

As you know, Python is dynamically-typed. This means that you don’t need to declare
the type of variable while writing the code. It uses duck-typing. But wait, what’s that?
Well, it just means that if it looks like a duck, it must be a duck. While this is easy on the
programmers during coding, it can raise run-time errors.

4. Underdeveloped Database Access LayersCompared to more widely used technologies

like JDBC (Java DataBase Connectivity) and ODBC (Open DataBase Connectivity),
Python’s database access layers are a bit underdeveloped. Consequently, it is less often
applied in huge enterprises.

5. Simple

No, we’re not kidding. Python’s simplicity can indeed be a problem. Take my example. I
don’t do Java, I’m more of a Python person. To me, its syntax is so simple that the
verbosity of Java code seems unnecessary.

This was all about the Advantages and Disadvantages of Python Programming Language.

History of Python

What do the alphabet and the programming language Python have in common? Right,
both start with ABC. If we are talking about ABC in the Python context, it's clear that the
programming language ABC is meant. ABC is a general-purpose programming language
and programming environment, which had been developed in the Netherlands,
Amsterdam, at the CWI (Centrum Wiskunde &Informatica). The greatest achievement of
ABC was to influence the design of Python. Python was conceptualized in the late 1980s.
Guido van Rossum worked that time in a project at the CWI, called Amoeba, a distributed
operating system. In an interview with Bill Venners 1, Guido van Rossum said: "In the

Department of Electronics and Communication Engineering,SVCN,Nellore 71

Multi-Class Heart Health Classification From ECG Data With RFC

early 1980s, I worked as an implementer on a team building a language called ABC at

Centrum voor Wiskunde en Informatica (CWI). I don't know how well people know
ABC's influence on Python. I try to mention ABC's influence because I'm indebted to
everything I learned during that project and to the people who worked on it. "Later on in
the same Interview, Guido van Rossum continued: "I remembered all my experience and
some of my frustration with ABC. I decided to try to design a simple scripting language
that possessed some of ABC's better properties, but without its problems. So I started
typing. I created a simple virtual machine, a simple parser, and a simple runtime. I made
my own version of the various ABC parts that I liked. I created a basic syntax, used
indentation for statement grouping instead of curly braces or begin-end blocks, and
developed a small number of powerful data types: a hash table (or dictionary, as we call
it), a list, strings, and numbers."

Python Development Steps

Guido Van Rossum published the first version of Python code (version 0.9.0) at
alt.sources in February 1991. This release included already exception handling, functions,
and the core data types of lists, dict, str and others. It was also object oriented and had a
module system. Python version 1.0 was released in January 1994. The major new features
included in this release were the functional programming tools lambda, map, filter and
reduce, which Guido Van Rossum never liked. Six and a half years later in October 2000,
Python 2.0 was introduced. This release included list comprehensions, a full garbage
collector and it was supporting unicode. Python flourished for another 8 years in the
versions 2.x before the next major release as Python 3.0 (also known as "Python 3000"
and "Py3K") was released. Python 3 is not backwards compatible with Python 2.x. The
emphasis in Python 3 had been on the removal of duplicate programming constructs and
modules, thus fulfilling or coming close to fulfilling the 13th law of the Zen of Python:
"There should be one -- and preferably only one -- obvious way to do it."Some changes
in Python 7.3:

• Print is now a function.

• Views and iterators instead of lists

Department of Electronics and Communication Engineering,SVCN,Nellore 72

Multi-Class Heart Health Classification From ECG Data With RFC

• The rules for ordering comparisons have been simplified. E.g., a heterogeneous
list cannot be sorted, because all the elements of a list must be comparable to
each other.  There is only one integer type left, i.e., int. long is int as well.
• The division of two integers returns a float instead of an integer. "//" can be used
to have the "old" behaviour.
• Text Vs. Data Instead of Unicode Vs. 8-bit
Purpose

We demonstrated that our approach enables successful segmentation of intra-retinal

layers— even with low-quality images containing speckle noise, low contrast, and
different intensity ranges throughout—with the assistance of the ANIS feature.

Python

Python is an interpreted high-level programming language for general-purpose

programming. Created by Guido van Rossum and first released in 1991, Python has a
design philosophy that emphasizes code readability, notably using significant whitespace.

Python features a dynamic type system and automatic memory management. It supports
multiple programming paradigms, including object-oriented, imperative, functional and
procedural, and has a large and comprehensive standard library.

• Python is Interpreted − Python is processed at runtime by the interpreter. You do

not need to compile your program before executing it. This is similar to PERL and
PHP.

• Python is Interactive − you can actually sit at a Python prompt and interact with
the interpreter directly to write your programs.

Python also acknowledges that speed of development is important. Readable and terse
code is part of this, and so is access to powerful constructs that avoid tedious repetition
of code. Maintainability also ties into this may be an all but useless metric, but it does say
something about how much code you have to scan, read and/or understand to troubleshoot
problems or tweak behaviors. This speed of development, the ease with which a

Department of Electronics and Communication Engineering,SVCN,Nellore 73

Multi-Class Heart Health Classification From ECG Data With RFC

programmer of other languages can pick up basic Python skills and the huge standard
library is key to another area where Python excels. All its tools have been quick to
implement, saved a lot of time, and several of them have later been patched and updated
by people with no Python background - without breaking.

Modules Used in Project

TensorFlow

TensorFlow is a free and open-source software library for dataflow and differentiable
programming across a range of tasks. It is a symbolic math library and is also used for
machine learning applications such as neural networks. It is used for both research and
production at Google.

TensorFlow was developed by the Google Brain team for internal Google use. It was
released under the Apache 2.0 open-source license on November 9, 2015.

NumPy

NumPy is a general-purpose array-processing package. It provides a high-performance

multidimensional array object, and tools for working with these arrays.

It is the fundamental package for scientific computing with Python. It contains various
features including these important ones:

• A powerful N-dimensional array object

• Sophisticated (broadcasting) functions
• Tools for integrating C/C++ and Fortran code
• Useful linear algebra, Fourier transform, and random number capabilities
Besides its obvious scientific uses, NumPy can also be used as an efficient multi-
dimensional container of generic data. Arbitrary datatypes can be defined using NumPy
which allows NumPy to seamlessly and speedily integrate with a wide variety of
databases.

Department of Electronics and Communication Engineering,SVCN,Nellore 74

Multi-Class Heart Health Classification From ECG Data With RFC

Pandas

Pandas is an open-source Python Library providing high-performance data manipulation

and analysis tool using its powerful data structures. Python was majorly used for data
munging and preparation. It had very little contribution towards data analysis. Pandas
solved this problem. Using Pandas, we can accomplish five typical steps in the processing
and analysis of data, regardless of the origin of data load, prepare, manipulate, model,
and analyze. Python with Pandas is used in a wide range of fields including academic and
commercial domains including finance, economics, Statistics, analytics, etc.

Matplotlib

Matplotlib is a Python 2D plotting library which produces publication quality figures in

a variety of hardcopy formats and interactive environments across platforms. Matplotlib
can be used in Python scripts, the Python and IPython shells, the Jupyter Notebook, web
application servers, and four graphical user interface toolkits. Matplotlib tries to make
easy things easy and hard things possible. You can generate plots, histograms, power
spectra, bar charts, error charts, scatter plots, etc., with just a few lines of code. For
examples, see the sample plots and thumbnail gallery.

For simple plotting the pyplot module provides a MATLAB-like interface, particularly
when combined with IPython. For the power user, you have full control of line styles,
font properties, axes properties, etc, via an object-oriented interface or via a set of
functions familiar to MATLAB users.

Scikit – learn

Scikit-learn provides a range of supervised and unsupervised learning algorithms via a

consistent interface in Python. It is licensed under a permissive simplified BSD license
and is distributed under many Linux distributions, encouraging academic and commercial
use. Python Python is an interpreted high-level programming language for general-
purpose programming. Created by Guido van Rossum and first released in 1991, Python
has a design philosophy that emphasizes code readability, notably using significant
whitespace.

Department of Electronics and Communication Engineering,SVCN,Nellore 75

Multi-Class Heart Health Classification From ECG Data With RFC

• Python is Interpreted − Python is processed at runtime by the interpreter. You do

not need to compile your program before executing it. This is similar to PERL and
PHP.
• Python is Interactive − you can actually sit at a Python prompt and interact with
the interpreter directly to write your programs.
Python also acknowledges that speed of development is important. Readable and terse
code is part of this, and so is access to powerful constructs that avoid tedious repetition
of code. Maintainability also ties into this may be an all but useless metric, but it does say
something about how much code you have to scan, read and/or understand to troubleshoot
problems or tweak behaviors. This speed of development, the ease with which a
programmer of other languages can pick up basic Python skills and the huge standard
library is key to another area where Python excels. All its tools have been quick to
implement, saved a lot of time, and several of them have later been patched and updated
by people with no Python background - without breaking.

Install Python Step-by-Step in Windows and Mac

Python a versatile programming language doesn’t come pre-installed on your computer

devices. Python was first released in the year 1991 and until today it is a very popular
highlevel programming language. Its style philosophy emphasizes code readability with
its notable use of great whitespace.

The object-oriented approach and language construct provided by Python enables

programmers to write both clear and logical code for projects. This software does not
come pre-packaged with Windows.

How to Install Python on Windows and Mac

There have been several updates in the Python version over the years. The question is
how to install Python? It might be confusing for the beginner who is willing to start

Department of Electronics and Communication Engineering,SVCN,Nellore 76

Multi-Class Heart Health Classification From ECG Data With RFC

learning Python but this tutorial will solve your query. The latest or the newest version of
Python is version

3.7.4 or in other words, it is Python 3.

Note: The python version 3.7.4 cannot be used on Windows XP or earlier devices.

Before you start with the installation process of Python. First, you need to know about
your System Requirements. Based on your system type i.e., operating system and based
processor, you must download the python version. My system type is a Windows 64-bit
operating system. So the steps below are to install python version 3.7.4 on Windows 7
device or to install Python 3. Download the Python Cheatsheet here. The steps on how to
install Python on Windows 10, 8 and 7 are divided into 4 parts to help understand better.

Download the Correct version into the system

Step 1: Go to the official site to download and install python using Google Chrome or
any other web browser. OR Click on the following link: https://www.python.org

Now, check for the latest and the correct version for your operating system.

Step 2: Click on the Download Tab.

Department of Electronics and Communication Engineering,SVCN,Nellore 77

Multi-Class Heart Health Classification From ECG Data With RFC

Step 3: You can either select the Download Python for windows 3.7.4 button in Yellow
Color or you can scroll further down and click on download with respective to their
version. Here, we are downloading the most recent python version for windows 3.7.4

Step 4: Scroll down the page until you find the Files option.

Step 5: Here you see a different version of python along with the operating system .

Department of Electronics and Communication Engineering,SVCN,Nellore 78

Multi-Class Heart Health Classification From ECG Data With RFC

• To download Windows 32-bit python, you can select any one from the three
options: Windows x86 embeddable zip file, Windows x86 executable installer or
Windows x86 web-based installer.

• To download Windows 64-bit python, you can select any one from the three
options: Windows x86-64 embeddable zip file, Windows x86-64 executable
installer or Windows x86-64 web-based installer.
Here we will install Windows x86-64 web-based installer. Here your first part regarding
which version of python is to be downloaded is completed. Now we move ahead with the
second part in installing python i.e., Installation

Note: To know the changes or updates that are made in the version you can click on the
Release Note Option.

Installation of Python

Step 1: Go to Download and Open the downloaded python version to carry out the
installation process.

Department of Electronics and Communication Engineering,SVCN,Nellore 79

Multi-Class Heart Health Classification From ECG Data With RFC

Step 2: Before you click on Install Now, Make sure to put a tick on Add Python 3.7 to
PATH.

Step 3: Click on Install NOW After the installation is successful. Click on Close.

Department of Electronics and Communication Engineering,SVCN,Nellore 80

Multi-Class Heart Health Classification From ECG Data With RFC

With these above three steps on python installation, you have successfully and correctly
installed Python. Now is the time to verify the installation.

Note: The installation process might take a couple of minutes.

Verify the Python Installation

Step 1: Click on Start

Step 2: In the Windows Run Command, type “cmd”.

Step 3: Open the Command prompt option.

Step 4: Let us test whether the python is correctly installed. Type python –V and press
Enter.

Department of Electronics and Communication Engineering,SVCN,Nellore 81

Multi-Class Heart Health Classification From ECG Data With RFC

Step 5: You will get the answer as 3.7.4

Note: If you have any of the earlier versions of Python already installed. You must first
uninstall the earlier version and then install the new one.

Check how the Python IDLE works

Step 1: Click on Start

Step 2: In the Windows Run command, type “python idle”.

Step 3: Click on IDLE (Python 3.7 64-bit) and launch the program

Step 4: To go ahead with working in IDLE you must first save the file. Click on File >
Click on Save

Department of Electronics and Communication Engineering,SVCN,Nellore 82

Multi-Class Heart Health Classification From ECG Data With RFC

Step 5: Name the file and save as type should be Python files. Click on SAVE. Here I
have named the files as Hey World.

Step 6: Now for e.g. enter print (“Hey World”) and Press Enter.

You will see that the command given is launched. With this, we end our tutorial on how
to install Python. You have learned how to download python for windows into your
respective operating system.

Note: Unlike Java, Python does not need semicolons at the end of the statements
otherwise it won’t work.

Department of Electronics and Communication Engineering,SVCN,Nellore 83

Multi-Class Heart Health Classification From ECG Data With RFC

CHAPTER 7

SOURCE CODE
import numpy as np import pandas as pd import

matplotlib.pyplot as plt import joblib from

sklearn.model_selection import train_test_split from

sklearn.linear_model import LogisticRegression from

sklearn.metrics import accuracy_score from

sklearn.metrics import classification_report from

sklearn.metrics import confusion_matrix import

seaborn as sns import os import warnings

warnings.filterwarnings("ignore") df_train =

pd.read_csv("mitbih_train.csv", header=None)

df_train.head() plt.plot(df_train.iloc[1,:186])

plt.plot(df_train.iloc[4,:186]) plot the circle of value

counts in dataset def plot_equilibre(equilibre):

plt.figure(figsize=(5,5)) my_circle=plt.Circle(

(0,0), 0.7, color='white') plt.pie(equilibre,

labels=['n','q','v','s','f'],

colors=['red','green','blue','skyblue','orange'],autopct='

%1.1f%%') p=plt.gcf()

p.gca().add_artist(my_circle)

plt.show()

Department of Electronics and Communication Engineering,SVCN,Nellore 84

Multi-Class Heart Health Classification From ECG Data With RFC

print(df_train[187].value_counts())

plot_equilibre(df_train[187].value_count

s()) X= df_train.values[:, :-1] y=

df_train.values[:, -1].astype(int)

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

plot one ECG for each category

C0 = np.argwhere(y_train == 0).flatten()

C1 = np.argwhere(y_train == 1).flatten()

C2 = np.argwhere(y_train == 2).flatten()

C3 = np.argwhere(y_train == 3).flatten()

C4 = np.argwhere(y_train == 4).flatten()

x = np.arange(0, 187)*8/1000 plt.figure(figsize=(20,6)) plt.plot(x, X_train[C0, :][0],

label="Normal")

plt.plot(x, X_train[C1, :][0], label="Artial Premature")

plt.plot(x, X_train[C2, :][0], label="Premature ventricular contraction")

plt.plot(x, X_train[C3, :][0], label="Fusion of ventricular and normal")

plt.plot(x, X_train[C4, :][0], label="Fusion of paced and normal")

plt.legend()

plt.title("1-beat ECG for every category", fontsize=20)

plt.ylabel("Amplitude", fontsize=15)

plt.xlabel("Time (ms)", fontsize=15)

Department of Electronics and Communication Engineering,SVCN,Nellore 85

Multi-Class Heart Health Classification From ECG Data With RFC

plt.show()

from sklearn.naive_bayes import BernoulliNB if

os.path.exists('BNB_classifier_weights.pkl'): # Load the model from the pkl file

bnb_classifier =joblib.load('BNB_classifier_weights.pkl')

y_pred = bnb_classifier.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)

print("Accuracy:", accuracy) else:

# Initialize Bernoulli Naive Bayes classifier

bnb_classifier = BernoulliNB() # Train the classifier on the training data

bnb_classifier.fit(X_train, y_train) # Make predictions on the test data

y_pred = bnb_classifier.predict(X_test) # Calculate the accuracy of the classifier

accuracy = accuracy_score(y_test, y_pred)

print("Accuracy:", accuracy) # Save the model weights to a pkl file

joblib.dump(bnb_classifier, 'BNB_classifier_weights.pkl')

print("Model trained and model weights saved.")

from sklearn.metrics import classification_report, confusion_matrix

labels = ["Normal",

"Artial Premature",

"Premature ventricular contraction",

"Fusion of ventricular and normal",

Department of Electronics and Communication Engineering,SVCN,Nellore 86

Multi-Class Heart Health Classification From ECG Data With RFC

"Fusion of paced and normal"]

print(classification_report(y_test, y_pred,

target_names=labels))

# Create a figure and axis for the plot

cm=confusion_matrix(y_test, y_pred)

fig, ax = plt.subplots()

# Create a heatmap using seaborn sns.heatmap(cm, annot=True, fmt='d',

cmap='Blues', xticklabels=labels, yticklabels=labels, ax=ax)

# Set labels and title

ax.set_xlabel('Predicted')

ax.set_ylabel('Actual')

ax.set_title('NBC Confusion Matrix')

# Show the plot

plt.show()

# Check if the pkl file exists if os.path.exists('rf_classifier_weights.pkl'): # Load

the model from the pkl file

rf_classifier= joblib.load('rf_classifier_weights.pkl')

y_pred1 = rf_classifier.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)

print("Accuracy:", accuracy)

else:

rf_classifier=RandomForestClassifier(n_estimators=100,
random_state=42,min_samples_split=2,min_samples_leaf=1)

Department of Electronics and Communication Engineering,SVCN,Nellore 87

Multi-Class Heart Health Classification From ECG Data With RFC

# Train the classifier on the training data

rf_classifier.fit(X_train, y_train) # Make predictions on the test data

y_pred1=rf_classifier.predict(X_test) # Calculate the accuracy of the classifier

accuracy = accuracy_score(y_test, y_pred)

print("Accuracy:", accuracy)# Save the model weights to a pkl file

joblib.dump(rf_classifier, 'rf_classifier_weights.pkl')

print("model trained and model weights saved.")

from sklearn.metrics import classification_report, confusion_matrix

labels = ["Normal",

"Artial Premature",

"Premature ventricular contraction",

"Fusion of ventricular and normal", "Fusion of paced

and normal"] print(classification_report(y_test, y_pred1,

target_names=labels))

# Create a figure and axis for the plot

cm1=confusion_matrix(y_test, y_pred1)

fig, ax = plt.subplots()

# Create a heatmap using seaborn

sns.heatmap(cm1, annot=True, fmt='d', cmap='Blues', xticklabels=labels,

yticklabels=labels, ax=ax)

# Set labels and title

ax.set_xlabel('Predicted')

Department of Electronics and Communication Engineering,SVCN,Nellore 88

Multi-Class Heart Health Classification From ECG Data With RFC

ax.set_ylabel('Actual')

ax.set_title('RFC Confusion Matrix')

# Show the plot plt.show() filename=r"mitbih_test.csv"

dataset =pd.read_csv(filename)

A='Normal'

B='Artial Premature'

C='Premature ventricular contraction'

D='Fusion of ventricular and normal'

E='Fusion of paced and normal' predict =

rf_classifier.predict(X_test[1:10,:]) for i

in range(len(predict)):

if predict[i] == 0:

print("{} :{} ".format(dataset.iloc[i,:],A))

elif predict[i]== 1:

print("{} :{} ".format(dataset.iloc[i, :],B))

elif predict[i]== 2:

print("{} :{} ".format(dataset.iloc[i, :],C))

elif predict[i]==3:

print("{} :{} ".format(dataset.iloc[i, :],D))

else:

print("{} :{} ".format(dataset.iloc[i,:],E))

Department of Electronics and Communication Engineering,SVCN,Nellore 89

Multi-Class Heart Health Classification From ECG Data With RFC

CHAPTER-8

RESULT AND DISCUSSION

8.1 SAMPLE DATASET

The dataset 8.1 for heart health classification consists of electrocardiogram (ECG) signals
obtained from human subjects. These signals are captured using ECG sensors attached to
the body, typically on the chest area. The dataset contains various attributes extracted
from these signals, such as waveforms, intervals, and amplitudes. Each data point in the
dataset represents a specific ECG recording from an individual. Additionally, the dataset
includes corresponding labels indicating the heart health status of each subject, such as
normal, arrhythmia, or other cardiac conditions. This dataset serves as a valuable resource
for developing and testing machine learning algorithms aimed at accurately classifying
heart health based on ECG signals.

The dataset consists of electrocardiogram (ECG) signals collected from individuals. Each
ECG signal represents the electrical activity of the heart over a period of time. These
signals are processed to extract numerical features, such as amplitude, frequency, and
duration of specific waveforms, like the P wave, QRS complex, and T wave. These
numerical features serve as inputs for machine learning algorithms to classify the heart
health status of the individuals, distinguishing between normal and abnormal conditions.
The dataset aims to aid researchers and healthcare professionals in developing accurate
and efficient methods for ECG-based heart health classification.

Figure 8.1 Sample Dataset

Department of Electronics and Communication Engineering,SVCN,Nellore 90

Multi-Class Heart Health Classification From ECG Data With RFC

Figure 8.2 The total Mitbih_train Dataset

8.2 SAMPLE FIGURE OF ECG

The sample figure 8.3 data of an electrocardiogram (ECG) displays the electrical activity
of the heart, providing insights into heart health classification. The ECG waveform
comprises distinct peaks and troughs, representing different phases of cardiac activity.
The P wave indicates atrial depolarization, while the QRS complex signifies ventricular
depolarization. The ST segment reflects the time between ventricular depolarization and
repolarization, crucial for assessing myocardial ischemia. Lastly, the T wave corresponds
to ventricular repolarization. By analyzing the amplitude, duration, and morphology of
these waveforms, healthcare professionals can evaluate the heart's rhythm, identify
abnormalities such as arrhythmias or ischemia, and classify the overall cardiac health
status of an individual.

Figure 8.3 Sample Figure of ECG

Department of Electronics and Communication Engineering,SVCN,Nellore 91

Multi-Class Heart Health Classification From ECG Data With RFC

8.3 THE CIRCLE OF VALUE COUNTS IN DATASET

The dataset figure 8.4 on heart health classification reveals that the majority, comprising
82.8%, is represented by the category denoted as 'n'. This indicates a substantial portion
of the dataset falls under this classification. Following 'n', there is a smaller proportion,
accounting for 7.3%, categorized as 'q'. Although not as prevalent as 'n', 'q' still holds a
significant portion within the dataset, suggesting a notable presence in the heart health
classification.

Additionally, the dataset encompasses 'v', constituting 6.6% of the total. While 'v'
represents a smaller fraction compared to 'n' and 'q', it remains a notable category within
the dataset. Furthermore, 's' and 'f' categories are observed, collectively representing a
minor portion, with 's' at 2.5% and 'f' at 0.7%. Although these categories make up a
smaller percentage individually, their inclusion underscores the diversity within the
dataset's heart health classification.

Figure 8.4 The circle of value counts in dataset

Department of Electronics and Communication Engineering,SVCN,Nellore 92

Multi-Class Heart Health Classification From ECG Data With RFC

8.4 ONE ECG FOR EACH CATEGORY

In a normal heart rhythm, the ECG tracing shows a series of regular, evenly spaced peaks
and valleys. The waves appear smooth and consistent, with a characteristic pattern. The
color of the plot is blue, indicating the normalcy of the heart rhythm. Each beat follows
the other in a predictable manner, signifying a healthy functioning heart.

In cases of atrial premature contractions, the ECG may exhibit irregularities characterized
by premature beats originating from the atria. These premature beats disrupt the regular
rhythm, causing deviations in the ECG tracing. The color of the plot shifts to orange,
marking these irregular contractions. Similarly, premature ventricular contractions
manifest as early, abnormal beats originating from the ventricles, causing irregularities in
the ECG tracing, represented by green. Fusion of ventricular normal rhythms displays a
combination of normal and irregular beats, depicted by a red color. Fusion of paced and
normal rhythms illustrates a combination of paced and normal beats, represented by
violet, indicating a blend of artificial pacing and natural heart rhythm.

Figure 8.5 likely displays one Electrocardiogram (ECG) for each heart condition category
mentioned in the classification report. Each ECG serves as an example representation of
the corresponding heart condition, helping to visually understand the differences between
them.

Figure 8.5 One ECG for each category

Department of Electronics and Communication Engineering,SVCN,Nellore 93

Multi-Class Heart Health Classification From ECG Data With RFC

8.5 CLASSIFICATION REPORT

Table 8.1 provides metrics such as precision, recall, F1-score, and support for each class,
along with overall accuracy, macro average, and weighted average. It evaluates the
performance of Naive Bayes algorithm in classifying different heart conditions.

Table 8.1 Existing Naive Bayes report

Class Precision Recall F1-Score Support

Normal 0.92 0.34 0.49 14579
Artial Premature 0.30 0.10 0.15 426
Premature Ventricular Contraction 0.11 0.30 0.16 1112
Fusion of Ventricular and Normal 0.04 0.97 0.07 145
Fusion of Paced and Normal 0.23 0.95 0.37 1249

Accuracy 0.38 17511

Macro Average 0.32 0.53 0.25 17511

Weighted Average 0.79 0.38 0.45 17511
Table 8.2, this table presents performance metrics for each class and overall accuracy,
macro average, and weighted average, but for the Random Forest Classifier (RFC)
algorithm. It compares the performance of RFC against Naive Bayes.

Table 8.2 Proposed RFC report

Class Precision Recall F1-Score Support
Normal 0.97 1.00 0.99 14579
Artial Premature 0.99 0.63 0.77 426
Premature Ventricular 0.96 0.88 0.92 1112
Contraction
Fusion of Ventricular and 0.91 0.63 0.75 145
Normal
Fusion of Paced and Normal 1.00 0.95 0.98 1249

Accuracy 0.98 17511

Macro Average 0.97 0.82 0.88 17511

Weighted Average 0.98 0.98 0.97 17511

Department of Electronics and Communication Engineering,SVCN,Nellore 94

Multi-Class Heart Health Classification From ECG Data With RFC

Table 8.3 directly compares overall performance metrics between Naive Bayes and RFC,
including precision, recall, F1-score, and support. It provides insights into how RFC
performs compared to Naive Bayes across all classes.

Table 8.3 Overall performance comparison

Metric Naive Bayes RFC

Accuracy 0.38 0.98
Macro Average 0.32 0.97
Weighted Average 0.79 0.98
Tables 8.4 offer a detailed comparison of performance metrics for each individual class
between Naive Bayes and RFC. Each table focuses on a specific class, providing
precision, recall, F1-score, and support for that class, allowing for a granular comparison
of algorithm performance.

Table 8.4 Normal Class Performance Comparison

Metric Naive Bayes RFC

Precision 0.92 0.97

Recall 0.34 1.00
F1-Score 0.49 0.99
Support 14579 14579

Tables 8.5 offer a detailed comparison of performance metrics for each individual class
between Naive Bayes and RFC. Each table focuses on a specific class, providing
precision, recall, F1-score, and support for that class, allowing for a granular comparison
of algorithm performance.

Table 8.5 Artial Premature class performance comparison

Metric Naive Bayes RFC
Precision 0.30 0.99
Recall 0.10 0.63
F1-Score 0.15 0.77
Support 426 426

Department of Electronics and Communication Engineering,SVCN,Nellore 95

Multi-Class Heart Health Classification From ECG Data With RFC

Tables 8.6 offer a detailed comparison of performance metrics for each individual class
between Naive Bayes and RFC. Each table focuses on a specific class, providing
precision, recall, F1-score, and support for that class, allowing for a granular comparison
of algorithm performance.

Table 8.6 Premature Ventricular contraction class performance comparison

Metric Naive Bayes RFC
Precision 0.11 0.96
Recall 0.30 0.88
F1-Score 0.16 0.92
Support 1112 1112
Tables 8.7 offer a detailed comparison of performance metrics for each individual class
between Naïve Bayes and RFC. Each table focuses on a specific class, providing
precision, recall, F1-score, and support for that class, allowing for a granular comparison
of algorithm performance.

Table 8.7 Fusion of ventricular and normal class performance comparison

Metric Naïve Bayes RFC

Precision 0.04 0.91
Recall 0.97 0.63
F1-Score 0.07 0.75
Support 145 145

Tables 8.8 offer a detailed comparison of performance metrics for each individual class
between Naive Bayes and RFC. Each table focuses on a specific class, providing
precision, recall, F1-score, and support for that class, allowing for a granular comparison
of algorithm performance.

Table 8.8 Fusion of paced and normal class performance comparison

Metric Naive Bayes RFC
Precision 0.23 1.00
Recall 0.95 0.95
F1-Score 0.37 0.98
Support 1249 1249

Department of Electronics and Communication Engineering,SVCN,Nellore 96

Multi-Class Heart Health Classification From ECG Data With RFC

8.6 CONFUSION_MATRIX:

Figure 8.6 confusion matrix is a visual representation of the performance of a

classification model. This specific confusion matrix is for the Naive Bayes algorithm,
illustrating how well the model predicts each class compared to the actual labels.

Figure 8.6 Naive Bayes class confusion matrix

Figure 8.7 this confusion matrix represents the performance of the Random Forest
Classifier (RFC) algorithm. It visually depicts the model's predictions compared to the
actual class labels.

Figure 8.7 RFC Confusion matrix

Department of Electronics and Communication Engineering,SVCN,Nellore 97

Multi-Class Heart Health Classification From ECG Data With RFC

CHAPTER 9

CONCLUSION AND FUTURE SCOPE

9.1 CONCLUSION

In conclusion, the study employs a random forest classifier to health based on ECG data.
The results indicate that the classifier achieves a high level of accuracy in distinguishing
between different heart health categories. Specifically, it accurately identifies individuals
with healthy hearts and those with various heart conditions. This suggests that ECG data
can be effectively utilized for heart health classification, offering a non-invasive and
efficient method for assessing cardiovascular well-being.

Additionally, the study highlights the importance of leveraging machine learning

techniques, such as random forest classifiers, in healthcare applications. By utilizing large
datasets of ECG data, these algorithms can learn complex patterns indicative of different
heart conditions. This not only enhances diagnostic capabilities but also provides insights
into potential preventive measures and personalized treatment strategies for individuals
at risk of cardiovascular diseases.

Moreover, the findings underscore the potential of incorporating ECG-based heart health
classification into routine medical practices. With advancements in technology and the
increasing availability of ECG devices, this approach could offer a cost-effective and
accessible means of assessing heart health. By identifying individuals at risk early on,
healthcare providers can intervene promptly, potentially reducing the burden of
cardiovascular diseases and improving overall patient outcomes.

In summary, the study demonstrates the efficacy of using random forest classifiers to
classify heart health based on ECG data. It emphasizes the utility of machine learning in
healthcare and advocates for the integration of ECG-based classification methods into
routine medical assessments. Ultimately, these efforts have the potential to enhance
preventive care, facilitate early diagnosis, and improve the management of cardiovascular
diseases, contributing to better health outcomes for individuals.

Department of Electronics and Communication Engineering,SVCN,Nellore 98

Multi-Class Heart Health Classification From ECG Data With RFC

9.2 FUTURE SCOPE

The future scope of classifying heart health from ECG data using a random forest
classifier is promising. Currently, the method shows effectiveness in accurately
identifying patterns associated with various heart conditions. As technology advances,
the classifier's accuracy and efficiency are likely to improve, making it a valuable tool in
routine health screenings.

Additionally, ongoing research aims to refine the classifier's algorithms, enabling it to

detect subtle abnormalities in ECG data that may indicate early signs of cardiovascular
diseases. This could lead to earlier intervention and better management of heart
conditions, ultimately improving patient outcomes and reducing healthcare costs.

Furthermore, the integration of machine learning techniques with wearable ECG

monitoring devices holds immense potential. This could enable real-time monitoring of
heart health, providing individuals with immediate feedback on their cardiovascular
status and allowing healthcare professionals to intervene promptly when necessary.

Overall, the continued development and implementation of ECG-based heart health

classification with random forest classifiers have the potential to revolutionize preventive
healthcare by offering non-invasive, accessible, and accurate methods for assessing and
managing cardiovascular risk in individuals.

9.3 REFERENCES

[1] Malakouti, Seyed Matin. "Heart disease classification based on ECG using machine
learning models." Biomedical Signal Processing and Control 84 (2023): 104796.
[2] Ozcan, Mert, and Serhat Peker. "A classification and regression tree algorithm for
heart disease modeling and prediction." Healthcare Analytics 3 (2023): 100130.
[3] Fakhry, Mahmoud, and Ascensión Gallardo-Antolín. "Elastic net regularization and
gabor dictionary for classification of heart sound signals using deep learning."
Engineering Applications of Artificial Intelligence 127 (2024): 107406.
[4] Nguyen, Minh Tuan, Wei Wen Lin, and Jin H. Huang. "Heart Sound Classification
Using Deep Learning Techniques Based on Log-mel Spectrogram." Circuits, Systems,
and Signal Processing 42, no. 1 (2023): 344-360.

Department of Electronics and Communication Engineering,SVCN,Nellore 99

Multi-Class Heart Health Classification From ECG Data With RFC

[5] Huang, Youhe, Hongru Li, and Xia Yu. "A novel time representation input based on
deep learning for ECG classification." Biomedical Signal Processing and Control 83
(2023): 104628.
[6] Sk, Khader Basha, D. Roja, Sunkara Santhi Priya, Lavanya Dalavi, Sai Srinivas
Vellela, and Venkateswara Reddy. "Coronary Heart Disease Prediction and
Classification using Hybrid Machine Learning Algorithms." In 2023 International
Conference on Innovative Data Communication Technologies and Application
(ICIDCA), pp. 1-7. IEEE, 2023.
[7] Li, Jiajia, Christopher Brown, Dillon J. Dzikowicz, Mary G. Carey, Wai Cheong Tam,
and Michael Xuelin Huang. "Towards real-time heart health monitoring in firefighting
using convolutional neural networks." Fire Safety Journal 140 (2023): 103852.
[8] Chen, Dan, Juan Feng, HongYan He, WeiPing Xiao, and XiaoJing Liu.
"Classification, Diagnosis, and Treatment of Obesity-Related Heart Diseases."
Metabolic Syndrome and Related Disorders (2024).
[9] Maulani, Ahmad Alaik, Sri Winarno, Junta Zeniarja, Rusyda Tsaniya Eka Putri, and
Ailsa Nurina Cahyani. "Comparison of Hyperparameter Optimization Techniques in
Hybrid CNNLSTM Model for Heart Disease Classification." Sinkron: jurnal dan
penelitian teknik informatika 9, no. 1 (2024): 455-465.
[10] Parveen, Nikhat, Manisha Gupta, Shirisha Kasireddy, Md Shamsul Haque Ansari,
and Mohammad Nadeem Ahmed. "ECG based one-dimensional residual deep
convolutional autoencoder model for heart disease classification." Multimedia Tools
and Applications (2024): 1-27.
[11] Tartarisco, Gennaro, Giovanni Cicceri, Roberta Bruschetta, Alessandro Tonacci,
Simona Campisi, Salvatore Vitabile, Antonio Cerasa et al. "An intelligent Medical
Cyber–Physical System to support heart valve disease screening and diagnosis."
Expert Systems with Applications 238 (2024): 121772.
[12] Jemima, P. Preethy, R. Gokul, R. Ashwin, and S. Matheswaran. "Optimized
Generalised Metric Learning Model for Iterative, Efficient, Accurate, and Improved
Coronary Heart Diseases." In Advanced Applications of Generative AI and Natural
Language Processing Models, pp. 373-388. IGI Global, 2024.
[13] Shivadekar, Samit, Ketan Shahapure, Shivam Vibhute, and Ashley Dunn.
"Evaluation of Machine Learning Methods for Predicting Heart Failure
Readmissions: A Comparative Analysis." International Journal of Intelligent Systems
and Applications in Engineering 12, no. 6s (2024): 694-699.
[14] Chakraborty, C. Parnasree. "Integrating Neural Networks and Traditional Models: A
Hybrid Approach for Accurate Heart Disease Prediction." (2024).

Department of Electronics and Communication Engineering,SVCN,Nellore 100

Multi-Class Heart Health Classification From ECG Data With RFC

[15] Kaur, Ishleen, and Tanvir Ahmad. "A cluster-based ensemble approach for congenital
heart disease prediction." Computer Methods and Programs in Biomedicine 243
(2024): 107922.
[16] Searles, Charles D. "MicroRNAs and Cardiovascular Disease Risk." Current
Cardiology Reports (2024): 1-10.
[17] Wright, Brandon, Carly Fassler, Dmitry Tumin, and Lauren A. Sarno. "Health system
encounters after loss to cardiology follow-up among patients with congenital heart
disease." The Journal of Pediatrics (2024): 113931.
[18] Jou, Stephanie, Sean R. Mendez, Jason Feinman, Lindsey R. Mitrani, Valentin
Fuster, Massimo Mangiola, Nader Moazami, and Claudia Gidea. "Heart
transplantation: Advances in expanding the donor pool and xenotransplantation."
Nature Reviews Cardiology 21, no. 1 (2024): 25-36.
[19] Christogianni, Aikaterini. "The Benefits of Continuous Health Data Monitoring in
Cardiovascular Diseases and Dementia." In Encyclopedia of Information Science
and Technology, Sixth Edition, pp. 1-22. IGI Global, 2025.
[20] Baek, Ji Yoon, Seung Hee Seo, Sooyoung Cho, Jun-Bean Park, Bhumsuk Keam,
Shin Hye Yoo, and Aesun Shin. "Emergency department visits of newly diagnosed
cardiovascular disease patients in Korea during the COVID-19 pandemic." Scientific
Reports 14, no. 1 (2024): 397.
[21] Trivedi, Rupal. "Cardiovascular Disease Management In The South: An
Implementation Science Approach." PhD diss., University of Georgia.
[22] Wiatma, Deny Sutrisna, Reksa Samoedra, I. Putu Bayu Agus Saputra, and Bayu
Setia. "Physical Activity and Smoking Habits are Closely Related to Cardiovascular
Endurance in Farmers." Indonesian Journal of Global Health Research 6, no. 1
(2024): 263-270.
[23] Bhende, Vishal V., Tanishq S. Sharma, Mathangi Krishnakumar, Anikode
Subramanian Ramaswamy, Kanchan Bilgi, Sohilkhan R. Pathan, and Sohilkhan
Pathan. "The Myths, Perils, and Pitfalls of Redo Pediatric Cardiac Surgery: The New
Normal in Developing Countries Such as India." Cureus 16, no. 1 (2024).
[24] Charchar, Fadi J., Priscilla R. Prestes, Charlotte Mills, Siew Mooi Ching, Dinesh
Neupane, Francine Z. Marques, James E. Sharman et al. "Lifestyle management of
hypertension: International Society of Hypertension position paper endorsed by the
World Hypertension League and European Society of Hypertension." Journal of
hypertension 42, no. 1 (2024): 23-49.
[25] Campbell‐Washburn, Adrienne E., Juliet Varghese, Krishna S. Nayak, Rajiv
Ramasawmy, and Orlando P. Simonetti. "Cardiac MRI at low field strengths."
Journal of Magnetic Resonance Imaging 59, no. 2 (2024): 412-430.

Department of Electronics and Communication Engineering,SVCN,Nellore 101

Multi-Class Heart Health Classification From ECG Data With RFC

[26] Seng, Nang San Hti Lar, Gebremichael Zeratsion, Oscar Yasser Pena Zapata,
Muhammad Umer Tufail, and Belinda Jim. "Utility of cardiac troponins in patients
with chronic kidney disease." Cardiology in Review 32, no. 1 (2024): 62-70.
[27]Thummisetti, Bala Siva Prakash, and Haritha Atluri. "Advancing Healthcare
Informatics for Empowering Privacy and Security through Federated Learning
Paradigms." International Journal of Sustainable Development in Computing
Science 1, no. 1 (2024): 1-16.
[28] Shield, Kevin, Catherine Paradis, Peter Butt, Tim Naimi, Adam Sherk, Mark
Asbridge, Daniel Myran et al. "New perspectives on how to formulate alcohol
drinking guidelines." Addiction 119, no. 1 (2024): 9-19.
[29] Neshat, Sina, Abbas Rezaei, Armita Farid, Salar Javanshir, Fatemeh Dehghan Niri,
Padideh Daneii, Kiyan Heshmat-Ghahdarijani, and Setayesh Sotoudehnia Korani.
"Cardiovascular diseases risk predictors: ABO blood groups in a different role."
Cardiology in Review 32, no. 2 (2024): 174-179.
[30] Lee, Chien-Chiang, and Zihao Yuan. "Impact of energy poverty on public health: A
nonlinear study from an international perspective." World Development 174 (2024):
106444.
[31] Li, Jian Ping, Amin Ul Haq, Salah Ud Din, Jalaluddin Khan, Asif Khan, and Abdus
Saboor. "Heart disease identification method using machine learning classification
in e-healthcare." IEEE access 8 (2020): 107562-107582.
[32] Deng, Muqing, Tingting Meng, Jiuwen Cao, Shimin Wang, Jing Zhang, and Huijie
Fan. "Heart sound classification based on improved MFCC features and
convolutional recurrent neural networks." Neural Networks 130 (2020): 22-32.
[33] Abdellatif, Abdallah, Hamdan Abdellatef, Jeevan Kanesan, Chee-Onn Chow, Joon
Huang Chuah, and Hassan Muwafaq Gheni. "An effective heart disease detection
and severity level classification model using machine learning and hyperparameter
optimization methods." ieee access 10 (2022): 79974-79985.
[34] Chen, Yongchao, Shoushui Wei, and Yatao Zhang. "Classification of heart sounds
based on the combination of the modified frequency wavelet transform and
convolutional neural network." Medical & Biological Engineering & Computing 58
(2020): 2039-2047.
[35] Shah, Devansh, Samir Patel, and Santosh Kumar Bharti. "Heart disease prediction
using machine learning techniques." SN Computer Science 1 (2020): 1-6.
[36] Oliveira, Jorge, Francesco Renna, Paulo Dias Costa, Marcelo Nogueira, Cristina
Oliveira, Carlos Ferreira, Alípio Jorge et al. "The CirCor DigiScope dataset: from
murmur detection to murmur classification." IEEE journal of biomedical and health
informatics 26, no. 6 (2021): 2524-2535.

Department of Electronics and Communication Engineering,SVCN,Nellore 102

Multi-Class Heart Health Classification From ECG Data With RFC

[37] Balaji, Tata. "An insight on machine learning algorithms for predicting heart
diseases." Turkish Journal of Computer and Mathematics Education (TURCOMAT)
12, no. 10 (2021): 5867-5877.
[38] Pati, Abhilash, Manoranjan Parhi, and Binod Kumar Pattanayak. "IHDPM: An
integrated heart disease prediction model for heart disease prediction." International
Journal of Medical Engineering and Informatics 14, no. 6 (2022): 564-577.
[39] Nagendra, Kolluru Venkata, Maligela Ussenaiah, and N. Rajasekhar. "Design and
Development of EGB Classification Model for predicting Heart Diseases." In 2020
2nd International Conference on Innovative Mechanisms for Industry Applications
(ICIMIA), pp. 359-366. IEEE, 2020.
[40] Vamshi Kumar, S., T. V. Rajinikanth, and S. Viswanadha Raju. "Heart Attack
Classification Using SVM with LDA and PCA Linear Transformation Techniques."
In Machine Learning Technologies and Applications: Proceedings of ICACECS
2020, pp. 99-112. Springer Singapore, 2021.
[41] Ali, Farman, Shaker El-Sappagh, SM Riazul Islam, Daehan Kwak, Amjad Ali,
Muhammad Imran, and Kyung-Sup Kwak. "A smart healthcare monitoring system
for heart disease prediction based on ensemble deep learning and feature fusion."
Information Fusion 63 (2020): 208-222.
[42] Katarya, Rahul, and Sunit Kumar Meena. "Machine learning techniques for heart
disease prediction: a comparative study and analysis." Health and Technology 11
(2021): 87-97.
[43] Katarya, Rahul, and Polipireddy Srinivas. "Predicting heart disease at early stages
using machine learning: A survey." In 2020 International Conference on Electronics
and Sustainable Communication Systems (ICESC), pp. 302-305. IEEE, 2020.
[44] Rath, Adyasha, Debahuti Mishra, Ganapati Panda, and Suresh Chandra Satapathy.
"An exhaustive review of machine and deep learning based diagnosis of heart
diseases." Multimedia Tools and Applications 81, no. 25 (2022): 36069-36127.
[45] Menshawi, Alaa, Mohammad Mehedi Hassan, Nasser Allheeib, and Giancarlo
Fortino. "A Hybrid Generic Framework for Heart Problem diagnosis based on a
machine learning paradigm." Sensors 23, no. 3 (2023): 1392.
[46] Kumar, Ashish, Rama Komaragiri, and Manjeet Kumar. "Heart rate monitoring and
therapeutic devices: a wavelet transform based approach for the modeling and
classification of congestive heart failure." ISA transactions 79 (2018): 239-250.
[47] Bahrami, Boshra, and Mirsaeid Hosseini Shirvani. "Prediction and diagnosis of heart
disease by data mining techniques." Journal of Multidisciplinary Engineering
Science and Technology (JMEST) 2, no. 2 (2015): 164-168.

Department of Electronics and Communication Engineering,SVCN,Nellore 103

Multi-Class Heart Health Classification From ECG Data With RFC

[48] Magnussen, Costan G., Olli T. Raitakari, Russell Thomson, Markus Juonala,
Dharmendrakumar A. Patel, Jorma SA Viikari, Jukka Marniemi et al. "Utility of
currently recommended pediatric dyslipidemia classifications in predicting
dyslipidemia in adulthood: evidence from the Childhood Determinants of Adult
Health (CDAH) study, Cardiovascular Risk in Young Finns Study, and Bogalusa
Heart Study." Circulation 117, no. 1 (2008): 32-42.
[49] Bahrami, Boshra, and Mirsaeid Hosseini Shirvani. "Prediction and diagnosis of heart
disease by data mining techniques." Journal of Multidisciplinary Engineering
Science and Technology (JMEST) 2, no. 2 (2015): 164-168.
[50] Magnussen, Costan G., Olli T. Raitakari, Russell Thomson, Markus Juonala,
Dharmendrakumar A. Patel, Jorma SA Viikari, Jukka Marniemi et al. "Utility of
currently recommended pediatric dyslipidemia classifications in predicting
dyslipidemia in adulthood: evidence from the Childhood Determinants of Adult
Health (CDAH) study, Cardiovascular Risk in Young Finns Study, and Bogalusa
Heart Study." Circulation 117, no. 1 (2008): 32-42.
[51] Acharya, R., Ashwin Kumar, P. S. Bhat, C. M. Lim, S. S. Lyengar, N. Kannathal, and
ShankarM Krishnan. "Classification of cardiac abnormalities using heart rate
signals." Medical and Biological Engineering and Computing 42 (2004): 288-293.
[52] Dewan, Ankita, and Meghna Sharma. "Prediction of heart disease using a hybrid
technique in data mining classification." In 2015 2nd International Conference on
Computing for Sustainable Global Development (INDIACom), pp. 704-706. IEEE,
2015.
[53] Shuvo, Samiul Based, Shams Nafisa Ali, Soham Irtiza Swapnil, Mabrook S. Al-
Rakhami, and Abdu Gumaei. "CardioXNet: A novel lightweight deep learning
framework for cardiovascular disease classification using heart sound recordings."
ieee access 9 (2021): 36955-36967.
[54] Reddy, N. Satish Chandra, Song Shue Nee, Lim Zhi Min, and Chew Xin Ying.
"Classification and feature selection approaches by machine learning techniques:
Heart disease prediction." International Journal of Innovative Computing 9, no. 1
(2019).
[55] Woodward, Mark. "Small area statistics as markers for personal social status in the
Scottish heart health study." Journal of epidemiology and community health 50, no.
5 (1996): 570.
[56] Deng, Shi-Wen, and Ji-Qing Han. "Towards heart sound classification without
segmentation via autocorrelation feature and diffusion maps." Future Generation
Computer Systems 60 (2016): 13-21.

Department of Electronics and Communication Engineering,SVCN,Nellore 104

Multi-Class Heart Health Classification From ECG Data With RFC

[57] Singh, Jagdeep, Amit Kamra, and Harbhag Singh. "Prediction of heart diseases using
associative classification." In 2016 5th International conference on wireless networks
and embedded systems (WECON), pp. 1-7. IEEE, 2016.
[58] Malik, John, Yu-Lun Lo, and Hau-tieng Wu. "Sleep-wake classification via
quantifying heart rate variability by convolutional neural network." Physiological
measurement 39, no. 8 (2018): 085004.
[59] Fida, Benish, Muhammad Nazir, Nawazish Naveed, and Sheeraz Akram. "Heart
disease classification ensemble optimization using genetic algorithm." In 2011 IEEE
14th International Multitopic Conference, pp. 19-24. IEEE, 2011.
[60] Katarya, Rahul, and Sunit Kumar Meena. "Machine learning techniques for heart
disease prediction: a comparative study and analysis." Health and Technology 11
(2021): 87-97.

Department of Electronics and Communication Engineering,SVCN,Nellore 105

"Heart Disease Detection Using Logistic Regression": "Jnana Sangama", Machhe, Belagavi, Karnataka-590018
No ratings yet
"Heart Disease Detection Using Logistic Regression": "Jnana Sangama", Machhe, Belagavi, Karnataka-590018
63 pages
Project Report
No ratings yet
Project Report
41 pages
Final Mini Project123-1
No ratings yet
Final Mini Project123-1
56 pages
Report HFP
No ratings yet
Report HFP
71 pages
Content Part - Merged
No ratings yet
Content Part - Merged
76 pages
A Project On Early Predictor of Retinal Diseases by Image Processing
No ratings yet
A Project On Early Predictor of Retinal Diseases by Image Processing
39 pages
QR Code Based Attendence Major Project
No ratings yet
QR Code Based Attendence Major Project
62 pages
Phase 2 Final
100% (1)
Phase 2 Final
65 pages
111 Final Report
No ratings yet
111 Final Report
34 pages
Main Papers
No ratings yet
Main Papers
7 pages
Umesh
No ratings yet
Umesh
40 pages
A Majar Project Report - 2
No ratings yet
A Majar Project Report - 2
45 pages
Final Mini Project123
No ratings yet
Final Mini Project123
52 pages
Rescued Document
No ratings yet
Rescued Document
51 pages
Batch 4 - Revolutionizing Blood Cell Analysis
No ratings yet
Batch 4 - Revolutionizing Blood Cell Analysis
79 pages
Declaration To Table of Content
No ratings yet
Declaration To Table of Content
11 pages
Sample Document
No ratings yet
Sample Document
29 pages
Visvesvaraya Technological University: Belagavi
0% (1)
Visvesvaraya Technological University: Belagavi
17 pages
MINI PROJECT REPORT-converted Kuld
No ratings yet
MINI PROJECT REPORT-converted Kuld
77 pages
Predicting Behavior Change in SEN Students
No ratings yet
Predicting Behavior Change in SEN Students
73 pages
Python Vehicle Plate Detection
No ratings yet
Python Vehicle Plate Detection
89 pages
Project Report
No ratings yet
Project Report
30 pages
Final Report Spam Mail Detection 34
No ratings yet
Final Report Spam Mail Detection 34
50 pages
1822 B.E Cse Batchno 67
No ratings yet
1822 B.E Cse Batchno 67
175 pages
Mini Project Campus Predictor Report
0% (1)
Mini Project Campus Predictor Report
46 pages
Wa0001.
No ratings yet
Wa0001.
36 pages
Project Doc (1) (1) 32
No ratings yet
Project Doc (1) (1) 32
39 pages
Hehehehe
No ratings yet
Hehehehe
77 pages
Major 1 (B-16)
No ratings yet
Major 1 (B-16)
51 pages
Final Report Spam Mail Detection 33
No ratings yet
Final Report Spam Mail Detection 33
51 pages
Major Final Report Kartik
No ratings yet
Major Final Report Kartik
48 pages
Offline Signature Verification System Using Artificial Neural Networks
No ratings yet
Offline Signature Verification System Using Artificial Neural Networks
65 pages
MajorProject G9 Report
No ratings yet
MajorProject G9 Report
21 pages
Mini Anji (1) Merged Merged
No ratings yet
Mini Anji (1) Merged Merged
29 pages
Sneha 2.0
No ratings yet
Sneha 2.0
41 pages
Minor Project Report Format Dec 2024 (1) (AutoRecovered)
No ratings yet
Minor Project Report Format Dec 2024 (1) (AutoRecovered)
15 pages
Parking 1 1
No ratings yet
Parking 1 1
29 pages
RTRP - Project - Documentation-2024-11 (1) - (1) (1) Fibal
No ratings yet
RTRP - Project - Documentation-2024-11 (1) - (1) (1) Fibal
59 pages
Predicting Report
No ratings yet
Predicting Report
70 pages
Final Report
No ratings yet
Final Report
58 pages
Sensor Guided Surveillance Robot For Enhanced Security
No ratings yet
Sensor Guided Surveillance Robot For Enhanced Security
42 pages
Sonu Thisis
No ratings yet
Sonu Thisis
58 pages
Project Report
No ratings yet
Project Report
22 pages
Black Book
No ratings yet
Black Book
30 pages
Minor Project Report
No ratings yet
Minor Project Report
69 pages
Object Detection with ML Algorithms
No ratings yet
Object Detection with ML Algorithms
73 pages
Bird Species Project Report Final
No ratings yet
Bird Species Project Report Final
50 pages
Final Reportrrrrttnb
No ratings yet
Final Reportrrrrttnb
60 pages
Chapter 1 - 5
100% (1)
Chapter 1 - 5
80 pages
Documentation Sample
No ratings yet
Documentation Sample
47 pages
Analysis On Credit Card Fraud Detection Using Machine Learning Approaches
No ratings yet
Analysis On Credit Card Fraud Detection Using Machine Learning Approaches
10 pages
Report 5
No ratings yet
Report 5
39 pages
Roshini Project
No ratings yet
Roshini Project
74 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
28 pages
Mini Project Document
No ratings yet
Mini Project Document
84 pages
Apeksha-Thesis Final-1 - 121
No ratings yet
Apeksha-Thesis Final-1 - 121
53 pages
Updated Report 2
No ratings yet
Updated Report 2
74 pages
FINALREPORTGROUP
No ratings yet
FINALREPORTGROUP
41 pages
Design of Dosage Regimen
No ratings yet
Design of Dosage Regimen
8 pages
Post Test Health Assessment Chapter 12
No ratings yet
Post Test Health Assessment Chapter 12
5 pages
Lecture 8
No ratings yet
Lecture 8
7 pages
Esc 20203 Consens Frailty
No ratings yet
Esc 20203 Consens Frailty
36 pages
ESHRE IVF Labs Guideline 15122015 FINAL
100% (1)
ESHRE IVF Labs Guideline 15122015 FINAL
30 pages
Protectol PE Protectol PES: Technical Information
No ratings yet
Protectol PE Protectol PES: Technical Information
8 pages
Listas Ingles Vocabulary Vocabulary Dificil
No ratings yet
Listas Ingles Vocabulary Vocabulary Dificil
17 pages
Intravenous Urography
No ratings yet
Intravenous Urography
3 pages
Group 9 Hemorrhoids and Oesophagal Stricture.... Group 9
No ratings yet
Group 9 Hemorrhoids and Oesophagal Stricture.... Group 9
38 pages
EN D Wise (1.0)
No ratings yet
EN D Wise (1.0)
2 pages
Rare Pulmonary Chromoblastomycosis Case
No ratings yet
Rare Pulmonary Chromoblastomycosis Case
2 pages
Jumbled Paragraph Exercises
No ratings yet
Jumbled Paragraph Exercises
8 pages
3rd Probiotics Prebiotics
50% (2)
3rd Probiotics Prebiotics
194 pages
Covid-19 Nepal: Preparedness and Response Plan (NPRP) : April 2020
No ratings yet
Covid-19 Nepal: Preparedness and Response Plan (NPRP) : April 2020
56 pages
BDP
No ratings yet
BDP
23 pages
Actor's Chest Pain: GERD or Ulcer?
No ratings yet
Actor's Chest Pain: GERD or Ulcer?
8 pages
Fetal Medicine (2022)
No ratings yet
Fetal Medicine (2022)
187 pages
Clinical Practice Guidelines - Eczema - RCH
No ratings yet
Clinical Practice Guidelines - Eczema - RCH
1 page
Formulation and Development of Mint Containing Herbal Hand Sanitizer
No ratings yet
Formulation and Development of Mint Containing Herbal Hand Sanitizer
5 pages
MSC FSN Syllabus 2017-18 Nov
No ratings yet
MSC FSN Syllabus 2017-18 Nov
16 pages
Chapter 6 The Sexual Self e Module
No ratings yet
Chapter 6 The Sexual Self e Module
25 pages
Genetics and Chromosomes
No ratings yet
Genetics and Chromosomes
29 pages
Concept of Health Top20 MCQs
No ratings yet
Concept of Health Top20 MCQs
2 pages
SFME - Blank - 2018 - June - v7 - 3287 PDF
No ratings yet
SFME - Blank - 2018 - June - v7 - 3287 PDF
14 pages
Revised Opinion Essay Writing Guideline For Students
No ratings yet
Revised Opinion Essay Writing Guideline For Students
4 pages
Mss Infection
No ratings yet
Mss Infection
41 pages
UPDATE LAP OPERASI DATA PASIEN 2013-2017 Update KBE + COT (Update NOV)
No ratings yet
UPDATE LAP OPERASI DATA PASIEN 2013-2017 Update KBE + COT (Update NOV)
460 pages
Chapter 04: Multiple Choice: Answ ER
No ratings yet
Chapter 04: Multiple Choice: Answ ER
25 pages
Food Hygiene Policy for UHS
100% (2)
Food Hygiene Policy for UHS
42 pages
MICOHEDMED 2022 Poster Presentation Guidelines-1
No ratings yet
MICOHEDMED 2022 Poster Presentation Guidelines-1
5 pages