Report
Report
Keywords: In this study, we present a novel graph-based methodology for an accurate classification of cardiac arrhythmia
Arrhythmia classification diseases using a single-lead electrocardiogram (ECG). The proposed approach employs the visibility graph
Electrocardiogram (ECG) technique to generate graphs from time signals. Subsequently, informative features are extracted from each
Graph convolutional neural network (GCN)
graph and then fed into classifiers to match the input ECG signal with the appropriate target arrhythmia
Multi-layer perception (MLP)
class. The six target classes in this study are normal (N), left bundle branch block (LBBB), right bundle
Random forest (RF)
Visibility graph (VG)
branch block (RBBB), premature ventricular contraction (PVC), atrial premature contraction (A), and fusion
(F) beats. Three classification models were explored, including graph convolutional neural network (GCN),
multi-layer perceptron (MLP), and random forest (RF). ECG recordings from the MIT-BIH arrhythmia database
were utilized to train and evaluate these classifiers. The results indicate that the multi-layer perceptron model
attains the highest performance, showcasing an average accuracy of 99.02%. Following closely, the random
forest achieves a strong performance as well, with an accuracy of 98.94% while providing critical intuitions.
1. Introduction activity of the heart. These leads include three bipolar limb leads (I,
II, III), three augmented unipolar limb leads (aVR, aVL, aVF), along
Cardiovascular diseases (CVDs), accounting for approximately 17.9 with six precordial leads (V1–V6). From bipolar leads, the second lead
million deaths annually, are the leading cause of global mortality. is the most widely used as it generally provides more information on
Reducing the incidence of premature fatalities hinges on the accurate the important waves of a heartbeat Lai, Bu, Su, Zhang, and Ma (2020).
identification of individuals at the highest risk and ensuring their In recent years, the utilization of machine learning algorithms in
prompt access to appropriate treatment. Among the various manifes-
analyzing diverse bio-signals has gained significant traction, dem-
tations of CVDs, arrhythmia, which denotes abnormal changes in the
onstrating superior performance compared to previous methods
heart pulse rate, emerges as a common occurrence. A vital initial step
(Ebrahimi, Loni, Daneshtalab, & Gharehbaghi, 2020; EPMoghaddam,
in planning effective treatment for patients with cardiac problems is the
establishment of a precise and automated arrhythmia diagnosis model. Banta, Post, Razavi, & Aazhang, 2023; EPMoghaddam, Banta, et al.,
This pivotal tool plays a vital role in enhancing the accuracy of identi- 2023a; EPMoghaddam, Sheth, Haneef, Gavvala, & Aazhang, 2022;
fication and expediting access to timely interventions, contributing to Nasiri, Naghibzadeh, Yazdi, & Naghibzadeh, 2009; Wang, Chiang,
the overall goal of reducing the impact of cardiovascular diseases on Hsu, & Yang, 2013). Notably, numerous studies have concentrated
global health. on arrhythmia classification, with a discernible trend emphasizing the
The electrocardiogram (ECG) is a non-invasive medical diagnostic integration of neural network models, particularly convolutional neural
tool, providing essential insights into the heart’s electrical activity, networks (CNNs) and recurrent neural networks (RNNs), to augment
rhythm, and condition (Kligfield et al., 2007). It is widely employed by the precision and efficacy of arrhythmia classification (Ebrahimi et al.,
cardiologists as a primary assessment tool for evaluating patients. Tra- 2020; EPMoghaddam, Muguli, & Aazhang, 2023b; Jun et al., 2018;
ditionally, cardiologists invested significant time in manually reviewing Singh, Pandey, Pawar, & Janghel, 2018).
electrocardiogram recordings. The integration of automated arrhythmia
As an example, the study by Sahoo et al. introduces an algorithm
detection models can streamline this process, resulting in enhanced
designed for the detection of QRS complexes, which encompass three
efficacy and accuracy in cardiac assessments. Usually, the 12-lead ECG
deflections on an ECG tracing: the Q wave, the R wave, and the S wave,
obtained from 10 electrodes is used for a routine study of the electrical
https://doi.org/10.1016/j.iswa.2024.200385
Received 28 February 2024; Received in revised form 11 April 2024; Accepted 30 April 2024
Available online 5 May 2024
2667-3053/© 2024 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
D. EPMoghaddam et al. Intelligent Systems with Applications 22 (2024) 200385
Table 1
Detailed comparison of several previous studies and their strengths and limitations.
Study #Classes Method Strength Limitations
Sahoo, Kanungo, Behera, 4 Introduces an algorithm for QRS Intuitive features, Utilizing Limited number of arrhythmia
and Sabut (2017) complex detection and feature multiple classifiers, High accuracy classes, Lower number of
extraction using multiresolution in classification. samples, Limited discussion on
wavelet transform. dataset size and imbalance.
Khorrami and Moavenian 5 Compares continuous wavelet Exploring the impact of Limited number of arrhythmia
(2010) transform with two other employing two ECG input leads classes, Limited data size, Lack of
techniques for arrhythmia on classification performance in discussion on dataset size and
classification. contrast to using a single lead. imbalance.
Houssein, Ibrahim, Neggaz, 4 Investigates the application of High inter-patient accuracy, Lack of discussion on the effect of
Hassaballah, and Wazery manta ray foraging optimization Comparison with several other data imbalance on the training.
(2021) with SVM for arrhythmia metaheuristic algorithms.
classification.
Kim, Seo, Kim, Choi, and 3 Introduces a CNN architecture Inter-patient paradigm, Model’s dependence on the choice
Park (2023) named WavelNet for arrhythmia Addressing imbalance by of mother wavelets, Moderate
classification. incorporating class weights. performance.
Li et al. (2021) 4 Offers a visibility-graph based Novel denoising approach, Focused only on AF, Lack of
denoising approach followed by Diverse dataset. discussion on dataset imbalance.
XGboost to identify AF patients.
Kutluana and Türker 5 Uses adjacency matrix of Exploring graph representation, Using 12-lead ECG as input,
(2024) weighted visibility graphs for Multi-label classification, Moderate performance.
arrhythmia classification. Inter-patient paradigm.
alongside the extraction of features using multiresolution wavelet trans- to effectively classify atrial fibrillation (AF). Similarly, another study
form (Sahoo et al., 2017). These extracted features are subsequently utilized the adjacency matrix of weighted visibility graphs or sequences
employed for classifying cardiac abnormalities, including normal (N), of node weights as features for the classification of cardiac disorders
left bundle branch block (LBBB), right bundle branch block (RBBB), using ECG recordings from the PTB-XL dataset (Kutluana & Türker,
and paced beats (P). The classification is carried out using neural net- 2024). Table 1 summarizes several studies’ strengths and limitations. As
work (NN) and support vector machine (SVM) classifiers. The reported evident, there is still room for further investigation into a broader spec-
average accuracy is 96.67% for NN and 98.39% for SVM. Another trum of arrhythmia classes and enhancing classification performance.
study investigates the application of continuous wavelet transform in Additionally, addressing the dataset’s limitations, especially regarding
comparison to alternative data transformation techniques, specifically data imbalance, is imperative as it can significantly affect learning
discrete wavelet transform and discrete cosine transform, for the task outcomes. In this work, we tackle these gaps by examining a wider
of arrhythmia classification (Khorrami & Moavenian, 2010). This study array of arrhythmia classes while concurrently enhancing classifica-
employs multi-layer perceptron (MLP) and SVM as classifiers. Notably, tion accuracy. Moreover, we will discuss the dataset’s constraints and
the study utilizes a dataset comprising only ninety beats for each propose viable strategies to mitigate any associated limitations.
arrhythmia—a limited set employed for training, testing, and validation In this study, we introduce and assess a novel methodology for
purposes. A more recent study in Houssein et al. (2021) explores the performing the arrhythmia classification task by leveraging visibility
application of manta ray foraging optimization with SVM for the effi- graphs in conjunction with three classifiers: graph convolutional neu-
cient classification of ECG arrhythmias. This approach incorporates a ral network (GCN), multi-layer perceptron, and random forest (RF).
diverse set of features, including one-dimensional local binary pattern, The paper is structured as follows: Section 2 delves into the data
and provides a comprehensive methodology overview, while Section 3
wavelet, higher-order statistical, and morphological information. The
presents the experimental results. Ultimately, Sections 4 and 5 discuss
integration of these features into the proposed framework achieves an
the findings and conclude the paper.
inter-patient average accuracy of 98.26%.
In the study by Jangra et al. a novel CNN model is proposed for ar-
2. Material and methods
rhythmia classification, achieving an accuracy of 99.48% in distinguish-
ing ventricular ectopic beats and supraventricular ectopic beats (Jan-
2.1. Data
gra, Dhull, Singh, Singh, & Cheng, 2023). Similarly, the work by Kim
et al. (2023) introduces another CNN architecture named WavelNet for The MIT-BIH open-source arrhythmia database from the PhysioNet
the same task. In their three-class classification, they attain sensitivity Forum stands out as the most widely utilized dataset for the arrhyth-
rates of 91.4%, 49.3%, and 91.4% for non-ectopic, supraventricular mia classification task (Goldberger et al., 2000; Xiao et al., 2023).
ectopic, and ventricular ectopic beat classifications, respectively. In the In line with evaluating the presented model, we employed the MIT-
study presented in Madan et al. (2022), a hybrid model named 2D-CNN- BIH dataset, which encompasses 48 half-hour ECG recordings obtained
LSTM is introduced, which combines convolutional neural network from 47 distinct patients at the BIH laboratory. This dataset features
and long short-term memory (LSTM) network architectures. Initially, 2-channel ECG recordings with a sampling rate of 360 Hz, offering a
ECG signals are translated into scalogram images before being input diverse range of heartbeats. For the specific focus of this work, we only
into the model. The researchers utilize heartbeats from three databases utilize the first ECG lead, namely modified limb lead II (MLII). We
and report accuracy rates of 98.7%, 99%, and 99% for the three- narrowed down our analysis to data from six classes: N, LBBB, RBBB,
class classification of cardiac arrhythmias, congestive heart failure, and PVC (premature ventricular), A (atrial premature contraction), and F
normal sinus rhythm, respectively. (fusion beats). Fig. 1A illustrates the distribution of distinct types of
Moreover, the visibility graph (VG) technique has been utilized heartbeats for each patient. Notably, recordings 102, 104, 207, and 217
in several studies for the arrhythmia classification task. For instance, were excluded from our analysis due to poor signal quality (Mark,
in Li et al. (2021), weighted multi-scale limited penetrability visibility 1987; Sellami & Hwang, 2019). Fig. 1B displays the proportion of
graph features were extracted, and an XGBoost classifier was employed various classes within the final training and testing sets.
2
D. EPMoghaddam et al. Intelligent Systems with Applications 22 (2024) 200385
Fig. 2. (A) Two heartbeat samples and their components. The plot on the right
illustrates a normal heartbeat, while the one on the left represents a right bundle branch
block heartbeat. (B) An example of an ECG signal before and after noise removal is
shown. Note that black circles indicate R peaks. After noise removal, the signal is
partitioned into individual heartbeat samples. Each sample has a fixed length of 300
time points. To achieve this, the R peaks of the preceding and subsequent heartbeats
are identified, and RR intervals are calculated. On each side of the main heartbeat,
if half of the RR interval is less than 150 time points, we apply padding to ensure a
window of 300 centered around the R peak.
If, on each side, half of the RR interval is less than 150 time points, we
apply padding to ensure a window of 300 centered around the R peak.
Following data segmentation, a down-sampling procedure was em-
ployed due to the substantial number of time points within each
Fig. 1. (A) Distribution of different heartbeat types for individuals in MIT-BIH heartbeat. This down-sampling aimed to reduce the dimensionality of
database. (B) Data distribution in both training and testing sets. The dataset exhibits a each segment, enhancing the efficiency of both graph generation and
significant imbalance, with the majority of instances belonging to the normal heartbeat graph feature extraction processes. To achieve this, a sliding window
class. This leads to challenges in training a model that adequately captures patterns in approach was applied across the time-series, where the average of
the minority classes.
the amplitude values within each window was computed. Specifically,
a sliding window of 10 time points was used, with a 20% overlap
between consecutive windows. The sliding window duration strikes a
2.2. Preprocessing balance, being large enough to reduce complexity yet small enough
to ensure that the final down-sampled segment, consisting of 38 time
The first stage in the proposed methodology is preprocessing, focus- points, retains sufficient characteristics of the raw heartbeat without
ing on noise removal and segmenting ECG recordings into individual substantial distortion.
heartbeats (Kher et al., 2019). Noise removal involves applying a notch
filter to eliminate 60 Hz power line noise, followed by wavelet denois- 2.3. Visibility graph generation
ing using biorthogonal wavelets to effectively remove high-frequency
noise (Mallat, 1999). Additionally, baseline wandering is addressed The visibility graph is a technique used to transform complex time-
through the application of successive median filters. series data into a network (Lacasa, Luque, Ballesteros, Luque, & Nuno,
Following the noise removal, ECG signals undergo a subsequent 2008; Stephen, Gu, & Yang, 2015). Fundamentally, each time point
partitioning into individual heartbeats. In Fig. 2A, two ECG heartbeats serves as a node in the graph, and the presence of an edge between two
and their essential components are displayed. A crucial consideration nodes signifies the intervisibility of the corresponding time points. In
in this process was to ensure a uniform segment length, coupled with this study, we employ two variations of the visibility graph, specifically
the centering of the R peaks across different segments. The goal was natural visibility graphs (NVG) and horizontal visibility graphs (HVG),
to have the same number of time samples in each heartbeat segment, as detailed in Lacasa et al. (2008), Luque, Lacasa, Ballesteros, and
ensuring that the peaks were consistently represented at roughly the Luque (2009).
same time across all segments. Opting for windows spanning 300 time In the context of a natural visibility graph, two arbitrary points
points, equivalent to 0.83 s, was a deliberate choice to encompass the (ta , ya ) and (tb , yb ) possess visibility and are consequently connected if,
entire temporal scope of a beat, ensuring a comprehensive analysis that for any other data point (tc , yc ) situated between them, the following
captures all critical components of a heartbeat. To accomplish this, as condition is satisfied:
shown in Fig. 2B, for each heartbeat, the R peaks of the preceding and t * tb
subsequent heartbeats are identified, and RR intervals are calculated. yc < yb + (ya * yb ) ù c . (1)
ta * tb
3
D. EPMoghaddam et al. Intelligent Systems with Applications 22 (2024) 200385
4
D. EPMoghaddam et al. Intelligent Systems with Applications 22 (2024) 200385
Fig. 5. Three classifiers used for arrhythmia classification task. (A) Graph convolutional network. The GCN processes natural visibility graphs as input, each comprising 38 nodes.
Each node is represented by a feature vector of size 1 ù 22. The GCN then produces output corresponding to one of the six arrhythmia classes. (B) Random forest. Please note
that DT stands for decision tree. The features of graph nodes are concatenated into a single vector, which is then inputted into the classifier. (C) Multi-layer perceptron.
5
D. EPMoghaddam et al. Intelligent Systems with Applications 22 (2024) 200385
Table 2
GCN performance. Please note that for precision, recall, and F1 columns, the first value represents the macro average, while the second one represents the
weighted average. Learning rate is 0.001 and batch size is 64.
#Hidden layers #Neurons Class weight Accuracy% Precision% Recall% F1 %
3 100 – 97.59 91.86, 97.55 87.72, 97.59 89.62, 97.52
3 150 – 97.89 93.45, 97.83 87.61, 97.89 90.19, 97.83
5 100 – 98.12 94.71, 98.07 89.07, 98.12 91.58, 98.07
5 150 – 98.13 93.16, 98.12 90.45, 98.12 91.59, 98.10
5 150 [0.5, 1, 1, 1, 1, 1] 98.03 98.95, 97.97 89.35, 98.03 91.43, 97.98
7 100 – 98.20 94.05, 98.17 90.60, 98.20 92.14, 98.18
Table 3
RF performance with different parameters. Please note that for precision, recall, and F1 columns, the first value represents the macro average, while the second one represents the
weighted average.
#Estimators Max depth Min samples split Class weight Accuracy% Precision% Recall% F1 %
20 – 2 – 98.39 97.75, 98.38 87.28, 98.39 91.61, 98.31
50 – 2 [0.1, 1, 1, 1.5, 3, 10] 98.32 98.38, 98.33 87.23, 98.32 91.89, 98.24
100 – 5 – 98.52 98.15, 98.51 87.80, 98.52 92.09, 98.44
100 10 5 [1, 1, 1, 1, 5, 20] 97.95 95.74, 97.94 89.25, 97.95 92.31, 97.90
150 – 5 – 98.51 98.16, 98.51 87.95, 98.51 92.21, 98.43
150 10 2 – 97.82 97.92, 97.84 82.89, 97.82 88.45, 97.67
200 – 2 – 98.55 97.99, 98.55 88.45, 98.56 92.50, 98.49
200 – 10 [1, 1, 1, 1, 5, 20] 98.52 97.82, 98.51 88.93, 98.52 92.81, 98.46
300 – 2 – 98.56 97.97, 98.55 88.17, 98.56 92.28, 98.48
300 – 10 – 98.48 98.24, 98.48 87.73, 98.48 92.08, 98.41
400 – – – 98.94 98.15, 98.93 91.70, 98.94 94.60, 98.91
Following the completion of this stage, we proceed with the classifica- Here tp , fp , tn , fn represent true positives, false positives, true negatives,
tion process. Algorithm 2 provides a comprehensive description of our and false negatives, respectively.
classification approach, which utilizes the processed visibility graphs to In many cases, solely examining one of these metrics can be mis-
predict arrhythmia classes. leading and may not furnish sufficient information for comprehensive
assessments. For instance, in scenarios characterized by extreme class
Algorithm 2 Classification imbalance, a naive classifier might label every sample as the majority
class, yielding a seemingly high accuracy. However, such a classifier
Input: Visibility Graphs, Classifier
is inherently undesirable. By incorporating the four aforementioned
Output: Predicted Arrhythmia Classes
metrics, we ensure a thorough evaluation of the model’s performance.
1: V Gtrain , V Gtest } Divide visibility graphs into two sets
2: Ytrain } Extract arrhythmia labels for V Gtrain
3: Ytest } Extract arrhythmia labels for V Gtest 3.2. Results
4: if Classifier is GCN then
5: Xtrain } V Gtrain We employ an intra-patient approach by consolidating samples from
6: Xtest } V Gtest all patients. Following this amalgamation, we randomly divide the
7: else samples into two non-overlapping sets—one for training and the other
8: Xtrain } Extract feature vectors from V Gtrain for testing, with each set comprising 46,497 samples. The presented
9: Xtest } Extract feature vectors from V Gtest results are obtained by evaluating the performance of the trained
10: end if models on the testing data.
11: Train the Classifier on (Xtrain , Ytrain ) In our exploration of classifiers, we assess three models: GCN,
12: Ypred } Predict labels for Xtest MLP, and RF. Each model undergoes an examination of a diverse
13: Evaluate the Classifier by comparing Ypred and Ytest range of parameters, including batch size, learning rate, the number
of layers, the number of neurons in hidden layers, and more. Notably,
for learning rate, values of 0.0001, 0.001, and 0.01 were explored, with
3. Results 0.001 selected. These selections were made based on commonly used
values for learning rates in various studies. Regarding batch size, we
3.1. Performance measures tested 32, 64, 128, and 256, finding that 32 and 64 achieved the highest
performance.
We investigate four widely used metrics, namely accuracy, preci- For GCN models, we experimented with varying the number of lay-
sion, recall, and F1-score, to assess the performance of the proposed ers from three to seven. Initially, we started with smaller models with
model. These metrics are defined as follows: fewer neurons and observed how incorporating additional layers and
tp + tn neurons impacted the performance. Table 2 presents GCN performance
Accuracy = (4) in several scenarios. Through our experimentation with three different
tp + fp + tn + fn
layer configurations (3, 5, 7), we observed that employing seven layers
tp resulted in a slightly better outcome, achieving an accuracy of 98.20%.
P recision = (5)
tp + fp Table 3 showcases the performance of RF across various scenar-
tp ios. Several choices for the number of trees in the forest, maximum
Recall = (6) tree depth, and minimum samples required to split an internal node
tp + fn
are explored. In Table 3, unless specified otherwise, the parameter
tp values are set to the defaults of the RandomF orestClassif ier function
F1 = (7)
tp + 1
(f + fn ) in sklearn. We also explored the addition of class weights to address
2 p
6
D. EPMoghaddam et al. Intelligent Systems with Applications 22 (2024) 200385
Table 4
MLP performance. Please note that for precision, recall, and F1 columns, the first value represents the macro average, while the second one
represents the weighted average. Learning rate is 0.001 and batch size is 32.
#Hidden layers #Neurons Accuracy% Precision% Recall% F1 %
2 64, 32 98.67 95.55, 98.64 91.01, 98.67 93.05, 98.64
3 64, 32, 16 98.62 95.82, 98.59 90.79, 98.64 93.09, 98.59
3 128, 64, 32 99.02 95.92, 99.00 94.07, 99.02 94.97, 99.01
4 128, 64, 32, 16 98.78 95.96, 98.75 92.52, 98.78 94.14, 98.76
5 512, 256, 128, 64, 64 98.74 96.06, 98.71 92.02, 98.74 93.91, 98.71
7
D. EPMoghaddam et al. Intelligent Systems with Applications 22 (2024) 200385
Fig. 7. Accuracy and loss curves for three classifiers. MLP1 is an MLP with three layers (64, 32, 16), GCN1 is a 5-layer GCN with 150 neurons, and GCN2 is a 5-layer GCN with
100 neurons. (A) Accuracy plot. (B) Loss plot.
Table 5
Performance comparison between the proposed work and previous studies using MIT-BIT database. All the studies are performing the intra-patient
paradigm.
Study ECG beat classes #Records #ECG samples Method Accuracy%
Yu and Chen (2007) N, L, R, V, A, P 23 23,200 Probabilistic Neural Network 99.65
Li, Zhang, Zhang, and Wei (2017) N, L, R, V, A 44 26,400 1D-CNN 97.50
Oh, Ng, San Tan, and Acharya (2018) N, L, R, V, A – 16,499 CNN+LSTM 98.10
Yildirim (2018) N, L, R, V, P – 7,326 ULSTM and BLSTM 99.39
Shi et al. (2021) N, L, R, V, A 23 9,943 1D-CNN 99.59
Wu, Lu, Yang, and Wong (2021) N, L, R, V, A 16 32,422 1D-CNN 97.20
Midani, Ouarda, and Ayed (2023) N, L, R, V, A 48 100,062 DeepArr 99.46
Kumar, Mallik, Kumar, Del Ser, and Yang (2023) N, S, V, F, Qa 44 87,554 Fuzz-ClustNet 98.66
Pandey et al. (2023) N, L, R, V, A – 25,000 1D-CNN 99.40
Proposed Work N, L, R, V, A, F 44 92,994 VG+MLP 99.02
a
AAMI recommended arrhythmia classes.
classes, whereas our case involves six classes, introducing additional weights to heighten the penalty for misclassifying these classes proved
complexity to the classification task. insufficient, signaling the need for additional measures to address this
A statistical method for comparing the performance of classifiers issue effectively. Consequently, we are considering the prospect of
involves hypothesis testing. The p-value is a statistical measure that oversampling the minority class and undersampling the majority class
quantifies the strength of evidence against the null hypothesis and as a viable solution to mitigate this challenge.
represents the probability of observing more extreme results if the A limitation of this study is its intra-patient focus. Enhancing the
null hypothesis were true. A small p-value indicates that the observed
reliability and generalizability of our classification model requires the
data is unlikely to occur if the null hypothesis were true, leading
accumulation of a more extensive array of samples for each arrhythmia
to the rejection of the null hypothesis in favor of the alternative
class. Furthermore, the inclusion of a broader spectrum of patients
hypothesis. Conversely, a large p-value suggests that the observed
data is not unusual under the null hypothesis, providing insufficient and adopting an inter-patient paradigm—where the trained model
evidence to reject the null hypothesis. In simpler terms, the p-value undergoes testing on a cohort of entirely new patients not represented
helps determine the likelihood of obtaining the observed results purely in the training process—are essential steps. This strategic expansion
by chance, allowing researchers to assess the statistical significance ensures a comprehensive evaluation of the model’s performance across
of their findings. A common significance level for hypothesis testing diverse patient populations, ultimately advancing its effectiveness and
is set at 0.05. In this framework, the null hypothesis posits that two applicability. While the predominant approach in current studies leans
classifiers perform equally well. When comparing our classifier to the towards the intra-patient paradigm, it is crucial to explore the inter-
one studied in Pandey et al. (2023), the obtained p-value is 0.61. patient paradigm more extensively. The prevalence of the intra-patient
This value, calculated based on accuracy, precision, recall, and F1 paradigm in studies may be attributed to the challenges posed by the
values, indicates that there is no compelling evidence to reject the limited availability of samples for arrhythmic classes. The inadequacy
null hypothesis. Hence, statistically speaking, there is no significant of these samples poses a potential hindrance to the development of
difference between the classifiers. However, it is worth noting that our
a comprehensive and generalized model. Exploring the inter-patient
study encompasses a larger sample size and a broader range of arrhyth-
paradigm on a broader scale is integral to gaining a deeper understand-
mia classes. Furthermore, compared to Wu et al. (2021) and Kumar
ing of arrhythmia classification dynamics and ensuring the model’s
et al. (2023), p-values of 0.08 and 0.18 were obtained, respectively,
indicating similar outcomes. This comparison supports our study, as it adaptability across diverse patient cohorts.
performs similarly on a more challenging task (six-class classification) Our research underscores the efficacy of visibility graphs in an-
with a larger sample size. alyzing time-series data and providing rich insights, particularly in
One of the challenges we encountered in this study was the data bio-signal analysis, and aligns with previous investigations into VG
imbalance, particularly concerning atrial premature contraction and fu- applications (Kutluana & Türker, 2024; Li et al., 2021). By further
sion beats, as shown in Fig. 1B. Our exploration of incorporating sample validating the utility of visibility graphs in arrhythmia classification,
8
D. EPMoghaddam et al. Intelligent Systems with Applications 22 (2024) 200385
9
D. EPMoghaddam et al. Intelligent Systems with Applications 22 (2024) 200385
Luque, B., Lacasa, L., Ballesteros, F., & Luque, J. (2009). Horizontal visibility graphs: Shi, Z., Yin, Z., Ren, X., Liu, H., Chen, J., Hei, X., et al. (2021). Arrhythmia classification
Exact results for random time series. Physical Review E, 80(4), Article 046103. using deep residual neural networks. Journal of Mechanics in Medicine and Biology,
Madan, P., Singh, V., Singh, D. P., Diwakar, M., Pant, B., & Kishor, A. (2022). A hybrid 21(10), Article 2140067.
deep learning approach for ECG-based arrhythmia classification. Bioengineering, Singh, S., Pandey, S. K., Pawar, U., & Janghel, R. R. (2018). Classification of
9(4), 152. ECG arrhythmia using recurrent neural networks. Procedia Computer Science, 132,
Mallat, S. (1999). A wavelet tour of signal processing. Elsevier. 1290–1297.
Mark, R. (1987). AAMI-recommended practice: Testing and reporting performance results Stephen, M., Gu, C., & Yang, H. (2015). Visibility graph based time series analysis.
of ventricular arrhythmia detection algorithms. Association for the Advancement PLoS One, 10(11), Article e0143015.
of Medical Instrumentation, Arrhythmia Monitoring Subcommittee, AAMI ECAR, Wang, J.-S., Chiang, W.-C., Hsu, Y.-L., & Yang, Y.-T. C. (2013). ECG arrhythmia
1987. classification using a probabilistic neural network with a feature reduction method.
Midani, W., Ouarda, W., & Ayed, M. B. (2023). DeepArr: An investigative tool for ar- Neurocomputing, 116, 38–45.
rhythmia detection using a contextual deep neural network from electrocardiograms Wu, M., Lu, Y., Yang, W., & Wong, S. Y. (2021). A study on arrhythmia via ECG signal
(ECG) signals. Biomedical Signal Processing and Control, 85, Article 104954. classification using the convolutional neural network. Frontiers in Computational
Nasiri, J. A., Naghibzadeh, M., Yazdi, H. S., & Naghibzadeh, B. (2009). ECG arrhythmia Neuroscience, 14, Article 564015.
classification with support vector machines and genetic algorithm. In 2009 third Xiao, Q., Lee, K., Mokhtar, S. A., Ismail, I., Pauzi, A. L. b. M., Zhang, Q., et al. (2023).
UKsim European symposium on computer modeling and simulation (pp. 187–192). IEEE. Deep learning-based ECG arrhythmia classification: A systematic review. Applied
Oh, S. L., Ng, E. Y., San Tan, R., & Acharya, U. R. (2018). Automated diagnosis of Sciences, 13(8), 4964.
arrhythmia using combination of CNN and LSTM techniques with variable length Yildirim, Ö. (2018). A novel wavelet sequence based on deep bidirectional LSTM
heart beats. Computers in Biology and Medicine, 102, 278–287. network model for ECG signal classification. Computers in Biology and Medicine,
Pandey, S. K., Shukla, A., Bhatia, S., Gadekallu, T. R., Kumar, A., Mashat, A., et 96, 189–202.
al. (2023). Detection of arrhythmia heartbeats from ECG signal using wavelet Yu, S.-N., & Chen, Y.-H. (2007). Electrocardiogram beat classification based on wavelet
transform-based CNN model. International Journal of Computational Intelligence transformation and probabilistic neural network. Pattern Recognition Letters, 28(10),
Systems, 16(1), 80. 1142–1150.
Sahoo, S., Kanungo, B., Behera, S., & Sabut, S. (2017). Multiresolution wavelet Zhang, H., Liu, C., Zhang, Z., Xing, Y., Liu, X., Dong, R., et al. (2021). Recurrence
transform based feature extraction and ECG classification to detect cardiac plot-based approach for cardiac arrhythmia classification using inception-ResNet-v2.
abnormalities. Measurement, 108, 55–66. Frontiers in Physiology, 12, Article 648950.
Sellami, A., & Hwang, H. (2019). A robust deep convolutional neural network with Zhao, X., Liu, Z., Han, L., & Peng, S. (2022). Ecgnn: Enhancing abnormal recognition
batch-weighted loss for heartbeat classification. Expert Systems with Applications, in 12-lead ecg with graph neural network. In 2022 IEEE international conference on
122, 75–84. bioinformatics and biomedicine (pp. 1411–1416). IEEE.
10