0% found this document useful (0 votes)
23 views19 pages

Automated Diabetic Retinopathy Screening Using Deep Learning

Uploaded by

namrathasbhat16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views19 pages

Automated Diabetic Retinopathy Screening Using Deep Learning

Uploaded by

namrathasbhat16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/377407492

Automated diabetic retinopathy screening using deep learning

Article in Multimedia Tools and Applications · January 2024


DOI: 10.1007/s11042-024-18149-4

CITATIONS READS

0 66

3 authors:

Sarra Guefrechi Amira Echtioui


Université de Moncton 29 PUBLICATIONS 176 CITATIONS
5 PUBLICATIONS 52 CITATIONS
SEE PROFILE
SEE PROFILE

Habib Hamam
Université de Moncton
360 PUBLICATIONS 3,666 CITATIONS

SEE PROFILE

All content following this page was uploaded by Amira Echtioui on 15 January 2024.

The user has requested enhancement of the downloaded file.


Multimedia Tools and Applications
https://doi.org/10.1007/s11042-024-18149-4

Automated diabetic retinopathy screening using deep


learning

Sarra Guefrachi1 · Amira Echtioui2 · Habib Hamam1,3,4,5,6

Received: 13 May 2023 / Revised: 12 September 2023 / Accepted: 3 January 2024


© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024

Abstract
The purpose of this research is to propose a new method for identifying diabetic retinopa-
thy using retinal fundus images. Currently, identifying diabetic retinopathy from comput-
erized fundus images is a challenging task in medical image processing and requires new
strategies to be developed. The manual analysis of the retinal fundus is time-consuming
and requires a significant amount of skill. To assist clinicians, this research develops a
graphical user interface that integrates imaging algorithms to assess whether the patient’s
fundus image is affected by diabetic retinopathy. The diagnosis is made using a deep neu-
ral network, specifically the Resnet152-V2, which has been shown to have 100% accuracy
in all evaluation criteria including accuracy, recall, precision, and F1 Score. The severity
of the disease is displayed on the graphical user interface and the patient’s information is
stored in a local database. This proposed method can also be used by ophthalmologists as
a backup option to support in disease detection, reducing the necessary processing time.

Keywords Computer aided diagnostic system · CNN · Deep learning · Multi-


classification · Diabetic retinopathy

* Amira Echtioui
echtiouiamira@yahoo.fr
1
Faculty of Engineering, Uni de Moncton, Moncton, NB E1A3E9, Canada
2
Advanced Technologies for Medicine and Signal Laboratory ‘ATMS’, National Engineering
School of Sfax (ENIS), Sfax University, Sfax, Tunisia
3
College of Computer Science and Engineering, University of Ha’il, Ha’il 55476, Saudi Arabia
4
International Institute of Technology and Management (IITG), Avenue des Grandes Ecoles, P.O.
Box 1989, Libreville, Gabon
5
Spectrum of Knowledge Production & Skills Development, P.O. Box 3027, Sfax, Tunisia
6
Department of Electrical and Electronic Engineering Science, School of Electrical Engineering,
University of Johannesburg, Johannesburg 2006, South Africa

13
Vol.:(0123456789)
Multimedia Tools and Applications

1 Introduction

The human eye consists of various vital components, which include the iris, cornea, pupil,
lens, vitreous, macula, retina, and optic nerve. The cornea is positioned at the front of the
eye, playing a key role in allowing light to enter the eye. The iris, along with its adjustable
opening called the pupil, controls the amount of light that enters. Meanwhile, the transpar-
ent lens focuses incoming light onto the retina, a light-sensitive tissue situated at the back
of the eye. The retina generates electrical signals, which are then transmitted to the brain
through the optic nerve (ON), establishing the ON as the critical connection between the
eye and the brain’s visual cortex. Finally, the vitreous substance occupies the central part
of the eye.
Expanding on the previous anatomical description, the retina comprises distinct regions
such as the macula, optic disc (OD), blood vessels (arteries and veins) (BV), and fovea.
The macula, a small portion within the retina surrounding the fovea, contains specialized
light-sensitive cells responsible for enabling clear and detailed vision. Given the vital role
of the retina in visual perception, our focus is directed toward a prevalent eye condition,
namely diabetic retinopathy (DR). Diabetic retinopathy (DR) is a significant and poten-
tially blinding complication associated with elevated blood glucose levels in individuals
with diabetes. This progressive condition has the capacity to harm the retina, sometimes
resulting in sudden vision loss. This ailment is particularly noteworthy as a leading cause
of blindness, especially in affluent nations, and it is directly linked to diabetes mellitus
[1–3]. The prevention of blindness or, at the very least, the mitigation of diabetic retin-
opathy’s progression into blindness hinges on the timely identification and treatment of
the condition. Consequently, comprehensive screening of diabetic patients is highly recom-
mended. Nevertheless, it’s worth noting that manual evaluation often produces dependable
outcomes due to the extensive experience and expertise involved in the process. Through-
out history, much work has been poured into establishing accurate computerized scanning
systems based on color fundus images [4, 5]. Detecting diabetic retinopathy from comput-
erized image data is still a work in progress, and alternative treatments are required. The
next stage is to apply intelligent diagnosis systems in addition to scanning and artificial
vision-based solutions.
Given the background of intelligent machines, it is obvious that artificially intelligent
approaches, particularly deep learning, are commonly used in practically all aspects of
modern life to decrease effort or cost while achieving superior results. The mathematical
nature of our environment has enabled intelligent systems to adapt to many difficulties
[6–9]. There are implementations of smart systems in the area of also clinical [10–14] and
the challenge of disease detection is a significant study interest in this method [15–17].
Diabetes diagnosis employing a computer-assisted intelligent system, as connected to the
theme of this study, is done to reduce the amount of human effort required to provide basic
diagnostic and distinguish between diabetes and normal patients.
To develop a diagnosis, doctors often look for certain symptoms, and a similar scheme
can be used by putting variables into different machine learning approaches. Based on this
indication, a machine learning-based program can be trained to determine if a person is
diabetic or not. There are numerous apps for this purpose, and the most recent trend is to
apply deep learning. It is also an essential way to apply image processing to clinical imag-
ing data for diagnosis [18–21]. There are several researchers who have used transfer learn-
ing models and convolutional neural network (CNN) for the diagnosis of some diseases
such as COVID-19 [22–24].

13
Multimedia Tools and Applications

The goal of this research is to give an alternate computer-aided detection solution tech-
nique for identifying diabetic retinopathy from the retinal fundus images. One outstanding
motivation of the work is gaining enhanced diagnosis outcomes and providing to the litera-
ture review with computer - assisted detection approach for diagnosis of a severe disease
by applying both graphic user interface and deep learning.
The main contribution of this paper lies in the meticulous construction of two robust
models employing ResNet152-V2 and InceptionResnet-V2 architectures, meticulously
designed for the precise analysis of retinal fundus images. These meticulously crafted mod-
els are seamlessly integrated into a sophisticated yet user-friendly graphical user interface
(GUI). The GUI’s purpose is to streamline and enhance the accuracy of diabetic retinopa-
thy (DR) detection and staging processes, making it a valuable tool for clinical practition-
ers and researchers in the field.
The study topic is organized into six sections. The second section provides an introduc-
tion to the related research on retinal eye disease analysis and diagnosis. The third section
delves into the details of the data and methodologies used in the study. This section will
include a comprehensive explanation of the data sources, data collection methods, and data
analysis techniques used, as well as a description of the implementations made with the
results. The fourth section examines the outcomes of the proposed strategy and investigates
any prospective discoveries that have arisen from the study. The fifth section provides a
discussion of the results, highlighting the key findings and implications. Finally, the sixth
section offers a conclusion to the study, summarizing the main points and outlining future
work plans for further research in this area.

2 Related work

Recently, several studies have explored the use of computer-aided diagnostic (CAD) sys-
tems to detect diabetic retinopathy (DR) using fundus images. For instance, in [25], the
authors examined the accuracy of a support vector machine (SVM) model in identifying
DR. They first applied a median filter and morphological procedures to the images and
then used grayscale co-occurrence matrices to extract statistical properties such as correla-
tion, frequency, contrast, and homogeneity. These features were then used as input for an
SVM classifier. The results showed that the system achieved an accuracy of 82.35%.
Acharya et al. [26] used texture features in conjunction with SVM classifier to automate
mass screenings for diabetic retinopathy. They categorized the images into four classes:
normal retina, macular edema, proliferative diabetic retinopathy, and non-proliferative dia-
betic retinopathy. The research focused on the analysis of 238 retinal fundus images, from
which five texture features—namely, correlation, homogeneity, long run emphasis, short
run emphasis, and run percentage—were extracted. Subsequently, these relevant features
were input into an SVM to enable automatic classification.
Another approach for diagnosing DR was proposed in [27], where the authors developed
a system that uses colored fundus images, preprocessed with blurred images, and machine
learning methods. Deep learning techniques, such as CNN [28] and transfer learning with
models like ResNet, VGG, and GoogleNet, have proven to be effective in categorizing dia-
betic retinopathy (DR) [29–34]. In a previous study conducted by [35], they explore the
use of various SVM kernels coupled with the Ant Colony System (ACS) feature selec-
tion method to detect DR. The cubic SVM classifier outperformed other kernels, achieving
accuracies of 92.6%, 91.8%, and 93% for 250, 550, and 750 attributes, respectively.

13
Multimedia Tools and Applications

In [36], the authors proposed a model based on deep learning. They used the DenseNet
encoder and the convolutional attention unit block for DR severity sensing. The attention
block is employed to refine features, while the encoder is used to extract features from the
input images of the APTOS dataset. They achieved an 82% accuracy in multiclass classifi-
cation and a 97% accuracy in binary classification.
Bodapati et al. [37] develop a diabetic retinopathy (DR) model by employing transfer
learning and feature extraction techniques. They use Inception ResNetV2, VGG-16, NAS-
Net, and Xception, to enhance feature representation and compared them with handcrafted
features for DR detection. The study also explore different approaches for feature fusion
and pooling, ultimately finding that the averaging pooling simple fusion approach on Deep
Neural Networks (DNN) yielded the most effective performance. The authors extract fea-
tures from DR images sourced from the APTOS 2019 Blindness Detection, using Xception
and VGG16. These features are then combined to obtain the final feature representations,
which are used to train the DNN.
Gurcan et al. [38] introduced an automated classification system for diabetic retinopathy
(DR), encompassing preprocessing, feature extraction, and classification stages, employ-
ing deep convolutional neural networks (CNNs) and machine learning (ML) techniques.
Their model extracted features from a pre-trained InceptionV3 model using transfer learn-
ing. They conducted a comprehensive comparison of various ML methods, including
Bagged Decision Trees, XGBoost, Random Forest, Extra Trees, Support Vector Machines,
Logistic Regression, and multilayer perceptron, with XGBoost demonstrating superior per-
formance. In their analysis, they used Grid search and calibration techniques, along with
meticulous preprocessing and fine-tuning. Notably, the model extracted generic descriptors
from the initial layers of InceptionV3 without performing layer-wise tuning, yet achieved
competitive classification accuracy when compared to ML methods. However, it is impor-
tant to note that these descriptors were derived from the early layers of the CNN, which
contain relatively low-level information in contrast to the higher layers. This choice of fea-
ture extraction from the lower layers may introduce inefficiencies in the DR detection pro-
cess and potentially hinder the recognition of critical features, consequently impacting the
model’s robustness.
In [32], the authors introduced a hybrid approach, which integrates the VGG16 archi-
tecture as the feature detector and employs the XGBoost algorithm as the classifier. This
strategic combination harnesses the respective advantages of deep learning architec-
ture and gradient boosting classification, with the goal of elevating the system’s overall
performance.
Ni et al. [39] proposed a novel and promising approach in the field of diabetic retin-
opathy diagnosis using retinal fundus images. Their method combines the power of image
processing and deep learning to achieve enhanced results. The study meticulously validate
their approach on 400 retinal fundus images sourced from the Messidor database. The
obtained average values for various performance evaluation metrics are quite impressive:
accuracy reached 97%, sensitivity (recall) reached 94%, specificity achieved 98%, precision
stood at 94%, F-score reached 94%, and GMean achieved 95%. These results highlight the
potential of the proposed hybrid solution in effectively detecting diabetic retinopathy, offer-
ing a significant step forward in medical image analysis.
In [40], the authors proposed an innovative approach for diabetic retinopathy (DR)
classification by employing swarm optimization deep neural networks. The proposed
method involves several stages, starting with pre-processing of input images, followed
by DR feature segmentation using a hybrid entropy model based on UNet and fuzzy
C-means techniques. The final classification step utilizes the Deep Stacked Encoder,

13
Multimedia Tools and Applications

optimized with a swarm algorithm. The results of their study are remarkable, dem-
onstrating the effectiveness of the proposed model. For the DIARETDB0 dataset, the
model achieve an impressive accuracy of 95.9%, with a specificity of 96.80% and sen-
sitivity of 88.07%. Similarly, for the DIARETDB1 dataset, the model performed well
with an accuracy of 95.48%, a specificity of 91.89%, and a sensitivity of 93.29%. These
findings underscore the potential of swarm optimization deep neural networks in accu-
rately classifying diabetic retinopathy, making it a promising advancement in the field
of medical image analysis and disease diagnosis.

3 Methodology

This paper introduces a comprehensive system encompassing three primary steps to


enhance Diabetic Retinopathy (DR) diagnosis. Firstly, meticulous collection and prepa-
ration of the dataset ensure its suitability for subsequent analysis. Secondly, a diagnos-
tic Convolutional Neural Network (CNN) is employed to accurately determine the final
diagnosis of retinal scans and classify DR severity levels ranging from 0 (NO DR) to 4.
The CNN proves instrumental in precise assessment and appropriate categoriza-
tion of retinal images. Lastly, a user-friendly Graphical User Interface (GUI) system
is developed using Tkinter [41] and heidiSQL [42] to efficiently manage and preserve
predictions alongside patient IDs and names.
The GUI-based approach streamlines prediction handling, enhancing overall usabil-
ity and practicality. In the subsequent sections, each step is extensively elucidated to
provide a comprehensive understanding of the system’s functionality and efficacy.
Figure 1 illustrates our proposed system.

Fig. 1  Proposed system

13
Multimedia Tools and Applications

3.1 Proposed system process

As a preliminary step, preprocessing of the input image is essential in the initial mod-
ule to attain the ideal resolution. The input image can be resized to achieve this result.
Furthermore, a novel deep learning-based approach has been employed to distinguish
between different diabetic retinopathy classes (DR). The final step involves evaluating
the proposed approach using various performance indicators. The subsequent sections
provide detailed explanations for each module’s functionality and significance.
The proposed system utilizes fundus images, obtained during screening procedures,
as its primary input. The fundus refers to the interior of the retina. To provide this input,
a Graphical User Interface (GUI) is employed. Users can easily upload an image by
accessing the interface and clicking the “Upload Image” button.
Image pre-processing is then applied to the uploaded image to make it suitable for
further analysis. This pre-processing involves converting the images into a required and
useful format. The images are sampled and resized to meet the necessary dimensions.
Subsequently, a pre-trained model evaluates the image’s severity of diabetic retin-
opathy (DR) by leveraging the attributes learned from the training images, making an
informed decision.

3.2 Data description

This section introduces data collection, image preprocessing and data augmentation.

3.2.1 Data collection

In this paper, we used the APTOS 2019 Blindness Detection Database [43], which com-
prises 3662 retinal images captured under various lighting conditions. This dataset was
sourced from the Aravind Eye Hospital in India and includes retinal images categorized
into five distinct classes, each representing different levels of severity in diabetic retin-
opathy (DR). Specifically, Class 0 denotes the absence of DR, Class 1 corresponds to
mild DR, Class 2 corresponds to moderate DR, Class 3 signifies severe DR, and Class 4
represents proliferative DR. To be more specific, the complete dataset consists of 3662
fundus images categorized into these five classes, with the following distribution: 1800
images in the No-DR category, 377 in the Mild category, 1000 in the Moderate cat-
egory, 300 in the Proliferate-DR category, and 195 in the Severe category.
In the process of dataset curation, we performed the removal of duplicate images
and also eliminated any images affected by noise or damage. As a result, the dataset
was reduced to a total of 2640 images, which were categorized as follows: 967 images
belonged to the No-DR class, 330 images represented Mild DR, 900 images were clas-
sified as Moderate DR, 270 images fell under Proliferative DR, and 173 images were
categorized as Severe DR. To provide a visual representation of these classes, Fig. 2
displays the diverse eye images corresponding to each category.

13
Multimedia Tools and Applications

Fig. 2  Diabetic retinopathy classes

3.2.2 Image preprocessing and data augmentation

Due to the scalability aspect of the datasets and the enormous number of image origins,
these images are often considered as artifacts with limited diagnostic usefulness. To
address this issue, an image preparation phase must be implemented to address prob-
lems related to image variances and standardize the images.
Initially, the pixel values of the images should be adjusted to a range between 0 and 1.
These scaled images are then resized to a uniform resolution of 224*224 pixels. This is
achieved by clipping the inner circle of the retinal image and arranging it in a square. Scal-
ing is a significant procedure in image processing as it allows for resizing of digital images.
To enhance the applicability of the processed dataset, augmentation techniques are
applied, including rotating images 360 degrees, flipping images both horizontally and ver-
tically, zooming images to 1.2 times their original size, and adjusting contrast for improved
lighting. Additionally, by balancing all of the classes, the dataset volume is increased from
2640 to 5000 samples. The dataset distribution is depicted in Fig. 3.

3.3 Algorithms used

The purpose of this step is to identify and categorize the severity degrees of Diabetic
Retinopathy (DR). Due to the enormous size of the database, we employ a new training
procedure that combines two distinct approaches: feature extraction and fine-tuning. We
selected two classification algorithms from the most known approaches.

• ResNet152-V2: Residual networks (ResNet) [44] belong to a class of deep neural net-
works with similar topologies but variable depths. ResNet introduces a structure known
as the residual learning unit to mitigate the degradation of deep networks. This unit
functions as a feed-forward neural network with a bypass link that takes in fresh inputs

13
Multimedia Tools and Applications

Fig. 3  Dataset distribution

and generates new outputs. The main advantage of this unit is that it enhances classifi-
cation accuracy despite the increasing complexity of the model. ResNet152 was chosen
for its superior accuracy within the ResNet family.
• InceptionResnet-V2: It is a 164-layer CNN classifier trained on nearly a million Ima-
geNet images. The Inception-Resnet-v2 [45] structure is a combination of the Inception
foundation and the Residual connection. The Inception-Resnet module merges multiple
sized convolutional filters using residual connections. The incorporation of feedback
connections not only overcomes the degradation problem caused by deep structures, but
also reduces training time.

Figure 4 illustrates the training process in detail.


Moreover, Resnet152-V2 and InceptionResnet-V2 are adopted for a novel training pro-
cess based on 2 steps:

Step 1 - Feature extraction

After removing the top layer and replacing it with a fully connected layer having a drop-
out rate of 40% and a final softmax activation function for classification, we obtain five out-
puts corresponding to the 5 categories of the DR disease. We leverage the representations
obtained from a pre-trained network to extract valuable characteristics from new data. To
fully exploit the previously learned feature maps for our dataset, we construct a new classi-
fier on top of our existing models, which will be trained from scratch. During this stage, we
freeze the neural basis and utilize it as a feature extractor. Subsequently, we add a classifier
on top of it and train the upper classifier. It is not necessary to (re)train the entire model. To
reduce the number of parameters, the additional classifier consists of three layers, with the
first being an average pooling layer with a 5 × 5 size. As this prediction will be treated as a
logit or a raw prediction result, an activation function is not needed in this situation.

13
Multimedia Tools and Applications

Fig. 4  Training process

Step 2 - Fine-tuning

By unfreezing a few of the upper layers of the frozen network base, we can simultane-
ously train the recently added classifier layers and the last layers of the base model. This
process enables us to “fine-tune” the main model’s higher-level representations to improve
their suitability for the specific task. In the feature extraction experiment, we only trained a
few layers on top of the basic model.
During training, the weights of the pre-trained system are not modified. To optimize
performance, one approach is to “fine-tune” the weights of the top layers of the previously
learned model while also training the introduced classifier. This allows for a transition from
general feature maps to dataset-specific features, and the weights will be adjusted through-
out the training phase.
All networks were evaluated on the provided database to assess their ability to detect
and diagnose DR.

4 Results

We conducted the training and testing phases on a Google Colab Professional account, and
all tests were implemented in Python [46]. We assessed the performance of each model
outlined in Section 3 by analyzing four key metrics: Accuracy, Precision, Recall, and
F1-score. Our evaluation was conducted on a created dataset containing 2640 retina eye
images, with the data split as follows: 80% for training, 10% for validation, and 10% for
testing.
The methods used in this study were trained using the Adam optimizer and the Cross-
Entropy loss function. The graphic input size for all networks was set to 224 by 224 pixels.
During the feature extraction phase, the initial learning rate was set to 1.00e-03, and it
was subsequently reduced to 1.00e-05 for both networks during the fine-tuning process. To

13
Multimedia Tools and Applications

Fig. 5  Training accuracy and loss for (A) ResNet152-V2 and (B) InceptionResnet-V2

account for the number of epochs, we initially trained the models for 25 epochs in the fea-
ture extraction stage. After that, an additional 25 epochs were added for fine-tuning.
Figure 5 presents the accuracy and loss graphs for both training stages, depicting the
results obtained on the validation and training sets of macular eye images.
Figure 5 illustrates the accuracy and loss charts constructed for the validation and train-
ing sets of retinal eye images during both training stages. These graphs demonstrate that
all models successfully converged on the training set after a moderate number of initial
epochs. It is worth noting that the validation accuracy achieved by ResNet152-V2 and
InceptionResnet-V2 during the feature extraction phase is 64.84% and 65.45%, respec-
tively. However, after fine-tuning, the accuracy of the models exhibits a notable increase,
reaching 94.92% for ResNet152-V2 and 94.14% for InceptionResnet-V2. Moreover, the
loss value decreased to 0.15 for InceptionResnet-V2 and 0.14 for ResNet152-V2.
Furthermore, we observed no significant overfitting in any of the models, as the conver-
gence of accuracy on the training set was approximately comparable to that on the valida-
tion set. The green line in the graphs represents the start of the fine-tuning process.

5 Discussion

In this section, we initiate an analysis of the obtained results, comparing them to the exist-
ing advancements in the field. Furthermore, we will also provide a detailed description of
the graphical user interface.

5.1 Performance analysis

Figure 6 presents the confusion matrices for the two models. They have achieved excellent
performance for the four types of diabetic retinopathy. It is evident that all models accurately

13
Multimedia Tools and Applications

Fig. 6  Confusion Matrix for (A) ResNet152-V2 and (B) InceptionResnet-V2

detect the “No-DR” category (Class with a value of 0). If we consider only two classifications,
i.e., “DR” and “No-DR” cases, the accuracy rating reaches 100%.
In medical categorization research, several commonly used performance measures are
employed. These include accuracy, precision, recall, and F1 score, all of which are crucial in
evaluating the effectiveness of a method. Briefly, the equations used to calculate these metrics
are as follows:
TP + TN
accuracy = (1)
TP + TN + FP + FN

TP
recall = (2)
TP + FN

TP
precision = (3)
TP + FP

precision × recall
F1 score = 2 × (4)
precision + recall

Table 1 shows the results of the performance analysis of the proposed methodology.
This research presents a comprehensive evaluation of the ResNet152-V2 and Inception-
Resnet-V2 neural networks for diabetic retinopathy classification. The models were rigor-
ously tested using various evaluation metrics, and their performance is documented in Table 1.
ResNet152-V2 clearly outperformed with a remarkable 100% score across all evaluation

Table 1  Performance results of Accuracy Precision Recall F1 Score


the proposed approaches
ResNet152-V2 100% 100% 100% 100%
InceptionResnet-V2 96.61% 98.30% 97.20% 97.80%

13
Multimedia Tools and Applications

metrics. On the other hand, the InceptionResnet-V2 network demonstrated accuracy, recall,
precision, and F1 scores that were very close to the ideal values for the five groups of diabetic
retinopathies. Notably, when diagnosing diabetic retinopathy, the recall value becomes the
most crucial statistic to consider. A higher recall value signifies a stronger ability of the model
to recognize all diabetic retinopathy cases in the dataset. Both models exhibited proficiency in
recognizing the five diabetic retinopathy classes, with ResNet152-V2 slightly outperforming
InceptionResnet-V2.
We conducted a performance comparison between our two proposed models and several
existing methods using different versions of the APTOS 2019 dataset. The results of this
comparison are summarized in Table 2. Notably, our proposed InceptionResnet-V2 and
ResNet152-V2 models outperformed other techniques, achieving accuracy rates of 96.61%
and 100%, respectively, in stark contrast to the 75.61% accuracy achieved by the CNN
model [28] on the APTOS 2019 Dataset. Additionally, when compared to other established
methods such as Hybrid [32], InceptionV3 [33], and InceptionResNet-V2 [34], the Incep-
tionResnet-V2 and ResNet152-V2 models demonstrated superior accuracy on the APTOS
2019 dataset.
To further improve evaluation, we tested the two models in identifying the severity
degrees of DR using a test retinal dataset. Figure 7 demonstrate the efficiency of the classi-
fiers with zero False-Negative cases for ResNet152-V2, indicating its robustness. However,
in Fig. 8, we observe one False-Negative case for InceptionResnet-V2, where the model
predicted the label as “Moderate” while the correct label was “Severe”.
This highlights the superior performance of ResNet152-V2 in correctly identifying the
severity of DR cases in the test dataset.
According to the evaluation results and the analyzed parameters, we have decided to
adopt ResNet152-V2 as the primary model for implementation in the GUI.

5.2 The graphical user interface

This section contains information about the proposed system’s application. The methodol-
ogy has a simpler solution than the previous ones existed in the literature. The interface
allows the user to supply an input image of the eye. This is then submitted to be preproc-
essed before being fed into the trained Resnet152-V2. After all of the processing is com-
pleted, the model finally displays the results. The graphical user interface exhibits descrip-
tively whether the individual has been diagnosed with diabetic retinopathy or not, and the
severity of the Diabetic Retinopathy will be depicted on the GUI accordingly so that the
patient is aware of the condition. Finally, all information on the patient’s identity and health
will be saved and stored in a heidiSQL database.

Table 2  Findings compared to Method Accuracy


each model on test accuracy
CNN [28] 94.44%
Hybrid [32] 79.50%
InceptionV3 [33] 82.00%
InceptionResnet-V2 [34] 82.18%
InceptionResnet-V2 96.61%
ResNet152-V2 100%

13
Multimedia Tools and Applications

Fig. 7  Prediction results for ResNet152-V2

The graphical user interface shown in the figure below displays the patient’s input
image on its interface. In the Fig. 9, we can see the outcome displayed, in this case
the patient has mild diabetic retinopathy, and we also demonstrate a screenshot of the
database displayed in Fig. 10 where we keep each patient’s information, as well as the
health condition.
Our technology offers several advantages, including its exceptional intelligence and
an easily recognizable graphical user interface that is accessible to all users. Moreover,
it boasts faster processing times compared to similar systems. Additionally, the solu-
tion is cost-effective, as it proves more economical than traditional manual strategies.
In comparison to previous predictive models, our design stands out for its user-friendly
and straightforward interface. The strength of our technology lies in the meticulous
preparation, utilizing a reliable and informative collection of images sourced directly
from a dataset created individually by ophthalmologists.
The primary benefit of our technology is, undoubtedly, its assistance to ophthal-
mologists in evaluating the patient’s condition. However, it is essential to acknowledge
certain limitations. For instance, unlike dependable fundus images, mobile-acquired
images may exhibit limitations, such as lack of clarity, which could potentially lead to
misclassifications.

13
Multimedia Tools and Applications

Fig. 8  Prediction results for InceptionResnet-V2

Fig. 9  The graphical user interface and the displayed results

13
Multimedia Tools and Applications

Fig. 10  HeidiSQL database

6 Conclusion

In this study, we proposed a novel deep learning technique for the detection and diagno-
sis of diabetic retinopathy (DR). The proposed method leverages a multi-stage training
procedure and integrates a graphical user interface with deep learning to achieve supe-
rior performance. Our approach yielded an average accuracy of 100% for ResNet152-
V2 and 96.61% for InceptionResnet-V2. These excellent accuracy, recall, and precision
results demonstrate the effectiveness of our technique for DR detection, especially when
compared to existing methods. Moreover, we carefully considered the computation time
of diabetic retinopathy detection in our research, ensuring that our visual user interface
provides a quick response to both patients and physicians, enabling them to determine
the presence of diabetic retinopathy using their retina images as input from a reliable
source.
The outstanding results obtained in this research have inspired the authors to conduct
additional research. The authors’ primary focus has been adjusted to include the use of
other solution processes to explore if the findings can be improved further and more imple-
mentations can be performed. In this regard, it is intended to use various image processing
approaches to evaluate if the efficiency and the solution quality may be improved. Alterna-
tive health sources will also be employed for certain diagnosis operations. Eventually, com-
ments from health facilities where the proposed computer aided diagnostic system is going
to be applied would be collected, and more effort will be put to employing the method in
clinics to diagnose various diseases, as it is considered that the system can be very benefi-
cial for physicians (and health workers) in terms of the overall identification techniques.
In the future, our aim is to contribute novel ideas and focus on implementing the pro-
posed system in other imaging modalities, such as OCTA. These imaging modalities have
the capability to simultaneously capture different disease features, including diabetic retin-
opathy, glaucoma, and age-related macular degeneration.

Author contributions Conceptualization was done by SG and AE. All the literature reading and data gath-
ering were performed by SG. All the experiments and coding were performed by SG and AE. The formal

13
Multimedia Tools and Applications

analysis was performed by AE. Manuscript writing original draft preparation was done by SG and AE.
Review and editing were done by SG, AE, HH. Visualization work was carried out by SG, AE, HH.

Funding The authors received no financial support for the research, authorship, and/or publication of this article.

Data availability The dataset used in this study is public and can be found at the following links: https://​
www.​kaggle.​com/​datas​ets/​sovit​rath/​diabe​tic-​retin​opathy-​224x2​24-​2019-​data.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication Not applicable.

Competing interests No conflict of interest to report.

References
1. Congdon NG, Friedman DS, Lietman T (2003) Important causes of visual impairment in the world
today. JAMA 290(15):2057–2060. https://​doi.​org/​10.​1001/​jama.​290.​15.​2057
2. Taylor HR, Keeffe JE (2001) World blindness: a 21st century perspective. Br J Ophthalmol
85(3):261–266. https://​doi.​org/​10.​1136/​bjo.​85.3.​261
3. Kaji Y (2018) Diabetic eye disease. Diabetes and aging-related complications. Springer, Singapore,
pp 19–29. https://​doi.​org/​10.​1007/​978-​981-​10-​4376-5_2
4. Antal B, Hajdu A (2012) An ensemble-based system for microaneurysm detection and diabetic
retinopathy grading. IEEE Trans Biomed Eng 59(6):1720. https://​doi.​org/​10.​1109/​TBME.​2012.​
21931​26
5. Islam M, Dinh AV, Wahid KA (2017) Automated diabetic retinopathy detection using bag of words
approach. J Biomed Sci Eng 10:86–96. https://​doi.​org/​10.​4236/​jbise.​2017.​105B0​10
6. Sutton J, Mahajan R, Akbilgic O, Kamaleswaran R (2018) PhysOnline: an online feature extraction
and machine learning pipeline for real-time analysis of streaming physiological data. IEEE J Biomed
Health Inf 10:11–12. https://​doi.​org/​10.​1109/​jbhi.​2018.​28326​10
7. Gencer C, Coskun A (2005) Robust speed control of permanent magnet synchronous motors using
adaptive neuro fuzzy inference system controllers. Asian J Inf Technol 4(10):918–919. https://​medwe​
lljou​rnals.​com/​abstr​act/?​doi=​ajit.​2005.​918.​919
8. Kose U, Arslan A (2017) Optimization of self-learning in Computer Engineering courses: an intel-
ligent software system supported by artificial neural network and vortex optimization algorithm. Com-
put Appl Eng Educ 25(1):142–156. https://​doi.​org/​10.​1002/​cae.​21787
9. Coskun A (2011) Optimization of a mini-golf game using the genetic algorithm. Electron Electr Eng
3(109):97–100. https://​doi.​org/​10.​5755/​j01.​eee.​109.3.​180
10. Hussein AF, ArunKumar N, Ramirez-Gonzalez G, Abdulhay E, Tavares JMR, de Albuquerque VHC
(2018) A medical records managing and securing blockchain based system supported by a genetic
algorithm and discrete wavelet transform. Cogn Syst Res 52:1–11. https://​doi.​org/​10.​1016/j.​cogsys.​
2018.​05.​004
11. Atanasov P, Gauthier A, Lopes R (2018) Applications of artificial intelligence technologies in health-
care: a systematic literature review. Value Health 21:S84. https://​doi.​org/​10.​1016/j.​jval.​2018.​07.​629
12. Wartman SA, Combs CD (2017) Medical education must move from the information age to the age of
artificial intelligence. Acad Med 93:1107–1109. https://​doi.​org/​10.​1097/​ACM.​00000​00000​002044
13. Hamet P, Tremblay J (2017) Artificial intelligence in medicine. Metabolism 69:S36–S40. https://​doi.​
org/​10.​1016/j.​metab​ol.​2017.​01.​011
14. Xing L, Krupinski EA, Cai J (2018) Artificial intelligence will soon change the landscape of medical
physics research and practice. Med Phys 45(5):1791–1793. https://​doi.​org/​10.​1002/​mp.​12831
15. Gupta D, Julka A, Jain S, Aggarwal T, Khanna A, Arunkumar N, de Albuquerque VHC (2018) Opti-
mized cuttlefish algorithm for diagnosis of Parkinson’s disease. Cogn Syst Res 52:36–48. https://​doi.​
org/​10.​1016/j.​cogsys.​2018.​06.​006

13
Multimedia Tools and Applications

16. Hemanth JD, Kose U, Deperlioglu O, de Albuquerque VHC (2018) An augmented reality-sup-
ported mobile application for diagnosis of heart diseases. J Super Comput. https://​doi.​org/​10.​1007/​
s11227-​018-​2483-6
17. Moreira MW, Rodrigues JJ, Al-Muhtadi J, Korotaev VV, de Albuquerque VHC (2018) Neuro-fuzzy
model for HELLP syndrome prediction in mobile cloud computing environments. Concurr Comput.
https://​doi.​org/​10.​1002/​cpe.​4651
18. Rebouças Filho PP, Peixoto SA, da Nóbrega RVM, Hemanth DJ, Medeiros AG, Sangaiah AK, de
Albuquerque VHC (2018) Automatic histologically-closer classification of skin lesions. Comput
Med Imaging Graph 68:40–54. https://​doi.​org/​10.​1016/j.​compm​edimag.​2018.​05.​004
19. Rebouças EDS, Marques RC, Braga AM, Oliveira SA, de Albuquerque VHC, Rebouças Filho PP
(2018) New level set approach based on Parzen estimation for stroke segmentation in skull CT
images. Soft Comput. https://​doi.​org/​10.​1007/​s00500-​018-​3491-4
20. Reboucas Filho PP, Reboucas EDS, Marinho LB, Sarmento RM, Tavares JMR, de Albuquerque
VHC (2017) Analysis of human tissue densities: a new approach to extract features from medical
images. Pattern Recogn Lett 94:211–218. https://​doi.​org/​10.​1016/j.​patrec.​2017.​02.​005
21. Rodrigues MB, Da Nóbrega RVM, Alves SSA, Rebouças Filho PP, Duarte JBF, Sangaiah AK, De
Albuquerque VHC (2018) Health of things algorithms for malignancy level classification of lung
nodules. IEEE Access 6:18592–18601. https://​doi.​org/​10.​1109/​ACCESS.​2018.​28176​14
22. Guefrechi S, Jabra MB, Ammar A, Koubaa A, Hamam H (2021) Deep learning based detection of
COVID-19 from chest X-ray images. Multimed Tools Appl 80(21–23):31803–31820. https://​doi.​
org/​10.​1007/​s11042-​021-​11192-5
23. Ben Jabra M, Koubaa A, Benjdira B, Ammar A, Hamam H (2021) COVID-19 diagnosis in chest
X-rays using Deep Learning and Majority Voting. Appl Sci 11:2884. https://​doi.​org/​10.​3390/​app11​
062884
24. Echtioui A, Zouch W, Ghorbel M, Mhiri C, Hamam H (2020) Detection methods of coronavirus
disease (COVID-19). SLAS Technol Q1:1–7. https://​doi.​org/​10.​1177/​24726​30320​962002
25. Foeady AZ, Novitasari DCR, Asyhar AH, Firmansjah M (2018) Automated diagnosis system of
diabetic retinopathy using GLCM method and SVM classifier. Proceedings of the 2018 5th Interna-
tional Conference on Electrical Engineering, Computer Science and Informatics (EECSI); Malang,
Indonesia. 16–18 October, pp. 154–160. https://​doi.​org/​10.​1109/​EECSI.​2018.​87527​26
26. Acharya UR, Ng EYK, Tan JH, Sree SV, Ng KH (2012) An integrated index for the identification
of diabetic retinopathy stages using texture parameters. J Med Syst 36(3):2011–2020. https://​doi.​
org/​10.​1007/​s10916-​011-​9663-8
27. Rahim SS, Palade V, Shuttleworth J, Jayne C (2016) Automatic screening and classification of dia-
betic retinopathy and maculopathy using fuzzy image processing. Brain Inform 3:249–267. https://​
doi.​org/​10.​1007/​s40708-​016-​0045-3
28. Yasashvini R, Yasashvini R, Vergin Raja Sarobin M, Panjanathan R, Graceline Jasmine S, Jani
Anbarasi L (2022) Diabetic Retinopathy classification using CNN and Hybrid Deep Convolutional
neural networks. Symmetry 14:1932. https://​doi.​org/​10.​3390/​sym14​091932
29. Rahhal D, Alhamouri R, Albataineh I, Duwairi R (2022) Detection and classification of diabetic
retinopathy using artificial intelligence algorithms. In: Proceedings of the 2022 13th International
Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 21–23 June 2022;
pp. 15–21. https://​doi.​org/​10.​1109/​ICICS​55353.​2022.​98111​97
30. da Rocha DA, Ferreira FMF, Peixoto ZMA (2022) Diabetic retinopathy classification using VGG16
neural network. Res Biomed Eng 38:761–772. https://​doi.​org/​10.​1007/​s42600-​022-​00200-8
31. Nadeem MW, Goh HG, Hussain M, Liew S-Y, Andonovic I, Khan MA (2022) Deep learning for
diabetic retinopathy analysis: a review, research challenges, and future directions. Sensors 22:6780.
https://​doi.​org/​10.​3390/​s2218​6780
32. Mohanty C, Mahapatra S, Acharya B, Kokkoras F, Gerogiannis VC, Karamitsos I, Kanavos A
(2023) Using Deep Learning architectures for detection and classification of Diabetic Retinopathy.
Sens (Basel) 23(12):5726. https://​doi.​org/​10.​3390/​s2312​5726
33. Kurup G, Jothi JAA, Kanadath A (2021) Diabetic retinopathy detection and classification using
pretrained inception-v3. Proceedings of the IEEE International Conference on Smart Generation
Computing, Communication and Networking (SMART GENCON); Pune, India. 29–30 October ;
pp. 1–6. https://​doi.​org/​10.​1109/​SMART​GENCO​N51891.​2021.​96459​14
34. Gangwar AK, Ravi V (2021) Diabetic Retinopathy Detection using transfer learning and Deep
Learning. In: Bhateja V, Peng SL, Satapathy SC, Zhang YD (eds) Evolution in Computational
Intelligence. Advances in Intelligent systems and Computing, vol 1176. Springer, Singapore.
https://​doi.​org/​10.​1007/​978-​981-​15-​5788-0_​64

13
Multimedia Tools and Applications

35. Fayyaz AM, Sharif MI, Azam S, Karim A, El-Den J (2023) Analysis of Diabetic Retinopathy (DR)
based on the Deep Learning. Information 14:30. https://​doi.​org/​10.​3390/​info1​40100​30
36. Farag MM, Fouad M, Abdel-Hamid AT (2022) Automatic severity classification of Diabetic Retin-
opathy based on DenseNet and Convolutional Block attention Module. IEEE Access 10:38299–38308.
https://​doi.​org/​10.​1109/​ACCESS.​2022.​31651​93
37. Bodapati JD, Veeranjaneyulu N, Shareef SN, Hakak S, Bilal M, Maddikunta PKR, Jo O (2020)
Blended multi-modal deep convnet features for diabetic retinopathy severity prediction. Electronics
9(6):914. https://​doi.​org/​10.​3390/​elect​ronic​s9060​914
38. Gurcan OF, Beyca OF, Dogan O (2021) A comprehensive study of machine learning methods on dia-
betic retinopathy classification. Int J Comput Intell Syst 14(2):1132–1141. https://​doi.​org/​10.​2991/​
ijcis.d.​210316.​001
39. Hemanth DJ, Deperlioglu O, Kose U (2019) An enhanced diabetic retinopathy detection and classifica-
tion approach using deep convolutional neural network. Neural Comput Appl 31:1–15. https://​doi.​org/​
10.​1007/​s00521-​018-​03974-0
40. Dayana AM, Emmanuel WRS (2022) An enhanced swarm optimization-based deep neural network for
diabetic retinopathy classification in fundus images. Multimed Tools Appl 81:20611–20642. https://​
doi.​org/​10.​1007/​s11042-​022-​12492-0
41. Beniz D, Espindola A (2016) Using tkinter of python to create graphical user interface (GUI) for
scripts in LNLS. WEPOPRPO 25:25–28. https://​api.​seman​ticsc​holar.​org/​Corpu​sID:​18283​9114
42. Garner P, Mariani JA (2015) Learning SQL in steps. J Systemics Cybern Inf 13(4):19–24. https://​api.​
seman​ticsc​holar.​org/​Corpu​sID:​19259​283
43. Diabetic Retinopathy 224x224 (2019) Data) | Kaggle: https://​www.​kaggle.​com/​datas​ets/​sovit​rath/​diabe​
tic-​retin​opathy-​224x2​24-​2019-​data
44. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. 2016 IEEE Confer-
ence on Computer Vision and, Recognition P (CVPR), Las Vegas NV, 2016, pp. 770–778. https://​doi.​
org/​10.​1109/​CVPR.​2016.​90
45. Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2017) Inception-v4, Inception-ResNet and the impact of
residual connections on Learning. In: AAAI. 4278–4284. https://​doi.​org/​10.​48550/​arXiv.​1602.​07261
46. Van Rossum G, Drake Jr FL (1995) Python Tutorial; Centrum voor Wiskunde en Informatica: Amster-
dam, The Netherlands

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under
a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of such publishing agreement and applicable
law.

13
View publication stats

You might also like