Complex & Intelligent Systems (2022) 8:3121–3129
https://doi.org/10.1007/s40747-021-00474-y
    ORIGINAL ARTICLE
A federated approach for detecting the chest diseases using DenseNet
for multi-label classification
K. V. Priya1,2      · J. Dinesh Peter1
Received: 27 February 2021 / Accepted: 14 July 2021 / Published online: 28 July 2021
© The Author(s) 2021
Abstract
Multi-label disease classification algorithms help to predict various chronic diseases at an early stage. Diverse deep neural
networks are applied for multi-label classification problems to foresee multiple mutually non-exclusive classes or diseases.
We propose a federated approach for detecting the chest diseases using DenseNets for better accuracy in prediction of various
diseases. Images of chest X-ray from the Kaggle repository is used as the dataset in the proposed model. This new model is tested
with both sample and full dataset of chest X-ray, and it outperforms existing models in terms of various evaluation metrics. We
adopted transfer learning approach along with the pre-trained network from scratch to improve performance. For this, we have
integrated DenseNet121 to our framework. DenseNets have a few focal points as they help to overcome vanishing gradient
issues, boost up the feature propagation and reuse and also to reduce the number of parameters. Furthermore, gradCAMS
are used as visualization methods to visualize the affected parts on chest X-ray. Henceforth, the proposed architecture will
help the prediction of various diseases from a single chest X-ray and furthermore direct the doctors and specialists for taking
timely decisions.
Keywords Deep learning · Transfer learning · Multi-label classification
Introduction                                                                  strategies that can be applied quickly and effectively to the
                                                                              objective population. A screening program should remember
Early diagnosis of chronic diseases aims at detecting the                     all the components for the screening cycle from welcoming
presence of diseases as early as possible so that the patients                the objective population to getting to viable treatment for peo-
will get better treatments in time. When the treatments get                   ple determined to have the diseases. To support this screening
delayed, then the chances of survival are less, and it may                    programme, we propose an optimized deep neural network
lead to worse situations in future. Early detection of diseases               framework.
helps the patients and doctors to make important decisions on                     Nowadays, deep learning is being explored in healthcare
their treatments and the expenses. It also helps the patients                 applications for disease prediction. However, the significant
and their dependents to get advice and proper guidance to                     issue experienced in the clinical space is the absence of
face the challenges in the future. The World Health Organi-                   enormous dataset with legitimate class names. Because of
zation (WHO) suggests diverse screening programs, which is                    the unavailability of large data sets from hospitals or test
an unexpected methodology in comparison to early finding.                     centers, most of the researchers depend on online datasets.
Screening is characterized as the identification of unrecog-                  Many researchers have already made a comparison of various
nized disease in an evidently sound, assessments or different                 machine learning algorithms for disease predictions using X-
                                                                              ray image dataset [1].
B    K. V. Priya                                                                  Deep learning models already shown the capacity of deriv-
     priyakv06@gmail.com
                                                                              ing more information about the examples of real evaluation
1    Department of Computer Science and Engineering, Karunya                  data than customary characterization systems. The high level
     Institute of Technology and Sciences, Coimbatore, Tamil                  deep learning designs can be utilized to distinguish the mean-
     Nadu, India                                                              ing of various highlights from patient records just as to
2    Department of Computer Science and Engineering, Sahrdaya                 become familiar with the contributions of each component
     College of Engineering and Technology, Thrissur, Kerala,                 that detect a patient’s danger for various diseases. In any
     India
                                                                                                                                   123
3122                                                                                   Complex & Intelligent Systems (2022) 8:3121–3129
                                                                    gle. It is accessible on an open-source platform [3]. The chest
                                                                    X-ray dataset be composed of 112,120 front facing CXR pic-
                                                                    tures from 30,805 one of a kind patients. All the pictures are
                                                                    named with 14 unique diseases, for example, atelectasis, con-
Fig. 1 Sample X-ray images from chest X-ray 14
                                                                    solidation, infiltration, pneumothorax, edema, emphysema,
                                                                    fibrosis and so on. The samples from this chest X-ray 14
                                                                    dataset are exhibited in Fig. 1.
case, exact forecast of chronic disease chances stays a dif-
ficult issue that warrants further examinations. Therefore,
it is unavoidable to think of a useful asset for doctors to         Related works
recognize patterns in patient information that demonstrate
chances related with particular sorts of chronic diseases. The      The greater part of the current works have embraced con-
proposed model can be the better solution for classifying           volutional networks for the chest X-ray classification [4,5].
various diseases from chest X-ray images.                           X-ray images have been utilized to recognize lung cancer
    System of transfer learning is accepted to speed up the         and other lung illnesses utilizing distinctive deep learning
performance of deep neural networks. Transfer learning is           models [6,7]. The ideal analysis of different lung infections
an approach of predictive modeling on different but the same        such as pneumothorax or atelectasis, pneumonia, pulmonary
kinds of problems. The principle thought behind transfer            edema, COPD, asthma and so on, additionally should be dealt
learning is to another move the knowledge secured in one            with. Several deep neural network frameworks were already
process to another [2].                                             developed for the early detection of these diseases. Lung
    The two ways to deal with transfer learning are: (1)            knobs from chest CT scans were recognized by implement-
calibrating the parameters in the pre-trained network as            ing a deep CNN with multi-label prediction [7]. In this work,
demonstrated by the given problem [2] and (2) utilizing the         they compared with 2D CNN [7]. They concluded that 3D
pre-trained network in place of feature extractor. At that          CNN helps to utilize spatial 3D contextual information and
point, these extracted features will be utilized to train the new   thereby leads to the generation of discriminative features by
classifier. In the deep learning area, transfer learning infers,    the training with 3D images. To identify the tiny nodules,
reusing the weights in at any rate one layer from a pre-trained     they additionally think of a variational nodule forecast pro-
network model. As per the requirements, it is possible to keep      cedure, contains cube expectation and clustering [7]. But this
the weights as fixed, fine-tuning the weights or accommodat-        framework cannot be used for the classification of different
ing the weights exactly the same as the pre-trained model for       disease types. To minimize the false-positive rate in clas-
training the new classifier.                                        sification of the lung nodules is proposed in [8,20] with a
    Chronic diseases are one of the significant reasons for         convolutional neural network. By investigating the attributes
death among adults in practically all nations and the pace          of CT scan images, they could lessen the false-positive rate
of influenced individuals will increment by 17% in the fol-         in the classification [8].
lowing 10 years. It is accounted for that six of every ten              The hybrid framework combines spatial transfer network
adults in the US have an chronic disease and four out of ten        (STN), data augmentation and VGG along with convolution
adults have at least two. In addition, it has been accounted        neural network has given better accuracy for the revelation
for that of the 58 million passings in 2005, around 35 mil-         of lung diseases from X-rays [9]. This work was named as
lion will be because of chronic diseases. Early examination         hybrid CNN VGG Data STN (VDSNet). Based on input
of chronic diseases targets recognizing the presence of dis-        parameters such as age, X-rays, gender and view position,
eases as exactly on time as possible with the objective that        a binary classification was done [9], and disease was pre-
the patients will improve treatments on time. When the treat-       dicted. This model had resized the input images as 64*64
ments get delayed, then the chances of survival are less, and       image size for the classifier. The architecture contains three
it may lead to worse situations in future. Multi-label disease      main stages as spatial transformer layers, extraction of fea-
classification algorithms help to predict various chronic dis-      tures layers and classification layers [9].
eases at an early stage. Binary classification in this domain           Spatial transformer layer incorporates λ(lambda) to move
help only to predict any one chronic disease. In the proposed       the normal routing [− 0.5:0.5], batch normalization and a
work, it is possible to predict various chronic diseases from a     spatial transformer to eliminate the most significant features
single chest X-ray and it will help the doctors and specialists     for lung illness classification [9]. To separate key features,
for taking accurate decisions.                                      a location network also used in the spatial transformation
    For the execution of this model, chest X-ray images pro-        layer. The different metrics used are recall, precision, and F-
vided by the NIH (National Institutes of Health) Clinical           score [9]. In the extraction stage of feature layers, pre-trained
Center is drawn out, and the same dataset is open from Kag-         model VGG16 has used. VGG16 [10] gives 92.7% accuracy
123
Complex & Intelligent Systems (2022) 8:3121–3129                                                                            3123
rate in ImageNet which contains million of image dataset          tial data and standardization on an dataset of 297,541 images
with thousand classes. The architecture of VGG16 contains         [13]. This framework could function admirably for classifi-
13 convolutional layers, 5 max pooling and 3 dense layers         cation of 12 unique irregularities, characterization of their
for better performance. VGG 16 is one of the best models          area, and segmentation of heart and lung projections. With
accessed by ILSVRC in 2014. AlexNet gets improved with            extra data of lung and heart segmentation and spatial labels,
the more kernel-sized filters such as 11 in the primary layer,    with a versatile normalization technique, they could improve
5 in the second layer and that too one after another [10].        the irregularity classification execution to a normal AUC as
    Beomhee Park et al. [11] built up a curriculum learning       0.883 on 12 diverse irregularities [13].
strategy to improve the classification precision of diverse          Another Dense Net model—CheXNet which contains a
injuries in X-ray images evaluating for aspiratory irreg-         dense convolutional Network of 121 layers [13] trained on the
ularities to see different pneumonic anomalies including          dataset of chest X-ray. DenseNets improve the progression
nodule[s], pneumothorax, pleural effusion, consolidation          of data and gradients, building the enhancement of dense
etc. with chest-X-rays of two different origins. The model        networks manageable [14]. The weights of this pre-trained
was aligned with two phases: first-the examples of tho-           model are adopted for our model to improve the classification
racic anomalies were recognized and then, pre-trained dataset     accuracy.
called ResNet-50 on the ILSVRC were used to tune the model
using the entire images from two different datasets [11].
    The convolutional neural network (CNN) architecture           Methodology
ResNet have been utilized commonly because of its intro-
duction in the ILSVRC (ImageNet Large Scale Visual                The overall architecture works in three different stages. In
Recognition Competition). They proved that when the lay-          the first stage, the required preprocessing should be done to
ers become deeper, ResNet could perform better. They have         achieve accurate results. Second, the model is fine tuned with
used a 50-layer architecture for their learning model and the     the help of a pre-trained DenseNet-DenseNet 121 to train the
multi-classification problem with Softmax function [11]. The      model to detect multiple diseases from a single chest X-ray
abnormal patterns are visualized with the help of class acti-     images. At last, grad-CAM were extricated to confine and
vation maps (CAM). The results of CAM and AUC together            visualize the irregular examples on the chest X-rays. The
yields missorted in not many patients by means of consolida-      proposed model involves the usage of densely connected
tion or nodule[s]. The outcomes of CAM were extracted for         convolutional networks, especially DenseNet121 [13]. The
all the trained classes and those results shown on the indepen-   major problem with conventional CNN appears when CNN
dent X-ray images. This highlights of inferred diseases can be    gets more intense. This is due to the fact that the mapping of
used by the experts for accurate decision making [11]. Differ-    data from input to output layer ends up being tremendous so
ent estimates such as sensitivity, specificity, and area under    much that they can vanish before showing up at the output
the curve (AUC) are measured separately for each of the           layer. Another problem encountered in the high-layer net-
two selected datasets. In the first dataset, they could achieve   works is that many of the layers are redundant. DenseNets
85.4%, 99.8% and 0.947 for the mentioned measures and             improve the network design between layers presented in dif-
97.9%, 100.0%, and 0.983 in the second dataset. This model        ferent architectures. DenseNets require less parameters than
helped to train the system easily with high-scale CXR images,     a comparable convolutional neural network, there can be
and this opened the door to detect more diseases from Chest       no compelling reason to figure out redundant feature maps.
X-rays. Sebastian Gundel et al. [12] come up with a deep neu-     Compared to other convolutional neural network, DenseNet
ral network-DenseNet architecture [13]. This incorporates         requires less number of parameters as it is not using redun-
five dense blocks and an aggregate of 121 convolutional lay-      dant feature maps while learning. The other regularly utilized
ers. Each dense block comprises various dense layers that         CNN is ResNets which has just demonstrated that numerous
incorporate batch normalization, improved linear units, and       layers are scarcely contributing. The parameters of ResNets
convolution [13]. In the proposed network [12], a transition      are more than DenseNets in light of the fact that each layer
layer is added between each dense block that incorporates         has its weights to learn. All things considered, the number of
batch normalization, convolution, and pooling, to diminish        layers in DenseNets are exceptionally limited (for example
the dimensions. A global average pooling layer (GAP) is like-     12 filters), and they add new feature maps [13]. Another issue
wise included. The former model is initialized through the        in dense networks raise while training, it is only because of
pre-trained ImageNet model [12], DenseNet 121. They saw           the stream of data and gradients. In DenseNet, each layer pro-
that this perform multiple tasks convolutional neural network     cure the gradients through the input and loss function. Each
could effectively arrange an ample scope of irregularities in     and every layer in DenseNet generate k features and those
the X-ray images of chest. While learning anomaly detection,      features get concatenated with the featured captured from the
the framework was upheld by extra features, for example spa-      previous layers. The result of this concatenation operation out
                                                                                                                      123
3124                                                                                   Complex & Intelligent Systems (2022) 8:3121–3129
to be given contributions to the following layer. Instead of
summing up the output feature maps with the new feature
maps in other dense networks, DenseNet concatenates these
output feature maps with the new feature maps. The follow-
ing equation represents the concatenation of feature maps in
DenseNets:
xl = Hl ([x0 , x1 , . . . , xl − 1]).                       (1)
   DenseNet basically of different dense blocks and the fea-
ture maps are in same dimension inside each dense block.
Although the filters in its number may change within the           Fig. 2 Class imbalance problem in the dataset
blocks. The layers among them known as transition layers.
Transition layers are meant for downsampling and includes
batch normalization, a 1 × 1 convolution and a 2 × 2 pooling
layers. Compared to VGG and ResNets, DenseNet perform
better because of the dense number of associations.
   The new 32 feature maps are added to the previous feature
maps in the dense network. As a result of this, there is shift
from 64 feature maps to 256 feature maps. This change in
feature maps significant at rear of sixth layer. As already
mentioned, transition block continues with 1×1 convolution
through 128 filters. This is also pursued through a pooling
window of 2 × 2 across a stride of 2. It bringing about the
feature maps into equal parts. The weights of our federated
model are initialized through the weights obtained from a
                                                                   Fig. 3 After implementing loss entropy
pre-trained model, DenseNet121 on ImageNet.
   As our proposed model focused on transfer learning [15,
16], separate sets of training, testing and validation data sam-
ples are chosen from the NIH chest X-ray dataset. During
initial stage, it is necessary to check data leakage. Data leak-
age means the X-rays of the same patient present in the
testing, training and validation dataset. For preprocessing,
various data generator methods such as normalization using
mean and standard deviation on both validation and training
dataset have been used. In addition, a separate standardiza-
tion is required on the testing set.
   Prior to contributing the images into the model, input
images are resized and normalized dependent on the stan-
dard deviation and mean. In each epoch, input is shuffled to
get accurate results. The main problem with the chest X-ray        Fig. 4 After solving class imbalance problem
dataset is the class imbalance problem. Figure 2 shows the
class irregularity issue that the Hernia pathology has the best
imbalance with the extent of positive training cases being
about 0.2%. Yet, even the Infiltration pathology, which has        dominant class. This facilitate the reduction of loss value to
minimal measure of imbalance, has just 17.5% of the training       a smaller value. If there is only one class in our dataset, then
cases marked positive.                                             this loss value become zero. With the utilization of ordinary
   Cross-entropy is an appropriate cost function feasible on       cross-entropy loss function work on this highly unbalanced
classification problems. To express the probabilities or class     dataset, the calculation will be boosted to focus on the major-
distributions, it takes advantage of activation function at the    ity class (for example negative for this situation), since it
output layer. Cross-entropy function is better choice when         offers more to the loss. To beat this issue, the average cross-
dealing with an imbalance dataset. The fraction of less dom-       entropy loss function can be altered over the whole training
inant class is get multiplied with the part associated with the    set D of size N as follows:
123
Complex & Intelligent Systems (2022) 8:3121–3129                                                                                                     3125
                                                                                       convolutional model get utilized to create heat maps. These
                          1
L cross−entropy (D) = −                  log( f (xi )) +              log(1 − f (xi )) .   heat maps are generally created through grad-CAM (gradient
                          N
                              positive                     negative
                                                                                           weighted class activation map) that features the significant
                                                                                    (2)
                                                                                           areas of an input image. To interpret convolutional neural
                                                                                           networks, grad-CAM is an excellent technique. The results
    Utilizing this formulae, it is noticeable that if there is a
                                                                                           of CAM and AUC get joined to produce significant results.
huge imbalance with not many positive training cases, for
                                                                                           The localization of the disease patterns is important for the
instance, at that point the loss will be overwhelmed by the
                                                                                           easy analysis of classification methods. The outcome of all
negative class. Equation (3) and (4) can be used to calculate
                                                                                           trained samples were autonomously drawn out with the grad-
the contribution of each class’s probability distributions and
                                                                                           CAMs for the predicted results. In addition, these results are
it is used to find the cross-entropy.
                                                                                           highlighted on the X-rays will help the specialists for their
    Adding the contribution over all the training cases for each
                                                                                           analysis.
class (for example pathological condition), at that point the
                                                                                              These representation procedures help to comprehend the
contribution of each class (for example positive or negative)
                                                                                           network and may likewise be valuable as almost resembles
is
                                                                                           the visual conclusion for introduction to radiologists [19].
        number of positives                                                                The classification results and localization utilizing grad-
freq p =                                                                            (3)    CAMs for test illnesses are shown in Fig. 6. The localization
                N
        number of negatives                                                                of the disease patterns is important for the easy analysis
freqn =                                                                             (4)    of classification methods. The outcome of all trained sam-
                N
                                                                                           ples were autonomously drawn out with the grad-CAMs for
   In Fig. 3, it is obviously distinguished that the contribu-                             the predicted results. Steps involve in the visualization using
tion of positive cases is fundamentally lower than that of                                 grad-CAM are as follows:
the negative ones. Be that as it may, the main focus here is
to make the contributions to be equivalent. One method of
doing this is by increasing every model from each class by a                               1. Hook into model output and last layer activations.
class-explicit weight factor, wpos and wneg with the overall                               2. Get gradients of last layer activations for output.
contribution of each class is the equivalent. To do this, we                               3. Compute the value of the last layer and gradients for an
need                                                                                          input image.
                                                                                           4. Compute weights from gradients by global average pool-
wpos × freq p = wneg × freqn ,                                                      (5)       ing.
                                                                                           5. Compute the dot product between the last layer and
which can be simplified as                                                                    weights to get the score for each pixel.
                                                                                           6. Resize, take ReLU, and return cam.
wpos = freqneg                                                                      (6)    7. Show the labels with the top 4 AUC.
wneg = freqpos .                                                                    (7)
    Using the above equation, it could balance the contribution                            Results
of positive and negative labels.
    After solving the class imbalance problem, train the model                             Figure 7 depicts the training loss curve acquired in our model,
with the help of DenseNet 121. The proposed model is trained                               and portrays the training process and the way in which the
with the weights obtained from the pre-trained model. Figure                               network learns. During an epoch, the loss function is deter-
5 consists of the proposed architecture in which the prepro-                               mined across each data item, and it is ensured to give the
cessed image is given to the pre-trained DenseNet 121 and                                  quantitative loss measure at the given epoch.
then to the Global Average 2D pooling. The processed image                                     The performance of this multi-class classification issue
is then fed to a dense net.                                                                can be pictured utilizing the receiver operating characteris-
                                                                                           tics (ROC) curve. The performance of a classification model
                                                                                           at all classification thresholds can be shown on a graph called
Localization using grad-CAM                                                                ROC curve. This curve outlines two different parameters such
                                                                                           as true-positive rate (TPR) and false-positive rate (FPR). The
After training the network for classification, gradient class                              term true-positive rate is used to represent recall. An ROC
activation maps (grad-CAM) are generated [17,18]. To visu-                                 curve plots TPR versus FPR at different classification thresh-
alize what CNNs are actually looking at, grad-CAM can                                      olds. In this model, ROC curves are plotted for every disease,
be utilized. The gradients derived from the output layer of                                as demonstrated in Fig. 8. The ROC curve is plotted with true-
                                                                                                                                               123
3126                                                                          Complex & Intelligent Systems (2022) 8:3121–3129
Fig. 5 A federated approach for detecting the chest diseases using DenseNet
Fig. 6 Grad-CAMs for different
diseases of the same patient
Fig. 7 Training loss curve
123
Complex & Intelligent Systems (2022) 8:3121–3129                                                                                  3127
Fig. 8 ROC curve
Table 1 Prediction accuracy of various diseases
Disease                                 Prediction accuracy                    Disease                              Prediction accuracy
Atelectasis                             0.882                                  Pneumothorax                         0.886
Cardiomegaly                            0.809                                  Consolidation                        0.724
Effusion                                0.904                                  Edema                                0.985
Infiltration                            0.742                                  Emphysema                            0.959
Mass                                    0.872                                  Fibrosis                             0.918
Nodule                                  0.874                                  Pleural thickening                   0.583
Pneumonia                               0.811                                  Hernia                               0.966
Table 2 Comparison study of prediction accuracies with existing models
Disease                         Pathology Wang et al. [6]            Yao et al. [21]                CheXNet         Federated approach
Atelectasis                     0.716                                0.772                          0.8094          0.882
Cardiomegaly                    0.807                                0.904                          0.9248          0.809
Effusion                        0.784                                0.859                          0.8638          0.904
Infiltration                    0.609                                0.695                          0.7345          0.742
Mass                            0.706                                0.792                          0.8676          0.872
Nodule                          0.671                                0.717                          0.7802          0.874
Pneumonia                       0.633                                0.713                          0.7680          0.811
Pneumothorax                    0.806                                0.841                          0.8887          0.886
Consolidation                   0.708                                0.788                          0.7901          0.724
Edema                           0.835                                0.882                          0.8878          0.985
Emphysema                       0.815                                0.829                          0.9371          0.959
Fibrosis                        0.769                                0.767                          0.8047          0.918
Pleural thickening              0.708                                0.765                          0.8062          0.583
Hernia                          0.767                                0.914                          0.9164          0.966
positive rate (TPR) on the X-axis against the false-positive             tion, atelectasis, effusion, pneumonia, edema, emphysema,
rate (FPR) on the x-axis.                                                fibrosis, hernia and so on. Table 2 gives a comparison study
    The accuracy of different diseases obtained in our model             of various approaches on multi-label classification of chest
is represented in Table 1.                                               X-ray images. The table gives comparison of the prediction
    The proposed federated approach is compared with the                 accuracies of existing models with our proposed model. This
existing results and found that our model outperforms in the             gives a conclusion that our model performs better compared
prediction of chest diseases such as mass, nodule, infiltra-             to existing models.
                                                                                                                            123
3128                                                                                              Complex & Intelligent Systems (2022) 8:3121–3129
Conclusion                                                                   4. Sun W, Zheng B, Qian W (2017) Automatic feature learning using
                                                                                multichannel ROI based on deep structured algorithms for com-
                                                                                puterized lung cancer diagnosis. Comput Biol Med 89:530–539
This work presents a high level deep learning network                        5. Sun W, Zheng B, Qian W (2016) Computer aided lung cancer
architecture upgraded for problems which need multi-label                       diagnosis with deep learning algorithms. In: Proceedings of SPIE,
classification. The proposed network model can be pre-                          medical imaging, vol 9785
pared without any preparation and moreover can be tweaked                    6. Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM (2017)
                                                                                ChestX-Ray8: hospital-scale chest X-ray database and benchmarks
with the weights embraced from the pre-trained model on                         on weakly-supervised classification and localization of common
the ImageNet. The proposed architecture demonstrated the                        thorax diseases. In: 2017 IEEE conference on computer vision and
performance improvement using weighted loss entropy to                          pattern recognition (CVPR). pp 3462–3471
balance class imbalance problems. The loss entropy gives                     7. Gu Y, Lu X, Yang L, Zhang B, Yu D, Zhao Y, Gao L, Wu L, Zhou
                                                                                T (2018) Automatic lung nodule detection using a 3D deep con-
better execution particularly to those classes with relative
                                                                                volutional neural network combined with a multi-scale prediction
lesser instances, for example Hernia. The output of this                        strategy in chest CTs. Comput Biol Med 103:220–31
model visualized using grad-CAMs and obtained a better                       8. Setio AAA, Traverso A, de Bel T, Berens MSN, van den Bogaard
visual explanation of the proposed classification architec-                     C, Cerello P, Chen H, Dou Q, Fantacci ME, Geurts B et al (2017)
                                                                                Validation, comparison, and combination of algorithms for auto-
ture. The results obtained by the grad-CAMs will be useful
                                                                                matic detection of pulmonary nodules in computed tomography
to the radiologists because of the highlighted areas of mul-                    images: the LUNA16 challenge. Med Image Anal 42:1–13
tiple disease affected areas on the X-ray images. This will                  9. Bharati S, Podder P, Mondal MRH (2020) Hybrid deep learning for
assist radiologists with auditing and interpreting the differ-                  detecting lung diseases from X-ray images. Inform Med Unlocked
                                                                                20:100391
ent diseases in a single X-ray image. The future work will                  10. Simonyan K, Zisserman A (2015) VERY Deep Convolutional Net-
focus on the better techniques for pre-training on various                      works for large-scale image recognition. Published as a conference
datasets especially for multiple disease prediction. The pro-                   paper at ICLR
posed model can be fine tuned with the use of appropriate                   11. Park B, Cho Y, Lee G, Lee SM, Cho Y-H, Lee ES, Lee KH, Seo JB,
                                                                                Kim N (2019) A curriculum learning strategy to enhance the accu-
optimization techniques such as stochastic gradient descent                     racy of classification of various lesions in chest-PA X-ray screening
with momentum, AdaGrad, RMSProp, and Adam Optimizer.                            for pulmonary abnormalities. Nature Sci Rep 9:15352. https://doi.
Optimization algorithms are responsible for reducing losses                     org/10.1038/s41598-019-51832-3
and providing the most accurate results possible.                           12. Gundel S, Ghesu FC, Grbic S, Gibson E, Georgescu B, Maier A,
                                                                                Comaniciu D (2019) IEEE, multi-task learning for chest X-ray
                                                                                abnormality classification on noisy labels. IEEE (2019)
Declarations                                                                13. Huang G, Liu Z, Maaten L, Weinberger KQ (2017) Densely
                                                                                connected convolutional networks. In: 2017 IEEE conference on
                                                                                computer vision and pattern recognition (CVPR), pp 2261–2269
Conflict of interest The authors declare that they have no conflict of      14. Rajpurkar P, Irvin J, Zhu K, Yang B, Mehta H, Duan T, Ding D,
interest.                                                                       Bagul A, Ball RL, Langlotz C, Shpanskaya K, Lungren MP (2017)
                                                                                Ng AY CheXNet: radiologist-level pneumonia detection on chest
Open Access This article is licensed under a Creative Commons                   X-rays with deep learning
Attribution 4.0 International License, which permits use, sharing, adap-    15. Zhi W, Yueng H, Chen Z et al (2017) Using transfer learning
tation, distribution and reproduction in any medium or format, as               with convolutional neural networks to diagnose breast cancer from
long as you give appropriate credit to the original author(s) and the           histopathological images. In: Proceedings of ICONIP 2017, III-
source, provide a link to the Creative Commons licence, and indi-               A0a, 3, IV-B. pp 669–676
cate if changes were made. The images or other third party material         16. Mehra R et al (2018) Breast cancer histology images classification:
in this article are included in the article’s Creative Commons licence,         training from scratch or transfer learning? ICT Express 4(4):247–
unless indicated otherwise in a credit line to the material. If material        254 (III-A0a, 3, IV-B)
is not included in the article’s Creative Commons licence and your          17. Selvaraju RR et al (2016) Grad-CAM: why did you say that? Visual
intended use is not permitted by statutory regulation or exceeds the            explanations from deep networks via gradient-based Localization.
permitted use, you will need to obtain permission directly from the copy-       pp 1–5. arXiv:1610.02391V2
right holder. To view a copy of this licence, visit http://creativecomm     18. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learn-
ons.org/licenses/by/4.0/.                                                       ing deep features for discriminative localization. In Proceedings of
                                                                                the IEEE conference on computer vision and pattern recognition.
                                                                                pp 2921–2929
                                                                            19. Pasa F, Golkov V, Pfeifer F, Cremers D, Pfeifer D (2019) Effi-
References                                                                      cient deep network architectures for fast chest X-ray tuberculosis
                                                                                screening and visualization. Sci Rep 9:1–9
 1. Baltruschat IM, Nickisch H, Grass M, Knopp T, Saalbach A (2019)         20. Song Q, Zhao L, Luo X, Dou X (2017) Using deep learn-
    Comparison of deep learning approaches for multi-label chest X-             ing for classification of lung nodules on computed tomography
    ray classification. Sci Rep 9:1–10                                          images. J Healthc Eng 2017:8314740. https://doi.org/10.1155/
 2. Ribani R, Marengoni M (2019) A survey of transfer learning for              2017/8314740. PMCID: PMC5569872
    convolutional neural networks. In: Proceedings of SIBGRAPI-T.
    IEEE, IV-B, pp 47–57
 3. NIH sample Chest X-rays dataset (2020) https://www.kaggle.com/
    nih-chest-xrays/sample. Accessded 5 June 2020
123
Complex & Intelligent Systems (2022) 8:3121–3129                                                                                       3129
21. Yao, Li, Poblenz, Eric, Dagunts, Dmitry, Covington, Ben,          22. Yao, Li, Poblenz, Eric, Dagunts, Dmitry, Covington, Ben,
    Bernard, Devon, and Lyman, Kevin. Learning to diagnose from           Bernard, Devon, and Lyman, Kevin (2017) Learning to diag-
    scratch by exploiting dependencies among labels. arXiv preprint       nose from scratch by exploiting dependencies among labels.
    arXiv:1710.10501, 2017                                                arXiv:1710.10501
                                                                      Publisher’s Note Springer Nature remains neutral with regard to juris-
                                                                      dictional claims in published maps and institutional affiliations.
                                                                                                                                123