Alzheimer's Disease Detection Using Deep Learning and Machine Learning: A Review
Alzheimer's Disease Detection Using Deep Learning and Machine Learning: A Review
https://doi.org/10.1007/s10462-025-11258-y
Saeed Mohsen1,2
Abstract
Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that significantly
impacts cognitive function, posing challenges in early diagnosis and treatment. Advances
in artificial intelligence (AI) have revolutionized medical image analysis, providing robust
frameworks for accurate and automated AD detection. This paper reviews recent develop-
ments in deep learning (DL) and machine learning (ML) models for AD classification, like
convolutional neural networks (CNNs), transfer learning, hybrid architectures, and novel
attention mechanisms. Additionally, applications of AD based on AI models, datasets,
preprocessing techniques, challenges, and recent studies in this field are discussed. Also,
the paper provides different medical modalities, factors of increasing risk of Alzheimer,
progress stages of this disease, and several metrics of assessing AI models’ performance.
These metrics such as accuracy, matthews correlation coefficient (MCC), F1-score, recall,
precision, area under the receiver operating characteristic (ROC) curve, confusion matrix,
and loss. Further, the paper presents several comparisons of different DL approaches for
AD, limitations, new trends, suggestions, and future directions for this evolving field.
1 Introduction
Early detection of Alzheimer’s disease (AD) is essential for implementing effective pre-
ventative strategies. As the most prevalent chronic condition among the elderly, AD affects
a significant portion of the aging population, making timely diagnosis crucial to managing
its high incidence rate. AD is a leading cause of dementia worldwide, necessitating timely
    Saeed Mohsen
    saeed.mohsen@ksiu.edu.eg
1
    Department of Electronics and Communications Engineering, Al-Madinah Higher Institute for
    Engineering and Technology, Giza 12947, Egypt
2
    Department of Artificial Intelligence Engineering, Faculty of Computer Science and
    Engineering, King Salman International University (KSIU), South Sinai 46511, Egypt
                                                                                         13
 262      Page 2 of 39                                                               S. Mohsen
and precise diagnostic techniques. So, there are multiple potential causes of AD that affect
thinking, memory, and behavior in older individuals (Chaddad et al. 2018). Traditional
approaches rely on clinical assessments and neuroimaging analysis, often involving signifi-
cant manual intervention (Iqbal et al. 2023).
   Deep learning (DL), with its ability to automatically extract meaningful features from
data, has emerged as a transformative tool in medical imaging, particularly in detecting and
classifying AD from modalities such as positron emission tomography (PET), magnetic
resonance imaging (MRI), and computed tomography (CT) scans (Kaya and Çetın-Kaya
2024). Classification tasks in AD detection aim to categorize patients into healthy, mild
cognitive impairment (MCI), or AD groups. This process helps in early intervention, par-
ticularly in MCI cases where the disease might progress to AD (Pei et al. Oct. 2022).
   The progress of brain stages for AD is demonstrated in Fig. 1, which highlights the dif-
ferences between No Impairment “normal cognition (NC)” as a healthy stage, mild decline
“mild cognitive impairment (MCI)” as a midle stage, and severe decline “Alzheimer’s dis-
ease (AD)” as a clinical stage (Bellio 2021).
   The likelihood of developing AD grows with age, particularly among individuals over
65 (Grundman and Petersen 2004). Several factors may contribute to an increased risk of
Alzheimer, including: Genetics: specific genetic variations have been linked to a heightened
risk of Alzheimer’s. Environmental Influences: exposure to harmful substances or experi-
encing head injuries can raise the likelihood of developing the disease. Lifestyle Choices:
unhealthy habits such as poor nutrition, inactivity, and other detrimental lifestyle behaviors
may play a role in increasing the risk. Medical Conditions: health issues like hyperten-
sion, diabetes, and elevated cholesterol levels are also associated with a greater risk of
Alzheimer’s (Malik et al. 2024). Figure 2 demonstrates some factors of increasing risk of
Alzheimer disease.
13
Alzheimer’s disease detection using deep learning and machine…            Page 3 of 39   262
    The commonly used neuroimaging modalities include MRI that provides structural
information, enabling the detection of brain atrophy, PET that offers functional insights,
capturing metabolic changes, and CT that helps in identifying structural abnormalities (Wu
et al. 2019; AlSaeed and Omar 2022). Alzheimer's disease leads to a significant reduction
in brain mass, Fig. 3 illustrates human brain mass for healthy and Alzheimer. In individuals
with normal cognition, brain shrinkage is minimal or negligible. However, those with MCI
experience an accelerated reduction in brain volume, losing around 1–2% annually—sig-
nificantly faster than typical aging (Huang et al. 2020). In AD, the rate of brain volume loss
increases further, reaching 3–5% per year. Regions like the hippocampus are particularly
affected, with shrinkage rates as high as 10–15% annually in advanced stages (Sungura et
al. 2021).
    The main contributions of this review are the following:
                                                                                   13
 262     Page 4 of 39                                                                    S. Mohsen
    methods.
●   Discussing the potential bias in AI models and mitigation strategies for this bias.
●   Presenting a discussion on the challenges and limitations in this field of AD.
●   Discussing the computational cost and latency in clinical AI deployment.
●   Presenting new trends, suggestions, and future directions in DL for AD detection.
●   Comparing several studies in AD detection in terms of the performance, methodology,
    and key contributions.
The remainder of this paper is organized as follows: Sect. 2 presents AI models for AD
classification and detection. Section 3 provides several applications of deep learning in
Alzheimer disease detection. Section 4 presents different dataset and preprocessing for
Alzheimer disease detection. Section 5 presents recent studies in AD. Section 6 presents
comparison of AI approaches versus traditional AD diagnostic methods. Section 7 discusses
real-world implementation of AI models and dataset adaptability. Section 8 explains the
overfitting problem in DL models and their mitigation methods. Section 9 provides some
medical modalities used with AD. Section 10 presents the evaluation metrics used in AD
detection, while the potential bias in AI models and mitigation strategies are presented in
Sect. 11, while challenges and limitation in this field and comparisons of DL approaches for
AD are introduced in Sect. 12. Section 13 presents a discussion about computational cost
and latency in clinical AI deployment. New trends and suggestions in DL for AD detection
are presented in Sect. 14. Future directions in deep learning for AD detection are discussed
in Sect. 15. Section 16 introduces a benchmark comparison of DL models for AD detection.
Finally, the paper is concluded in Sect. 17.
This paper reviews state-of-the-art artificial intelligence (AI) models employed for AD
detection, highlighting classification methodologies, their performance, and the challenges
faced in clinical translation. So, several advancements in AI models for AD classification
are presented as follows.
    Convolutional Neural Networks (CNNs) are widely used for image-based AD detection
due to their ability to extract hierarchical features efficiently (Feng et al. 2019). They consist
of convolutional layers that learn spatial patterns and pooling layers that reduce compu-
tational complexity while preserving essential information. CNNs have proven effective
in tasks like brain tumor classification (Mohsen et al. 2023a), object detection (Zou et al.
2023), and segmentation (Sahu et al. 2018; Mohsen et al. 2024).
    Long Short-Term Memory (LSTM) networks are a type of RNN designed for sequential
data processing, addressing RNN limitations with memory cells and gating mechanisms.
The forget gate removes irrelevant data, while input and output gates regulate information
storage and usage (Nagabushanam et al. 2020; Chakraborty et al. 2020). Their ability to cap-
ture long-term dependencies makes them effective in tasks like speech recognition, machine
translation, and stock price prediction.
    Hybrid models integrate multiple deep learning paradigms, such as CNNs with attention
mechanisms or RNNs, to enhance interpretability and performance (Mohsen et al. 2021a).
They are valuable for complex tasks like medical imaging and disease diagnosis, where
13
Alzheimer’s disease detection using deep learning and machine…             Page 5 of 39   262
balancing accuracy and explainability is crucial. Examples include CNN + RNN models
for analyzing time-series MRI data to predict disease progression and multi-branch hybrid
models that combine CNNs trained on different modalities (e.g., MRI, PET) with a shared
attention mechanism for comprehensive insights (Prakash et al. 2023).
    Transfer learning are pre-trained DL models trained on huge datasets and designed
to be fine-tuned for particular tasks. These models, such as BERT for language tasks or
VGG16 for image tasks, leverage transfer learning to adapt general knowledge to domain-
specific problems (Minoofam et al. 2023). Pre-trained models save time and computational
resources while achieving high performance in tasks like medical imaging, sentiment analy-
sis, and natural language processing. Also, pre-trained models such as Inception and ResNet
have been adapted for AD detection. Fine-tuning these networks on medical imaging data-
sets reduces the need for extensive labeled data and accelerates model convergence. For
example, combining CNNs with pre-trained VGG16 networks significantly improved clas-
sification accuracy in Alzheimer's studies.
    Generative adversarial network (GAN) includes two networks—a generator and a dis-
criminator—working in opposition. The generator creates fake data, while the discriminator
evaluates whether the data is fake or real. Through this adversarial training process, the gen-
erator learns to produce highly realistic data. GANs are utilized in several tasks for example,
image synthesis, deepfake creation, and data augmentation for training other models (Tiwari
et al. 2020).
    Autoencoder is a neural network designed to learn efficient representations of data
through compression and reconstruction. It has two main components: an encoder that
compresses the input into a latent representation and a decoder that reconstructs the input
from this representation. Autoencoders are useful for tasks such as noise reduction, anomaly
detection, and generating new data. Variational Autoencoders (VAEs) and Convolutional
Autoencoders are common variants used in different domains (Failed 2019).
    Vision transformer (ViT) introduces a transformer-based architecture to image process-
ing, replacing the traditional convolutional approach. It divides images into patches, which
are treated as input tokens for a transformer model. By applying self-attention mechanisms,
ViTs focus on relationships between patches, making them effective for tasks like object
detection, image classification, and segmentation. Their scalability allows them to excel
with large datasets, though they may require significant computational resources (Gong et
al. 2024).
    Deep neural network (DNN) is a general framework for neural networks with multiple
layers. It includes an input layer, hidden layers, and an output layer, with each neuron in
one layer linked to each neuron in the next. DNNs use optimization algorithms like gradi-
ent descent and backpropagation to learn patterns from data. This versatile architecture is
applied across diverse domains, including image recognition, and financial modeling, offer-
ing a robust foundation for many deep learning solutions (Wu et al. 2024).
    Attention-based models, such as dual-attention mechanisms, focus on clinically relevant
regions, improving detection accuracy and reducing false positives. The common recent
attention models is based on a block, namely sequeeze and excitation (SE). Examples
include self-attention in transformers or spatial and channel attention in convolutional set-
tings. Attention models can highlight critical regions in brain MRI scans that contribute to
disease diagnosis, aiding clinicians in understanding model decisions (Li et al. 2024).
                                                                                    13
 262    Page 6 of 39                                                                  S. Mohsen
13
Alzheimer’s disease detection using deep learning and machine…             Page 7 of 39   262
datasets (Vijay and Verma 2023). Figure 4 summarizes most common AI models for AD
classification.
Artificial Intelligence (AI) has revolutionized the field of medical imaging and diagnostic
analysis. Its application in Alzheimer’s disease (AD) detection spans a wide range of clini-
cal, research, and technological domains. The applications of AI in this field span diagnostic,
therapeutic, and research domains. By leveraging advanced AI architectures and integrating
multimodal data, these models hold immense potential to improve early detection, stream-
line clinical workflows, and enhance personalized care for AD patients. We present many
key applications with highlighting its impact as follows:
                                                                                    13
 262     Page 8 of 39                                                                   S. Mohsen
1. Early Diagnosis and Risk Assessment: DL models enable early detection of MCI, a
    precursor to AD, from structural and functional brain imaging. The impact of this appli-
    cation is used to provide timely interventions, slowing disease progression. Also, it aids
    in personalized treatment planning (Shuvo et al. 2022).
2. Clinical Decision Support Systems (CDSS): Automated diagnostic tools powered by
    DL assist clinicians in identifying AD from MRI, PET, or CT scans with high accuracy.
    The effect of this application is utilized to reduces diagnostic errors and inter-observer
    variability. Additionally, it saves time for clinicians by automating feature extraction
    and classification (Duarte de Almeida and Oliveira 2019).
3. Neuroimaging Biomarker Identification: DL models help identify biomarkers from
    imaging data, such as hippocampal atrophy that correlate with AD. The benefit of this
    application is to enhance understanding of disease pathology and support the develop-
    ment of targeted therapies (Skolariki et al. 2020).
4. Disease Progression Prediction: Predicting the transition from MCI to AD using
    sequential imaging data analyzed by DL models (e.g., hybrid CNN-RNN architectures).
    The impact of this application is utilized to help in monitoring disease progression and
    inform long-term care strategies and research studies (Guarín et al. 2024).
5. Personalized Treatment Monitoring: DL analyzes neuroimaging and clinical data to
    assess the performance of treatments and interventions. The effect of this application is
    used to enable dynamic adjustments to treatment plans and facilitate real-time monitor-
    ing of therapeutic outcomes (Koutkias et al. 2010).
6. Cognitive Function Prediction: Predicting cognitive scores (e.g., MMSE) based on
    imaging and non-imaging data using DL regression models. The benefit of this applica-
    tion is to offer a non-invasive proxy for cognitive testing and support routine clinical
    evaluations (Zhang et al. 2013).
7. Drug Development and Clinical Trials: DL aids in patient stratification by accurately
    classifying participants into healthy, MCI, and AD groups, ensuring homogeneous
    study cohorts. It effects on reduceing heterogeneity in clinical trials and facilitating
    drug discovery and efficacy testing (Sonka and Grunkin 2002).
8. Public Health Screening: Large-scale, automated AD screening using DL-powered
    platforms integrated with imaging tools (e.g., mobile MRI units or cloud-based sys-
    tems). This application is used to expand access to diagnostic services in underserved
    regions and support epidemiological studies and public health initiatives (Fletcher et al.
    2017).
9. Integration with Wearable Devices: Combining DL models with data from wearable
    sensors that track sleep, physical activity, or speech patterns to predict cognitive decline.
    The impact of this application is used to offer non-invasive, continuous monitoring
    of at-risk individuals. Also, it promotes early-stage intervention (Alattar and Mohsen
    2023).
10. Real-Time Diagnosis in Telemedicine: DL models integrated into telemedicine plat-
    forms analyze remotely acquired imaging and clinical data to provide diagnostic
    insights. The effect of this application is utilized to increase diagnostic reach, especially
    in rural areas and reduce patient burden of traveling for clinical assessments (Yan and
    Song 2010).
11. Cross-Modality Analysis: DL combines data from various imaging modalities (e.g.,
    MRI and PET) for more comprehensive diagnostics. There are two effects for this
13
Alzheimer’s disease detection using deep learning and machine…             Page 9 of 39   262
    application. The first one is improving classification accuracy, while the second effect
    is providing a holistic view of structural and functional changes in the brain (Feng et al.
    2021).
12. Educational Tools for Medical Training: DL-powered visualizations (e.g., Grad-CAM,
    saliency maps) are used as teaching tools for medical students and radiologists to inter-
    pret AD-related changes. There are twp benefits for this application: enhancing learning
    through real-world data and reducing the gap between practical application and theo-
    retical knowledge (Sakuma 2013).
13. Synthetic Data Generation: Generative Adversarial Networks (GANs) create synthetic
    neuroimaging data to augment limited datasets. It addresses data scarcity and class
    imbalance issues and facilitates training of more robust DL models (Failed 2023). Fig-
    ure 5 shows several applications of Alzheimer disease with DL.
There are publicly available datasets in the field of AD, such as ADNI (Alzheimer’s Dis-
ease Neuroimaging Initiative) that are a comprehensive dataset with MRI and PET images,
OASIS (Open Access Series of Imaging Studies) that focus on aging and AD detection, and
AIBL (Australian Imaging, Biomarkers, and Lifestyle) dataset that provides longitudinal
neuroimaging data.
   The success of DL models in Alzheimer’s disease classification heavily relies on high-
quality, well-labeled datasets. Several public datasets have become the benchmark for
researchers in this field, allowing for reproducible results and fostering collaboration among
researchers.
                                                                                    13
 262    Page 10 of 39                                                                  S. Mohsen
HMS promotes collaboration and inclusivity by offering its dataset freely to the research
community. However, as a newer initiative compared to established datasets like ADNI
and OASIS, HMS may currently have a more limited data volume, potentially restricting
the depth of analysis. Furthermore, while the dataset is publicly accessible, specific details
regarding data types and access protocols might not be as well-documented or user-friendly
as those provided by ADNI and OASIS.
13
Alzheimer’s disease detection using deep learning and machine…                     Page 11 of 39   262
   Table 1 illustrates some different dataset comparisons. Figure 6 illustrates different data-
sets, preprocessing techniques, and processing tools used with AD.
   All these mentioned datasets need different preprocessing techniques such as nor-
malization and scaling to ensure consistent intensity across scans, data augmentation to
expand dataset size, improving model robustness, and shuffling to prevent data leakage and
                                                                                             13
262     Page 12 of 39                                                               S. Mohsen
enhances generalization. So, preprocceing techniques are very important to obtain a high
effectiveness of AI models.
1. Data augmentation is a technique used to artificially enlarge the size and diversity
   of datasets by employing different transformations to the existing data. This process
   is especially valuable in scenarios where the available data is limited. For images,
   common augmentation methods include flipping, rotating, zooming, cropping, add-
   ing noise, and color shifting. For text, techniques such as random insertion, synonym
   replacement, deletion, or translation are used. The primary goal of data augmentation is
   to enhance a model's prediction ability, reduce overfitting, and improve robustness by
   exposing it to a broader range of variations in the input data (Nanthini et al. 2023).
2. Normalization is a preprocessing step that changes the scale of data features to bring
   them within a consistent range, ensuring comparability. In image processing, normal-
   ization typically involves scaling values of pixels from 0–255 to a range such as 0–1 or
   -1 to 1. This technique is beneficial because it speeds up the training process, ensures a
   more uniform feature distribution, and helps machine learning models converge faster.
   A common normalization formula involves subtracting the average and dividing by the
   standard deviation of the dataset (Huang et al. 2023).
3. Skull stripping is a preprocessing step in medical imaging, particularly for MRI scans.
   This technique involves removing the skull and other non-brain tissues from the image,
   leaving only the brain region. Skull stripping enhances the focus on brain-specific areas
   by eliminating irrelevant features and noise, which, in turn, enhances the efficiency of
   models analyzing these images. Methods for skull stripping include threshold-based
   techniques, region-growing algorithms, and advanced deep learning-based automated
   approaches (Singh et al. 2023).
4. Shuffling is the process of randomizing the order of data samples before feeding them
   into a machine learning model. This step is crucial in preventing the model from learn-
   ing biases or patterns that are specific to the order of the data. By shuffling, we ensure
   that the training process does not inadvertently favor certain sequences or classes,
   enhancing the model's ability to predict. Shuffling is often used in conjunction with
   stochastic gradient descent (SGD) during model training to maintain randomness and
   variability in the input (Nehal et al. 2023).
5. Gradient-weighted class activation mapping (Grad-CAM) is a visualization tech-
   nique designed to provide insights into the decision-making process of deep learning
   models, particularly CNNs. It works by leveraging the gradients flowing into the last
   convolutional layer of the model to compute a weighted activation map. This map is
   then used to create a heatmap that highlights the regions in an input image that con-
   tribute most to the model’s prediction. Grad-CAM is especially useful for debugging,
   understanding model behavior, and improving interpretability by revealing which parts
   of an image influenced the decision (Quach et al. 2023).
6. Salience maps are another visualization tool that highlights the most important regions
   of an input image influencing a model's output. These maps are generated by computing
13
Alzheimer’s disease detection using deep learning and machine…                    Page 13 of 39    262
     the gradients of the model's output with respect to the input image, showing the sen-
     sitivity of the output to changes in specific input features. Salience maps are widely
     used in tasks like feature importance analysis, object detection, and attention mecha-
     nisms, providing a clearer understanding of the model's focus and decision-making
     process (Mukherjee et al. 2015). Table 2 shows comparisons of several preprocessing
     techniques.
Techniques like SHAP and LIME provide powerful ways to understand and explain AI
model decisions, increasing clinician trust and promoting AI adoption in healthcare. By
integrating these methods into decision-support systems, we can improve diagnostic accu-
racy and treatment outcomes while ensuring greater acceptance by both doctors and patients.
SHAP is based on Shapley values from game theory, which analyze the contribution of
each feature to a model’s prediction. The core idea is to compute the average impact of each
feature across all possible feature orderings, providing both global and local interpretability
of the model. The SHAP has three advantages: the first one, it provides a consistent and fair
explanation of feature importance. Also, it can be applied to deep learning models, random
forests, and regression models. Further, it offers global insights into the model's behavior as
well as local explanations for individual predictions.
                                                                                             13
262     Page 14 of 39                                                               S. Mohsen
LIME focuses on local interpretability by creating a simplified model (e.g., linear regres-
sion) to approximate the predictions of the original model in a small neighborhood around a
specific data point. This is achieved by generating perturbed data and analyzing how feature
variations affect predictions. The advantages of the Lime are: Fast and effective in explain-
ing individual predictions, model-agnostic, meaning it can be applied to any machine learn-
ing model, and helps in understanding why a model made a particular decision for a specific
patient.
In healthcare, clinicians rely on logical and explainable reasoning when making critical
decisions. If they understand how and why a model provides a certain diagnosis, they will
have more confidence in using it. The ability to justify predictions is crucial, as medical
professionals need to ensure that AI-driven recommendations align with established clinical
knowledge and patient history. Techniques like SHAP and LIME enhance trust by offering
clear explanations of model decisions, allowing doctors to compare AI-generated insights
with their own expertise. This transparency reassures clinicians that AI is a supportive tool
rather than an unpredictable "black-box" system. One of the biggest barriers to AI adoption
in healthcare is the concern that models operate as opaque "black boxes," making decisions
without clear reasoning. Interpretable AI models address this issue by revealing which fac-
tors contribute to specific predictions, increasing transparency and credibility.
There are recent studies in the field of AD presented (Slimi et al. 2024; Ching et al. 2024;
Alsubaie et al. 2024; Jo et al. 2019; Lu et al. 2019; Iriondo et al. 2203; Raza et al. 2023;
Gnanasegar et al. 2020). In (Slimi et al. 2024), the hybrid model integrating Xception
and DenseNet121 architectures has demonstrated outstanding performance, achieving an
accuracy of 99.85% for AD classification. This success is attributed to the combination of
Xception’s efficient depthwise separable convolutions and DenseNet121’s connected lay-
ers, which enhance feature extraction and improve the model's representational capacity.
The used dataset is preprocessed MRIs. The use of the Synthetic Minority Over-sampling
Technique (SMOTE) effectively addressed class imbalance, ensuring better generalization
across underrepresented categories. However, this hybrid model comes with notable chal-
lenges. The integration of two complex architectures results in high computational cost and
increased processing overhead, which poses difficulties for real-world scalability. Addition-
ally, the model's complexity may limit its adaptability to settings with constrained computa-
tional resources, highlighting the trade-off between performance and practicality.
    In (Ching et al. 2024), the EfficientNet-B0 model, leveraging transfer learning, offers
a lightweight and efficient solution for AD detection, reaching an accuracy of 87.17%. Its
streamlined architecture makes it particularly appropriate for deployment on systems with
limited computational resources, like edge devices or smaller healthcare facilities. Despite
its advantages, the model has limitations in handling complex feature representations due to
13
Alzheimer’s disease detection using deep learning and machine…           Page 15 of 39   262
its reduced network depth, which can impact its ability to discern subtle patterns in medical
imaging data. Additionally, its performance is relatively lower compared to deeper archi-
tectures, especially in multi-class classification tasks where a greater level of feature dis-
crimination is required. These trade-offs highlight the balance between model efficiency and
accuracy, with EfficientNet-B0 excelling in resource-constrained environments but facing
challenges in more demanding classification scenarios. The used datasets are MRI scans
divided into four classes (non-demented to severe AD).
    In (Alsubaie et al. 2024), Capsule Networks (CapsNets) have shown promising potential
in Alzheimer’s disease classification because of their robust feature representation capa-
bilities. CapsNets excel in capturing spatial hierarchies and relationships between features,
which is particularly beneficial for analyzing complex brain imaging data. Their architec-
ture also helps reduce overfitting, even when applied to datasets with variations in segmen-
tation, enhancing model robustness. Despite these advantages, CapsNets face significant
challenges, including high computational demand and slower training times compared to
traditional CNNs. Additionally, their scalability remains limited, as handling large-scale
datasets efficiently is difficult. These trade-offs make CapsNets a compelling, yet computa-
tionally intensive, alternative for medical imaging tasks.
    In (Jo et al. 2019), deep boltzmann machines (DBMs) have been applied to AD detection
by integrating multi-modal neuroimaging data, demonstrating their effectiveness in pre-
dicting the conversion from MCI to AD. By leveraging data from multiple sources, such
as MRI, PET, and clinical measures, DBMs capture complementary features that enhance
prediction accuracy and provide a holistic view of patient conditions. However, their effec-
tiveness depends heavily on the availability of high-quality, multi-modal datasets, which
are often challenging to collect in clinical practice. Additionally, the training of DBMs is
computationally intensive, requiring significant processing power and time. These limita-
tions, while notable, are balanced by their ability to process and synthesize complex, diverse
datasets, offering valuable insights in early diagnosis and disease progression modeling.
    In (Lu et al. 2019), the CNN-LSTM hybrid model combines the spatial feature extrac-
tion capabilities of CNNs with the temporal sequence processing strength LSTM networks,
achieving a high accuracy of 98.5% on segmented datasets for Alzheimer’s disease detec-
tion. This integration allows the model to capture both spatial patterns from medical imaging
and temporal dependencies, such as progressive changes in brain scans over time. However,
this approach necessitates advanced pre-processing techniques, including segmentation
and normalization, to optimize data quality and ensure effective model training. Addition-
ally, the hybrid architecture introduces increased training complexity and higher memory
requirements, making it more computationally demanding compared to simpler models.
Despite these challenges, the CNN-LSTM hybrid model is a powerful tool for analyzing
time-sequenced medical data in Alzheimer’s research. This work utilized two MRI datasets
with Alzheimer’s disease stages.
    In (Iriondo et al. 2203), the DeepAD model leverages advanced DL techniques for pre-
dicting AD progression by integrating clinical information and 3D MRI scans from multiple
cohorts. A key strength of DeepAD lies in its ability to address domain adaptation and inter-
study biases through adversarial training, which enhances its robustness across diverse data-
sets. The model effectively combines clinical data with imaging features, utilizing mutual
information loss to promote domain generalization and improve prediction consistency
across varying data sources. Despite its strengths, DeepAD’s multi-network architecture
                                                                                   13
262     Page 16 of 39                                                                S. Mohsen
Traditional methods (MMSE, CDR, PET) remain valuable, but have limitations in early
detection, cost, and scalability, while AI models outperform them in accuracy, efficiency,
and early detection, especially when using deep learning on MRI and PET data. AI could
augment traditional methods rather than replace them, assisting clinicians in making faster,
more precise diagnoses.
   Traditional AD diagnostic methods such as MMSE, CDR, and PET. The MMSE is a
simple 30-question cognitive test used to assess memory, attention, and problem-solving.
However, it lacks sensitivity to early-stage AD and is influenced by education level and lan-
guage skills, while the CDR is a clinician-led structured interview evaluating six domains
(memory, orientation, problem-solving, etc.). Results depend on expert interpretation, mak-
ing it prone to variability. The PET scans are highly accurate, but expensive and not widely
available. PET detects amyloid-beta and tau protein deposits, biomarkers of AD.
   AI-based models can analyze MRI/PET changes and detect neurodegeneration years
before cognitive symptoms appear. Some models achieve > 90% accuracy, compared to
MMSE’s ~ 80%. DL models use MRI scans to objectively quantify brain atrophy in key
regions (e.g., hippocampus), reducing subjectivity. AI models are more accessible and less
13
Alzheimer’s disease detection using deep learning and machine…                      Page 17 of 39      262
costly, while still achieving high diagnostic performance. Some DL models trained on MRI
data achieve accuracy comparable to PET-based assessments. Table 4 shows a comparison
of AI Approaches versus Traditional AD Diagnostic Methods.
Application: AI models like Google's DeepMind and IBM Watson Health analyze X-rays,
MRIs, and CT scans to detect diseases such as cancer, pneumonia, and brain hemorrhages.
Model: Deep learning-based Convolutional Neural Networks (CNNs), use case: radiology
and pathology. Adaptability: AI models trained on large, diverse datasets like the ChestX-
ray14 dataset (NIH) can be fine-tuned with local hospital data to improve specificity. Also,
transfer learning enables adaptation to new datasets with fewer labeled images, allowing
hospitals in different regions to deploy these systems efficiently.
                                                                                                13
 262     Page 18 of 39                                                                        S. Mohsen
Application: AI-powered sepsis watch system at Duke University analyzes vital signs, lab
results, and medical history to predict sepsis risk in ICU patients. Model: Recurrent Neural
Networks (RNNs) and Long Short-Term Memory (LSTM) networks, use case: early sepsis
detection. Adaptability: the model integrates data from Electronic Health Records (EHR)
such as MIMIC-III (MIT). Also, continuous training on new patient data ensures adaptabil-
ity to different populations and hospital workflows.
13
Alzheimer’s disease detection using deep learning and machine…           Page 19 of 39   262
and adapts to different patient anatomies. Also, federated learning allows hospitals to share
anonymized data to improve model performance without compromising patient privacy.
   When evaluating the potential clinical adoption of different models, several key factors
must be considered alongside accuracy: clinical adoption of models depends on interpret-
ability, scalability, and efficiency. Explainable AI (XAI) methods enhance trust by provid-
ing clear insights into model decisions. Scalability factors, such as computational cost and
EHR integration, affect real-world deployment, with lighter or hybrid models being more
feasible. Balancing accuracy with efficiency is crucial, as high-performance models with
excessive computational demands may be impractical for widespread clinical use.
Scalability refers to an AI model’s ability to handle increasing amounts of data, users, and
healthcare facilities while maintaining efficiency. Challenges to scalability as follows: com-
putational Costs: Large AI models require significant processing power, limiting deployment
in resource-constrained hospitals. Additionally, integration with EHR Systems: Hospitals
use different Electronic Health Record (EHR) formats (e.g., Epic, Cerner, Meditech), mak-
ing AI integration complex. Also, regulatory and compliance Hurdles: AI models must meet
local regulations (e.g., HIPAA, GDPR) before large-scale deployment.
                                                                                   13
 262    Page 20 of 39                                                                 S. Mohsen
Overfitting occurs when a trained model becomes overly adapted to the training data, lead-
ing to excellent performance on training data but poor generalization to test or unseen data.
This means the model memorizes patterns specific to the training set rather than learning
general trends that can be applied to new data.
   Overfitting typically happens when the model is too complex, containing too many
parameters relative to the dataset size. It also occurs when the model fits the training data
too tightly, capturing noise or irrelevant patterns instead of useful trends. Additionally, if
the model relies on unnecessary or irrelevant features, it becomes ineffective when tested
on new datasets.
   There are several signs that indicate a model is overfitting. The most common sign is an
extremely high accuracy on training data but a significant drop in performance on test data.
Another indicator is a large gap between training accuracy and test accuracy, showing that
the model is not generalizing well. If a model is excessively complex without improving
real-world predictions, it is also likely to be overfitting.
To reduce overfitting, one effective method is increasing the dataset size. More data helps
the model generalize better instead of memorizing training samples. When additional data
is not available, data augmentation techniques, such as image rotation, flipping, brightness
adjustments, and adding noise, can artificially expand the dataset. Another approach is to
simplify the model. Reducing the number of layers in neural networks or selecting a less
complex algorithm can prevent overfitting. Similarly, reducing the number of parameters in
the neural network architecture helps in making the model more generalizable. Cross-vali-
dation is also a useful technique for mitigating overfitting. K-Fold Cross-Validation ensures
that the model is tested on multiple subsets of data, reducing the likelihood of overfitting
by providing a better estimate of its performance on unseen data. Regularization techniques
play a crucial role in preventing overfitting. L1 regularization (Lasso) penalizes the absolute
values of parameters, forcing some of them to become zero and effectively selecting only
important features. L2 regularization (Ridge), on the other hand, penalizes the squared val-
ues of parameters, reducing their magnitude without eliminating them, which results in a
more stable model. For neural networks, dropout is an effective regularization method that
13
Alzheimer’s disease detection using deep learning and machine…           Page 21 of 39   262
randomly disables neurons during training to prevent the model from depending too much
on specific features. Feature selection is another key strategy. Removing unnecessary or
low-impact features reduces the complexity of the model, making it less prone to overfit-
ting. Dimensionality reduction techniques such as Principal Component Analysis (PCA)
can help retain only the most significant features while discarding irrelevant ones. Adding
noise to training data is another way to prevent overfitting. By introducing random varia-
tions, the model learns to generalize better instead of relying on minor patterns that may not
be present in real-world data. This is especially useful in neural networks to enhance their
robustness. Lastly, early stopping is a widely used technique to avoid overfitting. By moni-
toring the model's performance on a validation set, training can be stopped when validation
accuracy starts to decline, preventing the model from memorizing the training data instead
of generalizing.
In this Section, we explain different modalities used for Alzheimer disease detection. Each
modality has unique strengths, making them complementary tools for diagnosis and research
in various medical fields, especially neurology and oncology.
    Computed tomography (CT) is a medical imaging technique that utilizes X-rays to pro-
duce detailed cross-sectional images of the body. The patient lies in a rotating scanner where
X-rays capture data, which a computer reconstructs into 2D or 3D images. CT scans are
commonly used to detect tumors, fractures, and internal bleeding, making them invalu-
able in emergency diagnostics due to their speed. They give high-resolution images of both
bones and soft tissues, but CT scans involve exposure to ionizing radiation and are less
effective than MRI for visualizing soft tissue contrast (Gao and Hui 2016).
    Structural magnetic resonance imaging (sMRI) is a non-invasive imaging technique
that utilizes roubst magnetic fields to produce highly detailed images of body structures.
It is particularly useful for visualizing brain anatomy, helping detect abnormalities such
as tumors or atrophy. sMRI plays a key role in studying structural changes in neurodegen-
erative diseases like Alzheimer’s. It offers excellent soft tissue contrast without exposing
patients to ionizing radiation. However, it is time-consuming, expensive, and unsuitable for
individuals with metal implants or severe claustrophobia (SSCS and B U 2023).
    Positron emission tomography (PET) is a functional imaging technique that reveals met-
abolic or biochemical activity in tissues. A radioactive tracer is injected into the body, and
as it decays, the PET scanner detects the emitted gamma rays to generate images. PET scans
are widely used to identify cancer, monitor its spread, and assess brain activity in condi-
tions like Alzheimer’s. They are also effective in evaluating heart function and blood flow.
While PET provides valuable functional data and is sensitive to abnormal cellular activity,
it involves exposure to radioactive materials and has lower spatial resolution compared to
MRI or CT (Salah et al. 2024).
    Diffusion tensor imaging (DTI) is a form of MRI that maps the diffusion of water mol-
ecules within tissues, primarily to study neural pathways in the brain. By measuring the
direction of water diffusion, DTI visualizes white matter tracts and brain connectivity. It is
particularly useful for diagnosing conditions like traumatic brain injury, multiple sclerosis,
and stroke, as well as for research in neural development and disorders. DTI is non-invasive
                                                                                   13
 262      Page 22 of 39                                                                  S. Mohsen
and effective for assessing white matter integrity but is sensitive to motion artifacts and
requires complex data analysis (Sang and Li 2024).
   Functional magnetic resonance imaging (fMRI) measures brain activity via detecting
changes in blood flow associated with neural activity (Yang et al. 2024). This technique uses
the Blood Oxygen Level-Dependent (BOLD) signal to pinpoint active brain regions during
specific tasks or rest. fMRI is a vital tool in cognitive neuroscience, mapping functions like
motor and sensory areas, and aiding pre-surgical planning for brain tumor removal. While it
is non-invasive and offers high spatial resolution, fMRI is an indirect measure of neuronal
activity, relying on blood flow changes, and is highly sensitive to motion artifacts.
   Figure 7 illustrates the percentage of utilization of each different imaging modality with
Alzheimer Disease.
In Alzheimer's disease (AD) classification, assessing the performance of deep learning mod-
els is essential to ensure their reliability and clinical applicability. Various evaluation metrics
are employed to assess model performance, each providing unique insights into how well
a model distinguishes between healthy and AD patients. There are many standard metrics
for model assessing include Precision, Recall, Accuracy, F1-Score to quantify classification
performance, Receiver Operating Characteristic-Area Under Curve (ROC-AUC) to analyze
discriminative ability, and also various loss Metrics to evaluate convergence during training.
Each metric has its advantages and is suitable for different types of analysis, based on the
nature of the dataset, the issue at hand, and the consequences of model errors. The following
are some of the most commonly used metrics, along with the following Equations (Mohsen
et al. 2023b; Chicco et al. 2021; Patil and Nisha 2021).
10.1 Accuracy
It is one of the most basic and oftenly utilized metrics in classification tasks. It represents
the proportion of correctly predicted samples (both positive and negative) to the overall
number of instances. In the context of Alzheimer’s classification, accuracy calculates how
often the model correctly classifies both the Alzheimer's patients and healthy individuals.
Equation (1) represents the accuracy.
13
Alzheimer’s disease detection using deep learning and machine…              Page 23 of 39   262
                                                TP + TN
                           Accuracy =                        (1)
                                           TP + FP + TN + FN
10.2 Precision
In medical classification tasks like AD detection, where the consequences of FPs and FNs
are significant, more measurements like precision, recall, and F1-score are preferred. Pre-
cision (also known as PPV) measures the proportion of true positive predictions between
all the positive predictions made by a model. This is especially important when false posi-
tives (misdiagnosing healthy individuals as AD patients) are a concern. It is represented by
Eq. (2).
                                                      TP
                                   P recision =             (2)
                                                    TP + FP
10.3 Recall
It is also known as sensitivity, indicates the proportion of actual positive cases that were cor-
rectly identified by a model. Recall is particularly important when false negatives (failing to
identify AD patients) are a concern. Recall is calculated by Eq. (3).
                                                   TP
                                     Recall =            (3)
                                                 TP + FN
10.4 F1-score
It is a harmonic mean of precision and recall, providing a balance between them. The
F1-score is especially beneficable when the dataset is unbalanced, as it provides a more
balanced view of the model’s efficiency than accuracy alone. Studies in AD detection often
prioritize F1-score because misdiagnosing AD patients (false negatives) can be more dan-
gerous than misclassifying healthy individuals. It is determined through Eq. (4).
                                                 P recision × Recall
                          F 1 − score = 2 ×                          (4)
                                                 P recision + Recall
It is another popular metric, particularly in binary classification tasks like Alzheimer's dis-
ease detection. It plots the True Positive Rate (Recall) against the False Positive Rate (FPR),
which is defined in Eq. (5).
                                                                                      13
 262      Page 24 of 39                                                                S. Mohsen
                                                   FP
                                   FPR =                 (5)
                                                 FP + TN
The area under the curve (AUC) quantifies the ability of the model to distinguish between
the two classes (AD vs. healthy). A value of 1 refers to perfect classification, while a value
of 0.5 suggests that the model performs no better than random guessing. AUC-ROC is
particularly useful when comparing models with different thresholds for decision-making.
Higher AUC values indicate better model discrimination between AD and non-AD cases.
AUC-ROC is estimated via Eq. (6).
                                         ∫   1
                                AU C =           T P Rd(F P R)(6)
                                             0
It is another metric that is valuable for evaluating binary classification models, especially
when dealing with imbalanced datasets. Unlike accuracy, MCC takes into account all four
confusion matrix elements (TP, TN, FP, FN) and provides a balanced metric of prediction
performance. MCC ranges from -1 (perfect inverse prediction) to + 1 (perfect prediction),
with 0 refering random predictions. MCC is particularly useful when both classes in the
dataset are of similar importance, as it considers the trade-offs between false positives (FPs)
and false negatives (FNs). The Eq. (7) represents the MCC.
                              (T P × T N ) − (F P × F N )
            M CC = √                                                    (7)
                    (T P + F P ) (T P + F N ) (T N + F P ) (T N + F N )
It is not a metric by itself, but a valuable tool for visualizing a model's performance. It dis-
plays the counts of true positive, true negative, false positive, and false negative values in a
matrix form, allowing for easy calculation of other measurements e.g., accuracy, F1-score,
precision, recall. Table 5 illustrates statistical values of confusion matrix (CM). This matrix
helps in understanding how well the model differentiates between two classes and where it
makes confusions, which is essential to medical diagnoses where the consequences of false
predictions can be significant.
10.8 Loss
It is often utilized to assess the performance of classification models that output proba-
bilities rather than discrete labels. This metric quantifies the accuracy of the probabilistic
13
Alzheimer’s disease detection using deep learning and machine…                 Page 25 of 39   262
predictions, penalizing incorrect classifications more heavily when the model is confident
about its wrong predictions. Equation (8) illustrates the log loss.
                                  N
                                1 
                   Loss = −         [yi log 2 yi + (1 − yi ) log 2 (1 − yi )](8)
                                N
                                    i=1
where yi is the true label of the ith instance (0 or 1), while yi is the predicted label for the ith
instance, and N is the total number of instances.
   Lower loss values refer better model performance, with perfect predictions achieving a
log loss of 0.
   These evaluation metrics each offer distinct advantages according to the nature of the
Alzheimer’s disease classification problem. While accuracy provides a general sense of
model performance, metrics like precision, F1-score, AUC-ROC, recall, and MCC give a
more detailed understanding of model strengths and weaknesses, particularly in the case of
imbalanced datasets. The use of confusion matrices and log loss further enhances model
evaluation, providing deeper insights into the specific types of losses a model makes. Under-
standing these metrics is crucial for implementing a DL model that isn’t only accurate, but
also reliable and safe for use in clinical settings.
Error analysis is a crucial step in evaluating AI models, especially in high-stakes fields such
as healthcare. By identifying common misclassification patterns, researchers and clinicians
can improve model performance, enhance interpretability, and mitigate risks in clinical deci-
sion-making. There are several misclassification patterns often fall into several categories:
false positives (Type I Errors), false negatives (Type II Errors), confusion between similar
classes, bias towards majority class, and context-dependent errors. Moreover, several fac-
tors contribute to these misclassification errors: data imbalance, feature overlap, poor data
quality, limited training data, and bias in training data. Misclassification in AI models can
have serious consequences in healthcare: false positives: unnecessary interventions, false
negatives: missed diagnoses, trust and adoption challenges, and ethical and legal concerns.
Bias in AI models for Alzheimer’s Disease (AD) diagnosis presents significant challenges
that can lead to disparities in healthcare outcomes. Ethical concerns, including fairness,
transparency, and accountability, must be addressed to ensure AI-driven diagnostic tools
benefit all patient populations equitably. By implementing robust bias mitigation strategies
such as improving data diversity, auditing algorithms, and enhancing model transparency,
we can reduce disparities and improve trust in AI-assisted healthcare.
   Fairness in AI-driven AD diagnosis requires a collaborative effort between clinicians,
data scientists, and policymakers to create models that are both accurate and ethically
responsible. AI should serve as a decision-support tool that enhances, rather than replaces,
human expertise. With ongoing monitoring and refinement, AI has the potential to revolu-
                                                                                         13
 262     Page 26 of 39                                                                       S. Mohsen
tionize early AD detection, ensuring timely and precise diagnosis for all patients, regardless
of their background. Also, AI tools should be designed to support equitable access to early
AD screening and intervention, regardless of a patient’s socioeconomic status or location.
This section presents many challenges in this field such as AD datasets are often limited in
size and skewed towards certain classes, impacting model training. Also, models learned
on particular datasets may struggle to predict across different populations and imaging
protocols. In addition, deep learning models achieve high accuracy, their black-box nature
hinders clinical adoption. Also, there are key challenges in AD detection such as early detec-
tion, imaging, heterogeneity of alzheimer’s, lack of ground truth, longitudinal monitoring,
and complexity of neurological data. Table 6 presents many challenges comparisons.
   Also, a comprehensive comparison of various deep learning (DL) models is presented,
their architectures, datasets, preprocessing techniques, evaluation metrics, and challenges.
The aim is to highlight the strengths and weaknesses of different approaches. Table 7 shows
some comparisons for models' architecture.
Deploying AI models in clinical settings involves computational costs and latency chal-
lenges. Deep learning models for AD detection, especially using MRI/PET, require high
computational power and may not be feasible for real-time use. EEG- and speech-based
detection methods are more suitable for real-time applications due to lower inference times
(milliseconds to seconds). Real-time AD detection using EEG or speech analysis is feasible
with lightweight AI models. Cloud-based solutions introduce network latency, while edge
AI and hardware optimizations (e.g., model compression, FPGA acceleration) can improve
feasibility. Balancing accuracy, cost, and computational efficiency is key to successful
deployment in real-world clinical environments. Deep learning models for medical imaging
13
Alzheimer’s disease detection using deep learning and machine…                      Page 27 of 39      262
(e.g., CNNs, transformers) require high computational power for inference, especially when
processing high-resolution MRI, PET, or EEG scans. Large-scale models (e.g., deep neural
networks with millions of parameters) demand powerful GPUs/TPUs, increasing deploy-
ment costs. Edge AI solutions (e.g., deploying models on local hospital servers or medical
devices) must balance accuracy with computational efficiency. Inference time depends on
hardware (CPU vs. GPU), model size, and input complexity.
In this section, we discuss many directions aim to enhance accuracy, usability, and clinical
impact in Alzheimer's research. There are many trends in this field are presented as follows:
1. Multi-modal data integration: Combining imaging (MRI, PET) with clinical, genetic,
   and behavioral data enhances model robustness and predictive power. Practical exam-
   ple: in Alzheimer's disease diagnosis, researchers combine MRI scans, PET imaging,
   genetic markers, and cognitive test scores to improve early detection. DL models inte-
   grate these diverse data types to enhance classification accuracy.
2. Transformers and attention mechanisms: Vision Transformers (ViT) and self-attention
   layers are increasingly used for improved feature extraction. Practical example: Vision
   Transformers (ViT) is used to analyze MRI and PET scans for early detection of AD.
   ViTs use self-attention mechanisms to capture long-range dependencies in brain imag-
   ing data, improving feature extraction and classification accuracy.
3. Explainable AI (XAI): Efforts to make models interpretable, aiding clinicians in under-
   standing predictions and decisions. Practical example: in Alzheimer Disease detection
   from MRI scans, SHAP (SHapley Additive exPlanations) is used to highlight the most
   important pixels in an image that contribute to a model’s decision, helping radiologists
   trust AI-driven predictions.
                                                                                                13
262     Page 28 of 39                                                               S. Mohsen
By comparing the mentioned five trends in terms of the models, datasets, techniques, and
challenges, it becomes evident that the selection of approach depends on particular project
goals, data availability, and computational resources.
   Also, there are several suggestions in this field are introduced as follows:
1- Focus on early detection: Develop methods for early-stage diagnosis, such as MCI to
   Alzheimer’s transition prediction.
2- Enhanced interpretability: Integrate XAI tools to build trust in clinical applications.
3- Ethical AI: Address biases in datasets to avoid disparities in diagnosis and treatment
   recommendations.
4- Cross-domain collaboration: Encourage partnerships between AI experts, neuroscien-
   tists, and clinicians to refine methodologies.
5- Real-world validation: Validate models on diverse, real-world datasets to ensure scal-
   ability and generalization.
1- Integration of multi-modal data: Future research can focus on fusing various data modal-
   ities like MRI, PET, clinical records, and genomic data for holistic disease modeling.
2- Advanced transfer learning: Expanding pre-trained models on domain-specific tasks to
   improve early detection accuracy and reduce the need for large labeled datasets.
3- Personalized models: Creating models tailored to individual patients by leveraging lon-
   gitudinal data and personal health records.
4- Real-time diagnosis tools: Developing lightweight models deployable on edge devices
   for real-time, point-of-care screening.
13
Alzheimer’s disease detection using deep learning and machine…            Page 29 of 39   262
The following is a detailed benchmark comparison across various studies in the field of
AD classification, focusing on the deep learning models used, datasets, and accuracy per-
formance. In (Slimi et al. 2024), a hybrid DL model integrating DenseNet121 and Xcep-
tion networks has demonstrated exceptional performance in AD classification. Utilizing the
ADNI dataset comprising MRI images, this approach achieved an impressive overall accu-
racy of 99.85% through fivefold cross-validation. This model leverages the complementary
strengths of DenseNet121 and Xception for feature extraction, providing a robust frame-
work that enhances detection performance. Key innovations include using data augmenta-
tion techniques like SMOTE to balance datasets and enhance generalization. Furthermore,
the model exhibits resilience against various image noise types, including Gaussian, Salt-
and-Pepper, and Speckle, making it a robust solution for practical applications in Alzheim-
er's disease detection.
    In (Turrisi et al. 2024), CNNs have been extensively used for Alzheimer’s disease clas-
sification, leveraging various architectural designs to achieve robust performance. Utilizing
the ADNI dataset, these models have consistently delivered accuracy rates ranging between
95 and 97%, with slight variations based on the specific architecture and the inclusion of
data augmentation techniques. A key strength of CNN-based approaches is their focus on
reproducibility, ensuring that experimental setups and results can be consistently validated.
Moreover, the diversity in architectural choices—ranging from simple to highly complex
designs—enables researchers to tailor models to the nuances of their datasets. The consis-
tent application of cross-validation further reinforces the reliability of these models, making
CNNs a foundational tool in Alzheimer's disease detection and classification.
    In (Tong et al. 2024), multiple Instance Learning (MIL) has emerged as an approach for
dementia classification, particularly in scenarios including weakly labeled or incomplete
data. Using the OASIS dataset, which includes MRI scans and clinical data, MIL achieves
accuracy rates between 85 and 90%, depending on the configuration. The model's strength
lies in its ability to handle missing annotations and ambiguous information, which are com-
mon challenges in clinical environments. By evaluating sets of instances (e.g., slices of MRI
scans or segments of clinical data) rather than requiring explicit labels for each instance,
MIL effectively classifies dementia even with limited or incomplete information. This capa-
                                                                                    13
262     Page 30 of 39                                                                S. Mohsen
bility makes it a practical and efficient tool for real-world applications where fully labeled
datasets are rare.
    In (Morris et al. 2406), explainable AI (XAI) has been increasingly integrated into CNNs
for AD diagnosis, focusing on enhancing the interpretability of AI-driven decisions. Using
the ADNI dataset of MRI images, these models achieve accuracy levels of 96–98%, compa-
rable to traditional CNNs. However, the distinguishing contribution of XAI-based models
lies in their use of methods such as saliency maps and attention mechanisms to provide
visual or conceptual explanations for their predictions. This capability helps build trust
among clinicians by offering insights into how the model identifies Alzheimer’s-related
patterns, such as brain atrophy. By addressing the “black box” nature of DL, XAI-driven
models promote the adoption of AI in clinical settings, fostering collaboration between AI
systems and medical practitioners through transparent and interpretable decision-making
processes.
    In (Khan et al. 2022), the study explored hybrid DL models—comprising CNN, Bidirec-
tional LSTM, and Stacked Deep Dense Neural Network (SDDNN)—for early Alzheimer’s
disease (AD) detection through text classification of clinical transcripts. Leveraging the
DementiaBank dataset, the models were trained and evaluated using both randomly initial-
ized weights and pre-trained GloVe embeddings. Extensive hyperparameter tuning was per-
formed through GridSearch method. Among these models, the SDDNN mixed with GloVe
embeddings reached the highest accuracy of 93.31% and outperformed others in metrics
such as AUC and recall. The results underscore the promise of automated approaches in
assisting clinicians with early AD diagnosis, while emphasizing the need for further research
to enhance performance on larger datasets.
    In (Ramani et al. 2024), three-dimensional Convolutional Neural Networks (3D CNNs)
have proven highly effective for the early detection of AD by leveraging 3D volumetric
data from MRI scans. Utilizing the ADNI dataset, these models achieve accuracy rates of
93% to 97%, depending on their specific configurations and pre-processing approaches.
The primary strength of 3D CNNs is their ability to capture spatial dependencies across
the three-dimensional structure of the brain, enabling the detection of subtle changes in
brain regions associated with early Alzheimer's. Unlike traditional 2D CNNs, which analyze
slices of imaging data independently, 3D CNNs analyze the complete volumetric context,
offering improved sensitivity to early structural alterations. This capability makes 3D CNNs
particularly valuable for diagnosing Alzheimer's disease at its earliest stages, facilitating
timely interventions.
    In (Shastry 2024), a custom CNN has been developed using the ADNI dataset, which
includes 10,000 MRI images categorized into three classes: Non-demented, MCI, and AD.
The methodology involved incorporating regularization methods, e.g. dropout, and employ-
ing data augmentation techniques to improve the model's robustness and generalizability.
These strategies significantly enhanced classification performance, achieving high accu-
racy while mitigating overfitting by expanding the effective training dataset with synthetic
variations. However, the custom CNN faces challenges, including limited interpretability,
which restricts its ability to provide explanations for its predictions—a critical factor for
clinical adoption. Additionally, the model is computationally expensive to train, requir-
ing substantial computational power and large memory, which may limit its scalability in
resource-constrained settings. Despite these limitations, the approach demonstrates a pow-
erful framework for AD detection.
13
Alzheimer’s disease detection using deep learning and machine…            Page 31 of 39   262
    In (Nanthini et al. 2024), the Hybrid Deep Belief Network (DBN) approach integrates
imaging and non-imaging data using a multi-task learning framework, applied to a com-
bined dataset of ADNI and OASIS with 15,000 images. The data is categorized into 3
classes: Cognitive Normal, MCI, and AD. The methodology leverages the representational
power of Deep Belief Networks (DBNs) to process multi-modal data, improving the ability
to extract complex patterns and dependencies. This approach excels in robust multi-modal
learning and provides effective predictions of disease progression, particularly by synthe-
sizing diverse data types like MRI scans and clinical metrics. However, the integration
of multi-modal data imposes high computational demands, requiring substantial resources
for training and deployment. Despite these challenges, the Hybrid DBN method offers a
promising avenue for improving diagnostic accuracy and understanding Alzheimer’s dis-
ease progression.
    Hussain et al. (2020) introduced a 12-layer CNN model to classify two categories
(Alzheimer/healthy) on MRI data from the OASIS dataset, achieving 97.75% accuracy and
97.50% F1-score, surpassing pre-trained architectures like InceptionV3 and VGG. Erdog-
mus and Cui et al. (2021) utilized adaptive logistic regression with particle swarm optimiza-
tion (PSO) on the ADNI dataset, reporting accuracy values of 96.27%, 84.81%, and 76.13%
for different binary classifications of Alzheimer’s subtypes. Erdogmus and Kabakus (Erdog-
mus and Kabakus 2023) proposed a 12-layer CNN optimized using 12 hyperparameters for
the DARWIN dataset, which involved converting 1D data to 2D, yielding a classification
accuracy of 90.4%.
    Sun et al. (2021) enhanced ResNet50 with spatial transformer networks and attention
mechanisms, achieving 97.1% accuracy. Manimurugan (2020) fine-tuned VGG19 on
OASIS data to achieve 95.82% accuracy, while Sharma et al. (2022) employed a VGG16-
based model with ANN for Alzheimer’s classification, obtaining 90.4% accuracy for four-
class classification. Savas (2022) identified EfficientNetB0 as the top-performing model in
a comparative analysis with a 92.98% accuracy rate. Lahmiri (2023) combined CNN-based
feature extraction with KNN classification optimized using Bayesian optimization, achiev-
ing 94.96% accuracy on OASIS data.
    There are several common challenges were encountered across the mentioned studies on
Alzheimer’s disease classification using deep learning techniques: one of the primary issues
is data availability and quality. Many studies rely on publicly available datasets such as
ADNI, OASIS, and DementiaBank, which, while valuable, may not always be representa-
tive of diverse populations. Additionally, limited data can lead to overfitting, especially in
complex models requiring extensive training.
    Another significant challenge is class imbalance. Some datasets contain an uneven dis-
tribution of classes, with fewer samples for early-stage Alzheimer’s or Mild Cognitive
Impairment (MCI). This imbalance can bias models toward majority classes, reducing per-
formance for underrepresented groups. Researchers often apply techniques such as overs-
ampling, undersampling, or weighted loss functions to address this issue, but it remains a
persistent limitation.
    Interpretability and explainability also pose critical concerns. Deep learning models, par-
ticularly CNNs, often function as "black boxes," making it difficult for clinicians to trust
and interpret their decisions. Although explainable AI (XAI) methods like saliency maps
and attention mechanisms have been introduced, their effectiveness in real-world clinical
                                                                                    13
262     Page 32 of 39                                                               S. Mohsen
13
Alzheimer’s disease detection using deep learning and machine…                  Page 33 of 39   262
                                                                                          13
 262     Page 34 of 39                                                                 S. Mohsen
Table 8 (continued)
Study  Methodology          Dataset        AI models          Accuracy     Key contributions
Sun    Enhancing ResNet50 ADNI             Hybrid of Transfer 97.10        A new ResNet
et al. with spatial trans-                 Learning                        model incorporating
(2021) former networks                                                     the Mish activation
       and attention                                                       function
       mechanisms
Man- Applying trans-        OASIS          VGG19               94.82       Fine tuning for the
imuru- fer learning for                                                    hyperparameters
gan    multi-classification
(2020)
Shar- Employing a           Kaggle         VGG16               90.4        Features from MRI
ma     VGG16-based                                                         scan images are
et al. model with ANN                                                      extracted using the
(2022)                                                                     VGG16 model
Savaş Implementing          ADNI           Transfer Learning   92.98       Apply several pro-
(2022) EfficientNetB0 as                                                   cessing techniques
       the top-performing                                                  on the dataset
       model
Lah-   CNN-based feature OASIS             Developing CNN      94.96       Applying processing
miri   extraction with KNN                 with an optimiza-               with both ML and
(2023) classification                      tion algorithm                  DL models
17 Conclusions
This paper examines the latest advancements in DL and ML models for AD classification.
It also explores various applications of AI in AD research, including datasets, preprocessing
methods, challenges, and notable recent studies in the field. Furthermore, the paper dis-
cusses medical imaging modalities, risk factors associated with AD, the disease's progres-
sion stages, and key metrics used to evaluate the performance of AI models. Additionally,
it provides comparative analyses of different DL approaches, highlights their limitations,
identifies emerging trends, and offers recommendations and future directions for this rapidly
evolving domain.
    In conclusion, deep learning has emerged as a transformative tool in the detection and
classification of AD, offering potential for early diagnosis and personalized treatment.
While advancements in multi-modal integration, transfer learning, and model interpretabil-
ity have been made, challenges such as data scarcity, computational complexity, and model
generalization remain.
    Deep learning has proven to be a game-changer in Alzheimer’s disease classifica-
tion, offering unprecedented accuracy and efficiency. While challenges remain, continued
advancements in model architectures, data availability, and interpretability promise to cover
the gap between research and clinical practice. This paper presents a review to underscore
the potential of DL models as critical tools in the fight against AD.
    In the context of evaluation metrics for AD detection, it is essential to use a combination
of the evaluation metrics to ensure reliable and accurate results. Proper evaluation allows for
the optimization of DL models, making them more effective in clinical practice and improv-
ing early detection of AD.
    Future work should focus on refining these models, ensuring clinical applicability, and
promoting collaborations to enhance diagnostic accuracy and expand access to AD care,
ultimately enhancing patient outcomes and advancing healthcare solutions.
13
Alzheimer’s disease detection using deep learning and machine…                         Page 35 of 39     262
Author contributions The author contributed to the writing—the original draft of the manuscript, concepts,
methodology, resources, visualization, similarity reduction, and the editing of the manuscript, the review of
the writing and grammatical errors for the manuscript, and supervision of the proposed work.
Funding Open access funding provided by The Science, Technology & Innovation Funding Authority
(STDF) in cooperation with The Egyptian Knowledge Bank (EKB).
No funding was received for this manuscript.
Data availability No datasets were generated or analysed during the current study.
Declarations
Open Access   This article is licensed under a Creative Commons Attribution 4.0 International License,
which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as
you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons
licence, and indicate if changes were made. The images or other third party material in this article are
included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material.
If material is not included in the article’s Creative Commons licence and your intended use is not permitted
by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the
copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
References
Aaboub F, Chamlal H, Ouaderhman T (2023) Analysis of the prediction performance of decision tree-based
     algorithms, International Conference on Decision Aid Sciences and Applications (DASA), Annaba,
     Algeria, pp. 7–11.
Alattar AE, Mohsen S (2023) A survey on smart wearable devices for healthcare applications. Wireless Pers
     Commun 132:775–783
AlSaeed D, Omar SF (2022) Brain MRI analysis for Alzheimer’s disease diagnosis using CNN-based feature
     extraction and machine learning. Sensors 22(8):2911
Alsubaie MG, Luo S, Shaukat K (2024) Alzheimer’s disease detection using deep learning on neuroimaging:
     a systematic review. Mach Learn Knowl Extract 6(1):464–505
Assam M, Kanwal H, Farooq U, Shah SK, Mehmood A, Choi GS (2021) An efficient classification of MRI
     brain images. IEEE Access 9:33313–33322
Aswin KS, Purushothaman M, Sritharani P (2022) ANN and deep learning classifiers for BCI applications,
     Third International Conference on Intelligent Computing Instrumentation and Control Technologies
     (ICICICT), Kannur, India, 2022, pp. 1603–1607.
Bellio M (2021) Translating predictive models for Alzheimer’s disease to clinical practice: user research,
     adoption opportunities, and conceptual design of a decision support tool, Doctoral thesis (Ph.D), Uni-
     versity College London
BJ BN, Yadhukrishnan S (2023) A comparative study on document images classification using logistic
     regression and multiple linear regressions, Second International Conference on Augmented Intelligence
     and Sustainable Systems (ICAISS), Trichy, India, pp. 1096–1104.
Bloch L, Friedrich CM (2019) Classification of Alzheimer’s disease using volumetric features of multiple
     MRI scans, 41st Annual International Conference of the IEEE Engineering in Medicine and Biology
     Society (EMBC), Berlin, Germany, pp. 2396–2401.
Chaddad A, Desrosiers C, Niazi T (2018) Deep radiomic analysis of MRI related to Alzheimer’s disease.
     IEEE Access 6:58213–58221
Chakraborty S, Banik J, Addhya S, Chatterjee D (2020) Study of dependency on number of LSTM units for
     character based text generation models, International Conference on Computer Science, Engineering
     and Applications (ICCSEA), Gunupur, India, pp. 1–5.
Chicco D, Warrens MJ, Jurman G (2021) The Matthews correlation coefficient (MCC) is more informative
     than Cohen’s Kappa and Brier score in binary classification assessment. IEEE Access 9:78368–78381
Ching WP, Abdullah SS, Shapiai MI (2024) Transfer learning for Alzheimer’s disease diagnosis using effi-
     cientNet-B0 convolutional neural network. J Adv Res Appl Sci Eng Technol 35(1):181–191
                                                                                                  13
 262      Page 36 of 39                                                                           S. Mohsen
Chu NN, Gebre-Amlak H (2021) Navigating neuroimaging datasets ADNI for Alzheimer’s disease. IEEE
      Consum Electron Mag 10(5):61–63
Cui X, Xiao R, Liu X, Qiao H, Zheng X, Zhang Y, Du J (2021) Adaptive LASSO logistic regression based
      on particle swarm optimization for Alzheimer’s disease early diagnosis. Chemom and Intell Lab Syst
      215:104316
de Almeida JRD, Oliveira JL (2019) GenericCDSS—a generic clinical decision support system, IEEE 32nd
      International Symposium on Computer-Based Medical Systems (CBMS), Cordoba, Spain, pp. 186–191
Erdogmus P, Kabakus AT (2023) The promise of convolutional neural networks for the early diagnosis of the
      Alzheimer’s disease. Eng Appl Artif Intell 123:106254
Feng C, Elazab A, Yang P, Wang T, Zhou F, Hu H, Xiao X, Lei B (2019) Deep learning framework for
      Alzheimer’s disease diagnosis via 3D-CNN and FSBi-LSTM. IEEE Access 7:63605–63618
Feng Y, Xu J, Ji YM, Wu F (2021) LLM: learning cross-modality person re-identification via low-rank local
      matching, IEEE Signal Processing Letters, 28:1789–1793,
Fletcher R, Díaz XS, Bajaj H, Ghosh-Jerath S (2017) Development of smart phone-based child health screen-
      ing tools for community health workers, IEEE Global Humanitarian Technology Conference (GHTC),
      San Jose, CA, USA, pp. 1–9
Gao XW, Hui R (2016) A deep learning based approach to classification of CT brain images, SAI Computing
      Conference (SAI), London, UK, pp. 28–31
Gnanasegar SM, Bhasuran B, Natarajan J (2020) A long short-term memory deep learning network for MRI
      based Alzheimer’s disease dementia classification. J Appl Bioinf Comput Biol 9(6):1–7
Gong Z, Chanmean M, Gu W (2024) Multi-scale hybrid attention integrated with vision transformers for
      enhanced image segmentation, 2nd International Conference on Algorithm, Image Processing and
      Machine Vision (AIPMV), Zhenjiang, China, pp. 180–184.
Grundman M, Petersen RC et al (2004) Mild cognitive impairment can be distinguished from Alzheimer
      disease and normal aging for clinical trials. Arch Neurol 61(1):59
Guarín DL, Wong JK, McFarland NR, Ramirez-Zamora A (2024) Characterizing disease progression
      in Parkinson’s disease from videos of the finger tapping test. IEEE Trans Neural Syst Rehabil Eng
      32:2293–2301
Hashemifar S, Iriondo C, Casey E, Hejrati M (2022) DeepAD: a robust deep learning model of Alzheimer's
      disease progression for real-world clinical applications, Preprint at arXiv:2203.09096
Huang Z, Zhu X, Ding M, Zhang X (2020) Medical image classification using a light-weighted hybrid neural
      network based on PCANet and densenet. IEEE Access 8:24697–24712
Huang L, Qin J, Zhou Y, Zhu F, Liu L, Shao L (2023) Normalization techniques in training DNNs: methodol-
      ogy, analysis and application. IEEE Trans Pattern Anal Mach Intell 45(8):10173–10196
Hussain E, Hasan M, Hassan SZ, Azmi TH, Rahman MA Parvez MZ (2020) Deep learning based binary clas-
      sification for Alzheimer’s disease detection using brain MRI images, 2020 15th IEEE Conference on
      Industrial Electronics and Applications (ICIEA), Kristiansand, Norway, pp. 1115–1120.
Iqbal S, Qureshi AN, Li J, Mahmood T (2023) On the analyses of medical images using traditional machine
      learning techniques and convolutional neural networks. Archiv Comput Methods Eng 30:3173–3233
Jo T, Nho K, Saykin AJ (2019) Deep learning in Alzheimer’s disease: diagnostic classification and prognostic
      prediction using neuroimaging data. Front Aging Neurosci 11(220):1–14
Kaya M, Çetın-Kaya Y (2024) A novel deep learning architecture optimization for multiclass classification of
      Alzheimer’s disease level. IEEE Access 12:46562–46581
Kazemi A, Boostani R, Odeh M, AL-Mousa MR (2022) Two-layer SVM, towards deep statistical learn-
      ing, International Engineering Conference on Electrical, Energy, and Artificial Intelligence (EICEEAI),
      Zarqa, Jordan, pp. 1–6.
Khan YF, Kaushik B, Rahmani MKI, Ahmed ME (2022) Stacked deep dense neural network model to predict
      Alzheimer’s dementia using audio transcript data. IEEE Access 10:32750–32765
Khan K, Husain S, Nauryzbayev G, Hashmi M (2023) Development and evaluation of ANN, ACOR-ANN,
      ALO-ANN based small-signal behavioral models for GaN-on-Si HEMT, 30th IEEE International Con-
      ference on Electronics, Circuits and Systems (ICECS), Istanbul, Turkiye, pp. 1–4.
Koutkias VG, Chouvarda I, Triantafyllidis A, Malousi A, Giaglis GD, Maglaveras N (2010) A personalized
      framework for medication treatment management in chronic care. IEEE Trans Inf Technol Biomed
      14(2):464–472
Lahmiri S (2023) Integrating convolutional neural networks, kNN, and Bayesian optimization for effi-
      cient diagnosis of Alzheimer’s disease in magnetic resonance images. Biomed Signal Process Control
      80:104375
Li B, Xu K, Feng D, Mi H, Wang H Zhu J (2019) Denoising convolutional autoencoder based B-mode ultra-
      sound tongue image feature extraction, IEEE International Conference on Acoustics, Speech and Signal
      Processing (ICASSP), Brighton, UK, pp. 7130–7134
13
Alzheimer’s disease detection using deep learning and machine…                       Page 37 of 39    262
Li J, Liu H, Li K, Shan K (2024) Heart sound classification based on two-channel feature fusion and dual
      attention mechanism, 5th International Conference on Computer Engineering and Application (ICCEA),
      Hangzhou, China, pp. 1294–1297.
Lu J, Zhang Q, Yang Z, Tu M (2019) A hybrid model based on convolutional neural network and long short-
      term memory for short-term load forecasting, 2019 IEEE Power & Energy Society General Meeting
      (PESGM), Atlanta, GA, USA, pp. 1–5
Malik I, Iqbal A, Gu YH, Al-antari MA (2024) Deep learning for Alzheimer’s disease prediction: a compre-
      hensive review. Diagnostics 14(12):1281
Manimurugan S (2020) Classification of Alzheimer’s disease from MRI images using CNN based pre-trained
      VGG-19 model. J Comput Sci Intell Technol. 1(2):34–41
Marcus DS, Fotenos AF, Csernansky JG, Morris JC, Buckner RL (2010) Open access series of imaging studies:
      longitudinal MRI data in nondemented and demented older adults. J Cogn Neurosci 22(12):2677–2684
Minoofam SAH, Bastanfard A, Keyvanpour MR (2023) TRCLA: a transfer learning approach to reduce
      negative transfer for cellular learning automata. IEEE Trans Neural Netw Learn Syst 34(5):2480–2489
Mohsen S, Elkaseer A, Scholz SG (2021a) Industry 4.0-oriented deep learning models for human activity
      recognition. IEEE Access 9:150508–150521
Mohsen S, Elkaseer A, Scholz SG (2021) Human activity recognition using K-nearest neighbor machine
      learning algorithm, 8th KES International Conference on Sustainable Design and Manufacturing, Croa-
      tia, pp. 304–313.
Mohsen S, Ali AM, El-Rabaie E-SM, ElKaseer A, Scholz SG, Hassan AMA (2023a) Brain tumor classifica-
      tion using hybrid single image super-resolution technique with ResNext101_32× 8d and VGG19 pre-
      trained models. IEEE Access 11:55582–55595
Mohsen S, Bajaj M, Kotb H, Pushkarna M, Alphonse S, Ghoneim SSM (2023b) Efficient artificial neural
      network for smart grid stability prediction. Int Trans Electr Energy Syst 2023:1–13
Mohsen S, Ali AM, Emam A (2024) Automatic modulation recognition using CNN deep learning models.
      Multimed Tools Appl 83:7035–7056
Morris T, Liu Z, Liu L, Zhao X (2024) Using a convolutional neural network and explainable AI to diagnose
      dementia based on MRI scans. Preprint at arXiv:2406.18555
Mukherjee P, Lall B, Shah A (2015) Saliency map based improved segmentation, IEEE International Confer-
      ence on Image Processing (ICIP), Quebec City, QC, Canada, pp. 1290–1294.
Nagabushanam P, George ST, Radha S (2020) EEG signal classifiation using LSTM and improved neural
      network algorithms. Soft Comput 24(13):9981–10003
Nanthini K, Sivabalaselvamani D, Chitra K, Gokul P, KavinKumar S, Kishore S., 2023 (2023) A survey on
      data augmentation techniques, 7th International Conference on Computing Methodologies and Com-
      munication, Erode, India, pp. 913–920.
Nanthini K, Tamilarasi A, Sivabalaselvamani D, Suresh P (2024) Automated classification of Alzheimer’s
      disease based on deep belief neural networks. Neural Comput Appl 36:7405–7419
Nehal TH, Khan AA, Shifa SA, Saiyara L, Hossain U, Islam AE (2023) A Shuffling building block and
      augmentation parameter tuning techniques to handle small medical dataset, International Conference
      on Electrical, Computer and Communication Engineering (ECCE), Chittagong, Bangladesh, 2023, pp.
      1–6.
Patil V, Nisha SL (2021) Detection of Alzheimer’s disease using machine learning and image processing,
      2021 International Conference on Smart Generation Computing, Communication and Networking
      (SMART GENCON), Pune, India, pp. 1–5.
Pei Z, Gou Y, Ma M, Guo M, Leng C, Chen Y, Li J (2022) Alzheimer’s disease diagnosis based on long-range
      dependency mechanism using convolutional neural network. Multimed Tools Appl 81(25):36053–36068
PJ K, RK, SN, SDP (2023) Secure and enhanced medical data repository, 6th International Conference on
      Contemporary Computing and Informatics (IC3I), Gautam Buddha Nagar, India, pp. 727–732.
Prakash S, Jalal AS, Pathak P (2023) Forecasting COVID-19 pandemic using prophet, LSTM, hybrid GRU-
      LSTM, CNN-LSTM, Bi-LSTM and stacked-LSTM for India, 6th International Conference on Informa-
      tion Systems and Computer Networks (ISCON), Mathura, India, pp. 1–6.
Quach L-D, Quoc KN, Quynh AN, Thai-Nghe N, Nguyen TG (2023) Explainable deep learning models with
      gradient-weighted class activation mapping for smart agriculture. IEEE Access 11:83752–83762
Rahmatillah I, Astuty E, Sudirman ID (2023) An improved decision tree model for forecasting consumer
      decision in a medium groceries store, IEEE 17th International Conference on Industrial and Information
      Systems (ICIIS), Peradeniya, Sri Lanka, pp. 245–250.
Ramani R, Ganesh SS, Rao SPVS, Aggarwal N (2024) Integrated multi-modal 3D-CNN and RNN approach
      with transfer learning for early detection of Alzheimer’s disease. Iran J Sci Technol Trans Electr Eng.
      https://doi.org/10.1007/s40998-024-00769-z
                                                                                                13
 262      Page 38 of 39                                                                           S. Mohsen
Ramteke N, Maidamwar P (2023) Cardiac patient data classification using ensemble machine learning tech-
     nique, 14th International Conference on Computing Communication and Networking Technologies
     (ICCCNT), Delhi, India, 2023, pp. 1–6.
Raza N, Naseer A, Tamoor M, Zafar K (2023) Alzheimer disease classification through transfer learning
     approach. Diagnostics 13(4):801
Resmi S, Singh T, Singh RP, Kumar P (2023) Skull stripping in magnetic resonance imaging of brain using
     semantic segmentation, 14th International Conference on Computing Communication and Networking
     Technologies (ICCCNT), Delhi, India, pp. 1–7.
Sahu S, Sarma H, Jyoti Bora D (2018) image segmentation and its different techniques: an in-depth analysis,
     International Conference on Research in Intelligent and Computing in Engineering (RICE), San Salva-
     dor, El Salvador, pp. 1–7.
Sakuma I (2013) Education and training in regulatory science for medical device development, 35th Annual
     International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Osaka,
     Japan, pp. 3155–3158.
Salah SBH, Chouchene M, Zayene MA, Sayadi FE (2024) Classification of Alzheimer's diseases from PET
     images using a convolutional neural network. International Conference on Control, Automation and
     Diagnosis (ICCAD), Paris, France, pp. 1–5.
Sang Y, Li W (2024) Classification study of Alzheimer’s disease based on self-attention mechanism and DTI
     imaging using GCN. IEEE Access 12:24387–24395
Savaş S (2022) Detecting the stages of Alzheimer’s disease with pre-trained deep learning architectures. Arab
     J Sci Eng 47(2):2201–2218
Sharma S, Guleria K, Tiwari S, Kumar S (2022) A deep learning based convolutional neural network model
     with VGG16 feature extractor for the detection of Alzheimer disease using MRI scans. Meas Sens
     24:100506
Shastry KA (2024) Deep learning-based classification of Alzheimer’s disease using MRI scans: a customized
     convolutional neural network approach. SN Comput Sci 5:917
Shuvo MMH, Ahmed N, Islam H, Alaboud K, Cheng J, Mosa ASM, Islam SK (2022) Machine learning
     embedded smartphone application for early-stage diabetes risk assessment, IEEE International Sympo-
     sium on Medical Measurements and Applications (MeMeA), Messina, Italy, pp. 1–6.
Skolariki K, Exarchos T, Vlamos P (2020) Contributing factors to Alzheimer’s disease and biomarker iden-
     tification techniques, 5th South-East Europe Design Automation, Computer Engineering, Computer
     Networks and Social Media Conference (SEEDA-CECNSM), Corfu, Greece, pp. 1–8.
Slimi H, Balti A, Abid S, Sayadi M (2024) A combinatorial deep learning method for Alzheimer’s disease
     classification-based merging pretrained networks. Front Comput Neurosci 18(1444019):1–13
Sonka M, Grunkin M (2002) Image processing and analysis in drug discovery and clinical trials. IEEE Trans
     Med Imaging 21(10):1209–1211
SS, CS, B. U, (2023) sMRI classification of Alzheimer's disease using genetic algorithm and multi-instance
     learning (GA+MIL), International Conference on Electrical, Electronics, Communication and Comput-
     ers (ELEXCOM), Roorkee, India, pp. 1–4.
Sun H, Wang A, Wang W, Liu C (2021) An improved deep residual network prediction model for the early
     diagnosis of Alzheimer’s disease. Sensors 21(12):4182
Sungura R, Onyambu C, Mpolya E, Sauli E, Vianney J-M (2021) The extended scope of neuroimaging and
     prospects in brain atrophy mitigation: a systematic review. Interdiscip Neurosurg 23:100875
Swami A, T. V, (2023) Multi-label tabular synthetic data generation for bundle recommendation problem,
     2023 IEEE 2nd International Conference on Data, Decision and Systems (ICDDS), Mangaluru, India,
     pp. 1–6.
Tiwari Y, Rasool A, Hajela G (2020) Machine learning with generative adverserial network, Second Inter-
     national Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India,
     pp. 543–548.
Tong T, Wolz R, Gao Q, Guerrero R, Hajnal JV, Rueckert D (2024) Multiple instance learning for classifica-
     tion of dementia in brain MRI. Med Image Anal 18(5):808–818
Turrisi R, Verri A, Barla A (2024) Deep learning-based Alzheimer’s disease detection: reproducibility and the
     effect of modeling choices. Front Comput Neurosci 18(1360095):1–13
Vijay V, Verma P (2023) Variants of Naïve bayes algorithm for hate speech detection in text documents,
     International Conference on Artificial Intelligence and Smart Communication (AISC), Greater Noida,
     India, pp. 18–21.
Wu J, Zhang Y, Wang K, Tang X (2019) Skip connection U-net for white matter hyperintensities segmenta-
     tion from MRI. IEEE Access 7:155194–155202
Wu Q, Xie Q, Xia D (2024) Application of computer vision and deep learning neural network in multi-modal
     information fusion, International Conference on Data Science and Network Security (ICDSNS), Tiptur,
     India, pp. 1–5.
13
Alzheimer’s disease detection using deep learning and machine…                      Page 39 of 39   262
Yan L, K Song (2010) Design of ARM-based telemedicine consultation system, International Conference on
     Biomedical Engineering and Computer Science, Wuhan, China, pp. 1–4.
Yang D, Wang R, Song C (2024) Classification of Alzheimer's disease using fMRI-based brain functional
     network data, 6th Asia Symposium on Image Processing, Tianjin, China, pp. 97–101.
Zhang S, Hu J, Bao Z, Wu J (2013) Prediction of spectrum based on improved RBF neural network in
     cognitive radio, International Conference on Wireless Information Networks and Systems (WINSYS),
     Reykjavik, Iceland, pp. 1–5.
Zou Z, Chen K, Shi Z, Guo Y, Ye J (2023) Object detection in 20 years: a survey. Proc IEEE 111(3):257–276
Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
13