DeepFake Detection Using DenseNet-121
Amruthendu S Menon Rasmi Raju K S Lakshmi Varma
Department of Computer Science Department of Computer Science Department of Computer Science
and IT, and IT, and IT,
School of Computing, School of Computing,
School of Computing,
Amrita Vishwa Vidyapeetham, Amrita Vishwa Vidyapeetham,
Amrita Vishwa Vidyapeetham,
Kochi, India Kochi, India
Kochi, India
kh.sc.i5mca20020@asas.kh.amrita.edu kh.sc.i5mca20025@asas.kh.amrita.edu
kh.sc.i5mca20014@asas.kh.amrita.edu
Anisha G S
Department of Computer Science
and IT,
School of Computing,
Amrita Vishwa Vidyapeetham,
Kochi, India
anisha@kh.amrita.edu
Abstract—In the current world, the manipulations in images
using advanced tools has become a serious problem. This paper detection and numerous research efforts have been initiated
presents DenseNet-121 for detecting real and fake images. It for this purpose.This chapter critically reviews eleven state-
includes data preprocessing, dataset organization via CSV meta- of-the-art,meaningful studies that have contributed to deep
data, data augmentation, and training a fine-tuned convolutional fake image detection methodologies in the form of different
neural network (CNN) to achieve high classification accuracy.
deep learning architectures and approaches.
Our implementation achieved a test accuracy of ninety-seven
percent. This chapter reviews eleven key studies on deepfake image
Keywords—DeepFake Detection, DenseNet-121, Real Image, detection using deep learning. One study combines Error
Fake Image, Image Classification Level Analysis (ELA) with CNN for feature extraction and
SVM/KNN for classification. Using a Residual Network, it
I. INTRODUCTION achieves 89.5% accuracy in detecting manipulated images.
The rise of social media and affordable computing tech- The research emphasizes the need for robust detection to
nologies, creating manipulated media such as fake images has combat misinformation and its societal impact. [1]
become easier and more accessible.(1) These manipulations The paper introduces a Deep Fake Predictor (DFP) that
have a chance of spreading misinformation , leading to signif- integrates VGG16 with CNN for enhanced deepfake
icant social, political, and financial implications. Developing detection. The model achieves 95% precision and 94%
efficient detection systems is critical to counter this problem. accuracy on a real and fake faces dataset. By combining
(2)Traditional machine learning systems often depend on VGG16’s feature extraction with CNN’s classification
manually created features ,Which can make it hard for them strength, it effectively identifies manipulated images. The
to work well across different datasets.(3) Neural networks can study emphasizes the need to integrate deepfake detection into
identify and learn key patterns at various levels, enabling high cybersecurity frameworks to prevent misuse. [2]
performance in image classification tasks. The study introduces an enhanced D-CNN model for deep-
This paper proposes a method using DenseNet121, a pre - fake detection, improving feature learning and efficiency. It
trained CNN , fine-tuned for binary classification(real vs achieves 94.67% accuracy on datasets like AttGAN, GDWCT,
fake). The framework involves preprocessing,dataset and StyleGAN. The model enhances interpretability, robust-
organization ness, and adaptability to evolving deepfake techniques. [3]
,Data augmentation, and evaluation on a carefully split The study introduces an improved D-CNN model for deep-
dataset. fake detection, achieving 94.67% accuracy across multiple
datasets. It enhances feature learning and adapts to evolving
II. RELATED WORK manipulation techniques. Another approach, CFFN with con-
trastive loss, compares real and fake images to identify unique
Recent developments in deepfake technology have
manipulation patterns, improving precision and robustness.
generated serious concerns related to cybersecu-
Future work aims to extend these methods to deepfake video
rity,misinformation,and digital authenticity. Deepfakes,
detection. [4]
powered by advanced m techniques of machine learning such
The paper introduces FF-LBPH, combining Fisher Face
as GAN or Generative Adversarial Network make it
and Local Binary Pattern Histogram for deepfake detection.
incredibly difficult to determine what’s real and what’s
fake..Countering deepfake-generated content calls for
effective mechanisms for
It addresses inefficiencies in current methods by improving
feature extraction and classification. Kalman filtering
enhances facial recognition accuracy. The approach is
validated across multiple datasets. [5] This research explores
Vision Trans- formers (ViTs) for deepfake detection,
introducing a mul- ticlass classification approach. ViTs
outperform CNNs like ResNet-50 and VGG-16 by capturing
global image features and improving robustness. Testing on
40,000 images shows high accuracy across different deepfake
types. The research emphasizes the capability of Vision
Transformers for exten- sive and instantaneous recognition.
[6] The paper proposes a deepfake detection method using Fig. 1. Workflow diagram
Decoupled Dynamic Convo- lution (DDC) for adaptive
feature extraction. DDC improves accuracy while reducing
computational costs, achieving 91.8% accuracy on ProGAN deep learning architecture, with custom layers added for fine-
and StyleGAN2 datasets. The model adapts effectively to tuning. The training phase involves optimizing the model
different scenarios deepfake techniques and supports real-time using callbacks like early stopping and learning rate reduction.
applications. Future work aims to expand detection to video Finally, the evaluation phase assesses the model’s accuracy,
content. [7] loss, and predictive performance using metrics like confusion
This study introduces EfficientNet-B3 for deepfake detec- matrices and classification reports, ensuring robust deepfake
tion by analyzing facial edge artifacts. The model improves detection.
ac- curacy while reducing computational overhead and A. Preprocessing
integrates Explainable AI (XAI) techniques like LRP and
LIME for transparency. Trained on FaceForensics++ and The dataset consists of images categorized into two classes:
DeepFakeDe- tection datasets, it leverages transfer learning real and fake[fig.2] Images are organized into separate folders
for optimal performance. The approach enhances both for training, testing, and validation, with metadata stored in a
detection accuracy and interpretability for cybersecurity CSV file for easy management. The CSV file includes the im-
applications. [8] age path and a binary label (”real” or ”fake”). A Python script
This research applies Explainable AI (XAI) techniques to is used to label the dataset and generate the CSV file, ensuring
deepfake detection using XceptionNet. It highlights key facial consistency by mapping folder names to standardized labels.
regions responsible for classification, improving model trans- The dataset is split into training (70%), validation (15%), and
parency and reliability. Results show robustness against noise testing (15%) subsets using proportional sampling. To
and transformations, confirming synthetic artifacts around the improve generalization, data augmentation is applied to the
nose, mouth, and jawline as key indicators. Future work aims training set using TensorFlow’s ImageDataGenerator,
to extend XAI methods to video deepfake detection. [9] including rescaling (normalizing pixel values to [0,1]) and
The study presents a CNN-based deepfake detection frame- horizontal flipping for robustness. Validation and test sets
work using Xception Net and Transfer Learning. The model undergo only rescaling to maintain integrity.
has learned on multiple datasets with a cyclical learning B. Dataset Description
rate and Adam optimization, demonstrating strong perfor-
The dataset, developed by the Computational Intelligence
mance. However, challenges remain in detecting high-quality
and Photography Lab at Yonsei University, is valuable for
deepfakes, emphasizing the evolving nature of AI-generated
training and evaluating deepfake detection models. It is
forgeries. Future research focuses on multimodal approaches
specifically designed to distinguish between authentic and
and real-time detection. [10]
manipulated facial images, aiding in image forensics. The
This approach explores deepfake detection using PCA for
dataset is structured into two primary directories training real,
feature selection and SVM for classification. PCA reduces
which contains genuine face images, and training fake, which
feature overlap, achieving 96.8% accuracy on a Kaggle
includes manipulated face images.
dataset. However, preprocessing choices impact results,
requiring care- ful integration for optimal performance. The C. Model Description
study confirms PCA’s effectiveness but emphasizes the need
DenseNet121, a pre-trained network for image recognition,
for refined pre- processing methods. [11]
is adapted for binary classification by adding custom layers.
III. PROPOSED APPROACH A GlobalAveragePooling2D layer replaces fully connected
layers, reducing spatial dimensions and preventing overfitting.
Our study focuses on deepfake detection with a transfer
A Dense(512, ReLU) layer extracts high-level features, while
learning-based CNN framework.This study begin’s with data
BatchNormalization stabilizes learning and accelerates train-
preparation, where images are organized,labeled and divided
ing. A Dropout(0.3) layer prevents overfitting by randomly
to training,validation,and test sets. Preprocessing includes
deactivating 30% of neurons. The final Dense(1, sigmoid)
rescal- ing, augmentation, and batching to increase model
perfor- mance. The model is built using DenseNet121, a
pre-trained
adjusting its learning rate and adapting it at past gradients
leads to stable convergence.The procedure of learning here
is checked over accuracy,primarily used as evaluation met-
ric.With this,structured pipeline with explicit hyperparameters
that ensure effective training is set out.The model was trained
for 10 epochs for a batch with a size of 64 to ensure com-
putation was done efficiently while conserving memory.The
dataset is fed into the DenseNet121 architecture,and forward
propagation takes place,where the model will extract hier-
archical features from images.The predicted values are then
compared against actual labels using binary cross-entropy loss
after which backpropagation gives the model weights based
Fig. 2. Dataset on the computed gradients.The Adam optimizer optimizes the
correct weight adjustments without causing instability in the
training process[fig.3].
layer outputs a probability score for classification. The model
is built with binary cross-entropy set as the objective function
to minimize classification errors. Adam optimizer, combining
AdaGrad and RMSProp, ensures faster and more stable con-
vergence during training.
D. Training And Optimization
Our study uses a structured training methodology to opti-
mize the performance of our real and fake face classification
model, which is based on DenseNet121. We have integrated
several techniques, including adaptive learning mechanisms,
Fig. 3. Key Training Parameters
early stopping, dynamic learning rate adjustments, and model
checkpointing, to achieve high accuracy while preventing
overfitting. An important optimization strategy applied within E. Model Evaluation Metrics
this work is the use of callbacks,which constitute automated
functions executed at specific stages of training in order The proposed model was evaluated using training, vali-
to enhance model efficiency.The first callback applied here dation, and test datasets to assess its accuracy, loss, and
is ModelCheckPoint,which saves the best-performing model classification performance. The confusion matrix offers a
according to validation loss.This will allow us to switch back comprehensive analysis of classification outcomes: 998 fake
to the model that has the lowest validation loss if overfitting images were correctly identified as fake (True Negatives),
occurs in later epochs.Otherwise,we may lose the best version while 20 fake images were incorrectly classified as real (False
of the model because of unnecessary continued training. The Positives). Additionally, 981 real images were accurately clas-
second callback used is EarlyStopping,which monitors the sified as real (True Positives), whereas 42 real images were
loss of validation and stops training when there is no mistakenly identified as fake (False Negatives). To further
improvement seen in a number of epochs defined.In this assess classification performance, we calculate key metrics
experiment,we set a patience level of five epochs;that is,if the using precision, recall, and the F1-score:
model does not improve for five epochs
P = TP/TP + FP (1)
consecutively,training stops.This eliminates unnecessary
computations and reduces the chances of overfitting. In third Precision (P) measures the percentage of predicted positive
callback ,ReduceLROnPlateau,which dynamically reduces the cases that are actually positive.
learning rate in case the model’s performance has plateaued.If
the loss of validation is not improving for three continuous R = TP/TP + FN (2)
epochs,the rate of learning is reduced by a point of 0.2,so finer
weight adjustments can be done in later stages of Recall (R) represents the fraction of actual positive instances
training,which helps to escape local minima and converge correctly identified.
more efficiently. The model underwent training with the F 1 − Score = 2 ∗ Precision ∗ Recall/Precision + Recall
binary log loss optimizing function, which is suitable for the (3)
binary categorization task of distinguishing between real and The F1-Score is the harmonic mean of precision and recall,
fake faces. This function evaluates the difference between the maintaining a balanced trade-off between the two.
estimated probabilities and the true labels, enabling model to
refine predictions effectively..For en- hancing training Adam A = TP + TN/TP + TN + FP + FN (4)
optimizer is in use,thereby dynamically
Accuracy (A) measures the overall correctness of the model’s
predictions.
Confusion Matrix: The confusion matrix offers a detailed
breakdown of the model’s predictions, displaying the counts
of true positives (TP), true negatives (TN), false positives
(FP), and false negatives (FN).
Predicted Class
Positive Negative
Positive Negative
TP FN
Actual Class
Fig. 5. model accuracy
FP TN
F. Visualization and Interpretation
The visualization of the model’s effectiveness is achieved
through accuracy and loss curves using matplotlib.pyplot.
These provide an idea about the model’s learning behavior
over training epochs. The accuracy graph illustrates a steady
improvement and validation accuracy stabilizing close to it,
indicating effective generalization. The loss curve further
supports this, showing a decline in both training and
validation loss, with minor fluctuations, suggesting that the
model is learning without significant overfitting.
Fig. 6. model loss
The model loss graph depicts how the error decreases over
multiple epochs. The X-axis represents the number of epochs,
while the Y-axis indicates the loss value. The blue curve
corresponds to the training phase, while the orange curve
represents the validation phase.
By the fourth epoch, the validation error drops
considerably, aligning closely with the training trend. This
suggests an improvement in the model’s ability to generalize
Fig. 4. confusion matrix
effectively.
The consistently low loss values suggest that the model
The confusion matrix graphically illustrates the classifica- effectively reduces errors and captures meaningful patterns.
tion accuracy, showcasing correctly identified real and fake The **alignment of both curves** implies minimal
instances. Most predictions are accurate, with 998 true pos- overfitting, ensuring a balance between learning from training
itives and 981 true negatives, while some errors include 20 data and adapting to new, unseen inputs [Fig.6].
false positives and 42 false negatives.[Fig.4]. The DenseNet121 model extracts features at multiple lev-
The confusion matrix visually depicts the classification els to effectively distinguish between real and fake images.
accuracy, emphasizing correctly classified real and fake in- At the low-level,it identifies basic structural details such as
stances. Most predictions are accurate, with 998 true positives edges, texture patterns, simple shapes, and color gradients,
and 981 true negatives, while a few errors are observed as 20 which help in detecting surface details and shading variations.
false positives and 42 false negatives.[Fig.5]. Moving to the mid-level, the model focuses on more intricate
features like facial parts, complex textures, lighting variations,
and artifacts, allowing it to recognize inconsistencies in skin
an optimal trade-off between accuracy and sensitivity, ensur-
ing that misclassified positive and negative instances remain
minimal.
V. CONCLUSION AND FUTURE WORKS
In this research, a deep learning-driven approach was
designed to classify real and fake images utilizing the
DenseNet121 architecture. The model attained an impressive
test accuracy of 97.00%, showcasing its effectiveness in
differ- entiating between genuine and manipulated images. As
far as we know, this is the first instance,our real and fake face
image detection dataset has been applied to DenseNet121,
marking a novel use of the architecture. The high test
accuracy further emphasizes the model’s reliability for
detecting real and fake images. The model’s success can be
credited to systematic data preprocessing, data augmentation
strategies, and the feature- sharing capability of DenseNet121.
Fig. 7. Features Extracted Furthermore, dropout lay- ers and batch normalization
significantly contributed to re- ducing overfitting and
enhancing generalization. Although the model delivers high
texture and illumination effects. At the high-level, it evaluates
accuracy, there is scope for enhancement to improve its real-
complete face structures, symmetry, and proportions while
world applicability. One crucial direction for future research is
also detecting deepfake artifacts such as blurring, warping,
real-time deepfake detection in video streams, where temporal
and pixelation. The final step, known as feature fusion,
inconsistencies need to be captured using methods such as
enables the model to make a conclusive decision on whether
Recurrent Neural Networks (RNNs) or 3D convolutional
an image is real or fake. These features are automatically
layers. Additionally, optimizing the model’s speed and
extracted by the DenseNet121 model through multiple
computational efficiency is essential to enable low- latency
convolutional layers, ensuring a detailed and accurate analysis
predictions for real-time use. In summary, this study
for deepfake detection[Fig.7].
underscores the role of deep learning in digital forensics and
IV. RESULTS media integrity. By advancing real-time detection capabilities,
The model showcased strong proficiency in distinguishing expanding dataset diversity, and optimizing deployment for
between real and fake images, achieving a training accuracy mobile platforms, the model has the potential to become a
of 97.70%. This high accuracy demonstrates the model’s valuable asset in combating deepfake technology, ensuring the
ability to effectively identify key patterns and features in the authenticity of digital content across multiple domains.
dataset.The recorded training loss of 0.0660 indicates that the REFERENCES
predictions are well-aligned with the actual labels, minimizing
[1] R. Rafique, M. Nawaz, H. Kibriya, and M. Masood, “Deepfake
errors during training. The test accuracy was approximately detection using error level analysis and deep learning,” in 2021 4th
97%,confirming that the model performs effectively on International Conference on Computing & Information Sciences
unseen data. A testing accuracy close to the training accuracy (ICCIS). IEEE, 2021, pp. 1–4.
[2] A. Raza, K. Munir, and M. Almutairi, “A novel deep learning approach
indicates that the model is not overfitting, suggesting it has for deepfake image detection,” Applied Sciences, vol. 12, no. 19, p.
learned generalized patterns rather than memorizing training 9820, 2022.
data. With a validation loss of 0.07, the model demonstrates [3] Y. Patel, S. Tanwar, P. Bhattacharya, R. Gupta, T. Alsuwian, I. E.
Davidson, and T. F. Mazibuko, “An improved dense cnn architecture for
effec- tive training, avoiding both excessive overfitting and deepfake image detection,” IEEE Access, vol. 11, pp. 22 081–22 095,
under- fitting.Additional evaluation metrics provide further 2023.
insights into the model’s effectiveness. The precision was [4] C.-C. Hsu, Y.-X. Zhuang, and C.-Y. Lee, “Deep fake image detection
based on pairwise learning,” Applied Sciences, vol. 10, no. 1, p. 370,
recorded at 0.980, meaning that 98% of the predicted 2020.
instances were correctly classified as real or fake. A high [5] S. Suganthi, M. U. A. Ayoobkhan, N. Bacanin, K. Venkatachalam,
precision score signifies a minimal occurrence of false H. Sˇteˇpa´n, T. Pavel et al., “Deep learning model for deep fake face
recognition and detection,” PeerJ Computer Science, vol. 8, p. e881,
positives., reducing misclassification between real and fake 2022.
images. The recall value was recorded at 0.966, indicating that [6] M. A. Arshed, S. Mumtaz, M. Ibrahim, C. Dewi, M. Tanveer, and
96.6% of actual real and fake images were accurately S. Ahmed, “Multiclass ai-generated deepfake face detection using
patch- wise deep learning model,” Computers, vol. 13, no. 1, p. 31,
detected, showcasing the model’s efficiency in identifying 2024.
relevant cases. Additionally, the F1-score reached 0.973, [7] J. Li and B. Li, “Deep fake image detection based on decoupled
acting as a combined measure of these two metrics, dynamic convolution,” in 2022 4th International Conference on Frontiers
Technology of Information and Computer (ICFTIC). IEEE, 2022, pp.
providing a holistic evaluation of the model’s effectiveness, 27–31.
particularly in cases of class imbalance. A high F1-score of [8] Z. Deng, B. Zhang, S. He, and Y. Wang, “Deepfake detection method
0.973 confirms that the model maintains based on face edge bands,” in 2022 9th International Conference on
Digital Home (ICDH). IEEE, 2022, pp. 251–256.
[9] B. Malolan, A. Parekh, and F. Kazi, “Explainable deep-fake detection
using visual interpretability methods,” in 2020 3rd International confer-
ence on Information and Computer Technologies (ICICT). IEEE, 2020,
pp. 289–293.
[10] P. Ranjan, S. Patil, and F. Kazi, “Improved generalizability of deep-
fakes detection using transfer learning based cnn framework,” in 2020
3rd international conference on information and computer technologies
(ICICT). IEEE, 2020, pp. 86–90.
[11] M. S. M. Altaei et al., “Detection of deep fake in face images based
machine learning,” Al-Salam Journal for Engineering and Technology,
vol. 2, no. 2, pp. 1–12, 2023.