0% found this document useful (0 votes)
31 views17 pages

Deep Fake Detection System

The document outlines a Deep Fake Detection System designed to identify AI-generated synthetic media using a combination of CNN, LSTM, and Ensemble Learning for high accuracy. It emphasizes the importance of preserving digital trust and authenticity in media, while detailing the system's architecture, methodologies, and applications across various sectors. The system features a user-friendly interface for media uploads and provides real-time analysis of content authenticity.

Uploaded by

Smriti Pandey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views17 pages

Deep Fake Detection System

The document outlines a Deep Fake Detection System designed to identify AI-generated synthetic media using a combination of CNN, LSTM, and Ensemble Learning for high accuracy. It emphasizes the importance of preserving digital trust and authenticity in media, while detailing the system's architecture, methodologies, and applications across various sectors. The system features a user-friendly interface for media uploads and provides real-time analysis of content authenticity.

Uploaded by

Smriti Pandey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

DEEP FAKE DETECTION SYSTEM

Safeguarding Truth in the Age of Deepfakes

Members:
Shashank Mishra (2201221540044)
Smriti Pandey (2201221540049)
INTRODUCTION
• Deepfakes are AI-generated synthetic media (images, audio,
or video) created using Generative Adversarial Networks
(GANs) and other deep learning techniques.
• They are highly realistic and can manipulate human faces,
voices, and actions convincingly.
• While deepfake technology has positive applications
(entertainment, education, accessibility), it also poses
serious threats such as misinformation, identity theft, and
non-consensual content.
• The need for an automated Deep Fake Detection System
has become crucial to preserve digital trust and ensure
media authenticity.
• Our system uses CNN (spatial analysis) + LSTM (temporal
analysis) + Ensemble Learning to detect manipulated
content effectively.
• Provides a user-friendly interface where users can upload
media and get results with confidence scores.
Literature Survey: Laying the Foundation
Our project builds upon a rich body of research, leveraging advancements in natural language processing and machine
learning. Key works include:

FaceForensics++ Dataset (Rossler et al., 2019) 1


Benchmark dataset and CNN-based detection of manipulated
facial images.
2 MesoNet (Afchar et al., 2018)
Compact CNN for facial forgery detection with efficiency
for real-time use.
CNN + LSTM Hybrid (Güera & Delp, 2018) 3
Combining spatial (CNN) and temporal (LSTM) features for
video deepfake detection.
4 DeepFake Detection Challenge (Dolhansky et al., 2020)
Large-scale dataset for testing model generalization on
diverse manipulations.
GAN Fingerprints (Yu et al., 2019) 5
Revealed unique generative patterns in GAN-based images for
improved detection.
Project Aims: Building a Robust System

Reliable Detection Smart Technologies (CNN + Dataset Utilization


Develop an AI-based system that
LSTM + Ensemble Learning) Leverage benchmark datasets
identifies manipulated videos and Use deep learning models to analyze (FaceForensics++, DFDC) to train and
images with high accuracy. spatial and temporal features of media evaluate detection models.
content.

Real-time Analysis User-friendly Interface


Ensure the system can process and detect deepfakes Provide a simple interface for users to upload media and view
efficiently for instant results. authenticity results.
Project Objectives: Key Milestones

Develop CNN Model Integrate LSTM Model


Build a Convolutional Neural Network to extract spatial features from Capture temporal inconsistencies in videos, such as lip-sync or blinking
facial images. patterns.

Ensemble Learning Performance Evaluation


Combine CNN and LSTM predictions for improved accuracy and Validate the system using accuracy, precision, recall, and F1-score on
robustness. benchmark datasets.

User Interface Development Evaluation Metrics


Design a simple platform for uploading media and displaying detection Establish robust metrics for system performance assessment.
results.
Proposed Methodology: A Step-by-Step Approach
Our system will follow a structured, multi-stage process to ensure accurate and efficient fact-checking.

Data Collection
Gathering diverse text data for training and evidence.

Claim Detection
Identifying claims that require verification using NLP.

Evidence Retrieval
Searching reliable databases and knowledge graphs.

Verification & Scoring


Applying ML/DL models to assess claim veracity.

Integration & Deployment


Developing user-facing tools and deploying the system.
Block Diagram
• Input Media: The system accepts images and video clips for
verification. Video data is divided into frames for analysis.
• Preprocessing: Includes resizing, normalization, and augmentation
to prepare data for model input. For videos, frames are extracted at
a fixed rate.
• Feature Extraction (CNN): CNN models (e.g., Xception, ResNet) are
used to extract spatial features from frames to detect visual
artifacts.
• Temporal Analysis (LSTM): LSTM networks analyze sequential
frames to identify temporal anomalies such as unnatural blinking or
inconsistent lip movements.
• Ensemble Classifier: Combines CNN and LSTM predictions using
ensemble techniques (e.g., weighted averaging) to improve
detection accuracy.
• Output Result: The system outputs whether the content is authentic
or fake, along with a confidence score.
• User Interface: A web/desktop-based interface that allows users to
upload files and view results in a user-friendly manner.
Technologies Used
Frontend Development Backend Development
• HTML5, CSS3, JavaScript → For structuring and styling • Python → Core programming language for AI/ML model
the user interface. implementation.
• React.js (or Angular) → Interactive and dynamic web • TensorFlow / PyTorch → Deep learning frameworks for
application design. CNN & LSTM model training.
• Flask/Django Templates → Simple UI integration with • Scikit-learn → Ensemble learning, feature extraction,
backend logic. and evaluation.
• Flask / Django → Server-side framework to handle requests
and model inference.
• SQLite / MySQL → Database for storing user data,
logs, and results.
System Requirements
Software Stack Hardware Specifications
• Python: Version 3.8 or higher. • Processor: Intel Core i5 or i7 (10th Gen or newer).
• Deep Learning Frameworks: TensorFlow or PyTorch. • RAM: 8-16GB for development, 32GB+ recommended for
• Libraries & Tools: OpenCV, Numpy, Pandas large-scale models.
• Database: MySQL. • GPU: NVIDIA GPU (RTX 2060 or better) for accelerated
• Web Framework: Flask or FastAPI. model training.
• Storage: 512GB SSD minimum, 1TB SSD recommended.
System Modules: User vs. Admin
The VeriFact system is designed with two primary modules to cater to distinct user roles: the User Module for general
fact-checking requests and the Admin Module for system management and enhancement.

User Module Admin Module


• Model Training & Evaluation → Train CNN, LSTM, and
• Upload Media → User uploads image or video for verification.
ensemble models.
• Validation & Storage → System validates input and stores
• Image/Video Preprocessing → Normalize, resize,
temporarily.
and augment datasets.
• Deepfake Analysis → AI model processes media for
• Feature Extraction → Extract spatial & temporal
authenticity check.
features from media.
• Result Display → Shows output (Real / Fake) with confidence • Prediction Management → Monitor classification
score. results and accuracy.
• Download/Report Option → Option to export results if needed.
Data Flow Diagram:
Level 0
A 0 level DFD provides a high-level overview of the
system’s functionality. In this process, the user submits
an input claim to the system, which acts as the central
processing unit. The system then interacts with
knowledge sources such as Wikipedia, knowledge
graphs etc. to fetch information required for verification.
Based on the retrieved evidence, the system analyzes
the claim and generates a verification result that is
returned to the user. Additionally, the admin plays an
important role by continuously monitoring and updating
the models to enhance accuracy and system reliability.
This 0-Level DFD highlights the overall flow of data
between the main entities without delving into internal
process details.
Data Flow Diagram: Level 1
Level-1 Data Flow Diagram (DFD) provides a more detailed
view of the internal processes. The system starts with the
User Interface (Web/App), where users submit their input
claims. These claims undergo preprocessing, where the
text is cleaned, normalized, and prepared using libraries
for further analysis. The system then performs claim
matching by interacting with knowledge sources such as
Wikipedia, datasets, and knowledge graphs to retrieve
relevant information. Once the information is matched, the
Knowledge Source Access module verifies the claim and
produces outputs in categories such as True, False, or
Misleading, along with references. The Result Generation
module organizes this output and presents it back to the
user in a clear and understandable form. Meanwhile,
Admin Control is responsible for training and evaluating
models, managing users, and monitoring the system to
ensure accuracy and efficiency. This Level-1 DFD
highlights the detailed flow of data and the interaction
between system components for effective fact verification.
Use Case Diagram:
The Use Case Diagram of the Deep Fake Detection System, showing
the interaction between the primary actors (User and Admin) and the
system’s functionalities. The User can submit claims which undergo
preprocessing and verification by accessing knowledge sources. Users
can then view the verification results and generate reports for
reference. On the other hand, the Admin is responsible for maintaining
the system by training and evaluating models, managing users, and
monitoring processes to ensure smooth functioning. This diagram
clearly highlights the roles of different actors and the major use cases
supported by the system.
Applications of the System:
• Media & Journalism �→ Prevents spread of fake news and
manipulated videos.
• Cybersecurity & Law Enforcement �→ Validates digital evidence in
investigations.
• Social Media Platforms �→ Flags and removes manipulated
content automatically.
• Forensics ⚖️→ Detects tampered images/videos in legal
proceedings.
• Corporate Sector �→ Protects against impersonation in meetings
and video calls.
• Public Awareness Tools �→ Allows individuals to verify
authenticity of media.
Advantages & Limitations
Advantages Limitations
• High Accuracy → Detects
• High Computational Needs
manipulations with robust
→ Requires GPU and large
CNN + LSTM + Ensemble.
memory for training.
• User-Friendly → Simple
interface for non-technical • Dataset Dependency →
users. Accuracy may drop with
• Scalable → Can handle unseen/new datasets.
large datasets and diverse
• Adversarial Attacks →
inputs.
Advanced fakes can bypass
• Automation → Reduces
detectors.
human effort, provides real-
time analysis. • Real-Time Constraints →
• Flexible → Works for both Processing high-res videos
images and videos. can be slow.
Influential Research in Fact-Checking
Our project builds upon prior research in computer vision, deep learning, and digital forensics.
Key works include:

1. Li et al. (2018) – Exposed irregularities in eye blinking patterns in deepfake videos.


2. Rossler et al. (2019) – Released the FaceForensics++ dataset and CNN-based detection methods.
3. Afchar et al. (2018) – Proposed MesoNet, a lightweight CNN for efficient forgery detection.
4. Güera & Delp (2018) – Combined CNN + LSTM for spatial and temporal detection in videos.
5. Dolhansky et al. (2020) – Introduced the DFDC dataset for benchmarking deepfake detectors.
6. Yu et al. (2019) – Investigated GAN fingerprints for cross-model detection.
THANK YOU !!
Any Questions ?

You might also like