Deepfake Detection Solution using Multimodal Approach.
-
Updated
Jun 22, 2025 - Python
Deepfake Detection Solution using Multimodal Approach.
Experiments around using Multi-Modal Casual Attention with Multi-Grouped Query Attention
Multimodal Agentic GenAI Workflow – Seamlessly blends retrieval and generation for intelligent storytelling
Multi-speaker diarization from video using SyncNet’s cross-modal embedding space to match multiple face tracks to corresponding audio tracks.
App to cheer you up with some awesome quotes when depressed using deep learning
[IROS 2023] GVCCI: Lifelong Learning of Visual Grounding for Language-Guided Robotic Manipulation
This repository implements temporal reasoning capabilities for vision-language models in simulated embodied environments, addressing the critical limitation of frame-by-frame processing in current multimodal AI systems.
A PyTorch implementation of a Transformer Network for Machine Translation that incorporates image features to enhance the performance of the translation
Repository for the "Latent Multimodal Reconstruction for Misinformation Detection" paper
Multimodal benchmark for evaluating handwritten editorial correction in printed text.
Multimodal deep learning package that uses both categorical and text-based features in a single deep architecture for regression and binary classification use cases.
Deeplearning utils for multimodal research
Code and Models for Binding Text, Images, Graphs, and Audio for Music Representation Learning
The first-ever series of embeddings models for olfaction-vision-language applications in robotics and embodied AI.
Mixed vision-language Attention Model that gets better by making mistakes
This repo contains the official PyTorch implementation of vLMIG: Improving Visual Commonsense in Language Models via Multiple Image Generation
Preprocessing and feature extraction for raw voice data of DAIC-WOZ
Kedro pipelines for preprocessing text and tabular data for multi-modal ML in TensorFlow.
Unofficial implementation for Sigmoid Loss for Language Image Pre-Training
Add a description, image, and links to the multimodal-deep-learning topic page so that developers can more easily learn about it.
To associate your repository with the multimodal-deep-learning topic, visit your repo's landing page and select "manage topics."