Try X-Dub to sync any character in a video with any audio you like | Official repository for "From Inpainting to Editing: Unlocking Robust Mask-Free Visual Dubbing via Generative Bootstrapping"

Python 203 3 Updated May 15, 2026

EvoLinkAI / awesome-gpt-image-2-API-and-Prompts

GPT-Image-2 API and Prompts

Python 16,736 1,698 Updated Jun 16, 2026

ASLP-lab / Easy-Turn

Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems

Python 112 8 Updated Jan 25, 2026

Soul-AILab / SoulX-Duplug

Plug-and-play streaming semantic VAD for real-time full-duplex spoken dialogue systems.

Python 246 25 Updated Mar 20, 2026

lifeiteng / OmniSenseVoice

Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯

Python 894 38 Updated Dec 10, 2025

PCL-Voice / PengChengStarling

PengChengStarling is specifically designed for developing multilingual ASR models based on the icefall project, supporting a complete ASR pipeline that includes data processing, model training, inf…

Python 188 22 Updated Mar 6, 2025

SentiAvatar / SentiAvatar

Python 309 43 Updated Apr 15, 2026

k2-fsa / OmniVoice

High-Quality Voice Cloning TTS for 600+ Languages

Python 7,521 1,178 Updated Jun 11, 2026

NVIDIA / Audio2Face-3D-Samples

A service to convert audio to facial blendshapes for lipsyncing and facial performances.

Python 306 49 Updated Mar 11, 2026

weijielyu / FaceCam

[CVPR 2026] FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning

Python 61 5 Updated Mar 26, 2026

NickTikhonov / shuo

sub-500ms latency phone agent orchestration

Python 661 67 Updated Mar 6, 2026

FireRedTeam / FireRed-OpenStoryline

FireRed-OpenStoryline is an AI video editing agent that transforms manual editing into intention-driven directing through natural language interaction, LLM-powered planning, and precise tool orches…

Python 2,935 344 Updated May 7, 2026

microsoft / VibeVoice

Open-Source Frontier Voice AI

Python 49,402 5,506 Updated May 6, 2026

Caxson / CosyVoice

Forked from FunAudioLLM/CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 5 1 Updated Mar 13, 2026