Stars
Universal LLM Deployment Engine with ML Compilation
🔥 MaxKB is an open-source platform for building enterprise-grade agents. 强大易用的开源企业级智能体平台。
Faster Whisper transcription with CTranslate2
DeepFaceLab is the leading software for creating deepfakes.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Lets make video diffusion practical!
Translate the video from one language to another and add dubbing.
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Free and Open Source Machine Translation API. Self-hosted, offline capable and easy to setup.
Automate Creation of YouTube Shorts using MoviePy.
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, th…
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
[CVPR 2024] Official repository for "MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model"
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频
基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Multilingual Voice Understanding Model
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.