Stars
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Command-line program to download videos from YouTube.com and other video sites
Magnificent app which corrects your previous console command.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Robust Speech Recognition via Large-Scale Weak Supervision
real time face swap and one-click video deepfake with only a single image
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
Clone a voice in 5 seconds to generate arbitrary speech in real-time
A collection of learning resources for curious software engineers
The definitive Web UI for local AI, with powerful features and easy setup.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Universal memory layer for AI Agents; Announcing OpenMemory MCP - local and secure memory management.
Developer-first error tracking and performance monitoring
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A generative speech model for daily dialogue.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Write scalable load tests in plain Python 🚗💨
Stable Diffusion with Core ML on Apple Silicon
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
Open Source DeepWiki: AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories. Join the discord: https://discord.gg/gMwThUMeme
💮 amazing QRCode generator in Python (supporting animated gif) - Python amazing 二维码生成器(支持 gif 动态图片二维码)
🚀 PR-Agent: An AI-Powered 🤖 Tool for Automated Pull Request Analysis, Feedback, Suggestions and More! 💻🔍
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
Real time interactive streaming digital human
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
A community-driven way to read and chat with AI bots - powered by chatGPT.
[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation