Lists (32)
Sort Name ascending (A-Z)
academic
acoustic echo cancellation
AIGC
audio codec
audio codecs
audio separation
audio tools
bandwidth extension
beamforming
computer vision
deep learning
diffusion
entertainments
hearing aid
LLM
mircophone array
music tools
noise reduction
packet loss compensation
programming related
simulation tools
singing voice tools
sound source localization
spatial audio
speaker recognition
speech dereverberation
speech diarization
speech frontend
speech recognition
speech separation
speech signal processing
speech voice tools
Starred repositories
Official Code for DragGAN (SIGGRAPH 2023)
Instant voice cloning by MIT and MyShell. Audio foundation model.
Multi-agent framework, runtime and control plane. Built for speed, privacy, and scale.
Free ChatGPT&DeepSeek API Key,免费ChatGPT&DeepSeek API。免费接入DeepSeek API和GPT4 API,支持 gpt | deepseek | claude | gemini | grok 等排名靠前的常用大模型。
Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, i…
Easily train a good VC model with voice data <= 10 mins!
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Code and documentation to train Stanford's Alpaca models, and generate the data.
Real-time face swap for PC streaming or video calls
[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.
Open-Sora: Democratizing Efficient Video Production for All
SoftVC VITS Singing Voice Conversion
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
A generative world for general-purpose robotics & embodied AI learning.
State-of-the-art 2D and 3D Face Analysis Project
A complete and graceful API for Wechat. 微信个人号接口、微信机器人及命令行微信,三十行即可自定义个人号机器人。
Industry leading face manipulation platform
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Fully open reproduction of DeepSeek-R1