A complete AI agency at your fingertips - From frontend wizards to Reddit community ninjas, from whimsy injectors to reality checkers. Each agent is a specialized expert with personality, processes…

Shell 113,912 18,595 Updated Jun 16, 2026

ultraworkers / claw-code

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 193,930 109,961 Updated Jun 8, 2026

alirezarezvani / claude-skills

337 Claude Code skills & agent skills & plugins (30+ Agents, 70+ custom commands, 330+ skills, customizable references, scripts)for Claude Code, Codex, Gemini CLI, Cursor, and 8 more coding agents …

Python 18,290 2,522 Updated Jun 12, 2026

zhao-kun / VibeVoiceFusion

VibeVoiceFusion is a full-stack, multi-speaker voice generation web system featuring LoRA fine-tuning, batch generation, and VRAM optimization. Based on Microsoft's VibeVoice (AR + diffusion archit…

Python 480 61 Updated Feb 23, 2026

stepfun-ai / NextStep-1

[🚀 ICLR 2026 Oral] NextStep-1: SOTA Autogressive Image Generation with Continuous Tokens. A research project developed by the StepFun’s Multimodal Intelligence team.

Python 690 27 Updated Feb 27, 2026

ayutaz / cc-g2pnp

Reimplementation of CC-G2PnP: Streaming Conformer-CTC based Japanese Grapheme-to-Phoneme and Prosody model (arXiv:2602.17157)

Python 9 1 Updated Jun 3, 2026

lsfhuihuiff / SongEcho_ICLR2026

Official code for SongEcho

Python 64 5 Updated Mar 3, 2026

jixiaozhong / Sonic

Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"

Python 3,253 290 Updated Jan 8, 2026

jianchang512 / pyvideotrans

Translate the video from one language to another and embed dubbing & subtitles.

Python 17,993 2,242 Updated Jun 16, 2026

resemble-ai / chatterbox

SoTA open-source TTS

Python 25,094 3,325 Updated Jun 10, 2026

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 21,683 2,498 Updated May 25, 2026

microsoft / VibeVoice

Open-Source Frontier Voice AI

Python 49,407 5,507 Updated May 6, 2026

affaan-m / ECC

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

JavaScript 216,790 33,295 Updated Jun 16, 2026

QwenLM / Qwen3-TTS

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…

Python 11,983 1,557 Updated Mar 17, 2026

krillinai / KrillinAI

AI video translation & dubbing tool for humans and AI Agents, powered by LLMs. Full pipeline: download, transcribe, translate, TTS dub, reformat, cover generation. 100+ languages, optimized for You…

Go 10,309 959 Updated Jun 17, 2026

ASLP-lab / VoiceSculptor

An instruct text-to-speech solution based on LLaSA and CosyVoice2 developed by the ASLP lab and collaborators.

Python 250 12 Updated Feb 26, 2026

supertone-inc / supertonic

Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.

Swift 12,364 1,268 Updated May 22, 2026

kyutai-labs / pocket-tts

A TTS that fits in your CPU (and pocket)

Python 4,616 512 Updated Jun 3, 2026

LqNoob / Neural-Codec-and-Speech-Language-Models

Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models

Python 243 14 Updated Dec 18, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 6,018 533 Updated May 4, 2026

tencent-ailab / MuCodec

Python 162 8 Updated Nov 22, 2024

AaronZ345 / TCSinger2

PyTorch Implementation of TCSinger 2(ACL 2025): Customizable Multilingual Zero-shot Singing Voice Synthesis

Python 181 31 Updated Apr 19, 2026

gwx314 / STARS

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation

Python 84 10 Updated Nov 11, 2025

OpenMOSS / MOSS-TTSD

MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, and multilingual support, while enablin…

Python 1,353 131 Updated Mar 23, 2026

fluxions-ai / vui

Real-time voice assistant — WebRTC streaming, faster-whisper ASR, local LLM, Vui Nano (300M) TTS. OpenAI Realtime API compatible. Voice cloning, barge-in, ~9× realtime on a 4090. Apache 2.0.

Python 701 72 Updated Jun 12, 2026

yiwei0730

Lists (12)

Big model list

Dataset

emotion

paper survey

singing synthesis

Text-to-audio

Tools

TTS-adapt

TTS-zero-shot

VITS

Vocoder

Voice conversion

Stars