Skip to content
View yiwei0730's full-sized avatar

Block or report yiwei0730

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Miso TTS is an 8 billion, highly emotive text-to-speech model

Python 2,876 275 Updated Jun 9, 2026

💫 Toolkit to help you get started with Spec-Driven Development

Python 112,684 9,951 Updated Jun 16, 2026

X-Voice

Python 166 21 Updated Jun 5, 2026

AI Agent 源码深度研究报告

Python 5,773 1,648 Updated Apr 12, 2026

将博导十年科研经验炼化为可直接调用的 AI 技能。从 Idea 构思到论文投稿,你的 AI 科研副导师。

Python 2,813 203 Updated Apr 29, 2026

A complete AI agency at your fingertips - From frontend wizards to Reddit community ninjas, from whimsy injectors to reality checkers. Each agent is a specialized expert with personality, processes…

Shell 113,912 18,595 Updated Jun 16, 2026

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 193,930 109,961 Updated Jun 8, 2026

337 Claude Code skills & agent skills & plugins (30+ Agents, 70+ custom commands, 330+ skills, customizable references, scripts)for Claude Code, Codex, Gemini CLI, Cursor, and 8 more coding agents …

Python 18,290 2,522 Updated Jun 12, 2026

VibeVoiceFusion is a full-stack, multi-speaker voice generation web system featuring LoRA fine-tuning, batch generation, and VRAM optimization. Based on Microsoft's VibeVoice (AR + diffusion archit…

Python 480 61 Updated Feb 23, 2026

[🚀 ICLR 2026 Oral] NextStep-1: SOTA Autogressive Image Generation with Continuous Tokens. A research project developed by the StepFun’s Multimodal Intelligence team.

Python 690 27 Updated Feb 27, 2026

Reimplementation of CC-G2PnP: Streaming Conformer-CTC based Japanese Grapheme-to-Phoneme and Prosody model (arXiv:2602.17157)

Python 9 1 Updated Jun 3, 2026

Official code for SongEcho

Python 64 5 Updated Mar 3, 2026

Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"

Python 3,253 290 Updated Jan 8, 2026

Translate the video from one language to another and embed dubbing & subtitles.

Python 17,993 2,242 Updated Jun 16, 2026

SoTA open-source TTS

Python 25,094 3,325 Updated Jun 10, 2026

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 21,683 2,498 Updated May 25, 2026

Open-Source Frontier Voice AI

Python 49,407 5,507 Updated May 6, 2026

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

JavaScript 216,790 33,295 Updated Jun 16, 2026

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…

Python 11,983 1,557 Updated Mar 17, 2026

AI video translation & dubbing tool for humans and AI Agents, powered by LLMs. Full pipeline: download, transcribe, translate, TTS dub, reformat, cover generation. 100+ languages, optimized for You…

Go 10,309 959 Updated Jun 17, 2026

An instruct text-to-speech solution based on LLaSA and CosyVoice2 developed by the ASLP lab and collaborators.

Python 250 12 Updated Feb 26, 2026

Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.

Swift 12,364 1,268 Updated May 22, 2026

A TTS that fits in your CPU (and pocket)

Python 4,616 512 Updated Jun 3, 2026

Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models

Python 243 14 Updated Dec 18, 2025

Open-source unified multimodal model

Python 6,018 533 Updated May 4, 2026
Python 162 8 Updated Nov 22, 2024

PyTorch Implementation of TCSinger 2(ACL 2025): Customizable Multilingual Zero-shot Singing Voice Synthesis

Python 181 31 Updated Apr 19, 2026

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation

Python 84 10 Updated Nov 11, 2025

MOSS-TTSD is a spoken dialogue generation model designed for expressive multi-speaker synthesis. It features long-context modeling, flexible speaker control, and multilingual support, while enablin…

Python 1,353 131 Updated Mar 23, 2026

Real-time voice assistant — WebRTC streaming, faster-whisper ASR, local LLM, Vui Nano (300M) TTS. OpenAI Realtime API compatible. Voice cloning, barge-in, ~9× realtime on a 4090. Apache 2.0.

Python 701 72 Updated Jun 12, 2026
Next