-
Shanghai Jiao Tong University & Shanghai Innovation Institute
- Shanghai
-
05:57
(UTC +08:00) - https://zhikangniu.github.io/
-
-
vllm-omni Public
Forked from vllm-project/vllm-omniA framework for efficient model inference with omni-modality models
Python Apache License 2.0 UpdatedApr 25, 2026 -
F5-TTS Public
Forked from SWivid/F5-TTSOfficial code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
-
-
-
CLIProxyAPI Public
Forked from router-for-me/CLIProxyAPIWrap Gemini CLI, Antigravity, ChatGPT Codex, Claude Code, Qwen Code, iFlow as an OpenAI/Gemini/Claude/Codex compatible API service, allowing you to enjoy the free Gemini 2.5 Pro, GPT 5, Claude, Qwe…
Go MIT License UpdatedMar 25, 2026 -
sub2api Public
Forked from Wei-Shaw/sub2apiSub2API-CRS2 一站式开源中转服务,让 Claude、Openai 、Gemini、Antigravity订阅统一接入,支持拼车共享,更高效分摊成本,原生工具无缝使用。
Go MIT License UpdatedMar 24, 2026 -
-
Curator Public
Forked from NVIDIA-NeMo/CuratorScalable data pre processing and curation toolkit for LLMs
Python Apache License 2.0 UpdatedMar 19, 2026 -
Qwen3-TTS Public
Forked from QwenLM/Qwen3-TTSQwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…
-
stable-audio-tools Public
Forked from Stability-AI/stable-audio-toolsGenerative models for conditional audio generation
-
diffusers Public
Forked from huggingface/diffusers🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Python Apache License 2.0 UpdatedMar 7, 2026 -
VibeVoice Public
Forked from microsoft/VibeVoiceOpen-Source Frontier Voice AI
Python MIT License UpdatedMar 6, 2026 -
-
nanochat Public
Forked from karpathy/nanochatThe best ChatGPT that $100 can buy.
-
-
Idea2Paper Public
Forked from AgentAlphaAGI/Idea2PaperIdea2Paper Offical Demo
Python MIT License UpdatedFeb 1, 2026 -
Semantic-VAE Public
Official code for "Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis"
-
-
flux2 Public
Forked from black-forest-labs/flux2Official inference repo for FLUX.2 models
Python Apache License 2.0 UpdatedNov 25, 2025 -
DC-Speech-VAE Public
Forked from KdaiP/DC-Speech-VAE5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs
Python Apache License 2.0 UpdatedNov 19, 2025 -
CosyVoice Public
Forked from FunAudioLLM/CosyVoiceMulti-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Python Apache License 2.0 UpdatedNov 18, 2025 -
calm Public
Forked from shaochenze/calmOfficial implementation of "Continuous Autoregressive Language Models"
Python MIT License UpdatedNov 10, 2025 -
SAC Public
Forked from Soul-AILab/SACTrainging, inference, and testing of the SAC speech codec model.
-
-
Ming-UniAudio Public
Forked from inclusionAI/Ming-UniAudioMing-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
Python MIT License UpdatedOct 28, 2025 -
metaquery Public
Forked from facebookresearch/metaqueryOfficial Implementation of Paper Transfer between Modalities with MetaQueries
Python Other UpdatedOct 12, 2025 -
NeMo-speech-data-processor Public
Forked from NVIDIA/NeMo-speech-data-processorA toolkit for processing speech data and creating speech datasets
-
flux Public
Forked from black-forest-labs/fluxOfficial inference repo for FLUX.1 models
Python Apache License 2.0 UpdatedJul 31, 2025 -