Highlights
- Pro
-
-
-
moshi Public
Forked from kyutai-labs/moshiMoshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Python Apache License 2.0 UpdatedSep 22, 2025 -
-
MOSS-TTSD Public
Forked from OpenMOSS/MOSS-TTSDMOSS-TTSD is a spoken dialogue generation model that enables expressive dialogue speech synthesis in both Chinese and English, supporting zero-shot multi-speaker voice cloning, and long-form speech…
Python Apache License 2.0 UpdatedAug 25, 2025 -
AcademiCodec Public
Forked from yangdongchao/AcademiCodecAcademiCodec: An Open Source Audio Codec Model for Academic Research
Python UpdatedAug 23, 2025 -
XY-Tokenizer Public
Forked from gyt1145028706/XY-TokenizerThis is the code for paper: XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs. Demos, technical insights and experimental results are presented on
Python UpdatedJul 12, 2025 -
-
stable-diffusion Public
Forked from nlile/stablediffusionHigh-Resolution Image Synthesis with Latent Diffusion Models
Python MIT License UpdatedJun 25, 2025 -
vall-e Public
Forked from lifeiteng/vall-ePyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
Python Apache License 2.0 UpdatedJun 23, 2025 -
-
-
-
voicebox-pytorch Public
Forked from lucidrains/voicebox-pytorchImplementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
Python MIT License UpdatedOct 1, 2024 -
SpeechGPT Public
Forked from 0nutation/SpeechGPTSpeechGPT Series: Speech Large Language Models
Python Apache License 2.0 UpdatedJul 22, 2024 -
streaming-llm Public
Forked from mit-han-lab/streaming-llm[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Python MIT License UpdatedJul 11, 2024 -
SpeechTokenizer Public
Forked from ZhangXInFD/SpeechTokenizerThis is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Python Apache License 2.0 UpdatedJun 9, 2024 -
-
USLM Public
Forked from 0nutation/USLMUnified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)
Python UpdatedSep 14, 2023