MOSS-Audio is an open-source foundation model for unified audio understanding, enabling speech, sound, music, captioning, QA, and reasoning in real-world scenarios.

Python 574 40 Updated Jun 2, 2026

MarvinRomson / voxtral-tts-codes-for-audio

Research about Voxtral, its codebooks and an attempt to reconstruct codes for a known audio

Python 12 1 Updated Apr 6, 2026

ysharma3501 / LavaSR

🌋LavaSR: Fast Speech restoration and enhancement

Python 551 49 Updated Jun 5, 2026

OpenMOSS / MOSS-TTS-Nano

MOSS-TTS-Nano is an open-source multilingual tiny speech generation model from MOSI.AI and the OpenMOSS team. With only 0.1B parameters, it is designed for realtime speech generation, can run direc…

Python 3,510 450 Updated Jun 2, 2026

Xiaobin-Rong / gtcrn

The official implementation of GTCRN, an ultra-lightweight SE model.

Python 670 111 Updated Jan 18, 2026

lifeiteng / mlx-vlm

Forked from Blaizzy/mlx-vlm

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

Python 1 Updated May 21, 2026

VITA-MLLM / VITA-QinYu

VITA-QINYU: Expressive Spoken Language Model for Role-Playing and Singing

Python 121 7 Updated Apr 3, 2026

ultraworkers / claw-code

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 193,976 109,960 Updated Jun 8, 2026

rishikksh20 / voxtral-codec-pytoch

Voxtral Codec : Combining Semantic VQ and Acoustic FSQ for Ultra-Low Bitrate Speech Generation (Voxtral TTS Backbone)

Python 15 1 Updated Mar 27, 2026

HumeAI / tada

Open Source Speech Language Model

Jupyter Notebook 995 107 Updated May 11, 2026

openclaw / openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 379,209 79,372 Updated Jun 17, 2026

beethogedeon / MulliVC

Python 2 1 Updated Oct 12, 2025

astral-sh / ty-vscode

A Visual Studio Code extension for ty.

TypeScript 363 17 Updated Jun 15, 2026

webmachinelearning / webmcp

🤖 WebMCP

Bikeshed 2,648 165 Updated Jun 17, 2026

antirez / voxtral.c

Pure C inference of Mistral Voxtral Realtime 4B speech to text model

C 1,692 118 Updated Feb 15, 2026

liyunlongaaa / MiMo-Tokenizer-Trainer

Unofficial implementation of training pipeline in mimo-tokenizer about "MiMo-Audio: Audio Language Models are Few-Shot Learners"

Python 3 Updated Nov 9, 2025

z-lab / dflash

DFlash: Block Diffusion for Flash Speculative Decoding

Python 5,164 373 Updated May 10, 2026

JIA-Lab-research / MGM-Omni

MGM-Omni: Scaling Omni LLMs to Personalized Long-Horizon Speech

Python 204 12 Updated Mar 26, 2026

jjery2243542 / flow-slm

Python 13 1 Updated Mar 18, 2026

locustio / locust

Write scalable load tests in plain Python 🚗💨

Python 27,906 3,220 Updated Jun 16, 2026

stepfun-ai / Step-Audio-EditX

A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics, and features robust zero-shot text-to-speech

Python 931 69 Updated Apr 9, 2026

narcotic-sh / senko

Very fast, accurate speaker diarization

Python 261 29 Updated Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xinkez

Block or report xinkez

Stars

BayLing-Models / BayLing-Duplex

xzf-thu / Audio-Interaction

Soul-AILab / SoulX-Transcriber

sp-uhh / streamfm

taeyoun811 / Whisfusion

hyzhang24 / DuplexSLA

hsliuustc0106 / vllm-omni-skills

xiaomi-research / tts-prism

OpenMOSS / MOSS-Audio