Skip to content
View xinkez's full-sized avatar

Block or report xinkez

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Native full-duplex speech dialogue inference for BayLing-Duplex.

Python 12 1 Updated Jun 17, 2026

An end-to-end framework for multi-speaker transcription that jointly models who spoke, when, and what.

Python 250 11 Updated Jun 4, 2026

Real-Time Streamable Generative Speech Restoration with Flow Matching

Python 41 7 Updated Jun 5, 2026

Whisfusion: Parallel ASR Decoding via a Diffusion Transformer

Python 30 3 Updated Aug 22, 2025

DuplexSLA: A Full-Duplex Spoken Language Model with Synchronized Speech, Language, and Action

80 Updated May 20, 2026

a collection of skills for vllm-omni

Python 78 24 Updated Jun 15, 2026
Python 47 4 Updated Apr 27, 2026

MOSS-Audio is an open-source foundation model for unified audio understanding, enabling speech, sound, music, captioning, QA, and reasoning in real-world scenarios.

Python 574 40 Updated Jun 2, 2026

Research about Voxtral, its codebooks and an attempt to reconstruct codes for a known audio

Python 12 1 Updated Apr 6, 2026

🌋LavaSR: Fast Speech restoration and enhancement

Python 551 49 Updated Jun 5, 2026

MOSS-TTS-Nano is an open-source multilingual tiny speech generation model from MOSI.AI and the OpenMOSS team. With only 0.1B parameters, it is designed for realtime speech generation, can run direc…

Python 3,509 450 Updated Jun 2, 2026

The official implementation of GTCRN, an ultra-lightweight SE model.

Python 670 111 Updated Jan 18, 2026

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

Python 1 Updated May 21, 2026

VITA-QINYU: Expressive Spoken Language Model for Role-Playing and Singing

Python 121 7 Updated Apr 3, 2026

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 193,973 109,958 Updated Jun 8, 2026

Voxtral Codec : Combining Semantic VQ and Acoustic FSQ for Ultra-Low Bitrate Speech Generation (Voxtral TTS Backbone)

Python 15 1 Updated Mar 27, 2026

Open Source Speech Language Model

Jupyter Notebook 995 107 Updated May 11, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 379,194 79,367 Updated Jun 17, 2026
Python 2 1 Updated Oct 12, 2025

A Visual Studio Code extension for ty.

TypeScript 363 17 Updated Jun 15, 2026

🤖 WebMCP

Bikeshed 2,648 165 Updated Jun 17, 2026

Pure C inference of Mistral Voxtral Realtime 4B speech to text model

C 1,692 118 Updated Feb 15, 2026

Unofficial implementation of training pipeline in mimo-tokenizer about "MiMo-Audio: Audio Language Models are Few-Shot Learners"

Python 3 Updated Nov 9, 2025

DFlash: Block Diffusion for Flash Speculative Decoding

Python 5,164 373 Updated May 10, 2026

MGM-Omni: Scaling Omni LLMs to Personalized Long-Horizon Speech

Python 204 12 Updated Mar 26, 2026
Python 13 1 Updated Mar 18, 2026

Write scalable load tests in plain Python 🚗💨

Python 27,906 3,220 Updated Jun 16, 2026

A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics, and features robust zero-shot text-to-speech

Python 931 69 Updated Apr 9, 2026

Very fast, accurate speaker diarization

Python 261 29 Updated Jun 11, 2026
Next