Skip to content
View echocatzh's full-sized avatar
🤗
willing to share
🤗
willing to share

Block or report echocatzh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

OmX - Oh My codeX: Your codex is not alone. Add hooks, agent teams, HUDs, and so much more.

TypeScript 28,949 2,307 Updated May 18, 2026

Lightweight coding agent that runs in your terminal

Rust 83,442 12,093 Updated May 18, 2026

Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.

Python 1,154 111 Updated Feb 25, 2026

Mobile-Agent: The Powerful GUI Agent Family

Python 8,681 875 Updated May 14, 2026

Voice Activity Projection Models: Self-supervised learning of Turn-taking Events

Python 100 21 Updated May 29, 2024
Python 1,385 82 Updated Jan 29, 2026

Turn detection for full-duplex dialogue communication

Python 561 39 Updated Dec 26, 2025

Open-Source Turn-Taking Detection Model and Dataset for Full-Duplex Spoken Dialogue Systems

Python 110 8 Updated Jan 25, 2026

We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction

Python 206 17 Updated Apr 7, 2026

OSUM & OSUM-EChat, open speech understanding model and empathetic spoken chatbot based on it, open-sourced by ASLP@NPU.

Python 492 32 Updated Nov 23, 2025

[ICML 2025 Tokenization Workshop] HH-Codec: High Compression High-fidelity Discrete Neural Codec for Spoken Language Modeling

Python 97 4 Updated Sep 28, 2025

LLM model quantization (compression) toolkit with HW acceleration support for Nvidia, AMD, Intel GPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.

Python 1,151 184 Updated May 18, 2026

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 5,060 543 Updated Apr 11, 2025

TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation

Python 420 62 Updated Apr 20, 2026

GPT-4o-level, real-time spoken dialogue system.

Python 378 33 Updated Jan 27, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,729 2,234 Updated Feb 1, 2025

[ACM CCS'24] SafeEar: Content Privacy-Preserving Audio Deepfake Detection

Python 185 22 Updated Mar 24, 2025

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 4,144 341 Updated Aug 14, 2025

A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.

Python 430 29 Updated Feb 12, 2026

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 21,090 2,431 Updated May 3, 2026

A Survey of Spoken Dialogue Models (60 pages)

318 18 Updated Nov 28, 2024

Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications

Python 91 8 Updated Dec 20, 2024

✨✨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,512 181 Updated Mar 28, 2025

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

Python 137,563 19,640 Updated May 15, 2026

Run frontier AI locally.

Python 44,759 3,168 Updated May 15, 2026

A generative speech model for daily dialogue.

Python 39,277 4,258 Updated Apr 10, 2026

A Framework for Speech, Language, Audio, Music Processing with Large Language Model

Python 1,032 114 Updated Jan 15, 2026

[ICLR 2024] Official code for the paper 'Elucidating the Exposure Bias in Diffusion Models'

Python 49 2 Updated Jun 2, 2025

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

408 20 Updated Nov 2, 2025
Next