crazyxixi

Follow

crazyxixi

Follow

1 follower · 4 following

Stars

ASLP-lab / WenetSpeech-Chuan

Official repository for the WenetSpeech-Chuan dataset.

Python 201 6 Updated Feb 5, 2026

VoltAgent / awesome-openclaw-skills

The awesome collection of OpenClaw skills. 5,400+ skills filtered and categorized from the official OpenClaw Skills Registry.🦞

50,240 4,895 Updated Jun 8, 2026

openclaw / openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 378,743 79,230 Updated Jun 15, 2026

xiaomi-research / r1-aqa

🤗 R1-AQA Model: mispeech/r1-aqa

Python 326 30 Updated Mar 28, 2025

OpenBMB / MiniCPM-V

A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone

Python 25,630 2,007 Updated Jun 4, 2026

OpenBMB / MiniCPM

MiniCPM5-1B: A SOTA 1B on-device LLM, small yet powerful.

Jupyter Notebook 9,450 621 Updated Jun 12, 2026

ByteDance-Seed / VeOmni

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 2,014 212 Updated Jun 15, 2026

Tencent / TencentPretrain

Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo

Python 1,089 147 Updated Aug 4, 2024

ASLP-lab / OSUM

OSUM & OSUM-EChat, open speech understanding model and empathetic spoken chatbot based on it, open-sourced by ASLP@NPU.

Python 497 32 Updated Nov 23, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 4,022 324 Updated Jun 12, 2025

baichuan-inc / Baichuan-Audio

Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction

Python 223 15 Updated Feb 28, 2025

modelscope / FunASR

Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.

Python 17,993 1,845 Updated Jun 11, 2026

hiyouga / LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 72,166 8,829 Updated Jun 15, 2026

MooreThreads / MooER

MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not …

Python 221 17 Updated Jan 8, 2025

FunAudioLLM / SenseVoice

Multilingual speech understanding: ASR + emotion recognition + audio event detection. 50+ languages, 15x faster than Whisper, non-autoregressive.

Python 8,570 780 Updated Jun 9, 2026

kkroening / ffmpeg-python

Python bindings for FFmpeg - with complex filtering support

Python 10,997 942 Updated Aug 4, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 82,875 18,065 Updated Jun 15, 2026

QwenLM / Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 2,079 165 Updated Apr 21, 2025

mem0ai / mem0

Universal memory layer for AI Agents

Python 58,578 6,729 Updated Jun 15, 2026

karpathy / LLM101n

LLM101n: Let's build a Storyteller

37,320 2,051 Updated Aug 1, 2024

zai-org / ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Python 15,567 1,805 Updated Jun 27, 2024