Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.

Python 1 Updated Jan 30, 2026

zll961020 / DiariZen

Forked from BUTSpeechFIT/DiariZen

A toolkit for speaker diarization.

Jupyter Notebook 1 Updated Jan 31, 2026

zll961020 / Fun-ASR

Forked from FunAudioLLM/Fun-ASR

Python 1 Updated Feb 3, 2026

zll961020 / claude-code-source-code

Forked from sanbuphy/learn-coding-agent

Claude Code v2.1.88 Source Code

TypeScript 1 Updated Mar 31, 2026

zll961020 / claude-howto

Forked from luongnv89/claude-howto

A visual, example-driven guide to Claude Code — from basic concepts to advanced agents, with copy-paste templates that bring immediate value.

Python 1 Updated Apr 11, 2026

zll961020 / deepagents

Forked from langchain-ai/deepagents

Agent harness built with LangChain and LangGraph. Equipped with a planning tool, a filesystem backend, and the ability to spawn subagents - well-equipped to handle complex agentic tasks.

Python 1 Updated Apr 15, 2026

zll961020 / deer-flow

Forked from bytedance/deer-flow

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of…

Python 1 Updated Apr 14, 2026

NiniAndy / Paraformer-V2

来自于文章Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition

Python 29 4 Updated Nov 20, 2024

BUTSpeechFIT / VBx

Variational Bayes HMM over x-vectors diarization

Python 287 58 Updated Jan 15, 2024

zll961020 / ROLL

Forked from alibaba/ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 1 Updated Dec 22, 2025

FunAudioLLM / Fun-ASR

End-to-end speech recognition large model: 31 languages, dialects, accents, lyrics, hotwords, timestamps, speaker diarization. Trained on tens of millions of hours.

C 1,299 126 Updated Jun 22, 2026

zll961020 / r1-aqa

Forked from xiaomi-research/r1-aqa

🤗 R1-AQA Model: mispeech/r1-aqa

Python 1 Updated Mar 28, 2025

zll961020 / SALMONN

Forked from bytedance/SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

1 Updated Sep 28, 2025

zll961020 / Qwen3-Omni

Forked from QwenLM/Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 1 Updated Oct 9, 2025

zll961020 / HTGS

Forked from nerficg-project/HTGS

Official code release for "Efficient Perspective-Correct 3D Gaussian Splatting Using Hybrid Transparency"

Cuda 1 Updated Oct 16, 2025

zll961020 / gaussian-splatting

Forked from graphdeco-inria/gaussian-splatting

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

Python 1 Updated Oct 17, 2025

zll961020 / whisperX

Forked from m-bain/whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 1 Updated Oct 21, 2025

zll961020 / SLAM-LLM

Forked from X-LANCE/SLAM-LLM

A Framework for Speech, Language, Audio, Music Processing with Large Language Model

Python 1 Updated Oct 24, 2025

zll961020 / CityGaussian

Forked from Linketic/CityGaussian

[ECCV`24&ICLR`25] CityGaussian Series for High-quality Large-Scale Scene Reconstruction with Gaussians

Jupyter Notebook 1 Updated Oct 26, 2025

zll961020 / nanochat

Forked from karpathy/nanochat

The best ChatGPT that $100 can buy.

Python 1 Updated Oct 28, 2025

zll961020 / west

Forked from wenet-e2e/west

We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction

Python 1 Updated Apr 8, 2026

zll961020 / ms-swift

Forked from modelscope/ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 1 Updated May 13, 2026

zll961020 / omnilingual-asr

Forked from facebookresearch/omnilingual-asr

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 1 Updated Nov 10, 2025

zll961020 / conditional-flow-matching

Forked from atong01/conditional-flow-matching

TorchCFM: a Conditional Flow Matching library

Python 1 Updated Nov 11, 2025

lingling zll961020

Lists (15)

3d

agent

audio

diffusion

digital human

inference framework

llm

misc

ml

mllm

resource

rl

rl train framework

train

vision

Stars