zwglory

Zhouwei zwglory

University of Chinese Academy of Science
Beijing in China

Achievements

Starred repositories

facebookresearch / sam-audio

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 2,246 162 Updated Dec 19, 2025

zai-org / Open-AutoGLM

An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone

Python 18,391 2,882 Updated Dec 19, 2025

zai-org / GLM-TTS

GLM-TTS: Controllable & Emotion-Expressive Zero-shot TTS with Multi-Reward Reinforcement Learning

Python 754 92 Updated Dec 17, 2025

zai-org / GLM-ASR

GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters

Python 586 52 Updated Dec 12, 2025

haoheliu / versatile_audio_super_resolution

Versatile audio super resolution (any -> 48kHz) with AudioSR.

Python 1,682 179 Updated Aug 27, 2025

haoheliu / AudioLDM

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Python 2,790 252 Updated Jun 25, 2025

lobehub / lobe-chat

🤯 LobeHub - an open-source, modern design AI Agent Workspace. Supports multiple AI providers, Knowledge Base (file upload / RAG ), one click install MCP Marketplace and Artifacts / Thinking. One-cl…

TypeScript 69,293 14,274 Updated Dec 21, 2025

stepfun-ai / gelab-zero

GELab: GUI Exploration Lab. One of the best GUI agent solutions in the galaxy, built by the StepFun-GELab team and powered by Step’s research capabilities.

Python 1,649 136 Updated Dec 19, 2025

PaddlePaddle / PaddleOCR

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Python 66,628 9,530 Updated Dec 16, 2025

Tongyi-MAI / Z-Image

Python 7,548 445 Updated Dec 14, 2025

stepfun-ai / Step-Audio-R1

Python 426 28 Updated Nov 27, 2025

supertone-inc / supertonic

Lightning-Fast, On-Device TTS — running natively via ONNX.

JavaScript 1,873 171 Updated Dec 15, 2025

GiantAILab / YingMusic-SVC

Official implementation of YingMusic-SVC.

Python 91 7 Updated Dec 15, 2025

sierra-research / tau2-bench

τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment

Python 555 123 Updated Dec 18, 2025

wenet-e2e / west

We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction

Python 167 11 Updated Dec 16, 2025

cofe-ai / flm-audio

FLM-Audio is a audio-language subversion of RoboEgo/FLM-Ego -- an omnimodal model with native full duplexity.

Python 53 8 Updated Dec 9, 2025

QwenLM / Qwen3-Omni

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 3,143 193 Updated Oct 9, 2025

jacklishufan / OmniFlows

The official implementation of OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows

Jupyter Notebook 122 10 Updated Aug 16, 2025

infiniflow / ragflow

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

Python 70,173 7,614 Updated Dec 19, 2025

SkyworkAI / DeepResearchAgent

DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for general-purpose task solving. The framework leverages a top-level planning agent to coo…

JavaScript 3,001 403 Updated Sep 29, 2025