-
Shanghai Jiao Tong University & Shanghai Innovation Institute
- Shanghai
-
04:38
(UTC +08:00) - https://zhikangniu.github.io/
Lists (29)
Sort Name ascending (A-Z)
Agent
ASR
Awesome List
Bench
Chinese LLM
Codec
CV
Dataset/Tools/Course
Diffusion
emotion
Framework
front
LLM
Music Generation
nano
nlp
other
pipeline
Podcast
PyTorch
RLHF
s2st
speaker diarization
T2V
TTS
tutorial
unify
V2A
Vocoder
Stars
LLaDA2.0-Uni: Understanding and Generation the World.
分享AI Infra知识&代码练习:PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等
A unified framework for easy reinforcement learning in Flow-Matching models
将博导十年科研经验炼化为可直接调用的 AI 技能。从 Idea 构思到论文投稿,你的 AI 科研副导师。
Practical, Colab-friendly notebooks for fine-tuning and running audio AI models
Pipeline to help users extract content from long meeting recordings.
😼 优雅地使用基于 clash/mihomo 的代理环境
Language modelling on RVQ tokens with minimal codes
Awesome Multimodal Modeling [Covers MLLM, UMM, and NMM]
Some comprehensive papers about speaker diarization
📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程
🤗 ml-intern: an open-source ML engineer that reads papers, trains models, and ships ML models
Self-evolving agent: grows skill tree from 3.3K-line seed, achieving full system control with 6x less token consumption
A framework for efficient model inference with omni-modality models
Hy3 preview (295B A21B), a leading reasoning and agent model in its size, with great cost efficiency
X-VC: Zero-shot Streaming Voice Conversion in Codec Space
科研写作助手 (Research Writing Assistant)
Repository for training models for music source separation.
Text-to-text alignment algorithm for speech recognition error analysis.
The project page of DiTReducio (Accepted by ACL 2026 Findings)
🎓 上海交通大学全能 AI 助手 — 基于 OpenClaw 的交大校园 Skill 包 | 19项功能覆盖 DDL/选课/邮箱/食堂/图书馆/PPT生成
High-Quality Voice Cloning TTS for 600+ Languages
💥 Blazing fast terminal file manager written in Rust, based on async I/O.
Parameter-efficient text-to-audio generation for edge and low-memory deployment.