ZhikangNiu

🎯

focus

Zhikang Niu-SII ZhikangNiu

🎯

focus

Ph.D. Student, SJTU @X-LANCE & SII @sii-research | Intern @ MiniMax @ Shanghai AILab @ MSRA

391 followers · 640 following

Shanghai Jiao Tong University & Shanghai Innovation Institute
Shanghai
08:27 (UTC +08:00)
https://zhikangniu.github.io/

Achievements

x2 x3 x2

Achievements

x2 x3 x2

Lists (28)

Sort

Stars

k2-fsa / OmniVoice

High-Quality Voice Cloning TTS for 600+ Languages

Python 2,862 431 Updated Apr 9, 2026

sxyazi / yazi

💥 Blazing fast terminal file manager written in Rust, based on async I/O.

Rust 36,239 803 Updated Apr 11, 2026

junxi25liu / TinyAudio

Parameter-efficient text-to-audio generation for edge and low-memory deployment.

7 Updated Apr 7, 2026

1EchA / how-to-vibecoding

Vibecoding 系列教程：从环境搭建到多智能体协作，涵盖 MCP、Skills、Agent 分工治理

537 42 Updated Apr 9, 2026

AutoArk / GPA

[AutoArk] GPA (General Purpose Audio) can do ASR, TTS and voice conversion with one tiny 300M model!

Python 106 17 Updated Apr 7, 2026

freshe / poddl

Podcast Downloader - Download all podcasts / episodes from an RSS-feed

C++ 189 18 Updated Jan 13, 2026

Soul-AILab / SoulX-Duplug

Plug-and-play streaming semantic VAD for real-time full-duplex spoken dialogue systems.

Python 177 16 Updated Mar 20, 2026

RightNow-AI / autokernel

Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels.

Python 1,175 112 Updated Mar 19, 2026

ZhikangNiu / terminal-setup

Shell 2 Updated Mar 31, 2026

Blaizzy / mlx-audio

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

Python 6,647 543 Updated Apr 7, 2026

OpenMOSS / MOSS-Audio-Tokenizer

MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, it supports streaming and variable bitrates, delivering SOTA …

Python 183 11 Updated Mar 6, 2026

vercel-labs / skills

The open agent skills tool - npx skills

TypeScript 13,681 1,110 Updated Apr 6, 2026

wanshuiyin / Auto-claude-code-research-in-sleep

ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…

Python 6,115 557 Updated Apr 10, 2026

facebookresearch / fvcore

Collection of common code that's shared among different research projects in FAIR computer vision team.

Python 2,233 237 Updated Mar 15, 2026

xiquan-li / Resonate

Pre-training, SFT, DPO and GRPO for Text-to-Audio Generation

Python 44 6 Updated Mar 13, 2026

HeartMuLa / heartlib

HeartMuLa Official Repo: The Most Powerful Open-Source Music Generation Model of 2026

Python 4,021 368 Updated Apr 10, 2026

xiaomi-research / dasheng-tokenizer

State-of-the-art continious audio tokenization

34 Updated Mar 9, 2026

duoan / TorchCode

🔥 LeetCode for PyTorch — practice implementing softmax, attention, GPT-2 and more from scratch with instant auto-grading. Jupyter-based, self-hosted or try online.

Jupyter Notebook 3,464 285 Updated Mar 27, 2026

FireRedTeam / FireRedVAD

A SOTA Industrial-Grade Voice Activity Detection & Audio Event Detection, supporting 100+ languages, outperforming Silero-VAD, TEN-VAD, FunASR-VAD and WebRTC-VAD

Python 347 26 Updated Apr 4, 2026

datawhalechina / musiclm-universe

Music Language Model Generation, Optimization, and Practice

Jupyter Notebook 41 5 Updated Apr 10, 2026

inclusionAI / Ming-omni-tts

Ming-omni-tts: Simple and Efficient Unified Generation of Speech, Music, and Sound with Precise Control

Python 219 16 Updated Feb 26, 2026

obra / superpowers

An agentic skills framework & software development methodology that works.

Shell 145,719 12,494 Updated Apr 10, 2026

FireRedTeam / FireRedASR2S

A SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc modules. FireRedASR2 supports Chinese (Mandarin, 20+ dialects/accents), English, code-switching, and both speech and singi…

Python 461 27 Updated Mar 24, 2026

Parakeet-Inc / J-HARD-TTS-Eval

Python 20 1 Updated Jan 28, 2026

Soul-AILab / SoulX-Singer

Official inference code for SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis

Python 541 59 Updated Mar 26, 2026

HamzaElshafie / gpt-oss-20B

A PyTorch implementation of the GPT-OSS-20B architecture. All components are coded from scratch: RoPE with YaRN, RMSNorm, SwiGLU with clamping and residual connection, Mixture-of-Experts (MoE), Sel…

Python 228 15 Updated Dec 2, 2025

MahmoudAshraf97 / ctc-forced-aligner

Text to speech alignment using CTC forced alignment

Python 480 83 Updated Feb 23, 2026

K-Dense-AI / claude-scientific-writer

A general purpose scientific writer

Python 1,457 181 Updated Mar 9, 2026

juhayna-zh / AudioControlNet

Official repository for the paper "Audio ControlNet for Fine-Grained Audio Generation and Editing".

Python 70 3 Updated Feb 7, 2026

Leey21 / awesome-ai-research-writing

Elevate your AI research writing, no more tedious polishing ✨

16,790 1,345 Updated Mar 25, 2026

Zhikang Niu-SII ZhikangNiu

Lists (28)

ASR

Awesome List

Bench

Chinese LLM

Codec

CV

Dataset/Tools/Course

Diffusion

emotion

Framework

front

LLM

Music Generation

nano

nlp

other

pipeline

Podcast

PyTorch

RLHF

s2st

speaker diarization

T2V

TTS

tutorial

unify

V2A

Vocoder

Stars