Skip to content
View ZhikangNiu's full-sized avatar
🎯
focus
🎯
focus

Block or report ZhikangNiu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

High-Quality Voice Cloning TTS for 600+ Languages

Python 2,862 431 Updated Apr 9, 2026

💥 Blazing fast terminal file manager written in Rust, based on async I/O.

Rust 36,239 803 Updated Apr 11, 2026

Parameter-efficient text-to-audio generation for edge and low-memory deployment.

7 Updated Apr 7, 2026

Vibecoding 系列教程:从环境搭建到多智能体协作,涵盖 MCP、Skills、Agent 分工治理

537 42 Updated Apr 9, 2026

[AutoArk] GPA (General Purpose Audio) can do ASR, TTS and voice conversion with one tiny 300M model!

Python 106 17 Updated Apr 7, 2026

Podcast Downloader - Download all podcasts / episodes from an RSS-feed

C++ 189 18 Updated Jan 13, 2026

Plug-and-play streaming semantic VAD for real-time full-duplex spoken dialogue systems.

Python 177 16 Updated Mar 20, 2026

Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels.

Python 1,175 112 Updated Mar 19, 2026
Shell 2 Updated Mar 31, 2026

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

Python 6,647 543 Updated Apr 7, 2026

MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, it supports streaming and variable bitrates, delivering SOTA …

Python 183 11 Updated Mar 6, 2026

The open agent skills tool - npx skills

TypeScript 13,681 1,110 Updated Apr 6, 2026

ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works…

Python 6,115 557 Updated Apr 10, 2026

Collection of common code that's shared among different research projects in FAIR computer vision team.

Python 2,233 237 Updated Mar 15, 2026

Pre-training, SFT, DPO and GRPO for Text-to-Audio Generation

Python 44 6 Updated Mar 13, 2026

HeartMuLa Official Repo: The Most Powerful Open-Source Music Generation Model of 2026

Python 4,021 368 Updated Apr 10, 2026

State-of-the-art continious audio tokenization

34 Updated Mar 9, 2026

🔥 LeetCode for PyTorch — practice implementing softmax, attention, GPT-2 and more from scratch with instant auto-grading. Jupyter-based, self-hosted or try online.

Jupyter Notebook 3,464 285 Updated Mar 27, 2026

A SOTA Industrial-Grade Voice Activity Detection & Audio Event Detection, supporting 100+ languages, outperforming Silero-VAD, TEN-VAD, FunASR-VAD and WebRTC-VAD

Python 347 26 Updated Apr 4, 2026

Music Language Model Generation, Optimization, and Practice

Jupyter Notebook 41 5 Updated Apr 10, 2026

Ming-omni-tts: Simple and Efficient Unified Generation of Speech, Music, and Sound with Precise Control

Python 219 16 Updated Feb 26, 2026

An agentic skills framework & software development methodology that works.

Shell 145,719 12,494 Updated Apr 10, 2026

A SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc modules. FireRedASR2 supports Chinese (Mandarin, 20+ dialects/accents), English, code-switching, and both speech and singi…

Python 461 27 Updated Mar 24, 2026
Python 20 1 Updated Jan 28, 2026

Official inference code for SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis

Python 541 59 Updated Mar 26, 2026

A PyTorch implementation of the GPT-OSS-20B architecture. All components are coded from scratch: RoPE with YaRN, RMSNorm, SwiGLU with clamping and residual connection, Mixture-of-Experts (MoE), Sel…

Python 228 15 Updated Dec 2, 2025

Text to speech alignment using CTC forced alignment

Python 480 83 Updated Feb 23, 2026

A general purpose scientific writer

Python 1,457 181 Updated Mar 9, 2026

Official repository for the paper "Audio ControlNet for Fine-Grained Audio Generation and Editing".

Python 70 3 Updated Feb 7, 2026

Elevate your AI research writing, no more tedious polishing ✨

16,790 1,345 Updated Mar 25, 2026
Next