Highlights
Lists (20)
Sort Name ascending (A-Z)
Stars
CMU 11-711 Advanced NLP https://cmu-l3.github.io/anlp-spring2026/
DEMON: Diffusion Engine for Musical Orchestrated Noise
Official code for "MIDI-Informed Singing Accompaniment Generation in a Compositional Song Pipeline"
Skills for Real Engineers. Straight from my .claude directory.
High-Quality Voice Cloning TTS for 600+ Languages
ASLP-lab / DiffRhythm2
Forked from xiaomi-research/diffrhythm2Di♪♪Rhythm 2: Efficient And High Fidelity Song Generation Via Block Flow Matching
PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models
MOSS-Music is an open-source music understanding model for targeting musical captioning, lyrics ASR, structural analysis, chord / key / tempo reasoning, and long-form musical question answering.
A ComfyUI custom node integration for local multi-engine multi-language Text-to-Speech and Voice Conversion. Supports: RVC, Echo-TTS, Qwen3-TTS, Cozy Voice 3, Step Audio EditX, IndexTTS-2, Chatterb…
This package allows macOS Finder to display thumbnails, static QuickLook previews, cover art and metadata for most types of video files.
A small python library which publishes the DNS-SD service for a plugin.
TouchDesigner CHOP plugins powered by Essentia for real-time audio analysis
PyTorch implementations of two transformer-based music source separation models.
A Semantically Consistent Dataset for Data-Efficient Query-Based Universal Sound Separation
AI agents running research on single-GPU nanochat training automatically
Local AI music generator with smart lyrics: Gradio web UI for HeartMuLa + Ollama/OpenAI, tags, history, and high-fidelity audio.
Official inference code for SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis
Containing SOTA methods that follows time-varying conditions for Text-to-Music
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
Variations of L1 SNR Loss function for training audio source separation machine learning models
The most powerful local music generation model that outperforms almost all commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.