-
timedomAIn
- Beijing
- seanweichat
Lists (28)
Sort Name ascending (A-Z)
3d-rendering
unity or other 3D rendering relatedAI_tricks
audio_framework
audio-generation
models for audio generationbigData
blockchain
chatGPTxxx
dataset
DeepLearning—learning
dsp
game_framework
game_graphic
game_physics
image_generation
xxGAN, diffusion..infra
interesting
large_model
MIR_ASR
ML_model deploy/optimization
music-generation
nlp
other_tools
server_dev
TTS_or_singing-sythesis
deep-learning paper for MIR, TTS for SInging-synthesisui_framework
vocoder
voice-conversion
webui
Stars
The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…
Efficient Triton Kernels for LLM Training
SGLang is a fast serving framework for large language models and vision language models.
Fused Qwen3 MoE layer for faster training, compatible with HF Transformers, LoRA, 4-bit quant, Unsloth
Generative models for conditional audio generation
An open agentic system built on smolagents, integrating multimodal state-of-the-art music AI models for understanding, generation, and interaction.
Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.
Context engineering is the new vibe coding - it's the way to actually make AI coding assistants work. Claude Code is the best for this so that's what this repo is centered around, but you can apply…
An open-source AI agent that lives in your terminal.
轻量、灵活、易上手的Python剪映草稿生成及导出工具,构建全自动化视频剪辑/混剪流水线。本项目的CapCut版本正于 https://github.com/GuanYixuan/pyCapCut 内开发
PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
The official code repository for SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement
No fortress, purely open ground. OpenManus is Coming.
FlowGram is an extensible workflow development framework with built-in canvas, form, variable, and materials that helps developers build AI workflow platforms faster and simpler.
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…
ACE-Step: A Step Towards Music Generation Foundation Model
Dataset and code of GTSinger(NeurIPS 2024 Spotlight): A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
PyTorch-implementations of Flow Models for toy data
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
A Model Context Protocol server for searching and analyzing arXiv papers
Encode and decode audio samples to/from compressed latent representations!
Multilingual Voice Understanding Model