Lists (1)
Sort Name ascending (A-Z)
Stars
Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.
The most powerful training scripts for ACE-Step 1.5 including a Command Line Interface, a Terminal Wizard and a Graphical User Interface.
openDAW is a next-generation web-based Digital Audio Workstation (DAW)
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
The most powerful local music generation model that outperforms most commercial alternatives, supporting Mac, AMD, Intel, and CUDA devices.
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Music repair method to convert lossy MP3 compressed music to lossless music.
Main reference implementation for NLWeb, implemented in Python.
ACE-Step: A Step Towards Music Generation Foundation Model
Robust Speech Recognition via Large-Scale Weak Supervision
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Audio Plugin for Audio to MIDI transcription using deep learning.
⏩ Source-controlled AI checks, enforceable in CI. Powered by the open-source Continue CLI
A zero-config VS Code database extension with affordances to aid development and debugging.
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Code for FLAVR: A fast and efficient frame interpolation technique.
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
Nodes related to video workflows