Starred repositories
A latent text-to-image diffusion model
š Text-Prompted Generative Audio Model
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllableā¦
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
StableLM: Stability AI Language Models
A multi-voice TTS system trained with an emphasis on quality
Foundational Models for State-of-the-Art Speech and Text Translation
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Sweep: AI coding assistant for JetBrains
Official Code for Stable Cascade
Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.
[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
[SIGGRAPH Asia 2023] Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Deep Reinforcement Learning: Zero to Hero!
OneDiff: An out-of-the-box acceleration library for diffusion models.
Takagi and Nishimoto, CVPR 2023
152334H / tortoise-tts-fast
Forked from neonbjb/tortoise-ttsFast TorToiSe inference (5x or your money back!)
PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator (NeurIPS 2024)
N3RP (the NFT Rental Protocol) allows users to trustlessly rent out their ERC721-based assets.