Stars
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
OpenMMLab Text Detection, Recognition and Understanding Toolbox
Official implementation of Character Region Awareness for Text Detection (CRAFT)
SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility.
Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.
A TTS model capable of generating ultra-realistic dialogue in one pass.
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
FP-Studio / framepack-studio
Forked from lllyasviel/FramePackExpanding FramePack into a multifunction video creation tool
Lets make video diffusion practical!
AI PC starter app for doing AI image creation, image stylizing, and chatbot on a PC powered by an Intel® Arc™ GPU.
QualityScaler - image/video AI upscaler app
[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
🔊 Text-Prompted Generative Audio Model
"ViMax: Agentic Video Generation (Director, Screenwriter, Producer, and Video Generator All-in-One)"
deepbeepmeep / Wan2GP
Forked from Wan-Video/Wan2.1A fast AI Video Generator for the GPU Poor. Supports Wan 2.1/2.2, Qwen Image, Hunyuan Video, LTX Video and Flux.
Enjoy the magic of Diffusion models!
SkyReels-V2: Infinite-length Film Generative model
Open-Sora: Democratizing Efficient Video Production for All
The AI Toolkit for TypeScript. From the creators of Next.js, the AI SDK is a free open-source library for building AI-powered applications and agents
Prompt, run, edit, and deploy full-stack web applications. -- bolt.new -- Help Center: https://support.bolt.new/ -- Community Support: https://discord.com/invite/stackblitz
stackblitz-labs / bolt.diy
Forked from stackblitz/bolt.newPrompt, run, edit, and deploy full-stack web applications using any LLM you want!
The FLARE team's open-source tool to identify capabilities in executable files.
<⚡️> SuperAGI - A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning