Highlights
- Pro
Stars
21 Lessons, Get Started Building with Generative AI
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
A curated collection of fun and creative examples generated with Nano Banana & Nano Banana Pro🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the commu…
Janus-Series: Unified Multimodal Understanding and Generation Models
👶🏻 신입 개발자 전공 지식 & 기술 면접 백과사전 📖
LLM-powered framework for deep document understanding, semantic retrieval, and context-aware answers using RAG paradigm.
Streamlined interface for generating images with AI in Krita. Inpaint and outpaint with optional text prompt, no tweaking required.
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
Using Low-rank adaptation to quickly fine-tune diffusion models.
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Segment Anything in Medical Images
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
Everything about the SmolLM and SmolVLM family of models
A python tool that uses GPT-4, FFmpeg, and OpenCV to automatically analyze videos, extract the most interesting sections, and crop them for an improved viewing experience.
Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
Famous Vision Language Models and Their Architectures
📚 A collection of Deep Learning based Image Colorization and Video Colorization papers.
FaceScape (PAMI2023 & CVPR2020)
Select a portrait, click to move the head around (please use your own space / GPU!)
Medical SAM 2: Segment 3D Medical Images Via Segment Anything Model 2
[ECCV 2024] HiDiffusion: Increases the resolution and speed of your diffusion model by only adding a single line of code!
"DeepDPM: Deep Clustering With An Unknown Number of Clusters" [Ronen, Finder, and Freifeld, CVPR 2022]
[SIGGRAPH'24] CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose Canonicalization
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models, arxiv 2023 / CVPR 2024
A tool to divide a single illustration into a layered structure.