Stars
A latent text-to-image diffusion model
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
StableLM: Stability AI Language Models
High-Resolution Image Synthesis with Latent Diffusion Models
PyTorch code and models for the DINOv2 self-supervised learning method.
QLoRA: Efficient Finetuning of Quantized LLMs
Reference PyTorch implementation and models for DINOv3
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
Using Low-rank adaptation to quickly fine-tune diffusion models.
Python code for "Probabilistic Machine learning" book by Kevin Murphy
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
Segment Anything in High Quality [NeurIPS 2023]
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
📚 Jupyter notebook tutorials for OpenVINO™
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
Paper2Agent is a multi-agent AI system that automatically transforms research papers into interactive AI agents with minimal human input.
Benchmarking Knowledge Transfer in Lifelong Robot Learning
Concept Sliders for Precise Control of Diffusion Models
[ECCV'2024] Gaussian Grouping for open-world Anything reconstruction, segmentation and editing.
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-…
Erasing Concepts from Diffusion Models
[NeurIPS 2021] [T-PAMI] Global Filter Networks for Image Classification
Build your own visual reasoning model
CycleResearcher: Improving Automated Research via Automated Review
Code for ICML 2025 Paper "Highly Compressed Tokenizer Can Generate Without Training"