-
21:08
(UTC +08:00)
Highlights
- Pro
Lists (12)
Sort Name ascending (A-Z)
Stars
Datasets, Transforms and Models specific to Computer Vision
Automatically crawl arXiv papers daily and summarize them using AI. Illustrating them using GitHub Pages.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
NumPy aware dynamic Python compiler using LLVM
Fast and memory-efficient exact attention
🐍 Geometric Computer Vision Library for Spatial AI
[CVPR 2026] Beyond Generation: Advancing Image Editing Priors for Depth and Normal Estimation
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
A python library for self-supervised learning on images.
Awesome Unified Multimodal Models
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[NeurIPS'25] A work to improve CLIP's visual detail capturing ability by inverting the unCLIP generative model.
Blocks specific sites from appearing in Google search results
collection of diffusion model papers categorized by their subareas
Collection of common code that's shared among different research projects in FAIR computer vision team.
This project aims to enhance the working environment on Windows
Most popular metrics used to evaluate object detection algorithms.
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.
[ECCV 2024] Official Repository for DiffiT: Diffusion Vision Transformers for Image Generation
Stable Diffusion web UI
[ICLR'25 Oral] No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
This repository collects papers on VLLM applications. We will update new papers irregularly.
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
High-fidelity performance metrics for generative models in PyTorch