Stars
Datasets, Transforms and Models specific to Computer Vision
A highly efficient implementation of Gaussian Processes in PyTorch
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
A data augmentations library for audio, image, text, and video.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
solo-learn: a library of self-supervised methods for visual representation learning powered by Pytorch Lightning
Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"
This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation"
PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722
Acceptance rates for the major AI conferences
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Most popular metrics used to evaluate object detection algorithms.
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
[MICCAI 2019 Young Scientist Award] [MEDIA 2020 Best Paper Award] Models Genesis
Official Implementation of Diffusion Step Annealing (DiSA) in Autoregressive Image Generation
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
[CVPR 2022] Pre-Training 3D Point Cloud Transformers with Masked Point Modeling
Codebase for evaluation of deep generative models as presented in Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models
PyTorch code and models for V-JEPA self-supervised learning from video.
Deep Learning Visualization Toolkit(『飞桨』深度学习可视化工具 )
Official Implementation for "Consistency Flow Matching: Defining Straight Flows with Velocity Consistency"