Stars
Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"
All-In-One VLM: Image + Video + Transfer to Other Languages / Domains (TPAMI 2023)
Official Implementation of Diffusion Step Annealing (DiSA) in Autoregressive Image Generation
This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation"
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
Codebase for evaluation of deep generative models as presented in Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
Official Implementation for "Consistency Flow Matching: Defining Straight Flows with Velocity Consistency"
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
[NeurIPS'24] Official PyTorch Implementation of Seeing the Image: Prioritizing Visual Correlation by Contrastive Alignment
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
PyTorch code and models for V-JEPA self-supervised learning from video.
Open-Source implementation of FlexPredict paper (https://arxiv.org/pdf/2308.00566.pdf)
[ICCV 2023 Oral] Official Implementation of "Denoising Diffusion Autoencoders are Unified Self-supervised Learners"
Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch
Code for Fast Training of Diffusion Models with Masked Transformers
A collection of literature after or concurrent with Masked Autoencoder (MAE) (Kaiming He el al.).
This is a PyTorch implementation of “Context AutoEncoder for Self-Supervised Representation Learning"
Pytorch Repo for DeepGCNs (ICCV'2019 Oral, TPAMI'2021), DeeperGCN (arXiv'2020) and GNN1000(ICML'2021): https://www.deepgcns.org
This is a offical PyTorch/GPU implementation of SupMAE.
Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
Dynamic contexts empowered visual-linguistic representation learning
Official repo for Directional Self-supervised Learning for Heavy Image Augmentations [CVPR2022]
Official repository for the paper "Self-Supervised Models are Continual Learners" (CVPR 2022)