Highlights
- Pro
Stars
PyTorch implementation of InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.
LAVIS - A One-stop Library for Language-Vision Intelligence
✨✨Latest Advances on Multimodal Large Language Models
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
[ICCV 2023] Efficient Diffusion Training via Min-SNR Weighting Strategy
This is an official implementation of our CVPR 2023 paper "Human Pose as Compositional Tokens" (https://arxiv.org/pdf/2303.11638.pdf)
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Painlessly create beautiful matplotlib plots.
Matplotlib styles for scientific plotting
An open source implementation of CLIP.
程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).
This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
Replication of Pix2Seq with Pretrained Model
Efficient 3D Backbone Network for Temporal Modeling
Semi-Supervised Learning, Object Detection, ICCV2021
Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Per-Pixel Classification is Not All You Need for Semantic Segmentation (NeurIPS 2021, spotlight)
PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations
This is an official implementation for "Self-Supervised Learning with Swin Transformers".
This is an official implementation for "Video Swin Transformers".
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
中国程序员容易发音错误的单词