Highlights
- Pro
Stars
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
A markdown version emoji cheat sheet
Pretrained ConvNets for pytorch: NASNet, ResNeXt, ResNet, InceptionV4, InceptionResnetV2, Xception, DPN, etc.
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
PointNet and PointNet++ implemented by pytorch (pure python) and on ModelNet, ShapeNet and S3DIS.
Awesome Incremental Learning
🚀 Efficient implementations of state-of-the-art linear attention models
A collection of loss functions for medical image segmentation
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Collect some papers about transformer with vision. Awesome Transformer with Computer Vision (CV)
医学影像数据集列表 『An Index for Medical Imaging Datasets』
VMamba: Visual State Space Models,code is based on mamba
DeepLab v3+ model in PyTorch. Support different backbones.
Pointcept: Perceive the world with sparse points, a codebase for point cloud perception research. Latest works: Concerto (NeurIPS'25), Sonata (CVPR'25 Highlight), PTv3 (CVPR'24 Oral)
[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
AcadHomepage: A Modern and Responsive Academic Personal Homepage
label-smooth, amsoftmax, partial-fc, focal-loss, triplet-loss, lovasz-softmax. Maybe useful
PaperBanana: Automating Academic Illustration For AI Scientists
SpikingJelly is an open-source deep learning framework for Spiking Neural Network (SNN) based on PyTorch.
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs (CVPR 2022)
A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.