-
Tongji University, Tsinghua University
- Beijing
- @1284915396
Starred repositories
本项目旨在为致力于进入VLA(Vision-Language-Action)领域的算法工程师提供一份全中文、实战导向的学习/面试手册。 不同于通用的 CV/NLP 面试指南,本项目聚焦于 Robotics 特有的挑战
Dexbotic: Open-Source Vision-Language-Action Toolbox
Collection of Unsupervised Learning Methods for Vision-Language Models (VLMs)
ReTA: Reliable Test-Time Adaptation. (ACM MM 2025)
This repo officially implements (ICLR2025) Multi-Label Test-Time Adaptation with Bound Entropy Minimization.
[ICML 2025] DPCore: Dynamic Prompt Coreset for Continual Test-Time Adaptation
Let's train vision transformers (ViT) for cifar 10 / cifar 100!
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models
Interactive Multi-Label CNN Learning with Partial Labels @ CVPR20
Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".
Open Images is a dataset of ~9 million images that have been annotated with image-level labels and bounding boxes spanning thousands of classes.
Provide free clash subscriptions of ssr / vmess / hysteria2 proxy servers 提供免费clash订阅,免费ssr节点,免费trojan节点,免费vmess节点,免费hysteria2节点服务器
订阅地址🚀 免费共享♻️ 定期更新✨ 科学上网🌈 请勿滥用🚫一键订阅📪SSR/CLASH/V2RAY
AutoDL平台服务器适配梯子, 使用 Clash 作为代理工具
Downsampled Open Images Dataset V4 with 15.4 M bounding boxes for 600 categories on 1.9M images
SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model @ ICCV 2023 **AND** SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-tr…
Official code repo for CVPR 2024 Paper: Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer
A Latex template for journal review response (initially designed for IEEE TGRS)
[CVPR2024] Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
This is a community implementation for the paper EcoTTA: Memory-Efficient Continual Test-time Adaptation via Self-distilled Regularization
Code repository for the paper - "Neural Priming for Sample-Efficient Adaptation"
A Survey on Multimodal Retrieval-Augmented Generation
[Paper][AAAI2024]Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations
Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey