Highlights
- Pro
Stars
MoralStack is a governance and safety layer for LLM applications. It analyzes user requests before generation, evaluates risk and intent, and decides whether the AI should answer normally, answer s…
[CVPR 2026 Workshop] Official code and models for Plain Mask Transformer (PMT).
[CVPR 2026 Oral] Official repository for the paper: "INSID3: Training-Free In-Context Segmentation with DINOv3"
Code, models, data for the NeurIPS'25 paper, Jamais Vu: Exposing the Generalization Gap in Supervised Semantic Correspondence
[CVPR 2026] Official code and models for Video Encoder-only Mask Transformer (VidEoMT).
Code for the paper "Attention Meets Post-hoc Interpretability: A Mathematical Perspective", ICML 2024
Official Repository for "Communication Efficient Federated Learning with Generalized Heavy-Ball Momentum", accepted at TMLR 2025
Official code for "To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition" CVPR IMW 2025
This repo aims to include materials (papers, codes, slides) about SAM2 (segment anything in images and videos). We are continuously improving the project. Welcome to PR the works (papers, repos) th…
Official implementation of "HiERO: understanding the hierarchy of human behavior enhances reasoning on egocentric videos", accepted at ICCV 2025.
Official implementation of "Hier-EgoPack: Hierarchical Egocentric Video Understanding with Diverse Task Perspectives" https://arxiv.org/abs/2502.02487
Official implementation of "A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives", accepted at CVPR 2024.
[NeurIPS 2025 Spotlight] "SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation."
Code for EarthMatch (CVPR 2024 IMW), an iterative coregistration pipeline to localize astronaut photos of Earth
[CVPR 2024] PEM: Prototype-based Efficient MaskFormer for Image Segmentation
[CVPR 2025 Highlight] Official code and models for Encoder-only Mask Transformer (EoMT).
🚀 Lightning-fast computer vision models. Fine-tune SOTA models with just a few lines of code. Ready for cloud ☁️ and edge 📱 deployment.
Wrapper of 50+ image matching models with a unified interface
[CVPR 2025 Highlight] Official repository for the paper: "SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation"
Code for the paper "AMEGO: Active Memory from long EGOcentric videos" published at ECCV 2024
Official repository of the CVPR24 paper "The Unreasonable Effectiveness of Pre-Trained Features for Camera Pose Refinement"
A bunch of scripts helping with daily tasks in 3D vision research.
Official code for ICCV 2023 paper "EigenPlaces: Training Viewpoint Robust Models for Visual Place Recognition"
Prepare for success in Machine Learning (ML) interviews after completing your Ph.D. Dive into Python-based resources and code examples for mastering ML interview challenges. From algorithms to CV-b…
DROPO: Sim-to-Real Transfer with Offline Domain Randomization