Stars
A benchmark for cross-domain few-shot object detection (ECCV24 paper: Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector)
the Final Version code for NTIRE 2025 CDFSOD Challenge
[CVPRW'25] Official Code for “Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection”
NTIRE 2026 Challenge on 2-nd Cross-Domain Few-Shot Object Detection @ CVPR 2026
NTIRE 2025 Challenge on 1-st Cross-Domain Few-Shot Object Detection @ CVPR 2025
[NeurIPS'24] A Simple Image Segmentation Framework via In-Context Examples
Official code for "No time to train! Training-Free Reference-Based Instance Segmentation"
[OV-DEIM] Real-time DETR-Style Open-Vocabulary Object Detection with GridSynthetic Augmentation
RF-DETR is a real-time object detection and segmentation model architecture developed by Roboflow, SOTA on COCO, designed for fine-tuning. [ICLR 2026]
[ICLR 2026] FSOD-VFM: Few-Shot Object Detection with Vision Foundation Models and Graph Diffusion
[CVPR 2026 Oral] "INSID3: Training-Free In-Context Segmentation with DINOv3"
keejkrej / sam3
Forked from facebookresearch/sam3The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
Muggled SAM: Segmentation without the magic
YOLOE-26: Open-Vocabulary Instance Segmentation
[NeurIPS 2025 Spotlight] A Generalist Diffusion Model for Vision Perception
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Official implementation for the paper "Deep ViT Features as Dense Visual Descriptors".
Template rendering for the Paper ZS6D: Zero-shot 6D Object Pose Estimation using Vision Transformers
An official implementation of Geo6D: Geometric-Constraints-Guided Direct Object 6D Pose Estimation Network
A distilled Segment Anything (SAM) model capable of running real-time with NVIDIA TensorRT
A project that optimizes OWL-ViT for real-time inference with NVIDIA TensorRT.
Introducing OWLv2: Google's Breakthrough in Zero-Shot Object Detection
Template-based Novel Object Detection and Segmentation
OS2D: One-Stage One-Shot Object Detection by Matching Anchor Features
Personalize Segment Anything Model (SAM) with 1 shot in 10 seconds