-
Eindhoven University of Technology
- https://daandegeus.com
- @dcdegeus
Stars
[CVPR 2026] Official repository for the paper: "INSID3: Training-Free In-Context Segmentation with DINOv3"
[CVPR 2026 Workshop] Official code and models for Plain Mask Transformer (PMT).
[CVPR 2026] Official code and models for Video Encoder-only Mask Transformer (VidEoMT).
Fast and memory-efficient exact attention
Sa2VA-i is an improved version of the popular Sa2VA model
VisualOverload (CVPR 2026) is a VQA benchmark for image understanding in dense, high-resolution scenes.
The code for paper 'Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors'
Official code of Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning
[ICCV 2025] DONUT: A Decoder-Only Model for Trajectory Prediction
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
🚀 Lightning-fast computer vision models. Fine-tune SOTA models with just a few lines of code. Ready for cloud ☁️ and edge 📱 deployment.
Code for on-the-fly creation of pseudo video datasets as described in "How Important are Videos for Training Video LLMs?"
[CVPR 2025 Highlight] Official code and models for Encoder-only Mask Transformer (EoMT).
[ICRA 2025] Interactive4D: Interactive 4D LiDAR Segmentation
[ECCV 2024] Improving 2D Feature Representations by 3D-Aware Fine-Tuning
[WACV'25 Oral] Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
Official repository for "AM-RADIO: Reduce All Domains Into One"
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Official implementation of the CVPR 2024 paper ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions.
[CVPR 2024] Task-aligned Part-aware Panoptic Segmentation through Joint Object-Part Representations
[CVPR 2024] PEM: Prototype-based Efficient MaskFormer for Image Segmentation