- Taiwan
Stars
A paper list for spatial reasoning
Survey: https://arxiv.org/pdf/2507.20198
Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)
Official implementation of the paper: "StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling"
Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)
ProtoMotions is a GPU-accelerated simulation and learning framework for training physically simulated digital humans and humanoid robots.
[DEIMv2] Real Time Object Detection Meets DINOv3
[NeurIPS 2025 (Spotlight)] The implementation for the paper "4DGT Learning a 4D Gaussian Transformer Using Real-World Monocular Videos"
Open Vision Agents by Stream. Build Vision Agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.
Release repo for our SLAM Handbook
Code for "LiftFeat: 3D Geometry-Aware Local Feature Matching", ICRA2025
[3DV 2026] ViSTA-SLAM: Visual SLAM with Symmetric Two-view Association
FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation (ICRA 2021)
Fisheye-Calib-Adapter: An Easy Tool for Fisheye Camera Model Conversion
The official implementation of ICCV'25 paper "FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution"
Panorama stitching based on asymmetric bidirectional optical flow
Numpy & PyTorch implementation of three algorithms of image deformation using moving least squares. http://dl.acm.org/citation.cfm?doid=1179352.1141920
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
SIFT detection and matching on (almost) any GPU with Vulkan
[ICCV 2025] SuperDec: 3D Scene Decomposition with Superquadric Primitives.
Barkour Robot: Agile Quadruped Robots by Google DeepMind
Official implementation of DA²: Depth Anything in Any Direction
Lightweight LiDAR-Inertial SLAM system for ROS 2. A minimal, dependency-free implementation for research and education.