Skip to content
View W-Ted's full-sized avatar

Block or report W-Ted

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🔥 Official code repository for "Unlocking Dense Metric Depth Estimation in VLMs"

Python 128 6 Updated May 21, 2026

Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders [Technical Report]

Jupyter Notebook 193 10 Updated Mar 30, 2026

Official code for CVPR 2026 paper: VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection

Python 131 4 Updated Apr 14, 2026

[CVPR 2025] Online Video Understanding: OVBench and VideoChat-Online

Python 95 6 Updated Oct 7, 2025

Official code for "Generalizable and Animatable 3D Full-Head Gaussian Avatar from a Single Image"

16 Updated Jan 22, 2026

Human-taught Computer-use Agent Designed for Real Windows and MacOS Desktops.

Python 310 37 Updated Jan 20, 2026

SPAgent, a foundation agent for understanding, reasoning over, and operating within the physical and spatial world.

Python 192 30 Updated May 20, 2026

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

TypeScript 36,413 3,671 Updated May 18, 2026

Youtu-Tip: Tap for Intelligence, Keep on Device.

Python 591 66 Updated Feb 27, 2026

RePlan: Reasoning-Guided Region Planning for Complex Instruction-Based Image Editing

Python 65 3 Updated Mar 19, 2026

DART-GUI: Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation

Python 93 6 Updated Feb 26, 2026

[ECCV 2024] GGRt: Towards Pose-free Generalizable 3D Gaussian Splatting in Real-time

Python 105 4 Updated Apr 3, 2025

[ICCV 2025] CityGS-X : A Scalable Architecture for Efficient and Geometrically Accurate Large-Scale Scene Reconstruction

Python 167 15 Updated May 15, 2026

EvoCUA: Evolving Computer Use Agent

Python 325 24 Updated Mar 31, 2026

Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Python 472 32 Updated May 20, 2026

[NeurIPS 2025] LabelAny3D: Label Any Object 3D in the Wild

Python 130 9 Updated Jan 6, 2026

[EMNLP 2025]Repository for paper "DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning"

Python 30 3 Updated Jul 2, 2025

Official implementation of "RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics"

Python 74 2 Updated Jan 19, 2026

🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

Python 17,192 1,951 Updated Jun 14, 2026

[NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"

Python 263 11 Updated Dec 16, 2025

Accelerate VGGT with efficient desciptor-based global attention

Python 88 2 Updated Jun 3, 2026

[ICLR 2026] Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos

28 1 Updated May 29, 2026

A part-based 3D generation framework & the largest and most comprehensively annotated 3D part dataset.

Jupyter Notebook 141 9 Updated Dec 15, 2025
Jupyter Notebook 223 14 Updated Jul 5, 2024

[CVPR 2024] This is official implementation of our CVPR 2024 paper "Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception" https://arxiv.org/abs/2405.07201

Python 17 Updated Jun 11, 2024

🍳 [CVPR'25] PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian Splatting

Python 227 15 Updated Apr 19, 2026

[CVPR 2026] An accurate and dense-annotated synthetic dataset for training SOTA detectors / segmentors / Grounding-VLMs.

Python 47 Updated Feb 23, 2026

Universal Monocular Metric Depth Estimation

Python 1,212 113 Updated May 18, 2025

[ICCV'25] 3D-MOOD: Lifting 2D to 3D for Monocular Open-Set Object Detection

Python 122 8 Updated Oct 14, 2025
Next