Skip to content
View Wilbur529's full-sized avatar

Block or report Wilbur529

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Code for the paper "Fusing Satellite Imagery and Planimetric Maps for Cross-View Localization"

2 Updated Feb 27, 2026

VecLang source code

9 Updated Jun 10, 2026

DepthMaster: Unified Monocular Depth Estimation for Perspective and Panoramic Images

Python 16 Updated Jun 13, 2026

ARM: An AutoRegressive Large Multimodal Model with Discrete Representations

44 Updated Jun 10, 2026

Learning Geometric Representations from Videos for Spatial Intelligent Multimodal Large Language Models

Python 7 Updated Jun 10, 2026

🔥 Official code repository for "Unlocking Dense Metric Depth Estimation in VLMs"

Python 128 6 Updated May 21, 2026

A high performance 3DGS renderer

TypeScript 704 74 Updated Jun 13, 2026

[CVPR 2026 Oral] VGGT Omega

Python 2,967 122 Updated May 18, 2026

A feed-forward 3D foundation model for reconstructing scenes from streaming data

Python 7,210 712 Updated Jun 2, 2026
Python 35 1 Updated Jun 2, 2026

[CVPR 2026 (Highlight)] Scal3R: Scalable Test-Time Training for Large-Scale 3D Reconstruction

Python 483 37 Updated May 11, 2026

This is the repo for paper "OccSim: Multi-kilometer Simulation with Long-horizon Occupancy World Models"

13 Updated May 14, 2026

(CVPR 2026) Sparsity-Aware Voxel Attention and Foreground Modulation for 3D Semantic Scene Completion

15 Updated Mar 8, 2026

A simple video streaming baseline that outperforms SOTAs.

Python 142 8 Updated May 1, 2026

[NeurIPS 2025] 3DRS: MLLMs Need 3D-Aware Representation Supervision for Scene Understanding

Python 157 Updated Dec 9, 2025

[CVPR 2026 Findings] Speed3R: Sparse Feed-forward 3D Reconstruction Models

Python 72 3 Updated Apr 7, 2026
Python 30 2 Updated Mar 20, 2026

[CVPR 2026] ZipMap: Linear-Time Stateful 3D Reconstruction via Test-Time Training

Python 449 11 Updated Jun 11, 2026

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 19,376 1,787 Updated Jan 30, 2026

[CVPR2026] SGDrive: Scene-to-Goal Hierarchical World Cognition for Autonomous Driving

Python 65 3 Updated Apr 15, 2026

[ECCV 2024] This is the official implementation of HRMapNet, maintaining and utilizing a low-cost global rasterized map to enhance online vectorized map perception.

Python 110 16 Updated Sep 25, 2024

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 33,849 7,056 Updated Jun 14, 2026

[CVPR 26] Release repo of our work "Co-Me: Confidence-Guided Token Merging for Visual Geometric Transformers"

Python 188 8 Updated May 18, 2026

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 10,671 876 Updated Jun 12, 2026

[ICLR2026] WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction

Python 69 3 Updated Sep 3, 2025

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 8,000 503 Updated Feb 10, 2026

[ICLR 2026] π^3: Permutation-Equivariant Visual Geometry Learning

Python 2,011 156 Updated May 18, 2026

Tooling for the Common Objects In 3D dataset.

Python 1,162 87 Updated Aug 14, 2024

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,486 1,314 Updated Jul 9, 2025

MMaDA - Open-Sourced Multimodal Large Diffusion Language Models (dLLMs with block diffusion, mixed-CoT, unified RL)

Python 1,650 87 Updated Feb 14, 2026
Next