Qijian Tian fangzhou2000

🎯

Focusing

15 followers · 12 following

Achievements

Highlights

Stars

gca-spatial-reasoning / gca

Official Implementation of "Geometrically-Constrained Agent for Spatial Reasoning"

Python 18 Updated Dec 18, 2025

realsee-developer / RealSee3D

RealSee3D: A multi-view RGB-D dataset combining real-world captures and procedurally generated scenes, with extensible annotations for diverse 3D vision research.

Python 204 8 Updated Dec 18, 2025

UMass-Embodied-AGI / MindJourney

[NeurIPS 2025] Source codes for the paper "MindJourney: Test-Time Scaling with World Models for Spatial Reasoning"

Python 116 4 Updated Nov 4, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,500 481 Updated Oct 27, 2025

wufeim / SpatialReasonerDataGen

Synthetic VQA data generation code for SpatialReasoner.

Python 12 1 Updated Nov 25, 2025

johnson111788 / SpatialReasoner

Training recipe for SpatialReasoner

Python 26 1 Updated Sep 21, 2025

Visual-Agent / DeepEyesV2

Python 449 46 Updated Dec 22, 2025

VTool-R1 / VTool-R1

Code for the paper "VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use"

Python 144 3 Updated Aug 10, 2025

zhangquanchen / 3DThinker

Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views

Python 123 4 Updated Dec 9, 2025

allenai / molmo

Code for the Molmo Vision-Language Model

Python 841 80 Updated Dec 12, 2024

WangYipu2002 / CrossPoint

Official implementation of “Towards Cross-View Point Correspondence in Vision-Language Models”.

Python 10 Updated Dec 8, 2025

Zhoues / RoboRefer

[NeurIPS 2025] Official implementation of "RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics"

Python 214 6 Updated Dec 16, 2025

LLaVA-VL / LLaVA-NeXT

Python 4,463 434 Updated Sep 14, 2025

facebookresearch / DepthLM_Official

Official implementation of DepthLM

Python 276 12 Updated Oct 7, 2025

zhaochen0110 / Awesome_Think_With_Images

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,220 39 Updated Dec 23, 2025