BITcats

Follow

✨

Focusing

Zewen Xu BITcats

✨

Focusing

Follow

Ph.D in Institute of Automation Chinese Academy of Sciences

41 followers · 118 following

Institute of Automation Chinese Academy of Sciences
BEIJING, CHINA
https://bitcats.github.io/

Achievements

Achievements

Lists (6)

Sort

Embodied AI, VLN, VLA, VLM

48 repositories

Large 3D Model

LO & LIO

19 repositories

ToolBox

some tools for SLAM

11 repositories

VLIO/SLAM

VO & VIO

29 repositories

Stars

467 results for source starred repositories

KlabCMU / IDOL

Official implementation of "IDOL: Inertial Deep Orientation-estimation & Localization." AAAI 2021.

60 13 Updated Feb 2, 2021

RoyinaJayanth / EqNIO

This is the source code for our ICLR 2025 work EqNIO

Python 23 2 Updated Apr 25, 2025

Tencent-Hunyuan / HY-WorldPlay

HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency

Python 1,102 88 Updated Jan 13, 2026

CognitiveAISystems / 3DGraphLLM

[ICCV 2025] 3DGraphLLM is a model that uses a 3D scene graph and an LLM to perform 3D vision-language tasks.

Python 103 7 Updated Dec 10, 2025

facebookresearch / jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Python 3,503 354 Updated Feb 27, 2025

nvidia-cosmos / cosmos-transfer2.5

Cosmos-Transfer2.5, built on top of Cosmos-Predict2.5, produces high-quality world simulations conditioned on multiple spatial control inputs.

Python 442 63 Updated Feb 3, 2026

superxslam / SuperOdom

A highly robust and accurate LiDAR-only, LiDAR-inertial odometry

C++ 771 97 Updated Jan 27, 2026

ywh187 / XStreamVGGT

🚀 Official code for “XStreamVGGT: Extremely Memory-Efficient Streaming Vision Geometry Grounded Transformer with KV Cache Compression”, published at SID’s Display Week 2026.

Python 29 1 Updated Jan 27, 2026

snarktank / ralph

Ralph is an autonomous AI agent loop that runs repeatedly until all PRD items are complete.

TypeScript 9,497 1,107 Updated Feb 2, 2026

InternRobotics / AnySplat

[SIGGRAPH Asia 2025 (ACM TOG)] AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views

Python 717 38 Updated Dec 22, 2025

LTH14 / JiT

PyTorch implementation of JiT https://arxiv.org/abs/2511.13720

Python 2,083 137 Updated Dec 8, 2025

yifan123 / flow_grpo

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,963 123 Updated Nov 4, 2025

AMAP-EAI / SocialNav

Official implementation for "SocialNav: Training Human-Inspired Foundation Model for Socially-Aware Embodied Navigation"

54 Updated Dec 1, 2025

intelligolabs / R2RIE-CE

[IROS 24] Official repository of "Mind the Error! Detection and Localization of Instruction Errors in Vision-and-Language Navigation". We present the first dataset - R2R-IE-CE - to benchmark instru…

Python 18 1 Updated Jan 8, 2025

InternRobotics / G2VLM

G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

Python 259 8 Updated Jan 15, 2026

HKUST-SAIL / sail-recon

[3DV 2026 Oral] Official Repo of "SAIL-Recon: Large SfM by Augmenting Scene Regression with Localization"

Python 285 17 Updated Dec 30, 2025

ByteDance-Seed / Depth-Anything-3

Depth Anything 3

Python 4,279 396 Updated Dec 12, 2025

MrZihan / Dynam3D

Official implementation of "Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation" (NeurIPS'25 Oral)

Python 75 5 Updated Dec 22, 2025

YvanYin / Metric3D

The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."

Python 2,097 158 Updated Mar 13, 2025

lifuguan / IGGT_official

IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction

Python 334 15 Updated Dec 1, 2025

ant-research / PlanarSplatting

[CVPR 2025 Hightlight] PlanarSplatting: Accurate Planar Surface Reconstruction in 3 Minutes

Python 69 3 Updated Sep 22, 2025

lck666666 / plana3r

[NeurIPS 2025] the official project page of a paper, "PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-Forward Planar Splatting"

Python 66 3 Updated Feb 1, 2026

NVlabs / planercnn

PlaneRCNN detects and reconstructs piece-wise planar surfaces from a single RGB image

Python 604 129 Updated Oct 9, 2022

AvaLovelace1 / BrickGPT

Official repository for BrickGPT, the first approach for generating physically stable toy brick models from text prompts.

Python 1,594 98 Updated Jan 26, 2026

facebookresearch / co-tracker

CoTracker is a model for tracking any point (pixel) on a video.

Jupyter Notebook 4,811 345 Updated Jan 21, 2025

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 18,163 1,577 Updated Jan 30, 2026

facebookresearch / sonata

[CVPR'25 Highlight] Official repository of Sonata: Self-Supervised Learning of Reliable Point Representations

Python 675 36 Updated Jun 4, 2025

JuanBindez / pytubefix

Python3 library for downloading YouTube Videos.

Python 1,453 180 Updated Dec 7, 2025

gangweix / pixel-perfect-depth

[NeurIPS 2025] Pixel-Perfect Depth

Python 760 31 Updated Jan 13, 2026

facebookresearch / map-anything

MapAnything: Universal Feed-Forward Metric 3D Reconstruction

Python 2,869 199 Updated Jan 18, 2026