Skip to content
View BITcats's full-sized avatar
Focusing
Focusing

Block or report BITcats

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
467 results for source starred repositories
Clear filter

Official implementation of "IDOL: Inertial Deep Orientation-estimation & Localization." AAAI 2021.

60 13 Updated Feb 2, 2021

This is the source code for our ICLR 2025 work EqNIO

Python 23 2 Updated Apr 25, 2025

HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency

Python 1,102 88 Updated Jan 13, 2026

[ICCV 2025] 3DGraphLLM is a model that uses a 3D scene graph and an LLM to perform 3D vision-language tasks.

Python 103 7 Updated Dec 10, 2025

PyTorch code and models for V-JEPA self-supervised learning from video.

Python 3,503 354 Updated Feb 27, 2025

Cosmos-Transfer2.5, built on top of Cosmos-Predict2.5, produces high-quality world simulations conditioned on multiple spatial control inputs.

Python 442 63 Updated Feb 3, 2026

A highly robust and accurate LiDAR-only, LiDAR-inertial odometry

C++ 771 97 Updated Jan 27, 2026

🚀 Official code for “XStreamVGGT: Extremely Memory-Efficient Streaming Vision Geometry Grounded Transformer with KV Cache Compression”, published at SID’s Display Week 2026.

Python 29 1 Updated Jan 27, 2026

Ralph is an autonomous AI agent loop that runs repeatedly until all PRD items are complete.

TypeScript 9,497 1,107 Updated Feb 2, 2026

[SIGGRAPH Asia 2025 (ACM TOG)] AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views

Python 717 38 Updated Dec 22, 2025

PyTorch implementation of JiT https://arxiv.org/abs/2511.13720

Python 2,083 137 Updated Dec 8, 2025

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,963 123 Updated Nov 4, 2025

Official implementation for "SocialNav: Training Human-Inspired Foundation Model for Socially-Aware Embodied Navigation"

54 Updated Dec 1, 2025

[IROS 24] Official repository of "Mind the Error! Detection and Localization of Instruction Errors in Vision-and-Language Navigation". We present the first dataset - R2R-IE-CE - to benchmark instru…

Python 18 1 Updated Jan 8, 2025

G2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

Python 259 8 Updated Jan 15, 2026

[3DV 2026 Oral] Official Repo of "SAIL-Recon: Large SfM by Augmenting Scene Regression with Localization"

Python 285 17 Updated Dec 30, 2025

Depth Anything 3

Python 4,279 396 Updated Dec 12, 2025

Official implementation of "Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation" (NeurIPS'25 Oral)

Python 75 5 Updated Dec 22, 2025

The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."

Python 2,097 158 Updated Mar 13, 2025

IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction

Python 334 15 Updated Dec 1, 2025

[CVPR 2025 Hightlight] PlanarSplatting: Accurate Planar Surface Reconstruction in 3 Minutes

Python 69 3 Updated Sep 22, 2025

[NeurIPS 2025] the official project page of a paper, "PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-Forward Planar Splatting"

Python 66 3 Updated Feb 1, 2026

PlaneRCNN detects and reconstructs piece-wise planar surfaces from a single RGB image

Python 604 129 Updated Oct 9, 2022

Official repository for BrickGPT, the first approach for generating physically stable toy brick models from text prompts.

Python 1,594 98 Updated Jan 26, 2026

CoTracker is a model for tracking any point (pixel) on a video.

Jupyter Notebook 4,811 345 Updated Jan 21, 2025

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 18,163 1,577 Updated Jan 30, 2026

[CVPR'25 Highlight] Official repository of Sonata: Self-Supervised Learning of Reliable Point Representations

Python 675 36 Updated Jun 4, 2025

Python3 library for downloading YouTube Videos.

Python 1,453 180 Updated Dec 7, 2025

[NeurIPS 2025] Pixel-Perfect Depth

Python 760 31 Updated Jan 13, 2026

MapAnything: Universal Feed-Forward Metric 3D Reconstruction

Python 2,869 199 Updated Jan 18, 2026
Next