-
New York University
- https://yyyybq.github.io/BaiqiaoYIN.github.io/
Stars
A collection of papers on semantic correspondence, organized by year.
IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction
[Awesome-Spatial-VLMs] This repository is the official, community-maintained resource for the survey paper: Spatial Intelligence in Vision-Language Models: A Comprehensive Survey;
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
Official implementation of paper "Unified World Models: Memory-Augmented Planning and Foresight for Visual Navigation"
Official repo of paper "SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models". A post-training framework that creates a cost-effective, self-iterative optimization loop.
https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT
Official Repository for “CoSpace: Benchmarking Continuous Space Perception Ability for Vision-Language Models" [CVPR2025]
InteriorGS: 3D Gaussian Splatting Dataset of Semantically Labeled Indoor Scenes
RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.
Training VLM agents with multi-turn reinforcement learning
The first collection of academic iKUN papers in the world
Github repository for "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas" (ICML 2025)
Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR 2024]
A paper list for spatial reasoning
TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation
[ICME 2024] Implementation of the paper “HDBN: A Novel Hybrid Dual-branch Network for Robust Skeleton-based Action Recognition“.
(CVPR2024)RMT: Retentive Networks Meet Vision Transformer
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
Deep reinforcement learning (PPO) apply in FrozenLakev1
List of papers on hallucination detection in LLMs.