Highlights
- Pro
Lists (11)
Sort Name ascending (A-Z)
Starred repositories
<Foundations of Computer Vision> Book
A feed-forward 3D foundation model for reconstructing scenes from streaming data
OKVIS2-X: Open Keyframe-based Visual-Inertial SLAM Configurable with Dense Depth or LiDAR, and GNSS
Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning" by Zhiheng Xi et al.
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)
OpenClaw-RL: Train any agent simply by talking
τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
A Curated List of Vision-Language-Action (VLA) and World Action Models (WAM) Research and Beyond
An open source platform for visual-inertial navigation research.
[CVPR'2026]: MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer
Curated academic CV templates and guidelines for PhD students, researchers, and faculty job applicants.
A curated list of papers on feed-forward 3D reconstruction and novel view synthesis.
🌱 A little course on Reinforcement Learning Environments for evaluating and training Language Models
[CVPR 2026 (Highlight)] Scal3R: Scalable Test-Time Training for Large-Scale 3D Reconstruction
[NeurIPS 2025] Instant4D: 4D Gaussian Splatting in Minutes
Fine-tune Gemma 4 and 3n with audio, images and text on Apple Silicon, using PyTorch and Metal Performance Shaders.
Official code for "SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization"
verl: Volcano Engine Reinforcement Learning for LLMs
MapEx: Indoor Structure Exploration with Probabilistic Information Gain from Global Map Predictions
[CVPR 2026] Gen3R: 3D Scene Generation Meets Feed-Forward Reconstruction
[NeurIPS 2025 (Spotlight)] The implementation for the paper "4DGT Learning a 4D Gaussian Transformer Using Real-World Monocular Videos"
Official code for "LagerNVS Latent Geometry for Fully Neural Real-time Novel View Synthesis" (CVPR 2026)
A gallery that showcases on-device ML/GenAI use cases and allows people to try and use models locally.
Official codebase for Fast-WAM: Do World Action Models Need Test-time Future Imagination?
Claw-R1: Empowering OpenClaw with Advanced Agentic RL.