Lists (1)
Sort Name ascending (A-Z)
Starred repositories
[R]einforcement [L]earning from [M]odel-rewarded [T]hinking - code for the paper "Language Models That Think, Chat Better"
[ICCV 2025 Oral] DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior
Building General-Purpose Robots Based on Embodied Foundation Model
Code for exploring surface electromyography (sEMG) data and training models associated with Reality Labs' paper
Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoning
[ICCV2025] Referring any person or objects given a natural language description. Code base for RexSeek and HumanRef Benchmark
Code repository for the CVPR 2025 paper "From Sparse Signal to Smooth Motion Real-Time Motion Generation with Rolling Prediction Models" and GORP dataset
[NeurIPS 2025] PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers
This is a pytorch implementation of method based on Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled Representation applying on human pose estimation tasks using stereo images.
Official implementation of "E3D-Bench: A Benchmark for End-to-End 3D Geometric Foundation Models"
Sequence to sequence network implementation in Pytorch
[CVPR 2025 Highlight] Official implementation of the solvers and estimators proposed in the paper "Relative Pose Estimation through Affine Corrections of Monocular Depth Priors"
This package contains the original 2012 AlexNet code.
[Lumina Embodied AI] 具身智能技术指南 Embodied-AI-Guide
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
[CVPR 2025] EgoLife: Towards Egocentric Life Assistant
A Python package that provides evaluation and visualization tools for the HO-Cap dataset
Official Code for "MITracker: Multi-View Integration for Visual Object Tracking"
[ICLR 2024] M/EEG-based image decoding with contrastive learning. i. Propose a contrastive learning framework to align image and eeg. ii. Resolving brain activity for biological plausibility.
Solve Visual Understanding with Reinforced VLMs
MoBA: Mixture of Block Attention for Long-Context LLMs
[ICLR 2025] Official implementation of "DiffSplat: Repurposing Image Diffusion Models for Scalable 3D Gaussian Splat Generation".
🚀 Efficient implementations of state-of-the-art linear attention models
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Official Code for ECCV 2024 paper "EgoPoser: Robust Real-Time Egocentric Pose Estimation from Sparse and Intermittent Observations Everywhere"