-
Zhejiang University
- Shanghai
Lists (1)
Sort Name ascending (A-Z)
Stars
Window Detection in Facade Using Heatmaps Fushion
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
RoboBrain 2.0: Advanced version of RoboBrain. See Better. Think Harder. Do Smarter. 🎉🎉🎉
A fork to add multimodal model training to open-r1
Solve Visual Understanding with Reinforced VLMs
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
Fully open reproduction of DeepSeek-R1
[IROS 2025 Award Finalist] The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
[Lumina Embodied AI] 具身智能技术指南 Embodied-AI-Guide
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
A generative world for general-purpose robotics & embodied AI learning.
Reference implementation for DPO (Direct Preference Optimization)
LAVIS - A One-stop Library for Language-Vision Intelligence
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
HOTA (and other) evaluation metrics for Multi-Object Tracking (MOT).
📊 Benchmark multiple object trackers (MOT) in Python
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
毕业论文小助手:一个翻译英文并将中文结果显示在侧边的PDF阅读器
World's first general purpose 3D object detection codebse.
Vim-fork focused on extensibility and usability
A protocol for real-time transfer and visualization of autonomy data