-
SJTU
Lists (24)
Sort Name ascending (A-Z)
3DOcc
CUDA
Datasets
Depth
Detection
Diffusion
Driving
Generative
Jina
LLM
Nerf
NTP
paperlist
PointCloud
PyTorch
RL
Robo
ROS
Scene
Scentific
Segmentation
tools
Transformer
WM
Starred repositories
Learning to Drive via Real-World Simulation at Scale
Official implementation of Don’t Blind Your VLA: Aligning Visual Representations for OOD Generalization. https://blind-vla-paper.github.io
WorldPlay: Interactive World Modeling with Real-Time Latency and Geometric Consistency
Native and Compact Structured Latents for 3D Generation
An implementation of chunked, compressed, N-dimensional arrays for Python.
Official implementation of "RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics"
Code repository of "GenieDrive: Towards Physics-Aware Driving World Model with 4D Occupancy Guided Video Generation"
Official code of Motus: A Unified Latent Action World Model
VLA-0: Building State-of-the-Art VLAs with Zero Modification
A unified inference and post-training framework for accelerated video generation.
Any4D: Unified Feed-Forward Metric 4D Reconstruction
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
The official repository of "Astra : General Interactive World Model with Autoregressive Denoising"
[NeurIPS 2025] AutoSeg3D, online real-time 3D segmentation as instance tracking with long-short term query memory for embodied perception
This repo contains the python code as well as the webpage html files for the Lang3D-XL project.
"Paper2Slides: From Paper to Presentation in One Click"
verl: Volcano Engine Reinforcement Learning for LLMs
Official implementation of “4D LangVGGT: 4D Language-Visual Geometry Grounded Transformer”
Light-X: Generative 4D Video Rendering with Camera and Illumination Control
DGGT: Feedforward 4D Reconstruction of Dynamic Driving Scenes using Unposed Images
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation
Official implementation of "C3G: Learning Compact 3D Representations with 2K Gaussians"
On-device Image Generation for Apple Silicon
Official implementation of "CAMEO: Correspondence-Attention Alignment for Multi-View Diffusion Models"