-
SJTU
Lists (24)
Sort Name ascending (A-Z)
3DOcc
CUDA
Datasets
Depth
Detection
Diffusion
Driving
Generative
Jina
LLM
Nerf
NTP
paperlist
PointCloud
PyTorch
RL
Robo
ROS
Scene
Scentific
Segmentation
tools
Transformer
WM
Starred repositories
Skill package for ML/CV/NLP paper writing, curated and adapted from Prof. Peng Sida's open notes for Codex, Claude Code, and Gemini.
Official implementation of CVPR26 paper "Lifting Unlabeled Internet-level Data for 3D Scene Understanding"
Official code, models, and data for Vista4D: Video Reshooting with 4D Point Clouds (CVPR 2026 Highlight)
Fixes missing reasoning_content for DeepSeek V4
GR00T-VisualSim2Real: Open-source sim-to-real framework for humanoid visual loco-manipulation. Train in simulation, deploy zero-shot on real robots with RGB + proprioception for tasks like pick-and…
[ICLR 2026] Disentangled Robot Learning via Separate Forward and Inverse Dynamics Pretraining
Implementation of "Deep Implicit Templates for 3D Shape Representation"
[NeurIPS 2025] UniRelight: Learning Joint Decomposition and Synthesis for Video Relighting
Official Implementation of MultiWorld: Scalable Multi-Agent Multi-View Video World Models
[CVPR'26] TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens
A Large-Scale Multimodal Car Dataset with Computational Fluid Dynamics Simulations and Deep Learning Benchmarks
[CVPR 2026 Oral] Pixel Diffusion Transformers for Image Generation
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
A feed-forward 3D foundation model for reconstructing scenes from streaming data
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds
Masked Depth Modeling for Spatial Perception
Video-Action Models for Generalizable Robot Control Beyond VLAs
Reinforcement Learning environments based on the 1993 game Doom
Open Overleaf/ShareLaTex projects in vscode, with full collaboration support.
Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos
HappyHorse AI turns text or images into remarkable 1080p cinematic video. Every HappyHorse AI video uses advanced motion synthesis — multi-shot storytelling, seamless transitions, and realism. Free…
[SIGGRAPH 2024] Official PyTorch Implementation of "BrepGen: A B-rep Generative Diffusion Model with Structured Latent Geometry".