-
Peking University
Lists (26)
Sort Name ascending (A-Z)
3D/4D Generation
3DV
Accelerating
AD
Agent
AR
Awesome
AWS
Datasets
DiT based
Flow and tracking
Human Generation
Image Generation
LLM
Physical model
PixDiffusion
Prediction
RL
Spatial Reasoning
T2M
Token-prediction based gen
Tools
UniUndGen
Video Generation
VLA
VTON
Stars
🏛️ 三省六部制 · OpenClaw Multi-Agent Orchestration System — 9 specialized AI agents with real-time dashboard, model config, and full audit trails
Elevate your AI research writing, no more tedious polishing ✨
[Lumina具身智能社区] 具身智能技术指南 Embodied-AI-Guide
[CVPR 2026] VGGDrive: Empowering Vision-Language Models with Cross-View Geometric Grounding for Autonomous Driving
[ICLR 2026] The official implementation of "PTQ4ARVG: Post-Training Quantization for AutoRegressive Visual Generation Models"
NVIDIA FastGen: Fast Generation from Diffusion Models
Compositional Diffusion with Guided search for Long-Horizon Planning
StableWorld: Towards Stable and Consistent Long Interactive Video Generation
Official repository of paper "CoMoVi: Co-Generation of 3D Human Motions and Realistic Videos"
PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation
Official code for StoryMem: Multi-shot Long Video Storytelling with Memory
StarVLA: A Lego-like Codebase for Vision-Language-Action Model Developing
Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.
Video generation Stage of InfiniCube, implemented in DiffSynth-Studio
[ICLR 2026] Official repo for paper "Video-As-Prompt: Unified Semantic Control for Video Generation"
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
[AAAI 2025] Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving
[ECCV 2024] Embodied Understanding of Driving Scenarios
[CVPR2026] Scaling Spatial Intelligence with Multimodal Foundation Models