-
National University of Singapore
- Singapore
- https://ldkong.com
- in/ldkong
- @ldkong1205
Highlights
Lists (5)
Sort Name ascending (A-Z)
Stars
🌐 Event Camera Vision in the Era of Large Models: A Survey
🌐 Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems
🌐 Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future
🌐 WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World
U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences
"Paper2Slides: From Paper to Presentation in One Click"
[NeurIPS 2025 DB Track] 3EED: Ground Everything Everywhere in 3D
[NeurIPS 2025] SPIRAL: Semantic-Aware Progressive LiDAR Scene Generation and Understanding
Official Repo for Paper <EditMGT Unleashing the Potential of Masked Generative Transformer in Image Editing>
[NeurIPS 2025] Deep Memory Backtracking for Long Video Understanding
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems
[AAAI 2026 Oral] LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences
Official Competition Toolkit for The 2025 RoboSense Challenge
Official implementation of "Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence"
[ICCV 2025] Perspective-Invariant 3D Object Detection
[ICCV 2025 Workshop Outstanding Paper Award] VChain: Chain-of-Visual-Thought for Reasoning in Video Generation
[ICLR 2026] RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
[SIGGRAPH Asia 2025] WorldExplorer: Towards Generating Fully Navigable 3D Scenes
🌐 3D and 4D World Modeling: A Survey
🌐 A Roadmap for 3D Scene Understanding in the Wild
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap
【Accepted by TPAMI】Human Motion Video Generation: A Survey (https://ieeexplore.ieee.org/document/11106267)
[CVPR 25] Vid2Sim: Realistic and Interactive Simulation from Video for Urban Navigation
4DNeX: Feed-Forward 4D Generative Modeling Made Easy