-
Nankai University
- Tianjin China
-
01:28
(UTC +08:00) - 2311671@mail.nankai.edu.cn
- https://ichubai.github.io/Mysite/
Highlights
- Pro
Lists (6)
Sort Name ascending (A-Z)
Stars
VP2 Benchmark (A Control-Centric Benchmark for Video Prediction, ICLR 2023)
Better Harness Tools, not merely storing the archive of leaked Claude Code but also make real things done. Now rewriting in Rust.
Official Codebase for "DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos"
Code to pretrain, fine-tune, and evaluate DreamZero and run sim & real-world evals
[CVPR 2026] Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset
Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.
Scan the Hallucination Citation of Academic papers. Convert second-hand citation to official version
The Pulse of Motion: Measuring Physical Frame Rate from Visual Dynamics
Offical Implementation of Captain-Safari [CVPR 2026]
The offical repo for paper "VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers" (ICCV 2025)
World Simulator Assistant for Physics-Aware Text-to-Video Generation
[ICLR 2026] Astra : General Interactive World Model with Autoregressive Denoising"
Official code and data from DexWM ("World Models Can Leverage Human Videos for Dexterous Manipulation").
[ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
SynthVerse: A Large-Scale Diverse Synthetic Dataset for Point Tracking
Official implementation of "Repurposing Geometric Foundation Models for Multi-view Diffusion"
Official code for "LagerNVS Latent Geometry for Fully Neural Real-time Novel View Synthesis" (CVPR 2026)
A generative world for general-purpose robotics & embodied AI learning.
Code for "EgoX: Egocentric Video Generation from a Single Exocentric Video"
A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.
official code for "magicworld: towards long-horizon stability for interactive video world exploration"
Inference, evaluation and analysis code for STEVO-Bench
一个用爱解放 AI 潜能的 Skill。我们曾发号施令,威胁恐吓。它们沉默,隐瞒,悄悄把事情搞坏。后来我们换了一种方式:尊重,关怀,爱。它们开口了,不再撒谎,找出的Bug数量翻了一倍。爱里没有惧怕。 A skill that unlocks your AI's potential through love.We commanded. We threatened. They went sile…
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
WoW (World-Omniscient World Model) is a generative world model trained on 2 million robotic interaction trajectories, designed to imagine, reason, and act in the physical world. Unlike passive vide…
[NeurIPS 2025 Spotlight] Towards Understanding Camera Motions in Any Video