-
SJTU
tools
[NeurIPS 2025 D&B] Open-source Multi-agent Poster Generation from Papers
[ICLR 2025] Official implementation and benchmark evaluation repository of <PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding>
[ICCV 2025] SpatialTrackerV2: 3D Point Tracking Made Easy
[NeurIPS 25] TrackingWorld: World-centric Monocular 3D Tracking of Almost All Pixels
a comprehensive and critical synthesis of the emerging role of GenAI across the full autonomous driving stack
A paper list for spatial reasoning
[CVPR 2025] Parallel Sequence Modeling via Generalized Spatial Propagation Network
[ICLR 2025] CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching
📖 This is a repository for organizing papers, codes, and other resources related to personalized video generation and editing.
A Benchmark for Evaluating Generalization for Robotic Manipulation
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Reference PyTorch implementation and models for DINOv3
A collection of the application documents I used to apply to universities in the US.
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing …
Awesome curated collection of images and prompts generated by gemini-2.5-flash-image (aka Nano Banana) state-of-the-art image generation and editing model. Explore AI generated visuals created with…
Machine Learning Engineering Open Book
DeepDiff: Deep Difference and search of any Python object/data. DeepHash: Hash of any object based on its contents. Delta: Use deltas to reconstruct objects by adding deltas together.
Collections of CS PhD Application Fee Waivers of schools in North America
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
PyTorch code and models for VJEPA2 self-supervised learning from video.
🥢像老乡鸡🐔那样做饭。主要部分于2024年完工,非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》,并做归纳、编辑与整理。CookLikeHOC.