Stars
🚀 [ICLR 2026] SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation
Official Codebase for our CVPR 2026 paper UniSH: Unifying Scene and Human Reconstruction in a Feed-Forward Pass
[CVPR 2026] ReasonMap: Towards Fine-Grained Visual Reasoning from Transit Maps
[TMLR 2025] Efficient Reasoning Models: A Survey
[IJCV 2025] Smaller But Better: Unifying Layout Generation with Smaller Large Language Models
[arXiv 25] OCRGenBench: A Comprehensive Benchmark for Evaluating OCR Generative Capabilities
A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.
Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
Reconstructing spatiotemporal dynamics from spatial transcriptome snapshots
A library that integrates different MIL methods into a unified framework
[🏆AAAI'25] Official Repo for ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area.
This is the repository of DEER, a Dynamic Early Exit in Reasoning method for Large Reasoning Language Models.
[🚀ICML 2025] "Taming Rectified Flow for Inversion and Editing" Using FLUX and HunyuanVideo for image and video editing!
[NeurIPS'25] Official Repository for the Paper "SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning"
[CVPR 2025] Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation
This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉
[ACL'25] UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench
[NeurIPS 2025 spotlight] Official implementation for "FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving"
[ICLR 2025] The offical implementation of "PSEC: Skill Expansion and Composition in Parameter Space", a new framework designed to facilitate efficient and flexible skill expansion and composition, …
🏆 Official implementation of LangCoop: Collaborative Driving with Natural Language
This repository is the official implementation of Human4DiT: 360-degree Human Video Generation with 4D Diffusion Transformer.
[NeurIPS D&B Track 2024] Official implementation of HumanVid
Generative Models by Stability AI
recurrence of the paper "Consistent View Synthesis with Pose-Guided Diffusion Models"
[NeurIPS 2021] Moment-DETR code and QVHighlights dataset