Highlights
- Pro
Stars
Research code for pixel-based encoders of language (PIXEL)
It is said that, Ilya Sutskever gave John Carmack this reading list of ~ 30 research papers on deep learning.
Data processing utilities in keras3
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models
[ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist web agents
Janus-Series: Unified Multimodal Understanding and Generation Models
Implementation of rectified flow and some of its followup research / improvements in Pytorch
Efficient vision foundation models for high-resolution generation and perception.
[ICLR 2025] Rectified Diffusion: Straightness Is Not Your Need
[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
Official PyTorch implementation of AdaFlow
[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
[ECCV2024] "SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow", Yuanzhi Zhu, Xingchao Liu, Qiang Liu
An open-source toolbox for fast sampling of diffusion models. Official implementations of our works published in ICML, NeurIPS, CVPR, J. Stat. Mech.
(CVPR 2024) 🧩 TokenCompose: Text-to-Image Diffusion with Token-level Supervision
Mixed continous/categorical flow-matching model for de novo molecule generation.
[NeurIPS 2024] RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance
Open-Sora: Democratizing Efficient Video Production for All
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
🔥[IJCAI 2022, Official Code] for paper "Rethinking Image Aesthetics Assessment: Models, Datasets and Benchmarks". Official Weights and Demos provided. 首个面向多主题场景的美学评估数据集、算法和benchmark.
Official code for paper: Text-to-Image Rectified Flow as Plug-and-Play Priors [ICLR 2025]
The codebase of our paper "Improving the Training of Rectified Flows", NeurIPS 2024