A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 3,863 276 Updated Sep 25, 2025

erwold / qwen2vl-flux

Python 571 32 Updated Nov 26, 2024

FoundationVision / VAR

[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…

Jupyter Notebook 8,559 551 Updated Nov 10, 2025

Tencent-Hunyuan / HunyuanVideo

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 11,466 1,151 Updated Nov 21, 2025

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 17,642 2,232 Updated Feb 1, 2025

SkyworkAI / Vitron

NeurIPS 2024 Paper: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Python 578 36 Updated Oct 20, 2024

OpenGVLab / VisionLLM

VisionLLM Series

Python 1,131 59 Updated Feb 27, 2025

Tencent-Hunyuan / Hunyuan3D-1

Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation

Python 3,460 274 Updated Nov 19, 2025

Tencent-Hunyuan / Tencent-Hunyuan-Large

Python 1,587 118 Updated Dec 6, 2024

zai-org / GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python 3,100 268 Updated Dec 5, 2024

genmoai / mochi

The best OSS video generation models, created by Genmo

Python 3,537 468 Updated Nov 14, 2025

VectorSpaceLab / OmniGen

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

Jupyter Notebook 4,287 366 Updated Dec 4, 2025

gnobitab / RectifiedFlow

Official Implementation of Rectified Flow (ICLR2023 Spotlight)

Python 1,496 87 Updated Jul 20, 2024

atong01 / conditional-flow-matching

TorchCFM: a Conditional Flow Matching library

Python 2,181 176 Updated Nov 11, 2025

jy0205 / Pyramid-Flow

[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling

Python 3,142 306 Updated Dec 21, 2024

cswry / SeeSR

[CVPR2024] SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution

Python 605 49 Updated Dec 16, 2025

AILab-CVC / SEED-X

Multimodal Models in Real World

Jupyter Notebook 551 23 Updated Feb 24, 2025

facebookresearch / chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 2,068 116 Updated Jul 29, 2024

lucidrains / transfusion-pytorch

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 1,283 65 Updated Dec 4, 2025

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 18,035 2,284 Updated Dec 25, 2024

Jiahao Li xljh0520

Lists (3)

Flow

LLM+Gen

tools

Stars