Skip to content
View enjoyyi00's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report enjoyyi00

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image Generation.

Python 743 43 Updated Feb 2, 2026

Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Python 223 9 Updated Feb 10, 2026

SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time

97 2 Updated Jan 1, 2026

DDT: Decoupled Diffusion Transformer

Python 361 17 Updated Aug 22, 2025

[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Python 11 Updated Dec 17, 2025

Training Large Language Model to Reason in a Continuous Latent Space

Python 1,500 166 Updated Aug 12, 2025

TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

Python 3,315 229 Updated Jan 29, 2026

[ICLR 2026] Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Lear…

Python 363 14 Updated Feb 5, 2026

Adapting Self-Supervised Representations as a Latent Space for Efficient Generation

38 Updated Oct 17, 2025

Kandinsky 5.0: A family of diffusion models for Video & Image generation

Python 705 52 Updated Jan 31, 2026

[NeurIPS 2025] VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models

Python 162 8 Updated Jan 7, 2026

Muon is an optimizer for hidden layers in neural networks

Python 2,287 106 Updated Jan 19, 2026

[ICLR 2026] Code for our paper "Next Visual Granularity Generation".

Python 49 1 Updated Jan 26, 2026

[NeurIPS'25 Spotlight] Boosting Generative Image Modeling via Joint Image-Feature Synthesis

Python 111 5 Updated Nov 3, 2025
Jupyter Notebook 117 3 Updated Nov 8, 2025

[NeurIPS 2025 Oral]Infinity⭐️: Unified Spacetime AutoRegressive Modeling for Visual Generation

Python 714 26 Updated Nov 27, 2025

[ICLR'26] Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?

Python 48 1 Updated Feb 10, 2026

[ICLR 2026] This is an early exploration to introduce Interleaving Reasoning to Text-to-image Generation field and achieve the SoTA benchmark performance. It also significantly improves the quality…

Python 87 Updated Jan 26, 2026
Python 1,776 78 Updated Dec 16, 2025

MoVQGAN - model for the image encoding and reconstruction

Jupyter Notebook 259 18 Updated Oct 31, 2023

[CVPR 2025🔥] Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model

Python 194 10 Updated May 11, 2025

NEO Series: Native Vision-Language Models from First Principles

Python 641 23 Updated Jan 9, 2026

Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)

Jupyter Notebook 1,127 80 Updated Jan 25, 2026

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,755 66 Updated Jan 20, 2026

[CVPR 2025 Highlight] GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control

Jupyter Notebook 1,260 69 Updated Sep 24, 2025

[ICLR 2026 Oral] DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Python 633 23 Updated Feb 10, 2026

[ICCV'25 Best Paper Finalist] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Python 1,735 82 Updated Nov 28, 2025

DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

Python 345 9 Updated Oct 5, 2025

Fully Open Framework for Democratized Multimodal Training

Python 727 57 Updated Dec 27, 2025

[ICCV 2025] OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Python 454 24 Updated Jan 29, 2026
Next