agnJason

Follow

Qijun Gan agnJason

Follow

Ph.D from ZJU in China

56 followers · 23 following

Zhejiang University
agnjason.github.io

Achievements

Achievements

Stars

bytedance / Bernini

Bernini is a unified framework for video generation and editing that combines an MLLM-based semantic planner with a DiT-based renderer.

Python 756 57 Updated Jun 12, 2026

DCDmllm / InstructSAM

The code for "InstructSAM: Segment Any Instance with Any Instructions"

Python 80 6 Updated May 26, 2026

bytedance / Lance

A 3B-active-parameter native unified multimodal model for image and video understanding, generation, and editing.

Python 1,199 79 Updated Jun 13, 2026

hanxunyu / DepthVLM

🔥 Official code repository for "Unlocking Dense Metric Depth Estimation in VLMs"

Python 128 6 Updated May 21, 2026

IF-LAB-PKU / Pyramid-Forcing

Official codebase for "Pyramid Forcing: Head-Aware Pyramid KV Cache Policy for High-Quality Long Video Generation"

Python 12 Updated Jun 3, 2026

nikopueringer / CorridorKey

Perfect Green Screen Keys

Python 13,866 857 Updated May 28, 2026

luoxyhappy / CoInteract

Official Implementation of CoInteract: Spatially-Structured Co-Generation for Interactive Human-Object Video Synthesis

Python 156 11 Updated May 7, 2026

bytedance / GRN

Generative Refinement Networks for Visual Synthesis (Support C2I & T2I & T2V)

Python 133 3 Updated Jun 8, 2026

OmniForcing / OmniForcing

Official implementation of "OmniForcing: Unleashing Real-time Joint Audio-Visual Generation"[arXiv:2603.11647]. OmniForcing is the first framework to distill bidirectional audio-visual diffusion mo…

Python 156 2 Updated May 2, 2026

PKU-YuanGroup / Helios

Helios: Real Real-Time Long Video Generation Model

Python 1,906 149 Updated Jun 10, 2026

ModelTC / GenRL

Reinforcement Learning Framework for Visual Generation

Python 123 6 Updated Feb 13, 2026

FoundationVision / Alive

[Tech Report] Alive: A Unified Audio-Video Generation Model

457 30 Updated Mar 31, 2026

MeiGen-AI / Infinite-World

[ICML 2026] | Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory

Python 177 6 Updated May 4, 2026

Alibaba-Quark / LiveAvatar

Implementation of "Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length"

Python 2,145 241 Updated May 31, 2026

FoundationVision / InfinityStar

[NeurIPS 2025 Oral]Infinity⭐️: Uniﬁed Spacetime AutoRegressive Modeling for Visual Generation

Python 765 28 Updated Apr 16, 2026

Playmate111 / Playmate2

[AAAI 2026] Playmate2: Training-Free Multi-Character Audio-Driven Animation via Diffusion Transformer with Reward Feedback

Python 299 28 Updated Nov 21, 2025

character-ai / Ovi

Python 1,723 200 Updated Nov 15, 2025

MeiGen-AI / InfiniteTalk

Unlimited-length talking video generation that supports image-to-video and video-to-video generation

Python 6,885 1,212 Updated May 22, 2026

Francis-Rings / StableAvatar

We present StableAvatar, the first end-to-end video diffusion transformer, which synthesizes infinite-length high-quality audio-driven avatar videos without any post-processing, conditioned on a re…

Python 1,238 110 Updated Jan 20, 2026

antgroup / echomimic_v3

[AAAI 2026] EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation

Python 944 111 Updated Mar 18, 2026

FoundationVision / Waver

Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.

939 121 Updated Aug 27, 2025

Omni-Avatar / OmniAvatar

Python 1,830 167 Updated Aug 6, 2025

songw-zju / awesome-multimodal-agentic-reasoning

Resources list for multimodal agentic reasoning

6 Updated Aug 11, 2025

fudan-generative-vision / hallo2

[ICLR 2025] Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation

Python 3,705 536 Updated Feb 27, 2025

Saiyan-World / goku

[CVPR2025 Highlight] Video Generation Foundation Models: https://saiyan-world.github.io/goku/

Python 2,909 310 Updated Feb 19, 2025

Lightricks / LTX-Video

Official repository for LTX-Video

Python 10,473 1,034 Updated Jan 5, 2026

OpenDCAI / DataFlow

Easy Data Preparation with latest LLMs-based Operators and Pipelines.

Python 4,870 548 Updated Jun 10, 2026

Wakals / GASCOL

Official implementary of HCoG: Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation [CVPR 2025]

Python 59 2 Updated Jul 28, 2025

MARS-EAI / RoboFactory

[ICCV 2025] RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints

Python 134 16 Updated Sep 2, 2025

ritzz-ai / GUI-R1

Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents

Python 248 18 Updated May 5, 2025