yangliu96

Follow

Yang Liu yangliu96

Follow

61 followers · 150 following

ZJU, BAAI

Achievements

Achievements

Lists (7)

Sort

Agent

Dataset

dLLM

LLM

10 repositories

Segmentation

VFM

Video Gen

Stars

Anionex / banana-slides

一个基于nano banana pro🍌的原生AI PPT生成应用，迈向真正的＂Vibe PPT＂; 支持上传任意模板图片；上传任意素材&智能解析；一句话/大纲/页面描述自动生成PPT；口头修改指定区域、一键导出 - An AI-native PPT generator based on nano banana pro🍌

TypeScript 5,321 589 Updated Dec 20, 2025

ByteDance-BandAI / CodeVision

Thinking with Programming Vision: Towards a Unified View for Thinking with Images

Python 30 Updated Dec 17, 2025

xinyu1205 / recognize-anything

Open-source and strong foundation image recognition models.

Jupyter Notebook 3,528 315 Updated Feb 18, 2025

vllm-project / vllm-omni

A framework for efficient model inference with omni-modality models

Python 1,050 142 Updated Dec 21, 2025

stepfun-ai / Step1X-Edit

A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.

Python 2,070 88 Updated Dec 15, 2025

EvolvingLMMs-Lab / LLaVA-OneVision-1.5

Fully Open Framework for Democratized Multimodal Training

Python 659 50 Updated Dec 15, 2025

Tongyi-MAI / Z-Image

Python 7,511 444 Updated Dec 14, 2025

pytorch / torchtitan

A PyTorch native platform for training generative AI models

Python 4,861 645 Updated Dec 21, 2025

glidea / banana-prompt-quicker

🍌Awesome Prompts; Nano Banana；Banana Pro; Gemini；AI Studio；Prompt Quickly[正在开发 Sidebar 高级功能，敬请期待]

JavaScript 1,827 144 Updated Dec 21, 2025

OpenGVLab / GenExam

GenExam: A Multidisciplinary Text-to-Image Exam

Python 50 3 Updated Dec 18, 2025

tyfeld / MMaDA-Parallel

Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation"

Python 279 7 Updated Nov 19, 2025

Tencent-Hunyuan / HunyuanVideo-1.5

HunyuanVideo-1.5: A leading lightweight video generation model

Python 2,025 98 Updated Dec 19, 2025

kandinskylab / kandinsky-5

Kandinsky 5.0: A family of diffusion models for Video & Image generation

Python 596 41 Updated Dec 19, 2025

yifan123 / flow_grpo

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,762 104 Updated Nov 4, 2025

NVlabs / DiffusionNFT

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Python 502 15 Updated Sep 22, 2025

ByteDance-Seed / Depth-Anything-3

Depth Anything 3

Python 3,627 308 Updated Dec 12, 2025

EvolvingLMMs-Lab / NEO

NEO Series: Native Vision-Language Models from First Principles

Python 589 20 Updated Dec 17, 2025

bytetriper / RAE

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,640 53 Updated Nov 15, 2025

TheNetAdmin / zjuthesis

Zhejiang University Graduation Thesis LaTeX Template

TeX 3,358 707 Updated Dec 8, 2025

FoundationVision / Waver

Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.

779 87 Updated Aug 27, 2025

FoundationVision / InfinityStar

[NeurIPS 2025 Oral]Infinity⭐️: Uniﬁed Spacetime AutoRegressive Modeling for Visual Generation

Python 658 24 Updated Nov 27, 2025

meituan-longcat / LongCat-Flash-Omni

This is the official repo for the paper "LongCat-Flash-Omni Technical Report"

Python 444 24 Updated Dec 15, 2025

baaivision / Emu3.5

Native Multimodal Models are World Learners

Python 1,367 52 Updated Nov 28, 2025

baaivision / URSA

🐻 Uniform Discrete Diffusion with Metric Path for Video Generation

Python 81 2 Updated Dec 19, 2025

Breakthrough / PySceneDetect

🎥 Python and OpenCV-based scene cut/transition detection program & library.

Python 4,409 473 Updated Dec 11, 2025

meituan-longcat / LongCat-Video

Python 1,547 200 Updated Dec 20, 2025

zerolllin / Delta-L-Normalization

Python 17 1 Updated Oct 11, 2025

aim-uofa / COSINE

[ICCV'25] Unified Open-World Segmentation with Multi-Modal Prompts

13 Updated Oct 10, 2025

Tencent-Hunyuan / HunyuanImage-3.0

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 2,599 123 Updated Oct 31, 2025

fudoki-hku / FUDOKI

The author's implementation of FUDOKI, a multimodal large language model purely based on discrete flow matching.

Python 65 3 Updated Sep 15, 2025