Skip to content
View yangliu96's full-sized avatar

Block or report yangliu96

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

一个基于nano banana pro🍌的原生AI PPT生成应用,迈向真正的"Vibe PPT"; 支持上传任意模板图片;上传任意素材&智能解析;一句话/大纲/页面描述自动生成PPT;口头修改指定区域、一键导出 - An AI-native PPT generator based on nano banana pro🍌

TypeScript 5,321 589 Updated Dec 20, 2025

Thinking with Programming Vision: Towards a Unified View for Thinking with Images

Python 30 Updated Dec 17, 2025

Open-source and strong foundation image recognition models.

Jupyter Notebook 3,528 315 Updated Feb 18, 2025

A framework for efficient model inference with omni-modality models

Python 1,050 142 Updated Dec 21, 2025

A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.

Python 2,070 88 Updated Dec 15, 2025

Fully Open Framework for Democratized Multimodal Training

Python 659 50 Updated Dec 15, 2025
Python 7,511 444 Updated Dec 14, 2025

A PyTorch native platform for training generative AI models

Python 4,861 645 Updated Dec 21, 2025

🍌Awesome Prompts; Nano Banana;Banana Pro; Gemini;AI Studio;Prompt Quickly[正在开发 Sidebar 高级功能,敬请期待]

JavaScript 1,827 144 Updated Dec 21, 2025

GenExam: A Multidisciplinary Text-to-Image Exam

Python 50 3 Updated Dec 18, 2025

Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation"

Python 279 7 Updated Nov 19, 2025

HunyuanVideo-1.5: A leading lightweight video generation model

Python 2,025 98 Updated Dec 19, 2025

Kandinsky 5.0: A family of diffusion models for Video & Image generation

Python 596 41 Updated Dec 19, 2025

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

Python 1,762 104 Updated Nov 4, 2025

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Python 502 15 Updated Sep 22, 2025

Depth Anything 3

Python 3,627 308 Updated Dec 12, 2025

NEO Series: Native Vision-Language Models from First Principles

Python 589 20 Updated Dec 17, 2025

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,640 53 Updated Nov 15, 2025

Zhejiang University Graduation Thesis LaTeX Template

TeX 3,358 707 Updated Dec 8, 2025

Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.

779 87 Updated Aug 27, 2025

[NeurIPS 2025 Oral]Infinity⭐️: Unified Spacetime AutoRegressive Modeling for Visual Generation

Python 658 24 Updated Nov 27, 2025

This is the official repo for the paper "LongCat-Flash-Omni Technical Report"

Python 444 24 Updated Dec 15, 2025

Native Multimodal Models are World Learners

Python 1,367 52 Updated Nov 28, 2025

🐻 Uniform Discrete Diffusion with Metric Path for Video Generation

Python 81 2 Updated Dec 19, 2025

🎥 Python and OpenCV-based scene cut/transition detection program & library.

Python 4,409 473 Updated Dec 11, 2025

[ICCV'25] Unified Open-World Segmentation with Multi-Modal Prompts

13 Updated Oct 10, 2025

HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation

Python 2,599 123 Updated Oct 31, 2025

The author's implementation of FUDOKI, a multimodal large language model purely based on discrete flow matching.

Python 65 3 Updated Sep 15, 2025
Next