YHWH666

Jason Li YHWH666

3 followers · 19 following

Stars

Image<->Text

19 repositories

cure-lab / PnPInversion

[ICLR2024] Official repo for paper "PnP Inversion: Boosting Diffusion-based Editing with 3 Lines of Code"

Jupyter Notebook 373 19 Updated Mar 12, 2024

lucidrains / parti-pytorch

Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch

Python 537 25 Updated Dec 8, 2023

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,191 2,684 Updated Aug 12, 2024

yzliu567 / sc-gan

Jupyter Notebook 38 3 Updated Jul 11, 2022

LTH14 / rcg

PyTorch implementation of RCG https://arxiv.org/abs/2312.03701

Python 935 43 Updated Sep 27, 2024

sled-group / InfEdit

[CVPR 2024] Official implementation, Inversion-Free Image Editing with Natural Language"

Python 353 9 Updated May 28, 2024

salesforce / LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 11,073 1,088 Updated Nov 18, 2024

LLaVA-VL / LLaVA-Plus-Codebase

LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills

Python 763 58 Updated Feb 1, 2024

AILab-CVC / VL-GPT

VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation

86 2 Updated Sep 12, 2024

CircleRadon / Osprey

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

Python 837 43 Updated Aug 19, 2025

LiWentomng / gradio-osprey-demo

Gradio demo used in our Osprey:Pixel Understanding with Visual Instruction Tuning.

Python 16 3 Updated Dec 19, 2023

tyxsspa / AnyText

Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>

Python 4,823 307 Updated Mar 7, 2025

instruct-imagen / instruct-imagen.github.io

HTML 2 Updated Jan 8, 2024

ali-vilab / SCEdit

[CVPR 2024 Highlight] Official repo: SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing

53 1 Updated Apr 5, 2024

MoayedHajiAli / ElasticDiffusion-official

The official Pytorch Implementation for ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation (CVPR 2024)

Python 159 8 Updated Dec 24, 2024

lishaoxu1994 / DiffStyler

Python 35 6 Updated Dec 16, 2025

bytedance / DiffusionEngine

Python 88 6 Updated Sep 17, 2023

hansam95 / OSASIS

Python 48 4 Updated Jul 17, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,854 303 Updated Jun 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly