Skip to content
View LiYinqi's full-sized avatar
  • Institute of Computing Technology, CAS

Highlights

  • Pro

Block or report LiYinqi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Awesome Unified Multimodal Models

990 31 Updated Aug 17, 2025

Automatically crawl arXiv papers daily and summarize them using AI. Illustrating them using GitHub Pages.

JavaScript 2,205 734 Updated Dec 22, 2025

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,220 39 Updated Dec 23, 2025
190 9 Updated Oct 16, 2025

[ICCV 2025] Code & Data for: SuperEdit - Rectifying and Facilitating Supervision for Instruction-Based Image Editing

Python 164 9 Updated Jun 26, 2025

[NeurIPS 2025] Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Surpasses GPT-4o in ID persistence~ MoE ckpt released! Only 4GB VRAM is enough to run!

Python 2,049 114 Updated Dec 19, 2025

VSCode extension that grammar-checks texts through a local LLM

TypeScript 25 7 Updated Oct 30, 2025

Official implement of ICML2024 Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation

Python 54 3 Updated Aug 15, 2024

[ICLR 2025] Diffusion Feedback Helps CLIP See Better

Python 298 15 Updated Jan 23, 2025

[NeurIPS'25] A work to improve CLIP's visual detail capturing ability by inverting the unCLIP generative model.

Python 18 Updated Jun 2, 2025

The official implementation of CVPR Workshop 2025 paper: Window Token Concatenation for Efficient Visual Large Language Models.

Python 10 Updated Apr 10, 2025

(ICCV 2025) Enhance CLIP and MLLM's fine-grained visual representations with generative models.

Python 74 4 Updated Jun 25, 2025

This repository collects papers on VLLM applications. We will update new papers irregularly.

194 15 Updated Dec 18, 2025

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 4,249 365 Updated Oct 19, 2025

Diffusion Classifier leverages pretrained diffusion models to perform zero-shot classification without additional training

Python 479 41 Updated Feb 28, 2024

[ICLR'25 Oral] No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images

Python 906 47 Updated Sep 2, 2025

LaTeX Thesis Template for the University of Chinese Academy of Sciences

TeX 3,734 943 Updated Feb 29, 2024
Python 1,411 134 Updated Jan 8, 2025

Official implementation of NeurIPS'24 paper Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features

Python 35 5 Updated May 28, 2025

Utilities intended for use with Llama models.

Python 7,401 1,306 Updated Dec 16, 2025

Next-Token Prediction is All You Need

Python 2,270 91 Updated Nov 19, 2025

Collection of common code that's shared among different research projects in FAIR computer vision team.

Python 2,206 236 Updated Aug 27, 2025

High-resolution models for human tasks.

Python 5,254 310 Updated Nov 18, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 18,090 2,292 Updated Dec 25, 2024

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,829 113 Updated Sep 27, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,975 134 Updated Nov 7, 2025

This repo contains the code for 1D tokenizer and generator

Jupyter Notebook 1,086 59 Updated Mar 20, 2025

SEED-Voken: A Series of Powerful Visual Tokenizers

Python 984 36 Updated Nov 25, 2025

Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ

Python 1,662 85 Updated Aug 12, 2025

DiffSeg is an unsupervised zero-shot segmentation method using attention information from a stable-diffusion model. This repo implements the main DiffSeg algorithm and additionally includes an expe…

Jupyter Notebook 329 25 Updated Jul 9, 2024
Next