-
Zhejiang University, China
- Hangzhou, China
- luohao.site
Stars
[MICCAI 2025] Official code for "Pre-Trained LLM is a Semantic-Aware and Generalizable Segmentation Booster"
Diffusion Models in Medical Imaging (Published in Medical Image Analysis Journal)
Frequency Autoregressive Image Generation with Continuous Tokens
Code and dataset for "Detecting Human Artifacts from Text-to-Image Models"
This repository is the official implementation of FLUX-CustomID. It is capable of generating images based on your face image at a level equivalent to real photographic quality. Our base model is FL…
Official inference repo for FLUX.1 models
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
VideoSys: An easy and efficient system for video generation
https://www.shoufachen.com/Awesome-Diffusion-Transformers/
A Collection of Papers and Codes for CVPR2025/ICCV2025/CVPR2024/ECCV2024 AIGC
Transparent Image Layer Diffusion using Latent Transparency
We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
A curated list of awesome research papers, projects, code, dataset, workshops etc. related to virtual try-on.
LightGlue: Local Feature Matching at Light Speed (ICCV 2023)
Adapting Meta AI's Segment Anything to Downstream Tasks with Adapters and Prompts
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Official repository for the General Robust Image Task (GRIT) Benchmark
A Semantic Controllable Self-Supervised Learning Framework to learn general human representations from massive unlabeled human images, which can benefit downstream human-centric tasks to the maximu…
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
Efficient Dataset Distillation by Representative Matching
LAVIS - A One-stop Library for Language-Vision Intelligence
SSL4EO-S12: a large-scale dataset for self-supervised learning in Earth observation
DAMO-YOLO: a fast and accurate object detection method with some new techs, including NAS backbones, efficient RepGFPN, ZeroHead, AlignedOTA, and distillation enhancement.