Lists (32)
Sort Name ascending (A-Z)
3D图像生成
confyui
confyui 相关的代码face
人脸相关项目,包括人脸编辑,动画生成等GAN
Image Edit
图像编辑image generatate
图像生成NeRF
神经辐射场OCR
SAM
sd-webui
关于stable diffusion的webui应用stable-diffusion
扩散模型transformer video
video generatate
视频生成代码或应用合集
光流估计
动漫相关
图像修复
图像文本匹配
图片搜索/图像匹配
基于扩散算法的图文算法
多任务
一个模型可以完成多个不同类别的任务大模型相关
大模型应用项目
换脸
数据集
目标检测/风格/姿态
包含目标检测,分割,姿态检测等项目目标跟踪/视频插帧
虚拟换衣
虚拟角色
包括人脸,人体重建驱动等视频插值
音频处理(生成)
风格迁移
Stars
A latent text-to-image diffusion model
🔊 Text-Prompted Generative Audio Model
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
Instruct-tune LLaMA on consumer hardware
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
This repository contains the source code for the paper First Order Motion Model for Image Animation
High-Resolution Image Synthesis with Latent Diffusion Models
PyTorch code and models for the DINOv2 self-supervised learning method.
LAVIS - A One-stop Library for Language-Vision Intelligence
Convert AI papers to GUI,Make it easy and convenient for everyone to use artificial intelligence technology。让每个人都简单方便的使用前沿人工智能技术
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
Zero-Shot Speech Editing and Text-to-Speech in the Wild
MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks
Inpaint anything using Segment Anything and inpainting models.
Using Low-rank adaptation to quickly fine-tune diffusion models.
Official Code for Stable Cascade
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
CoTracker is a model for tracking any point (pixel) on a video.
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
[SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.
[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering