-
Ant Group
- Hangzhou, China
-
21:39
(UTC +08:00) - https://zengyh1900.github.io/
- @zengyh1900
Highlights
- Pro
Lists (7)
Sort Name ascending (A-Z)
Starred repositories
A latent text-to-image diffusion model
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
🔊 Text-Prompted Generative Audio Model
Google Research
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
A game theoretic approach to explain the output of any machine learning model.
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
A guidance language for controlling large language models.
A High-Quality Real Time Upscaler for Anime Video
Instruct-tune LLaMA on consumer hardware
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
This repository provides tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive A…
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
StableLM: Stability AI Language Models
This repository contains the source code for the paper First Order Motion Model for Image Animation
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
High-Resolution Image Synthesis with Latent Diffusion Models
强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/
LAVIS - A One-stop Library for Language-Vision Intelligence
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
Zero-Shot Speech Editing and Text-to-Speech in the Wild
A series of large language models trained from scratch by developers @01-ai
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
Inpaint anything using Segment Anything and inpainting models.
Using Low-rank adaptation to quickly fine-tune diffusion models.