Highlights
- Pro
Stars
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Official Code for DragGAN (SIGGRAPH 2023)
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Generative Models by Stability AI
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Official inference repo for FLUX.1 models
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Lets make video diffusion practical!
Wan: Open and Advanced Large-Scale Video Generative Models
An open source implementation of CLIP.
Generate 3D objects conditioned on text or images
Official implementation of AnimateDiff.
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Python client for Baidu Yun (Personal Cloud Storage) 百度云/百度网盘Python客户端
Real-Time High-Resolution Background Matting
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
Infinite Photorealistic Worlds using Procedural Generation
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
[SIGGRAPH Asia 2024, Journal Track] ToonCrafter: Generative Cartoon Interpolation