Stars
Robust Speech Recognition via Large-Scale Weak Supervision
Making large AI models cheaper, faster and more accessible
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Official inference repo for FLUX.1 models
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
Official repo for consistency models.
High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Karras et al. (2022) diffusion models for PyTorch
Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
[ICML'23] StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
DataComp: In search of the next generation of multimodal datasets
Easily create large video dataset from video urls
[CVPR 2022] StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"
Score-Based Generative Modeling with Critically-Damped Langevin Diffusion
Official Pytorch Implementation of Our CVPR2023 Paper: "Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization"
Easily compute clip embeddings from video frames
Comparison between Frechet Video Distance implementation from StyleGAN-V and the original paper
Implementation of the paper Recurrent Independent Mechanisms (https://arxiv.org/pdf/1909.10893.pdf)