From scratch🛠️
Materials for the Hugging Face Diffusion Models Course
Distributed training (multi-node) of a Transformer model
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.
My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT p…
My implementation of the original GAT paper (Veličković et al.). I've additionally included the playground.py file for visualizing the Cora dataset, GAT embeddings, an attention mechanism, and entr…
Yet another PyTorch implementation of Stable Diffusion (probably easy to read)
Creating a diffusion model from scratch in PyTorch to learn exactly how they work.
Attention Is All You Need | a PyTorch Tutorial to Transformers
Google AI 2018 BERT pytorch implementation
An Educational Framework Based on PyTorch for Deep Learning Education and Exploration
🌎 machine learning tutorials (mainly in Python3)
Stable Diffusion implemented from scratch in PyTorch
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
Implementation of https://srush.github.io/annotated-s4
A simple and efficient Mamba implementation in pure PyTorch and MLX.
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
Complete implementation of Llama2 with/without KV cache & inference 🚀
The goal of this library is to generate more helpful exception messages for matrix algebra expressions for numpy, pytorch, jax, tensorflow, keras, fastai.
Meshed-Memory Transformer for Image Captioning. CVPR 2020
A walkthrough of transformer architecture code
An annotated implementation of the Transformer paper.
Autograd to GPT-2 completely from scratch
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023
Annotated version of the Mamba paper
Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).