This is my road to learning about transformers and SOTA language models. It closely resembles karpathy / makemore and is based on @karpathy's YouTube videos with just a few minor changes and tweaks to match my personal learning preferences. Let's hope for an innovation 🤞🏻
- Bigram Language Model Using Probabilities
- Multi-Layer Perceptron Bigram LM
- Convolutional Neural Networks
...
TBC