Highlights
- Pro
Stars
Implement all of the components (tokenizer, model architecture, optimizer) necessary to train a standard BPE-encoder Transformer language model. Training and tuning of LM.
Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch