dl

Implementations of different Deep Learning architectures and algorithms, using only basic ops provided by pytorch.

Experiment Results

transformer

Model	wikitext-103 ppl	Closest public model
gpt2 12l	26.7	26.37 (gpt2-medium)

toy datasets

RNN vs LSTM vs GRU on toy dataset of "abcdef...": tensorboard
RNN vs LSTM vs GRU on toy dataset of "a...ab..bc..c...": tensorboard

TODOs

transformer

Implement transformer with self-attention
Implement sinusoidal position embeddings
Implement relative position bias a la T5
Implement RoPE embeddings
Add support for cross-attention, as used in NMT
Implement beam search decoding

rnn

Implement RNN
Implement LSTM
Implement GRU
Implement RWKV

examples/wikitext

eval

Implement evaluation framework
Collect popular LM benchmarks and published metrics

Name		Name	Last commit message	Last commit date
Latest commit History 154 Commits
dl		dl
tools		tools
.envrc		.envrc
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dl

Experiment Results

transformer

toy datasets

TODOs

transformer

rnn

examples/wikitext

eval

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

dl

Experiment Results

transformer

toy datasets

TODOs

transformer

rnn

examples/wikitext

eval

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages