Stars
100M tokens. Infinite compute. Lowest val loss wins.
CIFAR-10 speedrun: Trains to 94% accuracy in 1.98 seconds on a single NVIDIA A100 GPU.
The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in language modeling.
RWKV-7: Surpassing GPT
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.
Fast bare-bones BPE for modern tokenizer training
The release codes of LA-MCTS with its application to Neural Architecture Search.
Machine Learning Engineering Open Book
Train to 94% on CIFAR-10 in <6.3 seconds on a single A100. Or ~95.79% in ~110 seconds (or less!)
Train ImageNet *fast* in 500 lines of code with FFCV
FFCV: Fast Forward Computer Vision (and other ML workloads!)
uBlock Origin - An efficient blocker for Chromium and Firefox. Fast and lean.
An Open Source alternative to Facebook, Reddit and Disqus
R package for plotting simple decision tree partitions
TriMap: Large-scale Dimensionality Reduction Using Triplets
Code etc for Hacker Dojo Deep Learning Study Group
A project template to simplify building and training deep learning models using Keras.
Tensorflow Implementation of Adversarial Attack to Capsule Networks
A curated list of awesome resources related to capsule networks
OpenAI Request for Research - https://blog.openai.com/requests-for-research-2/
DAGAN: Data Augmentation Generative Adversarial Networks