Stars
MoE training for Me and You and maybe other people
Entropy Based Sampling and Parallel CoT Decoding
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
LLM papers I'm reading, mostly on inference and model compression
Run Llama based LLMs in Unity entirely in compute shaders with no dependencies
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Android ListView with drag and drop reordering.