🥲
Focusing
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
8
stars
written in C++
Clear filter
FlashMLA: Efficient Multi-head Latent Attention Kernels
CUDA Templates and Python DSLs for High-Performance Linear Algebra
A header-only C++ library for sketching in randomized linear algebra
Lab 5 project of MIT-6.5940, deploying LLaMA2-7B-chat on one's laptop with TinyChatEngine.