🤖
Working on ML and AI
Keep The Neurons Firing
-
Two Learn Tech
- Chattanooga, TN
-
21:19
(UTC -05:00) - https://orcid.org/0009-0003-4692-6861
- u/Slight-Living-8098
- @badgids
- BackwoodsUncleBub
- https://huggingface.co/Badgids
Starred repositories
1
result
for source starred repositories
written in Cuda
Clear filter
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.