blap

Bruno Pio blap

11 followers · 0 following

Achievements

x2 x2

Achievements

x2 x2

Lists (2)

Sort

Build LLMs

316 repositories

LLMs

60 repositories

Stars

6 stars written in Cuda

Clear filter

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 29,450 3,498 Updated Jun 26, 2025

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 3,307 275 Updated Apr 8, 2026

thu-ml / SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 3,281 389 Updated Jan 17, 2026

thu-ml / SpargeAttn

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

Cuda 973 91 Updated Feb 25, 2026

ColfaxResearch / cutlass-kernels

Cuda 261 38 Updated Jul 11, 2024

OpenBMB / CPM.cu

CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge techniques in sparse architecture, speculative sampling and qua…

Cuda 236 22 Updated Jan 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bruno Pio blap

Achievements

Achievements

Block or report blap

Lists (2)

Build LLMs

LLMs

Stars

karpathy / llm.c

HazyResearch / ThunderKittens

thu-ml / SageAttention

thu-ml / SpargeAttn

ColfaxResearch / cutlass-kernels

OpenBMB / CPM.cu