TheTom

Follow

Tom Turney TheTom

Follow

Working on LLM inference systems, KV cache compression, and kernel-level optimizations (TurboQuant).

546 followers · 2 following

Achievements

Achievements

Organizations

Stars

3 results for forked starred repositories

TheTom / llama-cpp-turboquant

Forked from ggml-org/llama.cpp

LLM inference in C/C++

C++ 1,822 321 Updated Jun 13, 2026

spiritbuun / buun-llama-cpp

Forked from TheTom/llama-cpp-turboquant

LLAMA Turboquant implementation with CUDA support

C++ 655 71 Updated Jun 4, 2026

miolini / autoresearch-macos

Forked from karpathy/autoresearch

AI agents running research on single-GPU nanochat training automatically adopted for MacOS

Python 2,233 328 Updated Mar 17, 2026