ThomAub

Thomas ThomAub

Research Scientist - Machine Learning

59 followers · 370 following

Paris

Achievements

Starred repositories

11 stars written in Cuda

Clear filter

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 29,540 3,516 Updated Jun 26, 2025

HigherOrderCO / HVM2

A massively parallel, optimal functional runtime in Rust

Cuda 11,225 436 Updated Nov 21, 2024

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 3,312 275 Updated Apr 8, 2026

mirage-project / mirage

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

Cuda 2,187 194 Updated Apr 12, 2026

mit-han-lab / torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,457 187 Updated Feb 24, 2025

openai / blocksparse

Efficient GPU kernels for block-sparse matrix multiplication and convolution

Cuda 1,065 198 Updated Jun 8, 2023

rapidsai / cuvs

cuVS - a library for vector search and clustering on the GPU

Cuda 732 180 Updated Apr 10, 2026

wangsiping97 / FastGEMV

High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.

Cuda 128 8 Updated Jul 13, 2024

aredden / torch-cublas-hgemm

PyTorch half precision gemm lib w/ fused optional bias + optional relu/gelu

Cuda 78 4 Updated Dec 3, 2024

spectral-compute / scale-examples

Cuda 67 2 Updated Jul 10, 2024

tqchen / flashinfer

Forked from flashinfer-ai/flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 4 Updated Apr 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thomas ThomAub

Achievements

Achievements

Block or report ThomAub

Starred repositories

karpathy / llm.c

HigherOrderCO / HVM2

HazyResearch / ThunderKittens

mirage-project / mirage

mit-han-lab / torchsparse

openai / blocksparse

rapidsai / cuvs

wangsiping97 / FastGEMV

aredden / torch-cublas-hgemm

spectral-compute / scale-examples

tqchen / flashinfer

Starred topics

Vim

Rust

Python