ThomAub

Thomas ThomAub

Research Scientist - Machine Learning

44 followers · 361 following

Paris

Achievements

Starred repositories

10 results for source starred repositories written in Cuda

Clear filter

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 28,437 3,334 Updated Jun 26, 2025

HigherOrderCO / HVM

A massively parallel, optimal functional runtime in Rust

Cuda 11,180 427 Updated Nov 21, 2024

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 4,317 610 Updated Dec 21, 2025

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 3,008 217 Updated Dec 9, 2025

mit-han-lab / torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,427 181 Updated Feb 24, 2025

openai / blocksparse

Efficient GPU kernels for block-sparse matrix multiplication and convolution

Cuda 1,063 198 Updated Jun 8, 2023

rapidsai / cuvs

cuVS - a library for vector search and clustering on the GPU

Cuda 596 147 Updated Dec 20, 2025

wangsiping97 / FastGEMV

High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.

Cuda 123 7 Updated Jul 13, 2024

aredden / torch-cublas-hgemm

PyTorch half precision gemm lib w/ fused optional bias + optional relu/gelu

Cuda 75 4 Updated Dec 3, 2024

spectral-compute / scale-examples

Cuda 65 2 Updated Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thomas ThomAub

Achievements

Achievements

Block or report ThomAub

Starred repositories

karpathy / llm.c

HigherOrderCO / HVM

flashinfer-ai / flashinfer

HazyResearch / ThunderKittens

mit-han-lab / torchsparse

openai / blocksparse

rapidsai / cuvs

wangsiping97 / FastGEMV

aredden / torch-cublas-hgemm

spectral-compute / scale-examples

Starred topics

Vim

Rust

Python