Skip to content
View ThomAub's full-sized avatar
  • Paris

Block or report ThomAub

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

10 results for source starred repositories written in Cuda
Clear filter

LLM training in simple, raw C/CUDA

Cuda 28,437 3,334 Updated Jun 26, 2025

A massively parallel, optimal functional runtime in Rust

Cuda 11,180 427 Updated Nov 21, 2024

FlashInfer: Kernel Library for LLM Serving

Cuda 4,317 610 Updated Dec 21, 2025

Tile primitives for speedy kernels

Cuda 3,008 217 Updated Dec 9, 2025

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,427 181 Updated Feb 24, 2025

Efficient GPU kernels for block-sparse matrix multiplication and convolution

Cuda 1,063 198 Updated Jun 8, 2023

cuVS - a library for vector search and clustering on the GPU

Cuda 596 147 Updated Dec 20, 2025

High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.

Cuda 123 7 Updated Jul 13, 2024

PyTorch half precision gemm lib w/ fused optional bias + optional relu/gelu

Cuda 75 4 Updated Dec 3, 2024