tuanavu

Tuan Vu tuanavu

488 followers · 21 following

Achievements

x3 x2

Achievements

x3 x2

Highlights

Stars

9 stars written in Cuda

Clear filter

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 28,081 3,264 Updated Jun 26, 2025

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 8,324 825 Updated Nov 6, 2025

mit-han-lab / torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,409 177 Updated Feb 24, 2025

Liu-xiandong / How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…

Cuda 1,183 172 Updated Jul 29, 2023

NVIDIA / nvbench

CUDA Kernel Benchmarking Library

Cuda 759 90 Updated Oct 21, 2025

ulrichstern / cuda-convnet

Alex Krizhevsky's original code from Google Code

Cuda 198 32 Updated Mar 10, 2016

drkennetz / cuda_examples

Some CUDA example code with READMEs.

Cuda 176 26 Updated Mar 2, 2025

NVIDIA-Merlin / HierarchicalKV

HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of HierarchicalKV is to store key-value feature-embeddings on h…

Cuda 175 30 Updated Nov 2, 2025

Phoenix8215 / BuildCudaNeuralNetworkFromScratch

Build CUDA Neural Network From Scratch

Cuda 21 1 Updated Aug 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tuan Vu tuanavu

Achievements

Achievements

Highlights

Block or report tuanavu

Stars

karpathy / llm.c

xlite-dev / LeetCUDA

mit-han-lab / torchsparse

Liu-xiandong / How_to_optimize_in_GPU

NVIDIA / nvbench

ulrichstern / cuda-convnet

drkennetz / cuda_examples

NVIDIA-Merlin / HierarchicalKV

Phoenix8215 / BuildCudaNeuralNetworkFromScratch