Skip to content
View cxxz's full-sized avatar

Block or report cxxz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
10 stars written in Cuda
Clear filter

NCCL Tests

Cuda 1,472 361 Updated Mar 11, 2026

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 1,102 110 Updated Dec 30, 2024

Efficient GPU kernels for block-sparse matrix multiplication and convolution

Cuda 1,064 198 Updated Jun 8, 2023

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Cuda 956 348 Updated Aug 19, 2024

A CUDNN minimal deep learning training code sample using LeNet.

Cuda 268 93 Updated Jul 30, 2023
Cuda 132 16 Updated Mar 19, 2026

Deep neural network framework for multiple GPUs

Cuda 34 15 Updated Jun 20, 2015

A GPU performance prediction toolkit for CUDA programs

Cuda 19 4 Updated Mar 25, 2019

Public data from research

Cuda 7 2 Updated Oct 10, 2016