Skip to content
View hawkingrei's full-sized avatar
:octocat:
I may be slow to respond.
:octocat:
I may be slow to respond.

Highlights

  • Pro

Organizations

@pingcap @prism-river @onerepo @hw-standalonecomplex @tikv @b3fs @go-kratos

Block or report hawkingrei

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

12 stars written in Cuda
Clear filter

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 10,828 1,093 Updated Apr 20, 2026

Fast parallel CTC.

Cuda 4,073 1,033 Updated Mar 4, 2024

Introduction to Parallel Programming class code

Cuda 1,349 1,146 Updated Jun 27, 2022

GPU database engine

Cuda 1,170 120 Updated Jan 30, 2017

Source code that accompanies The CUDA Handbook.

Cuda 572 197 Updated Mar 10, 2026

Static suckless single batch CUDA-only qwen3-0.6B mini inference engine

Cuda 554 48 Updated Sep 8, 2025

A simple GPU hash table implemented in CUDA using lock free techniques

Cuda 406 44 Updated Feb 7, 2024

Facebook's CUDA extensions.

Cuda 284 57 Updated Mar 27, 2019

Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search

Cuda 146 26 Updated Sep 23, 2020

CUDA implementation of the Floyd-Warshall All pairs shortest path graph algorithm(with path reconstruction)

Cuda 39 15 Updated Sep 18, 2014

Python wrappers for fast NMF training using CUDA,MKL, and ATLAS

Cuda 10 1 Updated Feb 27, 2018

4x4 matrix sum game

Cuda 7 2 Updated Jun 30, 2016