Skip to content
View Gui-Yom's full-sized avatar
🦀
🦀

Organizations

@chapi-com @Mercuri-Inc

Block or report Gui-Yom

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

11 results for source starred repositories written in Cuda
Clear filter

A massively parallel, optimal functional runtime in Rust

Cuda 11,205 434 Updated Nov 21, 2024

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 6,158 813 Updated Feb 3, 2026

Tile primitives for speedy kernels

Cuda 3,122 235 Updated Feb 5, 2026

CUDA Kernel Benchmarking Library

Cuda 808 102 Updated Feb 5, 2026

Fastest kernels written from scratch

Cuda 532 64 Updated Sep 18, 2025

State of the art sorting and segmented sorting, including OneSweep. Implemented in CUDA, D3D12, and Unity style compute shaders. Theoretically portable to all wave/warp/subgroup sizes.

Cuda 425 27 Updated Dec 14, 2024

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without un…

Cuda 247 15 Updated Jan 29, 2026

TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.

Cuda 106 6 Updated Jun 28, 2025
Cuda 43 13 Updated May 21, 2021

FlashFFTStencil: Bridging Fast Fourier Transforms to Memory-Efficient Stencil Computations on Tensor Core Units (PPoPP'25)

Cuda 6 3 Updated Jan 9, 2025
Cuda 1 Updated Jul 30, 2023