PWhiddy

Peter Whidden PWhiddy

1.5k followers · 157 following

Seattle WA
transdimensional.xyz

Achievements

x4 x2

Achievements

x4 x2

Highlights

Organizations

Stars

16 stars written in Cuda

Clear filter

NVlabs / instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,133 2,037 Updated Dec 2, 2025

HigherOrderCO / HVM

A massively parallel, optimal functional runtime in Rust

Cuda 11,175 425 Updated Nov 21, 2024

luanfujun / deep-painterly-harmonization

Code and data for paper "Deep Painterly Harmonization": https://arxiv.org/abs/1804.03189

Cuda 6,058 615 Updated Aug 2, 2021

NVIDIA / cub

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,808 464 Updated Oct 9, 2023

k2-fsa / k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Cuda 1,293 232 Updated Nov 19, 2025

openai / blocksparse

Efficient GPU kernels for block-sparse matrix multiplication and convolution

Cuda 1,061 198 Updated Jun 8, 2023

siboehm / SGEMM_CUDA

Fast CUDA matrix multiplication from scratch

Cuda 977 146 Updated Sep 2, 2025

b0nes164 / GPUSorting

State of the art sorting and segmented sorting, including OneSweep. Implemented in CUDA, D3D12, and Unity style compute shaders. Theoretically portable to all wave/warp/subgroup sizes.

Cuda 408 25 Updated Dec 14, 2024

vernamlab / cuFHE

CUDA-accelerated Fully Homomorphic Encryption Library

Cuda 236 61 Updated Jul 7, 2021

canonizer / halloc

A fast and highly scalable GPU dynamic memory allocator

Cuda 110 9 Updated Mar 11, 2015

oresths / tSparse

A GPU algorithm for sparse matrix-matrix multiplication

Cuda 73 16 Updated Oct 1, 2020

mark-poscablo / gpu-radix-sort

CUDA implementation of parallel radix sort using Blelloch scan

Cuda 66 16 Updated Feb 29, 2024

covexp / cuda-noise

Library of common noise functions for CUDA kernels

Cuda 41 7 Updated Aug 17, 2025

knotman90 / cuStreamComp

Efficient CUDA Stream Compaction Library

Cuda 35 6 Updated Jun 9, 2023

b0nes164 / OneSweep

A simple library-less CUDA implementation of the OneSweep sorting algorithm.

Cuda 11 Updated Feb 26, 2024

kevmo314 / prospero.vm

Cuda 4 Updated Mar 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peter Whidden PWhiddy

Achievements

Achievements

Highlights

Organizations

Block or report PWhiddy

Stars

NVlabs / instant-ngp

HigherOrderCO / HVM

luanfujun / deep-painterly-harmonization

NVIDIA / cub

k2-fsa / k2

openai / blocksparse

siboehm / SGEMM_CUDA

b0nes164 / GPUSorting

vernamlab / cuFHE

canonizer / halloc

oresths / tSparse

mark-poscablo / gpu-radix-sort

covexp / cuda-noise

knotman90 / cuStreamComp

b0nes164 / OneSweep

kevmo314 / prospero.vm