Skip to content
View b0nes164's full-sized avatar
👨‍🍳
👨‍🍳

Organizations

@linebender @gridwise-webgpu

Block or report b0nes164

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Formally verified polygon intersection

Lean 42 3 Updated May 30, 2026

Build your own high performance LLM inference engine in C++ and CUDA - a smaller version of vLLM

C++ 793 51 Updated Apr 14, 2026

GPU-accelerated triangle mesh processing

Cuda 311 43 Updated Jun 7, 2026

Production-ready K-Means clustering for Apache Spark with pluggable Bregman divergences (KL, Itakura-Saito, L1, etc). 6 algorithms, 740 tests, cross-version persistence. Drop-in replacement for MLl…

Scala 342 53 Updated Feb 14, 2026

Benchmarks for locking algorithms as well as implementations of locking algorithms.

C++ 25 4 Updated Mar 6, 2018
Objective-C 3 3 Updated Oct 31, 2023

An implementation of a workgroup occupancy discovery protocol and an inter-workgroup barrier. Also example applications.

C++ 6 1 Updated Sep 7, 2016

Automatic build of dawn (WebGPU) for Windows

Batchfile 65 13 Updated Jun 14, 2026

Apple GPU microarchitecture

Metal 613 29 Updated Sep 22, 2024

Small header-only C library to decompress any BC compressed image

C 187 24 Updated Oct 23, 2025

libcubwt is a library for GPU accelerated suffix array and burrows wheeler transform construction.

Cuda 40 2 Updated Aug 14, 2025

core WebGPU shaders

TypeScript 16 1 Updated Aug 18, 2024

NHW : A Next-Generation Image Compression Codec

C 75 9 Updated Jun 10, 2026

[NeurIPS 2024 Spotlight]"LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS", Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang

Python 802 76 Updated Dec 30, 2024

A family of header-only, very fast and memory-friendly hashmap and btree containers.

C++ 3,190 308 Updated Jun 10, 2026

Easy to integrate Vulkan memory allocation library

C 3,377 439 Updated Jun 4, 2026
Cuda 12 7 Updated Dec 17, 2023

Extremely fast non-cryptographic hash algorithm

C 11,073 901 Updated Jun 8, 2026

A GPU compute-centric 2D renderer.

Rust 4,092 259 Updated Jun 13, 2026

GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compression of numerical and other data types in HPC/ML applications.

Cuda 394 34 Updated Mar 18, 2026

Non-blocking screen capture example with asynchronous GPU readback

C# 405 48 Updated Apr 2, 2023

A shader system built using staged metaprogramming

Lua 15 3 Updated Jul 9, 2022

GPU-Accelerated Lossless Data Compressors Survey

Cuda 124 11 Updated Sep 10, 2020

A Visual Studio extension that provides enhanced support for editing High Level Shading Language (HLSL) files

HLSL 631 105 Updated Apr 9, 2026

A simple GPU hash table implemented in CUDA using lock free techniques

Cuda 406 44 Updated Feb 7, 2024