A nearly complete collection of prefix sum algorithms implemented in CUDA, D3D12, Unity and WGPU. Theoretically portable to all wave/warp/subgroup sizes.

C++ 294 11 Updated Jan 29, 2025

callummcdougall / ARENA_3.0

Jupyter Notebook 1,081 698 Updated May 12, 2026

GiuseppeCesarano / pside

A modern causal profiler built leveraging Linux tracepoints

Zig 11 Updated May 13, 2026

visutwin / visutwin-canvas

VisuTwin Canvas

C++ 11 Updated May 7, 2026

protobuf-c / protobuf-c

Protocol Buffers implementation in C

C++ 2,965 767 Updated Apr 7, 2025

mrdoob / three.wasm

8x Faster JavaScript 3D Library.

TypeScript 538 34 Updated Apr 2, 2026

shuveb / zerohttpd

A simple HTTP server written from scratch as a teaching tool to teach Unix network program architectures

C 396 56 Updated Apr 28, 2019

jmaczan / tiny-vllm

Build your own high performance LLM inference engine in C++ and CUDA - a smaller version of vLLM

C++ 132 7 Updated Apr 14, 2026

floooh / oryol

A small, portable and extensible C++ 3D coding framework

C++ 2,062 203 Updated Feb 6, 2023

ikawrakow / ik_llamafile

Forked from mozilla-ai/llamafile

Distribute and run LLMs with a single file.

C++ 24 Updated May 13, 2025

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 10,995 1,108 Updated May 3, 2026

chenglou / pretext

Fast, accurate & comprehensive text measurement & layout

TypeScript 47,017 2,595 Updated May 11, 2026

flashinfer-ai / flashinfer-bench

Building the Virtuous Cycle for AI-driven LLM Systems

Python 227 40 Updated May 1, 2026

jaywyawhare / C-ML

Machine learning framework written in C.

C 104 13 Updated Apr 25, 2026

behdad / glyphy

GLyphy is an implementation of the Slug algorithm for GPU text rasterization

C++ 839 80 Updated Mar 30, 2026

ROCm / aiter

AI Tensor Engine for ROCm

Python 433 313 Updated May 16, 2026

diffusionstudio / slug-webgpu

WebGPU implementation of Eric Lengyel's Slug algorithm for resolution-independent vector text rendering on the GPU

TypeScript 190 3 Updated Mar 25, 2026

achal achalpandeyy

Lists (3)

Game Engine Ideas

ML Systems

Systems

Stars