Skip to content
View geohot's full-sized avatar

Highlights

  • Pro

Block or report geohot

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)

Jupyter Notebook 1,069 173 Updated Mar 24, 2026

A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.

178,675 18,259 Updated Apr 20, 2026

A kernel library written in tilelang

Python 1,596 140 Updated Apr 23, 2026
C 101 24 Updated Jun 6, 2026

OpenCode plugin that uses your existing Claude Code credentials — no separate login needed.

TypeScript 1,094 139 Updated May 15, 2026

The open source coding agent.

TypeScript 176,268 21,461 Updated Jun 19, 2026

Open-source CUDA, Triton and HIP compiler targeting multiple GPU and CPU architectures.

C 1,699 87 Updated Jun 17, 2026

Documentation for the Mainboard and printable mechanical parts in the Framework Desktop

OpenSCAD 302 20 Updated Nov 30, 2025

A project trying to build a hoverboard controller without semiconductors

20 1 Updated Nov 24, 2025

Code, labs, and resources for O'Reilly AI Systems Performance Engineering: GPU optimization, distributed training, inference scaling, and full-stack tuning.

Python 1,597 226 Updated Jun 18, 2026

A machine learning accelerator core designed for energy-efficient AI at the edge.

Emacs Lisp 2,406 293 Updated Jun 17, 2026

The best ChatGPT that $100 can buy.

Python 55,222 7,586 Updated May 5, 2026

Memory Optimizations for Deep Learning (ICML 2023)

Python 122 15 Updated Mar 13, 2024

Exocompilation for productive programming of hardware accelerators

Python 731 55 Updated May 16, 2026

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

Cuda 2,322 222 Updated Jun 18, 2026
Rocq Prover 370 12 Updated Sep 20, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 13,174 1,585 Updated Feb 27, 2026

NanoGPT (124M) in 90 seconds

Python 5,417 811 Updated Jun 19, 2026

Open-source high-performance RISC-V processor

Scala 7,074 913 Updated Jun 19, 2026

Type annotations and runtime checking for shape and dtype of JAX/NumPy/PyTorch/etc. arrays. https://docs.kidger.site/jaxtyping/

Python 1,830 90 Updated Jun 13, 2026

the official Rust and C implementations of the BLAKE3 cryptographic hash function

Assembly 6,290 462 Updated May 21, 2026

Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to large…

Python 357 27 Updated Jul 29, 2024

Entropy Based Sampling and Parallel CoT Decoding

Python 3,436 321 Updated Nov 13, 2024

A free and strong UCI chess engine

C++ 15,870 2,921 Updated Jun 14, 2026

parallelized hyperdimensional tictactoe

Python 127 2 Updated Aug 25, 2024

Nvidia Instruction Set Specification Generator

Python 339 23 Updated Jul 9, 2024
Next