Stars
Open-source CUDA compiler targeting AMD GPUs (and more in the future!). Compiles .cu to GFX11 machine code.
Documentation for the Mainboard and printable mechanical parts in the Framework Desktop
A project trying to build a hoverboard controller without semiconductors
A machine learning accelerator core designed for energy-efficient AI at the edge.
Memory Optimizations for Deep Learning (ICML 2023)
Exocompilation for productive programming of hardware accelerators
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
Minimal reproduction of DeepSeek R1-Zero
Open-source high-performance RISC-V processor
Type annotations and runtime checking for shape and dtype of JAX/NumPy/PyTorch/etc. arrays. https://docs.kidger.site/jaxtyping/
the official Rust and C implementations of the BLAKE3 cryptographic hash function
Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wikitext-103 on a single A100 in <100 seconds. Scales to large…
Entropy Based Sampling and Parallel CoT Decoding
A free and strong UCI chess engine
parallelized hyperdimensional tictactoe
Nvidia Instruction Set Specification Generator
Simplifying reinforcement learning for complex game environments
Python script to stream EEG data from the muse 2016 headset
Port of the leading 3DS emulator, Citra — designed for playing 3DS homebrew and personal game backups in 3D on the go with your Quest.
⚡ A Fast, Extensible Progress Bar for Python and CLI