Stars
Skills for writing tilelang and debugging with CUDA toolkits.
A kernel library written in tilelang
An always-running agent, but is trustworthy and secure.
Control your claude code from your apple watch
DFVG: A Heterogeneous Architecture for Speculative Decoding with Draft-on-FPGA and Verify-on-GPU.
A Rust OS kernel autonomously implemented by Claude Code Opus/Sonnet.
Tree-structured context management for Claude Code via MCP server
RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing …
Check your Claude/Codex/Gemini, and learn how much money you save
Fast, small, and fully autonomous AI personal assistant infrastructure, ANY OS, ANY PLATFORM — deploy anywhere, swap anything 🦀
Tile-Based Runtime for Ultra-Low-Latency LLM Inference
A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…
cuTile is a programming model for writing parallel kernels for NVIDIA GPUs
Helpful kernel tutorials, examples and SKILLs for tile-based GPU programming
[ICML 2026]A framework to compare low-bit integer and float-point formats
GitHub Action to compile LaTeX documents
A machine learning accelerator core designed for energy-efficient AI at the edge.
Fast Hadamard transform in CUDA, with a PyTorch interface