Stars
Agentic Kernel Optimization for All — automated GPU kernel optimization for any kernel, any hardware, any language
Conveniently export torch.compile compiled products into self-contained Python files
cuDNN Frontend is NVIDIA's modern, open-source entry point to the cuDNN library and a growing collection of high-performance open-source kernels.
Wrap Antigravity, ChatGPT Codex, Claude Code, Grok Build as an OpenAI/Gemini/Claude/Codex compatible API service, allowing you to enjoy the free Gemini 3.1 Pro, GPT 5.5, Grok 4.3, Claude model thro…
Improve viewing Markdown in Neovim
A hackable markdown, Typst, latex, html(inline) & Asciidoc previewer for Neovim
high-performance linear attention kernel library built on TileLang
Causal depthwise conv1d in CUDA, with a PyTorch interface
A kernel library written in tilelang
FlashKDA: high-performance Kimi Delta Attention kernels
CUDA kernels for linear attention variants, written in CuTe DSL and CUTLASS C++.
Warcraft III Peon voice notifications (+ more!) for Claude Code, Codex, IDEs, and any AI agent. Stop babysitting your terminal. Employ a Peon today.
Fast, Sharp & Reliable Agentic Intelligence
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
SQL databases in Python, designed for simplicity, compatibility, and robustness.
Submit stacked diffs to GitHub on the command line
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
If you live in the terminal, kitty is made for you! Cross-platform, fast, feature-rich, GPU based.
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…
A simple, fast and user-friendly alternative to 'find'
A tiling window manager for macOS based on binary space partitioning
Accelerating MoE with IO and Tile-aware Optimizations