- Hamilton, Ontario, Canada
- https://brendanduke.ca
Highlights
- Pro
Stars
- All languages
- Assembly
- C
- C#
- C++
- CMake
- CSS
- Common Lisp
- Crystal
- Cuda
- Cython
- Dockerfile
- Emacs Lisp
- F#
- Forth
- GLSL
- Go
- HCL
- HTML
- Haskell
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- Lean
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Markdown
- Mojo
- NASL
- Nix
- OCaml
- Objective-C
- Odin
- PHP
- PLpgSQL
- Perl
- PowerShell
- Python
- R
- Racket
- Rich Text Format
- Rust
- SWIG
- Sass
- Scala
- Shell
- Starlark
- Svelte
- Swift
- SystemVerilog
- Tcl
- TeX
- Tree-sitter Query
- TypeScript
- VBScript
- VHDL
- Verilog
- Vim Script
- Vue
- WebAssembly
- Zig
TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration
adds /goal functionality similar to that used in Codex and Claude Code to OpenCode.
Ideogram 4: Open image model at the forefront of design
NVIDIA FastGen: Fast Generation from Diffusion Models
Conveniently export torch.compile compiled products into self-contained Python files
TokenSpeed is a speed-of-light LLM inference engine.
Ready-to-use ML training recipes to help you build and deploy models on Baseten.
The lightweight framework for building agents
A fast type checker and language server for Python
Performant kernels, and other ML Systems integrations
From a+b to sparsemax(QK^T)V in Triton!
Anthropic's original performance take-home, now open for you to try!
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
A kernel library written in tilelang
CUDA kernels for linear attention variants, written in CuTe DSL and CUTLASS C++.
Use Codex from Claude Code to review code or delegate tasks.
Region-level profiling for CUDA kernels with trace, NVBit, CUPTI, NSys, and an interactive Explorer.
Dashboard for InferenceX™, Open Source Continuous Inference
Skills for Real Engineers. Straight from my .claude directory.
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
Agent Lattice: a knowledge graph for your codebase, written in markdown.
A markdown native slides tool for academics building with agents.
CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies
Gemini auth plugin for opencode