Skip to content
View dukebw's full-sized avatar

Highlights

  • Pro

Block or report dukebw

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 312 29 Updated Jun 15, 2026

TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration

Python 1,568 181 Updated Mar 27, 2026

adds /goal functionality similar to that used in Codex and Claude Code to OpenCode.

JavaScript 92 7 Updated Jun 15, 2026

Ideogram 4: Open image model at the forefront of design

Python 2,084 206 Updated Jun 4, 2026

NVIDIA FastGen: Fast Generation from Diffusion Models

Python 812 64 Updated Jun 7, 2026

Conveniently export torch.compile compiled products into self-contained Python files

Python 33 2 Updated Jun 5, 2026

TokenSpeed is a speed-of-light LLM inference engine.

Python 1,447 158 Updated Jun 16, 2026

Ready-to-use ML training recipes to help you build and deploy models on Baseten.

Python 55 8 Updated Jun 12, 2026

The lightweight framework for building agents

Python 445 48 Updated Jun 16, 2026

AI Tensor Engine for ROCm

Python 463 353 Updated Jun 16, 2026

A modern alternative to ls

Rust 22,293 454 Updated May 31, 2026

A fast type checker and language server for Python

Rust 6,657 405 Updated Jun 16, 2026

Performant kernels, and other ML Systems integrations

Cuda 7 2 Updated Jun 8, 2026

From a+b to sparsemax(QK^T)V in Triton!

Jupyter Notebook 34 Updated Jun 19, 2025

Anthropic's original performance take-home, now open for you to try!

Python 3,893 888 Updated Jan 22, 2026

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 6,506 604 Updated Jun 16, 2026

A kernel library written in tilelang

Python 1,591 139 Updated Apr 23, 2026

CUDA kernels for linear attention variants, written in CuTe DSL and CUTLASS C++.

Python 522 64 Updated Jun 12, 2026

Use Codex from Claude Code to review code or delegate tasks.

JavaScript 21,113 1,278 Updated Jun 14, 2026

Region-level profiling for CUDA kernels with trace, NVBit, CUPTI, NSys, and an interactive Explorer.

Python 118 11 Updated Apr 17, 2026

Dashboard for InferenceX™, Open Source Continuous Inference

TypeScript 30 9 Updated Jun 16, 2026
Python 232 8 Updated Oct 27, 2025

Skills for Real Engineers. Straight from my .claude directory.

Shell 131,839 11,477 Updated Jun 12, 2026

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 892 255 Updated Jun 16, 2026

Agent Lattice: a knowledge graph for your codebase, written in markdown.

TypeScript 1,651 107 Updated Apr 2, 2026

A monitor of resources

C++ 32,880 1,054 Updated Jun 6, 2026

A markdown native slides tool for academics building with agents.

Python 207 11 Updated Jun 4, 2026

CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies

Rust 62,965 3,884 Updated Jun 16, 2026

Gemini auth plugin for opencode

TypeScript 1,674 114 Updated Jun 3, 2026
Next