Skip to content
View dukebw's full-sized avatar

Highlights

  • Pro

Block or report dukebw

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 1,880 151 Updated Dec 21, 2025

Low overhead tracing library and trace visualizer for pipelined CUDA kernels

C 127 5 Updated Nov 26, 2025

AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming

Python 140 27 Updated Dec 20, 2025

Open ABI and FFI for Machine Learning Systems

C++ 257 43 Updated Dec 20, 2025

A Quirky Assortment of CuTe Kernels

Python 705 64 Updated Dec 16, 2025

kernels, of the mega variety

Python 631 34 Updated Sep 28, 2025

A collection of GPU experiments and benchmarks for my personal understanding and research.

Cuda 16 3 Updated Dec 8, 2025
Cuda 43 10 Updated Dec 10, 2025

NUMA-aware multi-CPU multi-GPU data transfer benchmarks

C++ 26 3 Updated Oct 26, 2023
Python 17 1 Updated Oct 29, 2025

slime is an LLM post-training framework for RL Scaling.

Python 2,925 353 Updated Dec 21, 2025

A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.

HTML 907 216 Updated Dec 19, 2025

An early research stage expert-parallel load balancer for MoE models based on linear programming.

Python 471 27 Updated Nov 19, 2025

An open-source AI agent that brings the power of Grok directly into your terminal.

TypeScript 2,166 285 Updated Nov 27, 2025

Modular RDMA Interface

C++ 67 15 Updated Dec 19, 2025

Fast and Furious AMD Kernels

C++ 324 40 Updated Dec 19, 2025

An open-source C++ library developed and used at Facebook.

C++ 30,160 5,825 Updated Dec 19, 2025

Everything you need to know about LLM inference

TypeScript 251 22 Updated Dec 17, 2025

Minimal effort CLIs derived from type hints and parse from command line, config files and environment variables

Python 408 62 Updated Dec 11, 2025

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…

Python 1,678 153 Updated Dec 21, 2025

A high-performance inference engine for LLMs, optimized for diverse AI accelerators.

C++ 828 101 Updated Dec 19, 2025

NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments

Go 128 29 Updated Dec 21, 2025

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 32,158 6,626 Updated Dec 21, 2025

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Python 14,644 2,256 Updated Dec 1, 2025

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,404 461 Updated Dec 18, 2025

The comprehensive WSGI web application library.

Python 6,826 1,753 Updated Dec 2, 2025

A tiny little JSON parsing library

C 1,423 37 Updated Sep 21, 2025

Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.

TypeScript 56,292 5,470 Updated Dec 21, 2025
Next