- Hamilton, Ontario, Canada
- https://brendanduke.ca
Highlights
- Pro
Stars
- All languages
- Assembly
- C
- C#
- C++
- CMake
- CSS
- Common Lisp
- Crystal
- Cuda
- Cython
- Dockerfile
- Emacs Lisp
- F#
- Forth
- GLSL
- Go
- HCL
- HTML
- Haskell
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- Lean
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Markdown
- Mojo
- NASL
- Nix
- OCaml
- Objective-C
- Odin
- PHP
- PLpgSQL
- Perl
- PowerShell
- Python
- R
- Racket
- Rich Text Format
- Roff
- Rust
- SWIG
- Sass
- Scala
- Shell
- Starlark
- Svelte
- Swift
- SystemVerilog
- Tcl
- TeX
- Tree-sitter Query
- TypeScript
- VBScript
- VHDL
- Verilog
- Vim Script
- Vue
- WebAssembly
- Zig
Low overhead tracing library and trace visualizer for pipelined CUDA kernels
AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming
A collection of GPU experiments and benchmarks for my personal understanding and research.
NUMA-aware multi-CPU multi-GPU data transfer benchmarks
slime is an LLM post-training framework for RL Scaling.
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
An early research stage expert-parallel load balancer for MoE models based on linear programming.
An open-source AI agent that brings the power of Grok directly into your terminal.
An open-source C++ library developed and used at Facebook.
Everything you need to know about LLM inference
Minimal effort CLIs derived from type hints and parse from command line, config files and environment variables
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…
A high-performance inference engine for LLMs, optimized for diverse AI accelerators.
NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
The comprehensive WSGI web application library.
Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.