Skip to content
View epwalsh's full-sized avatar
  • Central Oregon
  • 15:01 (UTC -08:00)
  • X @epwalsh

Organizations

@allenai @ISU-DMC @structurely @rusty-celery

Block or report epwalsh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🚀 Efficient implementations of state-of-the-art linear attention models

Python 4,077 331 Updated Dec 18, 2025

Neovim plugin for a code outline window

Lua 2,166 107 Updated Nov 25, 2025

A Quirky Assortment of CuTe Kernels

Python 696 64 Updated Dec 16, 2025

Ship correct and fast LLM kernels to PyTorch

Python 126 15 Updated Dec 18, 2025

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 688 89 Updated Dec 18, 2025

Simple and efficient DeepSeek V3 SFT using pipeline parallel and expert parallel, with both FP8 and BF16 trainings

Python 101 19 Updated Jul 27, 2025

macOS system monitor in your menu bar

Swift 35,323 1,128 Updated Dec 7, 2025

Lightweight yet powerful formatter plugin for Neovim

Lua 4,746 263 Updated Dec 14, 2025

Primary and community-submitted packages for webinstall.dev

Shell 2,655 291 Updated Oct 21, 2025

PyTorch bindings for CUTLASS grouped GEMM.

Cuda 174 46 Updated Dec 16, 2025

Configuration with Dataclasses+YAML+Argparse. Fork of Pyrallis

Python 74 16 Updated Oct 30, 2025

Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/

Python 2,719 177 Updated Dec 5, 2025

A simple, performant and scalable Jax LLM!

Python 2,046 441 Updated Dec 18, 2025

PyTorch emulation library for Microscaling (MX)-compatible data formats

Python 326 41 Updated Jun 18, 2025

PyTorch building blocks for the OLMo ecosystem

Python 590 107 Updated Dec 18, 2025

Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)

Python 457 53 Updated Dec 6, 2025

GPU programming related news and material links

1,872 110 Updated Sep 17, 2025

Ring attention implementation with flash attention

Python 947 90 Updated Sep 10, 2025

Efficient Triton Kernels for LLM Training

Python 5,955 451 Updated Dec 18, 2025

PyTorch bindings for CUTLASS grouped GEMM.

Cuda 134 79 Updated May 29, 2025

PyTorch implementation of models from the Zamba2 series.

Python 186 17 Updated Jan 23, 2025

Mamba SSM architecture

Python 16,756 1,541 Updated Nov 11, 2025

For optimization algorithm research and development.

Python 553 60 Updated Dec 16, 2025

Tips for Writing a Research Paper using LaTeX

TeX 3,636 404 Updated May 4, 2023

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance…

Python 3,011 581 Updated Dec 18, 2025

Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI

Python 294 24 Updated Jun 3, 2025
Python 1,510 219 Updated Jun 26, 2025

PyTorch native quantization and sparsity for training and inference

Python 2,579 386 Updated Dec 18, 2025

Simple, safe way to store and distribute tensors

Python 3,557 285 Updated Dec 18, 2025

Microsoft Automatic Mixed Precision Library

Python 630 49 Updated Dec 1, 2025
Next