Skip to content
View cats256's full-sized avatar

Highlights

  • Pro

Block or report cats256

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

LinearKAN: A very fast implementation of Kolmogorov-Arnold Networks

Python 20 1 Updated Nov 12, 2025

CUDA Embedding Lookup Kernel Library

Cuda 43 5 Updated Feb 9, 2026

Development repository for the Triton language and compiler

MLIR 18,796 2,713 Updated Mar 30, 2026
Python 1,625 159 Updated Feb 23, 2026

A collection of full time roles in SWE, Quant, and PM for new grads.

16,589 1,271 Updated Mar 30, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 74,696 14,952 Updated Mar 30, 2026

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

Python 464 18 Updated Mar 29, 2026

LeetGPU Challenges

Python 719 69 Updated Mar 28, 2026

Efficient Triton Kernels for LLM Training

Python 6,243 507 Updated Mar 28, 2026

LM engine is a library for pretraining/finetuning LLMs

Python 162 28 Updated Mar 28, 2026

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 4,119 772 Updated Mar 30, 2026

A curated list of papers of interesting empirical study and insight on deep learning. Continually updating...

395 18 Updated Jan 7, 2026

Kernels, of the mega variety :)

Python 696 52 Updated Mar 29, 2026

Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs

HTML 883 127 Updated Mar 15, 2026

The AMFormer algorithm, accepted at AAAI-2024, for deep tabular learning

Python 41 10 Updated Jul 3, 2024

A modular framework for neural networks with Euclidean symmetry

Python 1,230 179 Updated Feb 13, 2026

Visualization and calculator for input & output for deep neural networks.

TypeScript 18 3 Updated Jul 28, 2025

CPU and GPU implementations of some 2D RNN layers

C++ 29 10 Updated Sep 23, 2017

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 32,018 4,011 Updated Mar 30, 2026

Chrome/Firefox extension that blocks access to distracting websites to improve your productivity.

TypeScript 393 52 Updated Nov 17, 2025

ASU-sparkysundevil-resume-template

TeX 35 19 Updated Oct 3, 2024

Tips and resources to prepare for Behavioral interviews.

8,021 1,632 Updated Aug 19, 2025

conv_visualizer

Processing 493 44 Updated Dec 1, 2024

Sample code for the Microsoft Cognitive Services Speech SDK

C# 3,414 1,993 Updated Mar 26, 2026

Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.

Python 284 37 Updated Oct 12, 2024

Hypergradient descent

Python 146 21 Updated May 31, 2024

public facing repo of my algorithm running on platform

Python 4 Updated Nov 20, 2024

QuantSC Spring '23 Project

Jupyter Notebook 67 8 Updated May 18, 2023

🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.

MDX 72,465 7,759 Updated Mar 11, 2026

Adaptive Quantile Activation (AQUA): A learnable activation function that dynamically adapts to input distribution

1 Updated Dec 12, 2024
Next