Skip to content
View cats256's full-sized avatar

Highlights

  • Pro

Block or report cats256

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

LinearKAN: A very fast implementation of Kolmogorov-Arnold Networks

Python 20 1 Updated Nov 12, 2025

CUDA Embedding Lookup Kernel Library

Cuda 43 5 Updated Feb 9, 2026

Development repository for the Triton language and compiler

MLIR 18,720 2,687 Updated Mar 22, 2026
Python 1,626 159 Updated Feb 23, 2026

A collection of full time roles in SWE, Quant, and PM for new grads.

16,529 1,266 Updated Mar 22, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 73,933 14,628 Updated Mar 22, 2026

Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.

Python 457 18 Updated Mar 21, 2026

LeetGPU Challenges

Python 682 62 Updated Mar 21, 2026

Efficient Triton Kernels for LLM Training

Python 6,222 504 Updated Mar 20, 2026

LM engine is a library for pretraining/finetuning LLMs

Python 154 28 Updated Mar 18, 2026

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 4,105 767 Updated Mar 22, 2026

A curated list of papers of interesting empirical study and insight on deep learning. Continually updating...

395 18 Updated Jan 7, 2026

Kernels, of the mega variety :)

Python 693 46 Updated Mar 22, 2026

Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs

HTML 873 127 Updated Mar 15, 2026

The AMFormer algorithm, accepted at AAAI-2024, for deep tabular learning

Python 41 10 Updated Jul 3, 2024

A modular framework for neural networks with Euclidean symmetry

Python 1,227 177 Updated Feb 13, 2026

Visualization and calculator for input & output for deep neural networks.

TypeScript 18 3 Updated Jul 28, 2025

CPU and GPU implementations of some 2D RNN layers

C++ 29 10 Updated Sep 23, 2017

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 31,744 3,972 Updated Mar 22, 2026

Chrome/Firefox extension that blocks access to distracting websites to improve your productivity.

TypeScript 392 52 Updated Nov 17, 2025

ASU-sparkysundevil-resume-template

TeX 35 19 Updated Oct 3, 2024

Tips and resources to prepare for Behavioral interviews.

7,985 1,623 Updated Aug 19, 2025

conv_visualizer

Processing 493 44 Updated Dec 1, 2024

Sample code for the Microsoft Cognitive Services Speech SDK

C# 3,409 1,991 Updated Mar 16, 2026

Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.

Python 284 37 Updated Oct 12, 2024

Hypergradient descent

Python 146 21 Updated May 31, 2024

public facing repo of my algorithm running on platform

Python 4 Updated Nov 20, 2024

QuantSC Spring '23 Project

Jupyter Notebook 64 7 Updated May 18, 2023

🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.

MDX 72,081 7,697 Updated Mar 11, 2026

Adaptive Quantile Activation (AQUA): A learnable activation function that dynamically adapts to input distribution

1 Updated Dec 12, 2024
Next