Lists (1)
Sort Name ascending (A-Z)
Stars
An Open Source Machine Learning Framework for Everyone
LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
FlashMLA: Efficient Multi-head Latent Attention Kernels
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
CUDA Templates and Python DSLs for High-Performance Linear Algebra
oneAPI Deep Neural Network Library (oneDNN)
lmctfy is the open source version of Google’s container stack, which provides Linux application containers.
A cross platform C99 library to get cpu features at runtime.
Tutorial code on how to build your own Deep Learning System in 2k Lines
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
Collective communications library with various primitives for multi-machine training.
Matrix Shadow:Lightweight CPU/GPU Matrix and Tensor Template Library in C++/CUDA for (Deep) Machine Learning
a software library containing BLAS functions written in OpenCL
A benchmark for low-level CPU micro-architectural features
C, C++ and Python Code for Exercises and Solutions
Distributed-memory, arbitrary-precision, dense and sparse-direct linear algebra, conic optimization, and lattice reduction
STXXL: Standard Template Library for Extra Large Data Sets