- Shanghai
-
08:11
(UTC +08:00)
Stars
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Visualizer for neural network, deep learning and machine learning models
Fast, Flexible and Portable Structured Generation
High-performance automatic differentiation of LLVM and MLIR.
Open deep learning compiler stack for cpu, gpu and specialized accelerators
CUDA Templates and Python DSLs for High-Performance Linear Algebra
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
Universal LLM Deployment Engine with ML Compilation
Learning Vim and Vimscript doesn't have to be hard. This is the guide that you're looking for π
Original Apollo 11 Guidance Computer (AGC) source code for the command and lunar modules.
π Path to a free self-taught education in Computer Science!
πA curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.π
Repository which contains links and resources on different topics of Computer Science.
Hummingbird compiles trained ML models into tensor computation for faster inference.
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.
GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism asβ¦
A list of awesome compiler projects and papers for tensor computation and deep learning.
An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.