Lists (1)
Sort Name ascending (A-Z)
Stars
- All languages
- Assembly
- Ballerina
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CoffeeScript
- Cuda
- D
- Dart
- Elixir
- Erlang
- FLUX
- GCC Machine Description
- Go
- Groovy
- HTML
- Haskell
- Java
- JavaScript
- Jinja
- Jsonnet
- Jupyter Notebook
- Kotlin
- LLVM
- Lean
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Mermaid
- MoonBit
- OCaml
- Objective-C
- Objective-C++
- PHP
- PLpgSQL
- Perl
- Python
- ReScript
- Ruby
- Rust
- Scala
- Scheme
- Shell
- TeX
- TypeScript
- Vim Script
- Vue
- YAML
FlyDSL is the Python front‑end of the project: Flexible LaYout DSL.
✔(已完结)超级全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】【大飞 大模型Agent】
A lightweight triton-based General Matrix Multiplication (GEMM) library.
CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-based computation patterns and optimizations targeting NVIDIA te…
AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming
Asterinas aims to be a production-grade Linux alternative—memory safe, high-performance, and more.
Universal LLM Deployment Engine with ML Compilation
Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.
FlagTree is a unified compiler supporting multiple AI chip backends for custom Deep Learning operations, which is forked from triton-lang/triton.
Allo Accelerator Design and Programming Framework (PLDI'24)
UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache transfer, RL weight transfer), and EP (e.g., GPU-driven)
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
slime is an LLM post-training framework for RL Scaling.
NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
Distributed Compiler based on Triton for Parallel Systems
My learning notes for ML SYS.
The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
TileFusion is an experimental C++ macro kernel template library that elevates the abstraction level in CUDA C for tile processing.
verl: Volcano Engine Reinforcement Learning for LLMs
A Datacenter Scale Distributed Inference Serving Framework
Staging repo for development of native port of TypeScript