#

triton

Here are 177 public repositories matching this topic...

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

triton llama hacktoberfest mistral finetuning llms llm-training llama3 phi3 gemma2 triton-kernels

Updated Apr 8, 2026
Python

RightNow-AI / autokernel

Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels.

gpu cuda pytorch triton kernel-optimization autoresearch

Updated Mar 19, 2026
Python

flagos-ai / FlagGems

FlagGems is an operator library for large language models implemented in the Triton Language.

pytorch triton triton-kernels

Updated Apr 9, 2026
Python

HKUSTDial / flash-sparse-attention

Trainable fast and memory-efficient sparse attention

kernel triton sparse-attention flash-attention flash-sparse-attention

Updated Apr 7, 2026
Python

BobMcDear / attorch

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

machine-learning deep-learning cuda pytorch openai triton openai-triton

Updated Aug 12, 2025
Python

rkinas / triton-resources

A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.

cuda triton

Updated Mar 10, 2025
Python

d4em0n / exrop

Automatic ROPChain Generation

reverse-engineering symbolic-execution triton pwn ctf rop rop-exploitation rop-gadgets binary-exploitation rop-chain exploit-development exploitdev

Updated Mar 20, 2026
Python

gpu-mode / reference-kernels

Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!

gpu leaderboard cuda triton

Updated Apr 8, 2026
Python

opendilab / DI-hpc

OpenDILab RL HPC OP Lib, including CUDA and Triton kernel

reinforcement-learning hpc cuda pytorch lstm triton

Updated Jul 4, 2024
Python

SQLab / symgdb

SymGDB - symbolic execution plugin for gdb

gdb symbolic-execution triton gdb-plugin

Updated May 15, 2018
Python

meta-pytorch / tritonparse

TritonParse: A Compiler Tracer, Visualizer, and Reproducer for Triton Kernels

debugging compiler pytorch triton structured-logging interactive-visualization ir-analysis gpu-kernel ir-visualization

Updated Apr 9, 2026
Python

ot-triton-lab / flash-sinkhorn

FlashSinkhorn: IO-Aware Entropic Optimal Transport in PyTorch + Triton. Streaming Sinkhorn with O(nd) memory.

machine-learning gpu cuda pytorch triton optimal-transport sinkhorn flash-attention entropic-optimal-transport flashsinkhorn

Updated Apr 6, 2026
Python

iris

ROCm / iris

AMD RAD's multi-GPU Triton-based framework for seamless multi-GPU programming

communication distributed-computing ml async-programming gpgpu triton rdma hip shmem gemm rma rocm multigpu kernel-fusion fused-kernel workgroup-specialization symmetric-memory remote-memory-access

Updated Apr 9, 2026
Python

trident

kakaobrain / trident

A performance library for machine learning applications.

python machine-learning library performance ai deep-learning pytorch triton

Updated Oct 12, 2023
Python

hyunwoongko / nanoRLHF

nanoRLHF: from-scratch journey into how LLMs and RLHF really work.

pytorch triton scratch-implementation llm rlhf

Updated Jan 23, 2026
Python

alexzhang13 / flashattention2-custom-mask

Triton implementation of FlashAttention2 that adds Custom Masks.

deep-learning triton attention cuda-kernels attention-mechanism triton-lang flash-attention flash-attention-2

Updated Aug 14, 2024
Python

psmarter / mini-infer

LLM inference engine from scratch — paged KV cache, continuous batching, chunked prefill, prefix caching, speculative decoding, CUDA graph, tensor parallelism, OpenAI-compatible serving

machine-learning cuda inference pytorch transformer triton moe quantization language-model inference-engine kv-cache tensor-parallelism llm speculative-decoding pagedattention continuous-batching

Updated Apr 9, 2026
Python

clearml / clearml-serving

ClearML - Model-Serving Orchestration and Repository Solution

kubernetes devops machine-learning ai deep-learning triton tensorflow-serving model-serving serving mlops serving-pytorch-models triton-inference-server clearml serving-ml

Updated Mar 12, 2026
Python

DeepAuto-AI / hip-attention

Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.

triton attention attention-mechanism sub-quadratic-attention openai-triton hip-attention

Updated Mar 31, 2026
Python

novioleo / Savior

(WIP)The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework for algorithm service that ensures reliability, high concurrency and scalability of services.

workflow deployment distributed triton deeplearning rpa

Updated Jul 21, 2021
Python

Improve this page

Add a description, image, and links to the triton topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the triton topic, visit your repo's landing page and select "manage topics."