-
AMD
- aaryaman.net
- @adyaman
-
DRTK Public
Forked from facebookresearch/DRTKDifferentiable Rendering Toolkit
C++ MIT License UpdatedJun 12, 2026 -
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
MLIR MIT License UpdatedJun 2, 2026 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedMay 29, 2026 -
SageAttention Public
Forked from thu-ml/SageAttention[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
-
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
Python Other UpdatedMay 6, 2026 -
-
-
-
lemonade Public
Forked from lemonade-sdk/lemonadeLemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk
C++ Apache License 2.0 UpdatedApr 21, 2026 -
whisper.cpp Public
Forked from ggml-org/whisper.cppPort of OpenAI's Whisper model in C/C++
C++ MIT License UpdatedApr 20, 2026 -
doomgeneric Public
Forked from jhuber6/doomgenericA GPU port of DOOM
C GNU General Public License v2.0 UpdatedApr 16, 2026 -
FlashMoE Public
Forked from osayamenja/FlashMoEAMD HIP port of Distributed MoE in a Single Kernel [NeurIPS '25]
-
claude-rocm-workspace Public
Forked from stellaraccident/claude-rocm-workspaceClaude code workspace for developing ROCm
Python UpdatedMar 13, 2026 -
aotriton Public
Forked from ROCm/aotritonAhead of Time (AOT) Triton Math Library
Python MIT License UpdatedFeb 19, 2026 -
TurboDiffusion Public
Forked from thu-ml/TurboDiffusionTurboDiffusion: 100–200× Acceleration for Video Diffusion Models
Python Apache License 2.0 UpdatedJan 6, 2026 -
SpargeAttn Public
Forked from thu-ml/SpargeAttn[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
-
onnxruntime Public
Forked from microsoft/onnxruntimeONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
C++ MIT License UpdatedNov 9, 2025 -
llama.cpp Public
Forked from ggml-org/llama.cppPort of Facebook's LLaMA model in C/C++
C MIT License UpdatedOct 19, 2025 -
nanochat Public
Forked from karpathy/nanochatThe best ChatGPT that $100 can buy.
Python UpdatedOct 15, 2025 -
academic-kickstart Public
Forked from HugoBlox/hugo-theme-academic-cvEasily create a beautiful website using Academic and Hugo
Shell MIT License UpdatedSep 9, 2025 -
-
gloo Public
Forked from pytorch/glooCollective communications library with various primitives for multi-machine training.
C++ Other UpdatedAug 22, 2025 -
flux-fast Public
Forked from huggingface/flux-fastMaking Flux go brrr on GPUs.
Python UpdatedJul 1, 2025 -
audio Public
Forked from pytorch/audioData manipulation and transformation for audio signal processing, powered by PyTorch
Python BSD 2-Clause "Simplified" License UpdatedJun 26, 2025 -
TheRock Public
Forked from ROCm/TheRockThe HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm
Python MIT License UpdatedJun 23, 2025 -
-
flame Public
Forked from HanGuo97/flame🔥 A minimal training framework for scaling FLA models
Python MIT License UpdatedJun 11, 2025 -
flash-linear-attention Public
Forked from fla-org/flash-linear-attention🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton
Python MIT License UpdatedJun 11, 2025 -
Phantom Public
Forked from Phantom-video/PhantomPhantom: Subject-Consistent Video Generation via Cross-Modal Alignment
Python Apache License 2.0 UpdatedJun 10, 2025 -
diff-triangle-rasterization Public
Forked from trianglesplatting/diff-triangle-rasterizationCuda Apache License 2.0 UpdatedJun 4, 2025