-
sglang Public
Forked from sgl-project/sglangSGLang is a fast serving framework for large language models and vision language models.
Python Apache License 2.0 UpdatedApr 7, 2026 -
claude-code Public
Forked from RishabhK103/claude-codeClaude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
TypeScript Other UpdatedMar 31, 2026 -
claw-code Public
Forked from ultraworkers/claw-codeBetter Harness Tools, not merely storing the archive of leaked Claude Code but also make shit things done. Now rewriting in Rust.
Python UpdatedMar 31, 2026 -
srt-slurm Public
Forked from ishandhanani/srt-slurmCollection of SLURM deployment scripts for various hardware + sglang disaggregation variants
Python UpdatedJan 15, 2026 -
FlashMLA Public
Forked from sgl-project/FlashMLAFlashMLA: Efficient Multi-head Latent Attention Kernels
C++ MIT License UpdatedJan 7, 2026 -
gpt-oss Public
Forked from openai/gpt-ossgpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Python Apache License 2.0 UpdatedAug 26, 2025 -
TensorRT-LLM Public
Forked from NVIDIA/TensorRT-LLMTensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
C++ Apache License 2.0 UpdatedAug 19, 2025 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedAug 5, 2024 -
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
C++ Other UpdatedOct 22, 2022 -
AITemplate_public Public
Forked from facebookincubator/AITemplateAITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
-
minGPT Public
Forked from karpathy/minGPTA minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
Jupyter Notebook MIT License UpdatedJul 1, 2022 -
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
-
tvm Public
Forked from apache/tvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
-
TASO Public
Forked from jiazhihao/TASOA Tensor Algebra SuperOptimizer for Deep Learning
C++ Apache License 2.0 UpdatedJan 21, 2020 -
KeepingYouAwake Public
Forked from newmarcel/KeepingYouAwakePrevents your Mac from going to sleep.
Objective-C MIT License UpdatedAug 13, 2019 -
dmlc-core Public
Forked from dmlc/dmlc-coreA common bricks library for building scalable and portable distributed machine learning.
C++ Other UpdatedMay 8, 2019 -
QNNPACK Public
Forked from pytorch/QNNPACKQuantized Neural Network PACKage - mobile-optimized implementation of quantized neural network operators
-
-
Open Neural Network Exchange
-
caffe2 Public
Forked from facebookarchive/caffe2Caffe2 is a lightweight, modular, and scalable deep learning framework.
C++ Apache License 2.0 UpdatedApr 2, 2018 -
models Public
Forked from facebookarchive/modelsA repository for storing pre-trained Caffe2 models.
PureBasic UpdatedNov 16, 2017 -
cpuinfo Public
Forked from pytorch/cpuinfoCPU INFOrmation library (x86/ARM, Linux/Mach/NaCl)
Objective-C BSD 2-Clause "Simplified" License UpdatedOct 23, 2017