-
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedOct 19, 2025 -
batch_invariant_ops Public
Forked from thinking-machines-lab/batch_invariant_opsPython MIT License UpdatedOct 8, 2025 -
FBGEMM Public
Forked from pytorch/FBGEMMFB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
C++ Other UpdatedOct 1, 2025 -
-
nano-vllm Public
Forked from GeeeekExplorer/nano-vllmNano vLLM
Python MIT License UpdatedJun 17, 2025 -
AReaL Public
Forked from inclusionAI/AReaLDistributed RL System for LLM Reasoning
Python Apache License 2.0 UpdatedJun 4, 2025 -
sglang Public
Forked from sgl-project/sglangSGLang is a fast serving framework for large language models and vision language models.
Python Apache License 2.0 UpdatedMay 30, 2025 -
pplx-kernels Public
Forked from perplexityai/pplx-kernelsPerplexity GPU Kernels
C++ MIT License UpdatedApr 2, 2025 -
flashinfer Public
Forked from flashinfer-ai/flashinferFlashInfer: Kernel Library for LLM Serving
Cuda Apache License 2.0 UpdatedFeb 24, 2025 -
picotron Public
Forked from huggingface/picotronMinimalistic 4D-parallelism distributed training framework for education purpose
Python Apache License 2.0 UpdatedDec 20, 2024 -
Awesome-LLM-Inference Public
Forked from xlite-dev/Awesome-LLM-Inference📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉
-
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
C++ Other UpdatedSep 24, 2024 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedAug 22, 2024 -
-
llama-toolchain Public
Forked from llamastack/llama-stackModel components of the Llama Stack APIs
Python Other UpdatedJul 25, 2024 -
torchrec-3 Public
Forked from meta-pytorch/torchrecPytorch domain library for recommendation systems
Python BSD 3-Clause "New" or "Revised" License UpdatedFeb 19, 2024 -
TransformerEngine Public
Forked from NVIDIA/TransformerEngineA library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in bot…
Python Apache License 2.0 UpdatedAug 22, 2023 -
xformers Public
Forked from facebookresearch/xformersHackable and optimized Transformers building blocks, supporting a composable construction.
Python Other UpdatedAug 14, 2023 -
param Public
Forked from facebookresearch/paramPArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for evaluation of training and inference platforms.
Python MIT License UpdatedNov 28, 2022 -
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
C++ BSD 3-Clause "New" or "Revised" License UpdatedApr 3, 2022 -
torchrec-1 Public
Forked from terrorizer1980/torchrecPytorch domain library for recommendation systems
Python BSD 3-Clause "New" or "Revised" License UpdatedFeb 22, 2022 -
effectivepython Public
Forked from bslatkin/effectivepythonEffective Python: Second Edition — Source Code and Errata for the Book
Python UpdatedMay 19, 2021 -
-
glow Public
Forked from pytorch/glowCompiler for Neural Network hardware accelerators
C++ Apache License 2.0 UpdatedJan 29, 2020 -
tutorials Public
Forked from pytorch/tutorialsPyTorch tutorials.
Jupyter Notebook BSD 3-Clause "New" or "Revised" License UpdatedDec 7, 2019 -
pytext Public
Forked from facebookresearch/pytextA natural language modeling framework based on PyTorch
Python Other UpdatedOct 29, 2019 -
-
asmjit Public
Forked from asmjit/asmjitComplete x86/x64 JIT and AOT Assembler for C++
C++ zlib License UpdatedAug 8, 2019 -
tvm Public
Forked from apache/tvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
Python Apache License 2.0 UpdatedAug 5, 2019 -
blislab Public
Forked from flame/blislabBLISlab: A Sandbox for Optimizing GEMM
C UpdatedJul 29, 2019