Highlights
- Pro
-
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedMay 18, 2026 -
flashinfer Public
Forked from flashinfer-ai/flashinferFlashInfer: Kernel Library for LLM Serving
Python Apache License 2.0 UpdatedMay 13, 2026 -
dynamo Public
Forked from ai-dynamo/dynamoA Datacenter Scale Distributed Inference Serving Framework
Rust Other UpdatedApr 22, 2026 -
TensorRT-LLM Public
Forked from NVIDIA/TensorRT-LLMTensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Python Other UpdatedFeb 7, 2026 -
recipes Public
Forked from vllm-project/recipesCommon recipes to run vLLM
Jupyter Notebook Apache License 2.0 UpdatedDec 3, 2025 -
transformers Public
Forked from huggingface/transformers🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Python Apache License 2.0 UpdatedOct 31, 2025 -
lm-evaluation-harness Public
Forked from EleutherAI/lm-evaluation-harnessA framework for few-shot evaluation of language models.
Python MIT License UpdatedOct 29, 2025 -
numba Public
Forked from numba/numbaNumPy aware dynamic Python compiler using LLVM
Python BSD 2-Clause "Simplified" License UpdatedOct 22, 2025 -
llvmlite Public
Forked from numba/llvmliteA lightweight LLVM python binding for writing JIT compilers
Python BSD 2-Clause "Simplified" License UpdatedOct 22, 2025 -
tokenizers Public
Forked from huggingface/tokenizers💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Rust Apache License 2.0 UpdatedOct 16, 2025 -
FlashMLA Public
Forked from deepseek-ai/FlashMLAFlashMLA: Efficient MLA kernels
C++ MIT License UpdatedOct 10, 2025 -
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
C++ Other UpdatedAug 6, 2025 -
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
Python Other UpdatedAug 4, 2025 -
buck2 Public
Forked from facebook/buck2Build system, successor to Buck
Rust Apache License 2.0 UpdatedJun 23, 2025 -
-
mscclpp Public
Forked from microsoft/mscclppMSCCL++: A GPU-driven communication stack for scalable AI applications
C++ MIT License UpdatedApr 14, 2025 -
dlpack Public
Forked from dmlc/dlpackcommon in-memory tensor structure
C++ Apache License 2.0 UpdatedApr 14, 2025 -
-
nvbench Public
Forked from NVIDIA/nvbenchCUDA Kernel Benchmarking Library
Cuda Apache License 2.0 UpdatedApr 10, 2025 -
rules_cuda Public
Forked from bazel-contrib/rules_cudaStarlark implementation of bazel rules for CUDA.
Starlark MIT License UpdatedApr 4, 2025 -
TensorRT-Model-Optimizer Public
Forked from NVIDIA/Model-Optimizernvidia-modelopt is a unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for do…
Python Other UpdatedApr 3, 2025 -
bazel-compile-commands-extractor Public
Forked from hedronvision/bazel-compile-commands-extractorGoal: Enable awesome tooling for Bazel users of the C language family.
Python Other UpdatedOct 8, 2024 -
rules_cc Public
Forked from bazelbuild/rules_ccC++ Rules for Bazel
Starlark Apache License 2.0 UpdatedAug 25, 2024 -
vscode-bazel Public
Forked from bazel-contrib/vscode-bazelBazel support for Visual Studio Code
TypeScript Apache License 2.0 UpdatedJan 2, 2024 -
-
-
-
-
tensorflow Public
Forked from tensorflow/tensorflowAn Open Source Machine Learning Framework for Everyone
C++ Apache License 2.0 UpdatedDec 5, 2023 -
TensorRT Public
Forked from NVIDIA/TensorRTNVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applicat…
C++ Apache License 2.0 UpdatedDec 3, 2023