- Shanghai
-
sglang Public
Forked from sgl-project/sglangSGLang is yet another fast serving framework for large language models and vision language models.
-
-
ThunderKittens Public
Forked from HazyResearch/ThunderKittensTile primitives for speedy kernels
Cuda MIT License UpdatedJul 23, 2025 -
flash-linear-attention Public
Forked from fla-org/flash-linear-attention🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton
Python MIT License UpdatedJun 5, 2025 -
verl Public
Forked from volcengine/verlverl: Volcano Engine Reinforcement Learning for LLMs
Python Apache License 2.0 UpdatedApr 28, 2025 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedApr 17, 2025 -
DeepGEMM Public
Forked from deepseek-ai/DeepGEMMDeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
-
dynamo Public
Forked from ai-dynamo/dynamoA Datacenter Scale Distributed Inference Serving Framework
Rust Apache License 2.0 UpdatedMar 19, 2025 -
applied-ai Public
Forked from meta-pytorch/applied-aiApplied AI experiments and examples for PyTorch
Python BSD 3-Clause "New" or "Revised" License UpdatedDec 31, 2024 -
triton Public
Forked from triton-lang/tritonDevelopment repository for the Triton language and compiler
C++ MIT License UpdatedDec 30, 2024 -
flashinfer Public
Forked from flashinfer-ai/flashinferFlashInfer: Kernel Library for LLM Serving
Cuda Apache License 2.0 UpdatedDec 11, 2024 -
EAGLE Public
Forked from SafeAILab/EAGLEOfficial Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
-
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedAug 27, 2024 -
cuda_hgemm Public
Forked from Bruce-Lee-LY/cuda_hgemmSeveral optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
-
Paddle Public
Forked from PaddlePaddle/PaddlePArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
Python Apache License 2.0 UpdatedMar 9, 2021 -
PaddleSlim Public
Forked from PaddlePaddle/PaddleSlimPaddleSlim is an open-source library for deep model compression and architecture search.
Python Apache License 2.0 UpdatedFeb 26, 2021 -
Paddle-Lite Public
Forked from PaddlePaddle/Paddle-LiteMulti-platform high performance deep learning inference engine (『飞桨』多平台高性能深度学习预测引擎)
-
PaddleClas Public
Forked from PaddlePaddle/PaddleClasA treasure chest for image classification powered by PaddlePaddle
Python Apache License 2.0 UpdatedFeb 23, 2021 -
PaddleSeg Public
Forked from PaddlePaddle/PaddleSegEnd-to-end image segmentation kit based on PaddlePaddle.
Python Apache License 2.0 UpdatedJan 22, 2021 -
FluidDoc Public
Forked from PaddlePaddle/docsDocumentations for PaddlePaddle
Shell UpdatedDec 3, 2020 -
nni Public
Forked from microsoft/nniAn open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Python MIT License UpdatedNov 11, 2020 -
CINN Public
Forked from PaddlePaddle/CINNa Compiler Infrastructure for Neural Networks
C++ Apache License 2.0 UpdatedNov 9, 2020 -
PaddleOCR Public
Forked from PaddlePaddle/PaddleOCROCR toolkit based on PaddlePaddle (基于飞桨的OCR工具库,包含总模型仅8.6M的超轻量级中文OCR,同时支持多种文本检测、文本识别的训练算法、服务部署和端侧部署)
C++ Apache License 2.0 UpdatedNov 4, 2020 -
incubator-tvm Public
Forked from apache/tvmOpen deep learning compiler stack for cpu, gpu and specialized accelerators
Python Apache License 2.0 UpdatedNov 1, 2020 -