Highlights
- Pro
Stars
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
A distributed, fast open-source graph database featuring horizontal scalability and high availability
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
A machine learning compiler for GPUs, CPUs, and ML accelerators
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel