Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

C++ 24,652 827 Updated Nov 6, 2025

ml-explore / mlx

MLX: An array framework for Apple silicon

C++ 22,727 1,378 Updated Nov 6, 2025

microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …

C++ 17,816 3,954 Updated Nov 6, 2025

ggml-org / ggml

Tensor library for machine learning

C++ 13,436 1,379 Updated Nov 4, 2025

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,053 1,845 Updated Nov 6, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,848 896 Updated Sep 30, 2025

google / sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 11,423 1,305 Updated Nov 6, 2025

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,442 957 Updated Oct 24, 2025

NVIDIA / cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,736 1,519 Updated Nov 6, 2025

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving for Local Deployment

C++ 8,377 450 Updated Aug 2, 2025

google / gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.

C++ 6,609 573 Updated Nov 6, 2025

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 6,343 920 Updated Mar 27, 2024

rapidsai / cuml

cuML - RAPIDS Machine Learning Library

C++ 4,997 600 Updated Nov 6, 2025

leejet / stable-diffusion.cpp

Diffusion model(SD,Flux,Wan,Qwen Image,...) inference in pure C/C++

C++ 4,524 439 Updated Nov 3, 2025

NVlabs / tiny-cuda-nn

Lightning fast C++/CUDA neural network framework

C++ 4,293 531 Updated Oct 13, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,232 420 Updated Nov 6, 2025

ztxz16 / fastllm

fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型，任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型，单并发20tps；INT4量化模型单并发30tps，多并发可达60+。

C++ 4,066 412 Updated Oct 28, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 3,867 301 Updated Nov 6, 2025

cactus-compute / cactus

Kernels & AI inference engine for phones

C++ 3,637 214 Updated Nov 6, 2025

yandex / perforator

Perforator is a cluster-wide continuous profiling tool designed for large data centers

C++ 3,345 148 Updated Nov 6, 2025

unum-cloud / USearch

Fast Open-Source Search & Clustering engine × for Vectors & Arbitrary Objects × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍

C++ 3,235 233 Updated Oct 29, 2025

b4rtaz / distributed-llama

Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.

C++ 2,727 190 Updated Nov 2, 2025

kpu / kenlm

KenLM: Faster and Smaller Language Model Queries

C++ 2,687 532 Updated Mar 30, 2025

PRBonn / kiss-icp

A LiDAR odometry pipeline that just works

C++ 1,975 399 Updated Oct 29, 2025

AlibabaResearch / AdvancedLiterateMachinery

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

C++ 1,794 200 Updated Apr 9, 2025

kevmo314 / scuda

SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.

C++ 1,774 73 Updated Jun 16, 2025

zhihu / ZhiLight

A highly optimized LLM inference acceleration engine for Llama and its variants.

C++ 902 103 Updated Jul 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SeshurajuP seshurajup

Block or report seshurajup

Lists (2)

Information Retrieval

Tuning

Stars

ggml-org / llama.cpp

nomic-ai / gpt4all

ggml-org / whisper.cpp

typesense / typesense