Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

C++ 24,662 829 Updated Nov 7, 2025

ml-explore / mlx

MLX: An array framework for Apple silicon

C++ 22,768 1,383 Updated Nov 8, 2025

microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …

C++ 17,820 3,958 Updated Nov 10, 2025

ggml-org / ggml

Tensor library for machine learning

C++ 13,527 1,385 Updated Nov 9, 2025

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,089 1,851 Updated Nov 10, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,856 898 Updated Sep 30, 2025

google / sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 11,428 1,305 Updated Nov 6, 2025

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,451 958 Updated Oct 24, 2025

NVIDIA / cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,751 1,521 Updated Nov 7, 2025

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving for Local Deployment

C++ 8,382 450 Updated Aug 2, 2025

google / gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.

C++ 6,610 573 Updated Nov 7, 2025

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 6,344 920 Updated Mar 27, 2024

rapidsai / cuml

cuML - RAPIDS Machine Learning Library

C++ 5,002 600 Updated Nov 7, 2025

leejet / stable-diffusion.cpp

Diffusion model(SD,Flux,Wan,Qwen Image,...) inference in pure C/C++

C++ 4,537 441 Updated Nov 9, 2025

NVlabs / tiny-cuda-nn

Lightning fast C++/CUDA neural network framework

C++ 4,298 530 Updated Oct 13, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,245 421 Updated Nov 10, 2025

ztxz16 / fastllm

fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型，任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型，单并发20tps；INT4量化模型单并发30tps，多并发可达60+。

C++ 4,068 412 Updated Oct 28, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 3,880 303 Updated Nov 10, 2025

cactus-compute / cactus

Kernels & AI inference engine for phones

C++ 3,692 216 Updated Nov 9, 2025

yandex / perforator

Perforator is a cluster-wide continuous profiling tool designed for large data centers

C++ 3,346 149 Updated Nov 10, 2025

unum-cloud / USearch

Fast Open-Source Search & Clustering engine × for Vectors & Arbitrary Objects × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍

C++ 3,245 233 Updated Oct 29, 2025

b4rtaz / distributed-llama

Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.

C++ 2,733 191 Updated Nov 2, 2025

kpu / kenlm

KenLM: Faster and Smaller Language Model Queries

C++ 2,686 532 Updated Mar 30, 2025

PRBonn / kiss-icp

A LiDAR odometry pipeline that just works

C++ 1,977 399 Updated Oct 29, 2025

AlibabaResearch / AdvancedLiterateMachinery

A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.

C++ 1,794 201 Updated Apr 9, 2025

kevmo314 / scuda

SCUDA is a GPU over IP bridge allowing GPUs on remote machines to be attached to CPU-only machines.

C++ 1,776 73 Updated Jun 16, 2025

zhihu / ZhiLight

A highly optimized LLM inference acceleration engine for Llama and its variants.

C++ 903 103 Updated Jul 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SeshurajuP seshurajup

Block or report seshurajup

Lists (2)

Information Retrieval

Tuning

Stars

ggml-org / llama.cpp

nomic-ai / gpt4all

ggml-org / whisper.cpp

typesense / typesense