Lists (8)
Sort Name ascending (A-Z)
Starred repositories
VRAFT is a framework written in C++ that implements RAFT protocol and SEDA architecture. Based on VRAFT, distributed software can be developed easily, such as vectordb and distributed storage system.
Democratizing large model inference and training on any device.
C++ implementation of a fast hash map and hash set using robin hood hashing
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
An transformer based LLM. Written completely in Rust
AKG (Auto Kernel Generator) is an optimizer for operators in Deep Learning Networks, which provides the ability to automatically fuse ops with specific patterns.
Embeddable Postgres with real-time, reactive bindings.
Integrates DuckDB with Google BigQuery, allowing direct querying and management of BigQuery datasets
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
A Unified and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for 🤗Diffusers.
Supercharge Your LLM with the Fastest KV Cache Layer
Official Repository of "LLM × DATA" Survey Paper
Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.
Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O
Disaggregated serving system for Large Language Models (LLMs).
🐸 Read Frog - Open Source Immersive Translate | 🐸 陪读蛙 - 开源沉浸式翻译
The lance extensions for DuckDB enable reading and writing of lance tables.
A high-throughput and memory-efficient inference and serving engine for LLMs
Diffusion model(SD,Flux,Wan,Qwen Image,...) inference in pure C/C++
DINOv2 inference engine written in C/C++ using ggml and OpenCV.
Eliminates delay when activating caps lock on macOS OSX