-
Alibaba Cloud
- hz
Stars
Persist and reuse KV Cache to speedup your LLM.
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.
A Datacenter Scale Distributed Inference Serving Framework
SGLang is a fast serving framework for large language models and vision language models.
A high-throughput and memory-efficient inference and serving engine for LLMs
Serverless LLM Serving for Everyone.
Apache Fluss is a streaming storage built for real-time analytics.
Supercharge Your LLM with the Fastest KV Cache Layer
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
DoctorK is a service for Kafka cluster auto healing and workload balancing
《Designing Data-Intensive Application》DDIA 第一版 / 第二版 中文翻译
Kubernetes CSI driver for LVM on shared disks
Open source Java implementation for Raft consensus protocol.
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of …
AutoMQ is a diskless Kafka® on S3. 10x Cost-Effective. No Cross-AZ Traffic Cost. Autoscale in seconds. Single-digit ms latency. Multi-AZ Availability.
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
沉浸式双语网页翻译扩展 , 支持输入框翻译, 鼠标悬停翻译, PDF, Epub, 字幕文件, TXT 文件翻译 - Immersive Dual Web Page Translation Extension
🤖 Assemble, configure, and deploy autonomous AI Agents in your browser.一键免费部署你的私人AutoGPT 网页应用
A Java library to perform direct I/O in Linux, bypassing file page cache.
A Java Direct IO framework which is very simple to use.
Source code for the X Recommendation Algorithm