-
13:32
(UTC +08:00)
Highlights
- Pro
Lists (5)
Sort Name ascending (A-Z)
Stars
Techniques and numbers for estimating system's performance from first-principles
AI agents running research on single-GPU nanochat training automatically
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞
A vector indexing library to bring fast, fresh and filtered search to your database
Fast, small, and fully autonomous AI personal assistant infrastructure, any OS, any platform — deploy anywhere, swap anything 🦀
Get Started From Here. The main repo for the whole open-rdma project. Including introduction, hands-on guide, new events and many other things.
Asterinas aims to be a production-grade Linux alternative—memory safe, high-performance, and more.
Provide with pre-build flash-attention 2 and 3 package wheels on Linux and Windows using GitHub Actions
Manage your dotfiles across multiple diverse machines, securely.
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
A scheduling framework for multitasking over diverse XPUs, including GPUs, NPUs, ASICs, and FPGAs
FalconFS is a high-performance distributed file system (DFS) designed for AI workloads.
A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)
High-Throughput, Cost-Effective Billion-Scale Vector Search with a Single GPU [SIGMOD'26]
A low-latency, billion-scale, and updatable graph-based vector store on SSD.
PiKV: KV Cache Management System for Mixture of Experts [Efficient ML System]
Distributed KV cache scheduling & offloading libraries
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
FlashInfer: Kernel Library for LLM Serving