Stars
verl: Volcano Engine Reinforcement Learning for LLMs
Probably the fastest coroutine lib in the world!
An easy-to-use framework for large scale recommendation algorithms.
My learning notes for ML SYS.
Golang implementation of the Raft consensus protocol
A highly customizable homepage (or startpage / application dashboard) with Docker and service API integrations.
Drogon: A C++14/17/20 based HTTP web application framework running on Linux/macOS/Unix/Windows
Gateway API Inference Extension
SGLang is a fast serving framework for large language models and multi-modality models.
FlashInfer: Kernel Library for LLM Serving
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Ultimate camera streaming application with support RTSP, RTMP, HTTP-FLV, WebRTC, MSE, HLS, MP4, MJPEG, HomeKit, FFmpeg, etc.
A flexible distributed key-value database that is optimized for caching and other realtime workloads.
C++ implementation of a fast hash map and hash set using robin hood hashing
we want to create a repo to illustrate usage of transformers in chinese
A lightweight library for portable low-level GPU computation using WebGPU.
Hackable and optimized Transformers building blocks, supporting a composable construction.
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
和我一步步部署 kubernetes 集群
A feature complete and high performance multi-group Raft library in Go.
An LLM playground you can run on your laptop
Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-re…
Cross-platform, customizable ML solutions for live and streaming media.