-
17:58
(UTC -12:00)
Stars
Collection of publicly available libraries
High-performance Rust benchmark client for vLLM serving endpoints.
网文/小说写作 skill 包,覆盖长篇与短篇网络小说的扫榜、拆文、写作、去AI味、封面图全流程
Flash OS images to SD cards & USB drives, safely and easily.
build llama inference compute from scrath, only using torch/numpy base ops
A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM
Persist and reuse KV Cache to speedup your LLM.
A browser automation framework and ecosystem.
LeaderWorkerSet: An API for deploying a group of pods as a unit of replication
A machine learning-based video super resolution and frame interpolation framework. Est. Hack the Valley II, 2018.
Development repository for the Triton language and compiler
Production-Grade Container Scheduling and Management
Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…
Apache Spark - A unified analytics engine for large-scale data processing
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models
deep learning for image processing including classification and object-detection etc.
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).
FlashInfer: Kernel Library for LLM Serving
A high-performance and light-weight router for vLLM large scale deployment
Open & Reproducible Research for Tracking VLAs
System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge