-
17:08
(UTC -12:00)
Stars
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.
Community maintained hardware plugin for vLLM on Spyre
A framework for few-shot evaluation of language models.
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
Visual testing tool for MCP servers
GitHub Action for interacting with kubectl (k8s)
Runner Container Hooks for GitHub Actions
Real-time & local speech-to-text server.
DeepEP: an efficient expert-parallel communication library
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
Potabk / vllm-ascend
Forked from vllm-project/vllm-ascendCommunity maintained hardware plugin for vLLM on Ascend
Supercharge Your LLM with the Fastest KV Cache Layer
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
pkking / vllm-ascend
Forked from cllouud/vllm-ascendCommunity maintained hardware plugin for vLLM on Ascend
Trojan Client for macOS, ported from ShadowsocksX-NG. Please use it in compliance with laws, regulations and rules.
🌟100+ 原创 LLM / RL 原理图📚,《大模型算法》作者巨献!💥(100+ LLM/RL Algorithm Maps )
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
昇腾(Ascend)DRA驱动,专为华为昇腾AI处理器设计的Kubernetes动态资源分配驱动实现。欢迎社区开发者使用、贡献和改进,共同打造更高效的AI加速卡资源调度框架。支持从单卡到多卡集群的灵活资源管理,适合AI训练和推理场景部署。
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A static site generator for data apps, dashboards, reports, and more. Observable Framework combines JavaScript on the front-end for interactive graphics with any language on the back-end for data a…
Community maintained hardware plugin for vLLM on Ascend
A high-throughput and memory-efficient inference and serving engine for LLMs
A generative speech model for daily dialogue.