Stars
Fully local, private and cross platform Speech-to-Text with LLM Post-processing
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, or on-prem).
Robyn is a Super Fast Async Python Web Framework with a Rust runtime.
HTTP routing and request-handling library for Rust that focuses on ergonomics and modularity
A tool for creating and running Linux containers using lightweight virtual machines on a Mac. It is written in Swift, and optimized for Apple silicon.
Kernels & AI inference engine for mobile devices.
An extremely fast Python type checker and language server, written in Rust.
Pruna is a model optimization framework built for developers, enabling you to deliver faster, more efficient models with minimal overhead.
On-device AI across mobile, embedded and edge for PyTorch
Inference server benchmarking tool
A curated list of materials on AI efficiency
A language server for Zig supporting developers with features like autocomplete and goto definition
Lightpanda: the headless browser designed for AI and automation
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization