Skip to content
View prravda's full-sized avatar
🔥
🔥

Highlights

  • Pro

Block or report prravda

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 2,897 385 Updated Apr 3, 2026

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python 2,256 382 Updated Mar 31, 2026

eBPF-based autoinstrumentation of web applications and network metrics

Go 1,962 169 Updated Apr 2, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,550 1,005 Updated Mar 31, 2026

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,789 1,027 Updated Mar 30, 2026

rvLLM: High-performance LLM inference in Rust. Drop-in vLLM replacement.

Rust 356 34 Updated Apr 2, 2026

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 6,308 851 Updated Mar 22, 2026

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of…

Python 57,086 7,041 Updated Apr 3, 2026

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 5,032 652 Updated Apr 3, 2026

Machine Learning Engineering Open Book

Python 17,606 1,117 Updated Mar 16, 2026

A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

Python 16,888 1,251 Updated Apr 3, 2026

Lightpanda: the headless browser designed for AI and automation

Zig 26,856 1,094 Updated Apr 3, 2026

A vulnerability scanner for container images and filesystems

Go 11,936 770 Updated Apr 3, 2026

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python 8,556 945 Updated Apr 2, 2026

ODIN [for Codex CLI as a plugin] - Outline Driven development approach for agentic INtelligence

Python 9 2 Updated Mar 20, 2026

AI agents running research on single-GPU nanochat training automatically

Python 64,710 9,195 Updated Mar 26, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 75,119 15,122 Updated Apr 3, 2026

AI Observability & Evaluation

Jupyter Notebook 9,150 792 Updated Apr 3, 2026

Lightweight static analysis for many languages. Find bug variants with patterns that look like source code.

OCaml 14,653 903 Updated Apr 3, 2026

🔥 xCrash provides the Android app with the ability to capture java crash, native crash and ANR. No root permission or any system permissions are required.

C 3,931 651 Updated Jun 27, 2025

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

Python 18,618 1,417 Updated Apr 3, 2026

⚡A CLI tool for code structural search, lint and rewriting. Written in Rust

Rust 13,267 338 Updated Apr 3, 2026

Web-based SQLite database browser written in Python

Python 4,059 394 Updated Mar 20, 2026

📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG

Python 23,787 1,971 Updated Mar 29, 2026

A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

Python 35,253 2,385 Updated Apr 2, 2026

Financial data platform for analysts, quants and AI agents.

Python 65,216 6,449 Updated Apr 3, 2026

Fastest enterprise AI gateway (50x faster than LiteLLM) with adaptive load balancer, cluster mode, guardrails, 1000+ models support & <100 µs overhead at 5k RPS.

Go 3,473 371 Updated Apr 3, 2026

System Level Intelligent Router for Mixture-of-Models at Cloud, Data Center and Edge

Go 3,591 594 Updated Apr 3, 2026

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…

Python 42,042 6,961 Updated Apr 3, 2026

The open source coding agent.

TypeScript 136,241 14,867 Updated Apr 3, 2026
Next