eBPF-powered network observability for Kubernetes. Indexes L4/L7 traffic with full K8s context, decrypts TLS without keys. Queryable by AI agents via MCP and humans via dashboard.
-
Updated
Apr 10, 2026 - Go
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
eBPF-powered network observability for Kubernetes. Indexes L4/L7 traffic with full K8s context, decrypts TLS without keys. Queryable by AI agents via MCP and humans via dashboard.
Terraform Pull Request Automation
Coroot is an open-source observability and APM tool with AI-powered Root Cause Analysis. It combines metrics, logs, traces, continuous profiling, and SLO-based alerting with predefined dashboards and inspections.
[Moved to cloudprober/cloudprober] An active monitoring software to detect failures before your customers do.
Layerform helps engineers create reusable environment stacks using plain .tf files. Ideal for multiple "staging" environments.
Kubernetes utility for exposing image versions in use, compared to latest available upstream, as metrics.
Infrastructure-as-Code Platform Built for the Future
An active monitoring software to detect failures before your customers do.
A blazing fast tool for building data pipelines: read, process and output events. Our community: https://t.me/file_d_community
Squzy - is a high-performance open-source monitoring, incident and alert system written in Golang with Bazel and love. Welcome to free SRE
Automatically capture and surface your team's tribal knowledge
Automatic SRE Superpowers within your Kubernetes cluster
Modern TCP tool and service for network performance observability.
preq is the community-driven problem detector for Common Reliability Enumerations (CREs)⚡️
autonomous systems engineering cli agent for any cloud environment: AWS, GCP, Cloudflare, etc
Slo-exporter computes standardized SLI and SLO metrics based on events coming from various data sources.
Marmot workflow execution engine