Generate the complete reliability stack from a service spec in 5 minutes. Dashboards, alerts, SLOs, PagerDuty - zero toil.
-
Updated
Apr 12, 2026 - Python
Generate the complete reliability stack from a service spec in 5 minutes. Dashboards, alerts, SLOs, PagerDuty - zero toil.
A python-based application to proviosion stacks of SLO CloudWatch alarms on AWS services using AWS CDK.
Calculate SLI/SLO metrics from ZMON's timeseries data
[DEPRECATED] Moved to microsoft/agent-governance-toolkit
端到端 SLO (服务级别目标) 治理平台,集成 MeterSphere 拨测数据,自动计算误差预算,并由 AI 辅助输出结构化诊断报告
Open-source uptime monitoring: multi-probe geographic correlation, browser scenario recorder, real-time dashboard, teams RBAC, SSO/OIDC, public status pages. One Docker command to deploy.
Elastic Observability SLO upgrade helper (8.11.x -> 8.12)
Мини-платформа наблюдаемости для K8s: метрики, пробы, SLA/SLO (Prometheus).
Ready-to-run observability starter (Prometheus/Alertmanager/Grafana + FastAPI RED demo). Includes dashboards, demo/prod alert rules, and SLO/Runbook templates for fast pilots.
Datadog AIOps observability stack: automated alert triage, anomaly detection, SLO/SLA dashboards, and LLM-powered RCA summaries.
Lightweight logging gateway prototype built with FastAPI. Accepts structured log messages over HTTP and provides a foundation for further DevOps observability experiments.
A complete data governance platform for managing data contracts, running tests, validating SLOs, and tracking data quality — all with dbt and Databricks
SLO evaluation, burn-rate alerting, incident management, and postmortem automation for a multi-node homelab.
CLI tool that generates load test plans (steady, burst, soak) from service profiles and SLOs, with metrics interpretation and evidence logging.
Production Engineering incident-response lab: SLOs, burn-rate alerts, runbooks, capacity planning, postmortems, change safety
Full-stack observability platform with Prometheus, Grafana, ELK, OpenTelemetry, SLI/SLO dashboards, and chaos engineering
Find and duplicate transformer reasoning circuits to boost LLM logic without training or weight changes
These are Python-based Custom Agent Checks (Metrics) for Datadog to calculate the amount of time elapsed since today at noon, begin of month (bom), begin of year (boy)
Add a description, image, and links to the slo topic page so that developers can more easily learn about it.
To associate your repository with the slo topic, visit your repo's landing page and select "manage topics."