Senior Site Reliability Engineer · Intuit · San Diego, CA
SRE with 8+ years designing and operating large-scale distributed systems. I focus on Kubernetes, chaos engineering, and GitOps — building platforms that keep millions of users online.
- Platform Reliability — Kubernetes-based failover platform serving 100+ microservices
- Chaos Engineering — GameDay orchestration framework testing resilience of 150+ applications
- Observability — SLO/SLI frameworks, Prometheus, Grafana, OpenTelemetry
- Maintained 99.95% uptime SLA through 24/7 on-call, incident response, and root cause analysis
Open to connecting about reliability engineering, Kubernetes, or chaos engineering.