Site Reliability Engineer • Platform / Infrastructure • Developer Experience
Designing, automating, and operating reliable systems for fast-moving teams.
I’m a Site Reliability Engineer focused on building and running platforms that let developers ship quickly without sacrificing reliability. Most of my work lives at the intersection of:
- Kubernetes-based platforms (cloud + on-prem + homelab)
- GitOps, infra-as-code, and supply-chain security
- Observability, SLOs, and incident response
- Developer experience and internal tooling
Recently, I’ve been:
- Building a developer platform (Catalyst Forge) for a small team
- Experimenting with multi-cluster Kubernetes across cloud and homelab hardware
- Tightening CI/CD pipelines around Go, Python, and platform code
- Exploring the Cosmos SDK and blockchain infra from an SRE perspective
-
Platform & SRE
- Design and operate Kubernetes clusters (dev/preprod/prod/preview)
- Implement GitOps workflows (Argo CD / Flux) and environment promotion
- Automate provisioning with Terraform / Pulumi and cloud-native tooling
-
Reliability & Observability
- Define and track SLOs and error budgets
- Build observability stacks with Prometheus, Loki, Tempo, Grafana, etc.
- Improve on-call quality via actionable alerts & runbooks
-
Developer Experience
- Build CLIs, templates, and golden paths for application teams
- Standardize CI/CD (GitHub Actions, pipelines, policy-as-code)
- Help teams adopt best practices without blocking delivery
-
Homelab & experimentation
- Run lab clusters on Minisforum/Proxmox/Talos and related hardware
- Prototype ideas before they graduate to production platforms
- Design first: I like to capture architecture, trade-offs, and failure modes before cementing them in code.
- Bias toward automation: Anything that’s done twice should probably become a script, a job, or a controller.
- Operational empathy: Tooling should make life better for the people on call, not worse.
- Measured change: Feature flags, progressive delivery, and clear rollback paths over “YOLO deploys”.
- Clear communication: I prefer detailed PRs, explicit docs, and transparent incident write-ups.
I maintain and contribute to open source projects when I can, especially around:
- Developer tooling / CLIs
- Platform / infra patterns
- Homelab and experimentation setups
If you’re using any of my projects and run into issues or have ideas, opening an issue or PR is always welcome.
The best ways to get in touch:
- LinkedIn: linkedin.com/in/jmgilman
- Website: jmgilman.com
I’m open to discussions about SRE / platform / infra roles, especially where a team needs help building a robust, developer-friendly platform.