This is a mono repository for my home infrastructure and Kubernetes cluster. I try to adhere to Infrastructure as Code (IaC) and GitOps practices using tools like Kubernetes, Flux, Renovate, and GitHub Actions.
My Kubernetes cluster is deployed with Talos. This is a semi-hyper-converged cluster, workloads and block storage are sharing the same available resources on my nodes.
There is a template over at onedr0p/cluster-template if you want to try and follow along with some of the practices I use here.
- Networking & Service Mesh: cilium provides eBPF-based networking, while envoy gateway powers service-to-service communication with L7 proxying and traffic management. cloudflared secures ingress traffic via Cloudflare, and external-dns keeps DNS records in sync automatically. multus enables attaching multiple network interfaces to pods, making it possible to connect workloads to different VLANs or networks simultaneously.
- Security & Secrets: cert-manager automates SSL/TLS certificate management. For secrets, I use external-secrets with 1Password Connect to inject secrets into Kubernetes.
- Storage & Data Protection: rook provides distributed storage for persistent volumes, with volsync handling backups and restores. spegel improves reliability by running a stateless, cluster-local OCI image mirror.
- Automation & CI/CD: actions-runner-controller runs self-hosted GitHub Actions runners directly in the cluster for continuous integration workflows.
Flux watches the clusters in my kubernetes folder (see Directories below) and makes the changes to my clusters based on the state of my Git repository.
The way Flux works for me here is it will recursively search the kubernetes/apps folder until it finds the most top level kustomization.yaml per directory and then apply all the resources listed in it. That aforementioned kustomization.yaml will generally only have a namespace resource and one or many Flux kustomizations (ks.yaml). Under the control of those Flux kustomizations there will be a HelmRelease or other resources related to the application which will be applied.
Renovate watches my entire repository looking for dependency updates, when they are found a PR is automatically created. When some PRs are merged Flux applies the changes to my cluster.
This Git repository contains the following directories under Kubernetes.
π kubernetes
βββ π apps # applications
βββ π components # re-useable kustomize components
βββ π flux # flux system configurationThis is a high-level look how Flux deploys my applications with dependencies. In most cases a HelmRelease will depend on other HelmRelease's, in other cases a Kustomization will depend on other Kustomization's, and in rare situations an app can depend on a HelmRelease and a Kustomization. The example below shows that atuin won't be deployed or upgrade until the rook-ceph-cluster Helm release is installed or in a healthy state.
graph TD
A>Kustomization: rook-ceph] -->|Creates| B[HelmRelease: rook-ceph]
A>Kustomization: rook-ceph] -->|Creates| C[HelmRelease: rook-ceph-cluster]
C>HelmRelease: rook-ceph-cluster] -->|Depends on| B>HelmRelease: rook-ceph]
D>Kustomization: atuin] -->|Creates| E(HelmRelease: atuin)
E>HelmRelease: atuin] -->|Depends on| C>HelmRelease: rook-ceph-cluster]
Click here to see my high-level network diagram
graph LR
%% Class Definitions
classDef isp fill:#f87171,stroke:#fff,stroke-width:2px,color:#fff,font-weight:bold;
classDef core fill:#60a5fa,stroke:#fff,stroke-width:2px,color:#fff,font-weight:bold;
classDef switch fill:#a78bfa,stroke:#fff,stroke-width:2px,color:#fff,font-weight:bold;
classDef device fill:#facc15,stroke:#fff,stroke-width:2px,color:#000,font-weight:bold;
classDef vlan fill:#1f2937,stroke:#fff,stroke-width:1px,color:#fff,font-size:12px;
%% Nodes
ISP[π Brightspeed<br/>1Gbps WAN]:::isp
UDM[π¦ UDM Pro]:::core
K8s[βΈοΈ Kubernetes<br/>3 Nodes]:::device
USW[π 16 Port<br/>2.5G PoE]:::switch
DEV[π» Devices]:::device
WIFI[πΆ WiFi Clients]:::device
%% Subgraph for VLANs
subgraph VLANs [LAN +vlan]
direction TB
DEFAULT[DEFAULT<br/>192.168.1.0/24]:::vlan
TRUSTED[TRUSTED*<br/>192.168.10.0/24]:::vlan
SERVERS[SERVERS*<br/>192.168.42.0/24]:::vlan
GUEST[GUEST*<br/>192.168.50.0/24]:::vlan
SERVICES[SERVICES*<br/>192.168.69.0/24]:::vlan
IOT[IOT*<br/>192.168.70.0/24]:::vlan
WIREGUARD[WIREGUARD*<br/>192.168.80.0/24]:::vlan
end
style VLANs fill:#111,stroke:#fff,stroke-width:2px,rx:0,ry:0,padding:20px;
%% Links
SERVERS -.-> ISP
ISP -.->|WAN| UDM
UDM -- 10G --- USW
USW -- 2.5G --> K8s
USW --> DEV
USW --> WIFI
%% Keep SERVERS->ISP as a hidden layout constraint and style bonded links thicker
linkStyle 0 stroke:transparent,stroke-width:0px,color:transparent;
linkStyle 2 stroke-width:4px;
linkStyle 3 stroke-width:4px;
linkStyle 4 stroke-width:2px;
linkStyle 5 stroke-width:4px;
While most of my infrastructure and workloads are self-hosted I do rely upon the cloud for certain key parts of my setup. This saves me from having to worry about three things. (1) Dealing with chicken/egg scenarios, (2) services I critically need whether my cluster is online or not and (3) The "hit by a bus factor" - what happens to critical apps (e.g. Email, Password Manager, Photos) that my family relies on when I no longer around.
Alternative solutions to the first two of these problems would be to host a Kubernetes cluster in the cloud and deploy applications like HCVault, Vaultwarden, ntfy, and Gatus; however, maintaining another cluster and monitoring another group of workloads would be more work and probably be more or equal out to the same costs as described below.
| Service | Use | Cost |
|---|---|---|
| 1Password | Secrets with External Secrets | ~$65/yr |
| Cloudflare | Domain and S3 | ~$40/yr |
| GitHub | Hosting this repository and continuous integration/deployments | Free |
| Pushover | Kubernetes Alerts and application notifications | $5 OTP |
| UptimeRobot | Monitoring internet connectivity and external facing applications | Free |
| Total: ~$9/mo |
In my cluster there are two instances of ExternalDNS running. One for syncing private DNS records to my UDM Pro using ExternalDNS webhook provider for UniFi, while another instance syncs public DNS to Cloudflare. This setup is managed by creating ingresses with two specific classes: internal for private DNS and external for public DNS. The external-dns instances then syncs the DNS records to their respective platforms accordingly.
Dell OptiPlex 5080 Micro (i5-10600T) Γ 1 Β· 64 GB RAM Β· Talos / Kubernetes
- OS β 512 GB AirDisk P10 NVMe (2280)
- Rook-Ceph β 800 GB Micron 5100 PRO SATA SSD
Dell OptiPlex 3090 Micro (i5-10500T) Γ 1 Β· 64 GB RAM Β· Talos / Kubernetes
- OS β 256 GB Western Digital PC SN520 NVMe (2230)
- Rook-Ceph β 800 GB Micron 5100 PRO SATA SSD
Dell OptiPlex 3090 Micro (i5-10500T) Γ 1 Β· 64 GB RAM Β· Talos / Kubernetes
- OS β 500 GB PNY CS2140 NVMe (2280)
- Rook-Ceph β 800 GB Micron 5100 PRO SATA SSD
- UDM Pro β router & NVR Β· 1 Γ 10 TB HGST Ultrastar He10 HDD
- USW Pro Max 16 PoE β 2.5 G PoE++ switch
Thanks to all the people who donate their time to the Home Operations Discord community. Be sure to check out kubesearch.dev for ideas on how to deploy applications or get ideas on what you could deploy.