Tags: decocms/operator
Tags
feat(decoredirect): auto-heal failed certs and add retry-cert API rou… …te (#19) * feat(decoredirect): auto-heal failed certs and add retry-cert API route When a Certificate enters cert-manager's exponential backoff (Issuing=False, Reason=Failed), the controller now automatically detects it and checks whether the domain DNS is correctly pointing to Deco's redirect infrastructure. If both an HTTP check (X-Redirect-By: deco) and an AAAA check (no GCP 2600:1901::/32 range) pass, the Certificate is deleted so cert-manager retries without backoff. Also adds POST /redirects/{domain}/retry-cert API route that performs the same DNS checks and forces an immediate retry for operators who don't want to wait for the next 30s reconcile cycle. Root cause addressed: domains migrating from Deno Deploy sometimes retain AAAA records in GCP range (2600:1901::/32). cert-manager's self-check uses IPv4 and passes, but Let's Encrypt validates via IPv6, hits Deno Deploy, and fails — leaving the Certificate stuck in multi-hour backoff. * fix(decoredirect): skip cert mutation while DeletionTimestamp is set * fix(lint): use _ to discard resp.Body.Close error (errcheck) * feat(decoredirect): remove retry-cert endpoint — controller auto-heals * test(decoredirect): add auto-healing scenarios and make DNSReadyFunc injectable Tests cover: - cert Failed + DNS ready → cert deleted (healed) - cert Failed + DNS wrong → cert untouched - cert Issuing=True → cert untouched (noop) - cert Ready=True → cert untouched (noop) - cert doesn't exist → no error Also skips healing when cert has DeletionTimestamp to avoid acting on a cert that is already being deleted. * feat(decoredirect): make blocked IPv6 CIDRs configurable via --redirect-blocked-ipv6 Removes the hardcoded GCP/Deno Deploy IPv6 range (2600:1901::/32) and replaces it with a configurable list of blocked CIDRs. When empty (default), no AAAA check is performed. Configure for Deco's deployment with: --redirect-blocked-ipv6=2600:1901::/32 Also accepts REDIRECT_BLOCKED_IPV6 env var. * Revert "feat(decoredirect): make blocked IPv6 CIDRs configurable via --redirect-blocked-ipv6" This reverts commit f626842. * feat(decoredirect): make blocked IPv6 CIDRs configurable via --redirect-blocked-ipv6 Removes hardcoded GCP range. Configure blocked CIDRs via: - --redirect-blocked-ipv6=2600:1901::/32 (flag) - REDIRECT_BLOCKED_IPV6=2600:1901::/32 (env) - redirect.blockedIPv6CIDRs in Helm values Default is empty (no AAAA check). * fix(helm): add blockedIPv6CIDRs arg to helm-generator
feat(decoredirect): add redirectCode field (301|307) and opt-in X-Red… …irect-By header (#17) * docs: add spec for DecoRedirect redirectCode field and X-Redirect-By header Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: make decoHeader opt-in in spec (open-source chart concern) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: make X-Redirect-By value configurable with default 'deco' Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: add implementation plan for DecoRedirect redirectCode + X-Redirect-By header Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(crd): add redirectCode field (enum 301|307, default 307) to DecoRedirect Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(controller): set permanent-redirect-code annotation from spec.redirectCode Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(api): add redirectCode to DecoRedirect request and response Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(api): document that redirectCode validation is delegated to CRD schema * feat(chart): add opt-in X-Redirect-By header via redirect.decoHeader Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(chart): warn when decoHeader enabled but add-headers not configured * fix(chart): generate configmap-redirect-custom-headers via helm generator Prevents cleanTemplates() from deleting the hand-placed ConfigMap on every make generate run by adding an addRedirectCustomHeaders generator function. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: remove superpowers plans and spec docs * chore(chart): improve NOTES.txt for decoHeader feature * feat(chart): generalize decoHeader to redirect.customHeaders map * chore: remove values-local.yaml from tracking, add to .gitignore * chore(chart): rename ConfigMap to redirect-response-headers --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
feat(decofile): cascade-delete via Revision ownerReference (#15) * feat(decofile): cascade-delete via Revision ownerReference The DecofileReconciler now adds the Knative Revision matching a Decofile's deploymentId as an ownerReference (controller=false) on the Decofile itself. When the Revision is later garbage-collected — either by Knative GC (maxNonActiveRevisions) or by the cluster's knative-clean-revisions CronJob — Kubernetes garbage collection cascades through Revision -> Decofile -> ConfigMap, eliminating orphan accumulation. Mechanism: * Reconcile path: after fetching the Decofile, syncRevisionOwnerRefs lists Revisions in the same namespace whose app.deco/deploymentId label matches the Decofile's deploymentId (or metadata.name when spec.deploymentId is empty), and appends any not-yet-present Revision as an ownerReference. Refetch-before-update avoids optimistic concurrency conflicts with the status update that happens later in the same Reconcile. Failures are logged but non-fatal — the operator continues without owner refs rather than blocking ConfigMap creation. * Watch path: Revision Create events enqueue any Decofile in the same namespace whose effective deploymentId matches the new Revision's label, so newly-created Revisions trigger ownerRef sync even when the corresponding Decofile already existed (the common case where the admin/build pipeline creates the Decofile slightly before the KSvc). Edge cases covered by the implementation and the unit tests: * No matching Revision yet: nothing is added; reconcile completes. * Revision in DeletionTimestamp: skipped (don't link to dying owners). * Multiple Revisions with same deploymentId (rollback): both become owners. Kubernetes GC waits for ALL owners to be deleted before reclaiming the Decofile. * Re-running Reconcile is idempotent — existing UIDs are detected and not duplicated. * Explicit spec.deploymentId is respected; decoys matching only the Decofile name are ignored. Defense-in-depth: the controller-cluster CronJob (PR decocms/infra_applications#92, deferred) remains useful as a fallback for legacy Decofiles created before this change, and for the rare case where the Revision is deleted before the operator manages to patch the ownerRef. Those orphans are still detectable by the script in infra_applications. Backfill of the existing fleet (~2276 Decofiles cluster-wide): the new operator will retroactively patch ownerRefs to any Decofile that still has a matching Revision, the next time each one is reconciled. Decofiles whose Revisions are already gone (likely the majority of the backlog) won't gain an owner — those require a one-off cleanup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(decofile): satisfy golangci-lint (prealloc, unparam) - prealloc: pre-size toAdd slice to len(revs.Items) - unparam: drop unused namespace param from test helpers (always "sites-foo") --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat: rename CRD from RedirectDomain to DecoRedirect (#14) Renames the CRD kind from RedirectDomain to DecoRedirect (group deco.sites/v1alpha1, plural decoredict). Updates all Go types, controller, API handlers, tests, RBAC markers, Helm templates, and sample manifests. Regenerated with make generate. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: rename CRD from RedirectDomain to DecoRedirect (#14) Renames the CRD kind from RedirectDomain to DecoRedirect (group deco.sites/v1alpha1, plural decoredict). Updates all Go types, controller, API handlers, tests, RBAC markers, Helm templates, and sample manifests. Regenerated with make generate. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: add GET /redirects/{domain} and accept original domain in DELETE (
#13)
* fix: accept original domain in DELETE path, return 404 for not found
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add GET /redirects/{domain} endpoint
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: validate domain path param in GET and DELETE, return 400 for invalid
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: return simplified response with certificateReady status from GET and LIST
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: add GET /redirects/{domain} and accept original domain in DELETE (
#13)
* fix: accept original domain in DELETE path, return 404 for not found
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: add GET /redirects/{domain} endpoint
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: validate domain path param in GET and DELETE, return 400 for invalid
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat: return simplified response with certificateReady status from GET and LIST
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: redirect management HTTP API (#12) * feat: add redirect management HTTP API Exposes POST/DELETE/GET /redirects for CRUD of RedirectDomain CRs. Enabled via REDIRECT_API_USER+REDIRECT_API_PASSWORD env vars (or existingSecret). Set redirectApi.hostname in values to auto-create Ingress+Certificate via redirect-nginx. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: decouple redirect API ingress from redirect-nginx Add redirectApi.ingressClass (default empty = cluster default) and redirectApi.clusterIssuer so the management API routes through the cluster's default ingress instead of the redirect NLB. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: rename redirectApi → operatorApi, REDIRECT_API_* → OPERATOR_API_* API is general-purpose (operator-api.deco.cx/redirects) not redirect-specific. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: update operatorApi hostname example to api.infra.deco.cx Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: preserve original domain in Spec.From, only sanitize k8s Name sanitizeDomain was converting dots to dashes in both the resource name and Spec.From, breaking the CEL validation rule. Now domainToName() is used only for the k8s Name; Spec.From keeps the original domain. Also returns 422 instead of 500 for validation errors from the API server. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: remove stale redirect-api templates (renamed to operator-api) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address PR review issues in operator API - Move OPERATOR_API env vars outside Valkey block (were injected under adminPassword branch) - Require both username+password (not username alone) to enable Service/Ingress/env - Tie Ingress creation to credentials being set (hostname alone no longer enough) - Add HTTP timeouts to server (ReadHeader/Read/Write/Idle) - Use release-name-prefixed TLS secret to avoid cross-release collisions - Differentiate 422 (Invalid/AlreadyExists) from 500 in create handler - Remove unused newTestServer helper from tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: make default redirect namespace configurable via --redirect-namespace Reads REDIRECT_NAMESPACE env var (default: deco-redirect-system). Chart injects it from redirect.namespace in values.yaml so the API default namespace stays in sync with the rest of the redirect config. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: remove inline username/password, credentials via existingSecret only Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: make operatorApi existingSecret optional so pod starts without the k8s Secret Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: return 409 Conflict for already-exists, keep 422 for validation errors Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: fallback to deco-redirect-system when defaultNamespace is empty Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor: use envFrom secretRef instead of individual secretKeyRef for operatorApi credentials Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: update operatorApi hostname example to operator.infra.deco.cx Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
PreviousNext