Skip to content

Keeper: changing volumeAttributesClassName triggers forbidden StatefulSet update + panic during recreate #1893

@BorisTyshkevich

Description

@BorisTyshkevich

What happened

When setting/changing spec.volumeClaimTemplates[*].spec.volumeAttributesClassName for ClickHouse Keeper, the operator attempts to update the Keeper StatefulSet spec.volumeClaimTemplates, which is immutable in Kubernetes.

This fails with:
StatefulSet.apps "keeper-primary-0-0" is invalid: spec: Forbidden: updates to statefulset spec for fields other than ... are forbidden

After that, the operator switches from Update to Recreate, and the reconciliation panics with a nil pointer dereference originating from WaitHostStatefulSetReady poller.

Expected behavior

  • Operator should NOT attempt to update StatefulSet.spec.volumeClaimTemplates just because volumeAttributesClassName changed.
  • Instead, it should reconcile the existing PVC(s) directly (which is allowed for volumeAttributesClassName) and leave StatefulSet VCT untouched.
  • No panic during delete/recreate flows.

Actual behavior

  • Update fails due to forbidden StatefulSet spec changes.
  • Operator falls back to recreate.
  • Operator panics (nil pointer dereference) while waiting for StatefulSet readiness.

Reproduction steps (high level)

  1. Deploy operator and a ClickHouseKeeperInstallation that creates a Keeper StatefulSet.
  2. Change spec.volumeClaimTemplates[0].spec.volumeAttributesClassName (example: gp3-medium-throughput).
  3. Observe operator reconciliation.

Relevant logs (from 2025-12-15)

E1215 13:20:22.818837 ... doUpdateStatefulSet(): ... StatefulSet update failed. err:
StatefulSet.apps "keeper-primary-0-0" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'ordinals', 'template', 'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds' are forbidden

... diff includes:
'.VolumeClaimTemplates[0].Spec.VolumeAttributesClassName' = '"gp3-medium-throughput"'

I1215 13:20:22.819669 ... switch from Update to Recreate

2025-12-15T13:20:22Z INFO Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference
...
github.com/altinity/clickhouse-operator/pkg/controller/common/poller/domain.(*HostObjectsPoller).WaitHostStatefulSetReady.func1
pkg/controller/common/poller/domain/poller-host-objects.go:98

Why this looks like an operator bug

The operator already has PVC-level reconciliation for VolumeAttributesClassName:

  • pkg/controller/common/storage/storage-reconciler.go calls reconcileVolumeAttributeClass() which sets pvc.Spec.VolumeAttributesClassName when specified.

So the operator can support the change by patching the PVC, without modifying the StatefulSet’s volumeClaimTemplates.

Proposed fix

  1. In StatefulSet update logic: do not attempt to update StatefulSet.Spec.VolumeClaimTemplates when only PVC-reconcilable fields changed (at least volumeAttributesClassName, plus labels/resources if applicable). Keep the current StatefulSet’s VCT as-is.

  2. Fix nil deref in poller:
    WaitHostStatefulSetReady() has a second poll callback that doesn’t check sts == nil before calling k8s.IsStatefulSetReady(sts).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions