Skip to content

Releases: Altinity/clickhouse-operator

release-0.27.1

04 Jun 13:42

Choose a tag to compare

NOTE: This is a major update to 0.27.0. Starting from 0.27.1 clickhouse-operator is FIPS-enabled

Added

Fixed

  • Fix permanent ArgoCD diff on resourceFieldRef.divisor by @pkieszcz in #1989
  • Fix inconsistent secret rendering in autogenerated remote_servers by @realyota in #1992. Closes #1991
  • Fixed skipped Delete when scale-to-0 Update fails by @dashashutosh80 in #1993. Closes #1990
  • Fixed .status.usedTemplates that were not properly cleaned when templates were removed
  • Bump dependent libraries to address CVEs

New Contributors

Full Changelog: release-0.27.0...release-0.27.1

release-0.27.0

08 May 13:58

Choose a tag to compare

Added

  • Reference CHK from CHI by a name rather than a service:
spec:
  configuration:
    zookeeper:
      keeper:
        name: my-keeper
  • [EXPERIMENTAL] Pre/post hooks. Allows to inject different SQL commands on reconcile events.
    • supported events: Any, HostCreate, HostUpdate, HostStart, HostStop, HostConfigRestart, HostRollout, HostShutdown, HostDelete
    • target controls where to run SQL: FirstHost (default), AllHosts, AllShards
    • failurePolicy specifies what to do if hook execution fails with an error: Fail (default) or Ignore.
      See CRD for detailed description. Example:
  reconcile:
    host:
      hooks:
        pre:
          - sql:
              queries:
                - "SYSTEM STOP REPLICATION QUEUES"
            events:
              - HostShutdown
  • Now operator automatically restarts aborted reconcile if failed pod goes online after abort
  • Restart operator on operator configuration change. #1960. Closes #1930
  • Allow to exclude certain metrics from exporting. Noisy OS/CPU metrics are now excluded by default. #1975. Closes #1876
    Exclusion rules are applied by default and can be changed in operator configuration:
    excludeMetricsRegexp:
      - "^metric\\.(OS.*CPU[0-9]+|CPUFrequencyMHz_[0-9]+)$"
  • Helm: Allow readiness and liveness probes for operator containers by @janeklb in #1976
  • Helm: Add security context for crdHook to enable security policy compliance by @qlevasseur-genetec in #1950

Changed

  • Enabled async_replication and use_xid_64 in Keeper default configuration. Requires Keeper 25.3 or above.
  • Whitelist Keeper four letter commands by default
  • Switched Keeper probes to ruok
  • Kubernetes client library has been upgraded to 0.30.14 by @dcoppa in #1970

Fixed

  • Fix watch ClickHouseOperatorConfiguration in operator namespace. #1959.
  • Fix chk probe crashloop by @hananbs in #1961. Closes #1962
  • Fix secret-backed env upgrade rollout. #1966. Closes #1963
  • Helm: fix configmap too long by @madrisan in #1949. Closes #1911
  • build(deps): bump go.opentelemetry.io/otel/sdk from 1.42.0 to 1.43.0 by @dependabot[bot] in #1953
  • Fix: format string issues by @destinyoooo in #1955
  • Fix: Prevent CrashLoopBackOff during image upgrade with RollingUpdate (#1926) by @dashashutosh80 in #1956. Closes #1926
  • Fixed a bug when host has not been removed from status fields hostsWithReplicaCaughtUp and hostsWithTablesCreated when removing from a cluster.
  • Prevent nil pointer panic in poll when CR Get fails by @wucm667 in #1974. Closes #1972
  • Scope schema discovery to target host's cluster in multi-cluster CHI by @lukas-pfannschmidt-tr in #1965. Closes #1964
  • Fix: use plain errors for clickhouse model messages in #1980 by @immanuwell

New Contributors

Full Changelog: release-0.26.3...release-0.27.0

release-0.26.3

15 Apr 17:02

Choose a tag to compare

Fixed

  • Prevent CrashLoopBackOff during image upgrade with RollingUpdate #1956 by @dashashutosh80. Closes #1926
  • Address CVEs in dependent libraries

Full Changelog: release-0.26.2...release-0.26.3

release-0.26.2

24 Mar 18:41

Choose a tag to compare

Fixed

  • Fixed a race condition when updating ClickHouse configuration and version altogether with a configuration setting that did not exist in the old version. Closes #1926
  • Fixed a bug when some Keeper nodes could be left offline after configuration changes
  • Fixed a bug when operator did not respect watched namespaces for Keeper. Closes #1923
  • Fixed potential races in configuration hash calculation. May close #1907
  • Updated dependent libraries to address CVE-2026-24051

Changed

  • Deleting multi-node CHI and CHK is now much faster
  • Added asynchronous_metrics_keeper_metrics_only to default Keeper configuration

Full Changelog: release-0.26.1...release-0.26.2]

release-0.26.1

13 Mar 13:07

Choose a tag to compare

Fixed

  • Fixed Keeper startup that was slow due to missing quorum. Closes #1931 and #1856
  • Fixed Keeper deletion logic that could previously leave PVCs undeleted
  • Fixed hostName generation in status that might result in excessive schema propagation cycles
  • Fixed FQDN normalization to prevent trailing-dot inconsistencies between internal hostname representations
  • Bump stdlib version to address CVE
  • Bump base image version by @Slach in #1941. Closes #1940
  • Document custom service template behaviour for CHK by @realyota in #1939

Full Changelog: release-0.26.0...release-0.26.1

release-0.26.0

20 Feb 13:22

Choose a tag to compare

IMPORTANT: Due to ClickHouse upstream regression ClickHouse/ClickHouse#89693 DDL queries may not work on newly created ClickHouse pods. It affects Kubernetes deployments only in some new ClickHouse versions (25.8.10+ and above). The workaround is to restart ClickHouse pods. The problem is fixed by ClickHouse/ClickHouse#92339, see backports for different release branches. The fix is backported to Altinity Stable 25.8.16.10001 as well.
Closes #1883 and #1913

Added

  • Added an option to abort reconcile if STS needs to be recreated. It can be configured in operator configuration or CHI.
# Reconcile StatefulSet scenario
reconcile:
  statefulSet:
    recreate:
      # What to do in case operator is in need to recreate StatefulSet?
      # Possible options:
      # 1. abort - abort the process, do nothing with the problematic StatefulSet, leave it as it is,
      #    do not try to fix or delete or update it, just abort reconcile cycle.
      #    Do not proceed to the next StatefulSet(s) and wait for an admin to assist.
      # 2. recreate - proceed and recreate StatefulSet.
      # Triggered when StatefulSet update fails or StatefulSet is not ready
      onUpdateFailure: recreate
  • Added an option to configure system tables for metrics scrapping. The default is system.metrics and system.custom_metrics tables, but those can be changed with a regular expression if needed:
    tablesRegexp: "^(metrics|custom_metrics)$"

Changed

  • The suspend flag now immediately aborts a running reconcile. Previously, it did not affect the one that was running
  • When suspend flag is set, any reconcile attempt automatically sets CHI/CHK status to aborted.
  • Add optional registry prefix for operator and metrics images in Helm chart by @lesandie in #1928
  • Improve ClickHouse Keeper Grafana Dashboard by @discostur in #1872
  • Add CRDHook annotations by @eyyu in #1914
  • Hotfix crdhook, add imagePullSecrets by @Slach in #1917
  • Fix installer to default template URL to OPERATOR_VERSION by @realyota in #1910
  • sort keys in Settings.Keys() method for consistent order (fix manifest reconcile issue) by @mastercactapus in #1900
  • Multiple documentation fixes

Fixed

  • Fixed Keeper rolling update logic. Closes #1796 #1915
  • Fixed a bug when replica was not added to monitoring until it catches up the replication lag
  • Fixed version parsing for FIPS compatible builds of ClickHouse. Closes #1850
  • Fixed stop and suspend attributes for CHK that were previously ignored
  • Fix distributed_ddl.replicas_path mismatch that could prevent sharing (Zoo)Keeper between multiple clusters @Elmo33 in #1922
  • Fixed a bug when defaults.storageManagement.reclaimPolicy was not respected
  • Fixed slow initial connectivity to newly created pods caused by DNS search list exhaustion (ndots:5). Added trailing dot to FQDN and increased connect timeout
  • Fixed a bug where reconcile settings specified at CHI level (e.g. spec.reconcile.statefulSet.recreate.onUpdateFailure) were not inherited by cluster-level reconcile configuration

Other

  • stdlib has been upgraded to 1.25.6 to address CVEs
  • Operator has been certified for 25.8.16.10001 Altinity.Stable.

New Contributors

Full Changelog: release-0.25.6...release-0.26.0

release-0.25.6

11 Dec 11:53

Choose a tag to compare

Changed

  • Hosts are not excluded from remote_servers anymore if restart is not needed. Previously, replicas in replicated cluster might be removed for a short time even if restart was not needed.
  • configuration.zookeeper section changes do not require restart anymore
  • Last reconciliation error and list of errors are now stored in CHI status

Fixed

  • actionPlan is now optional in status. That fixes operator upgrade problems that might happen in some environments.
  • Fixed excessive reconciles triggered by endpoint slices. Closes #1873
  • Fixed crash in CHK that might happen sometimes. Closes #1863
  • Fixed an issue with handling fractional requests/limits that would result in excessive reconcile. Closes #1849 #1821
  • Fixed a bug with operator crash in Terminating namespace. Closes #1871
  • stdlib was upgraded in order to address CVEs in dependent libraries

NOTE: Due to regression in upstream ClickHouse ClickHouse/ClickHouse#89693 schema propagation and DDL statements do not work with ClickHouse versions 25.8.10+ and newer until it is resolved.

Full Changelog: release-0.25.5...release-0.25.6

release-0.25.5

24 Oct 09:52

Choose a tag to compare

Added

  • The latest applied ActionPlan is now stored in chi-storage ConfigMap
  • volumeClaimTemplate.volumeAttributeClass attribute. Closes #1818
  • Configuration for DROP REPLICA behavior:
reconcile:
   host:
    drop:
      replicas:
        # Whether the operator during reconcile procedure should drop replicas when replica is deleted
        onDelete: yes
        # Whether the operator during reconcile procedure should drop replicas when replica volume is lost
        onLostVolume: yes
        # Whether the operator during reconcile procedure should drop active replicas when replica is deleted or recreated
        active: no

Now active replicas are never dropped. That solves a potential bug when replica could be dropped on a multi-volume node if newly added volume is not yet available.

Changed

  • Enabled ReadinessProbe for Keeper. Closes #1846
  • Enabled query_log for all DDL statements performed by operator
  • minor logging improvements by @janeklb in #1829

Fixed

  • Removed excessive logging isUpdatedEndpointSlice():unknown
  • Changed metrics collection query that could be broken in 0.25.4 for old ClickHouse versions

Helm updates

New Contributors

Full Changelog: release-0.25.4...release-0.25.5

release-0.25.4

26 Sep 14:06

Choose a tag to compare

Added

  • Operator configuration 'reconcile' section is now fully supported at CHI level under both 'reconcile' and old 'reconciling' name. Previously, only selected settings were available at CHI level.
  • Allow to exclude namespaces that operator watches. by @AdheipSingh in #1770
spec:
  watch:
    namespaces:
      include: []
      exclude: [] # new
  • Option to choose which probe should operator wait for during reconcile. Previously, it always waited for pod to be ready. This can now be configured in 'reconcile' section of operator or CHI:
spec:
  reconcile:
    host:
      wait:
        probes:
          startup: "yes"
          readiness: "no"

Changed

  • system.custom_metrics table is currently scrapped for monitoring in addition to metrics and asynchronous_metrics. That allows to inject custom monitoring data from ClickHouse side.
  • Deprecated Endpoints API has been replaced with EndpointSlice. Closes 1801

Fixed

  • Fixed a bug with long environment variables used for secrets being truncated. Closes #1804
  • Fixed a bug that operator did not respect watched namespaces for CHK

Helm updates

  • Define values.schema.json by @Slach in #1815. Closes #1814
  • Added clickhouse-operator deployment strategy parameters to Helm chart by @Slach in #1789
  • Publish operator helm chart to helm.altinity.com in addition to artifacthub.io

Full Changelog: release-0.25.3...release-0.25.4

release-0.25.3

08 Aug 07:24

Choose a tag to compare

Added

  • Added support for pdbMaxUnavailable in CHK
  • Added .spec.configuration.clusters[].pdbManaged for CHI and CHK that allow to set external PDBs by @zrudzionis in #1768

Fixed

  • Add ZK error handling and logging by @wilkermichael in #1762
  • Fixed collision between PDBs for CHI and CHK with the same name. As a side effect PDB for CHI will be re-created with this upgrade
  • Fixed rare panic in buildCRFromObj() when it can not find CR
  • Fixed update of storage configmap that might block in rare cases. Closes #1781
  • Fixed the check for host to be included in remote_servers, that did not work correctly in some network configurations. Closes #1782

Helm updates

  • feat(helm): add priority class to helm chart by @nobletooth in #1774
  • feat(helm): publish as an OCI helm package as well by @ogirardot in #1779
  • There is no 'helm repo upgrade', should be 'update' by @CaptTofu in #1780
  • Hotfix port names, to avoid warning during helm install by @Slach in #1784

New Contributors

Full Changelog: release-0.25.2...release-0.25.3