Releases: Altinity/clickhouse-backup
Releases · Altinity/clickhouse-backup
2.7.2
v2.7.2
IMPROVEMENTS
- include the data part file path and source object key (
fPath,srcKey) inobject_diskCopyObject/CopyObjectStreamingerror messages duringcreateandrestore, so brokenobject_diskdata (keys missing on remote storage) points to the exact failing file instead of just the shadow/table name - document GCS Workload Identity authentication for the
gcsremote storage inExamples.md
BUG FIXES
- fix
clickhouse.skip_table_engines(envCLICKHOUSE_SKIP_TABLE_ENGINES) silently keeping some matching tables: the in-place slice removal advanced the cursor past the next element, so adjacent tables sharing a skipped engine were not all skipped; iterate in reverse so every match is dropped, fix #1416 - ensure
/backup/kill(and context cancellation in general) promptly aborts an in-flightdownload/restoreeven when a read is stalled on a slow/half-open network or disk backpressure — the source reader is now force-closed on cancellation so blockedReadcalls inDownloadCompressedStream/DownloadPath, the AzureCopyObjectpoll backoff, andobject_disk.CopyObjectStreamingreturn instead of running to completion, and the resumable.pidfile is removed, fix #1365
2.7.1
v2.7.1
NEW FEATURES
- add
--fips-infoapp-level flag — prints binary name, version, git commit, build date, Go version, and the FIPS module build/runtime state (GOFIPS140build setting,GODEBUG fips140default/runtime) then exits, without requiring a Go toolchain, fix #1402 - add Azure AD Workload Identity support for
azblob— whenAZBLOB_USE_MANAGED_IDENTITY=trueand theAZURE_TENANT_ID/AZURE_CLIENT_ID/AZURE_FEDERATED_TOKEN_FILEenv vars are injected (e.g. by the AAD Workload Identity webhook), the federated token is used to authenticate; seeExamples.mdfor deployment, fix #1124
IMPROVEMENTS
- add
general.compression_use_multi_thread(envCOMPRESSION_USE_MULTI_THREAD, defaultfalse),general.compression_threads(envCOMPRESSION_THREADS, default0= auto/GOMAXPROCS) andgeneral.compression_buffer_size(envCOMPRESSION_BUFFER_SIZE, default0) config options to tune per-stream zstd/gzip threading and the compression buffer (zstd encoder window / gzip DEFLATE window / pgzip block size); per-stream compression is now single-threaded by default to avoid CPU over-subscription, sinceupload_concurrency/download_concurrencyalready parallelize across tables, fix #1378 - migrate compression from the archived, frozen
github.com/mholt/archiver/v4 v4.0.0-alpha.8to its maintained successorgithub.com/mholt/archives v0.1.5, removing thereplacedirective pinned in archiver#428 - migrate S3 storage from the deprecated
github.com/aws/aws-sdk-go-v2/feature/s3/managertogithub.com/aws/aws-sdk-go-v2/feature/s3/transfermanager; the forced CRC32aws-chunkedtrailer that broke non-AWS S3-compatible providers is now disabled viaRequestChecksumCalculation=WhenRequired, and the obsoletes3.buffer_size/S3_BUFFER_SIZEoption was removed, fix #1409 - add buffer-size and HTTP-transport tuning for high-bandwidth (10Gbps+) S3/GCS transfers:
general.pipe_buffer_size(envPIPE_BUFFER_SIZE, default 128KB),general.download_copy_buffer_size(envDOWNLOAD_COPY_BUFFER_SIZE),gcs.upload_buffer_size(envGCS_UPLOAD_BUFFER_SIZE),s3.http_write_buffer_size(envS3_HTTP_WRITE_BUFFER_SIZE) ands3.http_read_buffer_size(envS3_HTTP_READ_BUFFER_SIZE), fix #1376 - replace post-hoc sleep-based bandwidth throttling with a token-bucket rate limiter wrapped around the storage
Reader/Writerinterfaces, so the configured limit is enforced continuously during transfer instead of after each chunk, fix #934, #1377 - parallelize
ALTER TABLE ... UNFREEZEafterbackup createinstead of running it inline inside each table goroutine, so an UNFREEZE no longer holds anupload_concurrencyslot and blocks the next table, fix #1381 - add
clickhouse.parts_columns_batch_size(envCLICKHOUSE_PARTS_COLUMNS_BATCH_SIZE, default25) to batch thesystem.partslookups when computinghash_of_all_files, avoidingMax query size exceededfailures on tables with very many parts, fix #1408 - resolve
requireddata parts duringrestorethe same way as duringdownload— hardlink from the required backup on local disk when present, otherwise download the part from remote storage, fix #1023 ResumeOperationsAfterRestartnow ignorescreate.state2/restore.state2files instead of failing the API server startup withunknown command; onlyuploadanddownloadare auto-resumed after a server restart, fix #1083- fail fast with a clear error instead of retrying for ~35s when a remote table metadata
.jsonis missing duringdownload(covers S3NoSuchKey, GCS 404, AzureBlobNotFound, FTP/SFTP not-found) — a missing table file is a permanent broken-backup condition, not a transient one, fix #1379 - emit a clear error on
--resume downloadwhen the local backup exists butdownload.state2is missing (so it is unknown which parts are complete), instead of crashing or silently resuming on top of partial data, fix #1383 - harden FIPS 140-3 verification: native Go
GOFIPS140=v1.0.0checks, ACVP reproducibility tests, outbound S3 TLS rejection checks and container cleanup in CI/CD, fix #1399, #1401, #1404
BUG FIXES
- don't kill
clickhouse-backup serverwithFatal/os.Exitwhen the resumable state DB can't be written or read (e.g.no space left on device); the error now propagates so the server stays alive and returns it to the API client, while the CLI exits with a non-zero code, fix #1172 - fix
backup createfailing withpart "<name>" not found in system.parts ... after FREEZEwhen a ClickHouse cache disk (e.g.s3_cache) wraps an underlying S3 object disk — prefer the underlying disk name over the cache wrapper ingetDisksFromSystemDisks, fix #1396 - fix
--hardlink-exists-filesto also match parts whosehash_of_all_filesis identical but that now live under a renamed table, fix #1398 - fix
--tablecombined with--resumeon incremental backups: recursively downloading a required backup closed the parentb.dstconnection and wiped the resumable state; the connection is now saved/restored, fix #1384 - improve detection when
clickhouse-backupruns on a host whose disks differ fromclickhouse-server, instead of silently warningdoesn't contain tables for restore, fix #1037
2.7.0
v2.7.0
NEW FEATURES
- add
clean_broken_retentionCLI command — walks top-level of remotepathandobject_disks_pathand batch-deletes (with retry) every entry that is not present in the live backup list and not matched by any--exclude=<glob>(and optionally scoped by--include=<glob>). Dry-run by default; pass--committo actually delete. Useful for cleaning up orphans left by failed retention runs, fix #1371 - add
infoCLI command for per-table backup size breakdown — shows per-table size, part count, and disk breakdown for local and remote backups, supports--tables=<db>.<table>glob filter and--format=text|json|yaml|csv|tsv, acceptsall|local|remotescope, fix #1388 - add
force_rebalanceconfig option (clickhouse.force_rebalance, envCLICKHOUSE_FORCE_REBALANCE) — distribute restored data across multiple JBOD disks under the same storage policy even when the source disk name (e.g.default) exists on the target machine, fix #1350 - switch FIPS variant from FIPS 140-2 boringssl to native Go 1.24+ FIPS 140-3 (
GODEBUG=fips140=on); embed an ACVP wrapper into the shippedclickhouse-backup-fipsbinary with dual entry points (clickhouse-backup-acvpargv0 dispatch andclickhouse-backup acvpsubcommand) and ship a tracked public-scope ACVP reproducibility flow, fix #1341, #1364, #1391, #1395 - add safety check to
restore/restore_remote: fail without--rm/--dropwhen target tables already exist and contain rows (checked viaclusterAllReplicas('{cluster}')whenrestore_schema_on_clusteris set) to avoid dangerous accidentalDROP TABLE, fix #1325 - add checksum verification during
upload --diff-from/--diff-from-remotewhen part name matches, to avoid uploading mismatched data and to detect silent corruption, fix #1307
IMPROVEMENTS
- speed up
restore_remotefrom S3 incremental chains: cache backup list and avoid redundantListObjectscalls per table (previously 8h on 280GB / 3500 tables shrinks to minutes), fix #1362, #1361 - reduce backup memory footprint for databases with thousands of tables (regression introduced in v2.6.42), fix #1360
- wrap S3 credentials with
aws.NewCredentialsCache()to avoid resolving credentials on every API call (IMDS/STS), reducing latency and throttling in IRSA + AssumeRole flows, fix #1335 - simplify
hash_of_all_filescomputation via a single post-FREEZESELECTfromsystem.partsinstead of per-file hashing — also enables--hardlinks-exists-filesto consultsystem.partschecksums during download, fix #1338 - isolate FREEZE shadow directory per backup as
/var/lib/clickhouse/shadow/backup-{uuid}so concurrent backups and cleanup-after-failure don't clobber each other's shadow data, fix #1345 - add option to skip persisting
listcalls into the API serveractionsstate — prevents unbounded growth of actions state when/backup/listis used as a monitoring endpoint during long-running backups, fix #1359 - improve
killcommand to ensure all in-flight operations really finish and to remove leftover.pidfiles, fix #1365 - document missing/incorrect concurrency defaults in
ReadMe.md(download_concurrency,s3.concurrency,cos.concurrency,sftp.concurrency,ftp.concurrency), fix #1346 - migrate integration tests to testcontainers-go for better parallelism and isolation, fix #1336
- fix the
list_durationlog field formatting inpkg/storage/general.go(was emitting raw nanoseconds), fix #1337
BUG FIXES
- fix
restore_remotefor tables using sparse-column serialization: accept empty sparse metadata files instead of treatingStorageObjectCount=0as corruption, affects ClickHouse 23.8+, fix #1372 - fix
restore_remoteaborting the entire restore when an incremental backup contains a table absent from the required full backup; the missing table is now skipped with a warning, fix #1373 - fix
object_diskbackup on S3 sources with SSE-C: handle 404 from server-sideCopyObjectby falling back to streaming and stop issuingHeadObject(returns 400) on SSE-C source objects beforeGetObject, fix #1374 - fix
--rbac-onlybackup failing with "is empty backup" when the database contains RBAC objects but no tables andallow_empty_backups=false, fix #1355 - fix nested
sshconsuming stdin from thewhile readloop in the rsync helper (usessh -n) so all backup metadata files are processed instead of only the first, fix #1368 - fix backup retention logic in the rsync helper: correct line counting, numeric comparison and arithmetic handling so old backups are properly cleaned up, fix #1369
2.6.44
2.6.43
v2.6.43
NEW FEATURES
- add
S3_REQUEST_CONTENT_MD5option for S3-compatible storage backends that requireContent-MD5header on uploads (e.g. Huawei S3), fix 1324, fix 1329
IMPROVEMENTS
- add ClickHouse 26.1, 26.2 to test matrix
- refactor integration tests: split monolithic test file and reorganize configs for better maintainability
BUG FIXES
- fix GCS upload performance regression by letting SDK manage transport instead of custom HTTP client
- fix restore
--partitionsoption error related to Embedded backup - fix 1328,
restore --data --partitionsnow correctly usesDROP PARTITION ID '<id>'for partition IDs andDROP PARTITION (<tuple>)for tuple-format partitions instead of always usingDROP PARTITION, also fix the same issue for Embedded backup/restore
2.6.42
v2.6.42
NEW FEATURES
- apply object disk key rewriting for remapped tables during restore to prevent data corruption when using
--restore-database-mappingor--restore-table-mappingwith object disks (S3/GCS/Azure), fix 1278 - add GCS customer-supplied encryption key (CSEK) support for client-side encryption where the encryption key is controlled by the user, not Google. Use
GCS_ENCRYPTION_KEYenvironment variable orgcs.encryption_keyconfig option with base64-encoded 256-bit key, fix 1316 - add TLS support for Keeper connections, allows secure connections to ClickHouse Keeper with SSL/TLS certificates, fix 1312
- add
--skip-empty-tablesoption torestoreandrestore_remotecommands to skip tables with no data during restore, available in CLI, API handlers, and server mode, fix 1265
IMPROVEMENTS
- implements batching keys deletions to improve speed of delete old backups during backup retention, fix 1066
- add ClickHouse 25.12 to test matrix
- add example for minimal grants for backup user in Examples.md
- improve GCS connection handling: properly close readers on error to prevent goroutine leaks, change retry logging from Debug to Warn level
- use partitionId directly instead of INSERT INTO temp table for ClickHouse 21.8+, improves partition handling performance, fix 1315
- refactor table column type checking to use single query before freeze operation instead of per-table queries fix 1194
- add KEEPER_TLS_ENABLES=1 by default in integration tests
- improve TestKeeperTLS, TestReplicatedCopyToDetached, and TestRestoreDistributedCluster test stability
- update GitHub Actions workflows and GOROOT configuration
- explain S3_FORCE_PATH_STYLE configuration option in documentation
BUG FIXES
- fix S3 multipart operations (upload, download, copy) to respect
S3_MAX_PARTS_COUNTinstead of hardcoded 10000 value, allows S3-compatible backends with stricter limits - fix GCS credential conflicts when multiple authentication options are provided, refactor GCS Connect to avoid conflicting client options
- fix COS upload for large files, fix 1318
- fix list status update in server mode, fix 1317
- fix TestServerAPI and TestReplicatedCopyToDetached test failures in CI/CD
- fix OpenSSL/client config parsing for Keeper TLS connections, add comprehensive integration tests
- fix partition filtering when using
--restore-database-mapping,--restore-table-mappingtogether with--partitionsoption - security: update dependencies to fix CVE-2025-61729, CVE-2025-61727
2.6.41
2.6.40
v2.6.40
IMPROVEMENTS
- add ClickHouse 25.10 and 25.11 support to CI/CD test matrix
BUG FIXES
- properly handle
operationIdincreate_remoteandrestore_remoteHTTP handlers, fix 1272 - improve
--tablesparameter to automatically adjust according to--restore-table-mappinglogic, fix 1278, fix 1302 - fix Download and Upload command proper close resumable state to avoid infinite bolt lock in server mode when upload or download command failed, fix 1304
- fix ApplyMacros behavior for Embedded backup/restore
- fix config race conditions in server mode
- fix GCS transient errors causing corruption, fix 1292
- change GCS default chunk size to 16Mb, fix 1292
- fix support for
object_disk.VersionFullObjectKey=5in ClickHouse 25.10+, fix 1290 - fix restore refreshable materialized view, fix 1271