Skip to content

[WIP] Antalya-26.3 Added support for TTL EXPORT#1810

Draft
mkmkme wants to merge 11 commits into
antalya-26.3from
mkmkme/antalya-26.3/ttl-export-partition
Draft

[WIP] Antalya-26.3 Added support for TTL EXPORT#1810
mkmkme wants to merge 11 commits into
antalya-26.3from
mkmkme/antalya-26.3/ttl-export-partition

Conversation

@mkmkme
Copy link
Copy Markdown
Collaborator

@mkmkme mkmkme commented May 19, 2026

Fixes #1793

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

TBD

Documentation entry for user-facing changes

TBD

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

@mkmkme mkmkme added antalya port-antalya PRs to be ported to all new Antalya releases antalya-26.3 labels May 19, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 19, 2026

Workflow [PR], commit [3e6e29e]

@mkmkme
Copy link
Copy Markdown
Collaborator Author

mkmkme commented May 19, 2026

With the new commit 04206 (syntax check) is now passing

@mkmkme mkmkme force-pushed the mkmkme/antalya-26.3/ttl-export-partition branch from 0f90fd6 to 1d0c1a7 Compare May 19, 2026 11:14
mkmkme and others added 9 commits May 20, 2026 15:02
Tests describe the contract for the upcoming `TTL ... EXPORT TO db.table`
action. They are added before the C++ implementation so they double as the
acceptance criteria.

Stateless (tests/queries/0_stateless):
- 04206_ttl_export_partition_syntax: parser/metadata round-trip and rejection
  of (a) two EXPORT TTLs to the same destination and (b) EXPORT TTL on a
  table without a partition key.
- 04207_ttl_export_partition_basic: happy path, plus an in-line assertion
  that a future-dated partition is not exported.
- 04208_ttl_export_partition_skip_already_exported: re-triggering after a
  partition has been exported does not duplicate it.

Integration (tests/integration/test_ttl_export_partition):
- test_basic_to_iceberg, test_only_one_replica_submits,
  test_failure_and_backoff, test_serial_across_partitions,
  test_replica_restart_mid_export,
  test_modify_ttl_picks_up_with_materialize, test_disabled_replica,
  test_dedup_via_high_water_mark.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce parser, AST, and metadata plumbing for `TTL ... EXPORT TO db.table`.
No background scheduler or per-part TTL info yet — those land in follow-up
commits. The clause is recognised, round-trips through `SHOW CREATE TABLE`,
and the resulting `TTLDescription` is collected into `TTLTableDescription`'s
new `export_ttl` list (exposed via `StorageInMemoryMetadata::getExportTTLs`).

Validation in `TTLTableDescription::getTTLForTableFromAST`:
  * reject two `EXPORT` clauses to the same destination,
  * reject `EXPORT` TTL on a table with no partition key.

The destination-specific override of `TTLDescription::result_column`
(`"_export_" + db + "." + table`) is required so that future per-part TTL
info (keyed by `result_column`) keeps separate clocks per destination.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stores per-part TTL info under `MergeTreeDataPartTTLInfos::export_ttl`,
keyed by `TTLDescription::result_column`. The map is:

  * populated at write time in `MergeTreeDataWriter` for every TTL
    returned by `getExportTTLs`,
  * recomputed during `MATERIALIZE TTL` and merge-time TTL recompute via
    `TTLCalcTransform` / `TTLTransform` (a new `TTLUpdateField::EXPORT_TTL`
    finalizes into the right map),
  * serialized in JSON under the `"export"` key (mirroring the
    `recompression` entry),
  * propagated across merges through the existing `update` aggregation,
  * surfaced through `hasAnyNonFinishedTTLs` and `checkAllTTLCalculated`
    so old parts that predate the TTL are flagged for `MATERIALIZE TTL`.

Adds the partition-wide helper `getPartitionExportTTLMax`: returns the
max `export_ttl.max` across all parts of a partition, or `nullopt` if
any part is missing the entry (with optional `missing_parts_out` for
the scheduler to log). Deliberate: no on-the-fly evaluation — the user
runs `ALTER TABLE ... MATERIALIZE TTL` to backfill, same UX as moves
and recompression TTLs.

Also pulls `export_ttl` into `hasAnyTableTTL` / `hasOnlyRowsTTL` and the
`getColumnDependencies` TTL-column-set walk.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Align the TTL syntax with `ALTER ... EXPORT PARTITION TO TABLE`: the
keyword is now `EXPORT TO TABLE <db.table>` instead of `EXPORT TO
<db.table>`. Parser, formatter, exception messages, and tests updated to
match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an `ExportOrigin` enum (`alter` | `ttl`) to the manifest body so
manifests submitted manually (`ALTER ... EXPORT PARTITION`) can be told
apart from manifests submitted by the upcoming TTL scheduler. Surfaced
as `system.replicated_partition_exports.export_origin
Enum8('alter' = 0, 'ttl' = 1)`. Existing manifests in ZooKeeper that
don't carry the field read back as `alter` for backwards compatibility.

`ttl`-origin manifests are skipped by manifest-TTL eviction: the
background cleanup in `ExportPartitionManifestUpdatingTask` and the
overwrite path in `StorageReplicatedMergeTree::exportPartitionToTable`
both refuse to consider them expired. The existing
`export_merge_tree_partition_force_export` setting still overrides via
the unchanged gate.

The write site keeps the default `ExportOrigin::alter`; the TTL
submitter writes `ExportOrigin::ttl` in a follow-up commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new query-level setting `export_merge_tree_partition_mark_as_ttl`
(default false). When set on `ALTER ... EXPORT PARTITION`, the resulting
manifest is written with `export_origin = ttl` (same as what the TTL
scheduler will write in a follow-up). The TTL scheduler always sets this
implicitly when it submits.

Enforces the "at most one ttl-origin manifest per (src, dest)"
invariant at submission time: when a ttl-origin manifest is being
created, scan siblings under `<zk_root>/exports/` for an existing
ttl-origin marker at a different `partition_id`. If found at `P_old`,
reject the submission as a back-fill (`new < P_old`) unless
`export_merge_tree_partition_force_export` is set; otherwise
best-effort `tryRemoveRecursive` of the old marker before creating the
new one. Same-key collisions continue to be handled by the existing
block.

A plain `alter` over a ttl marker at a different partition is allowed
without friction — alter manifests coexist with the ttl marker, and
the TTL scheduler will filter by `export_origin = ttl` when reading
its own state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extracts the readable-vs-insertable column diff and the partition-key
AST compare from `StorageReplicatedMergeTree::exportPartitionToTable`
into `ExportPartitionUtils::verifyExportDestinationCompatibility`, and
calls it from `TTLTableDescription::getTTLForTableFromAST` for every
`TTLMode::EXPORT` clause when not attaching. The destination is resolved
through `DatabaseCatalog::getTable`, matching the manual
`ALTER ... EXPORT PARTITION` flow (throws `UNKNOWN_TABLE` if missing).

The check is skipped under `is_attach=true` because the destination
table may not yet be loaded at server startup; submission-time
validation in `exportPartitionToTable` still covers that path.

Iceberg destinations skip the partition-key AST compare here; the
existing `verifyIcebergPartitionCompatibility` runs against the runtime
iceberg metadata at submission time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces `TTLExportScheduler`, a per-`StorageReplicatedMergeTree`
background driver that submits partition exports for tables with
`TTL ... EXPORT TO TABLE db.table`. The scheduler is stateless across
restarts: it reads the latest `export_origin = ttl` manifest from
ZooKeeper on every tick and acts on its status — no manifest →
submit the smallest eligible partition; PENDING → wait; COMPLETED →
walk forward to `partition_id > completed`; FAILED → resubmit with
`force_export=1` after per-partition exponential backoff; KILLED →
idle with a `LOG_WARNING` carrying the recovery recipe.

`submit` classifies outcomes as `Submitted | Transient | Failure` so
ZK CAS races and `UNKNOWN_TABLE` (destination dropped post-DDL) do
not bump backoff, while genuine submission errors do.

Adds three table-level settings used by the scheduler:
`export_merge_tree_partition_ttl_poll_interval_seconds` (default 5),
`export_merge_tree_partition_ttl_min_backoff_seconds` (default 1),
`export_merge_tree_partition_ttl_max_backoff_seconds` (default 60).

The scheduler is not yet wired into the background task pool; that
follows in a separate commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Declare the scheduler and its background task next to the other
`export_merge_tree_partition_*` task holders. Under the
`allow_experimental_export_merge_tree_partition` server gate:

- Construct the scheduler and create the `TTLExport` task, logging any
  exceptions from `run` via `tryLogCurrentException`.
- `ReplicatedMergeTreeRestartingThread::tryStartup` activates the task
  alongside the other export tasks; `partialShutdown` deactivates it.
- `alter` calls `ttl_export_task->schedule` when any `MODIFY TTL`
  command is in the alter so newly added EXPORT TTLs take effect
  immediately.
- `TTLExportScheduler::run` reschedules itself with
  `export_merge_tree_partition_ttl_poll_interval_seconds * jitter25`
  on the polling path. Early returns (shutdown, readonly, no EXPORT
  TTL, experimental gate off) intentionally skip the reschedule —
  deactivation, the `alter` hook, and server-level config drive those
  paths instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mkmkme mkmkme force-pushed the mkmkme/antalya-26.3/ttl-export-partition branch from 1d0c1a7 to f7d8a27 Compare May 20, 2026 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

antalya antalya-26.3 port-antalya PRs to be ported to all new Antalya releases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant