Skip to content

cdouglas/iceberg

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7,695 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Compaction Maps for Apache Iceberg

This branch is a research prototype built on Apache Iceberg 1.10.1 (tag apache-iceberg-1.10.1). It adds compaction maps: a compact data structure that records the position transformations applied by a compaction so that concurrent transactions writing position deletes (or deletion vectors) can be rebased onto the new layout instead of restarted.

Compactions and concurrent updates logically commute — compaction does not change table contents — but in current table formats they conflict on direct file references. A compaction map captures, per run of rows, the move from a source file to one or more target files. Either side of a conflict can use the map to rewrite its position-delete references and commit, with no global coordination beyond Iceberg's existing snapshot pointer swap.

Paper

Chris Douglas and Joseph M. Hellerstein. Commutative Compaction. 1st International Workshop on Data FORMATS for Modern Architectures and Workloads (FORMATS '26), May 31–June 5, 2026, Bengaluru, India. ACM. https://doi.org/10.1145/3802514.3809174

The paper introduces compaction maps, the rebase operation, and the remapping policy, and evaluates an Apache Iceberg prototype (this branch) on Apache Spark 3.5 and 4.0 with both v2 position delete files and v3 deletion vectors.

Benchmark Results

End-to-end remapping was measured against object storage in three clouds (AWS S3 us-west-2, Azure ADLSv2 westus2, GCS uswest1) on commodity VMs (4 vCPU, 16 GiB), varying the number of runs in the compaction map (10 to 10K) and the size of the position delete file or deletion vector (1K to 1M deletes):

Total latency heatmap for deletion vectors across AWS, Azure, and GCP

  • Repairing a 1M-delete commit against a 10K-run compaction map completes in under one second in every cloud, including all I/O — 0.34–0.45 s for deletion vectors and 1.8–2.3 s for position delete files.
  • At 10K deletes (typical commit size) latency never exceeds half a second in any cloud, regardless of run count.
  • Deletion vectors outperform position delete files across the board; Parquet encode dominates the PD cost while the DV roaring-bitmap region is only a few KiB.
  • The compaction map itself is small: 10 runs occupies 2.6 KiB and 10K runs occupies 8.9 KiB on disk.

Cost is negligible relative to the compaction it commutes with — compactions typically run for minutes to hours.

See docs/docs/compaction_maps_bench.md for the full benchmark methodology and the JMH microbenchmark suite that drives the remapping-algorithm policy.

Status

Implementation is on the cmpmap branch:

  • Core data structure, builder, Avro storage, and manifest-list reference.
  • Conflict detection and SERIALIZABLE-isolation integration in BaseRowDelta / MergingSnapshotProducer.
  • Automatic remapping in RewriteDataFilesCommitManager for Spark 3.5 and 4.0, for both position delete files (v2) and deletion vectors (v3).
  • Empirically-tuned remapping policy (IntervalTree / RangeQuery / StreamJoin) selected at runtime from the shape of the inputs.

Documentation

Detailed documentation lives in docs/docs/:

Building

This branch builds with the standard Iceberg toolchain (Gradle, Java 11/17/21):

./gradlew :iceberg-core:compileJava
./gradlew :iceberg-core:test --tests "*CompactionMap*"
./gradlew :iceberg-core:test --tests "*Remapping*"
./gradlew spotlessApply

For general Iceberg build, engine-compatibility, and contribution information, see the upstream project at https://iceberg.apache.org.

About

Apache Iceberg

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Java 97.5%
  • Scala 1.5%
  • Shell 0.5%
  • Python 0.4%
  • ANTLR 0.1%
  • HTML 0.0%