Skip to content

Destroy row groups in parallel#19053

Merged
Mytherin merged 3 commits into
duckdb:mainfrom
Mytherin:paralleldestructors
Sep 19, 2025
Merged

Destroy row groups in parallel#19053
Mytherin merged 3 commits into
duckdb:mainfrom
Mytherin:paralleldestructors

Conversation

@Mytherin
Copy link
Copy Markdown
Collaborator

When shutting down or closing a database, we need to destroy all loaded row groups. For very large databases, we can have many row groups with many column data / column segments loaded. Relying on regular C++ destructors to destroy these results in a single-threaded pass over all this data. On many core machines, this can be significantly sped up by doing this in parallel.

@Mytherin Mytherin requested a review from lnkuiper September 18, 2025 16:42
Copy link
Copy Markdown
Member

@lnkuiper lnkuiper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! I built this locally, and timed how long it takes to quit the CLI on my laptop after running TPC-H Q1 at SF1000, but the time was similar (4s vs 3.5s).

Is the speedup only expected on Linux, or only on machines with many more cores?

@Mytherin
Copy link
Copy Markdown
Collaborator Author

Mytherin commented Sep 19, 2025

On my machine with hits, timing the DETACH:

ATTACH 'hits.db';
.mode trash
FROM hits.hits; -- load entire db into memory
.timer on
DETACH hits;

Old:

Run Time (s): real 1.648 user 0.119399 sys 1.506338

New:

Run Time (s): real 0.813 user 1.067256 sys 5.999461

@lnkuiper
Copy link
Copy Markdown
Member

Thanks for sharing the DETACH time, that's a solid improvement! :)

@Mytherin Mytherin merged commit 74f64b6 into duckdb:main Sep 19, 2025
52 checks passed
krlmlr added a commit to krlmlr/duckdb-r that referenced this pull request Oct 21, 2025
krlmlr added a commit to krlmlr/duckdb-r that referenced this pull request Nov 1, 2025
krlmlr added a commit to krlmlr/duckdb-r that referenced this pull request Nov 2, 2025
@Mytherin Mytherin deleted the paralleldestructors branch December 4, 2025 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants