Skip to content

Conversation

@Tmonster
Copy link
Contributor

@Tmonster Tmonster commented Nov 28, 2025

Disable sampling over the Allocator::MAXIMUM_ALLOC_SIZE.

Error is thrown in the same place as what might happen if someone samples -1 rows.

Fixes #17780
Fixes https://github.com/duckdblabs/duckdb-internal/issues/5065

auto rows = sample_value.GetValue<int64_t>();
if (rows < 0) {
throw ParserException("Sample rows %lld out of range, must be bigger than or equal to 0", rows);
if (rows < 0 || sample_value.GetValue<uint64_t>() >= Allocator::MAXIMUM_ALLOC_SIZE) {
Copy link
Collaborator

@Mytherin Mytherin Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sample size is in rows but the allocator max size is in bytes - I suspect that sampling for 281474976710656-1 rows will still throw the allocation error. Should we reduce the limit further / add a test for this?

Alternatively, we might want to not allocate all the space for the sample up-front if the sample size is very large - but that's obviously a larger change.

@duckdb-draftbot duckdb-draftbot marked this pull request as draft December 1, 2025 13:00
@Tmonster
Copy link
Contributor Author

Tmonster commented Dec 1, 2025

Have updated it so that no more than 1 billion rows can be requested.

No to get an internal error every row sampled would need 281kb of data for the allocation limit to be reached. I can reduce the max row limit if desired

@Tmonster Tmonster marked this pull request as ready for review December 1, 2025 13:32
@pdet pdet merged commit e555239 into duckdb:v1.4-andium Dec 1, 2025
60 checks passed
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Dec 1, 2025
github-actions bot added a commit to duckdb/duckdb-r that referenced this pull request Dec 1, 2025
No sampling over 281TB (duckdb/duckdb#19978)

Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants