-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Windows must sample less #20018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows must sample less #20018
Conversation
| # group: [sample] | ||
|
|
||
| # sample max on windows 32 bit is 10000000, otherwise 1000000000 | ||
| require notwindows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be require ram 16gb or similar - given the reservoir sample down below is pre-allocating loads of memory in the current implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added require ram 16gb. Makes se wonder if I should lower the sample limit even further? 1,000,000,000 (1 billion) rows at 1 byte per row is already one GB. Maybe assume 1kb per row max? Which would make the max reservoir sample size something like 100,000 rows?
The error message could then include that you should use a percentage if you want more. Reservoir percentage samples are allocated as the table scan executes, and not all at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just not pre-allocate the entire reservoir if the reservoir exceeds some threshold instead of lowering the limit, but let's do that at a later point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, have made a ticket https://github.com/duckdblabs/duckdb-internal/issues/6757
|
First checking CI locally here |
|
I'll mirror this PR internally so we can track it for the release (it is related to InvokeCI failing, right)? |
|
Looks like MacOS failed int he extensions, but that is due to a download failure here |
|
Or is this error also happening on other PRs / something broke on our side? |
|
I can't selectively rerun that one CI run right now, but I think it will be possible once more CI runs are done |
|
Thanks! |
Windows must sample less (duckdb/duckdb#20018) Bump MySQL scanner (duckdb/duckdb#20025) Fix INSERT OR REPLACE BY NAME with partial columns(duckdb/duckdb#19845) (duckdb/duckdb#19989)
Windows must sample less (duckdb/duckdb#20018) Bump MySQL scanner (duckdb/duckdb#20025) Fix INSERT OR REPLACE BY NAME with partial columns(duckdb/duckdb#19845) (duckdb/duckdb#19989) Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
Fixes nightly failure found here https://github.com/duckdb/duckdb/actions/runs/19842531250/job/56857957914#step:6:4105