-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[Compression] Prevent overriding COMPRESSION_EMPTY with COMPRESSION_CONSTANT
#19913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…filter logic doesn't work otherwise
|
We should assert that |
I don't think that's necessary - the |
|
Thanks! |
1 similar comment
|
Thanks! |
More testing for appender and attach-detach (duckdb/duckdb#19708) Make `make tidy-check-diff` compare against base branch, instead of always comparing against `origin/main` (duckdb/duckdb#19917) [Compression] Prevent overriding `COMPRESSION_EMPTY` with `COMPRESSION_CONSTANT` (duckdb/duckdb#19913) free disk space in Upload Extensions job (duckdb/duckdb#19912)
More testing for appender and attach-detach (duckdb/duckdb#19708) Make `make tidy-check-diff` compare against base branch, instead of always comparing against `origin/main` (duckdb/duckdb#19917) [Compression] Prevent overriding `COMPRESSION_EMPTY` with `COMPRESSION_CONSTANT` (duckdb/duckdb#19913) free disk space in Upload Extensions job (duckdb/duckdb#19912) Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
This PR fixes duckdblabs/duckdb-internal#6797 #19913 turned out to be a bandaid If the stats say it's constant, then it's actually constant, otherwise it wouldn't get there.
This PR fixes https://github.com/duckdblabs/duckdb-internal/issues/6656
COMPRESSION_EMPTYwas added by #15591COMPRESSION_EMPTYis used when we detect that the compression algorithm used for the data can also encode the validity of the column.So the validity does not need to be separately compressed.
COMPRESSION_CONSTANTis used when we detect that all values in the segment are the same. This is also used for Validity in the case where everything is null / everything is non-null.So the data does not need to be compressed.
They are semantically similar, but some of the callbacks on the CompressionFunction have different behaviors, for example
filter.This callback is used to implement filter pushdown at the storage level
For a constant-compressed validity segment this is either 0 or all for a null/non-null filter.
For an empty-compressed validity segment this should be a no-op and implemented by the column's compression function.