Skip to content

Conversation

@Tishj
Copy link
Contributor

@Tishj Tishj commented Nov 24, 2025

This PR fixes https://github.com/duckdblabs/duckdb-internal/issues/6656

COMPRESSION_EMPTY was added by #15591

COMPRESSION_EMPTY is used when we detect that the compression algorithm used for the data can also encode the validity of the column.
So the validity does not need to be separately compressed.

COMPRESSION_CONSTANT is used when we detect that all values in the segment are the same. This is also used for Validity in the case where everything is null / everything is non-null.
So the data does not need to be compressed.

They are semantically similar, but some of the callbacks on the CompressionFunction have different behaviors, for example filter.

This callback is used to implement filter pushdown at the storage level

For a constant-compressed validity segment this is either 0 or all for a null/non-null filter.
For an empty-compressed validity segment this should be a no-op and implemented by the column's compression function.

@Tishj
Copy link
Contributor Author

Tishj commented Nov 24, 2025

We should assert that Filter is implemented if the compression function claims that it can include the validity information.

@Mytherin
Copy link
Collaborator

We should assert that Filter is implemented if the compression function claims that it can include the validity information.

I don't think that's necessary - the Filter optimization shouldn't be necessary for any compression algorithm. It will just fallback to scan + filter if it is not implemented and not touch this code.

@duckdb-draftbot duckdb-draftbot marked this pull request as draft November 24, 2025 20:50
@Tishj Tishj marked this pull request as ready for review November 24, 2025 20:51
@Mytherin Mytherin merged commit f019064 into duckdb:v1.4-andium Nov 25, 2025
81 of 82 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

1 similar comment
@Mytherin
Copy link
Collaborator

Thanks!

github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Nov 27, 2025
More testing for appender and attach-detach (duckdb/duckdb#19708)
Make `make tidy-check-diff` compare against base branch, instead of always comparing against `origin/main` (duckdb/duckdb#19917)
[Compression] Prevent overriding `COMPRESSION_EMPTY` with `COMPRESSION_CONSTANT` (duckdb/duckdb#19913)
free disk space in Upload Extensions job (duckdb/duckdb#19912)
github-actions bot added a commit to duckdb/duckdb-r that referenced this pull request Nov 27, 2025
More testing for appender and attach-detach (duckdb/duckdb#19708)
Make `make tidy-check-diff` compare against base branch, instead of always comparing against `origin/main` (duckdb/duckdb#19917)
[Compression] Prevent overriding `COMPRESSION_EMPTY` with `COMPRESSION_CONSTANT` (duckdb/duckdb#19913)
free disk space in Upload Extensions job (duckdb/duckdb#19912)

Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
Tishj added a commit to Tishj/duckdb that referenced this pull request Dec 9, 2025
Mytherin added a commit that referenced this pull request Dec 9, 2025
This PR fixes duckdblabs/duckdb-internal#6797

#19913 turned out to be a bandaid
If the stats say it's constant, then it's actually constant, otherwise
it wouldn't get there.
NiclasHaderer pushed a commit to NiclasHaderer/duckdb that referenced this pull request Dec 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants