Skip to content

Conversation

@kryonix
Copy link
Contributor

@kryonix kryonix commented Dec 3, 2025

Mark joins without correlation on the left side could not be handled correctly without this PR. Information regarding the correlated columns was simply lost, which could lead to wrong query behavior. This PR fixes this and introduces correct decorrelation of mark joins without correlation on the left side.


Fix: #19504

Mark joins without correlation on the left side could not be handled
correctly without this PR. Information regarding the correlated columns
was simply lost, which could lead to wrong query behavior. This PR fixes
this and introduces correct decorrelation of mark joins without
correlation on the left side.

----

fix: duckdb#19504
@kryonix kryonix force-pushed the fix_decorrelation_mark_join branch from 8e752a0 to 55ae961 Compare December 3, 2025 14:37
@kryonix kryonix changed the base branch from main to v1.4-andium December 3, 2025 14:37
@kryonix kryonix marked this pull request as draft December 3, 2025 14:38
@kryonix kryonix marked this pull request as ready for review December 3, 2025 14:38
… side have correlation

Kind of a weird case, maybe we should leave the join decorrelation early
here. Will have to investigate some time.
@duckdb-draftbot duckdb-draftbot marked this pull request as draft December 3, 2025 17:43
@kryonix kryonix marked this pull request as ready for review December 4, 2025 05:47
Copy link
Collaborator

@lnkuiper lnkuiper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks good to me. I just have one question, then this can be merged.

}
} else if (join.join_type == JoinType::MARK) {
if (!left_has_correlation && right_has_correlation) {
// found a MARK join where the left side has no correlation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This covers the left/right {0, 1} correlation case. Are we sure all other cases are covered by the existing code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be quite honest—I'm not 100% sure if we will need another special case, when the left and right sides are correlated, like is the case here:

SELECT query,
      (WITH t AS MATERIALIZED (SELECT query)
       SELECT *
       FROM (SELECT query) AS _(x)
       WHERE x IN (SELECT * FROM t)) AS broken
FROM   (VALUES (('cat')), (('dog')), (('duck'))) AS queries(query);

However, the query does return the correct result 🤔 So, we may be fine, I think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's good enough for me, for now. v1.4 is LTS, so if we run into another issue we can fix it, and we can reference this discussion.

@lnkuiper lnkuiper merged commit 8a2fc1d into duckdb:v1.4-andium Dec 4, 2025
102 checks passed
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Dec 4, 2025
Fix mark join decorrelation (duckdb/duckdb#20033)
Enable windows_arm64 arch for main extensions (duckdb/duckdb#20004)
github-actions bot added a commit to duckdb/duckdb-r that referenced this pull request Dec 4, 2025
Fix mark join decorrelation (duckdb/duckdb#20033)
Enable windows_arm64 arch for main extensions (duckdb/duckdb#20004)

Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regression due to optimization in 1.4.0

3 participants