Skip to content

Conversation

@Mytherin
Copy link
Collaborator

This PR reworks how we name subqueries without aliases. Previously we used the table index we got from the binder, e.g.:

D select unnamed_subquery1.* FROM (SELECT 42 a, 43 b);
┌───────┬───────┐
│   a   │   b   │
│ int32 │ int32 │
├───────┼───────┤
│    4243 │
└───────┴───────┘

The problem is that this numbering is unstable and unpredictable. From the perspective of the user the name seems completely arbitrary, e.g.:

-- now it is 7
D select unnamed_subquery7.* FROM (FROM (SELECT 42 a, 43 b) t);
┌───────┬───────┐
│   a   │   b   │
│ int32 │ int32 │
├───────┼───────┤
│    4243 │
└───────┴───────┘
-- now it is 8
D select unnamed_subquery8.* FROM (SELECT 84) t, (SELECT 42 a, 43 b);
┌───────┬───────┐
│   a   │   b   │
│ int32 │ int32 │
├───────┼───────┤
│    4243 │
└───────┴───────┘

This causes problems also because queries are no longer portable, e.g. this query works:

select unnamed_subquery1.* FROM (SELECT 42 a, 43 b);
┌───────┬───────┐
│   a   │   b   │
│ int32 │ int32 │
├───────┼───────┤
│    4243 │
└───────┴───────┘

But this fails:

D select (select unnamed_subquery1.* FROM (SELECT 42 a, 43 b));
-- Error: Binder Error: Referenced table "unnamed_subquery1" not found!
-- Candidate tables: "unnamed_subquery8"

This PR reworks the naming of unnamed subqueries so that they are predictable - they are named starting at unnamed_subquery (without a number) - and then numbers are added as more unnamed subqueries are found within the same level of the FROM clause. For example:

-- standard case, single unnamed subquery, the table is called "unnamed_subquery"
D select unnamed_subquery.* FROM (SELECT 42 a, 43 b);
┌───────┬───────┐
│   a   │   b   │
│ int32 │ int32 │
├───────┼───────┤
│    4243 │
└───────┴───────┘
-- two unnamed subqueries, the second one is called "unnamed_subquery2"
D select unnamed_subquery.a, unnamed_subquery2.b FROM (SELECT 42 a), (SELECT 43 b);
┌───────┬───────┐
│   a   │   b   │
│ int32 │ int32 │
├───────┼───────┤
│    4243 │
└───────┴───────┘
-- unnamed subqueries/subqueries WITHIN do not affect naming at the current level
D select unnamed_subquery.a, unnamed_subquery2.b FROM (FROM (SELECT 42 a) t), (SELECT 43 b);
┌───────┬───────┐
│   a   │   b   │
│ int32 │ int32 │
├───────┼───────┤
│    4243 │
└───────┴───────┘
-- similarly names are stable and queries are portable
D select (select unnamed_subquery.a from (select 42 a, 43 b)) AS res;
┌───────┐
│  res  │
│ int32 │
├───────┤
│    42 │
└───────┘

This also fixes an issue with using unnamed subqueries in the UNPIVOT, found here

@Mytherin Mytherin merged commit 8ee70c5 into duckdb:main Feb 20, 2024
github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request Mar 15, 2024
Merge pull request duckdb/duckdb#10765 from Mytherin/unpivoterrormessage
@Mytherin Mytherin deleted the unpivoterrormessage branch July 5, 2024 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant