Skip to content

new feature: EXPLAIN (format sql) ..query..#11

Open
peterboncz wants to merge 2 commits into
mainfrom
pb/explain-format-sql
Open

new feature: EXPLAIN (format sql) ..query..#11
peterboncz wants to merge 2 commits into
mainfrom
pb/explain-format-sql

Conversation

@peterboncz

@peterboncz peterboncz commented Jun 15, 2026

Copy link
Copy Markdown

Add EXPLAIN (FORMAT SQL)

Summary

Adds a new EXPLAIN format, SQL, that returns the query's optimized logical plan rendered back as equivalent CTE-based SQL (the LPTS output) — analogous to how Umbra can print a plan as SQL.

D EXPLAIN (FORMAT SQL) SELECT name FROM users WHERE age > 25;

┌─────────────────────────────┐
│┌───────────────────────────┐│
││  Optimized Logical Plan   ││
│└───────────────────────────┘│
└─────────────────────────────┘

WITH scan_0 (t0_name) AS (SELECT "name" FROM memory.main.users WHERE age>25),
projection_1 (t1_name) AS (SELECT t0_name FROM scan_0)
SELECT t1_name AS "name" FROM projection_1;

It behaves like a genuine EXPLAIN for every client: the CLI renders it borderless (raw multi-line text, no duckbox, no escaped \n), and JDBC/Python/ADBC receive the same two-column (explain_key, explain_value)
result they get from any EXPLAIN, with the SQL in explain_value.

Motivation

DuckDB supports EXPLAIN (FORMAT json|html|text|…) but has no sql format. The goal was for EXPLAIN (FORMAT SQL) Q to be treated identically to a native EXPLAIN by all clients and by the CLI renderer — not a
look-alike result set that the CLI boxes.

How it works

EXPLAIN (FORMAT SQL) Q is not a libpg syntax error — libpg parses it, then transform throws Invalid Input Error: "sql" is not a valid FORMAT argument. So the parse_function parser-extension flow never fires. Two
cooperating hooks handle it instead, with no changes to the duckdb/ submodule:

  1. Parser override (FALLBACK mode): matches the EXPLAIN ( FORMAT SQL ) token prefix (via Parser::Tokenize, so comments/whitespace are handled correctly) and returns a real ExplainStatement. Because the override
    has no ClientContext (and therefore can't run LPTS), it carries the inner query to the next stage via a sentinel string constant embedded in the explain's inner statement. Declines everything else, so all other
    queries — including normal EXPLAIN — are untouched.
  2. Optimizer extension (pre_optimize): when the plan is a LogicalExplain carrying the sentinel, it runs the existing LPTS pipeline (PlanQuery → LogicalPlanToAst → AstToCteList → ToQuery, identical to PRAGMA
    lpts) and replaces the plan with a LogicalColumnDataGet emitting a single ("logical_opt", ) row.

Since the statement type stays EXPLAIN_STATEMENT, the CLI's ModeExplainRenderer prints the value verbatim and clients see the standard EXPLAIN result shape. Conversion failures (catalog errors, unsupported
operators) propagate as normal query errors.

The override requires allow_parser_override_extension = FALLBACK, which the extension enables on load.

Testing

  • New test/sql/explain_format_sql.test: verifies the two-column result contract via the connection API, case-insensitive keywords / whitespace / comments, round-trip correctness (lpts_check), and no regression
    for native EXPLAIN, EXPLAIN (FORMAT JSON), unknown formats, and normal queries.
  • Full unit suite: 70 cases / 3078 assertions pass.
  • SQLStorm (--tpch_sf 0.001): 15781/17036 success, incorrect=0, lpts_error=0, crash=0 — no regression.

will now explain the query as a SQL stetement (similar to how this works in Umbra)
@peterboncz peterboncz requested a review from ila June 17, 2026 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant