Skip to content

Support simple dynamic expressions on DOLT_DIFF #11187

@NoTuxNoBux

Description

@NoTuxNoBux

I would like to look at the commit history of two branches, and then see if the same airspace (table row) was modified in one of the unique commits on branch X and one of the unique commits on branch Y. The idea is to see if someone has been working on the same entity on these two branches. I wrote the following query for this:

SELECT id, entity_type

FROM (
    -- Modified on Y directly.
    SELECT DISTINCT(COALESCE(to_id, from_id)) AS id, 'airspace' AS entity_type, 'y' AS change_origin

    FROM DOLT_DIFF('INT...branch-x', 'airspaces')

    UNION

    -- Modified on X directly.
    SELECT DISTINCT(COALESCE(to_id, from_id)) AS id, 'airspace' AS entity_type, 'x' AS change_origin

    FROM DOLT_DIFF('branch-x...INT', 'airspaces')
) modified_entity

GROUP BY id, entity_type

HAVING COUNT(change_origin) > 1

This query succeeds as expected. I would now like to expand this query to run for a dynamic set of branches on just one side: branch-x will always be fixed, but I would like to query the dolt_branches table and select a few branches where I can then do this same comparison with respect to branch-x.

In SQL terms this could look like this:

SELECT
    branch.name,
    (
        SELECT COUNT(CONCAT(id, entity_type))
        
        FROM (
            -- Modified on Y directly.
            SELECT DISTINCT(COALESCE(to_id, from_id)) AS id, 'airspace' AS entity_type, 'y' AS change_origin
        
            FROM DOLT_DIFF(CONCAT(branch.name, '...branch-x'), 'airblocks')
        
            UNION
        
            -- Modified on X directly.
            SELECT DISTINCT(COALESCE(to_id, from_id)) AS id, 'airspace' AS entity_type, 'x' AS change_origin
        
            FROM DOLT_DIFF(CONCAT('branch-x...', branch.name), 'airblocks')
        ) modified_entity
        
        GROUP BY id, entity_type
        
        HAVING COUNT(change_origin) > 1
    ) AS warning_count

FROM dolt_branches branch

WHERE branch.name != 'branch-x'

However, this yields the following error:

Invalid argument to dolt_diff: concat(branch.name,'...branch-x') – only literal values supported 

I need a three-dot diff in this case, but using the simplified syntax DOLT_DIFF(branch.name, 'branch-x', 'airspaces') for the two-dot diff yields another error:

unable to find field with index 1 in row of 0 columns. This is a bug. Please file an issue here: https://github.com/dolthub/dolt/issues 

Is this something that can be supported, even if in a simple form? I've explored some other avenues:

  • Scan the dolt_diff_* tables instead, but those depend on the active checked out branch and won't let me easily identify common commits.
  • Fetch branches up front, loop over them, and perform multiple queries, each with fixed values. This will work, but I would later on like to add a filter on branches WHERE warning_count > 0 and paginate with LIMIT, which becomes somewhat painful as I will just need to keep querying until I meet the expected page sizes (as I don't know which branches will be filtered out or not) and providing a total estimate of all results becomes difficult.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcustomer issueenhancementNew feature or requestsystem tablesIssues and feature requests related to system tables

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions