Skip to content

Filtering with by ignores the group by condition #1680

@samklonaris

Description

@samklonaris

Creating a blaze expression that filters with by omits the GROUP BY in the final expression.

Example:

create table orders (
    id integer primary key,
    product text,
    timestamp time
);
>>> from blaze import data, by, compute
>>> url = "postgresql:///testdb"
>>> db= data(url)
>>> orders = db['orders']
pairs = by(orders['product'], max_timestamp=orders['timestamp'].max())

# computing the above produces the following sql
SELECT orders.product, max(orders.timestamp) as max_timestamp
FROM orders
GROUP BY orders.dataset limit 10001
expression = orders[(orders['product'] == pairs['product']) & (orders['timestamp'] == pairs['max_timestamp'])]

# computing the above produces the following sql
SELECT orders.id, orders.product, orders.timestamp FROM orders
WHERE orders.product =  orders.product
AND  orders.timestamp = max(orders.timestamp) limit 10001

^ GROUP BY is missing above

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions