Tags: rerun-io/rerun
Tags
Add support for server side filtering of DataFusion DataFrames (#12147) This PR adds support for server side (push down) filtering of DataFrames from a dataset query. Specifically it adds support for `rerun_segment_id` and time index. With this PR users can do complicated queries such as the following: ```python df = view.reader(index="log_tick").filter( (col("rerun_segment_id") == "recording_0") & (col("log_tick") == 123456) ) ``` This will be converted into an update in the dataset query or will produce multiple queries as needed. All of the results from the queries will be combined to find a unique set of required chunks and then sent out to the appropriate DataFusion partitions for requesting chunks from the server. This PR adds support for boolean logic (AND/OR) to combine the segment and time parameters. Additionally it supports `functions.in_list` and `Expr.between`. If the filters include a non-supported expression, then we will fall back to querying all data. --------- Co-authored-by: Antoine Beyeler <49431240+abey79@users.noreply.github.com>
PreviousNext