Skip to content

Add iterator-returning variant of promql.Query engine interface #17276

@ringerc

Description

@ringerc

Proposal

I'd like to add an extension to the Prometheus engine Go interface that can stream out a result instead of accumulating all series/samples into one promql.Result's giant parser.Value as a matrix.

This would probably use a go iterator though chan would be an option too.

Rationale

This improvement would be useful for the existing /api/v1/query and /api/v1/query_range. Prometheus could discard the result series and samples as soon as the json serialisation of each one was built, so it doesn't have to accumulate both the whole result vector and the whole json serialisation in memory at once. This can somewhat lower lowers peak memory use for high-cardinality queries, reducing OOM hazards and slow-downs due to forced aggressive go gc runs when approaching GOMEMLIMITs.

The main motivation is to use this in conjunction with the proposed Streaming Result for Query/QueryRange HTTP API #10040
to allow the client to receive the result without Prometheus having to store the entire final result at all. This can dramatically lower peak memory use for high-cardinality queries and/or range queries with many steps. Even where the query is nontrivial so the executor constructs large interim vectors/matrices during its execution, it still means the final result doesn't have to fit in memory at the same time as the working-memory to create it.

Limitations

To be clear, none of this fully solves Prometheus's issues with spiky peak memory requirements when serving high cardinality queries. But it'd be a significant step towards that, and would help unlock further improvements as subsequent work on streaming in the executor would have more direct and immediate benefits.

Implementation

promql.Query is used as an interface by other Prometheus engine implementations like https://github.com/thanos-io/promql-engine. Go interfaces do not support default-implementations for methods, so it's generally not reasonable to add methods to an existing interface.

Instead, Go's duck-typing should be used to define a superset-interface like promql.StreamingQuery that adds the new method, say ExecStreaming.

The PromQL engine implementation of promql.Query.Exec could then be implemented in terms of the promql.StreamingQuery's ExecStreaming. The underlying promql.Engine's exec(...) method
and execEvalStmt(...) would be changed to yield results via an iterator, not append them to a slice that is accumulated then returned.

Concerns

Some testing should be done to determine if there is a meaningful CPU overhead for yielding results through an iterator (or channel, if a channel is selected for the API) instead of appending to a slice. It is expected that this cost will be negligible, and probably more than offset by savings in slice reallocation etc, but this should be tested at least to some extent.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions