-
Notifications
You must be signed in to change notification settings - Fork 9.8k
Description
Proposal
I'd like to add an extension to the Prometheus engine Go interface that can stream out a result instead of accumulating all series/samples into one promql.Result
's giant parser.Value
as a matrix.
This would probably use a go iterator though chan
would be an option too.
Rationale
This improvement would be useful for the existing /api/v1/query
and /api/v1/query_range
. Prometheus could discard the result series and samples as soon as the json serialisation of each one was built, so it doesn't have to accumulate both the whole result vector and the whole json serialisation in memory at once. This can somewhat lower lowers peak memory use for high-cardinality queries, reducing OOM hazards and slow-downs due to forced aggressive go gc runs when approaching GOMEMLIMIT
s.
The main motivation is to use this in conjunction with the proposed Streaming Result for Query/QueryRange HTTP API #10040
to allow the client to receive the result without Prometheus having to store the entire final result at all. This can dramatically lower peak memory use for high-cardinality queries and/or range queries with many steps. Even where the query is nontrivial so the executor constructs large interim vectors/matrices during its execution, it still means the final result doesn't have to fit in memory at the same time as the working-memory to create it.
Limitations
To be clear, none of this fully solves Prometheus's issues with spiky peak memory requirements when serving high cardinality queries. But it'd be a significant step towards that, and would help unlock further improvements as subsequent work on streaming in the executor would have more direct and immediate benefits.
Implementation
promql.Query
is used as an interface by other Prometheus engine implementations like https://github.com/thanos-io/promql-engine. Go interfaces do not support default-implementations for methods, so it's generally not reasonable to add methods to an existing interface.
Instead, Go's duck-typing should be used to define a superset-interface like promql.StreamingQuery
that adds the new method, say ExecStreaming
.
The PromQL engine implementation of promql.Query.Exec
could then be implemented in terms of the promql.StreamingQuery
's ExecStreaming
. The underlying promql.Engine
's exec(...)
method
and execEvalStmt(...)
would be changed to yield results via an iterator, not append them to a slice that is accumulated then returned.
Concerns
Some testing should be done to determine if there is a meaningful CPU overhead for yielding results through an iterator (or channel, if a channel is selected for the API) instead of appending to a slice. It is expected that this cost will be negligible, and probably more than offset by savings in slice reallocation etc, but this should be tested at least to some extent.
Related
- Streaming Result for Query/QueryRange HTTP API #10040
- Remote read adding external labels might lead to unsorted response #12605
- sidecar: Greatly increased Thanos sidecar memory usage from 0.32.2 to 0.32.3, still exists in 0.35.0 thanos-io/thanos#7395
- Add sidecar flag bypass Prometheus response buffering and re-sorting thanos-io/thanos#8487