Skip to content

Macro for aggregating per-second rates - need to simplified SQL query which allows get rates with dimensions but show only one time series with sum/avg #386

@oplehto

Description

@oplehto

One common pattern in our metrics is that we want to calculate the rate of a large number of counters and then aggregate those series, typically using a sum. A good example of this would be the total transmit bandwidth for all interfaces grouped by datacenter.

There may be hundreds of series here so we really need to push down the query to ClickHouse so doing workarounds with transforms would not really help. It is doable but

I've included below the InfluxDB and ClickHouse versions of the raw query. Having a macro would simplify the CH version considerably. I was thinking this could be a $perSecond* variant with 2 extra parameters the aggregation function (typically sum) and the columns to aggregate over.

InfluxDB:

SELECT sum(x) FROM (SELECT non_negative_derivative(max("PortXmitData_bits"), 1s) AS x FROM "netstats" WHERE ("datacenter" =~ /^$datacenter$/) AND env = 'prod' AND $timeFilter GROUP BY time($__interval),host,interface) GROUP BY time($__interval) fill(null)

ClickHouse:

SELECT
    t,
    datacenter,
    groupArray((datacenter, max_0_Rate_aggregated)) AS groupArr 
FROM
(
  SELECT 
    t,
    datacenter,
    sum(max_0_Rate) AS max_0_Rate_aggregated
    FROM (
    SELECT
        t,
        datacenter,
        perSecondColumns,
        if(runningDifference(max_0) < 0, nan, runningDifference(max_0) / runningDifference(t / 1000)) AS max_0_Rate
    FROM
(
        SELECT
            $timeSeries as t,
            concat(host, interface) AS perSecondColumns,
            max(PortXmitData_bits) AS max_0,
            datacenter
        FROM default.netstats

        WHERE
            $timeFilter
            AND datacenter in ($datacenter) AND env = 'prod'
        GROUP BY
            t,
            datacenter,
            perSecondColumns
        ORDER BY
            perSecondColumns,
            datacenter,
            t
)
)
GROUP BY datacenter, t
ORDER BY datacenter, t
)
GROUP BY datacenter, t
ORDER BY datacenter, t

Metadata

Metadata

Assignees

Labels

p1Priority 1

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions