-
Notifications
You must be signed in to change notification settings - Fork 122
Closed
Description
One common pattern in our metrics is that we want to calculate the rate of a large number of counters and then aggregate those series, typically using a sum. A good example of this would be the total transmit bandwidth for all interfaces grouped by datacenter.
There may be hundreds of series here so we really need to push down the query to ClickHouse so doing workarounds with transforms would not really help. It is doable but
I've included below the InfluxDB and ClickHouse versions of the raw query. Having a macro would simplify the CH version considerably. I was thinking this could be a $perSecond* variant with 2 extra parameters the aggregation function (typically sum) and the columns to aggregate over.
InfluxDB:
SELECT sum(x) FROM (SELECT non_negative_derivative(max("PortXmitData_bits"), 1s) AS x FROM "netstats" WHERE ("datacenter" =~ /^$datacenter$/) AND env = 'prod' AND $timeFilter GROUP BY time($__interval),host,interface) GROUP BY time($__interval) fill(null)
ClickHouse:
SELECT
t,
datacenter,
groupArray((datacenter, max_0_Rate_aggregated)) AS groupArr
FROM
(
SELECT
t,
datacenter,
sum(max_0_Rate) AS max_0_Rate_aggregated
FROM (
SELECT
t,
datacenter,
perSecondColumns,
if(runningDifference(max_0) < 0, nan, runningDifference(max_0) / runningDifference(t / 1000)) AS max_0_Rate
FROM
(
SELECT
$timeSeries as t,
concat(host, interface) AS perSecondColumns,
max(PortXmitData_bits) AS max_0,
datacenter
FROM default.netstats
WHERE
$timeFilter
AND datacenter in ($datacenter) AND env = 'prod'
GROUP BY
t,
datacenter,
perSecondColumns
ORDER BY
perSecondColumns,
datacenter,
t
)
)
GROUP BY datacenter, t
ORDER BY datacenter, t
)
GROUP BY datacenter, t
ORDER BY datacenter, t
Metadata
Metadata
Assignees
Labels
p1Priority 1Priority 1