Skip to content

Conversation

@PSeitz-dd
Copy link
Contributor

@PSeitz-dd PSeitz-dd commented Dec 4, 2025

In this refactoring a collector knows in which bucket of the parent their data is in. This allows to convert the previous approach of one collector per bucket to one collector per request.

Add PagedTermMap as another TermAggregationMap to reduce memory usage compared to a HashMap

It contains an optimization for low cardinality bucket id
Remove Clone on the collector (we only have one instance now)

Future Work

  • Fetch all values for all buckets once per collector (currently each collect fetches their own data per bucket)
  • Improve perf of group by bucket id in caching layer
  • Improve low cardinality detection
  • Remove PerRequestAggSegCtx, we can store now everything in the collector

Performance

The heavy hitters are drastically reduced in terms of memory and CPU.
For term aggs with many terms, we use a lot less memory.
We use some more buffers to pass docs, which increases memory consumption for some aggs.

Biggest regression is
terms_zipf_1000_with_avg_sub_agg Avg: 9.1190ms (+39.79%)
Which should be fixed when we fetch all values for all buckets at once.

full
average_u64                                    Memory: 21.7 KB (-2.30%)      Avg: 2.9480ms (-3.64%)      Median: 2.9303ms (-3.69%)      [2.8115ms .. 3.1867ms]       
average_f64                                    Memory: 21.7 KB (-2.30%)      Avg: 3.1238ms (-3.17%)      Median: 3.1146ms (-2.94%)      [3.0220ms .. 3.3337ms]       
average_f64_u64                                Memory: 23.0 KB (-0.48%)      Avg: 5.7650ms (-1.43%)      Median: 5.7655ms (-0.97%)      [5.6324ms .. 6.0755ms]       
stats_f64                                      Memory: 21.8 KB (-2.29%)      Avg: 3.1220ms (-2.72%)      Median: 3.1131ms (-2.49%)      [3.0185ms .. 3.3488ms]       
extendedstats_f64                              Memory: 23.0 KB (+3.22%)      Avg: 3.2849ms (-1.79%)      Median: 3.2591ms (-1.38%)      [3.1558ms .. 3.5704ms]       
percentiles_f64                                Memory: 39.4 KB (+37.37%)     Avg: 6.9987ms (-3.80%)      Median: 6.9900ms (-2.82%)      [6.9396ms .. 7.1972ms]       
terms_7                                        Memory: 36.7 KB (+3.00%)      Avg: 2.3431ms (+0.96%)      Median: 2.3425ms (+1.63%)      [2.2808ms .. 2.4372ms]       
terms_all_unique                               Memory: 14.7 MB (-50.24%)     Avg: 6.9163ms (-58.72%)     Median: 6.8885ms (-57.92%)     [6.7612ms .. 7.3554ms]       
terms_150_000                                  Memory: 3.0 MB (-55.96%)      Avg: 6.2613ms (-37.68%)     Median: 6.2406ms (-35.99%)     [6.1016ms .. 6.4347ms]       
terms_many_top_1000                            Memory: 5.2 MB (-33.38%)      Avg: 9.3138ms (-29.50%)     Median: 9.2938ms (-27.81%)     [9.1127ms .. 9.8866ms]       
terms_many_order_by_term                       Memory: 3.0 MB (-55.96%)      Avg: 4.9930ms (-58.18%)     Median: 4.9816ms (-57.86%)     [4.8966ms .. 5.2057ms]       
terms_many_with_top_hits                       Memory: 50.0 MB (-11.50%)     Avg: 96.4471ms (-40.65%)    Median: 94.6858ms (-41.67%)    [91.5090ms .. 113.9275ms]    
terms_all_unique_with_avg_sub_agg              Memory: 56.7 MB (-39.11%)     Avg: 17.5857ms (-74.43%)    Median: 17.4918ms (-74.10%)    [17.0980ms .. 19.1589ms]     
terms_many_with_avg_sub_agg                    Memory: 13.5 MB (-34.50%)     Avg: 15.2693ms (-45.67%)    Median: 15.1739ms (-44.65%)    [14.9133ms .. 17.1725ms]     
terms_status_with_avg_sub_agg                  Memory: 101.3 KB (+65.18%)    Avg: 6.2418ms (+13.02%)     Median: 6.2257ms (+13.12%)     [6.1074ms .. 6.4710ms]       
terms_status_with_histogram                    Memory: 137.1 KB (+28.02%)    Avg: 6.1062ms (+12.70%)     Median: 6.0941ms (+13.20%)     [6.0223ms .. 6.3640ms]       
terms_zipf_1000                                Memory: 69.2 KB (-12.86%)     Avg: 2.2407ms (+5.06%)      Median: 2.2369ms (+5.27%)      [2.2020ms .. 2.3248ms]       
terms_zipf_1000_with_histogram                 Memory: 1.2 MB (+20.39%)      Avg: 24.0328ms (+5.64%)     Median: 23.9883ms (+5.50%)     [23.8058ms .. 25.1477ms]     
terms_zipf_1000_with_avg_sub_agg               Memory: 463.4 KB (+27.36%)    Avg: 9.1190ms (+39.79%)     Median: 9.0712ms (+41.72%)     [8.8888ms .. 9.8575ms]       
terms_many_json_mixed_type_with_avg_sub_agg    Memory: 20.6 MB (-20.72%)     Avg: 25.2443ms (-41.91%)    Median: 25.1315ms (-41.44%)    [24.6717ms .. 26.7138ms]     
cardinality_agg                                Memory: 3.7 MB (-0.01%)       Avg: 30.6559ms (+3.37%)     Median: 29.7482ms (+0.62%)     [28.5122ms .. 35.9346ms]     
terms_status_with_cardinality_agg              Memory: 5.5 MB (+0.78%)       Avg: 74.4157ms (+4.42%)     Median: 72.5500ms (+2.05%)     [69.0952ms .. 87.0695ms]     
range_agg                                      Memory: 25.2 KB (-5.33%)      Avg: 3.4807ms (+7.94%)      Median: 3.3977ms (+5.44%)      [3.2741ms .. 4.0961ms]       
range_agg_with_avg_sub_agg                     Memory: 95.5 KB (+81.82%)     Avg: 7.2579ms (+3.35%)      Median: 7.1267ms (+1.81%)      [6.9252ms .. 8.1604ms]       
range_agg_with_term_agg_status                 Memory: 109.4 KB (+62.59%)    Avg: 6.7002ms (-69.18%)     Median: 6.5399ms (-69.70%)     [6.3694ms .. 7.7095ms]       
range_agg_with_term_agg_many                   Memory: 6.9 MB (+0.32%)       Avg: 14.4014ms (-53.85%)    Median: 14.3027ms (-54.12%)    [13.6193ms .. 15.9501ms]     
histogram                                      Memory: 22.6 KB (+0.67%)      Avg: 3.1483ms (-2.77%)      Median: 3.1182ms (-2.91%)      [3.0445ms .. 3.3583ms]       
histogram_hard_bounds                          Memory: 20.6 KB (+3.46%)      Avg: 1.6982ms (+3.78%)      Median: 1.6709ms (+2.33%)      [1.6003ms .. 1.9916ms]       
histogram_with_avg_sub_agg                     Memory: 122.7 KB (+68.32%)    Avg: 9.6292ms (+5.66%)      Median: 9.5913ms (+6.64%)      [9.3812ms .. 9.9690ms]       
histogram_with_term_agg_status                 Memory: 492.5 KB (+4.78%)     Avg: 12.9491ms (-34.46%)    Median: 12.8700ms (-34.61%)    [12.6297ms .. 13.8100ms]     
avg_and_range_with_avg_sub_agg                 Memory: 82.5 KB (+96.81%)     Avg: 10.1590ms (+3.49%)     Median: 10.0878ms (+3.37%)     [9.9766ms .. 10.6046ms]      
filter_agg_all_query_count_agg                 Memory: 139.3 KB (+21.71%)    Avg: 4.5550ms (+16.74%)     Median: 4.5375ms (+16.39%)     [4.4388ms .. 4.8239ms]       
filter_agg_term_query_count_agg                Memory: 140.0 KB (+21.85%)    Avg: 7.0402ms (+9.54%)     Median: 7.0113ms (+12.72%)     [6.9111ms .. 7.3698ms]       
filter_agg_all_query_with_sub_aggs             Memory: 157.1 KB (+19.28%)    Avg: 9.6685ms (+7.96%)      Median: 9.6228ms (+8.09%)      [9.4179ms .. 10.1717ms]      
filter_agg_term_query_with_sub_aggs            Memory: 157.5 KB (+19.23%)    Avg: 12.1100ms (+5.88%)     Median: 12.0301ms (+5.82%)     [11.8841ms .. 12.4835ms]     

@PSeitz-dd PSeitz-dd force-pushed the bucket_id_agg branch 2 times, most recently from a16d0ff to d97bda9 Compare December 5, 2025 06:53
@PSeitz PSeitz force-pushed the bucket_id_agg branch 4 times, most recently from 59e9f5a to 856e5d5 Compare December 8, 2025 00:43
PSeitz and others added 5 commits December 8, 2025 10:20
In this refactoring a collector knows in which bucket of the parent
their data is in. This allows to convert the previous approach of one
collector per bucket to one collector per request.

low card bucket optimization
use paged term map in term agg
use special no sub agg term map impl
@PSeitz PSeitz force-pushed the bucket_id_agg branch 2 times, most recently from 3356fc8 to d76b315 Compare December 8, 2025 02:34
@PSeitz PSeitz force-pushed the bucket_id_agg branch 2 times, most recently from 1f717da to 29add85 Compare December 8, 2025 02:44
remove clone
move data in term req, single doc opt for stats
Comment on lines 25 to 34
/// Only used when LOWCARD is true.
/// Cache doc ids per bucket for sub-aggregations.
///
/// The outer Vec is indexed by BucketId.
per_bucket_docs: Vec<Vec<DocId>>,
/// Only used when LOWCARD is false.
/// For higher cardinalities we use a partitioned approach to store
///
/// partitioned Vec<(BucketId, DocId)> pairs to improve grouping locality.
partitions: [PartitionEntry; NUM_PARTITIONS],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!? why use a boolean for this. I don't understand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What boolean? Do you mean array?

It's done as a cheap inexact group_by on bucket_id

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants