Skip to content

Max memory hardcoded for Trace Matching Feature #9707

@marciorgb

Description

@marciorgb

In what situation are you experiencing subpar performance?

Hardcoded limit memory in Trace Matching

How to reproduce

  1. Make queries with high volume 140mil traces /5min using the Trace Matching with Indirect Dependent

Your Environment

  • Linux
  • Mac
  • Windows

Additional context

Image Image

CHI Configs:

      # Threading - optimize for ingestion
    default/max_threads: 120 # https://clickhouse.com/docs/operations/settings/settings#max_threads - def auto
    default/max_insert_threads: 80 # https://clickhouse.com/docs/operations/settings/settings#max_insert_threads - def 0
    default/background_pool_size: 90 # https://clickhouse.com/docs/operations/server-configuration-parameters/settings#background_pool_size -def 16 # t7 - setting doesnt be propagated to user scope, so we set it here / t8 80 -> 90 to force test

    # Block size settings (USER-LEVEL) - CORRETO AQUI!
    default/max_block_size: 1048576 # https://clickhouse.com/docs/operations/settings/settings#max_block_size - def 65409 # 1M rows
    default/max_insert_block_size: 134217728 # https://clickhouse.com/docs/operations/settings/settings#max_insert_block_size - def 1048449 # 8M / t13 - Note: Preciso reduzir a fragmentação de inserst 8M->32M / t19 32_>218
    default/max_query_size: 8388608 # https://clickhouse.com/docs/operations/settings/settings#max_query_size  - def 262144
    default/min_insert_block_size_rows: 1048449 # https://clickhouse.com/docs/operations/settings/settings#min_insert_block_size_bytes - def 1048449
    default/min_insert_block_size_bytes: 1073741824 # https://clickhouse.com/docs/operations/settings/settings#min_insert_block_size_bytes 268402944 # t13 256 -> 1GB
    default/distributed_directory_monitor_split_batch_on_failure: 1 # https://docs.altinity.com/releasenotes/altinity-stable-release-notes/20.8/ def - 0

    # Memory management (USER-LEVEL)
    default/max_memory_usage: 150000000000 # https://clickhouse.com/docs/operations/settings/settings#max_memory_usage - def 0 -  40Gi per query - t14 acelerar o merge  TODO: remover depois da estabilização t41: 80->150Gi
    default/max_memory_usage_for_user: 150000000000 # https://clickhouse.com/docs/operations/settings/settings#max_memory_usage_for_user - def 0  -  80Gi per user / t14 : TODO: remover depois da estabilização  80-> 150Gi

    # Cache settings (USER-LEVEL)
    default/use_uncompressed_cache: 1 # https://clickhouse.com/docs/operations/settings/settings#use_uncompressed_cache -def 0

    # External sort settings
    default/max_bytes_before_external_group_by: 200000000000 # 32Gi https://clickhouse.com/docs/operations/settings/settings#max_bytes_before_external_group_by - def 0 - recomendo a metade de memória - habilita spill se ultrapassar / t40 -> 30->200
    default/max_bytes_before_external_sort: 100000000000 # 32Gi https://clickhouse.com/docs/operations/settings/settings#max_bytes_before_external_sort - def 0 t41: 32gi -> 100Gi

    # Async insert otimizado para alta carga
    default/async_insert: 1 # https://clickhouse.com/docs/operations/settings/settings#async_insert def 0
    default/async_insert_max_data_size: 209715200 # https://clickhouse.com/docs/operations/settings/settings#async_insert_max_data_size def / 100MB / t13 100m->200M  10485760 / max_size before assync
    default/async_insert_busy_timeout_ms: 2000 # Ref: https://clickhouse.com/docs/optimize/asynchronous-inserts t35
    default/async_insert_stale_timeout_ms: 30000 # Ref: https://clickhouse.com/docs/optimize/asynchronous-inserts - t29 10000->30000 t35
    default/wait_for_async_insert: 0 # Non-blocking  / we neet to be attention to retry / enable if we losing data
    default/async_insert_deduplicate: 1 # t10 Disabling - suspicious of it reduces size

    # Distributed settings (USER-LEVEL)
    default/insert_distributed_sync: 1 # https://clickhouse.com/docs/operations/settings/settings#insert_distributed_sync - def auto
    default/insert_quorum: 1 # https://clickhouse.com/docs/operations/settings/settings#insert_quorum - def auto
    default/insert_quorum_timeout: 60000 # https://clickhouse.com/docs/operations/settings/settings#insert_quorum_timeout - def 60000

    # t12 - Settings in user scope to gain priority
    default/distributed_directory_monitor_sleep_time_ms: 5 #
    default/distributed_directory_monitor_max_sleep_time_ms: 25 #
    default/distributed_directory_monitor_batch_inserts: 1 #

    # Experimental features (apenas não-obsoletos)
    default/allow_nondeterministic_mutations: 1 # https://clickhouse.com/docs/operations/settings/settings#allow_nondeterministic_mutations def 0
    default/allow_asynchronous_read_from_io_pool_for_merge_tree: 1 # ref: https://clickhouse.com/docs/knowledgebase/async_vs_optimize_read_in_order - def 0 - increase performance for i/o
    default/allow_experimental_parallel_reading_from_replicas: 0 # ref: https://clickhouse.com/docs/operations/settings/settings#allow_experimental_parallel_reading_from_replicas - def 0 - beta set to 1 to disable auto in fail case
    # default/cluster_for_parallel_replicas: "cluster" # https://clickhouse.com/docs/operations/settings/settings#cluster_for_parallel_replicas def empty - set to cluster to enable parallel reading from replicas
    default/allow_experimental_window_functions: "1"
    default/allow_experimental_analyzer: 0 # It RESOLVES the BUG CATS!!! DONT REMOVE THIS!``
    # default/enable_analyzer: 1 # TEST

    # Network timeouts
    default/tcp_keep_alive_timeout: 7200 # https://clickhouse.com/docs/operations/settings/settings#tcp_keep_alive_timeout - def 290

    signozprd/allow_experimental_s3queue: 1
    signozprd/allow_experimental_window_functions: "1"
    signozprd/allow_nondeterministic_mutations: "1"
    signozprd/allow_experimental_parallel_reading_from_replicas: "1"
    signozprd/allow_asynchronous_read_from_io_pool_for_merge_tree: 1
    signozprd/max_bytes_before_external_group_by: 4000000000
    signozprd/max_bytes_before_external_sort: 4000000000

CHI Resources:

  resources:
    requests:
      memory: "256Gi"
      cpu: "120"
    limits:
      memory: "256Gi"
      cpu: "120"

Hardcoded config

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions