Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow QueueSetting to set min and max number of consumers #2261

Open
rakyll opened this issue Dec 7, 2020 · 2 comments
Open

Allow QueueSetting to set min and max number of consumers #2261

rakyll opened this issue Dec 7, 2020 · 2 comments

Comments

@rakyll
Copy link
Contributor

rakyll commented Dec 7, 2020

In connection pooling, it's canonical practice to allow users to set a min and max number of concurrent connections. This allows users to utilize at least a minimal number of initial connections, allow them to provision new connections and set an upper cap. Currently, we only allow users to set a NumConsumers. This doesn't give a lot of flexibility to fine tune the consumers. By providing both min and max, we can allow users to utilize the minimal resources and increase the number of workers based on the load. Based on how the workers behave, providing this option might be significant. For example, if each worker needs to open a long-living connection, min/max will allow users not to open large number of connections until they have a load issue.

Filing this an issue so it is considered before the APIs are finalized.

cc @bogdandrutu @tigrannajaryan

@bogdandrutu
Copy link
Member

@rakyll this is interesting, but we do things a bit different there. We re-use the connection between workers in case of OTLP for example (grpc does a better job when connection is shared). We apply the same logic for http as well because the http client deals with open/close/keep connections alive, etc.

@cforce
Copy link
Contributor

cforce commented Sep 15, 2023

Could you please direct me to the documentation that explains how to configure and optimize the settings for the number of connections, worker threads, and queue size (if applicable) for an OpenTelemetry collector, specifically also for HTTP receivers?

In our load testing environment, which is built on Azure Container Apps, we've identified an optimal point at 50 requests per second per node. This configuration utilizes neither significant CPU nor RAM resources. However, when we attempt to handle a higher volume of requests beyond this threshold, we observe an increase in error rates.

It appears that the OpenTelemetry collector's connection handling might be limited by default settings. We aim to make better use of the available CPU and RAM resources, potentially by choosing larger virtual machines, so we can efficiently manage our expected workloads without the need to scale to hundreds of nodes.

Furthermore, when dealing with an even larger number of clients that generate more connections while maintaining the same level of throughput (which we currently simulate with a low number of nodes), we face additional pressure to increase the number of requests per collector pod.

Could you kindly provide guidance or point us towards the relevant documentation to adjust and fine-tune these settings for optimal performance in our OpenTelemetry collector setup for HTTP and gRPC receivers
@andrewhsu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants