-
Notifications
You must be signed in to change notification settings - Fork 41.6k
Closed
Labels
kind/featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.priority/important-longtermImportant over the long term, but may not be staffed and/or may need multiple releases to complete.Important over the long term, but may not be staffed and/or may need multiple releases to complete.sig/instrumentationCategorizes an issue or PR as relevant to SIG Instrumentation.Categorizes an issue or PR as relevant to SIG Instrumentation.sig/nodeCategorizes an issue or PR as relevant to SIG Node.Categorizes an issue or PR as relevant to SIG Node.triage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.
Description
We need to propose and instrument metrics for https://kep.k8s.io/1287. We get the "total resize request count" (apiserver_request_total{resource=pods,subresource=resize}) for free through the api server, but some additional metrics might be useful such as:
- resize requests at the pod level (taken as an aggregate across all containers), by whether the request is for cpu or memory, and by whether it is an increase or decrease
- the same as the previous one, but at the container level
- the latency between when a resize is marked as in progress and when it completes
- if a resize is infeasible, the reason
- resize actuation error count / error rate ???
- after [FG:InPlacePodVerticalScaling] Move resize allocation logic out of the sync loop #131612 merges, how often a resize request is accepted through the periodic retry as opposed to being explicitly signaled (this indicates that we missed something and there is unnecessary latency in retrying the deferred resizes)
Metadata
Metadata
Assignees
Labels
kind/featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.priority/important-longtermImportant over the long term, but may not be staffed and/or may need multiple releases to complete.Important over the long term, but may not be staffed and/or may need multiple releases to complete.sig/instrumentationCategorizes an issue or PR as relevant to SIG Instrumentation.Categorizes an issue or PR as relevant to SIG Instrumentation.sig/nodeCategorizes an issue or PR as relevant to SIG Node.Categorizes an issue or PR as relevant to SIG Node.triage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.
Type
Projects
Status
Done