-
Notifications
You must be signed in to change notification settings - Fork 41.6k
[FG:InPlacePodVerticalScaling] Add extended resources to ContainerStatuses[i].Resources #124227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FG:InPlacePodVerticalScaling] Add extended resources to ContainerStatuses[i].Resources #124227
Conversation
|
Please note that we're already in Test Freeze for the Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Mon Apr 8 01:42:04 UTC 2024. |
|
/test pull-kubernetes-e2e-inplace-pod-resize-containerd-main-v2 |
|
/cc @vinaykul |
|
/priority important-longterm |
|
/triage accepted |
|
Changelog suggestion -Extended resources to be reflected as part of the pod.Status.ContainerStatuses[i].Resources struct.
+Added status for extended Pod resources within the `status.containerStatuses[].resources` field. |
1 similar comment
This comment was marked as duplicate.
This comment was marked as duplicate.
Done! |
|
@iholder101 Thanks for the PR! 😄 If it is not too much to ask, can you please extend the E2E tests test/e2e/node/pod_resize.go:doPodResizeTests() by adding a test case that requests and verifies extended resources? Eyeballing the code, verifyPodAllocations() and verifyPodStatusResources() should already check for it but please confirm that it indeed verifies extended resources as well. I think the nodes need to be patched for extended resources .. a possible example for reference: kubernetes/test/e2e/scheduling/preemption.go Lines 139 to 144 in 9791f0d
|
|
Also adding a few more sharp eyes for this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments..
pkg/kubelet/kubelet_pods.go
Outdated
| } | ||
|
|
||
| convertCustomResources := func(inResources, outResources v1.ResourceList) { | ||
| for extendedResourceName, extendedResourceQuantity := range inResources { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resourceName, resourceQuantity sounds more appropriate instead of extendedResourceName, extendedResourceQuantity
| extendedResourceName == v1.ResourceStorage || extendedResourceName == v1.ResourceEphemeralStorage { | ||
| continue | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove this blank line please
| }, | ||
| }, | ||
| }, | ||
| } { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a test case with only ExendedResources?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't "BestEffort QoSPod with extended resources" covering that already? Can you please elaborate?
|
Who is the intended consumer for these resources? I think the status is supposed to reflect the actual state of resources in-use, so it seems like if we're including extended resources there should be some sort of confirmation from device drivers that the resources are in fact in use. Also, AFAICT this isn't including extended resources in the allocated resources. Should it? |
@tallclair AFAIK extended resources are not managed by the CRI. It is used for scheduling and kubelet is admission gatekeeper. If a pod requesting extended resources is admitted and successfully started, it can be assumed that the pod's containers got the extended resources they requested. And since it is immutable (at this time), reflecting it back in status just like ephemeral-storage would be accurate. The AllocatedResources part is a bit non-obvious. It gets checkpointed upon admission and resize: kubernetes/pkg/kubelet/kubelet.go Lines 2576 to 2592 in 4bb4345
And it is retrieved from checkpoint during APIStatus generation: kubernetes/pkg/kubelet/kubelet_pods.go Lines 2074 to 2083 in 4bb4345
I suppose you can see the fragility of this code! Some added context: node-local checkpointing was much debated topic during design discussions.. Dawn and I eventually capitulated in order to break the impasse and make progress. But at that time we didn’t have KEP https://github.com/kubernetes/enhancements/tree/master/keps/sig-architecture/2527-clarify-status-observations-vs-rbac and the precedent PR #119665. I feel we can drop all of the checkpointing code and rely on pod.Status but the APF aspect needs a closer look. |
d8383be to
d8950a7
Compare
Signed-off-by: Itamar Holder <iholder@redhat.com>
Signed-off-by: Itamar Holder <iholder@redhat.com>
Signed-off-by: Itamar Holder <iholder@redhat.com>
Signed-off-by: Itamar Holder <iholder@redhat.com>
Signed-off-by: Itamar Holder <iholder@redhat.com>
Signed-off-by: Itamar Holder <iholder@redhat.com>
Signed-off-by: Itamar Holder <iholder@redhat.com>
Signed-off-by: Itamar Holder <iholder@redhat.com>
Signed-off-by: Itamar Holder <iholder@redhat.com>
d8950a7 to
85068bc
Compare
|
@tallclair @vinaykul @SergeyKanzhelev Sorry for the delay. I was out of office for a while. |
|
/lgtm |
|
LGTM label has been added. Git tree hash: 61ff3111e0919afd45d849861589f775340c8f6f
|
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: iholder101, tallclair, vinaykul The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
/sig node
What this PR does / why we need it:
In-place pod resize feature adds Resources field to ContainerStatuses struct. This is used to report CPU and memory requests & limits configured for the containers.
After this PR, extended resources would also be reflected as part of
ContainerStatuses[i].Resources.Sample output:
Add a dummy extended resource to the node:
Add a pod using this resource:
Create the pod and check its
ContainerStatuses[i].Resources:Which issue(s) this PR fixes:
Fixes #114159
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: