-
Notifications
You must be signed in to change notification settings - Fork 41.6k
move pod admission and resize logic into the allocation manager #131801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move pod admission and resize logic into the allocation manager #131801
Conversation
e4f98db to
2ddac8f
Compare
|
/assign @tallclair |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for splitting this out. I didn't quite get through everything, will take another pass tomorrow.
8bd6cd3 to
e59802a
Compare
|
/retest |
e59802a to
c4421fc
Compare
|
/retest |
|
/triage accepted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, just a nit & question. Sorry for the review delay!
| kubelet.admitHandlers.AddPodAdmitHandler(lifecycle.NewPredicateAdmitHandler(kubelet.getNodeAnyWay, lifecycle.NewAdmissionFailureHandlerStub(), kubelet.containerManager.UpdatePluginResources)) | ||
| handlers = append(handlers, lifecycle.NewPredicateAdmitHandler(kubelet.getNodeAnyWay, lifecycle.NewAdmissionFailureHandlerStub(), kubelet.containerManager.UpdatePluginResources)) | ||
|
|
||
| if !excludeAdmitHandlers { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In its current form, TestHandlePluginResources clears all the existing admit handlers and adds a custom one:
kubernetes/pkg/kubelet/kubelet_test.go
Lines 1099 to 1100 in 849a82b
| kl.admitHandlers = lifecycle.PodAdmitHandlers{} | |
| kl.admitHandlers.AddPodAdmitHandler(lifecycle.NewPredicateAdmitHandler(kl.getNodeAnyWay, lifecycle.NewAdmissionFailureHandlerStub(), updatePluginResourcesFunc)) |
To preserve the original intent of the test, the test needs a mechanism to do the same thing; in this case instead of clearing the existing admit handlers I gave it a mechanism to create a kubelet that doesn't have any to begin with. Without this, some of the handlers added by default end up changing the admission results of this test.
pkg/kubelet/kubelet.go
Outdated
| if !kl.podWorkers.IsPodTerminationRequested(pod.UID) && !podutil.IsPodPhaseTerminal(pod.Status.Phase) { | ||
| // We failed pods that we rejected, so allocatedPods include all admitted | ||
| // pods that are alive. | ||
| allocatedPods := kl.getAllocatedPods() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized - I think there's a race condition here? If a resize happens between here and calling allocationManager.AddPod.
Allocation manager is already the source of truth for the allocations though, so I think the fix should be to simply pass the active pods to allocationManager.AddPod, and handle the conversion to the allocated resources within the allocation manager while holding the lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks for the catch
|
@natasha41575: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/lgtm The mutex changes in the add pod flow make me a little nervous, but I don't see any other issues. Let's proceed, so we can unblock the follow-up PRs. |
|
LGTM label has been added. Git tree hash: 00d31506ef8c0d5a411f1b713541a6c0c58d3205
|
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: natasha41575, tallclair The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Move pod admission and resize logic into the allocation manager. This is broken out of this discussion: #131612 (comment). My goal in this one was to change as little business logic as possible, just trying to untangle some dependencies in preparation for #131612.
Does this PR introduce a user-facing change?