-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add hierarchical queue support for nodegroup plugin #4455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hierarchical queue support for nodegroup plugin #4455
Conversation
Co-authored-by: googs1025 <googs1025@gmail.com> Signed-off-by: JesseStutler <chenzicong4@huawei.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @JesseStutler, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly enhances the nodegroup scheduler plugin by implementing comprehensive support for hierarchical queues. This feature enables child queues to automatically inherit node group affinity settings from their parent or nearest ancestor queues when they lack explicit configurations, streamlining the management of complex queue structures and ensuring efficient resource allocation based on inherited policies. The changes involve core logic updates, new configuration options, and improved testing utilities.
Highlights
- Hierarchical Queue Support: Introduced the capability for the
nodegroupplugin to support hierarchical queues, allowing child queues to inherit node group affinity configurations from their nearest ancestor if they do not specify their own. - Configuration Option: Added a new
enableHierarchyconfiguration option to thenodegroupplugin. When set totrue, this option activates the hierarchical queue functionality, enabling affinity inheritance. - Refactored Affinity Logic: The internal representation and processing of queue group affinities within the
nodegroupplugin have been significantly refactored. This includes new data structures (queueAttr,queueGroupAffinity) and helper functions (initQueueAttrs,buildAncestors,updateAffinityFromAncestor) to manage and apply hierarchical affinity rules. - Centralized Hierarchy Check: A new
HierarchyEnabledmethod was added to theframework.Sessionto centralize the check for whether a specific plugin has hierarchical queue support enabled. This promotes consistency and reusability across plugins. - Improved Error Handling: Dedicated error constants (
errNodeGroupAffinityNotFound,errNodeGroupLabelNotFound,errNodeUnsatisfied) and a helper function (newFitErr) were introduced in a newerrors.gofile within thenodegroupplugin. This provides more specific and consistent error messages for node group affinity failures. - Enhanced Testing Utilities: A new
QueueWrapperfluent builder was added topkg/scheduler/util/test_utils.go. This utility simplifies the creation and configuration ofschedulingv1beta1.Queueobjects in tests, improving test readability and maintainability, and enabling comprehensive testing of the new hierarchical features.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces hierarchical queue support for the nodegroup plugin. The changes are well-structured, but there are two main issues that should be addressed: a critical logic error in the predicate function that could lead to incorrect scheduling, and a high-severity issue regarding the lack of cycle detection in the recursive hierarchy-building logic, which could cause the scheduler to crash.
| func (np *nodeGroupPlugin) buildAncestors(ssn *framework.Session, attr *queueAttr) { | ||
| if attr.queueID == rootQueueID { | ||
| return | ||
| } | ||
|
|
||
| queueInfo := ssn.Queues[attr.queueID] | ||
| parentID := api.QueueID(rootQueueID) | ||
| if queueInfo.Queue.Spec.Parent != "" { | ||
| parentID = api.QueueID(queueInfo.Queue.Spec.Parent) | ||
| } | ||
|
|
||
| parentInfo := ssn.Queues[parentID] | ||
| if parentInfo == nil { | ||
| klog.Warningf("Parent queue %s not found for queue %s, unable to build hierarchy.", queueInfo.Queue.Spec.Parent, queueInfo.Name) | ||
| return | ||
| } | ||
|
|
||
| parentAttr, found := np.queueAttrs[parentInfo.UID] | ||
| if !found { | ||
| klog.Warningf("Parent queue attribute not found for %s, which should not happen.", parentInfo.Name) | ||
| return | ||
| } | ||
| if _, ok := task.Pod.Annotations[batch.QueueNameKey]; ok { | ||
| return task.Pod.Annotations[batch.QueueNameKey] | ||
|
|
||
| // Recursively build ancestors for the parent queue. | ||
| np.buildAncestors(ssn, parentAttr) | ||
|
|
||
| // A queue's ancestors are its parent plus all the parent's ancestors. | ||
| attr.ancestors.PushBack(parentAttr) | ||
| attr.ancestors.PushBackList(parentAttr.ancestors) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buildAncestors function uses recursion to build the queue hierarchy. If a cyclic dependency exists in the queue configuration (e.g., queue 'A' is the parent of 'B', and 'B' is the parent of 'A'), this will result in infinite recursion and a stack overflow, potentially crashing the scheduler. To prevent this, add cycle detection by tracking visited queues during recursion. This can be achieved by passing a map to track the queues in the current recursion path and checking for cycles before recursing.
func (np *nodeGroupPlugin) buildAncestors(ssn *framework.Session, attr *queueAttr, visited map[api.QueueID]bool) {
if visited == nil {
visited = make(map[api.QueueID]bool)
}
if _, ok := visited[attr.queueID]; ok {
klog.Warningf("Cycle detected in queue hierarchy at queue %s, skipping ancestor building.", attr.queueID)
return
}
visited[attr.queueID] = true
defer delete(visited, attr.queueID)
if attr.queueID == rootQueueID {
return
}
queueInfo := ssn.Queues[attr.queueID]
parentID := api.QueueID(rootQueueID)
if queueInfo.Queue.Spec.Parent != "" {
parentID = api.QueueID(queueInfo.Queue.Spec.Parent)
}
parentInfo := ssn.Queues[parentID]
if parentInfo == nil {
klog.Warningf("Parent queue %s not found for queue %s, unable to build hierarchy.", queueInfo.Queue.Spec.Parent, queueInfo.Name)
return
}
parentAttr, found := np.queueAttrs[parentInfo.UID]
if !found {
klog.Warningf("Parent queue attribute not found for %s, which should not happen.", parentInfo.Name)
return
}
// Recursively build ancestors for the parent queue.
np.buildAncestors(ssn, parentAttr, visited)
// A queue's ancestors are its parent plus all the parent's ancestors.
attr.ancestors.PushBack(parentAttr)
attr.ancestors.PushBackList(parentAttr.ancestors)
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's a good suggestion, although potentially users won't configure the cyclic dependency queues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
|
Besides, this PR should also be merged: volcano-sh/apis#186, otherwise the root queue will miss the affinity field even if I add it |
Signed-off-by: JesseStutler <chenzicong4@huawei.com>
318458a to
268703a
Compare
|
/cc @kingeasternsun |
| func GetPodQueue(task *api.TaskInfo) string { | ||
| if _, ok := task.Pod.Labels[batch.QueueNameKey]; ok { | ||
| return task.Pod.Labels[batch.QueueNameKey] | ||
| func (np *nodeGroupPlugin) buildAncestors(ssn *framework.Session, attr *queueAttr, visited map[api.QueueID]struct{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m wondering if it’s really necessary to maintain Ancestors eagerly. We could just perform a recursive lookup when needed, since the queue hierarchy is unlikely to be very deep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because when the Child queue has affinity , just use it's own affinity (no need inheritance).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it's more simple in this way, I think this can also be accepted.
| #### Inheritance Rules | ||
| 1. **Direct Inheritance**: A child queue without affinity configuration inherits from its immediate parent queue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
“most recent ancestor” is better.
|
Will fire another pr to fix some minor nits, we can merge this first, as @JesseStutler is busy for other work. @kingeasternsun |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Monokaix The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/lgtm |
ok |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Inspired by #4426, in the scenario of hierarchical queues, it is hoped that the following capabilities can be supported:
The parent queue is associated with the nodegroup, and the jobs submitted by the child queue are automatically associated with the nodegroup associated with the parent queue. Therefore we can implement it in this way: if a child queue has no affinity, it will inherit the affinity configuration from the nearest ancestor queue
Which issue(s) this PR fixes:
Fixes #4426
Special notes for your reviewer:
Validation:
We have such queues: [root, q1, q2, q2-sub]
The parent of q1 and q2 is the root queue. Neither q1 nor q2-sub specifies affinity. The affinity specified by the root queue is:
, and the affinity and antiaffinity of q2 are completely opposite to those of the root. You can see the printout when the nodegroup plugin OnSessionOpen is:

, q1 inherits the configuration of the root, while q2-sub inherits the configuration of q2
Besides, this PR should also be merged: volcano-sh/apis#186, otherwise the root queue will miss the affinity field even if I add it
Does this PR introduce a user-facing change?