-
Notifications
You must be signed in to change notification settings - Fork 1.1k
OCPBUGS-61220: internal/cgmgr: fix CPU weight setting for cgroup v2 #9456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@sohankunkerkar: This pull request references Jira Issue OCPBUGS-61220, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sohankunkerkar The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
001f5e1
to
b926c0d
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #9456 +/- ##
==========================================
+ Coverage 66.96% 67.05% +0.09%
==========================================
Files 202 202
Lines 27997 28016 +19
==========================================
+ Hits 18747 18786 +39
+ Misses 7673 7656 -17
+ Partials 1577 1574 -3 🚀 New features to boost your workflow:
|
@sohankunkerkar: This pull request references Jira Issue OCPBUGS-61220. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@sohankunkerkar: This pull request references Jira Issue OCPBUGS-61220, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
b926c0d
to
2537bdc
Compare
Name: "CPUShares", | ||
Value: dbus.MakeVariant(resources.CPU.Shares), | ||
}) | ||
if node.CgroupIsV2() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why the IPPR test fails when conmon cgroup doesn't have cpu weight set. it's checking the pod cgroup weight and that's the culmination of its child cgroups?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the pod-level cgroup aggregates CPU weights from all its children, including conmon. If conmon’s cgroup has no cpu.weight, it defaults to 100, which can make the pod-level weight inconsistent. This will avoid that issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the log, it checked /sys/fs/cgroup/cpu.weight
in the target container (c1-init) and it was empty.
Is it a pod level cgroup? It looks like container-level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah sounds like container level. @sohankunkerkar this fix did fix the test for you though?
2537bdc
to
9ad7250
Compare
2bf48c2
to
ed7dc8d
Compare
24334cc
to
aba57f1
Compare
eb7712a
to
c424158
Compare
This resolves Pod InPlace Resize Container test failure where cpu.weight file was empty for Guaranteed QoS pods with restartable init containers. Sometimes, systemd fails to convert CPUShares → CPUWeight for this specific scenario, likely due to how it handles complex pod hierarchies with concurrent containers, but the exact internal mechanism is unclear. This acts more as a safeguard in case systemd doesn’t set the value. Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
c424158
to
2172b77
Compare
It resolves Pod InPlace Resize Container test failure where cpu.weight file was empty for Guaranteed QoS pods with restartable init containers. Sometimes, systemd fails to convert CPUShares → CPUWeight for this specific scenario, likely due to how it handles complex pod hierarchies with concurrent containers, but the exact internal mechanism is unclear. This acts more as a safeguard in case systemd doesn’t set the value.
What type of PR is this?
/kind bug
/kind flake
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes https://issues.redhat.com/browse/OCPBUGS-61220
Special notes for your reviewer:
Does this PR introduce a user-facing change?