Skip to content

Conversation

sohankunkerkar
Copy link
Member

This change validates memory limits against current usage from cgroup files instead of stale cAdvisor cache.

What type of PR is this?

/kind bug

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

server: add real-time memory validation for limit updates

@openshift-ci openshift-ci bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Aug 6, 2025
Copy link
Contributor

openshift-ci bot commented Aug 6, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added kind/bug Categorizes issue or PR as related to a bug. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Aug 6, 2025
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 6, 2025
Copy link
Contributor

openshift-ci bot commented Aug 6, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sohankunkerkar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 6, 2025
@sohankunkerkar sohankunkerkar force-pushed the fix-memory-limit-decrease branch from 2b64053 to e2db63a Compare August 6, 2025 04:17
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 6, 2025
@sohankunkerkar sohankunkerkar force-pushed the fix-memory-limit-decrease branch from e2db63a to 60c8932 Compare August 6, 2025 04:24
Copy link

codecov bot commented Aug 6, 2025

Codecov Report

❌ Patch coverage is 54.54545% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.28%. Comparing base (06e7803) to head (a6e9311).
⚠️ Report is 15 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9385      +/-   ##
==========================================
- Coverage   67.31%   67.28%   -0.03%     
==========================================
  Files         201      202       +1     
  Lines       27720    27760      +40     
==========================================
+ Hits        18660    18679      +19     
- Misses       7515     7529      +14     
- Partials     1545     1552       +7     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sohankunkerkar sohankunkerkar force-pushed the fix-memory-limit-decrease branch 5 times, most recently from 91632fb to 87fed4e Compare August 6, 2025 13:47
@sohankunkerkar sohankunkerkar force-pushed the fix-memory-limit-decrease branch 2 times, most recently from 0a9a319 to 7f2d633 Compare August 6, 2025 15:37
@sohankunkerkar sohankunkerkar changed the title [WIP] server: add real-time memory validation for limit updates server: add real-time memory validation for limit updates Aug 6, 2025
@sohankunkerkar sohankunkerkar marked this pull request as ready for review August 6, 2025 16:39
@sohankunkerkar sohankunkerkar requested a review from mrunalp as a code owner August 6, 2025 16:39
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 6, 2025
@openshift-ci openshift-ci bot requested review from klihub and QiWang19 August 6, 2025 16:40
@sohankunkerkar sohankunkerkar force-pushed the fix-memory-limit-decrease branch 3 times, most recently from f16a01d to 5a37807 Compare August 11, 2025 19:09
@sohankunkerkar sohankunkerkar force-pushed the fix-memory-limit-decrease branch from 5a37807 to ef34e5c Compare August 11, 2025 19:14
This change validates memory limits against current usage
from cgroup files instead of stale cAdvisor cache.

Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
@sohankunkerkar sohankunkerkar force-pushed the fix-memory-limit-decrease branch from ef34e5c to a6e9311 Compare August 11, 2025 19:56
@sohankunkerkar
Copy link
Member Author

/retest

Comment on lines +47 to +53
currentUsage := int64(usageBytes.GetValue())

// Check if new limit is below current usage.
if newMemoryLimit < currentUsage {
return fmt.Errorf("cannot decrease memory limit to %d bytes: current usage is %d bytes",
newMemoryLimit, currentUsage)
}
Copy link
Contributor

@bitoku bitoku Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should we handle if currentUsage is 0 (= unlimited?)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, currentUsage = 0 doesn't mean unlimited, it means the container hasn't allocated any memory yet (or is using negligible memory). This is different from newMemoryLimit = 0 which means unlimited. When currentUsage = 0, it either means the container just started and hasn't allocated significant memory or the container is idle. I think we should allow setting any reasonable memory limit when the container isn't using memory yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry I got the actual usage and the current limit mixed up. thanks for clarification.

@bitoku
Copy link
Contributor

bitoku commented Aug 13, 2025

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 13, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit 3d8e440 into cri-o:main Aug 13, 2025
72 of 74 checks passed
@sohankunkerkar
Copy link
Member Author

/cherry-pick release-1.33

@openshift-cherrypick-robot

@sohankunkerkar: new pull request created: #9404

In response to this:

/cherry-pick release-1.33

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants