Skip to content

Conversation

saschagrunert
Copy link
Member

What type of PR is this?

/kind ci

What this PR does / why we need it:

Add a memory comparison test between conmon and conmon-rs.

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

None

Does this PR introduce a user-facing change?

None

@saschagrunert saschagrunert requested a review from mrunalp as a code owner August 7, 2025 08:28
@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/ci Categorizes issue or PR as related to CI labels Aug 7, 2025
@openshift-ci openshift-ci bot requested review from bitoku and hasan4791 August 7, 2025 08:28
Copy link
Contributor

openshift-ci bot commented Aug 7, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 7, 2025
@saschagrunert saschagrunert force-pushed the conmon-vs-conmon-rs branch 2 times, most recently from 0a606e6 to 6f1e8d5 Compare August 7, 2025 09:00
Copy link

codecov bot commented Aug 7, 2025

Codecov Report

❌ Patch coverage is 22.72727% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.21%. Comparing base (e8f7645) to head (90f2aeb).
⚠️ Report is 79 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main    #9388       +/-   ##
===========================================
- Coverage   67.22%   52.21%   -15.02%     
===========================================
  Files         201      202        +1     
  Lines       27718    27781       +63     
===========================================
- Hits        18634    14506     -4128     
- Misses       7533    11789     +4256     
+ Partials     1551     1486       -65     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@saschagrunert
Copy link
Member Author

saschagrunert commented Aug 7, 2025

The numbers don't look to good for conmon-rs:

  • compare conmon vs conmonrs using a single container without exec
    conmon: 217088 byte, conmonrs: 368640 byte (diff: +151552 byte)

  • compare conmon vs conmonrs using five containers in a pod without exec
    conmon: 1089536 byte, conmonrs: 1130496 byte (diff: +40960 byte)

  • compare conmon vs conmonrs using a single container with exec
    conmon: 217088 byte, conmonrs: 1306624 byte (diff: +1089536 byte)

  • compare conmon vs conmonrs using five containers in a pod with exec
    conmon: 1044480 byte, conmonrs: 3710976 byte (diff: +2666496 byte)

@saschagrunert
Copy link
Member Author

saschagrunert commented Aug 7, 2025

Looks like the whole tracing stack is consuming a bunch of memory in conmon-rs:

screenshot

@bitoku
Copy link
Contributor

bitoku commented Aug 7, 2025

@saschagrunert Can we turn off the tracing feature for the fare comparison?

@saschagrunert
Copy link
Member Author

@saschagrunert Can we turn off the tracing feature for the fare comparison?

We can make it compile optional, yes.

@saschagrunert
Copy link
Member Author

@saschagrunert Can we turn off the tracing feature for the fare comparison?

We can make it compile optional, yes.

Actually the effort is pretty high for this since the logging is wired into tracing. We already disable tracing by default, but I guess the logging creates vectors all over the place. I propose to disable logging for now, which should impact the memory usage: containers/conmon-rs#2661

@haircommander
Copy link
Member

note: to have a fully fair comparison we'll need to take cri-o's memory into account too because conmon-rs does some stuff for cri-o that conmon doesn't. that may make the picture look better for conmon-rs. but I agree, let's drop logging for now to see if that makes the picture better

@haircommander
Copy link
Member

maybe if the test moves cri-o instance to a cgroup the tests creates, and then has the conmon cgroup be that cgroup as well we'd have a picture of what each are in conjunction with cri-o

@saschagrunert
Copy link
Member Author

Numbers are way better now. We cal also further optimize with containers/conmon-rs#2674

@saschagrunert saschagrunert force-pushed the conmon-vs-conmon-rs branch 2 times, most recently from f8d4402 to 067548d Compare August 13, 2025 05:57
@saschagrunert
Copy link
Member Author

saschagrunert commented Aug 13, 2025

coming closer:

ok 1 compare conmon vs conmonrs using a single container without exec:
  conmon: 274432 byte, conmonrs: 282624 byte (diff: 8 kb)

ok 2 compare conmon vs conmonrs using a single container with exec
  conmon: 274432 byte, conmonrs: 917504 byte (diff: 628 kb)

ok 3 compare conmon vs conmonrs using five containers in a pod without exec
  conmon: 1257472 byte, conmonrs: 1175552 byte (diff: -80 kb)

ok 4 compare conmon vs conmonrs using five containers in a pod with exec
  conmon: 1298432 byte, conmonrs: 3018752 byte (diff: 1680 kb)

@saschagrunert saschagrunert force-pushed the conmon-vs-conmon-rs branch 3 times, most recently from 5adead9 to cd4f4b9 Compare August 13, 2025 12:02
Reuse `monitor_env` to pass down the options to the pod runtime (conmon-rs).

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@saschagrunert saschagrunert force-pushed the conmon-vs-conmon-rs branch 4 times, most recently from 8d1d8bc to 4b06084 Compare August 15, 2025 05:51
@saschagrunert
Copy link
Member Author

saschagrunert commented Aug 15, 2025

A new set of test results:

Test results using 1 containers and 0 execs per container:
conmon:			248kb	conmonrs:			272kb	(diff: 24kb)
CRI-O (conmon):	2660kb	CRI-O (conmonrs):	3852kb	(diff: 1192kb)
Both (conmon):	2908kb	Both (conmonrs):	4124kb	(diff: 1216kb)

Test results using 1 containers and 50 execs per container:
conmon:			236kb	conmonrs:			884kb	(diff: 648kb)
CRI-O (conmon):	6444kb	CRI-O (conmonrs):	8048kb	(diff: 1604kb)
Both (conmon):	6680kb	Both (conmonrs):	8932kb	(diff: 2252kb)

Test results using 5 containers and 0 execs per container:
conmon:			1220kb	conmonrs:			1124kb	(diff: -96kb)
CRI-O (conmon):	5720kb	CRI-O (conmonrs):	5708kb	(diff: -12kb)
Both (conmon):	6940kb	Both (conmonrs):	6832kb	(diff: -108kb)

Test results using 5 containers and 50 execs per container:
conmon:			1228kb	conmonrs:			2928kb	(diff: 1700kb)
CRI-O (conmon):	11788kb	CRI-O (conmonrs):	11288kb	(diff: -500kb)
Both (conmon):	13016kb	Both (conmonrs):	14216kb	(diff: 1200kb)

Test results using 50 containers and 0 execs per container:
conmon:			12448kb	conmonrs:			9704kb	(diff: -2744kb)
CRI-O (conmon):	19584kb	CRI-O (conmonrs):	17520kb	(diff: -2064kb)
Both (conmon):	32032kb	Both (conmonrs):	27224kb	(diff: -4808kb)

Test results using 50 containers and 50 execs per container:
conmon:			12192kb	conmonrs:			24144kb	(diff: 11952kb)
CRI-O (conmon):	35112kb	CRI-O (conmonrs):	30220kb	(diff: -4892kb)
Both (conmon):	47304kb	Both (conmonrs):	54364kb	(diff: 7060kb)

@saschagrunert saschagrunert force-pushed the conmon-vs-conmon-rs branch 2 times, most recently from d264be0 to 15b22f9 Compare August 18, 2025 11:55
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Copy link
Contributor

openshift-ci bot commented Aug 18, 2025

@saschagrunert: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ci-fedora-integration 90f2aeb link true /test ci-fedora-integration
ci/prow/ci-cgroupv2-integration 90f2aeb link true /test ci-cgroupv2-integration
ci/prow/ci-fedora-kata 90f2aeb link true /test ci-fedora-kata
ci/prow/e2e-aws-ovn 90f2aeb link true /test e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 30, 2025
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

A friendly reminder that this PR had no activity for 30 days.

@github-actions github-actions bot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/ci Categorizes issue or PR as related to CI needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note-none Denotes a PR that doesn't merit a release note.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants