Skip to content

Conversation

danwinship
Copy link
Contributor

The new iptables-and-nftables MetaHostPortManager code was falling victim to a classic golang gotcha. If newHostportManagerIPTables returned a nil value (of type *hostportManagerIPTables), then that value would get assigned to mh.managers[utilnet.IPv4].iptables, but that value is a HostPortManager, so the value would be typecast, becoming a non-nil HostPortManager value wrapping the nil *hostportManagerIPTables value, and then later, this code

if managers.iptables != nil {
err := managers.iptables.Remove(id, mappingsForFamily)
// Ignore iptables errors if we're primarily using nftables
if err != nil && managers.nftables == nil {
errstrings = append(errstrings, err.Error())
}
}

would see that it was non-nil and attempt to call a method on it, which would then SEGV because the underlying *hostportManagerIPTables was nil.

There are a few possible ways to fix this... this is one of them...

This wasn't caught by unit tests because the unit tests don't use NewMetaHostportManager (since it's not fully mock-able) and it wasn't caught by e2e tests because the e2e environment has both iptables and nftables installed.

I also pulled in a cri-o-ified version of kubernetes/kubernetes@b031258 to get rid of the superfluous error messages also seen in #9220.

Fixes #9220
/kind bug
/kind cleanup

Does this PR introduce a user-facing change?

Fixes a crash introduced in 1.33.0 when cleaning up a pod that uses HostPorts
on a system that has either just iptables (but not nftables) or just nftables
(but not iptables).

@danwinship danwinship requested a review from mrunalp as a code owner May 26, 2025 15:59
@openshift-ci openshift-ci bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels May 26, 2025
Copy link
Contributor

openshift-ci bot commented May 26, 2025

Hi @danwinship. Thanks for your PR.

I'm waiting for a cri-o member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 26, 2025
@openshift-ci openshift-ci bot requested review from hasan4791 and klihub May 26, 2025 16:00
Copy link

codecov bot commented May 26, 2025

Codecov Report

Attention: Patch coverage is 78.94737% with 8 lines in your changes missing coverage. Please review.

Project coverage is 66.52%. Comparing base (4abe996) to head (b3f1394).
Report is 9 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9222      +/-   ##
==========================================
+ Coverage   66.50%   66.52%   +0.01%     
==========================================
  Files         198      198              
  Lines       27164    27176      +12     
==========================================
+ Hits        18065    18078      +13     
+ Misses       7612     7611       -1     
  Partials     1487     1487              
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bitoku
Copy link
Contributor

bitoku commented May 26, 2025

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 26, 2025
Comment on lines 239 to 246
if err != nil {
// The only likely error is "no such file or directory", in which case any
// further commands will fail the same way, so we don't need to do
// anything special here.
return runner
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be too sensitive, but I want to log something every time we get error, especially when it interacts with something not managed by us.
But debug log is enough, I think.

@bitoku
Copy link
Contributor

bitoku commented May 27, 2025

Thank you for the quick fix!

If `iptables --version` failed, iptables.New() would log a warning
and assume that the problem was that you had an implausibly ancient
version of iptables installed. Change it to instead assume that the
problem is that you don't have iptables installed at all (and only
log at debug).

Signed-off-by: Dan Winship <danwinship@redhat.com>
Add an internal constructor so that the unit tests share the same
logic as the real constructor, and fix it to handle nil sub-managers
correctly.

Signed-off-by: Dan Winship <danwinship@redhat.com>
@danwinship danwinship force-pushed the hostport-no-iptables branch from cbb5c36 to b3f1394 Compare May 27, 2025 11:55
@bitoku
Copy link
Contributor

bitoku commented May 30, 2025

/retest

Copy link
Contributor

openshift-ci bot commented May 30, 2025

@danwinship: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ci-e2e-evented-pleg b3f1394 link false /test ci-e2e-evented-pleg

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@bitoku
Copy link
Contributor

bitoku commented May 30, 2025

/skip
/lgtm
@cri-o/cri-o-maintainers PTAL

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 30, 2025
@haircommander
Copy link
Member

/approve

thank you for the quick fix!

Copy link
Contributor

openshift-ci bot commented May 30, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, haircommander

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 30, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit b31fbf3 into cri-o:main May 30, 2025
70 of 89 checks passed
@danwinship danwinship deleted the hostport-no-iptables branch May 31, 2025 14:29
@haircommander
Copy link
Member

/cherry-pick release-1.33

@openshift-cherrypick-robot

@haircommander: new pull request created: #9241

In response to this:

/cherry-pick release-1.33

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SIGSEGV when deleting pod with host port
4 participants