Skip to content

Conversation

@umohnani8
Copy link
Member

Users can add default sysctls that they want all cri-o containers
to have in the crio.conf file now.
Adding net.ipv4.ping_group_range as a default so we can drop the
NET_RAW capability.

Remove sysctl parsing code from cri-o as we extract the sysctls from k8s directly.
Had a TODO from a while back to remove it, so removing.

Drop the NET_RAW capability from the list of cri-o defaults
as we are using the sysctl net.ipv4.ping_group_range instead
to allow users to do a ping.

Signed-off-by: umohnani8 umohnani@redhat.com

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 6, 2018
@umohnani8
Copy link
Member Author

@rhatdan PTAL. For now the default sysctl is net.ipv4.ping_group_range=0 0, what should I change the value of this sysctl to be?

@mrunalp
Copy link
Member

mrunalp commented Aug 6, 2018

@umohnani8 You will have to restrict it to /proc/1/gid_map values on the extreme else sysctl will fail.

lib/config.go Outdated

// DefaultSysctls for the default-sysctls option in the crio.conf file
var DefaultSysctls = []string{
"net.ipv4.ping_group_range=0 0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot hard code this. For user namespaces, the gid_map could be different so will have to populate the range and validate it against user ns group mappings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need to see if these are relative to the USERNS. IE Does this happen in runc after the USERNS is setup, then it should work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should happen from inside the container once the user namespace is configured. Some sysctl are not allowed from a namespace and the OCI runtime will fail to set these, but I don't think this PR has to worry of it.

Copy link
Member

@mrunalp mrunalp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a test to check that we can ping without cap_net_raw.

@rhatdan
Copy link
Contributor

rhatdan commented Aug 7, 2018

Why do you think we need to check against GIDMAP? If the sysctl is executed within the userns which it must be then the only issue might be if 0 was not mapped? I guess we could attempt to set that up and see if it fails.

@giuseppe PTAL

@umohnani8 umohnani8 force-pushed the sysctl branch 3 times, most recently from 5de0ca1 to 037a50f Compare August 7, 2018 14:40
@umohnani8
Copy link
Member Author

@mrunalp the tests that ping a pod from another pod should be testing this change automatically - https://github.com/kubernetes-incubator/cri-o/blob/master/test/network.bats#L103.

@umohnani8 umohnani8 force-pushed the sysctl branch 2 times, most recently from e985e5e to 25b669b Compare August 8, 2018 14:11
@umohnani8
Copy link
Member Author

/test all

@umohnani8
Copy link
Member Author

/test critest_rhel

@umohnani8
Copy link
Member Author

/test integration_rhel

@umohnani8
Copy link
Member Author

/test kata-containers

@umohnani8
Copy link
Member Author

/test all

@umohnani8 umohnani8 force-pushed the sysctl branch 3 times, most recently from 736c8aa to 32fa9e7 Compare August 13, 2018 15:52
@umohnani8
Copy link
Member Author

/test all

@umohnani8 umohnani8 force-pushed the sysctl branch 2 times, most recently from eca1b8f to f89f82e Compare August 20, 2018 19:20
@mrunalp
Copy link
Member

mrunalp commented Aug 20, 2018

/test all

@umohnani8
Copy link
Member Author

/test integration_rhel

@mrunalp
Copy link
Member

mrunalp commented Aug 21, 2018

We can skip the test for RHEL 7 by checking for the setting on the host.

@umohnani8
Copy link
Member Author

/test all

@umohnani8 umohnani8 force-pushed the sysctl branch 2 times, most recently from be35082 to d243d65 Compare August 21, 2018 19:43
@mrunalp
Copy link
Member

mrunalp commented Aug 21, 2018

/test all

@umohnani8 umohnani8 force-pushed the sysctl branch 3 times, most recently from 6a1d44a to 130d02e Compare August 22, 2018 17:12
@umohnani8
Copy link
Member Author

/test all

@mrunalp
Copy link
Member

mrunalp commented Aug 22, 2018

Just debugged this with @umohnani8. This sysctl isn't available when using a user namespace on RHEL. We should use some other sysctl for testing such as net.ipv4.ip_forward.

@umohnani8 umohnani8 force-pushed the sysctl branch 3 times, most recently from 73d80b4 to 58b0316 Compare August 23, 2018 14:13
@umohnani8
Copy link
Member Author

/test all

@umohnani8
Copy link
Member Author

Tests are green! @mrunalp PTAL


// Add default sysctls given in crio.conf
for _, defaultSysctl := range s.config.DefaultSysctls {
split := strings.Split(defaultSysctl, "=")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a SplitN(defaultSysctl, "=",2)
TO make sure the sysctl does not include a "=", I know it is doubtful, but...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

split := strings.Split(defaultSysctl, "=")
if len(split) == 2 {
if err := validateSysctl(split[0], hostNetwork, hostIPC); err != nil {
logrus.Warnf("%v - skipping...", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skipping %s ..., err, defaultSysctl
Should this be an error and refuse to start the container?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking why error out completely when it is just possible that the user made a typo or something in the sysctls they have enabled. Made more sense to log it and just continue to create the error.
If you think it is better to fail completely I can change it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we at lease need more info on which sysctl failed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah adding that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Users can add default sysctls that they want all cri-o containers
to have in the crio.conf file now.
Adding net.ipv4.ping_group_range as a default so we can drop the
NET_RAW capability.

Signed-off-by: umohnani8 <umohnani@redhat.com>
We extract the sysctls from k8s so we don't need
to extract it in cri-o separately.
Had a TODO from a while back to remove it, so removing.

Signed-off-by: umohnani8 <umohnani@redhat.com>
@rhatdan
Copy link
Contributor

rhatdan commented Aug 24, 2018

/test all

@rhatdan
Copy link
Contributor

rhatdan commented Aug 24, 2018

LGTM

@umohnani8
Copy link
Member Author

@mrunalp this is ready to merge :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants