[FG:InPlacePodVerticalScaling] Pod CPU limit is not configured to cgroups as calculated if systemd cgroup driver is used

### What happened?

As a result of #124216, which was introduced in v.1.32, a pod CPU limit calculated in `ResourceConfigForPod()` is rounded up to the nearest 10ms in `libcontainer` at resizing the pod:
- Resize a pod:
  ```
  $ kubectl patch pod resize-pod --subresource=resize --patch '{"spec":{"containers":[{"name":"resize-container", "resources":{"limits":{"cpu":"417m"}}}]}}'
  pod/resize-pod patched
  ```
- The container cgroup value is set with 1ms precision:
  ```
  $ kubectl exec resize-pod -- cat /sys/fs/cgroup/cpu.max
  41700 100000
  ```
- The pod cgroup value is rounded up:
  ```
  $ cat /sys/fs/cgroup/kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-burstable.slice/kubelet-kubepods-burstable-pod68a17b59_0d31_40b2_ba86_ea43f3b2f05c.slice/cpu.max
  42000 100000
  ```

When `systemd` cgroup driver is used, `libcontainer` passes the CPU Quota to `systemd` with rounding up:
https://github.com/kubernetes/kubernetes/blob/a4b8a3b2e33a3b591884f69b64f439e6b880dc40/vendor/github.com/opencontainers/runc/libcontainer/cgroups/systemd/common.go#L304-L311

In addition, there seems to be a race in `libcontainer`. It directly writes values to the cgroup file without roundup after it passes the rounded value to `systemd`:
https://github.com/kubernetes/kubernetes/blob/a4b8a3b2e33a3b591884f69b64f439e6b880dc40/vendor/github.com/opencontainers/runc/libcontainer/cgroups/systemd/v2.go#L489-L493

So, there is also a case where the cgroup value is set as calculated. As far as I tried, decreasing CPU limits usually hits this case though I’m not sure why:
- Decrease the CPU limits:
  ```
  $ kubectl patch pod resize-pod --subresource=resize --patch '{"spec":{"containers":[{"name":"resize-container", "resources":{"limits":{"cpu":"365m"}}}]}}'
  pod/resize-pod patched
  ```

- Both the container and the pod cgroup values are set with 1ms precision:
  ```
  $ kubectl exec resize-pod -- cat /sys/fs/cgroup/cpu.max
  36500 100000
  $ cat /sys/fs/cgroup/kubelet.slice/kubelet-kubepods.slice/kubelet-kubepods-burstable.slice/kubelet-kubepods-burstable-pod68a17b59_0d31_40b2_ba86_ea43f3b2f05c.slice/cpu.max
  36500 100000
  ```


### What did you expect to happen?
This roundup looks like the intended behavior of `systemd` cgroup driver because CPU quota is also rounded up when a pod is just created with 1ms precision CPU limits. However, I have the following concerns:
- We might need to confirm this tiny gap doesn’t cause a similar issue to #128769 at resizing pods.
- We might need to clarify why the CPU quota of pod cgroup is sometimes not rounded up. This is especially necessary to complete #127192, which is going to add pod cgroup verification to resize tests.


### How can we reproduce it (as minimally and precisely as possible)?

0. Use `systemd` cgroup driver and enable `InPlacePodVertialScaling`.
1. Resize CPU limits of a pod with 1ms precision.


### Anything else we need to know?

_No response_

### Kubernetes version

V1.32
<details>

```console
$ kubectl version
# paste output here
Client Version: v1.31.4
Kustomize Version: v5.4.2
Server Version: v1.32.0
```

</details>


### Cloud provider

N/A

### OS version

<details>

```console
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
```

</details>


### Install tools

<details>

</details>


### Container runtime (CRI) and version (if applicable)

<details>

</details>


### Related plugins (CNI, CSI, ...) and versions (if applicable)

<details>

</details>


	// systemd converts CPUQuotaPerSecUSec (microseconds per CPU second) to CPUQuota
	// (integer percentage of CPU) internally. This means that if a fractional percent of
	// CPU is indicated by Resources.CpuQuota, we need to round up to the nearest
	// 10ms (1% of a second) such that child cgroups can set the cpu.cfs_quota_us they expect.
	cpuQuotaPerSecUSec = uint64(quota*1000000) / period
	if cpuQuotaPerSecUSec%10000 != 0 {
	cpuQuotaPerSecUSec = ((cpuQuotaPerSecUSec / 10000) + 1) * 10000
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FG:InPlacePodVerticalScaling] Pod CPU limit is not configured to cgroups as calculated if systemd cgroup driver is used #129357

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	if err := setUnitProperties(m.dbus, getUnitName(m.cgroups), properties...); err != nil {
	return fmt.Errorf("unable to set unit properties: %w", err)
	}

	return m.fsMgr.Set(r)

[FG:InPlacePodVerticalScaling] Pod CPU limit is not configured to cgroups as calculated if systemd cgroup driver is used #129357

Description

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions