Skip to content

[FG:InPlacePodVerticalScaling] Handle pod resize even if the pod has not started yet #126527

@hshiina

Description

@hshiina

What happened?

If an unacceptable pod resizing that causes Deferred or Infeasible is requested before the container is started (for example, while an init container is running), the container is started with the unacceptable spec.

$ kubectl create -f pod.yaml; sleep 5; kubectl patch pod resize-pod --patch '{"spec": {"containers": [{"name": "resize-container", "resources":{"requests": {"cpu": "100"}, "limits": {"cpu": "100"}}}]}}'
pod/resize-pod created
pod/resize-pod patched
$ kubectl get pod resize-pod -o jsonpath='spec: {.spec.containers[0].resources}{"\nallocatedResources: "}{.status.containerStatuses[0].allocatedResources}{"\nstatus: "}{.status.containerStatuses[0].resources}{"\nresize: "}{.status.resize}{"\n"}'
spec: {"limits":{"cpu":"100","memory":"200Mi"},"requests":{"cpu":"100","memory":"200Mi"}}
allocatedResources: {"cpu":"200m","memory":"200Mi"}
status: {"limits":{"cpu":"100","memory":"200Mi"},"requests":{"cpu":"100","memory":"200Mi"}}
resize: Infeasible

The pod is admitted with the initial spec when the pod is created. Then, the resized spec is not verified for admission because the pod is not running yet:

func (kl *Kubelet) handlePodResourcesResize(pod *v1.Pod) *v1.Pod {
if pod.Status.Phase != v1.PodRunning {
return pod
}

As a result, the container is started with the unacceptable spec. Eventually, the pod gets into Infeasible resize status after the pod is started because the allocated resources that are not updated differs from the resized pod spec.

It does not seems that this issue affects actual resource consumption similarly to #126033. Because the pod cgroup is not updated in this case, the container resource will keep limited. In addition, since AllocatedResources in the container is not updated, this infeasible resizing will not affect the pod resource calculation of the scheduler.

What did you expect to happen?

The pod is started with the initial spec and gets into Infeasible resize status or the pod fails to start.

How can we reproduce it (as minimally and precisely as possible)?

  1. Enable InPlacePodVerticalScaling.

  2. Create a pod with an init container that takes a few seconds to complete:

    apiVersion: v1
    kind: Pod
    metadata:
      creationTimestamp: null
      labels:
        run: resize-pod
      name: resize-pod
    spec:
      initContainers:
      - image: busybox
        name: init-container
        command:
          - sleep
          - "10"
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
          limits:
            cpu: 100m
            memory: 100Mi
      containers:
      - image: busybox
        name: resize-container
        command:
          - sh
          - -c
          - trap "exit 0" SIGTERM; while true; do sleep 1; done
        resources:
          requests:
            cpu: 200m
            memory: 200Mi
          limits:
            cpu: 200m
            memory: 200Mi
        resizePolicy:
        - resourceName: cpu
          restartPolicy: NotRequired
        - resourceName: memory
          restartPolicy: NotRequired
      restartPolicy: Always
    
  3. While the init container is running, patch the pod with an infeasible resize request:

    $ kubectl create -f pod.yaml; sleep 5; kubectl patch pod resize-pod --patch '{"spec": {"containers": [{"name": "resize-container", "resources":{"requests": {"cpu": "100"}, "limits": {"cpu": "100"}}}]}}'
    
  4. Watch the pod:

    $ kubectl get pod resize-pod -o jsonpath='spec: {.spec.containers[0].resources}{"\nallocatedResources: "}{.status.containerStatuses[0].allocatedResources}{"\nstatus: "}{.status.containerStatuses[0].resources}{"\nresize: "}{.status.resize}{"\n"}' -w
    

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
# paste output here
Client Version: v1.30.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.2

Cloud provider

N/A

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.priority/important-longtermImportant over the long term, but may not be staffed and/or may need multiple releases to complete.sig/nodeCategorizes an issue or PR as relevant to SIG Node.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

Status

Done

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions