Skip to content

InPlace VPA: wrong CRI updates after lack of resources limits  #120709

@LingyanYin

Description

@LingyanYin

What happened?

test-burstable1.yaml, after I deletes the resource limits, when inplace vpa completes, the cgroup's mem limits is still the old value.

apiVersion: v1
kind: Pod
metadata:
  name: test-burstable1
  namespace: ly-test
spec:
  containers:
  - name: test-burstable1
    image: nginx:1.14.2
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"
    ports:
    - containerPort: 80

After deleting limits

apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"test-burstable1","namespace":"ly-test"},"spec":{"containers":[{"image":"nginx:1.14.2","name":"test-burstable1","ports":[{"containerPort":80}],"resources":{"limits":{"cpu":"500m","memory":"128Mi"},"requests":{"cpu":"250m","memory":"64Mi"}}}]}}
  creationTimestamp: "2023-08-22T02:53:30Z"
  name: test-burstable1
  namespace: ly-test
  resourceVersion: "4028"
  uid: 3ad61d54-d812-4519-b927-9e9190fdfd83
spec:
  containers:
  - image: nginx:1.14.2
    imagePullPolicy: IfNotPresent
    name: test-burstable1
    ports:
    - containerPort: 80
      protocol: TCP
    resizePolicy:
    - resourceName: cpu
      restartPolicy: NotRequired
    - resourceName: memory
      restartPolicy: NotRequired
    resources:
      requests:
        cpu: 250m
        memory: 64Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-65vsw
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: 127.0.0.1
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-65vsw
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-08-22T02:53:30Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-08-22T02:53:33Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-08-22T02:53:33Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-08-22T02:53:30Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - allocatedResources:
      cpu: 250m
      memory: 64Mi
    containerID: containerd://c2d0a778ab88b0c6b20192ab501ace1a1ead72a2dbe9d2169f4637d3c564065d
    image: docker.io/library/nginx:1.14
    imageID: docker.io/library/nginx@sha256:f7988fb6c02e0ce69257d9bd9cf37ae20a60f1df7563c3a2a6abe24160306b8d
    lastState: {}
    name: test-burstable1
    ready: true
    resources:
      requests:
        cpu: 250m
        memory: 64Mi
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2023-08-22T02:53:33Z"
  hostIP: 127.0.0.1
  phase: Running
  podIP: 10.88.0.80
  podIPs:
  - ip: 10.88.0.80
  - ip: 2001:4860:4860::50
  qosClass: Burstable
  startTime: "2023-08-22T02:53:30Z"

Check cgroup memory.limits_in_bytes, it's still 128MB, which is WRONG!

root:/sys/fs/cgroup/memory/kubepods/burstable/pod3ad61d54-d812-4519-b927-9e9190fdfd83/c2d0a778ab88b0c6b20192ab501ace1a1ead72a2dbe9d2169f4637d3c564065d# cat memory.limit_in_bytes
134217728

What did you expect to happen?

cgroup's mem limits shouldn't be the old value

How can we reproduce it (as minimally and precisely as possible)?

as in the description

Anything else we need to know?

No response

Kubernetes version

$ kubectl version

# paste output here
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.1", GitCommit:"4c9411232e10168d7b050c49a1b59f6df9d7ea4b", GitTreeState:"clean", BuildDate:"2023-04-14T13:21:19Z", GoVersion:"go1.20.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"", Minor:"", GitVersion:"v0.0.0-master+$Format:%H$", GitCommit:"ecfb391633f9cbe634e014eb5fba4fb0b8b9e5eb", GitTreeState:"dirty", BuildDate:"2023-08-23T21:54:59Z", GoVersion:"go1.20.7", Compiler:"gc", Platform:"linux/amd64"}
error: could not parse pre-release/metadata (-master+$Format:%H$) in version "v0.0.0-master+$Format:%H$

Cloud provider

N/A

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
$ uname -a
# paste output here
Linux n231-228-034 5.4.56.bsk.10-amd64 #5.4.56.bsk.10 SMP Debian 5.4.56.bsk.10 Fri Sep 24 12:17:03 UTC  x86_64 GNU/Linux
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.sig/nodeCategorizes an issue or PR as relevant to SIG Node.

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions