Skip to content

Surface image pull errors in service status #4192

@jonjohnsonjr

Description

@jonjohnsonjr

In what area(s)?

/area API

What version of Knative?

HEAD

Expected Behavior

We should provide some indication that ImagePullBackoff is happening when trying to deploy a service.

Actual Behavior

In the revision, everything is Unknown because we are deploying.

status:
  conditions:
  - lastTransitionTime: "2019-05-29T22:48:43Z"
    message: Requests to the target are being buffered as resources are provisioned.
    reason: Queued
    severity: Info
    status: Unknown
    type: Active
  - lastTransitionTime: "2019-05-29T22:48:43Z"
    status: "True"
    type: BuildSucceeded
  - lastTransitionTime: "2019-05-29T22:48:43Z"
    reason: Deploying
    status: Unknown
    type: ContainerHealthy
  - lastTransitionTime: "2019-05-29T22:48:43Z"
    reason: Deploying
    status: Unknown
    type: Ready
  - lastTransitionTime: "2019-05-29T22:48:43Z"
    reason: Deploying
    status: Unknown
    type: ResourcesAvailable

In the service, we are just waiting for the revision to become Ready.

Interestingly, we know that the deployment is not Progressing:

deployment:
  status:
    conditions:
    - lastTransitionTime: "2019-05-29T22:48:43Z"
      lastUpdateTime: "2019-05-29T22:48:43Z"
      message: Deployment does not have minimum availability.
      reason: MinimumReplicasUnavailable
      status: "False"
      type: Available
    - lastTransitionTime: "2019-05-29T22:50:44Z"
      lastUpdateTime: "2019-05-29T22:50:44Z"
      message: ReplicaSet "autoscale-go-d8flr-deployment-7bc5c75ff4" has timed out progressing.
      reason: ProgressDeadlineExceeded
      status: "False"
      type: Progressing
    observedGeneration: 1
    replicas: 1
    unavailableReplicas: 1
    updatedReplicas: 1

But because the revision's Active condition is Unknown:

func (rs *RevisionStatus) IsActivationRequired() bool {
if c := revCondSet.Manage(rs).GetCondition(RevisionConditionActive); c != nil {
return c.Status != corev1.ConditionTrue
}
return false

We assume activation is required, so we don't propagate the deployment ProgressDeadlineExceeded condition:

if hasDeploymentTimedOut(deployment) && !rev.Status.IsActivationRequired() {

That seems like it might be a separate bug?

Regardless, in this case, we can check to see if the pod's user-container is in state waiting and surface an error if the deployment has also timed out with ProgressDeadlineExceeded:

pod:
  status:
    containerStatuses:
    - image: gcr.io/jonjohnson-test/autoscale-go@sha256:e5e89c5fd57c717b49d41be89faebc526bdcda017e898ae86c2bf20f5cd339b5
      imageID: ""
      lastState: {}
      name: user-container
      ready: false
      restartCount: 0
      state:
        waiting:
          message: Back-off pulling image "gcr.io/jonjohnson-test/autoscale-go@sha256:e5e89c5fd57c717b49d41be89faebc526bdcda017e898ae86c2bf20f5cd339b5"
          reason: ImagePullBackOff
    hostIP: 10.128.0.50
    phase: Pending
    podIP: 10.60.0.9
    qosClass: Burstable
    startTime: "2019-05-29T22:48:43Z"

Steps to Reproduce the Problem

Deploy an ksvc with a non-existent image by (valid) digest (so tag -> digest resolution will skip over it). The pod will never become ready because it can't pull the image.

Metadata

Metadata

Assignees

Labels

area/APIAPI objects and controllerskind/bugCategorizes issue or PR as related to a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions