Skip to content

test: add e2e test case for nodepool upgrades#4505

Merged
openshift-merge-bot[bot] merged 8 commits into
Azure:mainfrom
machi1990:e2e/y-stream-node-pool-upgrade-test
Mar 26, 2026
Merged

test: add e2e test case for nodepool upgrades#4505
openshift-merge-bot[bot] merged 8 commits into
Azure:mainfrom
machi1990:e2e/y-stream-node-pool-upgrade-test

Conversation

@machi1990

@machi1990 machi1990 commented Mar 18, 2026

Copy link
Copy Markdown
Collaborator

Special notes:

What

test: add e2e test case for nodepool upgrades

Why

To verify the feature working end to end.

The strategy employed here for minor upgrade is is:

  • create a control plane at a minor say X.Y
  • find the latest z in the previous minor call it X.Y-1.z and install the nodepool on it
  • find the latest release images for the nodes after nodepool installed and call them previous images
  • find the lowest installed control plane version on the version history and use it as the target to find the next nodepool version to upgrade to
  • trigger the nodepool upgrade
  • verify that the upgrade is successfully

strategy for z-stream upgrade is :
- it install the cluster to latest
- find and install the nodepool to an older patch release
- find the latest patch release to which we can upgrade to and perform the upgrade
- verify the upgrade was successfully

To verify that the upgrade was successfully we do:

  • waiting for the nodes to be ready
  • waiting for the release images of the new nodes to change
  • for the nodes to be installed on the X.Y desired version minor

Special notes for your reviewer

  1. The verification was done that was because the Node CR didn't include the version X.Y.Z information that'd have let us simply do "targetVersion==NodeObservedVersion". Only the kubelet version is available which isn't what we want to verify here. The current ocpVersion of nodepool is only available in the NodePool CR but that is a service only CR and not accessible from within the HCP guest cluster but only the management cluster.
  2. We need Automated - Update component image digests #4491 and feat: node pool upgrades frontend handler #4224 and fix: use Cosmos desired version for nodepool in merge so API reflects requested version #4537 merged and deployed to prod
  3. Initial attempt for 4.20 and 4.21 releases to see how the test is doing timing wise and can be extended to include other versions
  4. Possible future improvement to verify that while the upgrade is ongoing the workload isn't disrupted

@openshift-ci

openshift-ci Bot commented Mar 18, 2026

Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Comment thread test/e2e/nodepool_version_upgrade.go Outdated
@machi1990

Copy link
Copy Markdown
Collaborator Author

/test e2e-parallel

@machi1990 machi1990 marked this pull request as ready for review March 18, 2026 18:38
@machi1990

Copy link
Copy Markdown
Collaborator Author

/hold

Hold against merge until the #4491 and #4224 have been promoted all the way to prod and we might need other changes as well which the e2e will tell us

@machi1990 machi1990 force-pushed the e2e/y-stream-node-pool-upgrade-test branch 3 times, most recently from d5a8cce to 9e4e5bd Compare March 19, 2026 10:10
@machi1990

Copy link
Copy Markdown
Collaborator Author

/test e2e-parallel

Comment thread test/e2e/nodepool_minor_upgrade.go Outdated
Comment thread test/e2e/nodepool_minor_upgrade.go Outdated

var _ = Describe("Customer", func() {
DescribeTable("should upgrade a nodepool from",
func(ctx context.Context, nodePoolMinor string, nextMinor string) {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A followup is ensuring that the sample app remains available during the upgrade

@machi1990

Copy link
Copy Markdown
Collaborator Author

/test e2e-parallel

@machi1990 machi1990 force-pushed the e2e/y-stream-node-pool-upgrade-test branch 2 times, most recently from d63d1c5 to 99239be Compare March 20, 2026 23:21
machi1990 added a commit to machi1990/ARO-HCP that referenced this pull request Mar 21, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
machi1990 added a commit to machi1990/ARO-HCP that referenced this pull request Mar 21, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
machi1990 added a commit to machi1990/ARO-HCP that referenced this pull request Mar 21, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
@machi1990 machi1990 force-pushed the e2e/y-stream-node-pool-upgrade-test branch from 99239be to 0e88194 Compare March 21, 2026 13:08
@machi1990 machi1990 changed the title test: add e2e test case for minor nodepool upgrades test: add e2e test case for patch & minor nodepool upgrades Mar 21, 2026
@machi1990 machi1990 changed the title test: add e2e test case for patch & minor nodepool upgrades test: add e2e test case for nodepool upgrades Mar 21, 2026
cgiradkar pushed a commit to machi1990/ARO-HCP that referenced this pull request Mar 25, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
cgiradkar pushed a commit to machi1990/ARO-HCP that referenced this pull request Mar 25, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
The strategy employed here is:
- create a control plane at a minor say X.Y
- find the latest z in the previous minor call it X.Y-1.z and install the nodepool on it
- find the latest release images for the nodes after nodepool installed and call them previous images
- find the lowest installed control plane version on the version history and use it as the target to find the next nodepool version to upgrade to
- trigger the nodepool upgrade
- verify that the upgrade was successfully done by:
  - waiting for the nodes to be ready
  - waiting for the release images of the new nodes to change
  - for the nodes to be installed on the X.Y control plane minor

Special notes: The verification was done that was because the Node CR didn't include the version X.Y.Z information that'd have let us simply do "targetVersion==NodeObservedVersion". Only the kubelet version is available which isn't what we want to verify here. The current ocpVersion of nodepool is only available in the NodePool CR but that is a service only CR and not accessible from within the HCP guest cluster but only the management cluster.
@machi1990

Copy link
Copy Markdown
Collaborator Author

There has been some new tests merged. I've to rebase

machi1990 and others added 7 commits March 26, 2026 09:07
strategy:
 - it install the cluster to latest
 - find and install the nodepool to an older patch release
 - find the latest patch release to which we can upgrade to and perform the upgrade
 - verify the upgrade was successfully
Full node objects can be 10KB+ due to annotations and cause Gomega to truncate output when assertions fail. Replace full node JSON dump with a compact nodeSummary struct containing only relevant fields: name, ready status, containerRuntimeVersion, and release images.
@machi1990 machi1990 force-pushed the e2e/y-stream-node-pool-upgrade-test branch from 1b7f8ae to ec8c17a Compare March 26, 2026 08:08
@miquelsi

Copy link
Copy Markdown
Collaborator

/lgtm

@openshift-ci openshift-ci Bot added the lgtm label Mar 26, 2026
@openshift-ci

openshift-ci Bot commented Mar 26, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahitacat, cgiradkar, JameelB, machi1990, miquelsi, mvacula02

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

cgiradkar pushed a commit to machi1990/ARO-HCP that referenced this pull request Mar 26, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
@openshift-merge-bot openshift-merge-bot Bot merged commit 41d52e6 into Azure:main Mar 26, 2026
17 checks passed
@machi1990 machi1990 deleted the e2e/y-stream-node-pool-upgrade-test branch March 26, 2026 13:13
cgiradkar pushed a commit to machi1990/ARO-HCP that referenced this pull request Mar 27, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
cgiradkar pushed a commit to machi1990/ARO-HCP that referenced this pull request Mar 30, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
cgiradkar pushed a commit to machi1990/ARO-HCP that referenced this pull request Mar 31, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
cgiradkar pushed a commit to machi1990/ARO-HCP that referenced this pull request Apr 2, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
cgiradkar pushed a commit to machi1990/ARO-HCP that referenced this pull request Apr 2, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
cgiradkar pushed a commit to machi1990/ARO-HCP that referenced this pull request Apr 7, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
cgiradkar pushed a commit to machi1990/ARO-HCP that referenced this pull request Apr 13, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
cgiradkar pushed a commit to machi1990/ARO-HCP that referenced this pull request Apr 15, 2026
…nodepool creation

While working on Azure#4505 and further reviewing of Azure#4554 I realised that the frontend was missing skew validation while creating the node pool.

The PR adds the skew validation by performing the following rules:
- if the version is set
  - validate that node pool desired minor version is within the maximum of 2 minors lower skew and not higher than the control plane minor
  - validate that the desired node pool version isn't higher that the lowest active control plane version if this is available: only available post install  - validate that when control plane major != nodepool major version then only valid skews are allowed
     - for now the definition of valid skew is:
       if cp on 5.0 then the node pool can be on 4.21 or 4.22
       if cp is on 5.1 then the node pool can be on 4.21, 4.22 or 4.23
       if cp is on 5.2 then the node pool can be only on 4.23
       I left a TODO to put this behind an AFEC flag once Azure#4479 is merged and it'll done as a followup
- if the lowest control plane version is available and node pool version set, perform the same minor skew as done for the cluster desired version but using the actual cluster version minor

The change also introduces some defaulting logic in case version isn't set: in this case we default to the lowest active control plane version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants