fix: add serviceProviderCluster fixtures and fix test step ordering#5634
fix: add serviceProviderCluster fixtures and fix test step ordering#5634cgiradkar wants to merge 2 commits into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: cgiradkar The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
9ad11d5 to
7235a9b
Compare
Added missing serviceProviderCluster test fixtures required for nodepool version validation and corrected test step execution order to load these fixtures before nodepool creation attempts. AI assisted code Signed-off-by: Chetan Giradkar <cgiradka@redhat.com>
7235a9b to
314ba71
Compare
ServiceProviderCluster documents must use deterministic UUIDs as document IDs, generated by arm.ResourceIDStringToCosmosID(resourceID), not subscription IDs or legacy pipe-escaped resource paths. Why these values needed to be changed: 1. Document ID consistency: Cosmos DB lookups in production code use arm.ResourceIDStringToCosmosID() to generate document IDs. Test fixtures with incorrect IDs (subscription IDs, pipe-escaped strings, or random UUIDs) create documents that real CRUD operations cannot find, causing version validation and other lookups to fail. 2. Storage layer compatibility: The deterministic UUID scheme ensures the same resource ID always maps to the same document ID across all code paths (test helpers, production storage layer, backend controllers). Using wrong IDs breaks this contract. 3. Test helper portability: Test helpers were using concrete CosmosIntegrationTestInfo type assertions, which prevented them from working with mock storage implementations. Switching to the StorageIntegrationTestInfo interface allows tests to run with both real Cosmos and mock databases. Changes: - Fixed ServiceProviderCluster fixture files with wrong document IDs - Fixed subscription ID mismatches - Fixed cluster name mismatches - Updated test helpers to use interface LoadContent() instead of concrete types - Added resourceGroupName parameter to avoid hard-coded values AI assisted code Signed-off-by: Chetan Giradkar <cgiradka@redhat.com>
|
@cgiradkar: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
This PR fixes integration test failures caused by missing test data required for nodepool version validation. The core issue was that nodepool creation tests lacked serviceProviderCluster fixtures containing cluster version information, causing validation to attempt dereferencing nil pointers and crashing the test server.
Context: Why These Changes Are Needed
When nodepool version validation was introduced (to prevent version skew between nodepools and control plane), the validation logic began requiring serviceProviderCluster data to check active cluster versions. Many existing integration tests predated this requirement and never had this fixture data, causing them to fail with cryptic EOF errors when the validation code panicked.
Problem Investigation Summary
The EOF errors were masking the real issue - the test HTTP server was panicking and closing connections. The panic occurred at
validators.go:960when dereferencing a nil*semver.Versionpointer. This pointer was nil because:serviceProviderClusterfixtures entirely, ORChanges Explained
1. Added Missing Test Fixtures
What to review: New
03-loadCosmos-serviceProviderCluster/serviceProviderCluster.jsonfiles across multiple test scenarios.Why this approach: Each test scenario needs cluster version data that matches its test cluster. The fixture structure mirrors what the backend creates when clusters are created via HTTP API - it contains:
active_versionsarray with semantic version strings (e.g., "4.20.0")desired_versionin the spec2. Fixed Test Step Execution Order
What to review: Directory renames from
03-httpCreate-*to04-httpCreate-*(or05-in some cases).Why this matters: The test framework processes steps in lexicographical order. When both
03-httpCreate-nodepooland03-loadCosmos-serviceProviderClusterexisted, the former executed first because 'h' < 'l' in ASCII. This meant nodepool validation ran before the required cluster version data was loaded.Solution rationale: Rather than changing the test framework's ordering logic (which would be risky), I renumbered directories to enforce correct execution order: load serviceProviderCluster at step 03, then create nodepool at step 04+.
Tests with multiple nodepools: Some tests (like
create-current) have multiple04-httpCreate-*steps. This is intentional and safe - as long as they all run AFTER the03-loadCosmosstep, the alphabetical ordering within step 04 doesn't matter.3. Programmatic Fixture Creation in
version_compliance_test.goWhat to review: New
createServiceProviderClusterForVersionCompliancehelper function.Why this pattern: The
version_compliance_test.gocreates clusters programmatically rather than from fixtures. To maintain consistency with its testing approach, it also creates serviceProviderCluster documents programmatically rather than loading from fixture files.Alternative considered: I could have made this test load from fixture files like FrontendCRUD tests do, but that would break the test's design pattern of being fully self-contained and not relying on external JSON files.
Testing & Verification
Manual verification performed:
make test-integration- all tests passmissing-required-fields,invalid-version,create-current,delete-not-last(these were failing for the modified code for skew validation PR feat: add node pool vs control plane version skew validations during nodepool creation #4566 )Rollout Considerations
No production impact: These are test-only changes. No business logic modified, no API behavior changed.
Review Focus Areas