Validate and tighten macOS dependency cache gating in CI#539
Conversation
fedorov
left a comment
There was a problem hiding this comment.
Review
The fix is correct and directly addresses the root cause. Logic is sound across all cache-hit/cache_ok combinations — approve with minor suggestions.
What this gets right
CACHE_OK=falsedefault ensurescache_okis always well-defined even on cache miss, so both configure conditions evaluate correctly.- Uses
>> "$GITHUB_OUTPUT"(current non-deprecated syntax). - Validation step runs unconditionally, which is necessary since both configure steps depend on its output.
- Logging on validation failure is clear and actionable.
Suggestions
1. Stale comment — The comment above the key: line still references the old cache name pattern:
# macOS-<arch>-build-cache-dcmtk-itk-<hash>
Should read cache-dcmtk-itk-zlib-v2 to match the new env var.
2. Use $GITHUB_WORKSPACE instead of template substitution in shell — Using ${{ github.workspace }} inside run: works but $GITHUB_WORKSPACE is the idiomatic form and avoids template expansion inside scripts:
if [ ! -f "$GITHUB_WORKSPACE/dcmqi-build/ITK-build/ITKConfig.cmake" ]; then3. restore-keys removal tradeoff — Removing the fallback means any PR touching CMakeExternals/DCMTK.cmake, ITK.cmake, or zlib.cmake gets zero cache reuse and triggers a full superbuild. If build time is a concern, a version-scoped restore-key would still benefit from partial matches while keeping the namespace tight:
restore-keys: |
${{ runner.os }}-${{ matrix.arch }}-build-${{ env.cache-name }}-(with cache-name: cache-dcmtk-itk-zlib-v2 this is now safe since v2 entries will always be valid builds — the validation step handles any edge cases). That said, the conservative no-fallback approach is fine given the corruption history.
4. Cache bust on first run — Bumping to -v2 will cold-start all caches on merge. Expected and unavoidable; just worth knowing the first CI run after merge will be slow.
Overall: this is a focused, correct fix. Happy to see it merged.
ReviewThe fix is correct and directly addresses the root cause. Logic is sound across all cache-hit/cache_ok combinations. What this gets right
Suggestions1. Stale comment — The comment above the Should read 2. Use if [ ! -f "$GITHUB_WORKSPACE/dcmqi-build/ITK-build/ITKConfig.cmake" ]; then3. restore-keys: |
${{ runner.os }}-${{ matrix.arch }}-build-${{ env.cache-name }}-The validation step would catch any bad entries anyway. That said, the conservative no-fallback approach is fine given the corruption history. 4. Cold-cache on first run — Bumping to Overall: focused, correct fix. Happy to see it merged. |
|
The macOS CI job could take the cached dependency path even when restored cache contents were incomplete/corrupt, leading to missing
ITKConfig.cmakeand configure failure. This change makes cached-path selection contingent on cache usability, not cache-hit alone.Cache robustness
Validate restored cachestep that emitscache_ok=true|falseafter checking:dcmqi-build/ITK-build/ITKConfig.cmakedcmqi-build/DCMTK-build/DCMTKConfig.cmakedcmqi-build/zlib-install/includedcmqi-build/zlib-install/libConfigure-path gating
cache-hit == 'true'andcache_ok == 'true'Cache key hardening
cache-dcmtk-itk-zlibtocache-dcmtk-itk-zlib-v2restore-keysfallback to avoid reusing arbitrary older cache entriesOriginal prompt
The macOS CI workflow has a failing job in repository
QIICR/dcmqicaused by the dependency cache being treated as valid when the restored cache contents are incomplete or corrupt.Context:
C/C++ CI macOS.github/workflows/cmake-macos.ymlhttps://github.com/QIICR/dcmqi/actions/runs/26633738915/job/78517899566495da82985f89a039fd8816ed23b83396c5be809Observed failure:
DCMTK,ITK, andzlibdirectories and then configures dcmqi using the cached dependency path.WARNING: Cache entry deserialization failed, entry ignoredand later CMake fails insideSlicerExecutionModelbecauseITKConfig.cmakecannot be found.CMakeLists.txtrequiresfind_package(ITK NO_MODULE REQUIRED)andfind_package(DCMTK NO_MODULE REQUIRED), so the cached path is only valid if the restored build directories actually contain those package config files.Please create a pull request that updates
.github/workflows/cmake-macos.ymlto make the cache usage robust.Required changes:
${{ github.workspace }}/dcmqi-build/ITK-build/ITKConfig.cmake${{ github.workspace }}/dcmqi-build/DCMTK-build/DCMTKConfig.cmake${{ github.workspace }}/dcmqi-build/zlib-install/includeand${{ github.workspace }}/dcmqi-build/zlib-install/libexistcache_ok=true|falseConfigure dcmqi w/ cached DCMTK/ITK/ZLIBstep so it only runs when both:steps.cache-dcmtk-itk-zlib.outputs.cache-hit == 'true'Configure dcmqi w/o cached DCMTK/ITK/ZLIBstep so it runs when either:cache-dcmtk-itk-zlibto something versioned likecache-dcmtk-itk-zlib-v2restore-keysfallback, or otherwise avoid restoring arbitrary older caches from the same prefixPlease make the PR title concise and descriptive, explaining that it validates and tightens the macOS dependency cache in CI.
The following is the prior conversation context from the user's chat exploration (may be truncated):
User: Please find a solution for this failing job. Use the logs, job definition, and any referenced files where the failure occurred. Keep your response focused on the solution and include code suggestions when appropriate.
Assistant: The failure is caused by an invalid cache restore that reports a cache hit but does not restore a usable ITK package layout.
What failed
The macOS workflow takes the cached-deps path when
cache-hit == 'true'and configures dcmqi with:-DDCMTK_DIR=${{ github.workspace }}/dcmqi-build/DCMTK-build-DITK_DIR=${{ github.workspace }}/dcmqi-build/ITK-build-DZLIB_ROOT=${{ github.workspace }}/dcmqi-build/zlib-installBut during the build,
SlicerExecutionModelfails in CMake with:By not providing "FindITK.cmake"... asked CMake to find a package configuration file provided by "ITK"ITKConfig.cmake/itk-config.cmakeThat means the restored cache did not contain a valid
ITK-buildtree, even though the workflow treated it as usable. The log also shows:WARNING: Cache entry deserialization failed, entry ignoredSo the job is effectively using a broken/partial cache state.
Relevant files:
Why this happens
In
CMakeLists.txt, non-superbuild configuration requires a valid ITK package:find_package(ITK NO_MODULE REQUIRED)atCMakeLists.txt:276The cached workflow path assumes
${{ github.workspace }}/dcmqi-build/ITK-buildcontainsITKConfig.cmake, but the cache restore warning indicates that assumption is false for this run.Solution
Make the workflow validate the cache contents before taking the cached-deps path. If
ITKConfig.cmakeorDCMTKConfig.cmakeis missing, force the full superbuild instead of using the cache.Recommended workflow change
In
.github/workflows/cmake-macos.yml, replace the current cache-hit-only gating with a validation step.Add a validation step after
Report cache status