Skip to content

Prevent run matrix to stop if one job fails#47

Open
martinberoiz wants to merge 1 commit into
mainfrom
fail-fast
Open

Prevent run matrix to stop if one job fails#47
martinberoiz wants to merge 1 commit into
mainfrom
fail-fast

Conversation

@martinberoiz

Copy link
Copy Markdown
Contributor

This closes #24.

@martinberoiz martinberoiz requested a review from duboism April 11, 2025 20:33
@martinberoiz

Copy link
Copy Markdown
Contributor Author

This is mostly "vibe coding". I didn't fully test it.

@duboism

duboism commented Apr 11, 2025

Copy link
Copy Markdown
Contributor

What I had in mind for #24 was to test all the notebooks and report all errors. After a few tests, I concluded that what you did here was a first step but not enough since the "individual" action stops after the first error.

I tried to following:

--- a/.github/workflows/run.yml
+++ b/.github/workflows/run.yml
@@ -18,6 +18,7 @@ jobs:
       run:
         shell: bash -l {0}
     strategy:
+      fail-fast: false
       matrix:
         python-version: [3.9]
         day: [1, 2, 3]
@@ -32,11 +33,13 @@ jobs:
         conda env create -f environment.yml
     - name: Test notebooks execution
       run: |
+        exit_code=0
         conda activate igwn-py39-lw
         for file in Tutorials/Day_${{ matrix.day }}/*ipynb; do
             echo "Checking ${file}";
             if ! ./tests/check_run.sh "${file}"; then
-                echo "::error file=${file},title=Error";
-                exit 1;
+                echo "::error file=${file},title=Error"
+                exit_code=1
             fi
         done
+        exit "${exit_code}"

but for some reasons it didn't work as expected.

Any idea?

@martinberoiz

Copy link
Copy Markdown
Contributor Author

I honestly just asked chat GPT you can see my conversation there.

You can try using continue-on-error: true as chad suggests.

@duboism

duboism commented Apr 11, 2025

Copy link
Copy Markdown
Contributor

Yes, I remember that the doc was not super clear. I might think at this again soon.

@martinberoiz

Copy link
Copy Markdown
Contributor Author

I just revisited this and now I understand the problem better. I tried the same code you tried:

        conda activate odw-py311
        exit_code=0
        for file in Tutorials/Day_${{ matrix.day }}/*ipynb; do
            echo "Checking ${file}";
            if ! ./tests/check_run.sh "${file}"; then
                echo "::error file=${file},title=Error"
                exit_code=1
            fi
        done
        exit $exit_code

I added a a=1/0 line in tutorial 1.1. My expectation was that (3.11, 1) would fail but that (3.11, 2) and (3.11, 3) would pass.
Instead, none of the tutorials passed on any day.

Day 2 and 3 are also returning exit code 1 for some reason I don't understand. I didn't even touch notebooks on day 2 and 3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Don't stop CI actions on first error

2 participants