Skip to content

Conversation

tcornell-bus
Copy link
Collaborator

@tcornell-bus tcornell-bus commented Sep 15, 2025

Add beaker.panic-watchdog option to hardware provision plugin.

When beaker.panic-watchdog is True beaker-watchdog will abort the job and return the host to Beaker when kernel panic is detected. In this case, the <watchdog panic="ignore"/> should NOT be in the recipe XML.

Default value is False, setting beaker-watchdog to ignore kernel panics. In this case, the <watchdog panic="ignore"/> should be in the recipe XML. It should be located in the general part of the XML, not under hostRequires, as it is not a true hardware requirement to beaker.

  1. Corresponding mrack change has been Merged: Add watchdog panic="ignore" option in mrack #4004
  2. Waiting on testing to complete in bodhi for mrack packages [1] [2].

Fixes: #3926

Assisted by: Cursor AI

Pull Request Checklist

  • implement the feature
  • write the documentation
  • extend the test coverage
  • update the specification
  • adjust plugin docstring
  • modify the json schema
  • mention the version
  • include a release note

@tcornell-bus tcornell-bus added step | provision Stuff related to the provision step plugin | mrack The beaker provision plugin labels Sep 15, 2025
@tcornell-bus tcornell-bus changed the title Draft: Add return-on-panic option to beaker provision plugin Add return-on-panic option to beaker provision plugin Sep 16, 2025
@thrix thrix added this to planning Sep 17, 2025
@github-project-automation github-project-automation bot moved this to backlog in planning Sep 17, 2025
@thrix thrix moved this from backlog to review in planning Sep 17, 2025
@psss
Copy link
Collaborator

psss commented Sep 18, 2025

One of the last blockers for the proper panic detection, raised today on the stakeholder meeting, proposing tentatively for the next sprint.

@psss psss moved this from review to implement in planning Sep 25, 2025
@tcornell-bus tcornell-bus force-pushed the tcornell-beaker-watchdog branch from c1a208f to 47c5f4b Compare September 25, 2025 19:39
@tcornell-bus tcornell-bus changed the title Add return-on-panic option to beaker provision plugin Add beaker.panic-watchdog option to hardware provision plugin Sep 25, 2025
@tcornell-bus tcornell-bus added the area | hardware Implementation of hardware requirements label Sep 25, 2025
@tcornell-bus tcornell-bus moved this from implement to review in planning Sep 25, 2025
@tcornell-bus tcornell-bus requested review from psss and happz September 25, 2025 19:43
Copy link
Collaborator

@skycastlelily skycastlelily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'd better leave karma to mrack packages[1][2] to have them get stable
This mr can't be released until the mrack packages become stable. though they will automatically become stable after seven days, giving them karma sooner would allow reviewers to easily test the changes locally. 

@tcornell-bus
Copy link
Collaborator Author

This is the output for the dryrun for the first test in dry.sh running tmt run --dry provision --how beaker --image Fedora-42 plan --default in data dir (notice no hardware settings in the command):

<job retention_tag="audit" product="[internal]">
  <whiteboard>tmt-467-CDEOudXz</whiteboard>
  <recipeSet priority="Normal">
    <recipe whiteboard="" ks_meta="">
      <distroRequires>
        <and>
          <distro_name op="=" value="Fedora-42"/>
          <distro_variant op="=" value="BaseOS"/>
          <distro_arch op="=" value="x86_64"/>
        </and>
      </distroRequires>
      <hostRequires/>
      <repos/>
      <partitions/>
      <reservesys duration="86400"/>
      <watchdog panic="ignore"/>
      <task name="/distribution/dummy" role="STANDALONE">
        <params/>
      </task>
    </recipe>
  </recipeSet>
</job>

By default we want to ignore kernel panics. The watchdog element is purposely not in hostRequires, it goes in the general section of the beaker recipe, following the Beaker documentation.


This is the output for the command where we set panic to True tmt run --dry provision --how beaker --hardware beaker.panic-watchdog=True --image Fedora-42 plan --default:

<job retention_tag="audit" product="[internal]">
  <whiteboard>tmt-469-UMPtIVRu</whiteboard>
  <recipeSet priority="Normal">
    <recipe whiteboard="" ks_meta="">
      <distroRequires>
        <and>
          <distro_name op="=" value="Fedora-42"/>
          <distro_variant op="=" value="BaseOS"/>
          <distro_arch op="=" value="x86_64"/>
        </and>
      </distroRequires>
      <hostRequires>
        <and/>
      </hostRequires>
      <repos/>
      <partitions/>
      <reservesys duration="86400"/>
      <task name="/distribution/dummy" role="STANDALONE">
        <params/>
      </task>
    </recipe>
  </recipeSet>
</job>

There should be no watchdog element.

@teemtee teemtee deleted a comment from happz Sep 29, 2025
Copy link
Collaborator

@skycastlelily skycastlelily left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM:)

@happz happz moved this from review to merge in planning Oct 1, 2025
@happz happz force-pushed the tcornell-beaker-watchdog branch from ee6683d to f67894a Compare October 1, 2025 08:05
@happz happz added the ci | full test Pull request is ready for the full test execution label Oct 1, 2025
@happz
Copy link
Collaborator

happz commented Oct 1, 2025

@tcornell-bus test is failing, please, take a look.

@tcornell-bus
Copy link
Collaborator Author

I believe it is failing because the mrack package is still in testing on bodhi.

@psss
Copy link
Collaborator

psss commented Oct 2, 2025

I believe it is failing because the mrack package is still in testing on bodhi.

Let's give it some karma to move it forward.

@psss psss added this to the 1.59 milestone Oct 2, 2025
tcornell-bus and others added 7 commits October 3, 2025 08:07
to hardware provision plugin.
When `beaker.panic-watchdog` is True beaker-watchdog
will abort the job and return the host to beaker
when kernel panic is detected.
Default value is False, setting beaker-watchdog
to ignore kernel panics.
Add unit tests and beaker dry tests
Add release note and update spec
Simplify conditional
Fix "beaker watchdog" to be "beaker-watchdog"
since the latter is in beaker docs
so it matches the schema option
Co-authored-by: Miloš Prchlík <mprchlik@redhat.com>
@psss psss force-pushed the tcornell-beaker-watchdog branch from 7f75a63 to 8061f99 Compare October 3, 2025 06:08
@psss
Copy link
Collaborator

psss commented Oct 3, 2025

Packages are now in stable, let's give it another try!

@happz
Copy link
Collaborator

happz commented Oct 3, 2025

Unrelated timeout, merging.

@happz happz merged commit 2eea6da into main Oct 3, 2025
25 of 26 checks passed
@happz happz deleted the tcornell-beaker-watchdog branch October 3, 2025 11:23
@github-project-automation github-project-automation bot moved this from merge to done in planning Oct 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area | hardware Implementation of hardware requirements ci | full test Pull request is ready for the full test execution plugin | mrack The beaker provision plugin step | provision Stuff related to the provision step
Projects
Status: done
Development

Successfully merging this pull request may close these issues.

Disable the Beaker watchdog in the mrack plugin
4 participants