Keep workers healthy when switching a site's PHP runtime#524
Merged
Conversation
Switching a site between FPM and FrankenPHP left its worker units pointing at the previous container. Going back to FPM removed the per-site FrankenPHP container while the queue and schedule units still had BindsTo and exec'd into it, so the workers could not start and heal could not recover them, since heal only resets and restarts a unit and never rewrites one. The runtime switch now stops the site's running workers and recreates them against the new container in both directions. A second issue compounded it: WorkerStartForSite rewrote a changed unit but never asked systemd to re-read it, so enable and start acted on the stale cached unit. It now runs a daemon-reload whenever the unit content changed, before enabling and starting.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Switching a site between FPM and FrankenPHP left its worker units pointing at the previous container. Going back to FPM removed the per-site FrankenPHP container while the queue and schedule units still had BindsTo and exec'd into it, so the workers could not start and healing could not recover them, since the healer only resets and restarts a unit and never rewrites one. The runtime switch now stops the site's running workers and recreates them against the new container in both directions, so they always exec into the container that actually exists.
A second issue compounded it. WorkerStartForSite rewrote a changed unit but never asked systemd to re-read it, so enable and start operated on the stale cached unit and a re-pointed worker kept failing on the old container even though the file on disk was already correct. It now runs a daemon-reload whenever the unit content changed, before enabling and starting, which also makes any on-demand worker start robust to a content change.