Feature/local resource healthchecks#2932
Conversation
Local sites have no Newt agent that can drive the targetHealthCheck
probes, so health checks were silently disabled for them. The Pangolin
server can reach those targets directly, so we now run the probes
ourselves on a small interval-based scheduler:
- New server/routers/target/localHealthChecker.ts: HTTP and TCP probes
with per-check interval, timeout, expected status, follow-redirects,
custom headers, and healthy/unhealthy thresholds. Status changes go
through the same fireHealthCheck{Healthy,Unhealthy}Alert helpers used
by the Newt-driven flow, so the UI, alerts, and history are unchanged.
- ws/messageHandlers.ts: start the local poller alongside the existing
Newt/Olm offline checkers (every build, including saas).
- updateTarget.ts: stop forcing hcHealth='unknown' for local sites; they
are now treated like newt sites (wireguard sites still pass through
the unknown branch since nothing probes them).
Re-uses the existing targetHealthCheck schema and TargetHealthCheck
type, so no migration is needed.
The healthcheck configuration button was hardcoded to only show for newt site types (siteType === 'newt'). Added 'local' to the condition so that local resources can also have healthchecks configured from both the resource proxy settings page and the resource creation page. The server-side local health checker (localHealthChecker.ts) was already implemented and operational -- it polls local targets via HTTP/TCP directly from the Pangolin server and updates hcHealth status in the database. The only missing piece was this UI gate.
|
So pangolin itself does the healthchecks at that point or you leverage traefik's own built in healthchecks and query that? |
|
Currently, the Pangolin process runs the health checks. This is in line with how healthchecks are already done with Gerbil. This can be expanded to cover Wireguard hosts in the same way. |
|
Actually pangolin doesn't do the healthchecks it's newt that does them right now. |
|
Hey thank you so much for this PR! I think right now we are going to keep the hc functionality in newt. I appreciate you opening this though! |
|
Hi @oschwartz10612, thanks for taking a look at this! I completely understand wanting to centralize health check functionality in Newt. However, I’d love for the team to reconsider this approach for local sites. This feature has been highly requested by the community in issues like #1835 and #1873. A major driving factor here is that many users experience performance bottlenecks with Newt (as tracked in #512). For environments where the Pangolin server can reach the target directly, allowing native server-side health checks removes that unnecessary overhead and provides a much more stable experience. Since the PR is already written and limits the scope strictly to local/server-reachable sites without breaking Newt's existing flow, would you be open to revisiting this as an optional/opt-in feature? |
|
1.19 is locked but willing to take a look for a different release |
|
Thank you. |
Community Contribution License Agreement
By creating this pull request, I grant the project maintainers an unlimited,
perpetual license to use, modify, and redistribute these contributions under any terms they
choose, including both the AGPLv3 and the Fossorial Commercial license terms. I
represent that I have the right to grant this license for all contributed content.
Description
This change adds healthcheck support for local sites in Pangolin.
How to test?
You need a Pangolin instance with at least one Local site configured — a site where the target is directly reachable by the Pangolin server over the network (not tunnelled through Newt).
Test 1: Healthcheck turns healthy
Test 2: Healthcheck detects a failure
Test 3: TCP mode
Repeat tests 1 and 2 using TCP mode instead of HTTP to verify the TCP probe path works independently
Test 4: Config options
Test 5: WireGuard sites are unaffected
On a WireGuard site, verify the healthcheck button still does not appear and no probing happens
Test 6: Newt sites are unaffected
On a Newt site, verify healthchecks still work exactly as before — the Newt agent should still be doing the probing, not the server
Test 7: Enable/disable toggle
Disable an active healthcheck and verify the status resets to Unknown and polling stops. Re-enable it and verify it resumes from Unknown → Unhealthy → Healthy