Skip to content

Auto-recover subscription sync after a PDS migration#291

Merged
disnet merged 1 commit into
mainfrom
radial/3mo6qwdv/fix-subscription-sync-silently-breaking-
Jun 13, 2026
Merged

Auto-recover subscription sync after a PDS migration#291
disnet merged 1 commit into
mainfrom
radial/3mo6qwdv/fix-subscription-sync-silently-breaking-

Conversation

@disnet

@disnet disnet commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Problem

A user migrated their PDS (changed PDS host) and their RSS subscriptions silently stopped syncing. Toggling Atmospheric sync off and back on appeared to fix it.

Root cause: the PDS host is resolved once at login into sessions.pds_url and did_pds_cache (24h TTL) and never re-resolved. PDSClient.request binds both the request URL and the DPoP htu claim to session.pdsUrl. After a migration the DID document's service endpoint changes but the session keeps the old host, so every authenticated PDS request fails (network / 401 / 403) and sync stops. Only the login/callback flow ever evicted + re-resolved the cache — the sync path and refreshSession reused the stale host.

On Phase 1 (confirm the toggle's recovery): /api/sync/full read sessions.pds_url verbatim and never re-resolved, so the toggle alone could not repair the host. The fix therefore makes recovery happen in the request path itself, covering both possible migration shapes — host-only (existing tokens still valid → full auto-recovery) and a full account migration (tokens rejected by the new auth server → an explicit re-auth signal).

Fix

One-shot stale-endpoint recovery centralized in PDSClient:

  • A new optional PDSRecoveryContext ({ env, sessionId }) enables self-healing. When a request fails in a way that looks like a moved host — network throw / 401 / 403 / invalid_token, excluding RecordNotFound, 429, 5xx, and use_dpop_nonce — the client evicts the DID cache, re-resolves the host from the DID doc, and only if the host actually changed persists it to the session, resets the per-host DPoP nonce, and retries once.
  • New updateSessionPdsUrl in oauth.ts persists the migrated host (storeSession's ON CONFLICT deliberately doesn't touch pds_url).
  • A one-shot guard + the "host unchanged ⇒ don't retry" check prevent any re-resolve loop and stop transient blips from being misread as migrations.
  • If the new host still rejects the tokens, the failure carries needsReauth, which flows through syncSubscriptions/api/sync/full → the frontend, where Settings shows a calm Atmosphere-voice prompt to sign in again.

Recovery is wired through syncSubscriptions / /api/sync/full using the request's session id, so the same sync the Settings toggle fires now self-heals automatically — no manual toggle needed.

Tests

  • pds-client.spec.ts — re-resolve + persist + retry + success; needsReauth with no retry loop; 5xx and RecordNotFound don't trigger recovery; unchanged re-resolved host doesn't retry; no recovery context is a no-op.
  • subscription-sync.spec.ts — sync-path migration: a seeded stale session recovers and pulls records without a toggle; new host still rejecting → needsReauth.

All affected backend suites pass; npm run check is clean on backend and frontend.

Subscriptions silently stopped syncing after a user migrated their PDS:
the host is resolved once at login into sessions.pds_url + did_pds_cache and
never re-resolved, so authenticated PDS calls kept hitting the dead/old host
and failing. Only login/callback evicted+re-resolved; the sync path didn't.

Add one-shot stale-endpoint recovery in PDSClient. When a request fails in a
way that looks like the host moved (network throw / 401 / 403 / invalid_token,
but NOT RecordNotFound / 429 / 5xx / use_dpop_nonce), and a recovery context
is supplied, the client evicts the DID cache, re-resolves the host from the
DID doc, and — only if the host actually changed — persists it to the session
(via new updateSessionPdsUrl, since storeSession's upsert doesn't touch
pds_url), resets the per-host DPoP nonce, and retries once. If the new host
still rejects the tokens, it surfaces needsReauth (no loop) so Settings can
prompt the user to reconnect in Atmosphere voice.

Wired through syncSubscriptions / /api/sync/full via the request's session id,
so the same sync the toggle triggers now self-heals with no manual toggle.

Regression coverage: PDSClient recovery unit tests (re-resolve+persist+retry+
success; needsReauth + no-loop; 5xx/not-found don't trigger; unchanged host
doesn't retry; no-context is a no-op) and sync-path tests on the migration.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@disnet disnet merged commit 0e4b689 into main Jun 13, 2026
6 checks passed
@disnet disnet deleted the radial/3mo6qwdv/fix-subscription-sync-silently-breaking- branch June 13, 2026 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant