Skip to content

Tags: hookdeck/outpost

Tags

v1.0.6

Toggle v1.0.6's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
feat(alert): default/disable semantics for consecutive-failure & exha…

…usted-retries alerts (#964)

* feat(alert): add Settings + enable gates for consecutive/exhausted alerts

Introduce alert.Settings (the resolved, operational alert config) plus two
monitor gates: WithConsecutiveFailureEnabled and WithExhaustedRetriesEnabled.

Both default to true, so behavior is unchanged until a caller opts out. When
consecutive-failure alerting is gated off the monitor neither tracks failures
nor auto-disables; when exhausted-retries is gated off it never emits, even
with retries enabled. Extracts the consecutive-failure path into a helper to
keep the replay/ordering semantics identical on the enabled path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(config): resolve alert config to alert.Settings with unset/empty/value rule

AlertConfig.ConsecutiveFailureCount and ExhaustedRetriesWindowSeconds become
*string so the parse layer can tell three states apart: unset uses the default
(100 / 3600), an empty string disables that alert dimension, and any other value
must parse to a non-negative integer.

AlertConfig.ToConfig resolves the raw values into the operational alert.Settings
(domain-owned, so nothing downstream imports config). Validate rejects malformed
values at startup. builder wires the resolved gates into the monitor and only
builds the exhausted-retries suppression window when enabled with a positive
window (0 = alert on every exhaustion).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(config): document alert default/disable behavior in config reference

Update the Alerts section of the self-hosting config reference for the new
unset/empty/value rule: ALERT_CONSECUTIVE_FAILURE_COUNT defaults to 100 (empty
disables), and document the previously-undocumented
ALERT_EXHAUSTED_RETRIES_WINDOW_SECONDS (default 3600, empty disables, 0 = no
suppression).

Also correct stale entries: drop the removed ALERT_CALLBACK_URL, fix the
ALERT_AUTO_DISABLE_DESTINATION default (false, not true), and fix the YAML
example key (alert, not alerts).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(openapi): describe alert behavior in ManagedConfig

Document the unset/empty/value behavior for ALERT_CONSECUTIVE_FAILURE_COUNT,
ALERT_EXHAUSTED_RETRIES_WINDOW_SECONDS and ALERT_AUTO_DISABLE_DESTINATION in the
ManagedConfig schema. Descriptions only — the properties are already typed as
string. SDKs are regenerated from this schema at release time.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(config): make empty-string alert disable work on the env-var surface

The *string representation had two problems found by manual QA:
1. caarlos0/env ignores a present-but-empty env var, so `ALERT_..._COUNT=`
   resolved to the default instead of disabling — the empty=off rule only
   worked via YAML, not env vars (the primary surface for the cloud product).
2. caarlos0/env crashes ("expected a pointer to a Struct") on any non-nil
   *string it walks, so setting these in a YAML config file would crash startup
   (env.Parse runs after the YAML load).

Replace *string with an OptionalString value type that implements both
TextUnmarshaler (bound by caarlos0/env as a scalar — no crash) and
yaml.Unmarshaler (so `key: ""` expresses the empty/off state). The one case
caarlos0/env cannot surface — a present-but-empty env var — is handled
explicitly via OSInterface.LookupEnv, which also gives env precedence over YAML.

Net: unset -> default, empty -> disabled, value -> value, identically on both
env and YAML, with env > yaml. Adds a full parse-path test covering the matrix
and precedence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(e2e): use OptionalString for ConsecutiveFailureCount in regression test

Missed call site when migrating AlertConfig fields to OptionalString;
the raw int assignment broke the cmd/e2e build.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

v1.0.5

Toggle v1.0.5's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
fix(destregistry): treat event format errors as failed deliveries, no…

…t DLQ (#957)

* fix(destregistry): treat event format errors as failed deliveries, not DLQ

A Format/key-template failure (e.g. an S3 key_template referencing a field
absent from the event) returned a nil delivery, which the registry turned into
a nil attempt and the deliverymq handler classified as a PreDeliveryError →
nack → Pub/Sub DLQ. The failure was never logged, invisible to the customer,
and paged us instead of surfacing as an actionable delivery error.

Add destregistry.NewFormatErrorDelivery, returning a non-nil failed Delivery
plus an ErrDestinationPublishAttempt, so the registry records a failed attempt,
acks the message, and retries via the scheduler. The customer-facing response
is a generic message; the raw Go error stays on the error for logs/telemetry
and is not persisted on the attempt.

Apply it across all providers with a Format step: s3, sqs, azure_servicebus,
gcp_pubsub, webhook, webhook_standard (previously `return nil, err`) and
kinesis, kafka (previously nil-delivery ErrDestinationPublishAttempt).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(e2e): regression for format error delivered as failed attempt, not DLQ

Standalone e2e test reproducing the production incident: an aws_s3 destination
whose key_template references a field missing from the event. Asserts the fixed
behavior end to end — nothing is written to S3, each delivery is recorded as a
failed attempt carrying the format error, and retries run on the normal schedule
and exhaust their budget rather than being nacked/dead-lettered.

Verified as a real guard: reverting the destawss3 fix makes this test fail
(0 attempts logged, message dead-lettered) instead of recording 3 attempts.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* refactor(destregistry): rename NewFormatErrorDelivery to NewFormatError

The helper returns the (*Delivery, error) pair a publisher returns on a format
failure, not just a delivery — name it accordingly. Behavior unchanged.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(e2e): trim format-error regression test comment

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>

sdks/outpost-typescript/v1.4.1

Toggle sdks/outpost-typescript/v1.4.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
chore: 🐝 Update SDK - Generate OUTPOST-TS 1.4.1 (#963)

* `outpost.configuration.getManagedConfig()`:  `response` **Changed** (Breaking ⚠️)
* `outpost.configuration.updateManagedConfig()`: 
  *  `request` **Changed** (Breaking ⚠️)
  *  `response` **Changed** (Breaking ⚠️)

Co-authored-by: speakeasybot <bot@speakeasyapi.dev>

sdks/outpost-python/v1.4.1

Toggle sdks/outpost-python/v1.4.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
chore: 🐝 Update SDK - Generate OUTPOST-PYTHON 1.4.1 (#962)

* `outpost.configuration.get_managed_config()`:  `response` **Changed** (Breaking ⚠️)
* `outpost.configuration.update_managed_config()`: 
  *  `request` **Changed** (Breaking ⚠️)
  *  `response` **Changed** (Breaking ⚠️)

Co-authored-by: speakeasybot <bot@speakeasyapi.dev>

sdks/outpost-go/v1.4.1

Toggle sdks/outpost-go/v1.4.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
chore: 🐝 Update SDK - Generate OUTPOST-GO 1.4.1 (#960)

* `Outpost.Configuration.GetManagedConfig()`:  `response` **Changed** (Breaking ⚠️)
* `Outpost.Configuration.UpdateManagedConfig()`: 
  *  `request.Request` **Changed** (Breaking ⚠️)
  *  `response` **Changed** (Breaking ⚠️)

Co-authored-by: speakeasybot <bot@speakeasyapi.dev>

v1.0.4

Toggle v1.0.4's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ci(spec-sdk-tests-vs-release): test PR's SDK against latest released …

…Outpost (#927)

* ci(spec-sdk-tests-vs-release): test PR's SDK against latest released Outpost

Closes #926.

Trigger: PRs touching sdks/outpost-typescript/** (where the Speakeasy
bot regen PRs land). Resolves the latest non-prerelease Outpost tag
dynamically via the GitHub releases API (the repo uses namespaced tags
like sdks/outpost-typescript/v1.3.0 for SDK releases, so the bare
vX.Y.Z pattern correctly picks out the Outpost release).

Question this answers: "Will the newly-regen'd SDK in this PR work
against the version of Outpost that customers are already running?"
Distinct from the existing spec-sdk-tests.yml workflow which asks
"does this PR's spec match this PR's server" — both are needed, neither
subsumes the other.

Job shape: pull hookdeck/outpost:<tag> as a docker image, run it
alongside the same service containers as the sibling workflow
(Postgres, redis-stack-server for RediSearch, RabbitMQ), build the
SDK from the PR with no regen step (the regen IS the PR), run the
contract suite.

Not dogfooded on this PR — the trigger filter only matches SDK paths,
which this PR doesn't touch. First real run will be on the next
Speakeasy bot regen PR after this lands.

* ci(spec-sdk-tests-vs-release): support sdk_version + outpost_version dispatch overrides

Lets you trigger the workflow from the Actions UI with optional inputs
for ad-hoc compat testing:
  sdk_version     pins the SDK to a specific release tag (or uses the
                  dispatch branch's contents if empty).
  outpost_version pins the server to a specific Outpost release (or
                  resolves the latest non-prerelease release if empty).

Both accept "1.3.0" or "v1.3.0" — leading "v" is normalized.

Inputs only affect workflow_dispatch runs; pull_request triggers
ignore them, so the gate behaviour for bot regen PRs is unchanged.

Single workflow rather than a sibling file — the job body is ~95%
identical between PR gate and compat testing; the only material
differences are two variables (which SDK, which Outpost).

* ci(spec-sdk-tests-vs-release): guard inputs.* references with workflow_dispatch event check

Defensive pattern flagged by Copilot review on #927: inputs.* context
is officially only populated on workflow_dispatch (and workflow_call).
Practically this works on PR events too — inputs.x evaluates to null
which compares as empty — but the explicit guard is unambiguous and
costs almost nothing.

Two changes:
* OVERRIDE env in the tag resolver uses the short-circuit ternary
  (github.event_name == 'workflow_dispatch' && inputs.x || '').
* SDK override step's if: prepends event_name == 'workflow_dispatch'
  so the inputs.sdk_version check is only evaluated on dispatch runs.

* ci(spec-sdk-tests-vs-release): don't self-trigger on workflow file edits

PRs that touch only this workflow file would fire it against main's
state — currently NEW tests + OLD SDK (regen still pending) + OLD
released Outpost — and fail at TS compile with 'type does not exist
in type DestinationUpdate'. That's predicted transitional-state noise,
not a real bug, but it leaves a permanently-red dogfood result that
future reviewers have to recognize as expected.

Drop the workflow file from its own trigger paths. The actual scenario
this workflow exists for — Speakeasy bot regen PRs — always touches
sdks/outpost-typescript/**, so the gate still catches them. Local
iteration on the workflow file itself uses 'gh workflow run --ref'.

Spotted while inspecting failing PR runs on #927.

sdks/outpost-typescript/v1.4.0

Toggle sdks/outpost-typescript/v1.4.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
chore: 🐝 Update SDK - Generate OUTPOST-TS 1.4.0 (#938)

* `outpost.destinations.update()`:  `request.body` **Changed** (Breaking ⚠️)

Co-authored-by: speakeasybot <bot@speakeasyapi.dev>

sdks/outpost-python/v1.4.0

Toggle sdks/outpost-python/v1.4.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
chore: 🐝 Update SDK - Generate OUTPOST-PYTHON 1.4.0 (#937)

* `outpost.destinations.update()`:  `request.body` **Changed** (Breaking ⚠️)

Co-authored-by: speakeasybot <bot@speakeasyapi.dev>

sdks/outpost-go/v1.4.0

Toggle sdks/outpost-go/v1.4.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
chore: 🐝 Update SDK - Generate OUTPOST-GO 1.4.0 (#935)

* `Outpost.Destinations.Update()`:  `request.Body` **Changed** (Breaking ⚠️)

Co-authored-by: speakeasybot <bot@speakeasyapi.dev>

v1.0.3

Toggle v1.0.3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
fix: increase consumer error tolerance for transient infra outages (#900

)

Previously the consumer gave up after 5 consecutive receive errors with
a 5s backoff cap (~3s total tolerance), permanently killing the worker
with no recovery path. A brief broker hiccup (e.g. GCP OAuth/DNS blip,
managed broker restart) was enough to take down logmq/deliverymq workers
across deployments until containers were manually restarted.

Mirrors the same fix applied to the retrymq scheduler in #881. Increase
to 10 errors with 15s backoff cap (~1 min tolerance window).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>