Skip to content

feat(api): adopt RFC 9457 problem+json as the canonical HTTP error format#1442

Open
JesseTheRobot wants to merge 3 commits into
masterfrom
feat/http-problem-details-foundation
Open

feat(api): adopt RFC 9457 problem+json as the canonical HTTP error format#1442
JesseTheRobot wants to merge 3 commits into
masterfrom
feat/http-problem-details-foundation

Conversation

@JesseTheRobot

@JesseTheRobot JesseTheRobot commented Jun 4, 2026

Copy link
Copy Markdown
Member

Summary

Step 1 of the HTTP interface conventions work (design doc): every error response from the API server now has exactly one shape — RFC 9457 Problem Details, served as application/problem+json — regardless of where in the stack the error originated.

{
  "type": "https://docs.irys.xyz/api/errors/block-not-found",
  "title": "Block not found",
  "status": 404,
  "detail": "Block 0xabc123 was not found in the block tree or database",
  "code": "BLOCK_NOT_FOUND",
  "traceId": "0af7651916cd43dd8448eb211c80319c",
  "error": "Block 0xabc123 was not found in the block tree or database"
}

Changes

  • irys-types: new shared Problem/ProblemItem types (serde-only) — the server and every client import one shape. errors[] + RFC 6901 pointer carry multi-field validation failures.
  • ApiError: every variant maps to (status, type, title, code); error_response() renders problem+json. The ad-hoc (String, StatusCode) tuple errors inherit a code derived from their status. New Validation { errors } variant for field-level failures.
  • 5xx hygiene: detail is genericized on server errors so internal messages never leak; the real error is logged server-side keyed by traceId (W3C trace id from the active OTEL span, also surfaced as a traceparent response header on all responses).
  • Enforcement across all response origins (conventions.rs): JSON-body/path/query extractor failures and unmatched routes route through ApiError (fixes the JSON-decode handler that returned plain text); a normalize_problem middleware backstops everything else (framework 405/413, panics). The backstop only synthesizes for empty/text/* bodies — structured application/* bodies are never clobbered, preserving the /execution-rpc JSON-RPC passthrough verbatim.
  • Migration: a deprecated top-level error string mirrors detail for one release so legacy parsers of {"error": "..."} keep working.

Behavior changes (intentional)

  • All error bodies change from ad-hoc shapes to problem+json (legacy error field retained for the deprecation window).
  • Malformed path/query params now return 400 (INVALID_PATH_PARAMETER/INVALID_QUERY_PARAMETER) instead of actix's defaults (404/422). No in-repo consumer asserted the old codes.

Test plan

  • 19 new unit/integration tests: shape serialization/roundtrip, ApiErrorProblem mapping, 5xx non-leak, extractor/default-service/middleware behavior, proxy-passthrough preservation
  • cargo test -p irys-api-server --lib → 95 passed; cargo test -p irys-types --lib problem → 3 passed
  • cargo fmt + cargo clippy --tests clean on touched crates

Roadmap (documented, not in this PR)

  1. utoipa code-first OpenAPI: /v1/openapi.json + Swagger UI at /v1/docs, committed spec + CI drift check
  2. Annotate remaining routes + route-coverage contract test
  3. Collapse the TUI's duplicate client onto irys-api-client

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • RFC 9457-compliant Problem error responses across the API, with field-level validation details and stable machine-readable codes.
    • Automatic trace correlation: traceId in error bodies and traceparent header on all responses.
    • Response-normalizing middleware that converts non-problem error responses into application/problem+json and preserves protocol headers.
    • Unmatched routes and malformed/invalid request parameters produce standardized problem responses.
  • Documentation

    • Added HTTP interface conventions design spec describing the Problem contract, trace correlation, and OpenAPI/export expectations.

…rmat

Foundation for the HTTP interface conventions work (step 1 of 4):

- Add shared Problem/ProblemItem types to irys-types so the server and
  every client import one error shape
- Map every ApiError variant to (status, type, title, code) and render
  application/problem+json with a traceId (W3C trace id from the active
  OTEL span) plus a deprecated `error` mirror of `detail` for legacy
  parsers (removed in a future release)
- Genericize 5xx detail so internal error messages never leak; the real
  error is logged server-side keyed by traceId
- Route extractor failures (JSON body, path, query) and unmatched routes
  through ApiError. Note: malformed path/query params now return 400
  (INVALID_PATH_PARAMETER / INVALID_QUERY_PARAMETER) instead of the
  actix defaults 404/422
- Add normalize_problem middleware backstop: synthesizes problem+json
  for empty/text framework error bodies (405/413/panics) and stamps the
  traceparent header on all responses. Structured application/* bodies
  are never clobbered, preserving the /execution-rpc JSON-RPC passthrough
- Design doc with the steps 2-4 roadmap (utoipa OpenAPI generation +
  /v1/openapi.json + docs UI, remaining route annotation, TUI client
  collapse) in docs/superpowers/specs/
@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 10add97c-668f-4b52-9869-f2eddd7f72d7

📥 Commits

Reviewing files that changed from the base of the PR and between 9c042f3 and 617f766.

📒 Files selected for processing (1)
  • crates/api-server/src/conventions.rs

📝 Walkthrough

Walkthrough

This PR implements RFC 9457 (Problem Details) error handling for the Irys API server. It adds shared Problem and ProblemItem types to crates/types, extends ApiError with structured variants and RFC 9457 serialization, introduces a normalize_problem middleware that synthesizes RFC 9457 JSON for unstructured error responses, extracts OpenTelemetry trace context via new helpers, and wires everything into the Actix app.

Changes

RFC 9457 Problem Details Error Handling

Layer / File(s) Summary
Shared RFC 9457 Problem types
crates/types/src/problem.rs, crates/types/src/lib.rs
Problem and ProblemItem structs provide the canonical RFC 9457 serialization shape with required error metadata, optional validation errors array, and deprecated error mirror. Serde field renames (typer#type, traceIdtrace_id) and omission rules for optional fields are enforced. Tests validate serialization, pointer values, and deserialization from wire format.
OpenTelemetry trace context extraction
crates/api-server/src/trace.rs, crates/api-server/Cargo.toml
New trace.rs module exposes current_trace_id() and current_traceparent() to derive correlation identifiers from the active tracing span's OpenTelemetry context. tracing-opentelemetry dependency is added to support context extraction.
ApiError RFC 9457 serialization
crates/api-server/src/error.rs
ApiError enum gains five new variants (Validation, MalformedBody, InvalidPathParameter, InvalidQueryParameter, RouteNotFound) carrying structured details. New to_problem(trace_id) method and helpers (code(), title(), problem_type(), status_code_token(), problem_from_status()) generate RFC 9457 responses with stable codes/titles. ResponseError::error_response() now emits application/problem+json with serialized Problem. Tests validate RFC 9457 output fields, error array presence, and trace id inclusion.
Error conventions and normalize_problem middleware
crates/api-server/src/conventions.rs
Three extractor config functions (json_error_config, path_error_config, query_error_config) map parse/unmarshal failures to ApiError variants. normalize_problem middleware stamps traceparent headers on all responses and synthesizes RFC 9457 JSON for non-structured error bodies (status ≥ 400, non-application content type), preserving structured application bodies. route_not_found handler converts unmatched routes to ApiError::RouteNotFound. Integration tests validate malformed JSON/path/query handling, backstop normalization, and preservation of existing structured responses.
App-level integration and module exposure
crates/api-server/src/lib.rs
Exposes the new conventions module, updates Actix Web imports to remove direct response/error types, integrates extractor configs and normalize_problem middleware into app setup, and simplifies tracing imports to tracing::info.
HTTP interface conventions design specification
docs/superpowers/specs/2026-06-03-http-interface-conventions-design.md
Design document specifying RFC 9457 error contract, W3C Trace Context correlation via traceId/traceparent, enforcement via five response origins and normalizing middleware, OpenAPI generation/serving infrastructure with utoipa, and phased rollout plan covering foundation, utoipa scaffolding, route documentation, and client consolidation.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adopting RFC 9457 problem+json as the canonical HTTP error format. It is concise, clear, and directly reflects the primary objective of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/http-problem-details-foundation

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/api-server/src/conventions.rs`:
- Around line 95-103: The synthesized RFC 9457 response built in the
should_synthesize_problem path loses protocol headers because res.into_parts()
discards the original response; update the code after calling res.into_parts()
(where you create problem via problem_from_status and build the new HttpResponse
with PROBLEM_JSON) to copy headers from _discarded.headers() into the new
response.headers_mut() before stamp_traceparent, skipping or overriding
Content-Type and Content-Length (and any other intentionally replaced headers)
so protocol-significant headers from the original response are preserved when
returning ServiceResponse::new(req, response).
- Around line 158-159: Replace the non-standard #[actix_web::test] attributes on
async tests (e.g., the malformed_json_body_renders_problem test and the other
async test functions in this file currently annotated with #[actix_web::test])
with the repository-standard async test macro: use #[tokio::test] for plain
async tests, or #[test_log::test(tokio::test)] if you want test-logging/tracing
output; update the test attribute lines accordingly so the tests run under the
crate's preferred test harness.

In `@crates/api-server/src/lib.rs`:
- Line 5: The crate currently exposes `trace` publicly via `pub mod trace;`
which expands the public API unnecessarily; change the declaration to a private
module (`mod trace;`) so internal references like `crate::trace` in `error.rs`
continue to work while preventing external consumers from depending on
`irys_api_server::trace`.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: bfe464b2-576b-4714-913d-f12d7366ea37

📥 Commits

Reviewing files that changed from the base of the PR and between b099906 and 004d740.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • crates/api-server/Cargo.toml
  • crates/api-server/src/conventions.rs
  • crates/api-server/src/error.rs
  • crates/api-server/src/lib.rs
  • crates/api-server/src/trace.rs
  • crates/types/src/lib.rs
  • crates/types/src/problem.rs
  • docs/superpowers/specs/2026-06-03-http-interface-conventions-design.md

Comment thread crates/api-server/src/conventions.rs
Comment on lines +158 to +159
#[actix_web::test]
async fn malformed_json_body_renders_problem() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

# Check the file and read the relevant lines
head -250 crates/api-server/src/conventions.rs | tail -100

Repository: Irys-xyz/irys

Length of output: 3980


🏁 Script executed:

# Get more context around the lines mentioned
sed -n '150,250p' crates/api-server/src/conventions.rs

Repository: Irys-xyz/irys

Length of output: 4042


🏁 Script executed:

# Check if there are any other test macros used in the same file
rg -n '#\[.*test.*\]' crates/api-server/src/conventions.rs

Repository: Irys-xyz/irys

Length of output: 320


🏁 Script executed:

# Check CLAUDE.md or similar documentation
find . -name "CLAUDE.md" -o -name "README*.md" | head -5

Repository: Irys-xyz/irys

Length of output: 199


🏁 Script executed:

# Search for test macro usage patterns across the api-server crate
rg '#\[(actix_web::test|tokio::test|test_log::test)' crates/api-server/

Repository: Irys-xyz/irys

Length of output: 578


🏁 Script executed:

# Check if there are other actix-web integration tests in the codebase
rg 'test::init_service|test::TestRequest' --type rust | head -20

Repository: Irys-xyz/irys

Length of output: 1976


🏁 Script executed:

# Look at what the actix-web::test macro actually does by checking Cargo.toml
grep -A 5 -B 5 'actix-web' crates/api-server/Cargo.toml

Repository: Irys-xyz/irys

Length of output: 694


🏁 Script executed:

# Check CLAUDE.md for specific guidance on test macros
cat CLAUDE.md | grep -A 20 -B 5 "test"

Repository: Irys-xyz/irys

Length of output: 4007


🏁 Script executed:

# Check if the project has any documentation about actix-web testing conventions
rg -i "actix.*test|integration.*test" README.md CLAUDE.md

Repository: Irys-xyz/irys

Length of output: 39


🏁 Script executed:

# Check the test in error.rs that uses #[actix_web::test]
sed -n '1,50p' crates/api-server/src/error.rs | grep -A 30 "test"

Repository: Irys-xyz/irys

Length of output: 39


🏁 Script executed:

# Check the commented-out tests in block.rs to see why they're commented out
grep -B 5 -A 15 '#\[tokio::test\]' crates/api-server/src/routes/block.rs

Repository: Irys-xyz/irys

Length of output: 1706


🏁 Script executed:

# Check if there's any guidance about actix-web tests specifically
rg -i "actix" CLAUDE.md

Repository: Irys-xyz/irys

Length of output: 218


🏁 Script executed:

# Let's check what #[actix_web::test] actually is and whether it's compatible with standard tokio::test
rg -B 5 -A 10 'actix_web::test' crates/api-server/src/error.rs

Repository: Irys-xyz/irys

Length of output: 639


🏁 Script executed:

# Check if there are any other examples in the codebase using #[tokio::test] with actix test utilities
rg -B 2 '#\[tokio::test\]' crates/api-server/ -A 5

Repository: Irys-xyz/irys

Length of output: 1794


🏁 Script executed:

# Check Cargo.toml to understand actix-web version and features
cat crates/api-server/Cargo.toml

Repository: Irys-xyz/irys

Length of output: 1111


Use the repo-standard async test macro in these tests.

These are async tests but they use #[actix_web::test] instead of the crate convention. Prefer #[tokio::test], or #[test_log::test(tokio::test)] where you want tracing/warning-path logs surfaced during test failures.

Also applies to: 183-184, 195-196, 217-218, 237-238

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-server/src/conventions.rs` around lines 158 - 159, Replace the
non-standard #[actix_web::test] attributes on async tests (e.g., the
malformed_json_body_renders_problem test and the other async test functions in
this file currently annotated with #[actix_web::test]) with the
repository-standard async test macro: use #[tokio::test] for plain async tests,
or #[test_log::test(tokio::test)] if you want test-logging/tracing output;
update the test attribute lines accordingly so the tests run under the crate's
preferred test harness.

Comment thread crates/api-server/src/lib.rs Outdated
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

- Preserve protocol-significant headers (e.g. Allow on 405) when the
  normalize_problem backstop replaces an error body; only the
  body-descriptive headers (Content-Type/Length, Transfer-Encoding) are
  superseded by the synthesized Problem
- Make the trace module crate-private; it is an internal helper and was
  needlessly part of the public API

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/api-server/src/conventions.rs`:
- Line 19: The header-preservation loop currently keeps body-encoding metadata
which causes clients to mis-decode the replacement problem+json body; update the
header-copying logic to also exclude CONTENT_ENCODING (alongside CONTENT_LENGTH,
CONTENT_TYPE and TRANSFER_ENCODING) so the new body isn't labeled as
gzip/deflate; locate the header-copying/remove section that references
CONTENT_LENGTH, CONTENT_TYPE and TRANSFER_ENCODING and add CONTENT_ENCODING to
the excluded set or call headers.remove(CONTENT_ENCODING) when replacing the
response body.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 89309d0b-af14-42b6-ab00-f08e1d8f21ff

📥 Commits

Reviewing files that changed from the base of the PR and between 004d740 and 9c042f3.

📒 Files selected for processing (3)
  • crates/api-server/src/conventions.rs
  • crates/api-server/src/lib.rs
  • crates/api-server/src/trace.rs

error::Error,
http::{
StatusCode,
header::{CONTENT_LENGTH, CONTENT_TYPE, HeaderName, HeaderValue, TRANSFER_ENCODING},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Drop Content-Encoding when replacing the body.

This loop still preserves body-encoding metadata from the discarded response. If the original 4xx/5xx carried Content-Encoding: gzip, clients will try to decode the new plain problem+json body as gzip and fail. Exclude CONTENT_ENCODING alongside the other body-descriptive headers.

Suggested fix
-        header::{CONTENT_LENGTH, CONTENT_TYPE, HeaderName, HeaderValue, TRANSFER_ENCODING},
+        header::{
+            CONTENT_ENCODING, CONTENT_LENGTH, CONTENT_TYPE, HeaderName, HeaderValue,
+            TRANSFER_ENCODING,
+        },
@@
-            if name != CONTENT_TYPE && name != CONTENT_LENGTH && name != TRANSFER_ENCODING {
+            if name != CONTENT_TYPE
+                && name != CONTENT_LENGTH
+                && name != TRANSFER_ENCODING
+                && name != CONTENT_ENCODING
+            {
                 response.headers_mut().append(name.clone(), value.clone());
             }

Also applies to: 102-107

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api-server/src/conventions.rs` at line 19, The header-preservation
loop currently keeps body-encoding metadata which causes clients to mis-decode
the replacement problem+json body; update the header-copying logic to also
exclude CONTENT_ENCODING (alongside CONTENT_LENGTH, CONTENT_TYPE and
TRANSFER_ENCODING) so the new body isn't labeled as gzip/deflate; locate the
header-copying/remove section that references CONTENT_LENGTH, CONTENT_TYPE and
TRANSFER_ENCODING and add CONTENT_ENCODING to the excluded set or call
headers.remove(CONTENT_ENCODING) when replacing the response body.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant