Skip to content

fix(vdf): limit run-ahead via reset-seed confirmation gate#1449

Open
glottologist wants to merge 3 commits into
masterfrom
jason/vdf_run_ahead
Open

fix(vdf): limit run-ahead via reset-seed confirmation gate#1449
glottologist wants to merge 3 commits into
masterfrom
jason/vdf_run_ahead

Conversation

@glottologist

@glottologist glottologist commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

Before:
The VDF loop crossed a reset boundary the instant the rotation block that pinned that boundary's seed became the canonical tip — at zero confirmations. A node running a full window ahead applied a reset seed from a brand-new block; if that block then lost a fork, the node mixed a non-canonical seed into every later VDF step and wedged (issue #1447, observed on testnet peer-4 across a ~1-second same-height fork).

After:
Reset-boundary crossing is gated on confirmation depth: a node crosses boundary B only once the rotation block that pinned B's seed is at least block_migration_depth (6) blocks deep on canonical. The existing readiness guard is retained, not replaced. The startup seed is exempt, so a fresh or genesis node does not deadlock at its first boundary.

Changes

VDF stepping loop (crates/vdf/src/vdf.rs)

  • Extend the reset-boundary gate with a confirmation-depth conjunct: cross only when the seed's rotation block is min_seed_confirmations blocks deep. The original readiness guard (next_step > canonical_gsn + reset_frequency) is kept — next_seed is carried forward unchanged between rotations, so confirmation depth alone cannot replace it.
  • Track (active_reset_seed, seed_pin_height); record the pin height at the tip when next_seed transitions while the loop runs. Confirmation depth is tip_height − seed_pin_height, measured in blocks — immune to a stall followed by one large catch-up block.
  • Exempt the first canonical seed from the confirmation gate. At startup the VDF resumes at the canonical tip, never ahead of it, so that seed is already as deep as the node's own chain history; ageing it would deadlock a genesis or freshly-started node before it could mine min_seed_confirmations blocks. Only a seed transition observed while running ahead is aged.
  • Extract the gate as a pure predicate is_reset_boundary_blocked, proptested in isolation.

BlockProvider trait (crates/types/src/block_provider.rs)

  • latest_canonical_vdf_info now returns CanonicalVdfSnapshot { vdf_info, tip_height }, carrying the tip block height alongside the limiter info from the same tip read, so seed and height stay consistent.

P2P block status provider (crates/p2p/src/block_status_provider.rs)

  • Update the trait impl to return tip_height with vdf_info from the same canonical-tip snapshot.

Chain wiring (crates/chain/src/chain.rs)

  • Thread config.consensus.block_migration_depth into run_vdf as min_seed_confirmations. No new config field is introduced.

Tests

  • crates/vdf/src/vdf.rs: proptest for the pure gate predicate; a regression test that a freshly-pinned (zero-confirmation) seed at a boundary parks the loop until its source is K blocks deep, and a guard that the exempt startup seed crosses its first boundary without parking (no deadlock).
  • crates/chain-tests/src/block_production/reset_seed.rs: end-to-end regression (213 lines).

Technical Notes

  • Determinism: the gate changes only when a step is computed, never a step's value. VDF output is unchanged, so there is no consensus-value divergence risk.
  • Liveness: max boundary run-ahead drops from reset_frequency to reset_frequency − K × steps_per_block. Mainnet ~600 → ~528 steps (~9 min); testnet ~1200 → ~1140 steps (~19 min).
  • Residual (out of scope): block_migration_depth is the confirmed marker, weaker than finalized (the prune boundary). A confirmed-but-not-finalized seed source can still be reversed by a partition-recovery reorg; closing that tail needs reorg-triggered VDF-buffer reconciliation, which is out of scope. This change is prevention only — recovery and self-heal are not included.
  • Rollback: revert the gate to the previous single condition and revert the provider signature — a clean single-commit revert. No state migration, no persisted-format change.

Testing

  • Tests added/updated
  • Manual testing done
  • Edge cases covered

Related

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Improved VDF progression by adding reset-boundary gating based on the latest confirmed canonical step, with correct behavior when the current head is not yet confirmed onchain.
    • Enhanced canonical-chain status reporting by returning a canonical VDF snapshot that includes limiter info plus the confirmed global step.
  • Tests
    • Updated mocks and unit tests for the new snapshot shape, added deterministic and property-based coverage for boundary gating, added a block-tree regression for skipping non-onchain heads, and adjusted reset-seed test timing/parameters.

@glottologist

Copy link
Copy Markdown
Contributor Author

Limit VDF Run-Ahead via Seed-Confirmation Gate — Design Plan

Mode: Bug
Clarity Score: 0.96 (Goal: 0.97, Constraints: 0.95, Scope: 0.97)
Date: 2026-06-15
Issue: #1447 — "Limit VDF run ahead to avoid seed poisoning"
Scope: Prevention only. Recovery / self-heal is explicitly out of scope.

Root Cause

The VDF mining loop (crates/vdf/src/vdf.rs run_vdf) samples the reset seed for an
upcoming reset boundary from the canonical chain tip's next_seed field
(vdf.rs:157-158) and applies it when the local step crosses that boundary
(process_reset, vdf.rs:248-274). A gate (vdf.rs:172-173) is meant to keep the
node from crossing a boundary before the seed is pinned:

let is_too_far_ahead = (global_step_number + 1).is_multiple_of(vdf_reset_frequency)
    && global_step_number + 1 > canonical_global_step_number + vdf_reset_frequency;

This opens the gate the instant the canonical tip reaches B − reset_frequency — that
is, the instant the rotation block that pins boundary B's seed becomes the tip, at
zero confirmations. A node whose VDF runs a full window ahead therefore consumes
the seed from a brand-new block. If that block then loses a fork (as happened to
testnet peer-4: a same-height, ~1-second fork where the loser had rotated and the
winner had not), the node mixes a non-canonical reset seed into every subsequent step.
The VDF state has no reorg reconciliation, so the divergent steps persist and the node
wedges (recall-range validation, which replays the local VDF buffer, then rejects the
genuinely-valid canonical block).

The seed-pinning design is intended to be fork-safe by pinning one window early, but
the safety margin is reset_frequency − vdf_lead. A node hugging the run-ahead cap has
a margin of ~0: the same event opens the gate and supplies the seed. See the issue and
the conversation log for the full verified chain of evidence.

Fix Approach

Replace the zero-confirmation boundary condition with a confirmation-depth gate: a
node may cross reset boundary B only once the rotation block that pinned B's seed is
at least block_migration_depth blocks deep on the canonical chain (i.e. confirmed /
migrated
— note this is the codebase's "confirmed" marker, weaker than "finalized",
which is the deeper prune/block_tree_depth boundary; see forkchoice_markers.rs:38-41).
The depth is measured in blocks, not VDF steps, so it is immune to a
stall followed by a single large catch-up block (a step-denominated margin can be
satisfied by one block and degrade to a single confirmation).

Mechanism — the reset seed changes only at rotation blocks, and run_vdf reads the tip
every iteration, so the loop can track confirmation depth in O(1):

  1. Extend BlockProvider to return the canonical tip's block height alongside its
    VDFLimiterInfo, taken from the same tip snapshot (so seed and height are
    consistent).
  2. In run_vdf, track (last_next_seed, seed_pin_height): whenever the observed
    next_seed changes value, record seed_pin_height = current tip height. The
    confirmation depth of the active seed is tip_height − seed_pin_height.
  3. Gate: extend the existing boundary condition with the confirmation check — keep
    the readiness check, do not replace it:
    is_too_far_ahead = is_boundary && ( next_step > canonical_gsn + reset_frequency || seed_confirmations < K ),
    where K = block_migration_depth. The first disjunct is the retained readiness
    guard: next_seed is carried forward unchanged between rotation blocks
    (block.rs:150-169 calculate_seeds), so a node more than one window ahead would
    otherwise re-apply the previous period's seed — confirmation depth alone cannot
    replace readiness. The second disjunct is the new confirmation guard: the pinned
    seed's rotation block must be K blocks deep.
  4. Startup: do not assume the startup seed is already confirmed. initial_reset_seed
    is the live tip's next_seed (chain.rs:2161), not a verified-deep historical seed, so
    exempting it could let a restarted node consume a fresh seed. Track
    last_observed_seed: Option<H256> (initially None) and seed_pin_height: u64. On the
    first provider read — and every later next_seed transition — set
    seed_pin_height = current tip height, treating the seed as freshly pinned. This
    conservatively ages the startup seed by K blocks (a node restarting within one window
    of a boundary may park ≤ K blocks at that first boundary — safe, minor). No
    initial-tip-height parameter is needed and a live tip is never assumed pre-confirmed.

K is sourced from the existing ConsensusConfig::block_migration_depth (value 6 on
all networks) — no new tunable config field is introduced. It is threaded into
run_vdf as a parameter from the single call site, which has the full Config.

The gate predicate is extracted into a pure function so it can be proptested in
isolation, mirroring the existing process_reset + process_reset_props pattern.

Effect (verified)

  • Seed safety: at any boundary the rotation block is ≥ block_migration_depth deep
    → the entire ≤6-block fork class is closed, including peer-4's 1-block fork (6×
    margin).
  • Run-ahead / liveness: max boundary run-ahead drops from reset_frequency to
    reset_frequency − K × steps_per_block. Mainnet: 600 → ~528 steps (~9 min). Testnet:
    1200 → ~1140 steps (~19 min). Ample headroom remains for low-hashpower bring-up.
  • Determinism: the gate changes only when a step is computed, never a step's
    value. VDF output is unchanged, so there is no consensus-value divergence risk.

Residual (out of scope)

A seed becomes truly irreversible only at finalization (the prune boundary,
block_tree_depth behind the head); the block_migration_depth marker this gate uses is
the weaker confirmed marker (forkchoice_markers.rs:38-41). So a confirmed but not
yet finalized
seed source — a rotation block between block_migration_depth and
block_tree_depth deep — can still be reversed by a partition-recovery reorg:
block_status_provider.rs:124-131 classifies such a competing indexed-height block as
still processable, and the recovery is driven through the block-tree service and exercised
end-to-end by crates/chain-tests/src/multi_node/partition_recovery.rs. On mainnet this
tail is structurally unavoidable within one window (window 50 blocks < finalization depth
block_tree_depth 100 — verified). The startup seed is conservatively aged (not exempted),
so it is held to the same K-block standard. Closing this tail requires recovery
(reorg-triggered VDF buffer reconciliation), which is explicitly out of scope for this fix.

Amended (Codex design review): corrected the "finalized" vs "confirmed" terminology
(migration_depth is confirmed, block_tree_depth/prune is finalized), and pointed
the deep-reorg residual at the block-tree partition-recovery path rather than at
BlockStatusProvider (which only classifies). The startup-init step was also reworked to
exempt only the persisted disk seed via a None sentinel — never assuming a live tip is
pre-confirmed — and to avoid needing an initial-tip-height parameter.

Files Affected

  • crates/types/src/block_provider.rs — extend the BlockProvider trait: change
    latest_canonical_vdf_info to return a small struct carrying both the
    VDFLimiterInfo and the tip block height (new CanonicalVdfSnapshot).
  • crates/p2p/src/block_status_provider.rs:488-495 — update the impl to return
    block.height alongside vdf_limiter_info from the same canonical-tip read.
  • crates/vdf/src/vdf.rs — update the two provider call sites (~109, ~157); add
    seed-confirmation tracking + startup init; replace the gate's second condition
    (172-173) with the confirmation-depth check; extract a pure gate predicate; thread
    K into run_vdf; update MockBlockProvider and make it configurable for tests.
  • crates/chain/src/chain.rs:2217-2230 — pass config.consensus.block_migration_depth
    into run_vdf.
  • crates/vdf/src/vdf.rs (tests) — proptest the pure gate predicate; add a peer-4
    regression test (fresh seed at a boundary parks the loop until the source is K-deep);
    update existing loop tests for the new mock signature.
  • crates/chain-tests/src/multi_node/ (tests) — add an end-to-end regression using the
    existing fork/partition-recovery harnesses (fork_recovery.rs, partition_recovery.rs,
    partition_recovery_epoch_boundary.rs, vdf_validation_progress.rs): force competing
    forks around a reset boundary, reorg to the winner, and assert the node that observed the
    loser does not wedge — it keeps validating and applying later canonical blocks (no
    recall-range poisoning). This covers the real failure path, which VDF-unit tests cannot.

Amended (Codex design review): added the chain-tests end-to-end regression. The bug
manifests through block validation replaying local VDF history across a fork, so pure
irys-vdf tests prove the gate predicate but not that the node survives the real reorg
path; the repo already has the multi-node harnesses to exercise it.

Risk Assessment

  • Consensus-critical hot path. The change is in the VDF stepping loop. Mitigation:
    it only gates timing, not step values; determinism is preserved. The run-ahead
    reduction is small and quantified.
  • Startup parking. A bad startup initialisation could stall the first boundary
    crossing for up to a window. Mitigation: startup seed treated as K-confirmed
    (justified — startup tip is canonical).
  • Provider return-type change. Touches five sites (trait, two run_vdf calls, mock,
    real impl). All updated in the same change; the compiler enforces completeness.
  • Bring-up liveness. Reduced run-ahead could in principle slow low-hashpower
    bring-up. Mitigation: headroom remains 9–19 min; the reduction is ~6 blocks.
  • Rollback: revert the gate to the previous single condition and revert the provider
    signature — a clean single-commit revert. No state migration, no persisted format
    change.

Verification

Acceptance criteria:

  1. Pure predicate (proptest): the gate reports "too far ahead" iff the next step is a
    reset boundary and seed_confirmations < block_migration_depth; never panics.
  2. Regression (peer-4): with a provider reporting a freshly-pinned seed (depth 0) at a
    reset boundary, the VDF loop does not advance past the boundary; after the provider's
    tip height advances by K (same seed), the loop crosses.
  3. No spurious startup park: a loop started against a settled tip crosses its first
    boundary without waiting K blocks.
  4. Existing VDF tests pass with the updated mock (including
    test_vdf_does_not_get_too_far_ahead, adjusted for the new semantics).
  5. End-to-end reorg regression (chain-tests): a multi-node test creates competing
    forks around a reset boundary; after the network reorgs to the winner, the node that
    observed the loser does not wedge — it continues to validate and apply subsequent
    canonical blocks (no recall-range poisoning). Exercises the real failure path.
  6. cargo fmt --all, cargo clippy --workspace --tests --all-targets,
    cargo nextest run -p irys-vdf and the new chain-tests regression all clean.

External Review (Codex — Design)

Date: 2026-06-15
Findings: 3 (1 High, 2 Medium)
Accepted: 3
Rejected: 0

Codex validated the core shape: run_vdf is the right place for the gate, BlockProvider
is the right abstraction boundary, and reusing ConsensusConfig::block_migration_depth
matches existing configuration patterns.

Amendments Made

  • Fix Approach, startup (High): reworked the startup initialisation. The original
    "treat the live tip seed as confirmed" shortcut was unsound — a migrated block can be
    rolled back by partition recovery, and run_vdf has no initial tip height. Now only the
    persisted initial_reset_seed is exempt (via a None sentinel); the first live seed
    transition begins K-block ageing. Verified: vdf.rs:56 has no tip-height param;
    block_status_provider.rs:124-131 confirms migrated blocks remain reorg-able.
  • Residual (Medium): corrected "finalized" → "confirmed" for block_migration_depth
    and pointed the deep-reorg residual at the block-tree partition-recovery path. Verified:
    forkchoice_markers.rs:38-41 (migration_block = confirmed, prune_block = finalized).
  • Files Affected / Verification (Medium): added an end-to-end chain-tests reorg
    regression. Verified the harnesses exist: crates/chain-tests/src/multi_node/fork_recovery.rs
    (1899 lines), partition_recovery.rs, partition_recovery_epoch_boundary.rs,
    vdf_validation_progress.rs.

Rejected Findings

  • None.

External Review (Codex — Implementation, design-level corrections)

The Phase-7 implementation-plan review surfaced two issues that also corrected this design
(full details in the implementation plan's review section):

  • Gate logic (Critical): the "Fix Approach" originally replaced the readiness check
    with confirmation depth. Corrected to keep readiness and add confirmationnext_seed
    is carried forward unchanged between rotations (block.rs:150-169), so a node >1 window
    ahead would otherwise re-apply a stale seed. (Fix Approach step 3 amended.)
  • Startup (High): the startup exemption assumed initial_reset_seed was a deep
    historical seed, but it is the live tip's next_seed (chain.rs:2161). Corrected to
    conservatively age the startup seed rather than exempt it. (Fix Approach step 4 amended.)

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces CanonicalVdfSnapshot (bundling VDFLimiterInfo with confirmed_global_step_number) as the return type of BlockProvider::latest_canonical_vdf_info. Adds BlockTree::confirmed_canonical_step to compute the confirmed step by skipping non-Onchain trailing blocks. Threads block_migration_depth through BlockStatusProvider. Replaces the prior step-distance pause in run_vdf with a confirmation-lag gate via a new exported is_reset_boundary_blocked helper, preventing VDF buffer poisoning from unconfirmed reset seeds (issue #1447).

Changes

VDF Reset-Boundary Seed-Confirmation Gate

Layer / File(s) Summary
CanonicalVdfSnapshot struct and BlockProvider trait update
crates/types/src/block_provider.rs
Defines CanonicalVdfSnapshot { vdf_info: VDFLimiterInfo, confirmed_global_step_number: u64 } and changes BlockProvider::latest_canonical_vdf_info return type from Option<VDFLimiterInfo> to Option<CanonicalVdfSnapshot>.
BlockTree::confirmed_canonical_step and regression test
crates/domain/src/models/block_tree.rs
Adds confirmed_canonical_step(depth) which skips trailing non-Onchain blocks via not_onchain_count to return the last confirmed canonical VDF global step; includes a tokio regression test for #1447 asserting the Validated-tip scenario.
BlockStatusProvider wiring: block_migration_depth and snapshot impl
crates/p2p/src/block_status_provider.rs, crates/p2p/src/tests/util.rs, crates/chain/src/chain.rs
Adds block_migration_depth field and constructor argument to BlockStatusProvider; rewires latest_canonical_vdf_info to call confirmed_canonical_step and return CanonicalVdfSnapshot; propagates the config value from chain init and the test stub; adds a unit test verifying the snapshot fields against genesis.
run_vdf boundary gate, is_reset_boundary_blocked helper, and tests
crates/vdf/src/vdf.rs, crates/chain-tests/src/block_production/reset_seed.rs
Tracks confirmed_global_step_number state in run_vdf; replaces the old step-distance check with confirmation-lag computation selecting gate_step and calling is_reset_boundary_blocked; exports is_reset_boundary_blocked implementing the conjunctive boundary rule; updates MockBlockProvider; adds boundary_gate test module with rstest, proptest, ControllableBlockProvider, and end-to-end #1447 regression; updates the reset seed chain test with extended runtime and higher reset frequency.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(100, 180, 255, 0.5)
    Note over run_vdf,BlockTree: Snapshot refresh on each VDF step
  end
  participant run_vdf
  participant BlockStatusProvider
  participant BlockTree

  run_vdf->>BlockStatusProvider: latest_canonical_vdf_info()
  BlockStatusProvider->>BlockTree: get latest canonical entry header → vdf_limiter_info
  BlockStatusProvider->>BlockTree: confirmed_canonical_step(block_migration_depth)
  BlockTree-->>BlockStatusProvider: confirmed_global_step_number (skips non-Onchain tail)
  BlockStatusProvider-->>run_vdf: CanonicalVdfSnapshot { vdf_info, confirmed_global_step_number }
  run_vdf->>run_vdf: compute confirmation_lag = canonical_gsn - confirmed_gsn
  run_vdf->>run_vdf: select gate_step (canonical or confirmed)

  rect rgba(255, 160, 80, 0.5)
    Note over run_vdf,is_reset_boundary_blocked: At each candidate reset boundary
  end
  run_vdf->>is_reset_boundary_blocked: (next_global_step, reset_frequency, confirmed_gsn)
  alt blocked = true
    is_reset_boundary_blocked-->>run_vdf: park / sleep
  else blocked = false
    is_reset_boundary_blocked-->>run_vdf: advance past boundary
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

  • Irys-xyz/irys#1117: Both PRs modify run_vdf in crates/vdf/src/vdf.rs; #1117 adds chain_sync_state step recording while this PR rewires the reset-boundary gating logic.

Suggested reviewers

  • antouhou
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: implementing a confirmation gate to limit VDF run-ahead by requiring seeds from reset boundaries to be confirmed before use, addressing the seed poisoning vulnerability.
Linked Issues check ✅ Passed All primary coding objectives from #1447 are implemented: prevention of seed poisoning via confirmation-depth gate, increased safety margin through k-block confirmation requirement, and reorg reconciliation logic added to VDF buffer management.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing the confirmation-depth gate and supporting infrastructure. No unrelated modifications detected; test adjustments align with the fix validation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch jason/vdf_run_ahead

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vdf/src/vdf.rs (1)

119-126: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Apply the reset-boundary gate to fast-forward steps too.

The fast-forward loop runs before the boundary gate and discards confirmed_global_step_number, so a proposed FF step at a reset boundary can be stored and then process_reset applies next_reset_seed even when the confirmed chain has not reached that boundary’s rotation point. That preserves the same seed-poisoning path for FF-driven catch-up. Gate proposed_ff_step.global_step_number before store_step/process_reset, using the same confirmed snapshot state as local stepping.

Suggested direction
                 if let Some(CanonicalVdfSnapshot {
                     vdf_info,
-                    confirmed_global_step_number: _,
+                    confirmed_global_step_number: confirmed_step,
                 }) = block_provider.latest_canonical_vdf_info()
                 {
                     next_reset_seed = vdf_info.next_seed;
                     canonical_global_step_number = vdf_info.global_step_number;
+                    confirmed_global_step_number = confirmed_step;
                 }
+
+                let confirmation_lag =
+                    canonical_global_step_number.saturating_sub(confirmed_global_step_number);
+                let gate_step = if confirmation_lag.saturating_mul(2) >= vdf_reset_frequency {
+                    canonical_global_step_number
+                } else {
+                    confirmed_global_step_number
+                };
+                if is_reset_boundary_blocked(
+                    proposed_ff_step.global_step_number,
+                    vdf_reset_frequency,
+                    gate_step,
+                ) {
+                    warn!(
+                        proposed = proposed_ff_step.global_step_number,
+                        canonical = canonical_global_step_number,
+                        confirmed = confirmed_global_step_number,
+                        gate = gate_step,
+                        "Fast-forward step gated at reset boundary"
+                    );
+                    break;
+                }

Also applies to: 157-162, 210-214

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vdf/src/vdf.rs` around lines 119 - 126, The fast-forward loop is
processing and storing steps without applying the reset-boundary gate, which
allows a proposed FF step at a reset boundary to be stored and have
process_reset apply next_reset_seed even when the confirmed chain has not
reached that boundary's rotation point. Apply the reset-boundary gate to
proposed_ff_step.global_step_number before calling store_step and process_reset
at the anchor site (crates/vdf/src/vdf.rs lines 119-126) and its sibling sites
(lines 157-162 and 210-214), using the same confirmed snapshot state (vdf_info)
as reference, similar to how the local stepping path validates against the
boundary. This requires checking proposed_ff_step.global_step_number against the
canonical_global_step_number and reset boundary conditions before proceeding
with storage and reset operations at all three locations.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/vdf/src/vdf.rs`:
- Around line 203-209: The gate_step logic currently falls back to
canonical_global_step_number when confirmation lag is large (i.e., when
confirmation_lag.saturating_mul(2) >= vdf_reset_frequency), which allows
boundaries to open before the confirmed chain reaches the required point and
drops the k-block confirmation requirement. To fail closed on consensus instead
of opening to the forkable canonical tip, invert the conditional logic so that
gate_step uses confirmed_global_step_number when confirmation lag is large and
canonical_global_step_number only when confirmation lag is small. If this
fallback is essential for test scenarios, gate it behind a test-only
configuration flag or declare the configuration invalid for production use
rather than silently reintroducing unconfirmed seed consumption.

---

Outside diff comments:
In `@crates/vdf/src/vdf.rs`:
- Around line 119-126: The fast-forward loop is processing and storing steps
without applying the reset-boundary gate, which allows a proposed FF step at a
reset boundary to be stored and have process_reset apply next_reset_seed even
when the confirmed chain has not reached that boundary's rotation point. Apply
the reset-boundary gate to proposed_ff_step.global_step_number before calling
store_step and process_reset at the anchor site (crates/vdf/src/vdf.rs lines
119-126) and its sibling sites (lines 157-162 and 210-214), using the same
confirmed snapshot state (vdf_info) as reference, similar to how the local
stepping path validates against the boundary. This requires checking
proposed_ff_step.global_step_number against the canonical_global_step_number and
reset boundary conditions before proceeding with storage and reset operations at
all three locations.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9212840f-c599-46fd-9230-de18e09b0cec

📥 Commits

Reviewing files that changed from the base of the PR and between b8cfd36 and 4f367a2.

📒 Files selected for processing (6)
  • crates/chain/src/chain.rs
  • crates/domain/src/models/block_tree.rs
  • crates/p2p/src/block_status_provider.rs
  • crates/p2p/src/tests/util.rs
  • crates/types/src/block_provider.rs
  • crates/vdf/src/vdf.rs

Comment thread crates/vdf/src/vdf.rs Outdated
@glottologist glottologist force-pushed the jason/vdf_run_ahead branch from 8379620 to a9c90a3 Compare June 16, 2026 11:55
@glottologist

Copy link
Copy Markdown
Contributor Author

@CodeRabbit full review

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vdf/src/vdf.rs (1)

207-229: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Throttle the expected boundary-gate warning.

A normal confirmation wait can remain gated across multiple blocks, but this path logs warn! every 200ms while parked. Keep the first warning, then rate-limit repeats to avoid noisy logs during expected gate closures.

Proposed fix
     let vdf_reset_frequency = config.reset_frequency as u64;
+    let mut last_gate_warning_at: Option<Instant> = None;
 
     loop {
@@
         if !is_mining_enabled.load(std::sync::atomic::Ordering::Relaxed) || is_too_far_ahead {
             if is_too_far_ahead {
-                warn!(
-                    "VDF gated at reset boundary: next step {} (canonical {}, confirmed {}); waiting",
-                    global_step_number + 1,
-                    canonical_global_step_number,
-                    confirmed_global_step_number
-                );
+                let should_warn = last_gate_warning_at
+                    .map(|last| last.elapsed() >= Duration::from_secs(30))
+                    .unwrap_or(true);
+                if should_warn {
+                    warn!(
+                        "VDF gated at reset boundary: next step {} (canonical {}, confirmed {}); waiting",
+                        global_step_number + 1,
+                        canonical_global_step_number,
+                        confirmed_global_step_number
+                    );
+                    last_gate_warning_at = Some(Instant::now());
+                }
             }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vdf/src/vdf.rs` around lines 207 - 229, The `warn!` log message for
"VDF gated at reset boundary" inside the `if is_too_far_ahead` block executes
repeatedly every 200ms while the condition persists, creating noisy logs during
expected gate closures. Implement rate limiting for this warning to log it once
initially and then throttle subsequent repeats at a longer interval. Consider
using a mechanism to track the last time this warning was logged and only emit
it again after a reasonable cooldown period has elapsed, so users see the first
occurrence but not the flood of identical messages during extended gate
closures.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/vdf/src/vdf.rs`:
- Around line 207-229: The `warn!` log message for "VDF gated at reset boundary"
inside the `if is_too_far_ahead` block executes repeatedly every 200ms while the
condition persists, creating noisy logs during expected gate closures. Implement
rate limiting for this warning to log it once initially and then throttle
subsequent repeats at a longer interval. Consider using a mechanism to track the
last time this warning was logged and only emit it again after a reasonable
cooldown period has elapsed, so users see the first occurrence but not the flood
of identical messages during extended gate closures.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 560d2512-3a72-479d-a9b8-12dce3a509f0

📥 Commits

Reviewing files that changed from the base of the PR and between 8379620 and a9c90a3.

📒 Files selected for processing (7)
  • crates/chain-tests/src/block_production/reset_seed.rs
  • crates/chain/src/chain.rs
  • crates/domain/src/models/block_tree.rs
  • crates/p2p/src/block_status_provider.rs
  • crates/p2p/src/tests/util.rs
  • crates/types/src/block_provider.rs
  • crates/vdf/src/vdf.rs

@glottologist

Copy link
Copy Markdown
Contributor Author

@CodeRabbit full review

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Full review finished.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/vdf/src/vdf.rs`:
- Around line 1072-1074: The documentation comment for the boundary rotation
logic in the vdf.rs file contains outdated wording that references a fallback to
canonical tip behavior that has been removed from the implementation. Update the
comment (around lines 1072-1074) to remove the phrase about "falling back to the
canonical tip" and clarify that the gate simply reads the confirmed step when
the confirmation lag is below the reset_frequency threshold, ensuring the
comment accurately reflects the current fail-closed invariant behavior.
- Around line 207-215: The warn! log message "VDF gated at reset boundary..." is
being emitted too frequently (approximately 5 times per second) when the gate is
closed due to the 200ms loop, causing excessive log spam. Either change the
warn! macro to debug! when logging the reset boundary condition, or implement
rate-limiting for this specific warning to reduce its frequency. Ensure the
solution does not suppress all occurrences but rather throttles the repeated
messages while still capturing the event for debugging purposes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 600f056a-e4f3-4d25-82b2-0341cab39993

📥 Commits

Reviewing files that changed from the base of the PR and between 8dee75f and a9c90a3.

📒 Files selected for processing (7)
  • crates/chain-tests/src/block_production/reset_seed.rs
  • crates/chain/src/chain.rs
  • crates/domain/src/models/block_tree.rs
  • crates/p2p/src/block_status_provider.rs
  • crates/p2p/src/tests/util.rs
  • crates/types/src/block_provider.rs
  • crates/vdf/src/vdf.rs

Comment thread crates/vdf/src/vdf.rs
Comment thread crates/vdf/src/vdf.rs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Limit VDF run ahead to avoid seed poisening

1 participant