Skip to content

Tags: JetBrains/youtrackdb

Tags

0.5.0-20260428.132443-44cf982-SNAPSHOT

Toggle 0.5.0-20260428.132443-44cf982-SNAPSHOT's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Bump io.youtrackdb:gremlin-core from 3.8.1-fccfc5a-SNAPSHOT to 3.8.1-…

…af9db90-SNAPSHOT (#1009)

#### Motivation:

Picks up the latest changes from the io.youtrackdb TinkerPop fork
(commit `af9db90`, replacing `fccfc5a`).

The `gremlin.version` property is shared by all `io.youtrackdb`
gremlin-* artifacts declared in `<dependencyManagement>` (gremlin-core,
gremlin-groovy, tinkergraph-gremlin, gremlin-test, gremlin-driver,
gremlin-server, gremlin-util, gremlin-console, gremlin-language), so a
single property bump moves them all in lockstep — which is required
because the fork ships them as a coordinated set.

#### Test plan:

- [ ] CI unit tests pass
- [ ] CI integration tests pass
- [ ] TinkerPop Cucumber feature tests pass in `core` and `embedded`
modules

0.5.0-20260423.204727-9802bc9-SNAPSHOT

Toggle 0.5.0-20260423.204727-9802bc9-SNAPSHOT's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Split /execute-tracks plan from implementation backlog (#1007)

## Motivation

`/execute-tracks` re-read the full `implementation-plan.md` at every
session start, including per-track `**What / How / Constraints /
Interactions**` subsections and track-level Mermaid diagrams whose
detail is only consumed in one phase of one track per session. On larger
plans this pushed main-agent context into the degradation zone before
any work began.

This PR splits that detail into a companion file,
`implementation-backlog.md`, and threads the current track's description
through the step file (`tracks/track-N.md`'s new `## Description`
section) during Phase A. Sessions still read the plan at startup, but
the plan is now a thin checklist for pending tracks. The backlog is read
only when a track enters Phase A or when a track is skipped. Plans
authored before the split (no backlog file) continue executing
unmodified via file-existence fallback — no migration required.

## Summary

- **Plan stays thin at startup** — pending-track entries carry title +
intro + Scope + Depends-on only; strategic sections (Goals, Constraints,
Architecture Notes, Decision Records, Component Map) and track status /
episodic memory are unchanged.
- **New `implementation-backlog.md`** (untracked, load-bearing marker)
holds `**W/H/C/I**` + optional track-level diagrams for pending tracks;
shrinks monotonically and stays on disk even when header-only so the
file-existence detection keeps evaluating true.
- **Description travels with the work** — Phase A atomically writes the
current track's description into `tracks/track-N.md`'s new `##
Description` section and then removes the section from the backlog;
Phase B / C sub-agents see it via the step file they already read.
- **Backward-compatible** — one-line file-existence detection
(per-operation) routes legacy plans through the old read-from-plan path
with no user action. The wire contract carries `backlog_path` with a
`(none — legacy plan)` sentinel when absent, so spawn-site shapes don't
fork.
- **Phase 2 orchestration** (`review-plan` SKILL) is authoritative for
path-passing across all four review spawn sites; Phase A consolidates
the six shared sub-agent inputs into one section to eliminate drift.
- **Phase 4 artifacts committed**: `docs/adr/thin-workflow/adr.md` and
`docs/adr/thin-workflow/design-final.md` — the only tracked workflow
files per the untracked-file invariant.

## Test plan

This PR touches only Claude workflow documentation (`.claude/skills/**`,
`.claude/workflow/**`) and ADR artifacts (`docs/adr/thin-workflow/**`).
No production Java code, no tests, no build files are modified.

- [x] Reviewed diff for unintended changes outside `.claude/**` and
`docs/adr/thin-workflow/**`
- [x] `adr.md` records all five original goals as landed, all five
original constraints as held, and no descoped goals
- [x] Legacy-plan path exercised by the file-existence fallback rule
(`conventions.md` §1.2) at every decision point
- [ ] Dry-run `/execute-tracks` on a fresh feature branch to confirm
startup context stays within the `safe` threshold on a representative
plan
- [ ] Dry-run `/execute-tracks` on a pre-split legacy plan to confirm
the fallback path still completes a track end-to-end

0.5.0-20260423.080514-cca739f-SNAPSHOT

Toggle 0.5.0-20260423.080514-cca739f-SNAPSHOT's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
 YTDB-650: Back-reference hash join for MATCH patterns (#946)

#### PR Title:

`YTDB-650: Back-reference hash join for MATCH patterns`

#### Motivation:

PR #918 (hash-join-match-patterns) optimized MATCH sub-patterns that are
independent of `$matched` — NOT patterns, multi-branch joins, WHILE
hierarchies. But a different class of bottleneck remained:
**back-reference edges** like `where: (@Rid = $matched.person.@Rid)` and
`where: ($currentMatch NOT IN $matched.start.out('KNOWS'))`. These edges
depend on a previously-bound alias, so the engine re-traverses the
source vertex's link bag for every upstream row — O(upstream_rows ×
link_bag_size). In LDBC IC5, this means ~700K link bag entry scans of
which 96% are rejected; in IC10, ~39K redundant edge reads.

This PR converts those per-row scans into a one-time hash table build
per distinct binding + O(1) probes, via three recognized patterns:

| Pattern | Shape | Example | Build → Probe |
|---|---|---|---|
| **A** — single edge | `.out('E'){where: @Rid = $matched.X.@Rid}` | IC5
final edge | reverse link bag → `Set<RID>`, probe source ∈ set |
| **B** — outE+inV chain | `.outE('E'){...}.inV(){where: @Rid =
$matched.X.@Rid}` | IC5 with edge properties | reverse link bag +
optional index → `Map<RID, List<Edge>>`, probe source → edges |
| **D** — NOT IN | `where: ($currentMatch NOT IN $matched.X.out('E'))` |
IC10 exclusion | forward link bag → `Set<RID>`, probe candidate ∈ set
(anti-join) |

Key design decisions:

- **Per-binding LRU cache** (capacity 256) avoids rebuilding the hash
table when the same binding recurs — critical for IC5 where ~58 distinct
persons produce ~5K triples each.
- **Sealed `SemiJoinDescriptor` interface** (records:
`SingleEdgeSemiJoin`, `ChainSemiJoin`, `AntiSemiJoin`) carries planner
metadata to runtime via `EdgeTraversal`, with exhaustive `switch`
dispatch in `BackRefHashJoinStep`.
- **Edge collapsing** (Pattern B): marks the `.outE()` edge as
`consumed` so `addStepsFor()` skips it — one `BackRefHashJoinStep`
covers both edges.
- **NOT IN stripping** (Pattern D): removes the NOT IN from the
`MatchStep`'s WHERE clause at plan time to avoid the expensive O(degree)
per-row evaluation; stores the condition in
`AntiSemiJoin.notInCondition` for runtime fallback.
- **Type-safe detection**: Pattern D uses `instanceof SQLNotInCondition`
with AST unwrapping instead of fragile `toString()` pattern matching.
- **Runtime fallback**: if the hash table build fails (threshold
exceeded, missing data), Patterns A/B fall back to `MatchEdgeTraverser`
nested-loop; Pattern D evaluates the stored NOT IN condition per row.
`BUILD_FAILED` sentinel is cached per binding to avoid re-attempting.

The optimization is purely additive — when any eligibility check fails,
the existing nested-loop path is used unchanged.

## Test plan

- [x] 17 new tests in `HashJoinPlannerIntegrationTest`: EXPLAIN plan
shape assertions + correctness for all three patterns, including
fan-out, cycles, fallback on threshold, and NOT IN stripping
verification
- [x] `MatchStatementExecutionTest` — 146 tests pass (0 regressions)
- [x] Run full `./mvnw clean package` to verify no cross-module
regressions
- [x] LDBC JMH benchmarks on Hetzner to measure IC5/IC10 improvement

0.5.0-20260421.201112-bd4a360-SNAPSHOT

Toggle 0.5.0-20260421.201112-bd4a360-SNAPSHOT's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Dispatch CI Fix Agent on macOS integration-test failures (#995)

## Summary
- Include `test-macos` in the `Dispatch CI Fix Agent` condition in
`maven-integration-tests-pipeline.yml` so macOS-only failures now
trigger `ci-failure-fix-agent.yml`, matching how Linux and Windows
failures are handled.
- Update stale comments that stated the agent is intentionally not
dispatched for macOS failures.
- No change needed in `maven-pipeline.yml` — its `notify-failure` uses
job-level `failure()` with `test-macos` already in `needs:`, so macOS
failures on develop pushes were already dispatching the agent there.

## Motivation
macOS regressions on the nightly integration-tests pipeline were not
surfacing to the CI Fix Agent, so failures on that leg had to be
investigated manually. The agent can still analyze failure logs even if
it cannot reproduce macOS-specific issues on its Linux runner, so giving
it a chance to triage is strictly better than silence.

## Test plan
- [ ] Verify that a macOS-only failure on the next nightly integration
run dispatches `ci-failure-fix-agent.yml`.
- [ ] Verify that Linux/Windows failure behavior is unchanged.

0.5.0-20260421.125440-43e4e2e-SNAPSHOT

Toggle 0.5.0-20260421.125440-43e4e2e-SNAPSHOT's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Reduce AsyncReadCacheTestIT iterations to fit macOS arm CI watchdog (#…

…998)

## Summary

- `testEvenDistribution` and `testZiphianDistribution` now run 8M ops
per worker × 8 workers (64M total page accesses) instead of 32M × 8 =
256M. Working set, cache cap, Zipfian skew, and thread count are
unchanged.
- Added a comment at both sites documenting the CI-watchdog rationale
and the preserved invariants (eviction pressure, access-pattern skew).

## Motivation

The `macOS arm - JDK 25 - temurin` leg of the [integration-tests
pipeline](https://github.com/JetBrains/youtrackdb/actions/runs/24672100968)
failed. Surefire reported a \"forked VM terminated without properly
saying goodbye\", but the \`deadlock-report.txt\` artifact shows the
real cause: the per-test watchdog in \`JUnitTestListener\` fired with
\`TEST TIMEOUT (testEvenDistribution ... running for 3617 seconds)\` and
called \`Runtime.halt(1)\`.

The thread dump shows forward progress on every worker — this is not a
deadlock, and \`findDeadlockedThreads()\` returned null. The test is
just doing far more work than the weakest CI runner can finish inside
the 60-minute watchdog:

| Runner                       | AsyncReadCacheTestIT (both tests) |
|------------------------------|-----------------------------------|
| Linux arm JDK 21 (self-hosted) | 1471 s (~25 min) ✅              |
| macOS arm JDK 25 (GitHub-hosted) | testEvenDistribution alone hit 60
min ❌ |

The GitHub-hosted Apple Silicon runner introduced in #987 is
substantially slower than the Hetzner self-hosted nodes. 256M page
accesses is much more than needed to surface any concurrency bug in
\`LockFreeReadCache\` (the cache still churns ~244× its capacity after
the cut), so the fix is to bring the iteration budget back in line with
what the slowest runner can finish.

The fix is in the test (not the production code or the CI config)
because:
- Production code is fine — no deadlock, no hang, no wrong assertion.
- Raising the watchdog on macOS would slow every future run without
adding coverage.
- 64M total ops is still two orders of magnitude above what any
deterministic assertion needs.

Verified locally: \`./mvnw -pl core -am verify -P ci-integration-tests
-Dit.test=AsyncReadCacheTestIT\` passes both tests in 126.6 s on x86
Linux. Spotless clean.

## Test plan

- [ ] macOS arm JDK 25 leg of the integration-tests pipeline completes
\`AsyncReadCacheTestIT\` inside the 60-min per-test watchdog.
- [ ] Linux x86/arm and Windows x64 legs still pass; no regression in
\`AsyncReadCacheTestIT\` on any platform.
- [ ] No new failures in neighbouring concurrency tests
(\`LockFreeReadCacheConcurrentTestIT\`,
\`ConcurrentLongIntHashMap*Test\`).

0.5.0-20260420.202159-4f12b23-SNAPSHOT

Toggle 0.5.0-20260420.202159-4f12b23-SNAPSHOT's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Improve JMH compare workflow: configurable base branch and history pr…

…eservation (#990)

#### Motivation:

The JMH LDBC compare workflow had two usability limitations that this PR
addresses:

1. **Hard-coded base branch**: The fork-point was always computed
against `origin/develop`, so there was no way to benchmark a branch that
was forked from a different base (e.g. a feature branch or `main`). The
base branch is now a workflow input (default `develop`) and is plumbed
through to `jmh-compare.py` so the comparison comment shows the actual
base branch it was compared against.

2. **Overwritten comment history**: The previous logic looked for an
existing `<!-- jmh-ldbc-compare -->` comment and updated it in place,
which silently destroyed the history of prior benchmark runs on the PR.
Each run now posts a new comment so every benchmark result is preserved.
The HEADER marker is retained so future tooling can still locate the
benchmark comments (e.g. to collapse older ones into `<details>`
blocks).

Also includes a small robustness fix: the `$FORK_POINT` variable is now
quoted when passed to `git rev-parse --short`.

## Summary

- Add `base_branch` workflow_dispatch input (default `develop`); use it
in `git fetch` and `git merge-base`.
- Thread the resolved base branch into `jmh-compare.py` via a new
`--base-branch` flag and surface it in the comparison comment header.
- Always post a new benchmark comment on the associated PR instead of
updating the existing one, preserving run history. Log the created
comment id.

## Test plan

- [ ] Trigger the workflow manually on a branch with default inputs —
verify comparison runs against `origin/develop` and a new comment is
posted.
- [ ] Trigger the workflow with a non-default `base_branch` — verify the
fork-point is computed against that branch and the comment header shows
it.
- [ ] Re-run the workflow on the same PR — verify a second comment is
created rather than overwriting the first.

spotless-baseline

Toggle spotless-baseline's commit message
Fix Windows CI: fetch full history for Spotless ratcheting

The Spotless plugin's ratchetFrom requires the spotless-baseline git tag,
which is not available in a shallow clone (the default fetch-depth: 1).
Linux CI already uses fetch-depth: 0; apply the same to Windows.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>