Tags: sirixdb/sirix
Tags
Bump version to 1.0.0-beta2 (#1039) Seven PRs merged since the sirix-1.0.0-beta1 tag: reader use-after-free + O(revisions) open-cost fix (#1031), HOT projection storage corruption + structural swizzle fix (#1032), crash-window/recovery/auto-commit/REST robustness wave (#1033), CodeFactor checkstyle config (#1034), vectorized sparse-field/FpCmp/mixed-key correctness (#1035), dense-anchor multi-key group-by (#1036), and the first working GraalVM native-image write path (#1037). brackit pinned to the released 1.0-alpha5. Co-authored-by: Johannes Lichtenberger <johannes.lichtenberger@sirix.io>
SirixDB 1.0.0-beta1: V0 disk format, write-through commits, typed vec… …torized analytics, ops hardening (#1028) * SirixDB 1.0.0-beta1: V0 disk format, write-through commits, typed vectorized analytics, ops hardening Storage engine - V0 disk layout finalized and contract-documented: superblocks, dual uber- beacon slots with checksummed trailers, 32-byte LE revision records; format contract + crash-window semantics in docs/DISK_FORMAT.md. - Write-through commit protocol: revisions channel O_SYNC, beacon channel O_DSYNC (in-place slot writes are ordering + ack), ONE explicit fsync per commit (data tail write-ahead). Writer.writeUberPageReference now carries a durable-on-return contract; async pendingFsync machinery removed. - Power-loss simulation harness (test-only, io/sirix/crash/): records every write/force, materializes torn/dropped/metadata-split crash states, cold- opens each one. Acked revisions always survive; rejections are clean. - Interrupted-first-commit auto-heal: re-bootstraps only when provably nothing was committed; sector-loss and garbage-beacon states still fail loudly. - Explicit rollback (truncateTo) now rewrites beacons + invalidates caches — was a silent no-op TODO. - Legacy io/file (RandomAccessFile) backend removed. Performance - Session open flat in history: O(R) index-controller probe removed (was 50M access() syscalls over a 10k-commit build) — 4.64ms → 0.18ms at 10k revs. - Commit rate flat ~570/s at depth 10k (was declining 296→154/s); 7→4→1 explicit syncs per commit. Full causal chain in docs/BENCHMARKS.md. - REST reads unserialized (executeBlocking ordered=false): c=16 throughput 1,042 → 18,361 req/s on a 1.9k-revision resource, p99 245 → 1.84ms. - RevisionIndex: amortized append (shared capacity-doubling arrays + bounded binary search + deferred Eytzinger rebuild). Query engine (with brackit 1.0-alpha4) - Typed vectorized group-by: NumberRegion/BooleanRegion columnar page kernels (zone-map shortcut, popcount booleans), typed multi-key grouping via the new capability-gated SPI, type-probe routing. Numeric-key group-by at 1M: 18.3s interpreted → 1.4s; two-key 20.4s → 1.6s; results byte-identical. - Wrong-results family fixed fail-closed: string-only group kernels silently returned EMPTY for numeric/boolean keys; negative-hash nameKeys ('active', 'amount') treated as missing by group-by/count-distinct/aggregates; sum over double columns truncated 14% short; integer avg emitted xs:double instead of xs:decimal. All gated by a new differential suite (19 shapes, vectorized must equal interpreted byte-for-byte). - Diff endpoint: enriched include-data cache, atomic file writes, stampede guard (607ms → 153ms per revision-scrub step on the demo dataset). Correctness - Serializer sweep: fused-node startNodeKey wrapping (invalid JSON), multi- revision record-serializer envelope, key-cache cross-revision leak, surrogate pairing in the UTF-8 sink; XML attribute/content escaping round-trips. - Cache architecture: random persisted database ids with collision re-keying, wipe-on-close removed, truncate-time invalidation. - TestNG removed (4 classes converted to JUnit 5, dependency dropped). Operations & usability - Backup/restore: io.sirix.backup.BackupManager (writer-lock consistency, verify-on-restore) + CLI backup/restore commands + docs/BACKUP.md. - Auth-lite for evaluation: SIRIX_AUTH_MODE=none (NoneAuth, loud banner, fail-fast Keycloak error otherwise) + docs/QUICKSTART.md (three verified setup paths). - OpenAPI spec served at /openapi.yaml; /metrics fixed (per-route series were silently dropped) with SLO buckets + JVM binders; docs/API_COMPATIBILITY.md. - Dockerfile: dead -Ddisable.single.threaded.check flag removed, AlwaysPreTouch removed, SIRIX_JAVA_OPTS passthrough. - New docs: WHY-SIRIX.md (honest positioning incl. where it loses), COMPARISON_POSTGRES.md (durability-parity benchmark vs PG 17), STORAGE_COST.md (per-commit bytes fully attributed), BENCHMARKS.md. Gates: sirix-core 9187/0, sirix-query 809/0 (incl. the new differential and power-loss suites), sirix-rest-api 161/0; brackit 1124/0. * Add fresh-executor-per-iteration bench harness for engine comparisons ScaleBenchMain's query phase shares one executor across iterations, so its min/avg figures partly measure the executor-level result caches. This harness creates a fresh SirixVectorizedExecutor per timed iteration (store, page caches and projection index stay shared — the moral equivalent of another engine's loaded table without result caching) for apples-to-apples per-query numbers against external systems. * Add projection on/off flag to the comparison bench harness * Declare $doc external in the comparison harness queries * Projection-backed multi-key group-by: composite dict-id counting over columnar leaves The generalized (multi-key / renamed-output) group-by previously always took the typed slot-walk kernel — correct but page-walk-bound (~100x a columnar sweep at 100M). When a covering projection carries every group field as a STRING_DICT column and the predicate (if any) is a supported conjunction, the grouping now runs as a composite dict-id sweep over the in-memory leaves: per-leaf packed cell ids, per-cell lazy composite-key compose (char-count encoded segments, matching the typed kernel's decoder), per-worker maps merged across leaf chunks. Falls back to the typed kernel on any mismatch (missing column, non-dict kind, unsupported predicate, kind drift). Differential suite extended with projection-mode tests (two-key, predicated two-key, renamed single key via the multi path, and the mixed-type fallback) - 23/23, vectorized identical to interpreted. * Projection-backed numeric aggregates + integrality-gated value-exact fast paths sum/avg/min/max previously always page-walked the NumberRegions (~59s at 100M for a column DuckDB sweeps in 10ms). When a covering projection carries the field as NUMERIC_LONG, aggregates now fold the column directly over the in-memory leaves (full-word fast path per 64-row mask block), parallel across leaf chunks, for both the unpredicated and predicated entry points. Correctness gate for everything value-exact: the projection builder truncates non-integral numbers into NUMERIC_LONG (Number#longValue), so the builder now tracks per-column integrality evidence, the registry handle exposes it, and (a) aggregate fast paths require a PROVABLY integral column, (b) numeric comparisons decline columns with KNOWN non-integral values (a `score > 2` would otherwise match 2.5 stored as 2); unknown provenance (re-encoded persisted leaves) keeps legacy predicate behavior and never serves the new aggregate path. Bench projection now also covers city/amount/score so the composite multi-key kernel and the aggregate paths engage on the scale benchmark; differential suite extended to 27 cases (projection aggregates incl. the must-decline double column) — all vectorized results identical to interpreted. * Bench: projection persist toggle for force-rebuilt wider column sets * Add DuckDB head-to-head comparison doc; link from WHY-SIRIX * Fix WHY-SIRIX analytics paragraph: measured DuckDB-relative numbers, drop stale cache-artifact claim * DuckDB comparison: add PGO native binary column (3 of 9 shapes ahead of DuckDB) * Sync WHY-SIRIX analytics claims with the PGO-native results * DuckDB comparison: add measured scan-path column and native-image findings * Interim: pin brackit 1.0-alpha4-SNAPSHOT so CI validates ahead of the release pin brackit master (sirixdb/brackit#100, merged) auto-publishes the snapshot; flip to the 1.0-alpha4 release coordinates once the release workflow runs. * Pin brackit 1.0-alpha4 release (Central live) --------- Co-authored-by: Johannes Lichtenberger <johannes.lichtenberger@sirix.io>
release: 1.0.0-alpha22 — correctness, durability, and HFT hardening a… …cross core, query, and REST API (#1024) * Correctness, durability, and HFT hardening across core, query, and REST API Storage/durability (FileChannelWriter + FileWriter): - Revisions-file records written at the deterministic slot FIRST_BEACON + 16*revision instead of append-at-size, so a failed commit or torn record can no longer shift every later revision offset - Ordered dual-copy uber-page writes (secondary at 512 -> fsync -> primary at 0) and a data-tail fsync before publishing the uber reference - truncateTo validates lengths/short reads and truncates the revisions file, invalidating the per-resource revision-file cache - Uber-slot size guard (> UBER_PAGE_BYTE_ALIGN now fails fast) - LZ4 decompression validates the embedded size header (256 MiB cap, MIN_VALUE/short-payload guards, size mismatch throws instead of WARN) - ByteHandlerPipeline segment path layers handlers in stream order - OSS io_uring backend removed; StorageType.IO_URING resolves an external provider (enterprise FFM implementation) and fails fast otherwise - Dead MMFileWriter deleted (MMStorage writes through FileChannelWriter) Transactions/hashing (JSON + XML): - Bulk-insert hash maintenance is mode-coherent: auto-committing imports keep incremental per-insert adaptation by default; new opt-in ResourceConfiguration.repairBulkInsertHashes does a single postorder repair at the end for very large imports - Rolling-hash updates chain correctly per ancestor level - Fused OBJECT_NAMED_* nodes rejected as direct array/object-child inserts - XML: duplicate-attribute insert returns existing node; ASFIRSTCHILD text-merge captures siblings before surgery; dewey-ID attribute/namespace numbering; hash-gated descendant decrements on remove - Multi-resource rollback iterates the correct transaction list - Resource removal cleans the name<->id bimap; max resource ID persisted before configuration serialization Path summary / indexes: - Array path entries de-indexed on removal (two-layer refcounts) - Object-key rename keeps CAS/name/path index listeners notified (DELETE+INSERT bracketing) and is refcount-aware for shared path classes - RBTree LOWER lookup descends left on exact match; listeners clone NodeReferences before mutation; empty-path guards on INSERT/DELETE - CAS filter converts atomic values only on type mismatch (zero-conversion hot path); iterative stale-key skip in index streams (no recursion) Query (sirix-query): - jn:diff validates revisions; jn:open-revisions null-doc/duplicate handling - sdb:is-deleted/sdb:item-history close transactions properly; item-history honors storeNodeHistory - Six temporal predicates de-inverted on JsonDBObject/XmlDBNode - Array wrappers no longer close the shared read transaction; negative index access fixed REST API (sirix-rest-api): - Query with nodeId 404s on a missing node instead of serializing the whole resource; XML responses use application/xml - If-Match honored on update/delete paths; orphan cleanup on failed create; structured logging in create handlers; history/path-summary/head handler correctness fixes Observability: - sirix_allocator_physical_memory_bytes reads the ACTIVE allocator via Allocators.getInstance() (was pinned to the legacy pool allocator and flatlined under the default FrameSlotAllocator); getPhysicalMemoryBytes added to the MemorySegmentAllocator interface - Commit-path TIL diagnostics downgraded WARN -> debug Tests: - New gates: PredicateOverUnwrappedArrayTest, NumericComparisonRegressionTest, ObjectNamedNumberNodeHashTest, FusedNumberEditPersistTest, ArrayIndexAccessOptimizerTest, and the opt-in Chicago scale gate ChicagoImportAndParallelTraversalTest (-Dsirix.chicago.run=true): 4.4 GiB import in 111 s, 382,159,848 nodes across 86 auto-commit epochs, 20 parallel DescendantAxis scans + PostOrderAxis walk all agree exactly - Full suites green: sirix-core 9141/0, sirix-query 789/0 - brackit pinned to released 1.0-alpha3 * Restore create-handler branch order: body-less PUT /:database must create the database The createMultipleResources-first reorder routed plain database creation (PUT /:database with no :resource) into createMultipleResources(), which invokes BodyHandler mid-handler and fails with "BodyHandler invoked after the request has ended" on routes without a route-level BodyHandler — every database create 500'd. Multipart bulk upload is served by the dedicated CreateMultipleResources handler on POST /:database, so the flag branch in this handler is unreachable dead code on current routes; document that instead of resurrecting it. * Reduce method complexity flagged by static analysis - XmlUpdate.update: delegate the inline legacy-ETag check to the shared checkHashCode precondition from AbstractUpdateHandler — this also makes XML updates honor the standard If-Match header like JSON updates - JsonUpdate.update: extract the scalar-token insertion switch into a named insertScalar helper - SirixVerticle.createRouter: extract the logout handler body into handleLogout * build: bump version to 1.0.0-alpha22 --------- Co-authored-by: Johannes Lichtenberger <johannes.lichtenberger@sirix.io>
release: 1.0.0-alpha21 — flaky UberPage test + invalid-JSON named-pro… …jection fixes (#1023) * fix(io): bound page data-length header to file size (fixes flaky UberPageCorruptionTest) FileReader/FileChannelReader read a 4-byte page data-length header straight from the file and used it directly to size new byte[dataLength] / a ByteBuffer. On a corrupt or garbled beacon that length can be any 32-bit value; a large positive one (e.g. random bytes read as ~2 GiB) triggered a multi-gigabyte allocation that OOM'd or stalled the JVM in GC instead of failing fast — which timed out the CI worker on lower-memory runners and made UberPageCorruptionTest flaky. A page can never be longer than the file that contains it, so both readers now bound the length by the data-file size and throw SirixIOException fast. Test: adds a deterministic huge-length-header case and wraps the beacon-garbage tests in assertTimeoutPreemptively so a regression fails fast instead of hanging the suite. * fix(serialize): wrap a single named-member query result so the JSON stays valid A query projecting a single fused named member (e.g. jn:doc(...).products[1].id, an OBJECT_NAMED_STRING) serializes inline as a bare "name":value fragment. The REST result wrapper emits it straight after the "revision": key, producing invalid JSON: {"revisionNumber":N,...,"revision":"id":"A"} (two colons), which the GUI query editor then fails to parse. JsonSerializer now wraps such a result in an object -> "revision":{"id":"A"}. Gated to the non-metadata path (metadata mode already emits a full {key,metadata,value} object) and it peeks at the start node's kind, since the framework only positions rtx on the result node after emitRevisionStartNode runs. Test: NamedProjectionSerializationTest serializes via JsonDBSerializer like the REST API and asserts valid JSON for named string/number projections, with the whole-object case unchanged. * release: 1.0.0-alpha21 — flaky UberPage test + invalid-JSON named-projection fixes --------- Co-authored-by: Johannes Lichtenberger <johannes.lichtenberger@sirix.io>
release: 1.0.0-alpha19 — jn:open/xml:open before first revision retur… …ns empty sequence (#1020) jn:open(db, resource, pointInTime) (and the XML twin xml:open) clamped a point in time that predates the resource's first revision to revision 1, anachronistically returning data for a moment when the resource did not exist. JsonDBCollection/XmlDBCollection#getDocumentInternal now return the empty sequence in that case; timestamps at/after the first commit are unaffected. Adds DocByPointInTimeJsonTest covering the pre-creation (empty) and post-creation (document) cases. Co-authored-by: Johannes Lichtenberger <johannes.lichtenberger@sirix.io>
sirix-1.0.0-alpha16 — int/double comparison fix (XPTY0004) via bracki… …t 1.0-alpha2
release: 1.0.0-alpha15 — fix streaming-shredder back-pressure deadlock ( #1005) Resume the paused JSON parser when the streaming shredder drains its bounded channel empty below the 50k resume threshold (low-water fallback), preventing a CPU-idle back-pressure deadlock. Adds a deterministic regression test + a production-scale async test.
PreviousNext