Skip to content

mcp-datahub-v1.7.0

Choose a tag to compare

@github-actions github-actions released this 29 Mar 23:54
Immutable release. Only release title and notes can be modified.
7a406a8

mcp-datahub v1.7.0 — GraphQL Schema Alignment & Validation Infrastructure

Corrects GraphQL query field paths across four client modules by validating every query against the upstream DataHub schema source files. Adds automated schema validation infrastructure to prevent future drift — all 59 query/mutation constants are now checked against the official .graphql definitions from datahub-project/datahub.

+25,426 lines | -318 lines | 46 files changed


Highlights

GraphQL Query Corrections (#121)

Cross-referenced all GraphQL queries with the upstream DataHub schema files (datahub-graphql-core/src/main/resources/*.graphql) and corrected field paths that did not match the actual API:

Module Issue Fix
documents.go DocumentRelatedAsset, DocumentRelatedDocument, DocumentParentDocument queried a direct urn field that doesn't exist on these wrapper types Changed to relatedAssets { asset { urn } }, relatedDocuments { document { urn } }, parentDocument { document { urn } } per upstream documents.graphql
structured_properties.go Fragment targeted non-existent type EntityStructuredPropertiesResult Changed to StructuredProperties per upstream entity.graphql
data_contracts.go Queried contract { result(refresh: false) { type assertionResults { ... } } } — the result field and its nested structure don't exist on DataContract Rewrote to contract { properties { freshness/schema/dataQuality { assertion { urn } } } status { state } } per upstream contract.graphql
semantic_search.go Used non-existent input type SemanticSearchInput Changed to SearchAcrossEntitiesInput with searchAcrossEntities query per upstream search.graphql

All corrections were verified against both the upstream .graphql source files (v1.4.0.3 and v1.5.0.1) and a live DataHub v1.4.0.3 instance.

Schema Validation Infrastructure (#121)

Adds automated, offline-capable validation of GraphQL queries against the upstream DataHub schema:

  • testdata/datahub-schema/ — 31 .graphql schema files synced from datahub-project/datahub at tag v1.5.0.1, checked into the repo for CI without network access
  • testdata/datahub-schema/sync.sh — downloads schema files from any tagged DataHub release
  • pkg/client/schema_validation_test.go — validates all 59 query/mutation constants against the schema: checks fragment targets, top-level query/mutation fields, inline fragment type names, and input type references
  • make schema-sync — download schema files for a target version
  • make schema-check — run schema validation (now part of make verify)

Workflow for targeting a new DataHub version:

DATAHUB_VERSION=v1.5.0.1 make schema-sync   # pull schema files
make schema-check                             # validate all queries

DataHub Version Compatibility Matrix

Updated CLAUDE.md with a verified compatibility matrix:

DataHub Version Features Available Schema Validated
1.3.x+ (minimum) All read tools, all write operations except documents No (pre-dates schema sync)
1.4.x+ (full) + Documents (create/update/delete), semantic search Yes (v1.4.0.3)
1.5.x+ (current) + Batch data product operations Yes (v1.5.0.1)

Schema files were diff'd between v1.4.0.3 and v1.5.0.1 — the only change is a new batchAddToDataProducts/batchRemoveFromDataProducts mutation in entity.graphql. All types used by this library are identical across both versions.


Breaking Changes

types.AssertionResult simplified

The AssertionResult type in pkg/types/data_contracts.go was simplified to match the actual DataHub DataContract schema:

Removed fields:

  • ResultType string — the real API does not expose per-assertion result types through the contract query
  • NativeResults map[string]string — the real API does not expose native result details through the contract query

Before:

type AssertionResult struct {
    AssertionURN  string            `json:"assertion_urn"`
    Type          string            `json:"type"`
    ResultType    string            `json:"result_type"`
    NativeResults map[string]string `json:"native_results,omitempty"`
}

After:

type AssertionResult struct {
    AssertionURN string `json:"assertion_urn"`
    Type         string `json:"type"`
}

If you were reading ResultType or NativeResults from AssertionResult, those fields were never populated by the actual DataHub API.

DataContract.Status values changed

The Status field now contains the DataContractState enum value from the status.state field (e.g., "ACTIVE", "PENDING") rather than the previously unpopulated result.type field (which was intended to contain "PASSING" / "FAILING").


Compatibility

Requirement Version
Go 1.25+
DataHub (minimum) 1.3.x
DataHub (full feature set incl. documents) 1.4.x+
DataHub (schema validated against) v1.5.0.1

Installation

Claude Desktop (macOS/Windows)

Download the .mcpb bundle for your platform and double-click to install:

  • macOS Apple Silicon (M1/M2/M3/M4): mcp-datahub_1.7.0_darwin_arm64.mcpb
  • macOS Intel: mcp-datahub_1.7.0_darwin_amd64.mcpb
  • Windows: mcp-datahub_1.7.0_windows_amd64.mcpb

Homebrew (macOS)

brew install txn2/tap/mcp-datahub

Claude Code CLI

claude mcp add datahub \
  -e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
  -e DATAHUB_TOKEN=your-token \
  -- mcp-datahub

Docker

docker pull ghcr.io/txn2/mcp-datahub:v1.7.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-datahub_1.7.0_linux_amd64.tar.gz.sigstore.json \
  mcp-datahub_1.7.0_linux_amd64.tar.gz