feat(dataplex): add tools to support metadata enrichment workflow#3270
feat(dataplex): add tools to support metadata enrichment workflow#3270harmonisha-wq wants to merge 16 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces support for Dataplex Data Insights by adding four new tools: generate_data_insights, get_data_insights, get_operation, and get_run_status. These tools allow for asynchronous generation and retrieval of data documentation for BigQuery resources. The reviewer identified several performance and security improvements in the internal/sources/dataplex implementation, including the need for input validation on operation names to prevent path traversal, more efficient token management, and optimizing the retrieval of the latest scan job by using server-side ordering and pagination instead of client-side iteration.
83016d5 to
2e89480
Compare
e3a91fa to
ce3547a
Compare
776be3f to
651f2ce
Compare
Yuan325
left a comment
There was a problem hiding this comment.
Hi @harmonisha-wq thank you for adding these tools. Please also update (1) docs, and (2) integration tests for these tools. Thank you!
Yuan325
left a comment
There was a problem hiding this comment.
Hi, please see the following feedbacks on top of the comments:
- Please update the prebuilt-config docs as well in
docs/en/integrations/ - Please apply the comments across all newly added tools.
b2384ea to
37b6cb9
Compare
This commit adds 6 new MCP tools to support Dataplex Data Profile, Data Discovery, and Data Quality workflows, completely integrated using standard gRPC client connections: - generate_data_profile / get_data_profile - generate_data_discovery / get_data_discovery - generate_data_quality / get_data_quality It also reuses get_operation and get_run_status to track these scans, and refactors GetDataInsights to a generic GetDataScan method. TAG=agy CONV=74c80935-9552-4038-b5b9-5c0d69b81a8d
Corrected the reference to 'get_run_status' for clarity.
This commit addresses reviewer feedback for the Dataplex enrichment workflow: 1. Moves the 'enrich' toolset configuration block to the bottom of dataplex.yaml. 2. Adds documentation pages (markdown files) under docs/en/integrations/knowledge-catalog/tools/ for all 10 newly introduced/modified Dataplex tools. 3. Updates dataplex_integration_test.go to register the new tools, verify parameters of get endpoints, and implements runDataplexDataScanLifecycleIntegrationTest to verify the asynchronous scan creation, LRO polling (get_operation), job run checking (get_run_status), and scan result retrieving (get_data_profile) end-to-end. TAG=agy CONV=74c80935-9552-4038-b5b9-5c0d69b81a8d
This commit registers and adds complete end-to-end lifecycle integration tests for: - generate_data_insights / get_data_insights - discover_metadata / get_discovery_results - check_data_quality / get_data_quality_results This covers all 10 newly introduced Dataplex tools in the integration test suite. TAG=agy CONV=74c80935-9552-4038-b5b9-5c0d69b81a8d
…eference tables across all Knowledge Catalog tools
…rmissions heading across all Knowledge Catalog tools
… from Dataplex source
…generation and lookup methods across Dataplex source and tools
…undant 'Required.' prefixes and ensuring optional parameters start with 'Optional.'
…ation, ensure correct annotations, deduplicate path normalization into dataplexcommon, and streamline integration tests into a table-driven suite
…onfigBase This refactors the remaining 10 Dataplex write tools (checkdataquality, discovermetadata, generatedatainsights, generatedataprofile, getdatainsights, getdataprofile, getdataqualityresults, getdiscoveryresults, getoperation, getrunstatus) to match the new BaseTool framework design, and fixes the configuration initialization in their corresponding unit test files. TAG=agy CONV=74c80935-9552-4038-b5b9-5c0d69b81a8d
eb9ddd3 to
8f56181
Compare
8f56181 to
c30a8db
Compare
Description
PR Checklist
CONTRIBUTING.md
bug/issue
before writing your code! That way we can discuss the change, evaluate
designs, and agree on the general idea
!if this involve a breaking change🛠️ Fixes #3269