Skip to content

feat(llmobs): add tool_definitions support to Tagger#8082

Merged
PROFeNoM merged 7 commits into
masterfrom
alex/llmobs-tool-definitions
May 5, 2026
Merged

feat(llmobs): add tool_definitions support to Tagger#8082
PROFeNoM merged 7 commits into
masterfrom
alex/llmobs-tool-definitions

Conversation

@PROFeNoM
Copy link
Copy Markdown
Contributor

@PROFeNoM PROFeNoM commented Apr 23, 2026

What does this PR do?

Introduces _ml_obs.meta.tool_definitions for LLMObs spans, mirroring the shape dd-trace-py's infra PR #14159 already ships. Each definition is { name, description, schema }.

This PR is infrastructure only, same scope as the Python counterpart.

Motivation

Prerequisite for bringing Node's Bedrock Converse spans (#8079) to parity with Python, which emits tool_definitions from toolConfig.tools. Opens the door for parity follow-ups on openai / anthropic / langchain / ai-sdk too.

Test plan

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 23, 2026

Overall package size

Self size: 5.68 MB
Deduped: 6.53 MB
No deduping: 6.53 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | import-in-the-middle | 3.0.1 | 82.56 kB | 817.39 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2026

Codecov Report

❌ Patch coverage is 75.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.62%. Comparing base (24339c2) to head (b452a1f).
⚠️ Report is 67 commits behind head on master.

Files with missing lines Patch % Lines
packages/dd-trace/src/llmobs/tagger.js 60.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #8082       +/-   ##
===========================================
+ Coverage   73.79%   89.62%   +15.83%     
===========================================
  Files         782      829       +47     
  Lines       36366    43365     +6999     
  Branches        0     7918     +7918     
===========================================
+ Hits        26835    38866    +12031     
+ Misses       9531     4499     -5032     
Flag Coverage Δ
aiguard-integration-active 41.31% <0.00%> (?)
aiguard-integration-latest 41.26% <0.00%> (?)
aiguard-integration-maintenance 41.31% <0.00%> (?)
aiguard-macos 35.81% <0.00%> (-0.62%) ⬇️
aiguard-ubuntu 35.92% <0.00%> (-0.62%) ⬇️
aiguard-windows 35.72% <0.00%> (-0.62%) ⬇️
apm-capabilities-tracing-macos 48.16% <0.00%> (+0.05%) ⬆️
apm-capabilities-tracing-ubuntu-active 48.23% <0.00%> (+0.09%) ⬆️
apm-capabilities-tracing-ubuntu-latest 48.20% <0.00%> (+0.09%) ⬆️
apm-capabilities-tracing-ubuntu-maintenance 48.22% <0.00%> (+0.08%) ⬆️
apm-capabilities-tracing-ubuntu-oldest 48.22% <0.00%> (+0.09%) ⬆️
apm-capabilities-tracing-windows 47.50% <0.00%> (-0.44%) ⬇️
apm-integrations-aerospike-18-gte.5.2.0 34.95% <0.00%> (?)
apm-integrations-aerospike-20-gte.5.5.0 34.97% <0.00%> (?)
apm-integrations-aerospike-22-gte.5.12.1 34.97% <0.00%> (?)
apm-integrations-aerospike-22-gte.6.0.0 34.97% <0.00%> (?)
apm-integrations-aerospike-eol- 34.87% <0.00%> (?)
apm-integrations-child-process 36.13% <0.00%> (+0.04%) ⬆️
apm-integrations-confluentinc-kafka-javascript-18 41.82% <0.00%> (?)
apm-integrations-confluentinc-kafka-javascript-20 41.84% <0.00%> (?)
apm-integrations-confluentinc-kafka-javascript-22 41.84% <0.00%> (?)
apm-integrations-confluentinc-kafka-javascript-24 41.77% <0.00%> (?)
apm-integrations-couchbase-18 35.16% <0.00%> (+0.03%) ⬆️
apm-integrations-couchbase-eol 35.21% <0.00%> (+0.03%) ⬆️
apm-integrations-dns 34.98% <0.00%> (?)
apm-integrations-elasticsearch 35.56% <0.00%> (?)
apm-integrations-http-latest 42.97% <0.00%> (?)
apm-integrations-http-maintenance 43.03% <0.00%> (?)
apm-integrations-http-oldest 43.04% <0.00%> (?)
apm-integrations-http2 40.33% <0.00%> (?)
apm-integrations-kafkajs-latest 41.70% <0.00%> (?)
apm-integrations-kafkajs-oldest 41.76% <0.00%> (?)
apm-integrations-net 35.65% <0.00%> (?)
apm-integrations-next-11.1.4 29.44% <0.00%> (?)
apm-integrations-next-13.2.0 31.29% <0.00%> (?)
apm-integrations-next-gte.10.2.0.and.lt.11 23.33% <ø> (?)
apm-integrations-next-gte.11.0.0.and.lt.13 31.30% <0.00%> (?)
apm-integrations-next-gte.13.0.0.and.lt.14 31.55% <0.00%> (?)
apm-integrations-next-gte.14.0.0.and.lte.14.2.6 31.36% <0.00%> (?)
apm-integrations-next-gte.14.2.7.and.lt.15 31.36% <0.00%> (?)
apm-integrations-next-gte.15.0.0 31.42% <0.00%> (?)
apm-integrations-oracledb 35.23% <0.00%> (+0.08%) ⬆️
apm-integrations-prisma-18-gte.6.16.0.and.lt.7.0.0 35.53% <0.00%> (?)
apm-integrations-prisma-latest-all 35.85% <0.00%> (?)
apm-integrations-restify 37.08% <0.00%> (?)
apm-integrations-sharedb 34.56% <0.00%> (?)
apm-integrations-tedious 35.12% <0.00%> (?)
appsec-express 52.78% <0.00%> (+0.04%) ⬆️
appsec-fastify 49.30% <0.00%> (+0.07%) ⬆️
appsec-graphql 49.47% <0.00%> (-0.03%) ⬇️
appsec-integration-active 37.60% <0.00%> (?)
appsec-integration-latest 37.57% <0.00%> (?)
appsec-integration-maintenance 37.59% <0.00%> (?)
appsec-integration-oldest 37.59% <0.00%> (?)
appsec-kafka 42.07% <0.00%> (+0.04%) ⬆️
appsec-ldapjs 41.28% <0.00%> (-0.06%) ⬇️
appsec-lodash 41.40% <0.00%> (+0.02%) ⬆️
appsec-macos 56.81% <0.00%> (+0.01%) ⬆️
appsec-mongodb-core 45.66% <0.00%> (-0.02%) ⬇️
appsec-mongoose 46.56% <0.00%> (+0.01%) ⬆️
appsec-mysql 48.73% <0.00%> (+0.02%) ⬆️
appsec-next-latest-11.1.4 29.61% <0.00%> (?)
appsec-next-latest-13.2.0 31.50% <0.00%> (?)
appsec-next-latest-gte.10.2.0.and.lt.11 28.55% <ø> (?)
appsec-next-latest-gte.11.0.0.and.lt.13 31.48% <0.00%> (?)
appsec-next-latest-gte.13.0.0.and.lt.14 31.68% <0.00%> (?)
appsec-next-latest-gte.14.0.0.and.lte.14.2.6 31.52% <0.00%> (?)
appsec-next-latest-gte.14.2.7.and.lt.15 31.52% <0.00%> (?)
appsec-next-latest-gte.15.0.0 31.52% <0.00%> (?)
appsec-next-oldest-11.1.4 29.63% <0.00%> (?)
appsec-next-oldest-13.2.0 31.73% <0.00%> (?)
appsec-next-oldest-gte.10.2.0.and.lt.11 28.68% <ø> (?)
appsec-next-oldest-gte.11.0.0.and.lt.13 31.50% <0.00%> (?)
appsec-next-oldest-gte.13.0.0.and.lt.14 31.93% <0.00%> (?)
appsec-next-oldest-gte.14.0.0.and.lte.14.2.6 31.78% <0.00%> (?)
appsec-next-oldest-gte.14.2.7.and.lt.15 31.78% <0.00%> (?)
appsec-next-oldest-gte.15.0.0 31.78% <0.00%> (?)
appsec-node-serialize 40.58% <0.00%> (+0.03%) ⬆️
appsec-passport 44.56% <0.00%> (+0.04%) ⬆️
appsec-postgres 48.31% <0.00%> (-0.10%) ⬇️
appsec-sourcing 40.07% <0.00%> (+0.02%) ⬆️
appsec-stripe 42.32% <0.00%> (-0.05%) ⬇️
appsec-template 40.74% <0.00%> (+0.03%) ⬆️
appsec-ubuntu 56.88% <0.00%> (+0.01%) ⬆️
appsec-windows 56.70% <0.00%> (+0.03%) ⬆️
debugger-ubuntu-active 44.39% <0.00%> (?)
debugger-ubuntu-latest 44.34% <0.00%> (?)
debugger-ubuntu-maintenance 44.76% <0.00%> (?)
debugger-ubuntu-oldest 44.77% <0.00%> (?)
instrumentations-instrumentation-bluebird 29.88% <0.00%> (+0.07%) ⬆️
instrumentations-instrumentation-body-parser 37.72% <0.00%> (+0.03%) ⬆️
instrumentations-instrumentation-child_process 35.51% <0.00%> (+0.05%) ⬆️
instrumentations-instrumentation-cookie-parser 31.80% <0.00%> (+0.07%) ⬆️
instrumentations-instrumentation-express 32.02% <0.00%> (+0.07%) ⬆️
instrumentations-instrumentation-express-mongo-sanitize 31.92% <0.00%> (+0.07%) ⬆️
instrumentations-instrumentation-express-session 37.35% <0.00%> (+0.03%) ⬆️
instrumentations-instrumentation-fs 29.56% <0.00%> (+0.08%) ⬆️
instrumentations-instrumentation-generic-pool 30.59% <ø> (+0.06%) ⬆️
instrumentations-instrumentation-http 36.97% <0.00%> (+0.06%) ⬆️
instrumentations-instrumentation-knex 29.85% <0.00%> (+0.07%) ⬆️
instrumentations-instrumentation-light-my-request 37.28% <0.00%> (+0.03%) ⬆️
instrumentations-instrumentation-mongoose 30.93% <0.00%> (+0.06%) ⬆️
instrumentations-instrumentation-multer 37.49% <0.00%> (+0.03%) ⬆️
instrumentations-instrumentation-mysql2 35.48% <0.00%> (+0.04%) ⬆️
instrumentations-instrumentation-passport 41.15% <0.00%> (-0.04%) ⬇️
instrumentations-instrumentation-passport-http 40.93% <0.00%> (+0.05%) ⬆️
instrumentations-instrumentation-passport-local 41.43% <0.00%> (+0.06%) ⬆️
instrumentations-instrumentation-pg 35.02% <0.00%> (+0.05%) ⬆️
instrumentations-instrumentation-promise 29.81% <0.00%> (+0.07%) ⬆️
instrumentations-instrumentation-promise-js 29.82% <0.00%> (+0.07%) ⬆️
instrumentations-instrumentation-q 29.85% <0.00%> (+0.07%) ⬆️
instrumentations-instrumentation-url 29.82% <0.00%> (+0.08%) ⬆️
instrumentations-instrumentation-when 29.83% <0.00%> (+0.07%) ⬆️
instrumentations-integration-esbuild-active 19.74% <ø> (?)
instrumentations-integration-esbuild-latest 19.72% <ø> (?)
instrumentations-integration-esbuild-maintenance 19.74% <ø> (?)
instrumentations-integration-esbuild-oldest 19.72% <ø> (?)
llmobs-ai 38.46% <50.00%> (+0.05%) ⬆️
llmobs-anthropic 37.96% <50.00%> (+0.09%) ⬆️
llmobs-bedrock 37.15% <50.00%> (+0.05%) ⬆️
llmobs-google-genai 37.59% <50.00%> (+0.05%) ⬆️
llmobs-langchain 37.12% <50.00%> (+0.11%) ⬆️
llmobs-openai 41.28% <50.00%> (+0.06%) ⬆️
llmobs-sdk-active 45.99% <100.00%> (?)
llmobs-sdk-latest 45.92% <100.00%> (?)
llmobs-sdk-maintenance 45.99% <100.00%> (?)
llmobs-sdk-oldest 45.97% <100.00%> (?)
llmobs-vertex-ai 37.77% <50.00%> (+0.05%) ⬆️
openfeature-macos 38.86% <0.00%> (?)
openfeature-ubuntu 38.94% <0.00%> (?)
openfeature-unit-active 50.43% <ø> (?)
openfeature-unit-latest 50.28% <ø> (?)
openfeature-unit-maintenance 50.43% <ø> (?)
openfeature-unit-oldest 50.43% <ø> (?)
openfeature-windows 38.67% <0.00%> (?)
platform-core 36.53% <ø> (+6.49%) ⬆️
platform-esbuild 40.80% <ø> (+7.97%) ⬆️
platform-instrumentations-misc 31.34% <0.00%> (-8.67%) ⬇️
platform-integration-active 48.02% <0.00%> (?)
platform-integration-latest 47.98% <0.00%> (?)
platform-integration-maintenance 48.03% <0.00%> (?)
platform-integration-oldest 48.20% <0.00%> (?)
platform-shimmer 42.11% <ø> (+6.39%) ⬆️
platform-unit-guardrails 35.88% <ø> (+4.48%) ⬆️
platform-webpack 20.79% <ø> (+0.08%) ⬆️
plugins-azure-durable-functions 37.74% <0.00%> (+12.37%) ⬆️
plugins-azure-event-hubs 35.60% <0.00%> (+10.09%) ⬆️
plugins-azure-service-bus 36.06% <0.00%> (+11.13%) ⬆️
plugins-bullmq 40.79% <0.00%> (+0.17%) ⬆️
plugins-cassandra 35.39% <0.00%> (+0.10%) ⬆️
plugins-cookie 26.47% <ø> (ø)
plugins-cookie-parser 26.28% <ø> (ø)
plugins-crypto 27.32% <ø> (+1.62%) ⬆️
plugins-dd-trace-api 35.48% <0.00%> (+0.07%) ⬆️
plugins-express-mongo-sanitize 26.42% <ø> (ø)
plugins-express-session 26.24% <ø> (ø)
plugins-fastify 39.40% <0.00%> (+0.10%) ⬆️
plugins-fetch 35.86% <0.00%> (+0.06%) ⬆️
plugins-fs 35.75% <0.00%> (+0.06%) ⬆️
plugins-generic-pool 25.40% <ø> (ø)
plugins-google-cloud-pubsub 43.06% <0.00%> (+0.03%) ⬆️
plugins-grpc 38.11% <0.00%> (+0.05%) ⬆️
plugins-handlebars 26.46% <ø> (ø)
plugins-hapi 37.34% <0.00%> (-0.09%) ⬇️
plugins-hono 37.58% <0.00%> (+0.03%) ⬆️
plugins-ioredis 35.80% <0.00%> (+0.06%) ⬆️
plugins-knex 26.14% <ø> (ø)
plugins-langgraph 35.13% <0.00%> (+0.09%) ⬆️
plugins-ldapjs 24.02% <ø> (ø)
plugins-light-my-request 25.88% <ø> (ø)
plugins-limitd-client 30.13% <0.00%> (+0.08%) ⬆️
plugins-lodash 25.47% <ø> (ø)
plugins-mariadb 36.67% <0.00%> (+0.06%) ⬆️
plugins-memcached 35.46% <0.00%> (+0.06%) ⬆️
plugins-microgateway-core 36.43% <0.00%> (-0.03%) ⬇️
plugins-modelcontextprotocol-sdk 34.37% <0.00%> (+0.04%) ⬆️
plugins-moleculer 38.13% <0.00%> (+0.05%) ⬆️
plugins-mongodb 36.64% <0.00%> (+0.09%) ⬆️
plugins-mongodb-core 36.27% <0.00%> (+0.08%) ⬆️
plugins-mongoose 36.13% <0.00%> (+0.07%) ⬆️
plugins-multer 26.24% <ø> (ø)
plugins-mysql 36.53% <0.00%> (+0.06%) ⬆️
plugins-mysql2 36.51% <0.00%> (+0.05%) ⬆️
plugins-node-serialize 26.51% <ø> (ø)
plugins-opensearch 35.10% <0.00%> (+0.05%) ⬆️
plugins-passport-http 26.30% <ø> (ø)
plugins-pino 31.85% <0.00%> (+<0.01%) ⬆️
plugins-postgres 34.60% <0.00%> (+0.15%) ⬆️
plugins-process 27.32% <ø> (+1.62%) ⬆️
plugins-pug 26.47% <ø> (ø)
plugins-redis 36.01% <0.00%> (+0.06%) ⬆️
plugins-router 39.75% <0.00%> (-0.20%) ⬇️
plugins-sequelize 25.18% <ø> (ø)
plugins-test-and-upstream-amqp10 35.77% <0.00%> (+0.06%) ⬆️
plugins-test-and-upstream-amqplib 40.91% <0.00%> (+0.02%) ⬆️
plugins-test-and-upstream-apollo 36.60% <0.00%> (+0.04%) ⬆️
plugins-test-and-upstream-avsc 35.09% <0.00%> (-0.39%) ⬇️
plugins-test-and-upstream-bunyan 31.25% <0.00%> (+0.06%) ⬆️
plugins-test-and-upstream-connect 37.94% <0.00%> (+0.05%) ⬆️
plugins-test-and-upstream-graphql 37.29% <0.00%> (+0.04%) ⬆️
plugins-test-and-upstream-koa 37.53% <0.00%> (+0.03%) ⬆️
plugins-test-and-upstream-protobufjs 35.30% <0.00%> (-0.39%) ⬇️
plugins-test-and-upstream-rhea 40.98% <0.00%> (+0.01%) ⬆️
plugins-undici 36.62% <0.00%> (+0.06%) ⬆️
plugins-url 27.32% <ø> (+1.62%) ⬆️
plugins-valkey 35.47% <0.00%> (+0.05%) ⬆️
plugins-vm 27.32% <ø> (+1.62%) ⬆️
plugins-winston 31.72% <0.00%> (+0.23%) ⬆️
plugins-ws 39.05% <0.00%> (+0.03%) ⬆️
profiling-macos 43.89% <0.00%> (+6.00%) ⬆️
profiling-ubuntu 44.40% <0.00%> (+6.35%) ⬆️
profiling-windows 41.10% <0.00%> (+1.68%) ⬆️
serverless-aws-sdk-latest-aws-sdk 35.38% <0.00%> (?)
serverless-aws-sdk-latest-bedrockruntime 33.85% <0.00%> (?)
serverless-aws-sdk-latest-client 22.12% <ø> (?)
serverless-aws-sdk-latest-dynamodb 36.27% <0.00%> (?)
serverless-aws-sdk-latest-eventbridge 29.44% <0.00%> (?)
serverless-aws-sdk-latest-kinesis 39.19% <0.00%> (?)
serverless-aws-sdk-latest-lambda 36.50% <0.00%> (?)
serverless-aws-sdk-latest-s3 34.45% <0.00%> (?)
serverless-aws-sdk-latest-serverless-peer-service 40.81% <0.00%> (?)
serverless-aws-sdk-latest-sns 40.49% <0.00%> (?)
serverless-aws-sdk-latest-sqs 39.47% <0.00%> (?)
serverless-aws-sdk-latest-stepfunctions 35.10% <0.00%> (?)
serverless-aws-sdk-latest-util 47.80% <ø> (?)
serverless-aws-sdk-oldest-aws-sdk 35.45% <0.00%> (?)
serverless-aws-sdk-oldest-bedrockruntime 33.90% <0.00%> (?)
serverless-aws-sdk-oldest-client 22.50% <ø> (?)
serverless-aws-sdk-oldest-dynamodb 36.32% <0.00%> (?)
serverless-aws-sdk-oldest-eventbridge 29.49% <0.00%> (?)
serverless-aws-sdk-oldest-kinesis 39.27% <0.00%> (?)
serverless-aws-sdk-oldest-lambda 36.54% <0.00%> (?)
serverless-aws-sdk-oldest-s3 34.51% <0.00%> (?)
serverless-aws-sdk-oldest-serverless-peer-service 40.88% <0.00%> (?)
serverless-aws-sdk-oldest-sns 40.43% <0.00%> (?)
serverless-aws-sdk-oldest-sqs 39.53% <0.00%> (?)
serverless-aws-sdk-oldest-stepfunctions 35.14% <0.00%> (?)
serverless-aws-sdk-oldest-util 48.12% <ø> (?)
serverless-azure-functions-client 39.27% <0.00%> (+14.01%) ⬆️
serverless-azure-functions-eventhubs 38.92% <0.00%> (+13.67%) ⬆️
serverless-azure-functions-servicebus 38.98% <0.00%> (+13.73%) ⬆️
serverless-lambda 33.56% <0.00%> (?)
test-optimization-cucumber-latest-7.0.0 50.19% <0.00%> (?)
test-optimization-cucumber-latest-latest 52.71% <0.00%> (?)
test-optimization-cucumber-oldest-7.0.0 50.21% <0.00%> (?)
test-optimization-cypress-latest-10.2.0-commonJS 47.39% <0.00%> (?)
test-optimization-cypress-latest-10.2.0-esm 47.42% <0.00%> (?)
test-optimization-cypress-latest-14.5.4-commonJS 48.02% <0.00%> (?)
test-optimization-cypress-latest-14.5.4-esm 48.05% <0.00%> (?)
test-optimization-cypress-latest-latest-commonJS 48.51% <0.00%> (?)
test-optimization-cypress-latest-latest-esm 48.54% <0.00%> (?)
test-optimization-cypress-oldest-10.2.0-commonJS 47.43% <0.00%> (?)
test-optimization-cypress-oldest-10.2.0-esm 47.46% <0.00%> (?)
test-optimization-cypress-oldest-14.5.4-commonJS 48.06% <0.00%> (?)
test-optimization-cypress-oldest-14.5.4-esm 48.09% <0.00%> (?)
test-optimization-jest-latest-24.8.0 53.54% <0.00%> (?)
test-optimization-jest-latest-latest 54.40% <0.00%> (?)
test-optimization-jest-oldest-24.8.0 53.57% <0.00%> (?)
test-optimization-jest-oldest-latest 54.41% <0.00%> (?)
test-optimization-mocha-latest-5.2.0 48.64% <0.00%> (?)
test-optimization-mocha-latest-latest 52.95% <0.00%> (?)
test-optimization-mocha-oldest-5.2.0 48.64% <0.00%> (?)
test-optimization-mocha-oldest-latest 53.07% <0.00%> (?)
test-optimization-playwright-latest-latest-playwright-active-test-span 44.29% <0.00%> (?)
test-optimization-playwright-latest-latest-playwright-atr 42.97% <0.00%> (?)
test-optimization-playwright-latest-latest-playwright-efd 43.27% <0.00%> (?)
test-optimization-playwright-latest-latest-playwright-impacted-tests 42.93% <0.00%> (?)
test-optimization-playwright-latest-latest-playwright-reporting 43.07% <0.00%> (?)
test-optimization-playwright-latest-latest-playwright-test-management 44.71% <0.00%> (?)
test-optimization-playwright-latest-oldest-playwright-active-test-span 44.34% <0.00%> (?)
test-optimization-playwright-latest-oldest-playwright-atr 43.16% <0.00%> (?)
test-optimization-playwright-latest-oldest-playwright-efd 43.30% <0.00%> (?)
test-optimization-playwright-latest-oldest-playwright-impacted-tests 42.97% <0.00%> (?)
test-optimization-playwright-latest-oldest-playwright-reporting 43.13% <0.00%> (?)
test-optimization-playwright-latest-oldest-playwright-test-management 44.76% <0.00%> (?)
test-optimization-playwright-oldest-latest-playwright-active-test-span 44.31% <0.00%> (?)
test-optimization-playwright-oldest-latest-playwright-atr 43.01% <0.00%> (?)
test-optimization-playwright-oldest-latest-playwright-efd 43.29% <0.00%> (?)
test-optimization-playwright-oldest-latest-playwright-impacted-tests 42.98% <0.00%> (?)
test-optimization-playwright-oldest-latest-playwright-reporting 43.09% <0.00%> (?)
test-optimization-playwright-oldest-latest-playwright-test-management 44.72% <0.00%> (?)
test-optimization-playwright-oldest-oldest-playwright-active-test-span 44.38% <0.00%> (?)
test-optimization-playwright-oldest-oldest-playwright-atr 43.20% <0.00%> (?)
test-optimization-playwright-oldest-oldest-playwright-efd 43.32% <0.00%> (?)
test-optimization-playwright-oldest-oldest-playwright-impacted-tests 43.01% <0.00%> (?)
test-optimization-playwright-oldest-oldest-playwright-reporting 43.14% <0.00%> (?)
test-optimization-playwright-oldest-oldest-playwright-test-management 44.78% <0.00%> (?)
test-optimization-selenium-latest 46.10% <0.00%> (?)
test-optimization-selenium-oldest 45.56% <0.00%> (?)
test-optimization-testopt-active 47.38% <0.00%> (?)
test-optimization-testopt-latest 47.34% <0.00%> (?)
test-optimization-testopt-maintenance 47.38% <0.00%> (?)
test-optimization-testopt-oldest 48.14% <0.00%> (?)
test-optimization-vitest-latest 51.06% <0.00%> (?)
test-optimization-vitest-oldest 48.08% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@datadog-official
Copy link
Copy Markdown

datadog-official Bot commented Apr 23, 2026

Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 86.48% (+18.32%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: b452a1f | Docs | Datadog PR Page | Give us feedback!

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Apr 23, 2026

Benchmarks

Benchmark execution time: 2026-05-04 06:45:26

Comparing candidate commit b452a1f in PR branch alex/llmobs-tool-definitions with baseline commit 24339c2 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1347 metrics, 97 unstable metrics.

@PROFeNoM PROFeNoM marked this pull request as ready for review April 23, 2026 14:17
@PROFeNoM PROFeNoM requested a review from a team as a code owner April 23, 2026 14:17
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 21f766e1cd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/dd-trace/src/llmobs/span_processor.js Outdated
sabrenner
sabrenner previously approved these changes Apr 24, 2026
Copy link
Copy Markdown
Collaborator

@sabrenner sabrenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one comment that can be for a follow up but lgtm!!

Comment thread packages/dd-trace/src/llmobs/tagger.js
Introduces `_ml_obs.meta.tool_definitions` for LLM spans, mirroring the
shape already produced by dd-trace-py. Each definition is
`{ name, description, schema }`.

- `constants/tags.js`: new `TOOL_DEFINITIONS` tag key.
- `tagger.js`: new `tagToolDefinitions(span, toolDefinitions)` that
  validates entries and stores them in the registry. Matches the
  `#filterToolCalls` validation style.
- `span_processor.js`: forwards the registry entry to
  `meta.tool_definitions` on LLM spans.
- `test/llmobs/util.js`: assertion support for a new
  `toolDefinitions` expected field.

No integration plugin emits tool definitions yet; this unblocks
Bedrock Converse (see follow-up PR) and opens the door for
parity follow-ups on openai/anthropic/langchain/ai-sdk.
Route `tool_definitions` through `#addObject` before assigning to
`meta`, matching the existing `metadata` path. Direct assignment
bypassed the circular-ref / BigInt substitution done via
`UNSERIALIZABLE_VALUE_TEXT`, so a user-provided schema with a cycle
or unserializable value would make `JSON.stringify(event)` throw in
the span writer and the whole LLMObs span would get dropped.

This is a generic-infra safety fix: the current Bedrock caller is
safe by construction (the AWS SDK JSON.stringifies the request
before send), but `tagToolDefinitions` is meant to be reused by
other plugins and eventually the public `llmobs.annotate()` surface
where the upstream-already-serialized guarantee doesn't hold.
Add an `else` branch that calls `#handleFailure` for non-array /
empty input. Matches the validation pattern used by other tagger
methods. Flagged on review.
Adds a `tagToolDefinitions` describe block in the tagger spec mirroring
the `tagMetrics` style: one happy-path test asserting the tag is set,
and one negative test asserting the malformed-input branch (non-array,
empty array, undefined) routes through `#handleFailure`.
@PROFeNoM PROFeNoM force-pushed the alex/llmobs-tool-definitions branch from 1cd9ac5 to 1451e6c Compare April 27, 2026 13:20
PROFeNoM added a commit that referenced this pull request Apr 27, 2026
When a ConverseCommand or ConverseStreamCommand request includes a
`toolConfig.tools` array, map each `toolSpec` to LLMObs'
`{ name, description, schema }` shape and attach it to the span via
the new `tagToolDefinitions` API (see parent PR #8082).

- utils.js: new `extractConverseToolDefinitions` helper.
- llmobs/plugins/bedrockruntime.js: call it on Converse requests.
- spec: assert on the `toolDefinitions` round-trip.

Brings Node parity with dd-trace-py's Bedrock Converse integration
for the "Available Tools" UI section.

Ref: MLOB-3509
PROFeNoM added a commit that referenced this pull request Apr 27, 2026
…sent

`tagToolDefinitions` now logs a failure for non-array / empty input
(see #8082). The Bedrock plugin previously called it unconditionally
on every Converse request — `extractConverseToolDefinitions` returns
`[]` when there's no `toolConfig`, which would now produce noisy
`invalid_tool_definitions` logs on every tool-less Converse call.
Gate the call on `length > 0`.
Comment thread packages/dd-trace/src/llmobs/span_processor.js
PROFeNoM added a commit that referenced this pull request May 4, 2026
When a ConverseCommand or ConverseStreamCommand request includes a
`toolConfig.tools` array, map each `toolSpec` to LLMObs'
`{ name, description, schema }` shape and attach it to the span via
the new `tagToolDefinitions` API (see parent PR #8082).

- utils.js: new `extractConverseToolDefinitions` helper.
- llmobs/plugins/bedrockruntime.js: call it on Converse requests.
- spec: assert on the `toolDefinitions` round-trip.

Brings Node parity with dd-trace-py's Bedrock Converse integration
for the "Available Tools" UI section.

Ref: MLOB-3509
PROFeNoM added a commit that referenced this pull request May 4, 2026
…sent

`tagToolDefinitions` now logs a failure for non-array / empty input
(see #8082). The Bedrock plugin previously called it unconditionally
on every Converse request — `extractConverseToolDefinitions` returns
`[]` when there's no `toolConfig`, which would now produce noisy
`invalid_tool_definitions` logs on every tool-less Converse call.
Gate the call on `length > 0`.
PROFeNoM added a commit that referenced this pull request May 4, 2026
When a ConverseCommand or ConverseStreamCommand request includes a
`toolConfig.tools` array, map each `toolSpec` to LLMObs'
`{ name, description, schema }` shape and attach it to the span via
the new `tagToolDefinitions` API (see parent PR #8082).

- utils.js: new `extractConverseToolDefinitions` helper.
- llmobs/plugins/bedrockruntime.js: call it on Converse requests.
- spec: assert on the `toolDefinitions` round-trip.

Brings Node parity with dd-trace-py's Bedrock Converse integration
for the "Available Tools" UI section.

Ref: MLOB-3509
PROFeNoM added a commit that referenced this pull request May 4, 2026
…sent

`tagToolDefinitions` now logs a failure for non-array / empty input
(see #8082). The Bedrock plugin previously called it unconditionally
on every Converse request — `extractConverseToolDefinitions` returns
`[]` when there's no `toolConfig`, which would now produce noisy
`invalid_tool_definitions` logs on every tool-less Converse call.
Gate the call on `length > 0`.
PROFeNoM added a commit that referenced this pull request May 4, 2026
When a ConverseCommand or ConverseStreamCommand request includes a
`toolConfig.tools` array, map each `toolSpec` to LLMObs'
`{ name, description, schema }` shape and attach it to the span via
the new `tagToolDefinitions` API (see parent PR #8082).

- utils.js: new `extractConverseToolDefinitions` helper.
- llmobs/plugins/bedrockruntime.js: call it on Converse requests.
- spec: assert on the `toolDefinitions` round-trip.

Brings Node parity with dd-trace-py's Bedrock Converse integration
for the "Available Tools" UI section.

Ref: MLOB-3509
PROFeNoM added a commit that referenced this pull request May 4, 2026
…sent

`tagToolDefinitions` now logs a failure for non-array / empty input
(see #8082). The Bedrock plugin previously called it unconditionally
on every Converse request — `extractConverseToolDefinitions` returns
`[]` when there's no `toolConfig`, which would now produce noisy
`invalid_tool_definitions` logs on every tool-less Converse call.
Gate the call on `length > 0`.
@PROFeNoM PROFeNoM merged commit a329f1f into master May 5, 2026
881 of 882 checks passed
@PROFeNoM PROFeNoM deleted the alex/llmobs-tool-definitions branch May 5, 2026 06:47
PROFeNoM added a commit that referenced this pull request May 5, 2026
When a ConverseCommand or ConverseStreamCommand request includes a
`toolConfig.tools` array, map each `toolSpec` to LLMObs'
`{ name, description, schema }` shape and attach it to the span via
the new `tagToolDefinitions` API (see parent PR #8082).

- utils.js: new `extractConverseToolDefinitions` helper.
- llmobs/plugins/bedrockruntime.js: call it on Converse requests.
- spec: assert on the `toolDefinitions` round-trip.

Brings Node parity with dd-trace-py's Bedrock Converse integration
for the "Available Tools" UI section.

Ref: MLOB-3509
PROFeNoM added a commit that referenced this pull request May 5, 2026
…sent

`tagToolDefinitions` now logs a failure for non-array / empty input
(see #8082). The Bedrock plugin previously called it unconditionally
on every Converse request — `extractConverseToolDefinitions` returns
`[]` when there's no `toolConfig`, which would now produce noisy
`invalid_tool_definitions` logs on every tool-less Converse call.
Gate the call on `length > 0`.
@dd-octo-sts dd-octo-sts Bot mentioned this pull request May 6, 2026
sabrenner pushed a commit that referenced this pull request May 6, 2026
* feat(llmobs): add tool_definitions support to Tagger

Introduces `_ml_obs.meta.tool_definitions` for LLM spans, mirroring the
shape already produced by dd-trace-py. Each definition is
`{ name, description, schema }`.

- `constants/tags.js`: new `TOOL_DEFINITIONS` tag key.
- `tagger.js`: new `tagToolDefinitions(span, toolDefinitions)` that
  validates entries and stores them in the registry. Matches the
  `#filterToolCalls` validation style.
- `span_processor.js`: forwards the registry entry to
  `meta.tool_definitions` on LLM spans.
- `test/llmobs/util.js`: assertion support for a new
  `toolDefinitions` expected field.

No integration plugin emits tool definitions yet; this unblocks
Bedrock Converse (see follow-up PR) and opens the door for
parity follow-ups on openai/anthropic/langchain/ai-sdk.

* fix(llmobs): sanitize tool_definitions through serialization guard

Route `tool_definitions` through `#addObject` before assigning to
`meta`, matching the existing `metadata` path. Direct assignment
bypassed the circular-ref / BigInt substitution done via
`UNSERIALIZABLE_VALUE_TEXT`, so a user-provided schema with a cycle
or unserializable value would make `JSON.stringify(event)` throw in
the span writer and the whole LLMObs span would get dropped.

This is a generic-infra safety fix: the current Bedrock caller is
safe by construction (the AWS SDK JSON.stringifies the request
before send), but `tagToolDefinitions` is meant to be reused by
other plugins and eventually the public `llmobs.annotate()` surface
where the upstream-already-serialized guarantee doesn't hold.

* fix(llmobs): handleFailure in tagToolDefinitions for malformed input

Add an `else` branch that calls `#handleFailure` for non-array /
empty input. Matches the validation pattern used by other tagger
methods. Flagged on review.

* test(llmobs): cover tagToolDefinitions happy path and malformed input

Adds a `tagToolDefinitions` describe block in the tagger spec mirroring
the `tagMetrics` style: one happy-path test asserting the tag is set,
and one negative test asserting the malformed-input branch (non-array,
empty array, undefined) routes through `#handleFailure`.

* test(llmobs): cover tool_definitions sanitization in span processor

* test(llmobs): clarify intent of nested cycle in tool_definitions test

* test(llmobs): simplify tool_definitions test to forwarding only
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants