Skip to content

Reduce MCP tool token footprint #59

@Gabriel-Darbord

Description

@Gabriel-Darbord

Goal

Minimize the token footprint of MCP-Pharo tool calls while keeping tool results useful, structured, and predictable for agents.

This is an umbrella issue. Concrete changes should reduce what the model has to read by default, avoid duplicate representations, and make expensive or verbose detail explicit rather than automatic.

What we learned

  • Reputable paginated APIs keep the default response compact and provide a continuation signal instead of repeating request metadata in every response.
  • Google AIP-158 uses page_size, page_token, and next_page_token; total size is optional rather than mandatory.
  • Slack cursor pagination uses response_metadata.next_cursor; completion is driven by the cursor, not by guessing from returned item count.
  • Stripe keeps list pages small by default and exposes has_more with cursors.
  • GitHub keeps pagination navigation out of the JSON body through Link headers, which reinforces the same principle: do not make every item payload carry navigation noise.
  • Kubernetes exposes continue and only reports remaining counts when useful; it also emphasizes stable, bounded slices.
  • For MCP-Pharo, the MCP client context is the scarce resource. Default tool output should optimize for the agent reading less, not for mirroring internal implementation detail.

References:

Direction

  • Prefer compact structured results over verbose summaries.
  • Avoid duplicate fields that repeat the same fact in multiple ways.
  • Keep legacy or human-oriented text out of observability and high-volume traces when structured data is enough.
  • Use small defaults for broad searches and lists. Different tools can have different defaults based on expected payload size.
  • Push detail behind focused follow-up calls where possible: list/search tools should identify candidates; get/read/detail tools should return full payloads.
  • Treat pagination, truncation, and result limits as core token-efficiency mechanisms, not only UI conveniences.

Acceptance criteria

  • High-volume tools have bounded default outputs sized for agent use.
  • Repeated pagination and truncation behavior is factored into reusable logic instead of hand-rolled per tool.
  • Result schemas avoid redundant page metadata by default.
  • Tools that can return large details have a compact default path and a focused detail path.
  • Observability data is sufficient to identify oversized payloads and guide follow-up reductions without duplicating large fields unnecessarily.
  • Documentation explains that MCP-Pharo tool output should be optimized for minimal token usage by default.

Implementation readiness

AFK after each concrete slice is scoped. This umbrella should stay open until the main high-volume outputs have been audited and compacted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestllm-effectivenessLLM effectiveness and agent output efficiency workmcpLLM effectiveness and agent output efficiency worktoken-efficiencyLLM effectiveness and agent output efficiency work

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions