Skip to content

This is a technical presentation project for an Adobe Commerce (Magento) Meetup in Vienna, Austria. The presentation explores context window limitations in AI-powered development workflows and provides practical solutions for developers working with AI coding assistants.

Notifications You must be signed in to change notification settings

PHOENIX-MEDIA/context-window-problem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Context Window Problem

When AI Assistants Eat Their Own Memory

A technical presentation exploring context window limitations in AI-powered development workflows and practical solutions for developers working with AI coding assistants.

🔥 The Problem

When setting up Claude Code with MCP (Model Context Protocol) servers for Jira, Confluence, and GitLab integration, 91% of the context window was consumed at startup—before writing a single line of code.

Total Context:    200,000 tokens
Used at Startup:  181,000 tokens (91%)
Available:         19,000 tokens (9%)

That's enough for:
→ 1 medium file (~500 lines)
→ OR 5 minutes of conversation
→ OR 3-4 tool outputs

Basically... nothing useful.

The irony: With this setup, it's possible to access Jira to check for bugs, but there's insufficient context remaining to read the files needed to implement fixes. The tools defeat their own purpose.


📊 How It Escalated

Baseline (no MCP servers)      68k  (34%)  ████████████████░░░░░░░░░░░░░░░░░░
✅ Healthy - plenty of room

↓ Add Jira (3 essential tools)

70k tokens (35%)               70k  (35%)  ████████████████░░░░░░░░░░░░░░░░░░
✅ Still fine - barely changed

↓ Enable full Jira (8 tools)

104k tokens (52%)              104k (52%)  ██████████████████████████░░░░░░░░
⚠️  Noticeable - but workable

↓ Add GitLab (109 tools!)

148k tokens (74%)              148k (74%)  █████████████████████████████████░░
⚠️  Problematic - getting cramped

↓ Add Confluence + other servers

181k tokens (91%)              181k (91%)  ████████████████████████████████████
🔴 BROKEN - unusable

🔍 Where Did 114,000 Tokens Go?

Total Context: 200,000 tokens

├─ Baseline (Claude + config):  68,000 (34%)
└─ MCP Tool Definitions:       114,300 (57.2%)
    ├─ GitLab:   ~77,000 tokens (109 tools)
    ├─ Jira:      37,000 tokens (8 tools)
    └─ Other:        300 tokens

Key insight: Tool definitions consumed 57% of context BEFORE any tools were even USED. That's like paying rent on an entire warehouse just to store the instruction manuals.

Per-tool cost:

  • Jira: ~4,625 tokens/tool
  • GitLab: ~706 tokens/tool
  • Jira tools are 6.5x more expensive per tool!

✅ Solution 1: Strategic Tool Selection

What it is: Load only the tools you actually need

When to use: When your MCP server supports individual tool configuration

Example: Atlassian Partial vs Full

Before: Full Jira (8 tools)

{
  "mcpServers": {
    "mcp-atlassian": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-atlassian"]
    }
  }
}

→ 104k tokens (52%)

After: Partial Jira (3 tools)

{
  "mcpServers": {
    "mcp-atlassian": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-atlassian"],
      "env": {
        "ENABLED_TOOLS": "jira_get_issue,jira_search,jira_add_comment"
      }
    }
  }
}

→ 70k tokens (35%)

Result:

  • ✅ Saved 34,000 tokens
  • ✅ Dropped 17 percentage points
  • ✅ Only loaded tools actually needed

The Limitation

This works great for Atlassian MCP Server, but GitLab MCP Server doesn't support individual tool configuration—it's all-or-nothing with 109 tools consuming 77,000 tokens.

Even with optimized Jira: 70k (Jira) + 77k (GitLab) = 147k tokens (74%)

Still broken.


✨ Solution 2: Lazy Loading with mcp-funnel

What it is: A proxy that loads MCP tools on-demand instead of at startup

When to use: When Solution 1 isn't enough or isn't available

How It Works

Traditional MCP:

[Startup]
    ↓
Load ALL 117 tool definitions
    ↓
114,300 tokens consumed
    ↓
[Ready to work]
    ↓
19k tokens available 🔴

mcp-funnel:

[Startup]
    ↓
Load bridge tools only
    ↓
9,600 tokens consumed
    ↓
[Ready to work]
    ↓
123k tokens available ✅
    ↓
[When you need a tool]
    ↓
Discover by keyword (~1k tokens)
    ↓
Load specific tool
    ↓
Use it

The Results

BEFORE (Direct MCP)

Context Usage:  181,000 tokens (91%)
MCP Tools:      114,300 tokens
Available:       19,000 tokens

Tool Count: 117 tools (all pre-loaded)

Status: 🔴 UNUSABLE
"Can access Jira but can't read files"

AFTER (mcp-funnel)

Context Usage:   77,000 tokens (38%)
MCP Tools:        9,600 tokens
Available:      123,000 tokens

Tool Count: 12 pre-loaded, 100+ available on-demand

Status: ✅ HEALTHY
"Back to normal development workflow"

The Impact:

  • 104,000 tokens saved
  • 53 percentage point improvement (91% → 38%)
  • 6.5x more working space (19k → 123k tokens)
  • 11.9x reduction in MCP overhead (114.3k → 9.6k tokens)

Setup

Step 1: Install

npm install -g mcp-funnel

Step 2: Update config (~/.claude/claude_desktop_config.json)

See .mcp.json for a working example with mcp-funnel:

{
  "mcpServers": {
    "mcp-funnel": {
      "command": "npx",
      "args": ["-y", "mcp-funnel", ".mcp-funnel.json"]
    }
  }
}

Step 3: Create mcp-funnel config (.mcp-funnel.json)

See .mcp-funnel.json for a complete example with GitLab and Sequential Thinking servers:

{
  "servers": {
    "mcp-gitlab": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "-e", "GITLAB_PERSONAL_ACCESS_TOKEN", "iwakitakuma/gitlab-mcp"]
    },
    "mcp-atlassian": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-atlassian"]
    }
  },
  "exposeCoreTools": [
    "discover_tools_by_words",
    "get_tool_schema",
    "bridge_tool_request"
  ]
}

Step 4: Restart Claude Code

That's it. Tools now load on demand.

Trade-offs

Pros:

  • Massive context savings (91% → 38%)
  • All tools remain available
  • Works with ANY MCP server (GitLab included!)
  • Simple configuration
  • Free and open source
  • Discovery is cheap (~1k tokens)

⚠️ Cons:

  • Extra discovery step ("discover Jira tools")
  • Small latency on first tool use (barely noticeable)
  • Slightly less "magical" (Claude must explicitly discover)
  • One more dependency to manage

The Reality: Trading "auto-magical" for "actually usable" is an easy choice.


💡 Bonus Tips: Context-Aware Workflows

Monitor Your Context

Use /context or /usage to check token consumption periodically.

The 70% rule:

  • < 70%: Work normally
  • > 70%: Consider /compact or /clear
  • > 85%: Definitely clear soon

Use CLAUDE.md Files

Store project context OUTSIDE active memory. Claude reads them automatically without consuming context window.

Break Large Tasks into Phases

Phase 1: Research → Save summary → /clear
Phase 2: Implement → Commit → /clear
Phase 3: Test → Save results → /clear

🌍 Why This Matters Beyond Claude

This is not just a Claude problem—every AI coding assistant faces finite context:

AI Tool Context Window Same Issues?
Claude Code 200K ✓ MCP servers
GitHub Copilot 8K ✓ Extensions
ChatGPT 128K ✓ Plugins
Cursor Varies ✓ Integrations
Cody 100K ✓ Context fetching

The Universal Truth:

  • Every AI coding assistant has finite context
  • Every tool integration consumes tokens
  • The more ambitious your workflow, the bigger the problem
  • Context is the new bottleneck

What Tool Builders Should Learn

  • → Lazy loading should be default
  • → Token costs should be transparent
  • → Give users control over what loads

🎯 Key Takeaways

  1. Context is Finite and Fills Fast

    • 200K tokens sounds like a lot
    • MCP servers can consume 57% at startup
    • Measure your actual usage
  2. Strategic Tool Selection Helps (If Available)

    • Load only what you need
    • Can save 30-40k tokens
    • But not all servers support this
  3. Lazy Loading is the Real Solution

    • mcp-funnel: 91% → 38%
    • 104k tokens saved
    • Works with any MCP server
    • Gives you your workflow back
  4. This Affects Everyone

    • All AI coding assistants face this
    • Context is the new bottleneck
    • Plan your workflows accordingly
  5. It's Fixable

    • The tools exist today
    • Free, open source, easy to set up
    • You don't have to live with 91% context usage

📚 Resources

Tools & Projects

Documentation

Configuration Examples

This repository contains three configuration scenarios:

  1. .mcp.json - mcp-funnel setup with Atlassian (partial tools)
  2. .mcp-funnel.json - GitLab + Sequential Thinking servers through mcp-funnel
  3. mcp-direct.json - Direct MCP configuration with Atlassian + GitLab (the 91% problem)

Screenshots

See measurements/screenshots/ for visual documentation from all 6 test configurations showing the context usage progression.


📊 Full Measurement Data

Configuration Total % MCP Available Tools
Baseline 68k 34% 0k 132k 0
Jira Partial 70k 35% 3.6k 130k 3
Jira Full 104k 52% 37k 96k 8
GitLab + Jira 148k 74% 80.9k 52k 112
All Combined 181k 91% 114.3k 19k 117
mcp-funnel 77k 38% 9.6k 123k 12 pre + 100+ lazy

📝 License

This presentation and associated materials are shared for educational purposes. Individual tools and projects referenced maintain their respective licenses:

  • mcp-funnel: Check repository for license
  • mcp-atlassian: Check repository for license
  • mcp-gitlab: Check repository for license

From broken to working in one config change.

You don't have to accept broken workflows. The solution exists today—it's free, it's easy, and it works.

About

This is a technical presentation project for an Adobe Commerce (Magento) Meetup in Vienna, Austria. The presentation explores context window limitations in AI-powered development workflows and provides practical solutions for developers working with AI coding assistants.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published