A concept demonstration of how intelligent caching can dramatically accelerate browser automation agents by storing and reusing action plans, eliminating the need for repeated LLM "thinking" about common tasks.
Main files are in /bday
Instead of having the LLM plan every action from scratch, this system:
- Caches Action Plans: Stores successful action sequences in a database
- Retrieves Similar Plans: Finds cached plans for similar tasks using semantic similarity
- Executes Directly: Skips the planning phase and goes straight to execution
- Accelerates Performance: Reduces latency and token costs by avoiding redundant LLM calls
- Action Plan Caching: Stores granular subgoals and browser actions for reuse
- Semantic Retrieval: Finds similar cached plans using vector similarity
- Zero-Thinking Execution: Bypasses LLM planning for cached action sequences
- Multi-Level Caching: LLM responses, subgoal plans, and execution results
- Browser Automation: Uses Playwright to navigate Wikipedia pages
- Provider Flexibility: Supports OpenAI, Lightpanda, and OpenLLM providers
-
Install dependencies:
pip install -r requirements.txt playwright install chromium
-
Set up environment variables (create a
.envfile):# Required: Choose one LLM provider OPENAI_API_KEY=sk-your-key-here OPENAI_MODEL=gpt-4o-mini # Optional: Lightpanda browser service LIGHTPANDA_TOKEN=your-token-here
-
Run the agent:
cd bday python t_agent.py "When was Marie Curie born?"
# Basic research question
python t_agent.py "What year was Einstein born?"
# Force new plan (bypass cache)
python t_agent.py "Compare Taylor Swift and Beyoncé's Grammy wins" --force-plan
# Preview stored plan without execution
python t_agent.py "When did World War II end?" --plan-preview
# Run in headless mode
python t_agent.py "Who invented the telephone?" --headless
# Show cache statistics
python t_agent.py "What is photosynthesis?" --show-counts- Planning Phase: LLM breaks down question into specific subgoals
- Action Generation: LLM creates concrete browser actions for each subgoal
- Execution: Playwright automates the browser to collect information
- Caching: Successful action sequences are stored in the database
- Answer Extraction: LLM synthesizes collected data into final answer
- Cache Lookup: System finds similar cached action plans using semantic similarity
- Direct Execution: Skips LLM planning, executes cached actions immediately
- Answer Extraction: LLM synthesizes collected data into final answer
Result: Dramatically faster execution with reduced token usage and lower costs
The system implements a three-tier caching strategy to maximize acceleration:
- LLM Cache: Stores and reuses LLM responses for similar prompts
- Subgoal Cache: Core innovation - Reuses complete action plans for similar research tasks
- Answer Cache: Stores final answers (currently disabled for fresh results)
- Cloud Based Database Stores cache in a structured Weaviates vector database
Question: "When was Einstein born?"
Cache Lookup: Finds similar cached plan for "birth date research"
Result: Executes cached actions directly, skipping 2-3 LLM planning calls
Speed Improvement: ~70% faster execution, ~60% fewer tokens used
# Purge cache for specific question
python t_agent.py "Your question here" --purge
# Emergency cleanup of wrong answers
python t_agent.py --emergency-purgeThe system automatically selects the best available LLM provider:
- OpenAI (recommended): Reliable JSON responses and usage tracking
- Lightpanda: Cloud-based browser automation
- OpenLLM: Self-hosted models
Force a specific provider:
export FORCE_PROVIDER=openai
export OPENAI_API_KEY=sk-...
export OPENAI_MODEL=gpt-4o-minit_agent.py- Main automation script with caching logicagent_core.py- Core browser automation logicllm_client.py- Provider-agnostic LLM interfacecachedb/- SQLite database with vector embeddings for semantic cachingcachedb_integrations/- Cache adapters and integrations
This caching approach provides significant advantages for browser automation:
- Speed: 60-70% faster execution on cache hits
- Cost: 50-60% reduction in token usage
- Reliability: Proven action sequences reduce execution errors
- Scalability: Cache grows smarter with each successful execution
- Python 3.9+
- Playwright with Chromium
- LLM API access (OpenAI, Lightpanda, or OpenLLM)
- Internet connection for Wikipedia access