Skip to content

Lyadalachanchu/action-cache

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Action Cache - Accelerated Browser Agent with Intelligent Caching

A concept demonstration of how intelligent caching can dramatically accelerate browser automation agents by storing and reusing action plans, eliminating the need for repeated LLM "thinking" about common tasks.

Main files are in /bday

Core Concept

Instead of having the LLM plan every action from scratch, this system:

  1. Caches Action Plans: Stores successful action sequences in a database
  2. Retrieves Similar Plans: Finds cached plans for similar tasks using semantic similarity
  3. Executes Directly: Skips the planning phase and goes straight to execution
  4. Accelerates Performance: Reduces latency and token costs by avoiding redundant LLM calls

Key Features

  • Action Plan Caching: Stores granular subgoals and browser actions for reuse
  • Semantic Retrieval: Finds similar cached plans using vector similarity
  • Zero-Thinking Execution: Bypasses LLM planning for cached action sequences
  • Multi-Level Caching: LLM responses, subgoal plans, and execution results
  • Browser Automation: Uses Playwright to navigate Wikipedia pages
  • Provider Flexibility: Supports OpenAI, Lightpanda, and OpenLLM providers

Quick Start

  1. Install dependencies:

    pip install -r requirements.txt
    playwright install chromium
  2. Set up environment variables (create a .env file):

    # Required: Choose one LLM provider
    OPENAI_API_KEY=sk-your-key-here
    OPENAI_MODEL=gpt-4o-mini
    
    # Optional: Lightpanda browser service
    LIGHTPANDA_TOKEN=your-token-here
  3. Run the agent:

    cd bday
    python t_agent.py "When was Marie Curie born?"

Usage Examples

# Basic research question
python t_agent.py "What year was Einstein born?"

# Force new plan (bypass cache)
python t_agent.py "Compare Taylor Swift and Beyoncé's Grammy wins" --force-plan

# Preview stored plan without execution
python t_agent.py "When did World War II end?" --plan-preview

# Run in headless mode
python t_agent.py "Who invented the telephone?" --headless

# Show cache statistics
python t_agent.py "What is photosynthesis?" --show-counts

How It Works

First Run (Cache Miss)

  1. Planning Phase: LLM breaks down question into specific subgoals
  2. Action Generation: LLM creates concrete browser actions for each subgoal
  3. Execution: Playwright automates the browser to collect information
  4. Caching: Successful action sequences are stored in the database
  5. Answer Extraction: LLM synthesizes collected data into final answer

Subsequent Runs (Cache Hit)

  1. Cache Lookup: System finds similar cached action plans using semantic similarity
  2. Direct Execution: Skips LLM planning, executes cached actions immediately
  3. Answer Extraction: LLM synthesizes collected data into final answer

Result: Dramatically faster execution with reduced token usage and lower costs

Cache Architecture

The system implements a three-tier caching strategy to maximize acceleration:

  • LLM Cache: Stores and reuses LLM responses for similar prompts
  • Subgoal Cache: Core innovation - Reuses complete action plans for similar research tasks
  • Answer Cache: Stores final answers (currently disabled for fresh results)
  • Cloud Based Database Stores cache in a structured Weaviates vector database

Cache Hit Example

Question: "When was Einstein born?"
Cache Lookup: Finds similar cached plan for "birth date research"
Result: Executes cached actions directly, skipping 2-3 LLM planning calls
Speed Improvement: ~70% faster execution, ~60% fewer tokens used
# Purge cache for specific question
python t_agent.py "Your question here" --purge

# Emergency cleanup of wrong answers
python t_agent.py --emergency-purge

Configuration

The system automatically selects the best available LLM provider:

  • OpenAI (recommended): Reliable JSON responses and usage tracking
  • Lightpanda: Cloud-based browser automation
  • OpenLLM: Self-hosted models

Force a specific provider:

export FORCE_PROVIDER=openai
export OPENAI_API_KEY=sk-...
export OPENAI_MODEL=gpt-4o-mini

Project Structure

  • t_agent.py - Main automation script with caching logic
  • agent_core.py - Core browser automation logic
  • llm_client.py - Provider-agnostic LLM interface
  • cachedb/ - SQLite database with vector embeddings for semantic caching
  • cachedb_integrations/ - Cache adapters and integrations

Performance Benefits

This caching approach provides significant advantages for browser automation:

  • Speed: 60-70% faster execution on cache hits
  • Cost: 50-60% reduction in token usage
  • Reliability: Proven action sequences reduce execution errors
  • Scalability: Cache grows smarter with each successful execution

Requirements

  • Python 3.9+
  • Playwright with Chromium
  • LLM API access (OpenAI, Lightpanda, or OpenLLM)
  • Internet connection for Wikipedia access

About

Making web agents faster by remembering past actions

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages