Skip to content

tomfunk/cache_cow

Repository files navigation

🐮 Cache Cow

Zero-config LLM caching for integration tests via DNS interception

Cache Cow is a transparent caching proxy that intercepts LLM API calls (OpenAI, Anthropic, etc.) via DNS override, automatically caching responses to make your test suite 50-100x faster with absolutely zero code changes.

Features

  • Truly zero config - No code changes, no environment variables, no proxy configuration
  • Blazing fast - Redis-backed caching with ~50ms response time for cache hits
  • 🎯 Smart caching - TTL-based with optional probabilistic bypass to catch API changes
  • 📊 Beautiful dashboard - Real-time metrics showing hit rate, latency, and per-provider stats
  • 🐳 Easy deployment - Docker Compose setup, up and running in 1 minute
  • 🔧 GitHub CI ready - Perfect for self-hosted integration test caching
  • 🌐 General purpose - Works with any API, not just LLMs

Quick Start

1. Setup (one command)

./scripts/setup.sh

This automatically:

  • Generates SSL certificate
  • Starts Docker containers (Redis + Cache Cow proxy)
  • Configures DNS overrides in /etc/hosts
  • Installs certificate in Python's certifi bundle

Note: You'll need sudo for DNS configuration, but not for certificate installation.

2. Run Demos

Try these demos with zero code changes - they use standard production code:

# OpenAI SDK demo
uv run python demos/demo_openai.py

# Anthropic SDK demo
uv run python demos/demo_anthropic.py

# PydanticAI demo (uses OpenAI and Anthropic)
uv run python demos/demo_pydantic_ai.py

# HTTPBin demo (shows it works with any API)
uv run python demos/demo_httpbin.py

Each demo shows:

  • First call: ~0.1-2s (cache MISS → real API)
  • Second call: ~0.01s (cache HIT → Redis)
  • 30-600x speedup!

3. Use in Your Code

Your production code works unchanged:

# OpenAI SDK
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello!"}]
)

# PydanticAI
from pydantic_ai import Agent
agent = Agent('openai:gpt-5.2')
result = agent.run_sync("Hello!")

# Any HTTP library
import requests
response = requests.get("https://api.example.com/data")

First run: ~2 seconds (cached) Second run: ~0.05 seconds ⚡

4. View Dashboard

Open http://localhost:8001 to see:

  • Real-time cache hit rate
  • Latency comparison (cached vs uncached)
  • Per-provider statistics
  • Request distribution

5. Teardown (one command)

./scripts/teardown.sh

This automatically:

  • Removes certificate from certifi
  • Removes DNS overrides from /etc/hosts
  • Stops Docker containers

Your system returns to normal - API calls go directly to real endpoints.

How It Works

Your Code → api.openai.com
                ↓
          /etc/hosts override (127.0.0.1)
                ↓
         [Cache Cow Proxy]
                ↓
         [Check Redis Cache]
                ↓
    ├─ Cache HIT → Return in ~50ms ⚡
    └─ Cache MISS → Forward to real API → Cache → Return in ~2s

DNS Interception:

  1. /etc/hosts makes api.openai.com point to 127.0.0.1:443
  2. Cache Cow listens on port 443, receives all requests
  3. Checks Redis cache, serves or forwards to real API
  4. Your application has no idea Cache Cow exists!

Cache Key Generation:

  • Namespace + host + path + normalized prompt + model + max_tokens
  • Ignores: temperature, seed, streaming flags (non-deterministic params)
  • Normalizes whitespace in prompts for better hit rate

Configuration

Edit config/cacheCow.yml to customize:

defaults:
  ttl: 86400  # 24 hours
  namespace: default

domains:
  # OpenAI
  - host: api.openai.com
    target: api.openai.com
    port: 443
    ttl: 86400
    cache: true

  # Anthropic
  - host: api.anthropic.com
    target: api.anthropic.com
    port: 443
    ttl: 86400
    cache: true

  # Add your own domains!
  - host: your-api.example.com
    target: your-api.example.com
    port: 443
    ttl: 3600
    cache: true

# Optional: Probabilistic bypass for testing
probabilistic:
  enabled: false
  bypass_rate: 0.1  # 10% of requests bypass cache

Probabilistic Bypass Mode

Catch API changes while still getting caching benefits:

probabilistic:
  enabled: true
  bypass_rate: 0.1

This will:

  • Cache 90% of requests (fast)
  • Bypass cache for 10% of requests (tests real API)
  • Use deterministic hashing (same request → same decision)

API Endpoints

Get Statistics

curl http://localhost:8001/api/stats/default
{
  "namespace": "default",
  "cache_hits": 45,
  "cache_misses": 5,
  "bypassed": 0,
  "total_requests": 50,
  "hit_rate": 90.0,
  "avg_cached_latency_ms": 52,
  "avg_uncached_latency_ms": 2340,
  "speedup": 45
}

Get Per-Provider Statistics

curl http://localhost:8001/api/stats/default/providers
{
  "namespace": "default",
  "providers": [
    {
      "provider": "api.openai.com",
      "hits": 30,
      "misses": 5,
      "total": 35,
      "hit_rate": 85.7
    }
  ]
}

Clear Cache

curl -X DELETE http://localhost:8001/api/cache/default

Health Check

curl http://localhost:8001/api/health

Supported Providers

Out of the box:

  • ✅ OpenAI (api.openai.com)
  • ✅ Anthropic (api.anthropic.com)
  • ✅ Google AI (generativelanguage.googleapis.com)
  • ✅ Cohere (api.cohere.ai)
  • ✅ Together AI (api.together.xyz)

Add any domain by editing config/cacheCow.yml!

Project Structure

cache-cow/
├── README.md              # This file
├── docker-compose.yml     # Multi-container orchestration
├── Dockerfile            # Container image definition
├── requirements.txt      # Production dependencies
│
├── src/                  # Source code
│   ├── reverse_proxy.py  # Core caching proxy
│   ├── api.py           # FastAPI management server
│   └── dashboard.html   # Web dashboard UI
│
├── config/              # Configuration
│   └── cacheCow.yml     # Domain whitelist & settings
│
├── scripts/             # Setup scripts
│   ├── setup.sh         # Complete setup (one command)
│   ├── teardown.sh      # Complete teardown (one command)
│   ├── generate_cert.sh # Generate SSL certificate
│   ├── install_cert.sh  # Install cert in certifi
│   ├── cleanup_cert.sh  # Remove cert from certifi
│   ├── setup_hosts.sh   # Setup DNS overrides
│   ├── cleanup_hosts.sh # Remove DNS overrides
│   └── start.sh         # Container startup (internal)
│
└── demos/               # Demo scripts
    ├── demo_openai.py      # OpenAI SDK demo
    ├── demo_anthropic.py   # Anthropic SDK demo
    ├── demo_pydantic_ai.py # PydanticAI demo (uses Anthropic)
    └── demo_httpbin.py     # HTTPBin (non-LLM) demo

Performance

Architecture Benefits

vs Forward Proxy (HTTP_PROXY):

  • ✅ No environment variables needed
  • ✅ Works with ALL HTTP clients (no cooperation needed)
  • ✅ True zero-config

vs Code Mocking:

  • ✅ No code changes
  • ✅ Tests real HTTP stack
  • ✅ Catches integration issues

vs Recording Fixtures:

  • ✅ No manual fixture management
  • ✅ Automatic cache management
  • ✅ Easy cache invalidation

Troubleshooting

Not caching?

  1. Verify DNS override:

    ping api.openai.com
    # Should show 127.0.0.1
  2. Check Cache Cow is running:

    docker-compose ps
  3. Check logs:

    docker-compose logs proxy
  4. Check dashboard:

    open http://localhost:8001

Connection errors?

Make sure Cache Cow is running:

docker-compose ps
# Should show cacheCow-proxy and cacheCow-redis running

Temporarily disable

./scripts/teardown.sh

Production Deployment for CI

Cache Cow can be deployed as a centralized caching service for all your CI test runs. Here's how to make it production-ready:

Note: The current implementation is Python-specific. Certificate installation targets Python's certifi bundle. For other languages (Node.js, Ruby, Go, etc.), you would need system-wide certificate installation (requires sudo), which is not currently implemented or tested.

Architecture

┌─────────────────────────────────────────────────┐
│  Your Infrastructure (AWS/GCP/Azure/Self-hosted) │
│                                                  │
│  ┌────────────────────────────────────────────┐ │
│  │  Cache Cow Service (Single Instance)       │ │
│  │                                             │ │
│  │  ┌─────────────┐   ┌──────────────────┐  │ │
│  │  │   Redis     │   │  Reverse Proxy    │  │ │
│  │  │  (Persistent│◄──┤  (Cache Logic)    │  │ │
│  │  │   Volume)   │   │                   │  │ │
│  │  └─────────────┘   └──────────────────┘  │ │
│  │                                             │ │
│  │  Accessible at: cache-cow.yourcompany.com  │ │
│  └────────────────────────────────────────────┘ │
│                       ▲                          │
└───────────────────────┼──────────────────────────┘
                        │
          ┌─────────────┴─────────────┐
          │                           │
   ┌──────▼───────┐          ┌────────▼──────┐
   │  CI Job #1   │          │  CI Job #2    │
   │              │          │               │
   │ /etc/hosts:  │          │ /etc/hosts:   │
   │ api.openai   │          │ api.openai    │
   │   .com →     │          │   .com →      │
   │ cache-cow IP │          │ cache-cow IP  │
   └──────────────┘          └───────────────┘

1. SSL/TLS Termination

Required for production HTTPS interception.

Option A: Self-Signed Certificate (Internal CI)

# Generate CA certificate
openssl req -x509 -newkey rsa:4096 -keyout ca-key.pem -out ca-cert.pem \
  -days 3650 -nodes -subj "/CN=Cache Cow CA"

# Generate server certificate
openssl req -newkey rsa:4096 -keyout server-key.pem -out server-csr.pem \
  -nodes -subj "/CN=cache-cow.yourcompany.com"

openssl x509 -req -in server-csr.pem -CA ca-cert.pem -CAkey ca-key.pem \
  -CAcreateserial -out server-cert.pem -days 365

Update docker-compose.yml:

services:
  proxy:
    environment:
      - SSL_CERT_FILE=/certs/server-cert.pem
      - SSL_KEY_FILE=/certs/server-key.pem
    volumes:
      - ./certs:/certs:ro

CI Setup (Python):

# .github/workflows/test.yml
- name: Install Cache Cow CA cert
  run: |
    curl -o /tmp/cache-cow-ca.pem https://cache-cow.yourcompany.com/ca-cert.pem
    # Python uses certifi bundle, NOT system certificates
    CERTIFI_PATH=$(python -c "import certifi; print(certifi.where())")
    echo "" >> "$CERTIFI_PATH"
    echo "# Cache Cow Certificate" >> "$CERTIFI_PATH"
    cat /tmp/cache-cow-ca.pem >> "$CERTIFI_PATH"

Note: Python's httpx/requests use the certifi bundle and ignore system certificate stores. For other languages (Node.js, Ruby, Go), you may need system-wide cert installation with sudo, but this is not tested.

Option B: Let's Encrypt (Public-facing)

Use nginx as SSL termination proxy:

# docker-compose.prod.yml
services:
  nginx:
    image: nginx:alpine
    ports:
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
      - /etc/letsencrypt:/etc/letsencrypt:ro
    depends_on:
      - proxy

  proxy:
    ports:
      - "8443:8443"  # Internal only, not exposed

nginx.conf:

server {
    listen 443 ssl;
    server_name cache-cow.yourcompany.com;

    ssl_certificate /etc/letsencrypt/live/cache-cow.yourcompany.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/cache-cow.yourcompany.com/privkey.pem;

    # Forward all domains to Cache Cow
    location / {
        proxy_pass http://proxy:8443;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

2. Persistent Redis Storage

Preserve cache across restarts.

Update docker-compose.yml:

services:
  redis:
    image: redis:7-alpine
    command: redis-server --save 60 1 --loglevel warning
    volumes:
      - redis_data:/data
    restart: always

volumes:
  redis_data:
    driver: local

For cloud deployments:

  • AWS: Use EBS volume or ElastiCache
  • GCP: Use Persistent Disk or Memorystore
  • Azure: Use Azure Disk or Azure Cache for Redis

3. Namespace Isolation

Isolate cache per team/project.

# Team A's tests
- name: Configure Cache Cow
  run: |
    echo "CACHE_COW_IP api.openai.com" | sudo tee -a /etc/hosts
    curl -X POST http://cache-cow.yourcompany.com:8001/api/namespace \
      -d '{"namespace": "team-a"}'

# Team B's tests
- name: Configure Cache Cow
  run: |
    echo "CACHE_COW_IP api.openai.com" | sudo tee -a /etc/hosts
    curl -X POST http://cache-cow.yourcompany.com:8001/api/namespace \
      -d '{"namespace": "team-b"}'

Update config/cacheCow.yml:

defaults:
  namespace: "${NAMESPACE:-default}"  # Environment variable support

4. GitHub Actions Integration

Complete CI workflow example:

# .github/workflows/integration-tests.yml
name: Integration Tests with Cache Cow

on: [push, pull_request]

env:
  CACHE_COW_HOST: cache-cow.yourcompany.com
  CACHE_COW_IP: 10.0.1.100  # Your Cache Cow server IP

jobs:
  test:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3

      - name: Configure DNS for Cache Cow
        run: |
          echo "${{ env.CACHE_COW_IP }} api.openai.com" | sudo tee -a /etc/hosts
          echo "${{ env.CACHE_COW_IP }} api.anthropic.com" | sudo tee -a /etc/hosts

      - name: Verify Cache Cow connectivity
        run: |
          curl -f http://${{ env.CACHE_COW_HOST }}:8001/api/health || exit 1

      - name: Run integration tests
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          pytest tests/integration/
          # First run: ~5 minutes
          # Subsequent runs: ~30 seconds ⚡

      - name: View Cache Stats
        if: always()
        run: |
          curl http://${{ env.CACHE_COW_HOST }}:8001/api/stats/default

5. Monitoring & Observability

Prometheus Metrics

Add to src/api.py:

from prometheus_client import Counter, Histogram, generate_latest

cache_hits = Counter('cache_cow_hits_total', 'Total cache hits')
cache_misses = Counter('cache_cow_misses_total', 'Total cache misses')
request_latency = Histogram('cache_cow_request_duration_seconds', 'Request latency')

@app.get("/metrics")
def metrics():
    return Response(generate_latest(), media_type="text/plain")

Grafana Dashboard

services:
  grafana:
    image: grafana/grafana
    ports:
      - "3000:3000"
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/dashboards:/etc/grafana/provisioning/dashboards

6. Cost Tracking

Track API cost savings.

Add to config/cacheCow.yml:

cost_tracking:
  enabled: true
  pricing:
    api.openai.com:
      gpt-4: 0.03  # per 1K tokens
      gpt-3.5-turbo: 0.002
    api.anthropic.com:
      claude-3-opus: 0.015
      claude-3-sonnet: 0.003

7. Deployment Checklist

  • Set up SSL/TLS termination
  • Configure persistent Redis storage
  • Implement namespace isolation
  • Set up monitoring (Prometheus/Grafana)
  • Configure log aggregation
  • Set up automated backups for Redis
  • Document CA certificate distribution
  • Test failover behavior
  • Set up alerts for high error rates
  • Configure rate limiting (optional)
  • Set up cost tracking dashboard
  • Document DNS configuration for CI

8. Infrastructure as Code

Terraform Example (AWS):

resource "aws_ecs_service" "cache_cow" {
  name            = "cache-cow"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.cache_cow.arn
  desired_count   = 1

  load_balancer {
    target_group_arn = aws_lb_target_group.cache_cow.arn
    container_name   = "cache-cow-proxy"
    container_port   = 443
  }
}

resource "aws_elasticache_cluster" "cache_cow_redis" {
  cluster_id           = "cache-cow-redis"
  engine              = "redis"
  node_type           = "cache.t3.micro"
  num_cache_nodes     = 1
  parameter_group_name = "default.redis7"
}

9. Security Considerations

  • API Key Security: Cache Cow sees API keys in requests. Use network isolation and encryption at rest.
  • Access Control: Restrict dashboard access with authentication
  • Audit Logging: Log all cache operations for compliance
  • Rate Limiting: Prevent abuse of the caching service
  • Network Policies: Use VPC/subnet isolation in cloud environments

10. Expected Savings

For a typical team running 100 CI jobs/day:

  • Without Cache Cow: 100 jobs × 5 min = 500 minutes
  • With Cache Cow: 100 jobs × 0.5 min = 50 minutes
  • Time Saved: 450 minutes/day = 7.5 hours/day

License

MIT

Credits

Built with:


MOOOOOOO! 🐮

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors