Skip to content

mbhatt1/COALESCE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

16 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

COALESCE: Cost-Optimized Agent Labor Exchange via Skill-based Competence Estimation

A comprehensive framework implementing the COALESCE system described in the research paper "A Proposed Framework for Cost-Optimized and Secure Task Outsourcing Among Autonomous LLM Agents via Skill-based Competence Estimation"

Overview

COALESCE is an advanced multi-agent framework that enables autonomous LLM agents to optimize resource utilization and operational costs through intelligent task outsourcing. The framework combines sophisticated multi-criteria decision algorithms, comprehensive cost modeling, and agent-to-agent market dynamics with both theoretical simulation and real-world LLM agent validation.

Key Innovation: The framework includes both mathematical simulation for theoretical validation and real LLM agent implementation using actual API calls to OpenAI GPT-4 and Anthropic Claude-3.5-Sonnet for empirical validation.

Key Features

🧠 Advanced Decision Engine

  • TOPSIS-based Multi-Criteria Analysis: Comprehensive evaluation of cost, reliability, latency, security, and skill compatibility
  • Epsilon-Greedy Exploration: 10% exploration rate for discovering beneficial contractor relationships
  • Game-Theoretic Optimization: Nash equilibrium strategies for optimal resource allocation
  • Dynamic Weight Learning: Adaptive weight adjustment based on historical performance

πŸ’° Comprehensive Cost Modeling

  • Internal Cost Analysis: Compute, memory, energy, opportunity, and depreciation costs
  • External Cost Assessment: Dynamic pricing, communication, verification, integration, risk, and latency penalties
  • Real-time Calibration: EWMA-based cost adjustment using historical performance data
  • Multi-dimensional Risk Assessment: Security, reliability, and quality risk quantification

πŸ”’ Security Framework

  • Multi-layer Security Model: Cryptographic protocols and data protection
  • Risk-based Cost Adjustment: Security risk quantification in decision-making
  • Privacy-preserving Communication: Secure agent-to-agent interaction protocols

🎯 Skill-based Competence Estimation

  • Ontological Skill Matching: Jaccard similarity for skill compatibility
  • Embedding-based Similarity: Cosine similarity for semantic skill matching
  • Historical Performance Integration: Performance-based contractor evaluation
  • Dynamic Skill Assessment: Adaptive skill compatibility scoring

Validated Performance Results

🎯 Key Findings

  • Exploration is Critical: Without epsilon-greedy exploration, real implementation achieved only 1.9% cost reduction
  • Proper Ξ΅-greedy Works: With working exploration (10% rate), performance improved to 20.3% cost reduction
  • API Validation: Confirmed HTTP requests to OpenAI and Anthropic APIs validate real LLM processing
  • Economic Rationality: Framework maintains cost efficiency while enabling beneficial contractor discovery

Installation

  1. Clone or download the COALESCE project
  2. Install dependencies:
pip install -r requirements.txt
  1. Set up API keys for real agent validation (optional):
export OPENAI_API_KEY="your-openai-api-key"
export ANTHROPIC_API_KEY="your-anthropic-api-key"

Quick Start

Mathematical Simulation

Run theoretical validation with mathematical models:

python main.py

Run comprehensive simulation replicating paper results:

python simple_table_replication.py

Real LLM Agent Validation

Run validation with actual LLM API calls:

python real_paper_validation.py

Run epsilon-greedy exploration validation:

python real_paper_validation_epsilon_greedy.py

Custom Configuration

python main.py --duration 7 --agents 25 --contractors 100 --verbose

Real Agent Implementation

πŸ€– Real LLM Contractors

The framework includes actual LLM agent implementations:

  • GPT-4-Real: OpenAI GPT-4 with actual API calls ($2.00/task)
  • Claude-3-Real: Anthropic Claude-3.5-Sonnet with API calls ($1.50/task)
  • Budget-Cloud-Real: Simulated budget provider ($0.80/task)

🏠 Local Agent Execution

  • Real Local Computation: Actual hardware-based cost calculation
  • Hardware Specifications: NVIDIA RTX 3080, Intel Xeon E5-2690 v4
  • Cost Model: $0.45/hour compute, $0.15/hour memory

πŸ“‹ Task Types

Real validation includes:

  • Financial RAG: Document analysis with varying complexity
  • Risk Assessment: Portfolio risk evaluation
  • Portfolio Optimization: Investment allocation optimization

Architecture

Core Modules

  1. Decision Engine (src/decision/decision_engine.py)

    • TOPSIS-based multi-criteria decision analysis
    • Epsilon-greedy exploration (Ξ΅=0.1)
    • Skill compatibility assessment
    • Game-theoretic optimization
    • Dynamic weight learning
  2. Cost Calculator (src/cost_model/cost_calculator.py)

    • Internal cost modeling (compute, memory, energy, opportunity, depreciation)
    • External cost assessment (pricing, communication, verification, integration, risk)
    • Real-time cost calibration using EWMA
    • Multi-dimensional risk quantification
  3. Agent Types (src/agents/agent_types.py)

    • Client and contractor agent definitions
    • Hardware configurations and capabilities
    • Task specifications and requirements
  4. Real Agents (real_agents/)

    • llm_agents.py: Real LLM contractor implementations
    • base_agent.py: Base agent interface
    • marketplace.py: Agent marketplace simulation
  5. Simulation Engine (src/simulation/simulation_engine.py)

    • Market dynamics modeling
    • Performance tracking
    • Statistical analysis

Validation Scripts

πŸ“Š Mathematical Simulation

  • simple_table_replication.py: Replicates paper Table I results (260 runs)
  • main.py: Core simulation engine
  • final_comprehensive.py: Comprehensive analysis framework

πŸ”¬ Real Agent Validation

  • real_paper_validation.py: Basic real agent validation
  • real_paper_validation_epsilon_greedy.py: Proper epsilon-greedy validation
  • real_paper_validation_with_exploration.py: Forced exploration analysis

πŸ“ˆ Analysis Tools

  • analyze_variation.py: Performance variation analysis
  • comprehensive_analysis.py: Statistical validation framework
  • debug_skills.py: Skill compatibility debugging

Configuration

YAML Configuration

The simulation supports YAML configuration files:

simulation_duration_days: 7
num_client_agents: 15
num_contractor_agents: 30
exploration_rate: 0.1
topsis_threshold: 0.6
confidence_threshold: 0.8

Command Line Options

usage: main.py [-h] [--config CONFIG] [--output-dir OUTPUT_DIR] 
               [--duration DURATION] [--agents AGENTS] 
               [--contractors CONTRACTORS] [--verbose]

COALESCE: Cost-Optimized Agent Labor Exchange

optional arguments:
  -h, --help            show this help message and exit
  --config CONFIG       Path to simulation configuration file
  --output-dir OUTPUT_DIR
                        Directory for simulation output and reports
  --duration DURATION   Simulation duration in days
  --agents AGENTS       Number of client agents
  --contractors CONTRACTORS
                        Number of contractor agents
  --verbose             Enable verbose logging

Output and Reports

Simulation Output

  • output/reports/executive_summary.md: High-level performance summary
  • output/data/time_series.csv: Time series performance data
  • output/data/decisions.csv: Individual decision records
  • output/charts/dashboard.png: Performance visualization

Real Agent Results

  • real_paper_validation_*.json: Detailed validation results
  • API call logs and cost calculations
  • Performance metrics and decision analysis

Performance Metrics

Economic Metrics

  • Cost Reduction: Percentage savings from outsourcing
  • Time Savings: Execution time improvement
  • ROI: Return on investment analysis
  • Resource Utilization: Efficiency metrics

Decision Quality Metrics

  • TOPSIS Score: Multi-criteria decision quality (0-1)
  • Confidence Level: Decision confidence assessment
  • Exploration Rate: Percentage of exploration decisions
  • Outsourcing Rate: Contractor engagement frequency

Market Dynamics

  • Supply-Demand Balance: Market equilibrium analysis
  • Price Volatility: Cost variation tracking
  • Contractor Utilization: Resource usage patterns
  • Market Concentration: Competition analysis

Research Applications

Academic Research

  • Multi-agent system optimization
  • Economic mechanism design
  • Autonomous agent coordination
  • Cost optimization algorithms

Industry Applications

  • Cloud resource optimization
  • Distributed computing coordination
  • Supply chain optimization
  • Service marketplace design

Experimental Validation

πŸ”¬ Mathematical Validation

  • Total Runs: 260 successful simulations across 13 configurations
  • Statistical Method: Multiple runs per configuration for robust validation
  • Confidence Level: High statistical confidence with comprehensive error analysis
  • Universal Success: 100% of scenarios achieve substantial performance gains

πŸ€– Real Agent Validation

  • API Integration: Confirmed HTTP requests to OpenAI and Anthropic
  • Cost Validation: Actual token-based cost calculations
  • Performance Verification: Real LLM processing and response analysis
  • Exploration Impact: Demonstrated critical importance of epsilon-greedy exploration

πŸ“Š Key Insights

  1. Exploration is Essential: Without exploration, real agents achieve only 1.9% cost reduction
  2. Proper Ξ΅-greedy Works: 10% exploration rate achieves 20.3% cost reduction
  3. Theory-Practice Gap: Mathematical simulation (50.2%) vs real implementation (20.3%)
  4. Economic Rationality: Framework maintains cost efficiency while enabling discovery

Contributing

The COALESCE framework is designed for research and educational purposes. Contributions welcome:

  1. Fork the repository
  2. Create feature branch
  3. Add tests for new functionality
  4. Submit pull request with detailed description

License

This project implements the research described in the COALESCE paper and is intended for academic and research use.

Caveats and Limitations

Mathematical Simulation

  • Theoretical Models: Uses statistical distributions to model agent behavior
  • Idealized Conditions: Perfect information and simplified market dynamics
  • Parameter Sensitivity: Performance depends on configuration tuning

Real Agent Implementation

  • API Costs: Real validation incurs actual API costs from OpenAI/Anthropic
  • Network Dependencies: Requires stable internet connection for API calls
  • Rate Limiting: Subject to API provider rate limits and availability

Framework Limitations

  • Exploration Dependency: Performance critically depends on working exploration mechanisms
  • Cost Model Accuracy: Real-world costs may differ from theoretical calculations
  • Scale Constraints: Large-scale deployment may face practical limitations

Data Interpretation

  • Zero-Value Results: May indicate restrictive parameters rather than system failure
  • Super-Efficiency: Cost reductions >100% reflect idealized conditions
  • Variance Interpretation: High variance may indicate parameter sensitivity

Contact

For questions about the framework, research collaboration, or technical issues, please use GitHub issues or contact the research team.


COALESCE Framework - Advancing autonomous agent coordination through intelligent task outsourcing and cost optimization.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages