Skip to content

HKUDS/AnyTool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AnyTool Logo

AnyTool: Universal Tool-Use Layer for AI Agents

✨ One Line of Code to Supercharge any Agent with
Fast, Scalable and Powerful Tool Use
✨

Platform Python License Feishu WeChat

| ⚑ Fast - Lightning Tool Retrieval Β |Β  πŸ“ˆ Self-Evolving Tool Orchestration Β |Β  ⚑ Universal Tool Automation |

🎯 What is AnyTool?

AnyTool is a Universal Tool-Use Layer that transforms how AI agents interact with tools. It solves three fundamental challenges that prevent reliable agent automation: overwhelming tool contexts, unreliable community tools, and limited capability coverage -- delivering the first truly intelligent tool orchestration system for production AI agents.

πŸ’‘ Research Highlights

⚑ Fast - Lightning Tool Retrieval

  • Smart Context Management: Progressive tool filtering delivers exact tools in milliseconds through multi-stage pipeline, eliminating context pollution while maintaining speed.

  • Zero-Waste Processing: Pre-computed embeddings and lazy initialization eliminate redundant processing - tools are instantly ready across all executions.

πŸ“ˆ Scalable - Self-Evolving Tool Orchestration

  • Adaptive MCP Tool Selection: Smart caching and selective re-indexing maintain constant performance from 10 to 10,000 tools with optimal resource usage.

  • Self-Evolving Tool Optimization: System continuously improves through persistent memory, becoming more efficient as your tool ecosystem expands.

🌍 Powerful - Universal Tool Automation

  • Quality-Aware Selection: Built-in reliability tracking and safety controls deliver production-ready automation through persistent learning and execution safeguards.

  • Universal Tool-Use Capability: Multi-backend architecture seamlessly extends beyond web APIs to system operations, GUI automation, and deep research through unified interface.

⚑ Easy-to-Use and Effortless Integration

One line to get intelligent tool orchestration. Zero-config setup transforms complex multi-tool workflows into a single API call.

from anytool import AnyTool

# One line to get intelligent tool orchestration
async with AnyTool() as tool_layer:
    result = await tool_layer.execute(
        "Research trending AI coding tools from GitHub and tech news, "
        "collect their features and user feedback, analyze adoption patterns, "
        "then create a comparison report with insights"
    )

πŸ“‹ Table of Contents


🎯 Quick Start

1. Environment Setup

# Clone repository
git clone https://github.com/HKUDS/AnyTool.git
cd AnyTool

# Create and activate conda environment
conda create -n anytool python=3.12 -y
conda activate anytool

# Install dependencies
pip install -r requirements.txt

Note

Create a .env file and add your API keys (refer to anytool/.env.example).

2. Start Local Server

The local server is a lightweight Flask service that enables AnyTool to interact with your computer (GUI automation, Python/Bash execution, file operations, screen capture, etc.).

Note

See anytool/local_server/README.md for complete API documentation and advanced configuration.

Important

Platform-specific setup required: Different operating systems need different dependencies for desktop control. Please install the required dependencies for your OS before starting the local server:

macOS Setup
# Install macOS-specific dependencies
pip install pyobjc-core pyobjc-framework-cocoa pyobjc-framework-quartz atomacos

Permissions Required: macOS will automatically prompt for permissions when you first run the local server. Grant the following:

  • Accessibility (for GUI control)
  • Screen Recording (for screenshots and video capture)

If prompts don't appear, manually grant permissions in System Settings β†’ Privacy & Security.

Linux Setup
# Install Linux-specific dependencies
pip install python-xlib pyatspi numpy

# Install system packages
sudo apt install at-spi2-core python3-tk scrot
Windows Setup
# Install Windows-specific dependencies
pip install pywinauto pywin32 PyGetWindow

After installing the platform-specific dependencies, start the local server:

python -m anytool.local_server.main

Tip

Local server is required for GUI automation and Python/Bash execution. Without it, only MCP servers and web research capabilities are available.

3. Quick Integration

AnyTool is a plug-and-play Universal Tool-Use Layer for any AI agent. The task passed to execute() can come from your agent's planning module, user input, or any workflow system.

import asyncio
from anytool import AnyTool
from anytool.tool_layer import AnyToolConfig

async def main():
    config = AnyToolConfig(
        enable_recording=True,
        recording_backends=["gui", "shell", "mcp", "web"],
        enable_screenshot=True,
        enable_video=True,
    )
    
    async with AnyTool(config=config) as tool_layer:
        result = await tool_layer.execute(
            "Research trending AI coding tools from GitHub and tech news, "
            "collect their features and user feedback, analyze adoption patterns, "
            "then create a comparison report with insights"
        )
        print(result["response"])

asyncio.run(main())

Tip

MCP Server Configuration: For tasks requiring specific tools, add relevant MCP servers to anytool/config/config_mcp.json. Unsure which servers to add? Simply add all potentially useful ones, AnyTool's Smart Tool RAG will automatically select the appropriate tools for your task. See MCP Configuration for details.


Technical Innovation & Implementation

🧩 Challenge 1: MCP Tool Context Overload

The Problem. Current MCP agents suffer from a fundamental design flaw: they load ALL configured servers and tools at every execution step, creating an overwhelming action space, creates three critical issues:

  • ⚑ Slow Performance with Massive Context Loading
    Complete tool set from all pre-configured servers loaded simultaneously at every step, degrading execution speed

  • 🎯 Poor Accuracy from Blind Tool Setup
    Users cannot preview tools before connecting, leading to over-setup "just in case" and confusing tool selection

  • πŸ’Έ Resource Waste with No Memory
    Same tools reloaded at every execution step with no caching, causing redundant loading

βœ… AnyTool's Solution: Tool Context Management Framework

Motivation: "Load Everything" β†’ "Retrieve What's Needed"
Improvement: Faster tool selection, cleaner context, and efficient resource usage through smart retrieval and memory.

Technical Innovation:

🎯 Multi-Stage Tool Retrieval Pipeline

  • Progressive MCP Tool Filtering: server selection β†’ tool name matching β†’ tool semantic search β†’ LLM ranking
  • Reduces MCP Tool Search Space: Each stage narrows down candidate tools for optimizing precision and speed

πŸ’Ύ Long-Term Tool Memory

  • Save Once, Use Forever: Pre-compute tool embeddings once and save them to disk for instant reuse
  • Zero Waste Processing: No more redundant processing - tools are ready to use immediately across all execution steps

🧠 Adaptive Tool Selection

  • Adaptive MCP Tool Ranking: LLM-based tool selection refinement triggered only when MCP tool results are large or ambiguous
  • Tool Selection Efficiency: Balances MCP tool accuracy with computational efficiency

πŸš€ On-Demand Resource Management

  • Lazy MCP Server Startup: MCP server initialization triggered only when specific tools are needed
  • Selective Tool Updates: Incremental re-indexing of only changed MCP tools, not the entire tool set

🚨 Challenge 2: MCP Tool Quality Issues

The Problem. Current MCP servers suffer from community contribution challenges that create three critical issues:

  • πŸ” Poor Tool Descriptions
    Misleading claims, non-existent advertised tools, and vague capability specifications lead to wrong tool selection.

  • πŸ“Š No Reliability Signals
    Cannot assess MCP tool quality before use, causing blind selection decisions.

  • ⚠️ Security and Safety Gaps
    Unvetted community tools may execute dangerous operations without proper safeguards.

βœ… AnyTool Solution: Self-Contained Quality Management

Motivation: "Blind Tool Trust" β†’ "Smart Quality Assessment"
Improvement: Reliable tool selection, safe execution, and autonomous recovery through quality tracking and safety controls.

Technical Innovation:

🎯 Quality-Aware Tool Selection

  • πŸ“ Description Quality Check: LLM-based evaluation of MCP tool description clarity and completeness.
  • πŸ“ˆ Performance-Based Ranking: Track call/success rates for each MCP tool in persistent memory to prioritize reliable options.

πŸ’Ύ Learning-Based Tool Memory

  • 🧠 Track Tool Performance: Remember which MCP tools work well and which fail over time.
  • ⚑ Smart Tool Prioritization: Automatically rank tools based on past success rates and description quality.

πŸ›‘οΈ Safety-First Execution

  • 🚫 Block Dangerous Operations: Prevent arbitrary code execution and require user approval for sensitive MCP tool operations.
  • πŸ”’ Execution Safeguards: Built-in safety controls for all MCP tool executions.

πŸš€ Self-Healing Tool Management

  • 🎯 Autonomous Tool Switching: Switch failed MCP tools locally without restarting expensive planning loops.
  • πŸ”„ Local Failure Recovery: Automatically switch to alternative MCP tools on failure without escalating to upper-level agents.

πŸ”„ Challenge 3: Limited MCP Capability Scope

The Problem. Current MCP ecosystem focuses primarily on Web APIs and online services, creating significant automation gaps that prevent comprehensive task completion:

  • πŸ–₯️ Missing System Operations
    No native support for file manipulation, process management, or command execution on local systems.

  • πŸ–±οΈ No Desktop Automation
    Cannot control GUI applications that lack APIs, limiting automation to web-only scenarios.

  • πŸ“Š Incomplete Tool Coverage
    Limited server categories in community and incomplete tool sets within existing servers create workflow bottlenecks.

βœ… AnyTool Solution: Universal Capability Extension
(MCP + System Commands + GUI Control β‰ˆ Universal Task Completion)

Motivation: "Web-Only MCP" β†’ "Universal Task Completion"
Improvement: Complete automation coverage through multi-backend architecture that seamlessly extends MCP capabilities beyond web APIs.

πŸ—οΈ Multi-Backend Architecture

  • MCP Backend: Community servers for Web APIs and online services
  • Shell Backend: Bash/Python execution for system-level operations and file management
  • GUI Backend: Pixel-level automation for any visual application without API requirements
  • Web Backend: Deep web research and data extraction capabilities

πŸ’‘ Self-Evolving Capability Discovery

  • Intelligent Gap Detection: Planning agent identifies when MCP tools are insufficient for task requirements
  • Automatic Backend Selection: Shell/GUI backends automatically fill capability gaps without manual intervention
  • Dynamic Capability Expansion: Previously impossible tasks become achievable through backend combination

🎭 Unified Tool Orchestration

  • Uniform Tool Schema: All backends expose identical interface for seamless agent tool selection
  • Transparent Backend Switching: Agents select optimal tools across backend types without knowing implementation details
  • Intelligent Tool Routing: Automatic routing to the most appropriate backend based on task requirements

πŸš€ Seamless Integration Layer

  • Single Tool Interface: Unified API that abstracts away backend complexity from AI agents.
  • Cross-Backend Coordination: Enable complex workflows that span multiple backend capabilities.
  • Consistent Safety Controls: Apply security and safety measures uniformly across all backend types.

πŸ”§ Configuration Guide

Configuration Overview

AnyTool uses a layered configuration system:

  • config_dev.json (highest priority): Local development overrides. Overrides all other configurations.
  • config_agents.json: Agent definitions and backend access control
  • config_mcp.json: MCP server registry
  • config_grounding.json: Backend-specific settings and Smart Tool RAG configuration
  • config_security.json: Security policies with runtime user confirmation for sensitive operations

Agent Configuration

Path: anytool/config/config_agents.json

Purpose: Define agent roles, control backend access scope, and set execution limits to prevent infinite loops.

Example configuration:

{
  "agents": [
    {
      "name": "GroundingAgent",
      "class_name": "GroundingAgent",
      "backend_scope": ["gui", "shell", "mcp", "system", "web"],
      "max_iterations": 20
    }
  ]
}

Key Fields:

Field Description Options/Example
backend_scope Accessible backends [] or any combination of ["gui", "shell", "mcp", "system", "web"]
max_iterations Maximum execution cycles Any integer (e.g., 15, 20, 50) or null (unlimited)

MCP Configuration

Path: anytool/config/config_mcp.json

Purpose: Register MCP servers with connection details. AnyTool automatically discovers tools from all registered servers and makes them available through Smart Tool RAG.

Example configuration:

{
  "mcpServers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "${GITHUB_TOKEN}"
      }
    }
  }
}

Runtime Configuration (AnyToolConfig)

Runtime Configuration (AnyToolConfig)

Complete example:

from anytool import AnyTool
from anytool.tool_layer import AnyToolConfig

config = AnyToolConfig(
    # LLM Configuration
    llm_model="anthropic/claude-sonnet-4-5",
    llm_enable_thinking=False,
    llm_timeout=120.0,
    llm_max_retries=3,
    llm_rate_limit_delay=0.0,
    llm_kwargs={},  # Additional LiteLLM parameters
    
    # Grounding Configuration
    grounding_config_path=None,  # Path to custom config file
    grounding_max_iterations=20,
    grounding_system_prompt=None,  # Custom system prompt
    
    # Backend Configuration
    backend_scope=["gui", "shell", "mcp", "web", "system"],
    
    # Workspace Configuration
    workspace_dir=None,  # Auto-create temp dir if None
    
    # Recording Configuration
    enable_recording=True,
    recording_backends=["gui", "shell", "mcp"],
    recording_log_dir="./logs/recordings",
    enable_screenshot=True,
    enable_video=True,
    
    # Logging Configuration
    log_level="INFO",
    log_to_file=False,
    log_file_path=None,
)

async with AnyTool(config=config) as tool_layer:
    result = await tool_layer.execute("Your task here")

Other Configuration Files

Backend Configuration

Path: anytool/config/config_grounding.json

Purpose: Configure backend-specific behaviors, timeouts, Smart Tool RAG system for efficient tool selection, and Tool Quality Tracking for self-evolving tool intelligence.

Key Fields:

Backend Field Description Options/Default
shell timeout Command timeout (seconds) Any integer (default: 60)
conda_env Auto-activate conda environment Environment name or null (default: "anytool")
working_dir Working directory for command execution Any valid path (default: current directory)
default_shell Shell to use "/bin/bash", "/bin/zsh", etc.
gui timeout Operation timeout (seconds) Any integer (default: 90)
screenshot_on_error Capture screenshot on failure true or false (default: true)
driver_type GUI automation driver "pyautogui" or other supported drivers
mcp timeout Request timeout (seconds) Any integer (default: 30)
sandbox Run in E2B sandbox true or false (default: false)
eager_sessions Pre-connect all servers at startup true or false (default: false, lazy connection)
tool_search search_mode Tool retrieval strategy "semantic", "hybrid" (semantic + LLM filter), or "llm" (default: "hybrid")
max_tools Maximum tools to return from search Any integer (default: 20)
enable_llm_filter Enable LLM-based tool pre-filtering true or false (default: true)
llm_filter_threshold Enable LLM filter when tools exceed this count Any integer (default: 50)
enable_cache_persistence Persist embedding cache to disk true or false (default: true)
tool_quality enabled Enable tool quality tracking true or false (default: true)
enable_persistence Persist quality data to disk true or false (default: true)
cache_dir Directory for quality cache Path string or null (default: ~/.anytool/tool_quality)
auto_evaluate_descriptions Automatically evaluate tool descriptions using LLM true or false (default: true)
enable_quality_ranking Incorporate quality scores in tool ranking true or false (default: true)
evolve_interval Trigger self-evolution every N tool executions Any integer 1-100 (default: 5)

Security Configuration

Path: anytool/config/config_security.json

Purpose: Define security policies with command filtering and access control. When sensitive operations are detected, AnyTool will prompt for user confirmation at runtime before execution.

Key Fields:

Section Field Description Options
global allow_shell_commands Enable shell command execution true or false (default: true)
allow_network_access Enable network operations true or false (default: true)
allow_file_access Enable file system operations true or false (default: true)
blocked_commands Platform-specific command blacklist Object with common, linux, darwin, windows arrays
sandbox_enabled Enable sandboxing for all operations true or false (default: false)
require_user_approval Prompt user before sensitive operations true or false (default: false)
backend shell, mcp, gui, web Per-backend security overrides Same fields as global, backend-specific

Example blocked commands: rm -rf, shutdown, reboot, mkfs, dd, format, iptables

Behavior:

  • Blocked commands are rejected automatically
  • When require_user_approval is true, sensitive operations pause execution and prompt for user confirmation
  • Sandbox mode isolates operations in secure environments (E2B sandbox for MCP)

Developer Configuration

Path: anytool/config/config_dev.json (copy from config_dev.json.example)

Loading Priority: config_grounding.json β†’ config_security.json β†’ config_dev.json (dev.json overrides the former ones)


πŸ“– Code Structure

πŸ“– Quick Overview

Legend: ⚑ Core modules | πŸ”§ Supporting modules

AnyTool/
β”œβ”€β”€ anytool/
β”‚   β”œβ”€β”€ __init__.py                       # Package exports
β”‚   β”œβ”€β”€ tool_layer.py                     # AnyTool main class
β”‚   β”‚
β”‚   β”œβ”€β”€ ⚑ agents/                         # Agent System
β”‚   β”œβ”€β”€ ⚑ grounding/                      # Unified Backend System
β”‚   β”‚   β”œβ”€β”€ core/                         # Core abstractions
β”‚   β”‚   └── backends/                     # Backend implementations
β”‚   β”‚       β”œβ”€β”€ shell/                    # Shell command execution
β”‚   β”‚       β”œβ”€β”€ gui/                      # Anthropic Computer Use
β”‚   β”‚       β”œβ”€β”€ mcp/                      # Model Context Protocol
β”‚   β”‚       └── web/                      # Web search & browsing
β”‚   β”‚
β”‚   β”œβ”€β”€ πŸ”§ llm/                           # LLM Integration
β”‚   β”œβ”€β”€ πŸ”§ config/                        # Configuration System
β”‚   β”œβ”€β”€ πŸ”§ local_server/                  # GUI Backend Server
β”‚   β”œβ”€β”€ πŸ”§ recording/                     # Execution Recording
β”‚   β”œβ”€β”€ πŸ”§ platform/                      # Platform Integration
β”‚   └── πŸ”§ utils/                         # Utilities
β”‚
β”œβ”€β”€ .anytool/                             # Runtime cache
β”‚   β”œβ”€β”€ embedding_cache/                  # Tool embeddings for Smart Tool RAG
β”‚   └── tool_quality/                     # Persistent tool quality tracking data
β”‚
β”œβ”€β”€ logs/                                 # Execution logs
β”‚
β”œβ”€β”€ requirements.txt                      # Python dependencies
β”œβ”€β”€ pyproject.toml                        # Package configuration
└── README.md

πŸ“‚ Detailed Module Structure

⚑ agents/ - Agent System
agents/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ base.py                         # Base agent class with common functionality
└── grounding_agent.py              # Execution Agent (tool calling & iteration control)

Key Responsibilities: Task execution with intelligent tool selection and iteration control.

⚑ grounding/ - Unified Backend System (Core Integration Layer)

Key Responsibilities: Unified tool abstraction, backend routing, session pooling, Smart Tool RAG, and Self-Evolving Quality Tracking*.

Core Abstractions

grounding/core/
β”œβ”€β”€ grounding_client.py             # Unified interface across all backends
β”œβ”€β”€ provider.py                     # Abstract provider base class
β”œβ”€β”€ session.py                      # Session lifecycle management
β”œβ”€β”€ search_tools.py                 # Smart Tool RAG for semantic search
β”œβ”€β”€ exceptions.py                   # Custom exception definitions
β”œβ”€β”€ types.py                        # Shared type definitions
β”‚
β”œβ”€β”€ tool/                           # Tool abstraction layer
β”‚   β”œβ”€β”€ base.py                     # Tool base class
β”‚   β”œβ”€β”€ local_tool.py               # Local tool implementation
β”‚   └── remote_tool.py              # Remote tool implementation
β”‚
β”œβ”€β”€ quality/                        # Self-evolving tool quality tracking
β”‚   β”œβ”€β”€ manager.py                  # Quality manager with adaptive ranking
β”‚   β”œβ”€β”€ store.py                    # Persistent quality data storage
β”‚   └── types.py                    # Quality record data types
β”‚
β”œβ”€β”€ security/                       # Security & sandboxing πŸ”§
β”‚   β”œβ”€β”€ policies.py                 # Security policy enforcement
β”‚   β”œβ”€β”€ sandbox.py                  # Sandbox abstraction
β”‚   └── e2b_sandbox.py              # E2B sandbox integration
β”‚
β”œβ”€β”€ system/                         # System-level provider
β”‚   β”œβ”€β”€ provider.py
β”‚   └── tool.py
β”‚
└── transport/                      # Transport layer abstractions πŸ”§
    β”œβ”€β”€ connectors/
    β”‚   β”œβ”€β”€ base.py
    β”‚   └── aiohttp_connector.py
    └── task_managers/
        β”œβ”€β”€ base.py
        β”œβ”€β”€ async_ctx.py
        β”œβ”€β”€ aiohttp_connection_manager.py
        └── placeholder.py

Backend Implementations

Shell Backend - Command execution via local server
backends/shell/
β”œβ”€β”€ provider.py                     # Shell provider implementation
β”œβ”€β”€ session.py                      # Shell session management
└── transport/
    └── connector.py                # HTTP connector to local server
GUI Backend - Anthropic Computer Use integration
backends/gui/
β”œβ”€β”€ provider.py                     # GUI provider implementation
β”œβ”€β”€ session.py                      # GUI session management
β”œβ”€β”€ tool.py                         # GUI-specific tools
β”œβ”€β”€ anthropic_client.py             # Anthropic API client wrapper
β”œβ”€β”€ anthropic_utils.py              # Utility functions
β”œβ”€β”€ config.py                       # GUI configuration
└── transport/
    β”œβ”€β”€ connector.py                # Computer Use API connector
    └── actions.py                  # Action execution logic
MCP Backend - Model Context Protocol servers
backends/mcp/
β”œβ”€β”€ provider.py                     # MCP provider implementation
β”œβ”€β”€ session.py                      # MCP session management
β”œβ”€β”€ client.py                       # MCP client
β”œβ”€β”€ config.py                       # MCP configuration loader
β”œβ”€β”€ installer.py                    # MCP server installer
β”œβ”€β”€ tool_converter.py               # Convert MCP tools to unified format
└── transport/
    β”œβ”€β”€ connectors/                 # Multiple transport types
    β”‚   β”œβ”€β”€ base.py
    β”‚   β”œβ”€β”€ stdio.py                # Standard I/O connector
    β”‚   β”œβ”€β”€ http.py                 # HTTP connector
    β”‚   β”œβ”€β”€ websocket.py            # WebSocket connector
    β”‚   β”œβ”€β”€ sandbox.py              # Sandboxed connector
    β”‚   └── utils.py
    └── task_managers/              # Protocol-specific managers
        β”œβ”€β”€ stdio.py
        β”œβ”€β”€ sse.py
        β”œβ”€β”€ streamable_http.py
        └── websocket.py
Web Backend - Search and browsing
backends/web/
β”œβ”€β”€ provider.py                     # Web provider implementation
└── session.py                      # Web session management
πŸ”§ llm/ - LLM Integration
llm/
β”œβ”€β”€ __init__.py
└── client.py                       # LiteLLM wrapper with retry logic
πŸ”§ config/ - Configuration System
config/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ loader.py                       # Configuration file loader
β”œβ”€β”€ constants.py                    # System constants
β”œβ”€β”€ grounding.py                    # Grounding configuration dataclasses
β”œβ”€β”€ utils.py                        # Configuration utilities
β”‚
β”œβ”€β”€ config_grounding.json           # Backend-specific settings
β”œβ”€β”€ config_agents.json              # Agent configurations
β”œβ”€β”€ config_mcp.json                 # MCP server definitions
β”œβ”€β”€ config_security.json            # Security policies
└── config_dev.json.example         # Development config template
πŸ”§ local_server/ - GUI Backend Server
local_server/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ main.py                         # Flask application entry point
β”œβ”€β”€ config.json                     # Server configuration
β”œβ”€β”€ feature_checker.py              # Platform feature detection
β”œβ”€β”€ health_checker.py               # Server health monitoring
β”œβ”€β”€ platform_adapters/              # OS-specific implementations
β”‚   β”œβ”€β”€ macos_adapter.py            # macOS automation (atomacos, pyobjc)
β”‚   β”œβ”€β”€ linux_adapter.py            # Linux automation (pyatspi, xlib)
β”‚   β”œβ”€β”€ windows_adapter.py          # Windows automation (pywinauto)
β”‚   └── pyxcursor.py                # Custom cursor handling
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ accessibility.py            # Accessibility tree utilities
β”‚   └── screenshot.py               # Screenshot capture
└── README.md

Purpose: Lightweight Flask service enabling computer control (GUI, Shell, Files, Screen capture).

πŸ”§ recording/ - Execution Recording
recording/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ recorder.py                     # Main recording manager
β”œβ”€β”€ manager.py                      # Recording lifecycle management
β”œβ”€β”€ action_recorder.py              # Action-level logging
β”œβ”€β”€ video.py                        # Video capture integration
β”œβ”€β”€ viewer.py                       # Trajectory viewer and analyzer
└── utils.py                        # Recording utilities

Purpose: Execution audit with trajectory recording and video capture.

πŸ”§ platform/ - Platform Integration
platform/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ config.py                       # Platform-specific configuration
β”œβ”€β”€ recording.py                    # Recording integration
β”œβ”€β”€ screenshot.py                   # Screenshot utilities
└── system_info.py                  # System information gathering
πŸ”§ utils/ - Shared Utilities
utils/
β”œβ”€β”€ logging.py                      # Structured logging system
β”œβ”€β”€ ui.py                           # Terminal UI components
β”œβ”€β”€ display.py                      # Display formatting utilities
β”œβ”€β”€ cli_display.py                  # CLI-specific display
β”œβ”€β”€ ui_integration.py               # UI integration helpers
└── telemetry/                      # Usage analytics (opt-in)
    β”œβ”€β”€ __init__.py
    β”œβ”€β”€ events.py
    β”œβ”€β”€ telemetry.py
    └── utils.py
πŸ“Š logs/ - Execution Logs & Recordings
logs/
β”œβ”€β”€ <script_name>/                        # Main application logs
β”‚   └── anytool_YYYY-MM-DD_HH-MM-SS.log   # Timestamped log files
β”‚
└── recordings/                           # Execution recordings
    └── task_<id>/                        # Individual recording session
        β”œβ”€β”€ trajectory.json               # Complete execution trajectory
        β”œβ”€β”€ screenshots/                  # Visual execution record (GUI backend)
        β”‚   β”œβ”€β”€ tool_<name>_<timestamp>.png
        β”‚   β”œβ”€β”€ tool_<name>_<timestamp>.png
        β”‚   └── ...                       # Sequential screenshots
        β”œβ”€β”€ workspace/                    # Task workspace
        β”‚   └── [generated files]         # Files created during execution
        └── screen_recording.mp4          # Video recording (if enabled)

Recording Control: Enable via AnyToolConfig(enable_recording=True), filter backends with recording_backends=["gui", "shell", ...]


πŸ”— Related Projects

AnyTool builds upon excellent open-source projects, we sincerely thank their authors and contributors:

  • OSWorld: Comprehensive benchmark for evaluating computer-use agents across diverse operating system tasks.
  • mcp-use: Platform that simplifies MCP agent development with client SDKs.

🌟 If this project helps you, please give us a Star!

πŸ€– Empower AI Agent with intelligent tool orchestration!


❀️ Thanks for visiting ✨ AnyTool!

Views

About

"AnyTool: Universal Tool-Use Layer for AI Agents"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published