Skip to content

throwaway9967/WebEVA

Repository files navigation

WebEVA
The Next Impact Toward Smarter Web Agents testing

Preview

Description

WebEVA is a multimodal web agent that achieved a state-of-the-art 80.3% success rate on the WebVoyager dataset. We support dynamic navigation, task refinement, and the ability to find elements without using visual cues/set-of-mark prompting. Our code and prompts can be found under index.js and messages.js respectively. Currently under review.

Demo
Check the IMDb score of the movie Inception and then find its budget and box office on wikipedia

Prerequisites

Before running the project, make sure you have Node.js installed. You can download it from Node.js official website.

Additionally, you will need a .env file to store your environment variables securely.

Setup

Follow these steps to set up the project:

  1. Clone the repository:

    git clone https://github.com/brotherspavel/WebEVA
    cd WebEVA
  2. Install dependencies: Install the required Node.js packages:

    npm install

    Install Playwright and its required browsers:

    npx playwright install
  3. Create the .env file: In the root of the project, create a .env file with the following environment variables:

    model="gpt-4o"
    OPENAI_API_KEY=YOUR_OPENAI_API_KEY
    OPENAI_API_URL=YOUR_OPENAI_API_URL

    Replace YOUR_OPENAI_API_KEY and YOUR_OPENAI_API_URL with your actual OpenAI API credentials.

Running the Project

Once the setup is complete, you can run the script by providing the task description as a command-line argument.

Example Usage:

node index.js "Check the IMDb scre of the movie Inception and then find its budget and box office on wikipedia"

If the task requires interaction with a particular website (like WebVoyager tasks), provide the website URL:

node index.js "Find the last composition by Mozart" "https://example.com"

Authors (To be edited)

About

WebEVA - Open Source Web Agent

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •