WebEVA is a multimodal web agent that achieved a state-of-the-art 80.3% success rate on the WebVoyager dataset. We support dynamic navigation, task refinement, and the ability to find elements without using visual cues/set-of-mark prompting.
Our code and prompts can be found under index.js and messages.js respectively. Currently under review.
Check the IMDb score of the movie Inception and then find its budget and box office on wikipedia
Before running the project, make sure you have Node.js installed. You can download it from Node.js official website.
Additionally, you will need a .env file to store your environment variables securely.
Follow these steps to set up the project:
-
Clone the repository:
git clone https://github.com/brotherspavel/WebEVA cd WebEVA -
Install dependencies: Install the required Node.js packages:
npm install
Install Playwright and its required browsers:
npx playwright install
-
Create the
.envfile: In the root of the project, create a .env file with the following environment variables:model="gpt-4o" OPENAI_API_KEY=YOUR_OPENAI_API_KEY OPENAI_API_URL=YOUR_OPENAI_API_URLReplace
YOUR_OPENAI_API_KEYandYOUR_OPENAI_API_URLwith your actual OpenAI API credentials.
Once the setup is complete, you can run the script by providing the task description as a command-line argument.
node index.js "Check the IMDb scre of the movie Inception and then find its budget and box office on wikipedia"If the task requires interaction with a particular website (like WebVoyager tasks), provide the website URL:
node index.js "Find the last composition by Mozart" "https://example.com"