Agentflare

This project provides an agent that can use a web browser and a computer to fulfill tasks specified by a user using Cloudflare Workers plus Containers.

Why?

General

This was motivated by the ChatGPT Agent and how its rather simple capabilities were constrained in terms of available usages (the below's from their announcement):

Pro users have 400 messages per month, while other paid users get 40 messages monthly, with additional usage available via flexible credit-based options.

Expensive

While Cloudflare does offer a product for headless browsers, their pricing can be a bit steep so here we run a chromium container (see here for the inspiration).

Usage

You can view the app (till we sunset this on August 31st) at: https://agentflare.yev-81d.workers.dev/

Note: Some of the below recordings have sections where it's loading cut out for viewability, the end-to-end one shotting is still true and various caching or optimizations can be applied to make live usage more alike the GIFs shown below.

Search example

Here's an example of doing web search to look up news related to Cloudflare and bots:

Browser example

Here's an example of using a web browser to get page content in order to summarize:

Computer example

Here's an example of using a computer to run a command inside the terminal that's opened using a NoVNC server hosted on a Cloudflare container:

Search and browser example

Here's an example of using search results to fetch links and then a browser to obtain the content to provide a summary of:

Architecture

There's plenty of examples of multi-agent architectures but the underlying premise is simple - why overwhelm the context of a single LLM when you can scope tasks into smaller accomplishable steps?

flowchart TD
    A[Prompt] -->|Deconstruct| B{Top-level agent}
    B -->|Delegate to| C[Search tool]
	C -->|Return result| B
    B -->|Delegate to| D[Browser agent]
	D -->|Return result| B
    B -->|Delegate to| E[Computer agent]
	E -->|Return result| B

Otherwise you end up trying to one-shot like you're drawing an owl.

Using Chromium

Due to a security measure that prevents chromium from arbitrarily being accessed or controlled remotely, the image proxies requests with nginx to mask the actual traffic origin. This allows you to run or debug from a machine that's not part of your Cloudflare deployment like so:

// The below code can be run on your local machine while a browser is run in a Cloudflare container
import puppeteer from 'puppeteer';

(async () => {
  const result = await fetch("https://<your deployment identifier>.workers.dev/json/version").then(res => res.json());

  // Launch the browser and open a new blank page
  const browser = await puppeteer.connect({
    browserWSEndpoint: result.webSocketDebuggerUrl.replaceAll('ws://localhost', 'wss://<your deployment identifier>.workers.dev')
  });

  const page = await browser.newPage();

  // Navigate the page to a URL
  await page.goto('https://news.ycombinator.com/');

  // Set screen size
  await page.setViewport({ width: 1080, height: 1024 });

  // Locate the full title with a unique string
  const textSelector = await page.waitForSelector(
    'title',
  );
  const fullTitle = await textSelector?.evaluate(el => el.textContent);

  // Print the full title
  console.log('The title of this page is "%s".', fullTitle);

  await browser.close();
})();

Using computer

If you're unfamiliar with Anthropic's computer use, here's a simplified version of how it looks under the hood (minus the ineditable configuration).

flowchart TD
    A[Goal] -->|Deconstruct| B(Plan)
    B -->|Handled by| C{Computer Agent}
    D -->|Return result| C
    C -->|Call if relevant| D[Keyboard tool]
    C -->|Call if relevant| E[Mouse tool]
    E -->|Return result| C
    C -->|Call if relevant| F[Screenshot tool]
    F -->|Return result| C

So, first, there needs to be a computer that be tinkered with hence spinning up a VNC server that can be connected to by web browser using NoVNC (as shown below) at https://agentflare.<your_account_id>.workers.dev/vnc.html:

Or by using a JavaScript client with its usage being shown in computer/app.ts.

Running

First, install dependencies.

$ yarn install

Log in to your account if this is your first time interacting with the wrangler CLI.

$ yarn wrangler login

Then, simply deploy.

$ yarn wrangler deploy

At the end of the output, you should see the URL you can open in your browser to view the application.

...
Deployed agentflare triggers (0.34 sec)
    https://agentflare.<your_account_id>.workers.dev      <---- This
  Current Version ID: some-pretty-cool-uuid
  Cloudflare collects anonymous telemetry about your usage of Wrangler. Learn more at https://github.com/cloudflare/workers-sdk/tree/main/packages/wrangler/telemetry.md
  Done in 225.40s.
🏁 Wrangler Action completed

Name		Name	Last commit message	Last commit date
Latest commit History 248 Commits
.github/workflows		.github/workflows
.vscode		.vscode
assets		assets
browser		browser
computer		computer
src		src
.gitignore		.gitignore
README.md		README.md
package.json		package.json
tsconfig.json		tsconfig.json
worker-configuration.d.ts		worker-configuration.d.ts
wrangler.jsonc		wrangler.jsonc
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Agentflare

Contents

Why?

General

Expensive

Usage

Search example

Browser example

Computer example

Search and browser example

Architecture

Using Chromium

Using computer

Running

About

Uh oh!

Languages

lsd-so/agentflare

Folders and files

Latest commit

History

Repository files navigation

Agentflare

Contents

Why?

General

Expensive

Usage

Search example

Browser example

Computer example

Search and browser example

Architecture

Using Chromium

Using computer

Running

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages