Browser Agent

CDP-native browser automation runtime for agents, with editable skills and reliable Gemini image workflows.

• Real Chrome Control • Skill-Driven Automation • Upload-Verified Image Generation •

Features • Quick Start • Workflows • Safety Model • Agent Guide

Quick Navigation

Tip

I'm a human -> Read this README for install, setup, and safe workflows.

I'm an agent -> Read SKILL.md for operation rules and execution patterns. (Recommended)

browser-agent is a minimal runtime that lets agents control your real Chrome session directly over CDP, while keeping task logic editable in-repo.

For operators: one command surface for browser actions, diagnostics, and updates.
For agents: stable helper APIs (new_tab, js, click_at_xy, upload_file, raw cdp).
For reliability: interaction skills and domain skills to encode repeatable mechanics.

Quick Start

For Agent (Recommended)

Tell your coding agent:

Install Browser Agent from https://github.com/PaulClawX/browser-agent and set it up to control my Chrome

For Human

git clone https://github.com/PaulClawX/browser-agent && cd browser-agent && uv tool install -e . && browser-agent --doctor

Browser Connection

Attach to your normal Chrome profile

Open chrome://inspect/#remote-debugging
Enable Allow remote debugging for this browser instance
Accept the Chrome allow popup when prompted

Re-run:

browser-agent -c 'print(page_info())'

See install.md for full setup.

Workflows

Tier	Workflow	Expected Behavior
Stable	General browser automation	Deterministic tab + DOM + input operations through CDP helpers
Stable	Upload-driven tasks	Upload confirmation before submit; fail-fast if upload isn't verifiable
Stable	Gemini image generation/editing	Prompt + reference flow with strict upload-first gating and export
Stable	Diagnostics and lifecycle	`--doctor`, daemon auto-start, update checks
Best-effort	Complex anti-bot sites	Fallback to coordinate actions, retries, and skill-specific patterns

Project Layout

src/browser_harness/ - core runtime modules
SKILL.md - operator rules for day-to-day use
install.md - first-time install and connection
docs/interaction-skills/ - reusable browser mechanics playbooks
src/agent-workspace/agent_helpers.py - task-specific helper extensions
docs/domain-skills/ - site-specific playbooks

Core Contributors and Maintainers

_{Panwang Pan}
_{paulpanwang@gmail.com}

_{Jingjing Zhao}
_{jingjingbudlet@gmail.com}

📧 Contact

Feel free to open an issue if you have any questions or suggestions. If this project helps you, please give it a ⭐ Star!

Acknowledgements

This project builds on and is inspired by the following open-source work:

browser-use/browser-harness - the primary code and architecture source.
OpenClaudex/openreview-agent - OpenReview dry-run workflow inspiration.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github		.github
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
SKILL.md		SKILL.md
install.md		install.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Browser Agent

Quick Navigation

Quick Start

For Agent (Recommended)

For Human

Browser Connection

Attach to your normal Chrome profile

Workflows

Project Layout

Core Contributors and Maintainers

📧 Contact

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Browser Agent

Quick Navigation

Quick Start

For Agent (Recommended)

For Human

Browser Connection

Attach to your normal Chrome profile

Workflows

Project Layout

Core Contributors and Maintainers

📧 Contact

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages