A simple browser agent that transforms unstructured course content from any MOOC website into clean, structured data.
-
Updated
Apr 22, 2025 - Jupyter Notebook
A simple browser agent that transforms unstructured course content from any MOOC website into clean, structured data.
This dataset contains 3,167 completed tasks of human-computer interactions captured with video, screenshots, DOM snapshots, and detailed interaction events. Created by Paradigm Shift AI for advancing computer use AI agent research.
An unofficial Nix Flake package for the BrowserOS AI-Powered Web Browser
Screen recording and computer interaction capture tool that records keyboard/mouse input, screen video, DOM snapshots, and accessibility trees. Perfect for creating datasets to train and evaluate computer-use AI models.
Screen recording and computer interaction capture tool that records keyboard/mouse input, screen video, DOM snapshots, and accessibility trees. Perfect for creating datasets to train and evaluate computer-use AI models.
Heybro transforms Chrome into an intelligent AI agent that executes browser tasks through natural language commands. Powered by Google Gemini, this open-source side-panel extension interprets your requests, analyzes page DOM structures, and autonomously performs clicks, form fills, and multi-page navigation— eliminating manual browser interactions
Agent-CE is a containerized continuous evaluation (CE) platform for web browsing agents. It provides production-ready Docker images and CI/CD pipelines for running and evaluating multiple agent frameworks including Browser Use, Notte, Anthropic Computer Use, and OpenAI Computer Use.
AI Browser Agent is an advanced Browser AI tool developed by Oxylabs AI Studio that automates real user browsing tasks using natural language instructions.
Boilerplate templates for LLM-powered computer use with various models and browser automation tools.
🎥 Capture screen recordings and interactions on macOS, including inputs and accessibility data, to create datasets for AI model training and evaluation.
An AI-powered multi-agent system that automatically captures, documents, and visualizes step-by-step UI workflows for any web application. Powered by Gemini planning, Playwright automation, and Claude vision validation.
A browser agent that is capable of doing task on the behalf of the user.
Antibot Browser Agent
Your form filling AI assistant
Serverless AI browser agent
G-Coder is a command-line AI agent designed to be your partner in software development, DevOps, and system administration. Built on the powerful and fluid [Google Agent Development Kit (ADK)](https://google.github.io/adk-docs/), G-Coder is engineered for speed, reliability, and effectiveness.
Build your own AI operators like OpenAI
An AI-powered browser automation tool built with Next.js and Gemini 2.0 Vision AI. Transform natural language into browser automation with visual understanding.
Add a description, image, and links to the browser-agent topic page so that developers can more easily learn about it.
To associate your repository with the browser-agent topic, visit your repo's landing page and select "manage topics."