browser-use

Open Source

<picture> <source media="(prefers-color-scheme: light)" srcset="https://github.com/user-attachments/assets/2ccdb752-22fb-41c7-8948-857fc1ad7e24"> <source media="(prefers-color-scheme: dark)" srcset="https://github.com/user-attachments/assets/774a46d5-27a0-490c-b7d0-e65fcbbfa358"> <img alt="Shows a black Browser Use Logo in light color mode and a white one in dark color mode." src="https://github.com/user-attachments/assets/2ccdb752-22fb-41c7-8948-857fc1ad7e24" width="full"> </picture> <div align="center"> <picture> <source media="(prefers-color-scheme: light)" srcset="https://github.com/user-attachments/assets/9955dda9-ede3-4971-8ee0-91cbc3850125"> <source media="(prefers-color-scheme: dark)" srcset="https://github.com/user-attachments/assets/6797d09b-8ac3-4cb9-ba07-b289e080765a"> <img alt="The AI browser agent." src="https://github.com/user-attachments/assets/9955dda9-ede3-4971-8ee0-91cbc3850125" width="400"> </picture> </div> <div align="center"> <a href="https://cloud.browser-use.com?utm_source=github&utm_medium=readme-badge-downloads"><img src="https://media.browser-use.tools/badges/package" height="48" alt="Browser-Use Package Download Statistics"></a> </div> --- <div align="center"> <a href="#demos"><img src="https://media.browser-use.tools/badges/demos" alt="Demos"></a> <img width="16" height="1" alt=""> <a href="https://docs.browser-use.com"><img src="https://media.browser-use.tools/badges/docs" alt="Docs"></a> <img width="16" height="1" alt=""> <a href="https://browser-use.com/posts"><img src="https://media.browser-use.tools/badges/blog" alt="Blog"></a> <img width="16" height="1" alt=""> <a href="https://browsermerch.com"><img src="https://media.browser-use.tools/badges/merch" alt="Merch"></a> <img width="100" height="1" alt=""> <a href="https://github.com/browser-use/browser-use"><img src="https://media.browser-use.tools/badges/github" alt="Github Stars"></a> <img width="4" height="1" alt=""> <a href="https://x.com/intent/user?screen_name=browser_use"><img src="https://media.browser-use.tools/badges/twitter" alt="Twitter"></a> <img width="4" height="1" alt=""> <a href="https://link.browser-use.com/discord"><img src="https://media.browser-use.tools/badges/discord" alt="Discord"></a> <img width="4" height="1" alt=""> <a href="https://cloud.browser-use.com?utm_source=github&utm_medium=readme-badge-cloud"><img src="https://media.browser-use.tools/badges/cloud" height="48" alt="Browser-Use Cloud"></a> </div> </br> 🌤️ Want to skip the setup? Use our <b>[cloud](https://cloud.browser-use.com?utm_source=github&utm_medium=readme-skip-setup)</b> for faster, scalable, stealth-enabled browser automation! # 🤖 LLM Quickstart 1. Direct your favorite coding agent (Cursor, Claude Code, etc) to [Agents.md](https://docs.browser-use.com/llms-full.txt) 2. Prompt away! <br/> # 👋 Human Quickstart Browser Use 0.13 introduces a new beta agent powered by a Rust core and a browser harness built for current frontier models. It gives the model a real browser/computer action space, persistent tools, and recovery loops inspired by coding agents. ```text Python API -> Rust core -> Browser harness -> Web task done ``` **1. Install Browser Use with the native core runtime (Python>=3.11):** ```bash uv add "browser-use[core]" # or: pip install "browser-use[core]" ``` The `[core]` extra installs the native Browser Use runtime for your platform. **2. [Optional] Get your API key from [Browser Use Cloud](https://cloud.browser-use.com/new-api-key?utm_source=github&utm_medium=readme-quickstart-api-key):** ``` # .env BROWSER_USE_API_KEY=your-key # GOOGLE_API_KEY=your-key # ANTHROPIC_API_KEY=your-key ``` **3. Run your first agent:** ```python from browser_use.beta import Agent, BrowserProfile, ChatBrowserUse # from browser_use.beta import ChatOpenAI # ChatOpenAI(model='gpt-5.5') # from browser_use.beta import ChatAnthropic # ChatAnthropic(model='claude-opus-4-8') import asyncio async def main(): agent = Agent( task="Find the number of stars of the browser-use repo", llm=ChatBrowserUse(model='bu-3-max'), # llm=ChatOpenAI(model='gpt-5.5'), # llm=ChatAnthropic(model='claude-opus-4-8'), # Sonnet also works well. browser_profile=BrowserProfile( headless=False, allowed_domains=["*.github.com"], ), ) history = await agent.run() print(history.final_result()) if __name__ == "__main__": asyncio.run(main()) ``` Existing Python agent users can keep using `from browser_use import Agent`. The new Rust-powered beta agent is `from browser_use.beta import Agent`. Check out the [library docs](https://docs.browser-use.com/open-source/introduction) and the [cloud docs](https://docs.cloud.browser-use.com?utm_source=github&utm_medium=readme-cloud-docs) for more! <br/> # Open Source vs Cloud <picture> <source media="(prefers-color-scheme: light)" srcset="static/accuracy_by_model_light.png"> <source media="(prefers-color-scheme: dark)" srcset="static/accuracy_by_model_dark.png"> <img alt="BU Bench V1 - LLM Success Rates" src="static/accuracy_by_model_light.png" width="100%"> </picture> We benchmark Browser Use across 100 real-world browser tasks. Full benchmark is open source: **[browser-use/benchmark](https://github.com/browser-use/benchmark)**. **Use the Open-Source Agent** - You need [custom tools](https://docs.browser-use.com/customize/tools/basics) or deep code-level integration - We recommend pairing with our [cloud browsers](https://docs.browser-use.com/open-source/customize/browser/remote) for leading stealth, proxy rotation, and scaling - Or self-host the open-source agent fully on your own machines **Use the [Fully-Hosted Cloud Agent](https://cloud.browser-use.com?utm_source=github&utm_medium=readme-hosted-agent) (recommended)** - Much more powerful agent for complex tasks (see plot above) - Easiest way to start and scale - Best stealth with proxy rotation and captcha solving - 1000+ integrations (Gmail, Slack, Notion, and more) - Persistent filesystem and memory <br/> # Demos ### 📋 Form-Filling #### Task = "Fill in this job application with my resume and information." ![Job Application Demo](https://github.com/user-attachments/assets/57865ee6-6004-49d5-b2c2-6dff39ec2ba9) [Example code ↗](https://github.com/browser-use/browser-use/blob/main/examples/use-cases/apply_to_job.py) ### 🍎 Grocery-Shopping #### Task = "Put this list of items into my instacart." https://github.com/user-attachments/assets/a6813fa7-4a7c-40a6-b4aa-382bf88b1850 [Example code ↗](https://github.com/browser-use/browser-use/blob/main/examples/use-cases/buy_groceries.py) ### 💻 Personal-Assistant. #### Task = "Help me find parts for a custom PC." https://github.com/user-attachments/assets/ac34f75c-057a-43ef-ad06-5b2c9d42bf06 [Example code ↗](https://github.com/browser-use/browser-use/blob/main/examples/use-cases/pcpartpicker.py) ### 💡See [more examples here ↗](https://docs.browser-use.com/examples) and give us a star! <br/> # 🚀 Template Quickstart **Want to get started even faster?** Generate a ready-to-run template: ```bash uvx browser-use init --template default ``` This creates a `browser_use_default.py` file with a working example. Available templates: - `default` - Minimal setup to get started quickly - `advanced` - All configuration options with detailed comments - `tools` - Examples of custom tools and extending the agent You can also specify a custom output path: ```bash uvx browser-use init --template default --output my_agent.py ``` <br/> # 💻 CLI Fast, persistent browser automation from the command line: ```bash browser-use open https://example.com # Navigate to URL browser-use state # See clickable elements browser-use click 5 # Click element by index browser-use type "Hello" # Type text browser-use screenshot page.png # Take screenshot browser-use close # Close browser ``` The CLI keeps the browser running between commands for fast iteration. See [CLI docs](browser_use/skill_cli/README.md) for all commands. ### Claude Code Skill For [Claude Code](https://claude.ai/code), install the skill to enable AI-assisted browser automation: ```bash mkdir -p ~/.claude/skills/browser-use curl -o ~/.claude/skills/browser-use/SKILL.md \ https://raw.githubusercontent.com/browser-use/browser-use/main/skills/browser-use/SKILL.md ``` <br/> ## Integrations, hosting, custom tools, MCP, and more on our [Docs ↗](https://docs.browser-use.com) <br/> # FAQ <details> <summary><b>What's the best model to use?</b></summary> We optimized **ChatBrowserUse()** specifically for browser automation tasks. On avg it completes tasks 3-5x faster than other models with SOTA accuracy. **bu-3 pricing (per 1M tokens):** - Input tokens: $2.00 - Cached input tokens: $0.20 - Output tokens: $11.00 **bu-3-max pricing (per 1M tokens):** - Input tokens: $2.50 - Cached input tokens: $0.25 - Output tokens: $50.00 For other LLM providers, see our [supported models documentation](https://docs.browser-use.com/supported-models). </details> <details> <summary><b>Should I use the Browser Use system prompt with the open-source preview model?</b></summary> Yes. If you use `ChatBrowserUse(model='browser-use/bu-30b-a3b-preview')` with a normal `Agent(...)`, Browser Use still sends its default agent system prompt for you. You do **not** need to add a separate custom "Browser Use system message" just because you switched to the open-source preview model. Only use `extend_system_message` or `override_system_message` when you intentionally want to customize the default behavior for your task. If you want the best default speed/accuracy, we still recommend the newer hosted `bu-*` models. If you want the open-source preview model, the setup stays the same apart from the `model=` value. </details> <details> <summary><b>Can I use custom tools with the agent?</b></summary> Yes! You can add custom tools to extend the agent's capabilities: ```python from browser_use import Tools tools = Tools() @tools.action(description='Description of what this tool does.') def custom_tool(param: str) -> str: return f"Result: {param}" agent = Agent( task="Your task", llm=llm, browser=browser, tools=tools, ) ``` </details> <details> <summary><b>Can I use this for free?</b></summary> Yes! Browser-Use is open source and free to use. You only need to choose an LLM provider (like OpenAI, Google, ChatBrowserUse, or run local models with Ollama). </details> <details> <summary><b>Terms of Service</b></summary> This open-source library is licensed under the MIT License. For Browser Use services & data policy, see our [Terms of Service](https://browser-use.com/legal/terms-of-service) and [Privacy Policy](https://browser-use.com/privacy/). </details> <details> <summary><b>How do I handle authentication?</b></summary> Check out our authentication examples: - [Using real browser profiles](https://github.com/browser-use/browser-use/blob/main/examples/browser/real_browser.py) - Reuse your existing Chrome profile with saved logins - If you want to use temporary accounts with inbox, choose AgentMail - To sync your auth profile with the remote browser, run `curl -fsSL https://browser-use.com/profile.sh | BROWSER_USE_API_KEY=XXXX sh` (replace XXXX with your API key) These examples show how to maintain sessions and handle authentication seamlessly. </details> <details> <summary><b>How do I solve CAPTCHAs?</b></summary> For CAPTCHA handling, you need better browser fingerprinting and proxies. Use [Browser Use Cloud](https://cloud.browser-use.com?utm_source=github&utm_medium=readme-faq-captcha) which provides stealth browsers designed to avoid detection and CAPTCHA challenges. </details> <details> <summary><b>How do I go into production?</b></summary> Chrome can consume a lot of memory, and running many agents in parallel can be tricky to manage. For production use cases, use our [Browser Use Cloud API](https://cloud.browser-use.com?utm_source=github&utm_medium=readme-faq-production) which handles: - Scalable browser infrastructure - Memory management - Proxy rotation - Stealth browser fingerprinting - High-performance parallel execution </details> <br/> <div align="center"> **Tell your computer what to do, and it gets it done.** <img src="https://github.com/user-attachments/assets/06fa3078-8461-4560-b434-445510c1766f" width="400"/> [![Twitter Follow](https://img.shields.io/twitter/follow/Magnus?style=social)](https://x.com/intent/user?screen_name=mamagnus00) &emsp;&emsp;&emsp; [![Twitter Follow](https://img.shields.io/twitter/follow/Gregor?style=social)](https://x.com/intent/user?screen_name=gregpr07) </div> <div align="center"> Made with ❤️ in Zurich and San Francisco </div>

AI Agents Browser Automation

97.9K Github Stars

Open Source

browser-harness

Browser Harness is a self-healing software that connects Large Language Models directly to a real browser via a thin, editable Chromium DevTools Protocol harness. Designed for complex browser automation tasks requiring complete freedom, it operates with a single websocket to Chrome, eliminating intermediate layers. The system is unique because the agent dynamically writes missing helper code during execution, allowing the harness to improve itself with every run. Setup involves enabling remote debugging in Chrome and pasting a setup prompt into an LLM interface. It supports connection to personal browsers as well as a free cloud tier offering concurrent browsers, proxies, and captcha solving. The lightweight architecture consists of approximately 1,000 lines across four core files, including protected core packages and an agent workspace for editable helpers and domain-specific skills. Users can enable community-contributed per-site playbooks to guide the agent on specific websites. The software learns from s

AI Agents Browser Automation

14.6K Github Stars

Open Source

web-ui

<img src="./assets/web-ui.png" alt="Browser Use Web UI" width="full"/> <br/> [![GitHub stars](https://img.shields.io/github/stars/browser-use/web-ui?style=social)](https://github.com/browser-use/web-ui/stargazers) [![Discord](https://img.shields.io/discord/1303749220842340412?color=7289DA&label=Discord&logo=discord&logoColor=white)](https://link.browser-use.com/discord) [![Documentation](https://img.shields.io/badge/Documentation-📕-blue)](https://docs.browser-use.com) [![WarmShao](https://img.shields.io/twitter/follow/warmshao?style=social)](https://x.com/warmshao) This project builds upon the foundation of the [browser-use](https://github.com/browser-use/browser-use), which is designed to make websites accessible for AI agents. We would like to officially thank [WarmShao](https://github.com/warmshao) for his contribution to this project. **WebUI:** is built on Gradio and supports most of `browser-use` functionalities. This UI is designed to be user-friendly and enables easy interaction with the browser agent. **Expanded LLM Support:** We've integrated support for various Large Language Models (LLMs), including: Google, OpenAI, Azure OpenAI, Anthropic, DeepSeek, Ollama etc. And we plan to add support for even more models in the future. **Custom Browser Support:** You can use your own browser with our tool, eliminating the need to re-login to sites or deal with other authentication challenges. This feature also supports high-definition screen recording. **Persistent Browser Sessions:** You can choose to keep the browser window open between AI tasks, allowing you to see the complete history and state of AI interactions. <video src="https://github.com/user-attachments/assets/56bc7080-f2e3-4367-af22-6bf2245ff6cb" controls="controls">Your browser does not support playing this video!</video> ## Installation Guide ### Option 1: Local Installation Read the [quickstart guide](https://docs.browser-use.com/quickstart#prepare-the-environment) or follow the steps below to get started. #### Step 1: Clone the Repository ```bash git clone https://github.com/browser-use/web-ui.git cd web-ui ``` #### Step 2: Set Up Python Environment We recommend using [uv](https://docs.astral.sh/uv/) for managing the Python environment. Using uv (recommended): ```bash uv venv --python 3.11 ``` Activate the virtual environment: - Windows (Command Prompt): ```cmd .venv\Scripts\activate ``` - Windows (PowerShell): ```powershell .\.venv\Scripts\Activate.ps1 ``` - macOS/Linux: ```bash source .venv/bin/activate ``` #### Step 3: Install Dependencies Install Python packages: ```bash uv pip install -r requirements.txt ``` Install Browsers in playwright. ```bash playwright install --with-deps ``` Or you can install specific browsers by running: ```bash playwright install chromium --with-deps ``` #### Step 4: Configure Environment 1. Create a copy of the example environment file: - Windows (Command Prompt): ```bash copy .env.example .env ``` - macOS/Linux/Windows (PowerShell): ```bash cp .env.example .env ``` 2. Open `.env` in your preferred text editor and add your API keys and other settings #### Step 5: Enjoy the web-ui 1. **Run the WebUI:** ```bash python webui.py --ip 127.0.0.1 --port 7788 ``` 2. **Access the WebUI:** Open your web browser and navigate to `http://127.0.0.1:7788`. 3. **Using Your Own Browser(Optional):** - Set `BROWSER_PATH` to the executable path of your browser and `BROWSER_USER_DATA` to the user data directory of your browser. Leave `BROWSER_USER_DATA` empty if you want to use local user data. - Windows ```env BROWSER_PATH="C:\Program Files\Google\Chrome\Application\chrome.exe" BROWSER_USER_DATA="C:\Users\YourUsername\AppData\Local\Google\Chrome\User Data" ``` > Note: Replace `YourUsername` with your actual Windows username for Windows systems. - Mac ```env BROWSER_PATH="/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" BROWSER_USER_DATA="/Users/YourUsername/Library/Application Support/Google/Chrome" ``` - Close all Chrome windows - Open the WebUI in a non-Chrome browser, such as Firefox or Edge. This is important because the persistent browser context will use the Chrome data when running the agent. - Check the "Use Own Browser" option within the Browser Settings. ### Option 2: Docker Installation #### Prerequisites - Docker and Docker Compose installed - [Docker Desktop](https://www.docker.com/products/docker-desktop/) (For Windows/macOS) - [Docker Engine](https://docs.docker.com/engine/install/) and [Docker Compose](https://docs.docker.com/compose/install/) (For Linux) #### Step 1: Clone the Repository ```bash git clone https://github.com/browser-use/web-ui.git cd web-ui ``` #### Step 2: Configure Environment 1. Create a copy of the example environment file: - Windows (Command Prompt): ```bash copy .env.example .env ``` - macOS/Linux/Windows (PowerShell): ```bash cp .env.example .env ``` 2. Open `.env` in your preferred text editor and add your API keys and other settings #### Step 3: Docker Build and Run ```bash docker compose up --build ``` For ARM64 systems (e.g., Apple Silicon Macs), please run follow command: ```bash TARGETPLATFORM=linux/arm64 docker compose up --build ``` #### Step 4: Enjoy the web-ui and vnc - Web-UI: Open `http://localhost:7788` in your browser - VNC Viewer (for watching browser interactions): Open `http://localhost:6080/vnc.html` - Default VNC password: "youvncpassword" - Can be changed by setting `VNC_PASSWORD` in your `.env` file ## Changelog - [x] **2025/01/26:** Thanks to @vvincent1234. Now browser-use-webui can combine with DeepSeek-r1 to engage in deep thinking! - [x] **2025/01/10:** Thanks to @casistack. Now we have Docker Setup option and also Support keep browser open between tasks.[Video tutorial demo](https://github.com/browser-use/web-ui/issues/1#issuecomment-2582511750). - [x] **2025/01/06:** Thanks to @richard-devbot. A New and Well-Designed WebUI is released. [Video tutorial demo](https://github.com/warmshao/browser-use-webui/issues/1#issuecomment-2573393113).

AI Agents Browser Automation

16K Github Stars

Open Source

bux

# Browser Use Box ♞ <img src="docs/hero.jpg" alt="Browser Use Box hero" width="100%" /> ## Your 24/7 Claude Code agent with a real browser, on any box you own. Rent any $5 VPS (Hetzner, DigitalOcean, Mac mini, Raspberry Pi — anything that runs Ubuntu), point one install script at it, and text your agent from anywhere. ``` $ curl -fsSL https://raw.githubusercontent.com/browser-use/bux/main/install.sh \ | sudo BROWSER_USE_API_KEY=bu_xxx bash ``` Three minutes from a blank VPS to *"hey claude, check my email and summarize the unread ones"* via Telegram. [Watch the 14-second Browser Use Box demo on TikTok](https://www.tiktok.com/@browser_use/video/7639824093721758989). More launch links: - [Public demo release](https://github.com/browser-use/bux/releases/tag/box-demo-2026-05-14) - [Pinned announcement discussion](https://github.com/browser-use/bux/discussions/181) - [Browser Use Box wiki](https://github.com/browser-use/bux/wiki) - [Launch page](https://browser-use.github.io/bux/) - [Managed pilot for Telegram-heavy teams](https://browser-use.github.io/bux/pilot.html) - [Managed pilot playbook](https://browser-use.github.io/bux/managed-pilot-playbook.html) - [Managed pilot demo transcript](https://browser-use.github.io/bux/managed-pilot-demo.html) - [Managed Telegram AI operator](https://browser-use.github.io/bux/telegram-ai-operator.html) - [Telegram AI operator for agencies](https://browser-use.github.io/bux/telegram-ai-operator-agencies.html) - [n8n Telegram AI operator](https://browser-use.github.io/bux/n8n-telegram-ai-operator.html) - [Telegram AI operator for crypto and fintech teams](https://browser-use.github.io/bux/telegram-operator-crypto.html) - [Managed pilot for AI automation agencies](https://browser-use.github.io/bux/managed-pilot-partners.html) - [Managed pilot proof report demo](https://browser-use.github.io/bux/managed-pilot-proof-report.html) - [Managed pilot terms](https://browser-use.github.io/bux/managed-pilot-terms.html) ## Setup prompt Paste into Claude Code (on your laptop) and it will set up your VPS for you: ```text Set up https://github.com/browser-use/bux on my remote box. SSH into it (I'll paste the host below), run install.sh with my BROWSER_USE_API_KEY, and optionally wire up a Telegram bot if I give you a token from @BotFather. Read install.md first. After the install completes, verify the services are running (systemctl is-active bux-browser-keeper bux-ttyd), then become the `bux` user and run `claude /login` so I can complete the OAuth flow. Once logged in, test the setup by asking claude to visit https://browser-use.com and report the page title. ``` ## What you get - **Claude Code** logged in and always on - A real **Chromium** session via [browser-harness](https://github.com/browser-use/browser-harness) — cookies persist, logins stick - A **Telegram bot** so you can text your agent — pass `TG_BOT_TOKEN=xxx` to the installer to enable - A **web terminal** bound to localhost for when SSH is too much - When claude hits a login wall / 2FA / CAPTCHA, it hands you a **live view URL** and waits — no credential-stuffing, no brittle workarounds ## Requirements - **A box** — Ubuntu 22.04+ with ≥2GB RAM. A $5/mo VPS is fine. - **[Browser Use Cloud API key](https://cloud.browser-use.com/new-api-key)** — free tier: 3 concurrent browsers, proxies, CAPTCHA solving. - An Anthropic API key *or* Claude Max subscription (claude asks on first `/login`). - *(optional)* A Telegram bot token from [@BotFather](https://t.me/BotFather). ## How it works ``` telegram ──► telegram_bot.py ─┐ ├──► claude -p ──► browser-harness ──► BU Cloud browser ──► ttyd ────────────┘ │ (cdp over wss) ▼ /home/bux (persistent state) ``` Three small services under systemd. Agent state lives in `/home/bux`, so reboots keep your cookies, skills, and chat history. ## Docs - [install.md](install.md) — full install guide and troubleshooting - [agent/CLAUDE.md](agent/CLAUDE.md) — the context claude loads on every session. Edit it to customize behavior (working dir layout, skill policies, allowed tools), then rerun `install.sh` — it's idempotent and the next claude turn picks up the change. ## Managed offering If you'd rather not run your own VPS: [cloud.browser-use.com](https://cloud.browser-use.com) provisions a box for you in ~60s — same software, zero setup, one-command `bux up` via a Claude Code skill. For teams that already run sales, support, partner onboarding, or operations in Telegram, there is also a [$1,000/month managed pilot](https://browser-use.github.io/bux/pilot.html): one scoped private operator workflow, launched in 7 days, with human handoff and weekly tuning. The pilot scope, handoff format, and example operator rules are in the [managed pilot playbook](https://browser-use.github.io/bux/managed-pilot-playbook.html). The [demo transcript](https://browser-use.github.io/bux/managed-pilot-demo.html) shows the first workflow: Telegram lead triage into a clean human handoff. The [managed Telegram AI operator page](https://browser-use.github.io/bux/telegram-ai-operator.html) is the focused entry point for lead qualification, support triage, and partner onboarding use cases. The [Telegram AI operator for agencies page](https://browser-use.github.io/bux/telegram-ai-operator-agencies.html) gives AI automation agencies a managed implementation path for one Telegram-heavy client workflow. The [n8n Telegram AI operator page](https://browser-use.github.io/bux/n8n-telegram-ai-operator.html) gives n8n builders a managed Telegram-first operator that hands off cleanly into existing workflows and human approvals. The [Telegram AI operator for crypto and fintech teams](https://browser-use.github.io/bux/telegram-operator-crypto.html) packages the same managed pilot for partner, support, ecosystem, and ops teams that triage Telegram requests by hand. The [partner pilot page](https://browser-use.github.io/bux/managed-pilot-partners.html) packages the same $1,000/month workflow as a delivery layer for AI automation agencies with Telegram-heavy clients. The [proof report demo](https://browser-use.github.io/bux/managed-pilot-proof-report.html) shows the weekly artifact that makes the retainer easy to renew: handled threads, clean handoffs, checks completed, blockers, tuning, and estimated time saved. The [managed pilot terms](https://browser-use.github.io/bux/managed-pilot-terms.html) summarize the $1,000/month scope, first-week deliverables, and acceptance criteria. Pilot inquiries can start on [Telegram](https://t.me/Magnus_Mueller) or through the [managed pilot issue form](https://github.com/browser-use/bux/issues/new?template=managed-pilot.yml). ## Contributing PRs welcome — bug fixes, docs tweaks, and new features all appreciated. Open an issue first if you're planning something larger. ## License MIT. See [LICENSE](LICENSE).

AI Agents Browser Automation

377 Github Stars

Open Source

workflow-use

<picture> <img alt="Workflow Use logo - a product by Browser Use." src="./static/workflow-use.png" width="full"> </picture> <br /> <h1 align="center">Deterministic, Self Healing Workflows (RPA 2.0)</h1> [![GitHub stars](https://img.shields.io/github/stars/browser-use/workflow-use?style=social)](https://github.com/browser-use/workflow-use/stargazers) [![Discord](https://img.shields.io/discord/1303749220842340412?color=7289DA&label=Discord&logo=discord&logoColor=white)](https://link.browser-use.com/discord) [![Cloud](https://img.shields.io/badge/Cloud-☁️-blue)](https://cloud.browser-use.com) [![Twitter Follow](https://img.shields.io/twitter/follow/Gregor?style=social)](https://x.com/gregpr07) [![Twitter Follow](https://img.shields.io/twitter/follow/Magnus?style=social)](https://x.com/mamagnus00) ⚙️ **Workflow Use** is the easiest way to create and execute deterministic workflows with variables which fallback to [Browser Use](https://github.com/browser-use/browser-use) if a step fails. You just _show_ the recorder the workflow, we automatically generate the workflow. ❗ This project is in very early development so we don't recommend using this in production. Lots of things will change and we don't have a release schedule yet. Originally, the project was born out of customer demand to make Browser Use more reliable and deterministic. ## 🚀 NEW: Generation Mode Automatically generate workflows from natural language! Describe your task, we run browser-use once, then create a reusable semantic workflow stored in a database. ### Quick Commands ```bash # Generate workflow from task description python cli.py generate-workflow "Find GitHub stars for browser-use repo" # List all workflows python cli.py list-workflows # Filter by generation mode python cli.py list-workflows --generation-mode browser_use # Run stored workflow python cli.py run-stored-workflow <workflow-id> --prompt "Find stars for playwright repo" # View workflow details python cli.py workflow-info <workflow-id> # Delete workflow python cli.py delete-workflow <workflow-id> ``` ### How It Works 1. **Describe**: Give a task in natural language 2. **Execute**: Browser-use completes the task once 3. **Generate**: Execution history → semantic workflow with parameters 4. **Store**: Save to database with metadata 5. **Reuse**: Run the workflow with different inputs, no AI needed ### Advanced Options ```bash # Custom models for generation python cli.py generate-workflow "Your task" \ --agent-model "gpt-4.1-mini" \ --extraction-model "gpt-4.1-mini" \ --workflow-model "gpt-4o" # Use Browser-Use Cloud browser python cli.py generate-workflow "Your task" --use-cloud # Save to custom location python cli.py generate-workflow "Your task" --output-file ./my-workflow.json # Skip database storage python cli.py generate-workflow "Your task" --no-save-to-storage ``` ### Storage Workflows stored at `workflows/storage/`: - `metadata.json` - Searchable index of all workflows - `workflows/<id>.workflow.json` - Individual workflow files ### Programmatic Usage ```python from workflow_use.healing.service import HealingService from workflow_use.storage.service import WorkflowStorageService from browser_use.llm import ChatOpenAI healing_service = HealingService(llm=ChatOpenAI(model='gpt-4.1')) storage_service = WorkflowStorageService() # Generate workflow workflow = await healing_service.generate_workflow_from_prompt( prompt="Fill contact form on example.com", agent_llm=ChatOpenAI(model='gpt-4.1-mini'), extraction_llm=ChatOpenAI(model='gpt-4.1-mini'), use_cloud=True # Optional: use Browser-Use Cloud ) # Save to storage metadata = storage_service.save_workflow( workflow=workflow, generation_mode='browser_use', original_task="Fill contact form on example.com" ) # Retrieve and execute loaded_workflow = storage_service.get_workflow(metadata.id) ``` # Quick start ```bash git clone https://github.com/browser-use/workflow-use ``` ## Build the extension ```bash cd extension && npm install && npm run build ``` ## Setup workflow environment ```bash cd .. && cd workflows uv sync source .venv/bin/activate # for mac / linux playwright install chromium cp .env.example .env # add your OPENAI_API_KEY to the .env file ``` ## Run workflow as tool ```bash python cli.py run-as-tool examples/example.workflow.json --prompt "fill the form with example data" ``` ## Run workflow with predefined variables ```bash python cli.py run-workflow examples/example.workflow.json ``` ## Record your own workflow ```bash python cli.py create-workflow ``` ## See all commands ```bash python cli.py --help ``` # Usage from python Running the workflow files is as simple as: ```python from workflow_use import Workflow workflow = Workflow.load_from_file("example.workflow.json") result = asyncio.run(workflow.run_as_tool("I want to search for 'workflow use'")) ``` ## Cloud Browser Support Run workflows in [Browser-Use Cloud](https://cloud.browser-use.com) (new signups get $10 free credits via OAuth or $1 via email) with semantic abstraction (no AI): (NOTE: Set BROWSER_USE_API_KEY environment variable) ```python from workflow_use import Workflow workflow = Workflow.load_from_file("workflow.json", llm, use_cloud=True) result = await workflow.run_with_no_ai() # No LLM calls, uses semantic mapping ``` Examples: - `examples/cloud_browser_demo.py` - Load recorded workflow and run on cloud ## Launch the GUI The Workflow UI provides a visual interface for managing, viewing, and executing workflows. ### Option 1: Using the CLI command (Recommended) The easiest way to start the GUI is with the built-in CLI command: ```bash cd workflows python cli.py launch-gui ``` This command will: - Start the backend server (FastAPI) - Start the frontend development server - Automatically open http://localhost:5173 in your browser - Capture logs to the `./tmp/logs` directory Press Ctrl+C to stop both servers when you're done. ### Option 2: Start servers separately Alternatively, you can start the servers individually: #### Start the backend server ```bash cd workflows uvicorn backend.api:app --reload ``` #### Start the frontend development server ```bash cd ui npm install npm run dev ``` Once both servers are running, you can access the Workflow GUI at http://localhost:5173 in your browser. The UI allows you to: - Visualize workflows as interactive graphs - Execute workflows with custom input parameters - Monitor workflow execution logs in real-time - Edit workflow metadata and details # Demos ## Workflow Use filling out form instantly https://github.com/user-attachments/assets/cf284e08-8c8c-484a-820a-02c507de11d4 ## Gregor's explanation https://github.com/user-attachments/assets/379e57c7-f03e-4eb9-8184-521377d5c0f9 # Features - 🔁 **Record Once, Reuse Forever**: Record browser interactions once and replay them indefinitely. - ⏳ **Show, don't prompt**: No need to spend hours prompting Browser Use to do the same thing over and over again. - ⚙️ **Structured & Executable Workflows**: Converts recordings into deterministic, fast, and reliable workflows which automatically extract variables from forms. - 🪄 **Human-like Interaction Understanding**: Intelligently filters noise from recordings to create meaningful workflows. - 🔒 **Enterprise-Ready Foundation**: Built for future scalability with features like self-healing and workflow diffs. # Vision and roadmap Show computer what it needs to do once, and it will do it over and over again without any human intervention. ## Workflows - [ ] Nice way to use the `.json` files inside python code - [ ] Improve LLM fallback when step fails (currently really bad) - [ ] Self healing, if it fails automatically agent kicks in and updates the workflow file - [ ] Better support for LLM steps - [ ] Take output from previous steps and use it as input for next steps - [ ] Expose workflows as MCP tools - [ ] Use Browser Use to automatically create workflows from websites ## Developer experience - [ ] Improve CLI - [ ] Improve extension - [ ] Step editor ## Agent - [ ] Allow Browser Use to use the workflows as MCP tools - [ ] Use workflows as website caching layer

AI Agents Browser Automation

4K Github Stars

Open Source

stress-tests

# [Browser-Use Stress Test](https://browser-use.github.io/stress-tests) > - Guided Evaluation Challenge: https://browser-use.github.io/stress-tests/challenge.html > - All the Form Libraries All At Once: https://browser-use.github.io/stress-tests/index.html > - All the CAPTCHAs: https://2captcha.com/demo > - All the browser fingerprints: https://abrahamjuliot.github.io/creepjs/ <img width="991" alt="Screenshot 2025-05-01 at 1 33 54 AM" src="https://github.com/user-attachments/assets/f4e1c0d5-3b90-423a-8279-028ca93a4093" /> <img width="1972" alt="image" src="https://github.com/user-attachments/assets/da0f3d41-af7b-41dd-a134-3c1b0f019dfd" /> ## Included Form Libraries 1. **Vanilla HTML + JS** - Basic HTML5 form elements with JavaScript validation 2. **jQuery + Bootstrap + Select2** - Classic form stack with enhanced select inputs 3. **AngularJS (v1)** - Angular 1.x form implementation with ng-model bindings 4. **Angular (v2+)** - Modern Angular reactive forms 5. **React Hook Form** - React-based form library using hooks 6. **TanStack Form** - Modern React form state management library 7. **Formik** - Popular form state management for React 8. **React Final Form** - High-performance React form state management 9. **Svelte Forms Lib** - Svelte-based form validation library 10. **Ember (ember-changeset-validations)** - Ember.js form implementation 11. **Vue.js (Vuelidate)** - Vue form validation library 12. **Material UI Forms** - Material Design styled form components 13. **Wufoo-style** - Intentionally difficult for autofill with unusual naming patterns 14. **Shadow DOM Form** - Form elements encapsulated within Shadow DOM 15. **Dynamic Form** - Dynamically generated form elements 16. **Web Components** - Lit/Polymer Web Components implementation 17. **Progressive Form** - Multi-step form with progressive disclosure 18. **React Native Web** - React Native components rendered to web 19. **Nested Iframes** - Form elements nested in multiple iframe layers 20. **Hidden Labels** - Form with visually hidden accessibility labels 21. **Non-Latin Form** - Form using non-Latin character sets 22. **Contenteditable Form** - Form using contenteditable elements 23. **Rich Text Fields** - Form with rich text editor fields 24. **GraphQL Form** - Form using GraphQL mutations 25. **Table-based Form** - Form with table-based layout 26. **Animated Form** - Form with CSS animations and transitions 27. **Internationalized Form** - Form with internationalization support ## Features Each form includes: - All standard HTML form input types - Date and time pickers - Character-restricted fields (alphanumeric only) - Disabled fields - Red herring modal buttons (opens a modal instead of submitting) - Form validation - Submit buttons - All forms will display `the secret is: dumbledore` upon succesful submission to make evals easy to validate ## Special Testing Cases The Wufoo-style form is specifically designed to challenge autofill systems with: - Unusual field naming conventions - Nested form structures - Non-standard input patterns - Split fields (separate month/day/year selects) - Honeypot fields - Fields with prefixes/suffixes ## Usage 1. Open `index.html` in a browser 2. Test your autofill browser extension against each form 3. Check for proper field recognition and filling ## Development This is a purely frontend project with no build steps required. All forms are self-contained in their respective HTML files. ## More - https://www.tohodo.com/autofill/form.html :star: - https://www.smashingmagazine.com/2023/02/comparing-react-form-libraries/ - https://fill.dev/form/identity-simple - https://www.roboform.com/filling-test-all-fields ## License MIT

Browser Automation Testing & QA

27 Github Stars

Open Source

template-library

# Browser-Use Template Library Template collection for the `browser-use` CLI init command. ## Structure ``` . ├── templates.json # Template registry and metadata ├── gitignore.template # Shared .gitignore for complex templates │ ├── Simple Templates (single Python files) ├── default_template.py # Example: Minimal setup ├── ... # See templates.json for complete list │ └── Complex Templates (full project scaffolding) └── shopping/ # Example: E-commerce automation ├── main.py ├── launch_chrome_debug.py ├── README.md ├── pyproject.toml.template └── .env.example.template └── ... # See templates.json for complete list ``` For a complete list of all available templates, see [`templates.json`](templates.json). ## Usage This repository is used as a git submodule by the main [browser-use](https://github.com/browser-use/browser-use) repository. Templates are loaded by the `browser-use init` CLI command: ```bash uvx browser-use init --template default uvx browser-use init --template shopping uvx browser-use init --template job-application uvx browser-use init --template agentmail ``` ## Testing Templates Locally To test your templates before submitting a PR, you can modify the browser-use CLI to use your fork/branch: 1. Fork this repository and create a branch with your changes 2. Clone the browser-use repository and locate `browser_use/init_cmd.py` 3. Find the `TEMPLATE_REPO_URL` variable (line 27, typically set to `https://raw.githubusercontent.com/browser-use/template-library/main`) 4. Replace it with your fork and branch: `https://raw.githubusercontent.com/YOUR_USERNAME/template-library/YOUR_BRANCH` 5. From the browser-use directory, test your template: ```bash # Interactive mode (select from list) python -m browser_use.init_cmd # Direct template selection python -m browser_use.init_cmd --template your-template --output test.py ``` This allows you to verify: - ✓ Template files are copied correctly - ✓ `next_steps` display properly - ✓ File permissions are set (executable files) - ✓ Binary files work (PDFs, images, etc.) - ✓ Variable substitution works (`{template}`, `{output}`) ## Adding New Templates ### Simple Template (Single File) 1. Create your template file in the root directory: ```bash # Example: my_template.py ``` 2. Add an entry to `templates.json`: ```json "my-template": { "file": "my_template.py", "description": "Brief description of what this template does" } ``` ### Complex Template (Multiple Files) 1. Create a new directory with your template files: ```bash mkdir my-template/ # Add files: main.py, README.md, pyproject.toml.template, .env.example.template ``` 2. Add a complete entry to `templates.json`: ```json "my-template": { "file": "my-template/main.py", "description": "Brief description of what this template does", "files": [ { "source": "my-template/main.py", "dest": "main.py" }, { "source": "my-template/pyproject.toml.template", "dest": "pyproject.toml" }, { "source": "gitignore.template", "dest": ".gitignore" }, { "source": "my-template/.env.example.template", "dest": ".env.example" }, { "source": "my-template/README.md", "dest": "README.md" } ], "next_steps": [ { "title": "Navigate to project directory", "commands": ["cd {template}"] }, { "title": "Set up your API key", "commands": [ "cp .env.example .env", "# Edit .env and add your API_KEY" ], "note": "(Get your key at https://example.com/api-keys)" }, { "title": "Install dependencies", "commands": ["uv sync"] }, { "title": "Run the script", "commands": ["uv run {output}"] }, { "footer": "📖 See README.md for detailed instructions" } ] } ``` ## Template Structure Reference ### templates.json Schema Each template entry in `templates.json` supports the following fields: #### Required Fields | Field | Type | Description | |-------|------|-------------| | `file` | string | Path to the main template file (e.g., `"my_template.py"` or `"my-template/main.py"`) | | `description` | string | Short description shown in CLI (1-2 sentences) | #### Optional Fields | Field | Type | Description | |-------|------|-------------| | `files` | array | List of files to copy for complex templates (see File Specification below) | | `next_steps` | array | Custom post-installation instructions (see Next Steps below) | | `featured` | boolean | Mark template as featured (shown prominently in CLI) | | `author` | object | Template author information (see Author Information below) | ### File Specification Each entry in the `files` array: ```json { "source": "path/to/source/file", "dest": "destination/filename", "binary": true, // Optional: true for PDFs, images, etc. (default: false) "executable": true // Optional: true to set +x permission (default: false) } ``` **Examples:** ```json // Text file { "source": "my-template/main.py", "dest": "main.py" } // Binary file (PDF, image, etc.) { "source": "my-template/resume.pdf", "dest": "resume.pdf", "binary": true } // Executable script { "source": "my-template/launch_script.py", "dest": "launch_script.py", "executable": true } ``` ### Next Steps Configuration Each entry in the `next_steps` array can be: **Regular step:** ```json { "title": "Step title", "commands": ["command1", "command2"], "note": "(Optional helpful note)" } ``` **Footer (shown at the end):** ```json { "footer": "Final message with helpful links or tips" } ``` **Variable Substitution:** The CLI automatically replaces these variables in `commands`: - `{template}` → Template name (e.g., `"shopping"`) - `{output}` → Output filename specified by user **Example:** ```json { "title": "Run your script", "commands": ["cd {template} && uv run {output}"] } // Becomes: "cd shopping && uv run my_bot" ``` ### Author Information The optional `author` object allows you to add attribution to your template: ```json { "name": "Jane Smith", // Optional: Author name or username "github_profile": "https://github.com/janesmith", // Optional: GitHub profile URL "last_modified_date": "2025-11-12" // Optional: Last update date (YYYY-MM-DD) } ``` **Example:** ```json "my-template": { "file": "my-template/main.py", "description": "Advanced web scraping with AI", "files": [...], "next_steps": [...], "author": { "name": "Jane Smith", "github_profile": "https://github.com/janesmith", "last_modified_date": "2025-11-12" } } ``` **Notes:** - All fields within `author` are optional - Default templates (`default`, `advanced`, `tools`) typically don't need author information - Community-contributed templates should include author information when possible - The `last_modified_date` should be updated whenever the template is significantly changed ### Featured Templates The optional `featured` boolean flag marks templates for prominent display in the CLI's template selector UI. ```json { "featured": true } ``` **Example:** ```json "shopping": { "file": "shopping/main.py", "description": "E-commerce automation with structured output", "featured": true, "files": [...], "next_steps": [...] } ``` **Notes:** - Featured templates are shown in a dedicated "Featured Templates" section in the CLI - Default templates (`default`, `advanced`, `tools`) are always shown separately and don't need the featured flag - Use this to highlight high-quality, well-maintained, or popular community templates - Currently featured templates: `shopping`, `job-application`, `agentmail`, `llm-arena`, `slack`, `all-openai-jobs` ### Best Practices 1. **README.md**: Include detailed setup instructions, customization tips, and troubleshooting 2. **.env.example.template**: Document all required environment variables with example values 3. **pyproject.toml.template**: Pin dependencies to working versions 4. **Description**: Be specific about what the template does (e.g., "E-commerce automation with Instacart" vs "Shopping bot") 5. **next_steps**: Provide clear, ordered instructions that work out of the box ## Contributing ### Workflow 1. Fork this repository 2. Create a new branch for your template 3. Add your template files and update `templates.json` 4. Submit a PR with: - Clear description of what the template does - Use case or problem it solves - Any special requirements or dependencies ### Quality Checklist - [ ] Template code is tested and working - [ ] README.md includes clear setup instructions - [ ] .env.example.template documents all required API keys - [ ] pyproject.toml.template has all necessary dependencies - [ ] next_steps guide users through setup correctly - [ ] Template name is descriptive and follows kebab-case convention ## License Same as [browser-use](https://github.com/browser-use/browser-use)

AI Agents Browser Automation

25 Github Stars

browser-use

Software by browser-use

browser-use

browser-harness

web-ui

bux

workflow-use

stress-tests

template-library