browserclaw
<p align="center"> <img src="assets/logo.png" alt="BrowserClaw" width="400" /> </p> <p align="center"> <a href="https://browserclaw.org"><img src="https://img.shields.io/badge/Live-browserclaw.org-orange" alt="Live" /></a> <a href="https://www.npmjs.com/package/browserclaw"><img src="https://img.shields.io/npm/v/browserclaw.svg" alt="npm version" /></a> <a href="./LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License: MIT" /></a> <a href="https://www.npmjs.com/package/browserclaw"><img src="https://img.shields.io/npm/dw/browserclaw" alt="npm downloads" /></a> <a href="https://github.com/idan-rubin/browserclaw/stargazers"><img src="https://img.shields.io/github/stars/idan-rubin/browserclaw" alt="GitHub stars" /></a> </p> > **DISCLAIMER: This project is NOT affiliated with browserclaw.com in any form. We have no connection to that site and recommend treating it with caution.** The AI-native browser automation library — born from [OpenClaw](https://github.com/openclaw/openclaw), built for agents. **Snapshot + ref targeting** — no CSS selectors, no XPath, no vision, just numbered refs that map to interactive elements. ```typescript import { BrowserClaw } from 'browserclaw'; const browser = await BrowserClaw.launch({ url: 'https://demo.playwright.dev/todomvc' }); const page = await browser.currentPage(); // Snapshot — the core feature const { snapshot, refs } = await page.snapshot(); // snapshot: AI-readable text tree // refs: { "e1": { role: "textbox", name: "What needs to be done?" }, "e2": { role: "link", name: "Playwright" } } await page.type('e1', 'Buy groceries', { submit: true }); // Type by ref await page.click('e2'); // Click by ref await browser.stop(); ``` ## Why browserclaw? Most browser automation tools were built for humans writing test scripts. AI agents need something different: - **Vision-based tools** (screenshot → click coordinates) are slow, expensive, and probabilistic - **Selector-based tools** (CSS/XPath) are brittle and meaningless to an LLM - **browserclaw** gives the AI a **text snapshot** with numbered refs — the AI reads text (what it's best at) and returns a ref ID (deterministic targeting) The snapshot + ref pattern means: 1. **Deterministic** — refs resolve to exact elements via Playwright locators, no guessing 2. **Fast** — text snapshots are tiny compared to screenshots 3. **Cheap** — no vision API calls, just text in/text out 4. **Reliable** — built on Playwright, the most robust browser automation engine ## Comparison with Other Tools The AI browser automation space is moving fast. Here's how browserclaw compares to the major alternatives. | | [browserclaw](https://github.com/idan-rubin/browserclaw) | [browser-use](https://github.com/browser-use/browser-use) | [Stagehand](https://github.com/browserbase/stagehand) | [Playwright MCP](https://github.com/microsoft/playwright-mcp) | | :--------------------------------------- | :------------------------------------------------------: | :-------------------------------------------------------: | :---------------------------------------------------: | :-----------------------------------------------------------: | | Ref → exact element, no guessing | :white_check_mark: | :heavy_minus_sign: | :x: | :white_check_mark: | | No vision model in the loop | :white_check_mark: | :heavy_minus_sign: | :white_check_mark: | :white_check_mark: | | Survives redesigns (semantic, not pixel) | :white_check_mark: | :heavy_minus_sign: | :white_check_mark: | :white_check_mark: | | Fill 10 form fields in one call | :white_check_mark: | :x: | :x: | :x: | | Interact with cross-origin iframes | :white_check_mark: | :white_check_mark: | :x: | :x: | | Playwright engine (auto-wait, locators) | :white_check_mark: | :x: | :white_check_mark: | :white_check_mark: | | Embeddable in your own JS/TS agent loop | :white_check_mark: | :x: | :heavy_minus_sign: | :x: | :white_check_mark: = Yes  :heavy_minus_sign: = Partial  :x: = No **browserclaw is the only tool that checks every box.** It combines the precision of accessibility snapshots with Playwright's battle-tested engine, batch operations, cross-origin iframe access, and zero framework lock-in — in a single embeddable library. ### The key distinction: browser tool vs. AI agent Most tools in this space are **AI agents that happen to control a browser**. They own the intelligence layer: they take a task, call an LLM, decide what actions to take, and execute them. That's a complete agent. browserclaw is different. It's a **browser tool** — just the eyes and hands. It takes a snapshot and returns refs. It executes actions on refs. The LLM, the reasoning, the task planning — that all lives in your code, in your agent, wherever you want it. browserclaw doesn't have opinions about any of that. This distinction matters if you're building an agent platform, a product with its own AI layer, or anything where you need to control the intelligence loop. You can't compose an agent-first tool into a system that already has an agent. You end up with two brains fighting over who's in charge. ### How each tool works under the hood - **browserclaw** — Accessibility snapshot with numbered refs → Playwright locator (`aria-ref` in default mode, `getByRole()` in role mode). One ref, one element. No vision model, no LLM in the targeting loop. You bring the brain. - **browser-use** — A complete AI agent: takes a task, calls an LLM, decides actions, executes them. The LLM loop is inside the library. Great for standalone automation scripts; incompatible with platforms that already own the agent loop. Python-only. - **Stagehand** — Accessibility tree + natural language primitives (`page.act("click login")`). Convenient, but the LLM re-interprets which element to target on every single call — non-deterministic by design. - **Playwright MCP** — Same snapshot philosophy as browserclaw, but locked to the MCP protocol. Great for chat-based agents, but not embeddable as a library — you can't compose it into your own agent loop or call it from application code. ### Why this matters for repeated complex UI tasks When you're running the same multi-step workflow hundreds of times — filling forms, navigating dashboards, processing queues — the differences compound: - **Cost**: ~4x fewer tokens per run than vision-based tools. A 20-step task repeated 100 times: ~3M tokens vs ~12M+. - **Speed**: No vision API round-trips. A 20-step workflow finishes in seconds, not minutes. - **Reliability**: Ref-based targeting is deterministic. Same page state → same refs → same result. No coordinate guessing, no LLM re-interpretation. - **Simplicity**: No framework opinions, no agent loop, no hosted platform. Just `snapshot()` → read refs → act. Compose it into whatever agent architecture you want. ## Try It Live — Or On Your Machine [browserclaw.org](https://browserclaw.org) is an open-source playground where you can type a prompt and watch an AI agent use browserclaw in a real browser — live. No setup, no API keys, just a text box and a browser stream. Want to run it yourself? The source is at [github.com/idan-rubin/browserclaw-agent](https://github.com/idan-rubin/browserclaw-agent) — spin it up with Docker or Node.js. Supports Groq, Gemini, OpenAI, and Anthropic out of the box. ## Install ```bash npm install browserclaw ``` Requires a Chromium-based browser installed on the system (Chrome, Brave, Edge, or Chromium). browserclaw auto-detects your installed browser — no need to install Playwright browsers separately. ## How It Works ``` ┌─────────────┐ snapshot() ┌──────────────────────────────────────────┐ │ Web Page │ ──────────────► │ AI-readable text tree │ │ │ │ │ │ [buttons] │ │ - heading "todos" │ │ [links] │ │ - textbox "What needs to be done?" [e1] │ │ [inputs] │ │ - link "Playwright" [e2] │ └─────────────┘ └──────────────┬───────────────────────────┘ │ AI reads snapshot, decides: type in e1 │ ┌─────────────┐ type('e1',...) ┌──────────────▼──────────────────┐ │ Web Page │ ◄────────────── │ Ref "e1" resolves to a │ │ (updated) │ │ Playwright locator — one ref, │ │ │ │ one exact element │ └─────────────┘ └─────────────────────────────────┘ ``` 1. **Snapshot** a page → get an AI-readable text tree with numbered refs (`e1`, `e2`, `e3`...) 2. **AI reads** the snapshot text and picks a ref to act on 3. **Actions target refs** → browserclaw resolves each ref to a Playwright locator and executes the action > **Note:** Refs are scoped to the snapshot that created them. After navigation or DOM changes, old refs become invalid — actions will fail with an error (timeout in aria mode, `"Unknown ref"` in role mode). Always re-snapshot before acting on a changed page. ## API ### Launch & Connect ```typescript // Launch a new Chrome instance (auto-detects Chrome/Brave/Edge/Chromium) const browser = await BrowserClaw.launch({ url: 'https://demo.playwright.dev/todomvc', // navigate initial tab (no extra tabs) headless: false, // default: false (visible window) executablePath: '...', // optional: specific browser path cdpPort: 9222, // default: 9222 noSandbox: false, // default: false (set true for Docker/CI) ignoreHTTPSErrors: false, // default: false (set true for expired local dev certs) userDataDir: '...', // optional: custom user data directory profileName: 'browserclaw', // profile name in Chrome title bar profileColor: '#FF4500', // profile accent color (hex) chromeArgs: ['--start-maximized'], // additional Chrome flags isolated: true, // fresh per-run profile, auto-cleaned on stop() }); // Connect to an already-running Chrome instance const browser = await BrowserClaw.connect('http://localhost:9222'); // Auto-discovery: scans common CDP ports (9222-9226, 9229) const browser = await BrowserClaw.connect(); ``` `connect()` checks that Chrome is reachable, then the internal CDP connection retries 3 times with increasing timeouts (5 s, 7 s, 9 s) — safe for Docker/CI where Chrome starts slowly. **Anti-detection:** `launch()` always passes Chrome the flag that disables the `AutomationControlled` Blink feature. `connect()` attaches to an already-running Chrome, so it cannot add launch flags retroactively. To inject JavaScript stealth patches for `navigator.webdriver`, plugins, WebGL vendor, and related browser signals, pass `stealth: true` to `launch()` or `connect()`. #### Isolated profiles (per-run, per-process) Pass `isolated: true` (or `isolated: 'some-label'`) to launch in a dedicated per-run profile under `$TMPDIR/browserclaw/isolated/`: - A run-scoped random suffix is **always** appended — including when you pass a label string. Two concurrent launches with the same label (`isolated: 'my-run'`) each get a unique directory and never collide on Chrome's SingletonLock. The label is for identification only; it does not produce a stable profile across runs. - `stop()` removes the isolated user-data directory on exit (best-effort; silent on failure). If the process crashes before `stop()`, leftover directories remain under `$TMPDIR/browserclaw/isolated/` and can be deleted safely when no Chrome process is using them. - When `isolated` is set, `profileName` and `userDataDir` options are ignored. - Any cookies, logins, extensions, or localStorage from prior runs are not available — by design. For a stable, shared profile across runs (persistent login state, preserved history), omit `isolated` and use `profileName` / `userDataDir` instead. #### SSRF policy (navigating agent-supplied URLs) By default, browserclaw permits navigation to **any** address — including private/loopback ranges such as `127.0.0.1`, `10.0.0.0/8`, and cloud metadata endpoints like `169.254.169.254`. This "trusted-network" default is convenient for local development and dev-tunnel workflows. If your agent navigates to URLs it received from an untrusted source (LLM output, user input, external API), you should opt into strict public-only enforcement: ```typescript const browser = await BrowserClaw.launch({ ssrfPolicy: { dangerouslyAllowPrivateNetwork: false, // block loopback, RFC1918, link-local, metadata endpoints hostnameAllowlist: ['*.example.com'], // optional allowlist allowedHostnames: ['internal.myapp.com'], // optional private-IP exceptions }, }); ``` Under strict mode browserclaw resolves DNS up front, pins the result, validates every resolved address against the policy, and re-checks redirect chains. The DNS cache is keyed by policy, so a permissive call does not leak cached private IPs to a later strict call. ### Pages & Tabs ```typescript const page = await browser.open('https://demo.playwright.dev/todomvc'); const current = await browser.currentPage(); // get first usable (non-blank) tab const tabs = await browser.tabs(); // list all tabs const handle = browser.page(tabs[0].targetId); // wrap existing tab const appPage = await browser.waitForTab({ urlContains: 'app-web' }); await browser.focus(tabId); // bring tab to front await browser.close(tabId); // close a tab await browser.stop(); // stop browser + cleanup page.id; // CDP target ID (use with focus/close/page) await page.url(); // current page URL await page.title(); // current page title browser.url; // CDP endpoint URL ``` #### Recovering tab handles Tab handles can get out of sync if the app rewrites its URL aggressively or replaces the top-level target. Use the recovery primitives to re-bind a `CrawlPage` without having to restart the session: ```typescript // Attempts to refresh the cached targetId, optionally falling back to the // best-effort resolver if the original target is gone. await page.refreshTargetId(); await page.refreshTargetId({ fallback: 'active' }); // Rebind the handle using the best-effort resolver: prefers the old // targetId, then the old URL, then a non-blank tab, then any tab. await page.reacquire(); ``` > **Contract — heuristic by design:** These resolvers do not query Chrome's focused tab; CDP doesn't expose that cleanly over connect-over-CDP. They apply a fixed preference order — old targetId → old URL → first non-blank accessible tab → any accessible tab — and that order is the contract. Use them for recovery after a target has been lost; don't use them to "ask which tab the human is looking at." When you need deterministic tab selection, capture the `targetId` up front via `browser.open()` / `browser.waitForTab()` / `browser.tabs()` and keep using that handle. BrowserClaw exports structured errors so workflow code can tell apart the common failure modes: ```typescript import { BrowserTabNotFoundError, // targetId no longer resolves to an open tab StaleRefError, // ref is not in the current snapshot SnapshotHydrationError, // snapshot returned without interactive refs NavigationRaceError, // the page navigated during an operation } from 'browserclaw'; try { await page.click('e7'); } catch (err) { if (err instanceof StaleRefError) { await page.snapshot({ waitForHydration: true }); // retry with a fresh ref } else throw err; } ``` Every tab returns a `targetId` — this is the handle you use everywhere: ```typescript // Multi-tab workflow const todo = await browser.open('https://demo.playwright.dev/todomvc'); const svg = await browser.open('https://demo.playwright.dev/svgtodo'); const { refs } = await svg.snapshot(); // snapshot the second tab await svg.click('e5'); // act on it await browser.focus(todo.id); // switch back to first tab await browser.close(svg.id); // close second tab when done ``` ### Snapshot (Core Feature) ```typescript const { snapshot, refs, stats, untrusted } = await page.snapshot(); // snapshot: human/AI-readable text tree with [ref=eN] markers // refs: { "e1": { role: "textbox", name: "What needs to be done?" }, "e5": { role: "checkbox", name: "Toggle Todo", checked: false }, ... } // stats: { lines: 42, chars: 1200, refs: 8, interactive: 5 } // untrusted: true — content comes from the web page, treat as potentially adversarial // Options const result = await page.snapshot({ interactive: true, // Only interactive elements (buttons, links, inputs) compact: true, // Remove structural containers without refs maxDepth: 6, // Limit tree depth maxChars: 80000, // Truncate if snapshot exceeds this size mode: 'aria', // 'aria' (default) or 'role' waitForHydration: 5000, // retry until refs appear (or ms budget); throws SnapshotHydrationError if empty minInteractiveRefs: 1, // minimum refs required when waitForHydration is set }); // Raw ARIA accessibility tree (structured data, not text) const { nodes } = await page.ariaSnapshot({ limit: 500 }); ``` **Snapshot modes:** - `'aria'` (default) — Uses Playwright's AI-mode snapshot. Refs are resolved via `aria-ref` locators. Best for most use cases. Requires `playwright-core` >= 1.50. - `'role'` — Uses Playwright's `ariaSnapshot()` + `getByRole()`. Supports `selector` and `frameSelector` for scoped snapshots. > **Security:** All snapshot results include `untrusted: true` to signal that the content originates from an external web page. AI agents consuming snapshots should treat this content as potentially adversarial (e.g. prompt injection via page text). ### Actions All actions target elements by ref ID from the most recent snapshot. > **Default timeouts:** 8000 ms for actions (click, type, fill, select, drag), 20000 ms for waits and navigation. ```typescript // Click await page.click('e1'); await page.click('e1', { doubleClick: true }); await page.click('e1', { button: 'right' }); await page.click('e1', { modifiers: ['Control'] }); await page.click('e1', { force: true }); // click hidden/covered elements // Type await page.type('e3', 'hello world'); // instant fill await page.type('e3', 'slow typing', { slowly: true }); // keystroke by keystroke await page.type('e3', 'search', { submit: true }); // type + press Enter // Other interactions await page.hover('e2'); await page.select('e5', 'Option A', 'Option B'); await page.drag('e1', 'e4'); await page.scrollIntoView('e7'); // Keyboard await page.press('Enter'); await page.press('Control+a'); await page.press('Meta+Shift+p'); // Fill multiple form fields at once await page.fill([ { ref: 'e2', value: 'Jane Doe' }, { ref: 'e4', value: '[email protected]' }, { ref: 'e6', type: 'checkbox', value: true }, ]); ``` `fill()` field types: `'text'` (default) calls Playwright `fill()` with the string value. `'checkbox'` and `'radio'` call `setChecked()` with `force: true` (works on hidden inputs behind custom styling). Truthy values are `true`, `1`, `'1'`, `'true'`. Type can be omitted and defaults to `'text'`. Empty ref throws. #### No-snapshot actions These methods find and click elements without needing a snapshot first — useful when you know the text or role but don't want the snapshot+ref round-trip. ```typescript // Click by visible text or title attribute await page.clickByText('Submit'); await page.clickByText('Save Changes', { exact: true }); // Click by ARIA role and accessible name await page.clickByRole('button', 'Save'); await page.clickByRole('link', 'Settings'); await page.clickByRole('button', 'Create', { index: 1 }); // second match // Click by CSS selector await page.clickBySelector('#submit-btn'); // Click at page coordinates (for canvas elements, custom widgets) await page.mouseClick(400, 300); // Press and hold at coordinates (raw CDP events, bypasses automation detection) await page.pressAndHold(400, 300, { holdMs: 5000, delay: 150 }); ``` #### Highlight ```typescript await page.highlight('e1'); // Playwright built-in highlight ``` #### File Upload Upload paths are confined to a sandboxed directory: `$TMPDIR/browserclaw/uploads` (e.g. `/tmp/browserclaw/uploads` on Linux). Files must exist inside this directory before uploading — paths outside it are rejected. Stage the file first, then reference it by path: ```typescript import { DEFAULT_UPLOAD_DIR } from 'browserclaw'; import { copyFile, mkdir } from 'node:fs/promises'; import { join } from 'node:path'; // Stage the file inside the sandboxed uploads directory await mkdir(DEFAULT_UPLOAD_DIR, { recursive: true }); const staged = join(DEFAULT_UPLOAD_DIR, 'file.pdf'); await copyFile('/path/to/file.pdf', staged); // Direct: set files on an <input type="file"> await page.uploadFile('e3', [staged]); // Arm pattern: for non-input file pickers // Awaiting the call resolves once the listener is armed; awaiting `done` // resolves after files have been set on the chooser. const { done } = await page.armFileUpload([staged]); await page.click('e3'); // triggers the file chooser await done; ``` #### Dialog Handling Handle JavaScript dialogs (alert, confirm, prompt). Arm the handler _before_ the action that triggers the dialog. ```typescript const dialogDone = page.armDialog({ accept: true }); await page.click('e5'); // triggers confirm() await dialogDone; // With prompt text const promptDone = page.armDialog({ accept: true, promptText: 'my answer' }); await page.click('e6'); // triggers prompt() await promptDone; // Persistent handler: called for every dialog until cleared await page.onDialog((event) => { console.log(`${event.type}: ${event.message}`); event.accept(); // or event.dismiss() }); await page.onDialog(undefined); // clear the handler ``` By default, unexpected dialogs are auto-dismissed to prevent `ProtocolError` crashes. ### Navigation & Waiting ```typescript await page.goto('https://demo.playwright.dev/todomvc'); await page.reload(); // reload the current page await page.goBack(); // navigate back in history await page.goForward(); // navigate forward in history await page.waitFor({ loadState: 'networkidle' }); await page.waitFor({ text: 'Welcome' }); await page.waitFor({ textGone: 'Loading...' }); await page.waitFor({ url: '**/dashboard' }); await page.waitFor({ selector: '.loaded' }); // wait for CSS selector await page.waitFor({ fn: '() => document.readyState === "complete"' }); // custom JS (string) await page.waitFor({ fn: () => document.title === 'Done' }); // custom JS (function) await page.waitFor({ fn: (name) => document.querySelector('button')?.textContent === name, arg: 'Save' }); // with arg await page.waitFor({ timeMs: 1000 }); // sleep await page.waitFor({ text: 'Ready', timeoutMs: 5000 }); // custom timeout ``` ### Capture ```typescript // Screenshots const screenshot = await page.screenshot(); // viewport PNG → Buffer const fullPage = await page.screenshot({ fullPage: true }); // full scrollable page const element = await page.screenshot({ ref: 'e1' }); // specific element by ref const bySelector = await page.screenshot({ element: '.hero' }); // by CSS selector const jpeg = await page.screenshot({ type: 'jpeg' }); // JPEG format // PDF const pdf = await page.pdf(); // PDF export (headless only) // Labeled screenshot — numbered badges on each ref for visual debugging const { buffer, labels, skipped } = await page.screenshotWithLabels(['e1', 'e2', 'e3']); // buffer: PNG with numbered overlays // labels: [{ ref: 'e1', index: 1, box: { x, y, width, height } }, ...] // skipped: refs that couldn't be found or had no bounding box ``` Both `screenshot()` and `pdf()` return a `Buffer`. Write to file with `fs.writeFileSync('out.png', screenshot)`. #### Trace Recording Capture Playwright traces (screenshots, DOM snapshots, network) for debugging. ```typescript await page.traceStart({ screenshots: true, snapshots: true }); // ... perform actions ... await page.traceStop('trace.zip'); // Open with: npx playwright show-trace trace.zip ``` #### Response Body Intercept a network response and read its body. ```typescript const resp = await page.responseBody('/api/data'); console.log(resp.status, resp.body); // { url, status, headers, body, truncated } ``` Options: `timeoutMs` (default 30 s), `maxChars` (truncate body). #### Wait For Request Wait for a network request matching a URL pattern and get full request + response details, including POST body. ```typescript const reqPromise = page.waitForRequest('/api/submit', { method: 'POST' }); await page.click('e5'); // submit a form const req = await reqPromise; console.log(req.method, req.postData); // 'POST', '{"name":"Jane"}' console.log(req.status, req.ok); // 200, true console.log(req.responseBody); // '{"id":123}' // { url, method, postData?, status, ok, responseBody?, truncated? } ``` Options: `method` (filter by HTTP method), `timeoutMs` (default 30 s), `maxChars` (truncate response body). ### Activity Monitoring Console messages, errors, and network requests are buffered automatically. ```typescript const logs = await page.consoleLogs(); // all messages const errors = await page.consoleLogs({ level: 'error' }); // errors only const recent = await page.consoleLogs({ clear: true }); // read and clear buffer const pageErrors = await page.pageErrors(); // uncaught exceptions const requests = await page.networkRequests({ filter: '/api' }); // filter by URL const fresh = await page.networkRequests({ clear: true }); // read and clear buffer ``` ### Storage ```typescript // Cookies const cookies = await page.cookies(); await page.setCookie({ name: 'token', value: 'abc', url: 'https://demo.playwright.dev' }); await page.clearCookies(); // localStorage / sessionStorage const values = await page.storageGet('local'); const token = await page.storageGet('local', 'authToken'); await page.storageSet('local', 'key', 'value'); await page.storageClear('session'); ``` ### Downloads ```typescript // Click a download link and save the file const result = await page.download('e7', '/tmp/report.pdf'); console.log(result.suggestedFilename); // 'report.pdf' // Returns: { url, suggestedFilename, path } // Arm pattern: wait for next download (call before triggering) const dlPromise = page.waitForDownload({ path: '/tmp/file.pdf' }); await page.click('e8'); // triggers download const dl = await dlPromise; ``` ### Emulation ```typescript // Device emulation (viewport + user agent) await page.setDevice('iPhone 13'); // Color scheme await page.emulateMedia({ colorScheme: 'dark' }); // Geolocation await page.setGeolocation({ latitude: 48.8566, longitude: 2.3522 }); // Paris await page.setGeolocation({ clear: true }); // reset // Locale & timezone await page.setLocale('fr-FR'); await page.setTimezone('Europe/Paris'); // Network await page.setOffline(true); await page.setExtraHeaders({ 'X-Custom': 'value' }); await page.setHttpCredentials({ username: 'admin', password: 'secret' }); await page.setHttpCredentials({ clear: true }); // remove ``` ### Evaluate Run JavaScript directly in the browser page context. ```typescript const title = await page.evaluate('() => document.title'); const text = await page.evaluate('(el) => el.textContent', { ref: 'e1' }); const count = await page.evaluate('() => document.querySelectorAll("img").length'); ``` #### `evaluateInAllFrames(fn)` Run JavaScript in ALL frames on the page, including cross-origin iframes. Playwright bypasses the same-origin policy via CDP, making this essential for interacting with embedded payment forms (Stripe, etc.). ```typescript const results = await page.evaluateInAllFrames(`() => { const el = document.querySelector('input[name="cardnumber"]'); return el ? 'found' : null; }`); // Returns: [{ frameUrl: '...', frameName: '...', result: 'found' }, ...] ``` ### Viewport ```typescript await page.resize(1280, 720); ``` ## Examples See the [`examples/`](./examples) directory for runnable demos: - **[basic.ts](./examples/basic.ts)** — Navigate, snapshot, click a ref - **[form-fill.ts](./examples/form-fill.ts)** — Fill a multi-field form using refs - **[ai-agent.ts](./examples/ai-agent.ts)** — AI agent loop pattern with Claude/GPT Run from the source tree: ```bash npx tsx examples/basic.ts ``` ## Requirements - **Node.js** >= 18 - **Chromium-based browser** installed (Chrome, Brave, Edge, or Chromium) - **playwright-core** >= 1.50 (installed automatically as a dependency) No need to install Playwright browsers — browserclaw uses your system's existing Chrome installation via CDP. ## Contributing Contributions welcome! Please: 1. Fork the repository 2. Create a feature branch (`git checkout -b my-feature`) 3. Make your changes 4. Run `npm run typecheck && npm run build` to verify 5. Submit a pull request ## Related Projects - **[chrome-relay](https://chrome-relay.kushalsm.com/)** — A native messaging host that exposes your already-running, logged-in Chrome over CDP, so any harness (a local CLI, Claude Code, Codex, a remote machine) can drive that same session. browserclaw is the snapshot + ref layer; chrome-relay handles the transport. In principle a browserclaw-style snapshot+refs loop can run over chrome-relay as the underlying CDP surface. ## Acknowledgments browserclaw was born from the browser automation module in [OpenClaw](https://github.com/openclaw/openclaw), built by [Peter Steinberger](https://github.com/steipete) and an [amazing community of contributors](https://github.com/openclaw/openclaw?tab=readme-ov-file#community). The snapshot + ref system, CDP connection management, and Playwright integration originate from that project. ## License [MIT](./LICENSE)