BrowserKing
Open-source browser agent Chrome extension powered by any LLM provider
Use GPT-4o, Claude, Gemini, Grok, DeepSeek, Mistral, Llama, and 15+ other models as your browser agent — all from one extension.
What is BrowserKing?
BrowserKing is a Chrome extension that gives any LLM the ability to see and control your browser. It works like a human assistant that can:
- Take screenshots and understand what's on screen
- Click, type, scroll and navigate web pages
- Read page content and extract information
- Open tabs and work across multiple pages
- Record and replay workflows for repetitive tasks
Unlike other browser agents locked to a single provider, BrowserKing lets you bring your own API key from any OpenAI-compatible provider — or use the Anthropic API directly.
Supported Providers
| Provider | Models | Vision |
|---|---|---|
| OpenAI | GPT-4o, GPT-4.1, o3, o4-mini | Yes |
| Anthropic | Claude 4 Sonnet, Claude 4 Opus | Yes |
| Gemini 2.5 Pro/Flash | Yes | |
| xAI | Grok 3, Grok 3 Mini | Yes |
| DeepSeek | DeepSeek-V3, DeepSeek-R1 | No |
| Mistral | Mistral Large, Pixtral Large | Pixtral only |
| Groq | Llama 4 Scout/Maverick, Llama 3.3 | Scout/Maverick |
| Cerebras | Llama 3.3 70B | No |
| Perplexity | Sonar Pro, Sonar Huge | No |
| Z.AI | GLM-4.6V, GLM-4.5V | Yes |
| Custom | Any OpenAI-compatible endpoint | Configurable |
Each provider gets its own theme color throughout the UI — the sidebar, send button, glow effects, and page border all match your active provider.
Run Local Models
BrowserKing works with any OpenAI-compatible API, which means you can run it with local models too. Point it at Ollama, LM Studio, LocalAI, or any other local inference server that exposes an OpenAI-compatible endpoint.
Setup:
- Start your local server (e.g.,
ollama serveor launch LM Studio) - In BrowserKing settings, select the Custom provider
- Set the base URL to your local endpoint (e.g.,
http://localhost:11434/v1for Ollama) - Enter any string as the API key (local servers usually don't require one, but the field can't be empty)
- Select or type your model name
Best models for local browser agents:
llama-4-scoutorllama-4-maverick(vision + tool use)qwen2.5-vl(strong vision)- Any model with vision and function calling support
Note: Local model support is experimental. Performance depends heavily on your hardware and the model's ability to handle tool calls and vision inputs. Cloud providers with dedicated function calling support will generally give the most reliable results. If you run into issues, please open an issue — we'd love to hear what works and what doesn't.
Features
- Multi-provider support — Switch between providers and models mid-conversation from the dropdown
- Automatic vision detection — Knows which models support images and which don't, with safe fallback
- Provider-themed UI — Every color in the extension adapts to your active provider
- Page glow border — Pulsing colored border around the page while the agent is working
- No account required — No sign-up, no subscription. Just add your API key and go
- Workflow recording — Record browser actions and replay them later
- GIF export — Export recordings as GIFs
- Tool use — Models that support function calling get collapsible tool-use blocks in the chat
Installation
- Download or clone this repo
- Open
chrome://extensionsin Chrome - Enable Developer mode (toggle in the top right)
- Click Load unpacked and select this folder
- Click the BrowserKing icon in your toolbar to open the side panel
- Go to Settings and add your API key for at least one provider
Quick Start
- Open any web page
- Open BrowserKing from the side panel
- Select your provider and model from the dropdown at the top
- Type a task like "Find the cheapest flight from Calgary to Tokyo next month"
- Watch the agent work — it takes screenshots, clicks, scrolls, and reports back
Configuration
Open the Provider Settings page from the extension options to:
- Enable/disable providers
- Enter API keys
- Set custom base URLs (for self-hosted or proxy endpoints)
- Choose default models
- Fetch available models from provider APIs
Architecture
BrowserKing works by intercepting the stock extension's Anthropic API calls and translating them to OpenAI-compatible chat/completions requests. This means:
- The full browser automation toolkit (screenshots, clicks, navigation) works with any provider
- Tool calls are translated between Anthropic and OpenAI formats in real-time
- SSE streaming is translated on the fly
- Vision payloads are automatically downgraded for text-only models
Key files:
| File | Purpose |
|---|---|
provider-registry.js |
Provider definitions, model lists, vision detection |
api-adapter.js |
API translation layer (Anthropic ↔ OpenAI) |
ui-branding.js |
Dynamic theme colors in the side panel |
brand-overlay.js |
Page glow border and stop button theming |
sidepanel-provider-menu.js |
Provider/model dropdown UI |
provider-settings.js |
Settings page for API keys and configuration |
Roadmap
- [ ] Persistent page glow — Fix the colored glow border so it reliably pulses for all providers during agent activity
- [ ] Conversation history — Save and resume past agent sessions
- [ ] Multi-tab workflows — Coordinate actions across multiple tabs in a single task
- [ ] Prompt templates — Pre-built task templates for common workflows (form filling, data extraction, price comparison)
- [ ] Better local model support — Improve tool call translation for models with non-standard function calling formats
- [ ] Export to Playwright/Puppeteer — Convert recorded workflows to automation scripts
- [ ] Provider cost tracking — Track token usage and estimated cost per conversation
- [ ] Firefox & Edge support — Port the extension to other Chromium and non-Chromium browsers
- [ ] Mobile support — Bring browser agent capabilities to mobile browsers via Kiwi Browser or Firefox Android extensions
Have a feature idea? Open an issue or submit a PR.
Acknowledgments
BrowserKing is built on top of Anthropic's Claude for Chrome extension. We extended it with multi-provider support, local model compatibility, and a provider-themed UI.
License
MIT
Contributing
PRs welcome. If you add a new provider, add it to the PROVIDERS object in provider-registry.js with:
id,label,colorbaseUrland model list- Vision support flags per model
Then add vision detection rules in inferVisionSupport() in the same file.