agent-browser - Headless Browser for AI Agents
When to use this skill
- Web automation and E2E testing
- Scraping data from modern web apps
- Deterministic element interaction using accessibility tree refs
- Isolated browser sessions for different agent tasks
1. Installation
npx skills add vercel-labs/agent-browser
# or
npm install -g agent-browser
agent-browser install
2. Core Workflow (Deterministic Interaction)
AI agents should use the snapshot + ref workflow for best results:
- Navigate:
agent-browser open <url> - Snapshot:
agent-browser snapshot -i(Returns tree with refs like @e1, @e2) - Interact:
agent-browser click @e1oragent-browser fill @e2 "text" - Repeat: Snapshot again if page changes
3. Key Commands
| Command | Description |
|---|---|
open <url> | Navigate to a URL |
snapshot | Get accessibility tree with refs |
click <sel> | Click element (by ref or CSS) |
fill <sel> <text> | Clear and fill input |
screenshot [path] | Take page screenshot |
close | Quit browser session |
4. Advanced Features
- Isolated Sessions: Use
--session <name>to isolate cookies/storage. - Persistent Profiles: Use
--profile <path>to persist login sessions. - Semantic Locators:
find role button click --name "Submit" - JavaScript Execution:
eval "window.scrollTo(0, 100)"
Quick Reference
# Optimal AI Workflow
agent-browser open example.com
agent-browser snapshot -i --json
# (AI parses refs)
agent-browser click @e2
