Browser-Use Skill
Build AI agents that autonomously browse the web using LLMs and the Chrome DevTools Protocol.
When to Use
- Building AI agents that need to interact with web pages
- Automating web tasks (form filling, data extraction, navigation)
- Creating autonomous web research agents
- Testing web applications with AI-driven exploration
- Scraping dynamic content that requires JavaScript execution
Core Concepts
Architecture
User Task → LLM (Claude/GPT) → Browser-Use Controller → Chrome CDP → Web Page
↑ ↓
└──── Page State (HTML/Screenshot) ────┘
Installation
pip install browser-use
playwright install chromium
Basic Agent
from browser_use import Agent
from langchain_anthropic import ChatAnthropic
agent = Agent(
task="Find the latest Python release version on python.org",
llm=ChatAnthropic(model="claude-sonnet-4-20250514"),
)
result = await agent.run()
print(result)
Advanced Configuration
from browser_use import Agent, Browser, BrowserConfig
browser = Browser(
config=BrowserConfig(
headless=True,
disable_security=False,
extra_chromium_args=["--no-sandbox"],
)
)
agent = Agent(
task="Navigate to example.com and extract all links",
llm=llm,
browser=browser,
max_actions_per_step=5,
use_vision=True, # Use screenshots for better understanding
)
Key Patterns
1. Multi-Step Task Automation
agent = Agent(
task="""
1. Go to github.com
2. Search for 'browser-use'
3. Click on the first repository result
4. Extract the star count and description
""",
llm=llm,
)
2. Data Extraction
agent = Agent(
task="Go to news.ycombinator.com and extract the top 10 story titles with their URLs",
llm=llm,
max_actions_per_step=3,
)
3. Form Interaction
agent = Agent(
task="""
Go to the contact form at example.com/contact
Fill in: Name='Test User', Email='test@example.com', Message='Hello'
Submit the form
""",
llm=llm,
)
Best Practices
- Be specific in task descriptions — Clear, step-by-step instructions yield better results
- Use vision mode for complex UIs —
use_vision=Truesends screenshots to the LLM - Set reasonable action limits — Prevent infinite loops with
max_actions_per_step - Handle authentication carefully — Never hardcode credentials in task descriptions
- Use headless mode in production —
headless=Truefor server deployments - Implement error handling — Wrap agent runs in try/except for graceful failures
Common Pitfalls
- CAPTCHAs: Browser-Use cannot solve CAPTCHAs; use authenticated sessions instead
- Rate limiting: Add delays between actions for sensitive websites
- Dynamic content: Use
use_vision=Truefor SPAs with heavy JavaScript - Memory usage: Close browser instances after use to prevent memory leaks
