askill
agent-browser

agent-browserSafety 85Repository

Use when troubleshooting UI issues, testing UI changes, writing bash-based e2e tests, or automating browser interactions. Provides headless browser CLI with accessibility snapshots and deterministic refs for AI-friendly element selection.

0 stars
1.2k downloads
Updated 2/2/2026

Package Files

Loading files...
SKILL.md

Agent Browser

Headless browser automation CLI optimized for AI agents.

Overview

Agent Browser provides deterministic browser control through accessibility snapshots and refs. Instead of fragile CSS selectors, agents take snapshots to get refs (e.g., @e2) that point to exact elements. This snapshot-then-act workflow is optimal for LLM-based automation.

When to Use

Use for:

  • Troubleshooting UI issues (visual inspection, element state)
  • Testing UI changes before/after comparisons
  • Writing bash-based end-to-end tests
  • Automating form submissions or multi-step workflows
  • Debugging visibility, focus, or interaction problems

Do NOT use for:

  • API testing (use curl or API tools)
  • Non-visual backend operations
  • Static file generation
  • Tasks that don't require browser rendering

Quick Reference

TaskCommand
Open pageagent-browser open <url>
Get elementsagent-browser snapshot
Click elementagent-browser click @e2
Fill inputagent-browser fill @e3 "text"
Get textagent-browser get text @e1
Screenshotagent-browser screenshot [path]
Wait for elementagent-browser wait <selector>
Close browseragent-browser close

Snapshot-First Pattern

The recommended interaction pattern follows a snapshot-first approach:

  1. Navigate to the target URL
  2. Snapshot to get accessibility tree with refs
  3. Identify target elements from ref output
  4. Interact using refs (not CSS selectors)
  5. Re-snapshot after DOM changes

This pattern ensures deterministic element selection and avoids fragile selectors.

Selectors

Refs (Recommended)

Refs from snapshots provide deterministic element targeting:

agent-browser snapshot
# Output: - button "Submit" [ref=e2]
#         - textbox "Email" [ref=e3]

agent-browser click @e2       # Click button
agent-browser fill @e3 "x"    # Fill textbox

Why refs? Deterministic (exact element from snapshot), fast (no re-query), AI-friendly.

CSS, Text, XPath

agent-browser click "#submit"
agent-browser click "text=Submit"
agent-browser click "xpath=//button"

Semantic Locators

agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "test@test.com"
agent-browser find text "Sign In" click

Actions: click, fill, check, hover, text

Command Reference

Navigation

agent-browser open <url>          # Navigate (aliases: goto, navigate)
agent-browser back                # Go back
agent-browser forward             # Go forward
agent-browser reload              # Reload page

Interaction

agent-browser click <sel>         # Click
agent-browser dblclick <sel>      # Double-click
agent-browser fill <sel> <text>   # Clear and fill
agent-browser type <sel> <text>   # Type into element
agent-browser press <key>         # Press key (Enter, Tab, Control+a)
agent-browser hover <sel>         # Hover
agent-browser focus <sel>         # Focus
agent-browser select <sel> <val>  # Select dropdown
agent-browser check <sel>         # Check checkbox
agent-browser uncheck <sel>       # Uncheck
agent-browser scroll <dir> [px]   # Scroll (up/down/left/right)
agent-browser drag <src> <tgt>    # Drag and drop
agent-browser upload <sel> <files> # Upload files

Get Information

agent-browser get text <sel>      # Get text content
agent-browser get html <sel>      # Get innerHTML
agent-browser get value <sel>     # Get input value
agent-browser get attr <sel> <attr> # Get attribute
agent-browser get title           # Get page title
agent-browser get url             # Get current URL
agent-browser get count <sel>     # Count matching elements
agent-browser get box <sel>       # Get bounding box

State Checks

agent-browser is visible <sel>    # Check visibility
agent-browser is enabled <sel>    # Check enabled state
agent-browser is checked <sel>    # Check checkbox state

Wait

agent-browser wait <selector>     # Wait for element visible
agent-browser wait <ms>           # Wait milliseconds
agent-browser wait --text "text"  # Wait for text
agent-browser wait --url "**/path" # Wait for URL pattern
agent-browser wait --load networkidle # Wait for load state
agent-browser wait --fn "window.ready === true" # Wait for JS condition

Snapshot Options

agent-browser snapshot            # Full accessibility tree
agent-browser snapshot -i         # Interactive elements only
agent-browser snapshot -c         # Compact (remove empty elements)
agent-browser snapshot -d 3       # Limit depth
agent-browser snapshot -s "#main" # Scope to selector
agent-browser snapshot --json     # Machine-readable output

Screenshot & PDF

agent-browser screenshot [path]   # Screenshot (--full for full page)
agent-browser pdf <path>          # Save as PDF

Browser Settings

agent-browser set viewport <w> <h> # Set viewport size
agent-browser set device <name>   # Emulate device ("iPhone 14")
agent-browser set geo <lat> <lng> # Set geolocation
agent-browser set offline [on|off] # Toggle offline
agent-browser set headers <json>  # Extra HTTP headers
agent-browser set media [dark|light] # Emulate color scheme

Cookies & Storage

agent-browser cookies             # Get all cookies
agent-browser cookies set <n> <v> # Set cookie
agent-browser cookies clear       # Clear cookies
agent-browser storage local       # Get localStorage
agent-browser storage local set <k> <v> # Set value
agent-browser storage local clear # Clear all

Network

agent-browser network route <url> # Intercept requests
agent-browser network route <url> --abort # Block requests
agent-browser network route <url> --body <json> # Mock response
agent-browser network unroute [url] # Remove routes
agent-browser network requests    # View tracked requests

Tabs & Frames

agent-browser tab                 # List tabs
agent-browser tab new [url]       # New tab
agent-browser tab <n>             # Switch to tab
agent-browser tab close [n]       # Close tab
agent-browser frame <sel>         # Switch to iframe
agent-browser frame main          # Back to main frame

Debug

agent-browser trace start [path]  # Start recording
agent-browser trace stop [path]   # Stop and save
agent-browser console             # View console messages
agent-browser errors              # View page errors
agent-browser highlight <sel>     # Highlight element
agent-browser state save <path>   # Save auth state
agent-browser state load <path>   # Load auth state

Sessions

agent-browser --session name open url # Named session
agent-browser session list        # List active sessions
agent-browser session             # Show current session

Setup

agent-browser install             # Download Chromium
agent-browser install --with-deps # With system deps (Linux)

Common Mistakes

MistakeSolution
Using CSS selectors instead of refsTake snapshot first, use @eN refs
Not waiting for page loadAdd wait --load networkidle after navigation
Stale refs after page changeTake new snapshot after interactions that change DOM
Full snapshot on complex pagesUse snapshot -i -c for interactive/compact mode
Forgetting to close browserAlways close when done to free resources
Hard-coded waitsUse wait <selector> or wait --text instead of wait <ms>

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

95/100Analyzed 2/10/2026

Excellent skill documentation providing a comprehensive CLI reference and a clear 'Snapshot-First' workflow for AI-driven browser automation. Includes detailed command lists and common pitfalls.

85
98
90
98
95

Metadata

Licenseunknown
Version-
Updated2/2/2026
Publisherdnlopes

Tags

apigithub-actionsllmsecuritytesting