Browser Tools
The browser automation system exposes 30+ tools through @playwright/mcp. These tools cover navigation, element interaction, page inspection, input handling, tab management, and advanced operations.
Tool Reference
Each tool below is available to the AI model during browser automation sessions. The model selects which tools to invoke based on the task at hand and the current page state.
Navigation Tools
| Tool | Description |
|---|---|
| navigate | Navigate to a URL. Waits for the page to load before returning. |
| navigate_back | Go back in the browser history. Equivalent to clicking the back button. |
| wait_for | Wait for a specific condition (element visible, text appears, network idle). |
Interaction Tools
| Tool | Description |
|---|---|
| click | Click on an element identified by selector or accessibility reference. |
| hover | Hover over an element to trigger tooltips or dropdown menus. |
| drag | Drag an element from one position to another. |
| select_option | Select an option from a dropdown or select element. |
Inspection Tools
| Tool | Description |
|---|---|
| snapshot | Capture the page accessibility tree. This is how the AI "sees" page structure. |
| screenshot | Take a visual screenshot of the current viewport. |
| console_messages | Retrieve console log output from the browser developer tools. |
Snapshot vs. Screenshot
The
snapshot tool returns a text-based accessibility tree that the model can parse efficiently. screenshot captures a visual image. The model typically uses snapshots for understanding page structure and screenshots for visual verification.Input Tools
| Tool | Description |
|---|---|
| type | Type text into the currently focused element. |
| press_key | Press a keyboard key or key combination (e.g., Enter, Ctrl+A). |
| file_upload | Upload a local file to a file input element. |
Tab Management
| Tool | Description |
|---|---|
| tabs | List all open browser tabs with their URLs and titles. Used for tab switching. |
The model can open new tabs by navigating to URLs that trigger new windows, or by using JavaScript evaluation. Each tab is independently addressable and maintains its own navigation history. Parallel tab support means the model can have multiple pages open simultaneously.
Advanced Tools
| Tool | Description |
|---|---|
| evaluate | Execute arbitrary JavaScript in the page context. Returns the result. |
JavaScript evaluation
The
evaluate tool executes arbitrary JavaScript in the browser context. This is powerful but carries risk -- it can read page data, modify the DOM, or make network requests. This tool always requires explicit approval in chat mode.