Browser Tools

The browser automation system exposes 30+ tools through @playwright/mcp. These tools cover navigation, element interaction, page inspection, input handling, tab management, and advanced operations.

Tool Reference

Each tool below is available to the AI model during browser automation sessions. The model selects which tools to invoke based on the task at hand and the current page state.

Tool	Description
navigate	Navigate to a URL. Waits for the page to load before returning.
navigate_back	Go back in the browser history. Equivalent to clicking the back button.
wait_for	Wait for a specific condition (element visible, text appears, network idle).

Interaction Tools

Tool	Description
click	Click on an element identified by selector or accessibility reference.
hover	Hover over an element to trigger tooltips or dropdown menus.
drag	Drag an element from one position to another.
select_option	Select an option from a dropdown or select element.

Inspection Tools

Tool	Description
snapshot	Capture the page accessibility tree. This is how the AI "sees" page structure.
screenshot	Take a visual screenshot of the current viewport.
console_messages	Retrieve console log output from the browser developer tools.

Snapshot vs. Screenshot

The snapshot tool returns a text-based accessibility tree that the model can parse efficiently. screenshot captures a visual image. The model typically uses snapshots for understanding page structure and screenshots for visual verification.

Input Tools

Tool	Description
type	Type text into the currently focused element.
press_key	Press a keyboard key or key combination (e.g., Enter, Ctrl+A).
file_upload	Upload a local file to a file input element.

Tab Management

Tool	Description
tabs	List all open browser tabs with their URLs and titles. Used for tab switching.

The model can open new tabs by navigating to URLs that trigger new windows, or by using JavaScript evaluation. Each tab is independently addressable and maintains its own navigation history. Parallel tab support means the model can have multiple pages open simultaneously.

Advanced Tools

Tool	Description
evaluate	Execute arbitrary JavaScript in the page context. Returns the result.

JavaScript evaluation

The evaluate tool executes arbitrary JavaScript in the browser context. This is powerful but carries risk -- it can read page data, modify the DOM, or make network requests. This tool always requires explicit approval in chat mode.

Overview

Recording & Replay