Guardrails

Guardrails are hard rules that govern what the AI agent can and cannot do. Critical rules cannot be overridden. Strict rules require explicit user confirmation to bypass.

Rule System

The guardrail system uses a three-tier hierarchy. This page details the specific rules in the Critical and Strict tiers, which are the enforceable rules that prevent unsafe behavior.

Critical Rules

Critical rules are absolute hard stops. They cannot be overridden by user configuration, model behavior, or any other mechanism. A violation of a Critical rule causes the tool call to be rejected immediately.

C01: No Data Exfiltration
The AI agent must not transmit user data, files, or session content to any external server, API, or endpoint without explicit user approval. This includes HTTP requests, email sends, webhook calls, and any other form of data transmission. Every outbound data transfer requires a separate confirmation prompt showing exactly what data will be sent and where.

C01: No Data Exfiltration

This is the most fundamental security rule. The AI cannot send your code, documents, or any other data to an external server without you seeing and approving the specific data being sent. This prevents malicious prompt injection attacks that attempt to exfiltrate sensitive data through tool calls.

C02: Destructive Commands Need Confirmation
Commands that delete files, drop databases, reset repositories, or perform other irreversible operations must receive explicit user confirmation before execution. The confirmation prompt must clearly describe the destructive action.

C02: Destructive Confirmation

Destructive commands include but are not limited to: rm -rf, git reset --hard, DROP TABLE, docker system prune. The daemon identifies destructive patterns in command arguments and escalates them to the user regardless of the current permission mode.

C03: No External Posts Without Approval
The AI agent must not create, modify, or delete content on external services (GitHub issues, Slack messages, social media posts, API calls) without explicit user approval. Each external mutation requires its own confirmation.

C03: No External Posts

This rule prevents the AI from taking actions on your behalf on external platforms. Even if the model has access to a GitHub MCP server, it cannot create a pull request, post a comment, or push code without your explicit approval for each action.

C08: Daemon Localhost Only
The daemon must bind exclusively to 127.0.0.1. It must not accept connections from external network interfaces under any circumstances. This rule is enforced at the network binding level and cannot be overridden by configuration.

C08: Daemon Localhost Only

The daemon listens on 127.0.0.1:9999 and will refuse to bind to 0.0.0.0 or any external interface. This prevents remote exploitation of the daemon API, which has broad system access through its tool execution capabilities.

Strict Rules

Strict rules are enforced by default but can be overridden on a per-instance basis with explicit user confirmation. They represent important security practices that have legitimate exceptions.

S01: Permission Check

Every tool invocation must pass through the permission check pipeline. No tool call can bypass the permission system. This ensures that the permission table documented in the Permissions page is always consulted before any tool executes.

S02: No Tokens in Logs

Authentication tokens, API keys, passwords, and other secrets must never appear in log output, terminal display, or session recordings. The daemon sanitizes all log messages to replace sensitive strings with redacted placeholders before they are written to any output.

S04: No Cloud Without Opt-In

No component may transmit data to any cloud service without explicit user opt-in. The default configuration makes zero outbound connections. Cloud features (Anthropic API fallback, Cloudflare Tunnel, Supabase auth) must be explicitly configured by the user.

S06: Max Tool Calls

A single turn is limited to a maximum of 30 tool calls. This prevents runaway agent loops where the model repeatedly invokes tools in an infinite cycle. If the limit is reached, the turn is terminated and the user is notified.

Configurable limit
The 30-tool-call limit is the default. It can be adjusted via configuration, but increasing it requires explicit user action and awareness of the implications.