Learn About

Deep workflow guides aligned to this documentation section.

Opta Local Architecture

Opta Local operating model, architecture, and execution surfaces.

general

Opta CLI Masterclass

Deep command workflows, session control, and production operator patterns.

cli

Opta LMX Masterclass

Advanced LMX runtime behavior, memory strategy, and performance tuning.

lmx

Browse all Opta Learn guides

Introduction to Opta Local

Opta Local is a local-first AI operating stack for engineering teams that require private inference on Apple Silicon, with governed cloud fallback only when it is explicitly enabled.

Use the platform selector at the top of docs pages to switch between macOS and Windows views. Commands, file paths, and service instructions adapt to the selected operating system.

What is Opta Local?

Opta Local is a vertically integrated execution system that unifies a command-line interface, a local inference runtime, and operator surfaces into one coherent contract. It supports interactive chat, autonomous coding workflows, session governance, and runtime observability.

Unlike cloud-only tooling, Opta Local defaults to local execution. Prompts and session data remain on your infrastructure unless cloud providers are intentionally configured for fallback or burst capacity.

The Three Core Apps

Opta Local is packaged as three core apps. Internally, these apps are powered by layered services:

Component	Role	Runs On
Opta CLI	Terminal-first control surface for chat, task execution, sessions, permissions, and tool routing. (Internally powered by the local daemon.)	Your workstation (MacBook, desktop)
Opta LMX + Dashboard	Local inference engine (LMX) plus its dashboard experience for monitoring and control.	dedicated Apple Silicon host (high-memory host)
Opta Code Desktop (macOS + Windows + Linux)	Native desktop execution surface for Opta workflows, backed by the daemon.	macOS + Windows + Linux workstations

Product vs architecture

Product taxonomy uses three core apps: Opta CLI, Opta LMX + Dashboard, and Opta Code Desktop. Surface websites (Home, Init, Help, Accounts) are entrypoints, not core apps.

How they connect

The CLI daemon runs on your development machine and proxies requests to the LMX inference server over your local network. The web dashboard connects directly to LMX for monitoring and chat. All communication stays on your LAN unless you explicitly configure a Cloudflare Tunnel for remote access.

Who is it For?

Opta Local is optimized for operators who prioritize control, privacy, and predictable execution:

Developers with Apple Silicon hardware -- particularly high-memory Apple Silicon hosts with 96GB+ unified memory, capable of running 70B+ parameter models locally.
Privacy-conscious engineers who want AI assistance without sending proprietary code or sensitive data to cloud providers.
Power users who want full control over model selection, inference parameters, and tool permissions.
Teams who want to share a local inference server across multiple workstations on a LAN.

Key Benefits

Privacy

Every prompt, response, and session stays on your local network. There is no telemetry, no cloud logging, and no data retention by third parties. Your code and conversations are yours alone.

Speed

Apple Silicon unified memory architecture allows models to load directly into GPU-accessible memory without PCIe bottlenecks. A dedicated Apple Silicon host with 192GB of unified memory can run 70B parameter models at 40+ tokens per second -- comparable to or faster than many cloud API endpoints.

Control

You choose which models to run, which tools to enable, and what permissions to grant. The CLI's permission system lets you approve or deny individual tool invocations. There are no opaque safety filters -- you set the guardrails.

No Recurring Costs

After the initial hardware investment, inference is free. No per-token pricing, no API rate limits, no monthly subscriptions. Run as many queries as your hardware can handle.

Architecture Overview

The following diagram shows how the three components connect:

Stack Architecture

opta chat / opta do / opta tui        CLI commands (your terminal)
        |
        v
opta daemon  127.0.0.1:9999            Background orchestration service
        |   HTTP v3 REST + WS streaming
        v
Opta LMX  lmx-host.local:1234          Apple Silicon inference server
        |   OpenAI-compatible /v1/chat/completions
        v
Opta Local Web  localhost:3004         Browser dashboard + chat UI

The CLI is your primary interface. When you run opta chat or opta do, the CLI connects to the daemon (starting it automatically if needed). The daemon manages sessions, enforces permissions, and proxies inference requests to LMX over your LAN. The web dashboard provides a visual interface for the same stack, connecting to LMX directly for monitoring and chat.

Next Steps

Proceed to installation to provision the CLI runtime and verify host readiness before connecting to inference.

Installation