Chat
The Opta Local Web chat interface provides a full streaming conversation experience in your browser, backed by the same models running on your LMX inference server.
Chat Interface
The chat view presents a familiar conversational layout: your messages on one side, model responses on the other, rendered in a scrolling message list. Messages are composed in an input area at the bottom of the screen with keyboard shortcuts for submission.
The interface supports multi-turn conversations with full context retention. Each message is displayed with the model identity and timestamp, and the entire conversation history is maintained in the browser session.
Model Picker
Before starting a conversation (or at any point during one), you can select which loaded model to use via the model picker. The picker shows only models currently loaded in LMX memory -- it pulls the active model list from the same SSE stream that powers the dashboard.
Selecting a different model mid-conversation switches inference to that model for subsequent messages. Previous messages in the conversation are preserved and re-sent as context to the new model.
Streaming Responses
Responses stream token by token as they are generated by LMX. The web client connects to the standard OpenAI-compatible streaming endpoint and renders tokens incrementally as they arrive.
POST /v1/chat/completions
Content-Type: application/json
{
"model": "qwen3-72b",
"messages": [
{ "role": "user", "content": "Explain unified memory architecture" }
],
"stream": true
}During streaming, a typing indicator and token counter are visible. The response text renders with progressive markdown formatting -- headings, code blocks, and lists appear as soon as enough tokens have been received to parse them.
Tool Execution Feedback
When the model requests tool execution (file reads, web searches, code execution), the chat interface displays tool call cards inline with the conversation. Each card shows:
- Tool name -- which tool the model invoked
- Arguments -- the parameters passed to the tool
- Result -- the output returned to the model
- Duration -- how long the tool execution took
Tool cards are collapsible -- click to expand or collapse the details. This keeps the conversation readable while still providing full transparency into what tools the model used and what data they returned.
Session Continuity
Sessions started in the Opta CLI can be continued in the web interface, and vice versa. The daemon maintains session state independently of the client, so you can:
- Start a coding conversation in your terminal with
opta chat - Switch to the web dashboard to continue the same session visually
- Return to the CLI and pick up where you left off
Session continuity works because both the CLI and web dashboard connect to the same daemon session store. The session ID is the key -- any client that knows the session ID can resume the conversation.
Markdown Rendering
Model responses are rendered with full markdown support, including:
- Headings (h1 through h6)
- Fenced code blocks with syntax highlighting
- Inline code spans
- Bold, italic, and strikethrough text
- Ordered and unordered lists
- Blockquotes
- Tables
- Links (opening in new tabs)
Code blocks include a copy button for extracting snippets. The markdown renderer matches the Opta design system -- dark backgrounds for code blocks, violet accent for links, and consistent typography throughout.