Introduction

nexo-rs is a Rust framework for building multi-agent LLM systems that live on real messaging channels — WhatsApp, Telegram, email — instead of a chat webapp. Event-driven (NATS, or a built-in local broker — zero external infra), per-agent tool sandboxes, drop-in configuration for private vs. public agents.

Coming from OpenClaw? That's the closest reference point — TypeScript, Node, single-process. nexo-rs keeps the "agents on real channels" idea and trades JS familiarity for: a single static binary, a fault-tolerant broker layer (NATS or a local stdio bridge — no NATS server required), per-agent capability sandboxes, durable workflows, secrets audit, plugin SDKs in 4 languages, and Termux-first portability. See vs OpenClaw for the full side-by-side.

One process, many agents, many channels. Kate handles your personal Telegram; Ana works the WhatsApp sales line; a cron-style poller sweeps Gmail for leads — all sharing one broker, one tool registry, and one memory layer.

Boots with zero config. nexo runs against documented defaults when no config dir exists — 0 agents, local broker, admin RPCs + health endpoint live. Add a persona, a YAML, or nexo init's 19 commented sample files when you're ready. Switch broker mode at runtime with nexo set-broker {local,nats}.

Single binary, ~90 MB. No Node, no npm, no Docker required. The release tarball you download is ~15 MB (xz-compressed); the .deb is ~18 MB, the .rpm ~25 MB. Runs on a fresh VPS, on Termux without root, or as a systemd unit. Pre-built binaries ship for Linux (x86_64 / aarch64, static musl), macOS (Intel / Apple Silicon), and Windows; .deb / .rpm / Termux .deb too.

flowchart LR
    WA[WhatsApp] --> NATS[(NATS broker)]
    TG[Telegram] --> NATS
    MAIL[Email / Gmail poller] --> NATS
    BROWSER[Browser CDP] --> NATS
    NATS --> ANA[Agent: Ana]
    NATS --> KATE[Agent: Kate]
    NATS --> OPS[Agent: ops-bot]
    ANA --> TOOLS[Tools & extensions]
    KATE --> TOOLS
    OPS --> TOOLS
    TOOLS --> MEM[(Memory: SQLite + sqlite-vec)]
    TOOLS --> LLM{{LLM providers}}

Why it exists

Most "agent frameworks" assume one LLM talking to one user through one UI. Real deployments are not shaped that way:

Several agents with different personas, models, and skills
Multiple channels (WA + Telegram + mail) feeding the same agents
Business logic that is not LLM-driven (scheduled tasks, regex email triage, lead notifications) running next to the LLM loop
Private prompts and pricing tables alongside an open-source core

nexo-rs is opinionated toward that shape.

What's in the box

Area	What ships
Runtime	Multi-agent core, SessionManager, Heartbeat, CircuitBreaker. Boots zero-config; `nexo init` scaffolds 19 commented sample YAMLs
Broker	NATS (`async-nats`) + disk queue + DLQ + backpressure, or `local` — stdio-bridge for subprocess plugins, no external server. Flip at runtime: `nexo set-broker {local,nats}`
LLMs	MiniMax M2.5 (primary), Anthropic (OAuth + API), OpenAI-compat, Gemini, DeepSeek
Plugins	WhatsApp, Telegram, Email, Browser (CDP), Google (Gmail/Calendar/Drive/Sheets), Web Search (Brave/Tavily/DuckDuckGo/Perplexity). Install with `nexo plugin install <owner>/<repo>` (GitHub Releases tarball) or `cargo install nexo-plugin-{whatsapp,telegram,email,browser,google,web-search}` from crates.io
Memory	Short-term in-memory, long-term SQLite, vector via sqlite-vec
Extensions	TOML manifest, stdio + NATS runtimes, CLI, 20+ skills shipped
MCP	Client (stdio + HTTP), agent as MCP server, hot-reload
TaskFlow	Durable multi-step flow runtime with wait/resume
Soul	Identity, MEMORY.md, dreaming, workspace-git, transcripts
Personas	Out-of-tree agent definitions installed via `nexo persona install <owner>/<repo>` (v2 manifest, GitHub Releases). Cody is the reference pack.

Who it is for

Developers who want to run real agents — not a ChatGPT demo with retrieval.
Multi-tenant single-install — several agents, several channels, isolated by config.
Fault-tolerance-first teams — disk queue, DLQ, circuit breakers, single-instance lock, no message drop on reconnect.
Anyone extending with their own stack — stdio extensions in any language, MCP, drop-in private agents.

What it is not

Not a chatbot, not a webapp. It has no UI of its own.
Not a replacement for LangChain/LlamaIndex as a "primitives library". It is an operational runtime.
Not a channel-abstraction layer. WhatsApp behaves like WhatsApp, Telegram like Telegram. The runtime surfaces channels, not uniforms them.

Three minutes to a running agent

# 1. Install nexo-rs — pre-built binary, no Rust toolchain needed:
curl -fsSL https://lordmacu.github.io/nexo-rs/install.sh | bash
#    (or `cargo install nexo-rs` from crates.io, or download the
#     .deb / .rpm / Termux .deb from GitHub Releases.)

# 2. Add a channel plugin (GitHub Releases tarball OR crates.io):
nexo plugin install lordmacu/nexo-plugin-whatsapp
#    Built-ins: nexo-plugin-{whatsapp,telegram,email,browser,google,web-search}.
#    crates.io path: `cargo install nexo-plugin-web-search` (etc.).

# 3. Install the Cody programmer-pair persona (or any other v2 pack):
nexo persona install lordmacu/nexo-persona-cody

# 4. Boot. The daemon picks up the persona + plugins automatically.
nexo

nexo (no subcommand) runs the daemon against documented defaults for every YAML when no config dir exists; nexo plugin install drops a channel plugin under <state_dir>/plugins/; nexo persona install lays down a ready-to-run agent + plugin bindings under <state_dir>/personas/. To tune from a documented baseline instead of the bare defaults, run nexo init first to scaffold 19 commented sample YAMLs. Other install channels (Docker, Nix, native packages, Termux): see the installation guide.

Build your own persona pack? See Installing personas for the v2 manifest shape + GitHub Releases wire convention.

Zero-config quickstart (30s)
Installation
Quick start (10min walkthrough)
Installing personas
Architecture overview
API reference (rustdoc) — every public type in the workspace

Why nexo-rs

If you've tried to build a real agent system, you know the gap.

Most "agent frameworks" assume one LLM talking to one user through one UI. Real deployments are not shaped that way.

You have several agents with different personas, models, and skills. Multiple channels (WhatsApp + Telegram + mail) feed the same agents. Business logic that is not LLM-driven (scheduled tasks, regex email triage, lead notifications) runs next to the LLM loop. Private prompts and pricing tables live alongside an open-source core. Customer support agents need a different tool sandbox than your billing bot.

nexo-rs is opinionated toward that shape.

What you get

One process, many agents, many channels. Kate handles your personal Telegram. Ana works the WhatsApp sales line. A cron-style poller sweeps Gmail for leads — all sharing one broker, one tool registry, and one memory layer.
Single binary, ~90 MB. No Node, no npm, no Docker required. The release tarball you actually download is ~15 MB (xz-compressed); the .deb is ~18 MB, the .rpm ~25 MB. Runs on a fresh VPS, on Termux without root, or as a systemd unit.
Production-grade by default. Event bus with disk fallback — NATS for multi-host, or a built-in local broker (stdio-bridge for subprocess plugins, no external server) for single-host; nexo set-broker flips between them. Per-agent capability sandboxes. Cosign-verified plugin marketplace. Multi-tenant SaaS-ready.
Zero-config boot. nexo runs against documented defaults with no config dir — admin RPCs + health live, 0 agents. Add a persona, a YAML, or nexo init's 19 commented samples when you're ready.

Three layers of extensibility

When you're ready to add functionality, pick the right layer:

Layer	Use it when	Ships as
Plugin	You want a new channel (Discord, Slack, custom protocol) or to expose new tools to agents.	Subprocess in your favorite language (Rust, Python, TypeScript, PHP), tarball published via GitHub Releases.
Extension	You're bundling tools, advisors, skills, MCP servers as a self-contained unit — typical for SaaS verticals (sales, support, marketing).	Local-path tarball; operator runs `nexo ext install`.
Microapp	You're building a SaaS product on top of nexo-rs. The framework runs in the background; your app owns the UI and the multi-tenant story.	Your own application, talking to nexo-rs over admin RPC (NATS).

What it's not

Not a chat webapp. There's no built-in UI; you bring your own channels (and your own UI for microapps).
Not a single-tenant prototype. Multi-tenant SaaS is a first-class shape, not an afterthought.
Not a research toy. Every release is signed; the install pipeline has cosign verification + sha256 checks; the broker has disk fallback; the test surface covers all four SDK languages.

How it compares

The closest reference point is OpenClaw (TypeScript, Node). If that's where you're coming from, here's the trade: you give up JS familiarity, you get —

A single static Rust binary (vs Node + node_modules); pre-built for Linux / macOS / Windows + .deb / .rpm / Termux
A fault-tolerant broker layer — NATS for multi-host or a local stdio bridge with no external server (vs in-memory only)
Zero-config boot + nexo init documented YAML scaffolds
Per-agent capability sandboxes
Durable workflows (TaskFlow) + secrets audit
Termux / mobile-first portability
Plugin SDKs in 4 languages — Rust, Python, TypeScript, PHP (vs TS only)

See vs OpenClaw for the full side-by-side.

Where to next

New here? → Quickstart — install
- first agent in 5 minutes.
Want to write a plugin? → Plugin contract.
Building a SaaS? → Microapps · getting started.
Curious about internals? → Browse the Advanced section in the sidebar — architecture deep-dives, ADRs, design notes.

What you can build

A non-exhaustive gallery of products people are shipping (or could ship by next week) on top of nexo-rs. Each card links to the recipe / template that gets you 80 % of the way.

If you're scanning to decide whether nexo-rs fits your use case, read this page top-to-bottom. If something matches your shape, follow the link.

Channel agents

Installing a channel plugin — each card below needs one. They ship as GitHub Releases tarballs; install with nexo plugin install <owner>/<repo>:
nexo plugin install lordmacu/nexo-plugin-whatsapp
nexo plugin install lordmacu/nexo-plugin-telegram
nexo plugin install lordmacu/nexo-plugin-email
nexo plugin install lordmacu/nexo-plugin-browser
nexo plugin install lordmacu/nexo-plugin-google      # Gmail/Calendar/Drive/Sheets
nexo plugin install lordmacu/nexo-plugin-web-search  # Brave/Tavily/DuckDuckGo/Perplexity
nexo plugin list
All six also ship to crates.io: cargo install nexo-plugin-web-search (etc.) drops the binary in $HOME/.cargo/bin/ and the daemon's discovery walker picks it up automatically.

(Or build from source: cargo install --git https://github.com/lordmacu/nexo-plugin-whatsapp.) Then reference the channel in your agent YAML, as shown below.

WhatsApp sales agent — qualify leads + book demos

⏱ Build time · 1 afternoon · ⚙️ Layer · agent + WhatsApp plugin

Ana takes inbound WhatsApp messages, qualifies the prospect with a tool that calls your CRM, and books a calendar slot. Persona prompt

2 tools (crm_lookup, calendar_book) + a YAML — that's it.

# config/agents.d/ana-sales.yaml
agents:
  - id: ana-sales
    persona_path: ./personas/ana-sales.md
    llm: minimax-m2.5
    channels:
      - whatsapp:sales-line
    tools: [crm_lookup, calendar_book, send_quote]
    memory: { long_term: true, vector: true }

→ WhatsApp sales agent recipe → WhatsApp plugin docs

Email triage agent — auto-reply + escalate

⏱ Build time · 1 day · ⚙️ Layer · agent + email plugin + skill bundle

Sweeps Gmail every 5 minutes, classifies inbound messages (invoice / support / spam / sales), auto-replies to the easy buckets, escalates the rest to a human via Telegram with a 1-paragraph summary.

agents:
  - id: triage-bot
    persona_path: ./personas/triage.md
    channels: [email:inbox, telegram:ops-team]
    skills: [classify-email, draft-reply, escalate-to-human]

→ Email plugin docs → Skill catalog

Google Workspace agent — Gmail + Calendar + Drive

⏱ Build time · 1 afternoon · ⚙️ Layer · agent + Google plugin

OAuth-authenticated agent that can search Gmail, schedule calendar events, and pull docs from Drive — all through the generic google_call tool that wraps any *.googleapis.com endpoint. Token state lives in the agent's workspace; access tokens auto-refresh.

cargo install nexo-plugin-google
nexo                                        # daemon discovers + spawns
nexo-plugin-google --oauth-once <agent_id> \
    --client-id-file ./secrets/google_client_id.txt \
    --client-secret-file ./secrets/google_client_secret.txt \
    --token-file ./data/workspace/<agent_id>/google_tokens.json \
    --scopes gmail.readonly,calendar,drive.readonly \
    --workspace-dir ./data/workspace/<agent_id>

agents:
  - id: gws
    persona_path: ./personas/gws.md
    google_auth:
      client_id_file:     ./secrets/google_client_id.txt
      client_secret_file: ./secrets/google_client_secret.txt
      scopes: [gmail.readonly, calendar, drive.readonly]
    tools: [google_auth_status, google_call]

→ Google plugin docs → Source · github.com/lordmacu/nexo-rs-plugin-google

Customer support copilot — Telegram bot with KB + handoff

⏱ Build time · 2-3 days · ⚙️ Layer · agent + Telegram + vector memory + MCP

Telegram bot answers from your knowledge base (sqlite-vec). When the LLM's confidence drops, it hands off to a human and posts the transcript to your support channel.

agents:
  - id: support-copilot
    persona_path: ./personas/support.md
    channels: [telegram:support-bot]
    memory:
      vector: true
      vector_collections: [kb-faqs, kb-troubleshooting]
    tools: [escalate_to_human, search_kb]

→ Telegram plugin → Vector search

Multi-agent systems

Multi-agent CRM — intake, qualifier, closer

⏱ Build time · 3-5 days · ⚙️ Layer · 3 agents + agent-to-agent delegation

Three coordinated agents over NATS:

Intake picks up inbound on every channel, normalizes, hands off
Qualifier scores the lead (BANT or your framework), tags
Closer (only on hot leads) drafts proposal + books call

Communicate via agent.route.<target_id> topics with a correlation_id to match responses.

flowchart LR
    WA[WhatsApp] --> INTAKE[Intake]
    EMAIL[Email] --> INTAKE
    INTAKE -->|hot lead| QUAL[Qualifier]
    QUAL -->|score >= 70| CLOSER[Closer]
    CLOSER --> CALENDAR[Calendar tool]
    QUAL -->|score < 70| NURTURE[(Drip campaign queue)]

→ Agent-to-agent delegation → Multi-agent coordination

Internal ops bot — Slack via MCP + AWS tools + cron

⏱ Build time · 1-2 days · ⚙️ Layer · agent + MCP + cron skills

A bot in your team's Slack (via MCP server) that answers "what's broken in prod", schedules nightly DB snapshots, and posts the daily cost report at 9 AM.

agents:
  - id: ops-bot
    persona_path: ./personas/ops.md
    channels: [mcp:slack-team]
    tools: [aws_logs, aws_cost, db_snapshot]
    cron:
      - "0 9 * * *"  # daily cost report
      - "0 2 * * *"  # nightly DB snapshot

→ MCP channels → Cron schedule tools

SaaS products

WhatsApp meta-creator SaaS — clients build their own agents

⏱ Build time · 4-8 weeks · ⚙️ Layer · microapp + multi-tenant + WhatsApp Web UI

A SaaS where end-users sign up and build their own WhatsApp agent through a WhatsApp-Web-style React UI. Each client gets isolated state, their own agents, their own knowledge base. The framework runs out of view; the microapp owns the UX.

#![allow(unused)]
fn main() {
// Provision a tenant from the microapp backend
admin.create_tenant(TenantSpec {
    id: "client-acme".into(),
    plan: "pro".into(),
    quotas: Quotas { agents: 10, llm_tokens_month: 5_000_000 },
}).await?;
}

→ Microapps · getting started → agent-creator reference → Multi-tenant SaaS

Vertical SaaS — sales / support / marketing extension pack

⏱ Build time · 2-3 weeks · ⚙️ Layer · extension + multi-tenant

Bundle your domain expertise as an extension: 5 tools + 3 advisors

8 skills + an MCP server adapter. Operators run nexo ext install ./your-pack and your vertical lights up across all their tenants.

→ Extension manifest → Extension templates

Personas — pre-built agent packs

Persona packs bundle a ready-to-run agent (system prompt + plugin bindings + workspace seed + secrets templates) you install with one command. Distinct from plugins (plugins register CODE; personas register CONFIG). Authored against the v2 manifest schema, published as GitHub Releases, installed via nexo persona install.

# Browse + install:
nexo persona install lordmacu/nexo-persona-cody
nexo persona list

Available today:

Pack	Persona	Channels	Use case
`lordmacu/nexo-persona-cody`	Cody — programmer pair	Telegram, WhatsApp	Drives Claude Code goals from chat. Reads PHASES.md, dispatches one phase at a time, audits the diff before declaring done. Self-modify by default (with git-worktree isolation); production opts out via `NEXO_DISALLOW_SELF_MODIFY=1`.
`lordmacu/nexo-persona-ana-template`	Ana — sales / lead capture	WhatsApp	Hardened single-tool template for inbound WhatsApp lead capture. `allowed_tools` whitelist + `outbound_allowlist.whatsapp` defense-in-depth: a jailbroken prompt cannot exfiltrate leads to an attacker number. Operator customizes the advisor phone + sales script before going live.
`lordmacu/nexo-persona-marketing-multiclient-template`	Multi-client marketing	configurable	Three distinct agents (intake / retention / exec) on one daemon, each with its own LLM (MiniMax M2.5 / Claude Haiku 4.5 / DeepSeek v4 flash) + own proactive cadence + own daily turn budget. Demonstrates the multi-tenant single-install pattern.

More on the way — see the Cody README for the v2 manifest shape if you want to publish your own. Inner- loop dev with nexo persona run /path/to/local/pack boots the daemon against an unpackaged dir.

→ Installing personas (full guide)

Specialized agents

Browser scraping agent — URL → structured data

⏱ Build time · 1-2 days · ⚙️ Layer · agent + browser plugin

Receives URLs (via webhook / Telegram / API), uses the browser plugin (Chrome DevTools Protocol) to render JS-heavy pages, extracts structured data, publishes results back to a topic. Useful for price monitoring, competitive intel, lead enrichment.

agents:
  - id: scraper
    persona_path: ./personas/scraper.md
    channels: [webhook:scrape-requests]
    tools: [browser_navigate, browser_screenshot, browser_extract_text]

→ Browser plugin

Lead notification poller — RSS / API → Telegram alert

⏱ Build time · half a day · ⚙️ Layer · poller + Telegram

A cron-style poller hits an external RSS feed / API every N minutes, dedupes against state, and pings your sales team in Telegram when something matches. Pure config — no LLM call needed on the hot path.

# config/pollers.yaml
pollers:
  - id: linkedin-jobs
    cron: "*/15 * * * *"
    url: "https://linkedin.example/.../feed.atom"
    filter:
      keyword: ["VP Sales", "Head of Growth"]
    publish: plugin.inbound.telegram.sales-alerts

→ Build a poller module → Pollers config

MCP server from Claude Desktop — expose your tools to Claude

⏱ Build time · 1 hour · ⚙️ Layer · agent as MCP server

Run nexo-rs as an MCP server. Claude Desktop (or any MCP client) sees every tool / agent / skill you've configured as if they were native. Build internal Claude integrations without writing TS.

→ MCP server from Claude Desktop → Agent as MCP server

Where to next

Pick the closest match → follow its link → adapt to your data.
Read the Quickstart first if you don't already have a binary running.
Browse the Recipes section in the sidebar for end-to-end deploy walkthroughs.
If you're building a SaaS, jump straight to Microapps · getting started.

Install nexo-rs (LLM-friendly guide)

Share this URL with any AI assistant to teach it how to install nexo-rs end-to-end on Linux or Termux: https://lordmacu.github.io/nexo-rs/install-for-ai.html

The page is intentionally linear: copy-paste each block in order. No menus to navigate, no marketing, every command is deterministic.

What you are installing

nexo-rs — a Rust multi-agent framework. Single agent binary
- a cloudflared tunnel for the web admin UI.
Source: https://github.com/lordmacu/nexo-rs
Docs (full): https://lordmacu.github.io/nexo-rs/
License: dual MIT / Apache-2.0

Pick your platform

Linux (Debian / Ubuntu / Arch / Fedora) → §A
Termux (Android, no root) → §B

Skip the section that doesn't apply.

§A — Linux install

A.1. System packages

Debian / Ubuntu:

sudo apt update
sudo apt install -y build-essential pkg-config libsqlite3-dev git curl

Arch:

sudo pacman -Syu --needed base-devel pkgconf sqlite git curl

Fedora:

sudo dnf install -y @development-tools pkgconf-pkg-config sqlite-devel git curl

A.2. Rust toolchain

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"
rustup component add rustfmt clippy

A.3. Clone + build

git clone https://github.com/lordmacu/nexo-rs
cd nexo-rs
cargo build --release --bin agent

The compiled binary is at ./target/release/agent. Copy it into PATH (optional):

sudo install -m 0755 target/release/agent /usr/local/bin/agent

A.4. First-run wizard

agent setup

Follow the interactive prompts. Defaults are sane. The wizard writes config/agents.d/<your-agent>.yaml, IDENTITY.md, SOUL.md, and any channel YAMLs you opt into.

A.5. Run

agent

Or, for the web admin (loopback HTTP + Cloudflare tunnel):

agent admin

The admin command prints a one-time URL + password to stdout. Open the URL, log in, and configure from the browser.

A.6. (Optional) systemd service

sudo useradd -r -s /bin/false -d /srv/nexo-rs nexo
sudo mkdir -p /srv/nexo-rs
sudo cp -r config target/release/agent /srv/nexo-rs/
sudo chown -R nexo:nexo /srv/nexo-rs

sudo tee /etc/systemd/system/nexo-rs.service > /dev/null <<'EOF'
[Unit]
Description=nexo-rs agent
After=network.target

[Service]
Type=simple
User=nexo
WorkingDirectory=/srv/nexo-rs
ExecStart=/srv/nexo-rs/agent --config /srv/nexo-rs/config
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable --now nexo-rs

Logs: journalctl -u nexo-rs -f.

A.7. (Optional) NATS broker

A single-process install does not need NATS — the runtime falls back to in-process channels. Add NATS only when scaling beyond one host:

curl -L -o /tmp/nats.tar.gz \
  https://github.com/nats-io/nats-server/releases/download/v2.10.20/nats-server-v2.10.20-linux-amd64.tar.gz
tar -xzf /tmp/nats.tar.gz -C /tmp
sudo mv /tmp/nats-server-*/nats-server /usr/local/bin/
sudo systemctl enable --now nats-server  # if you have a unit file

Then in config/broker.yaml set kind: nats and url: nats://127.0.0.1:4222.

§B — Termux install (Android, no root)

B.1. Termux from F-Droid

Install Termux from https://f-droid.org/en/packages/com.termux/. Do not install from the Google Play Store — that build is outdated.

Open Termux. Then:

pkg update
pkg upgrade -y

B.2. Build dependencies

pkg install -y rust git curl sqlite openssl clang pkg-config

Optional extras (only the ones you'll use):

# media transcoding + OCR + youtube downloads
pkg install -y ffmpeg tesseract yt-dlp
# tmux for long-running tunnels and ssh
pkg install -y tmux openssh
# headless Chromium for the browser plugin
pkg install -y tur-repo
pkg install -y chromium
# Termux:API for sensors / SMS / clipboard
pkg install -y termux-api
# (also install the Termux:API companion app from F-Droid)

B.3. Clone + build

cd ~
git clone https://github.com/lordmacu/nexo-rs
cd nexo-rs
cargo build --release --bin agent

B.4. First-run wizard

./target/release/agent setup

B.5. Run

./target/release/agent

Or with the admin UI (the cloudflared tunnel works on Termux):

./target/release/agent admin

B.6. Keep running with the screen off

Termux apps get killed on doze unless you disable battery optimizations and acquire a wake lock:

Disable optimizations: Android Settings → Apps → Termux → Battery → Unrestricted.
Wake lock: in Termux, type:
```
termux-wake-lock
```

(Optional) auto-restart on boot: install Termux:Boot from F-Droid, then create ~/.termux/boot/00-nexo-rs:

mkdir -p ~/.termux/boot
cat > ~/.termux/boot/00-nexo-rs <<'EOF'
#!/data/data/com.termux/files/usr/bin/sh
termux-wake-lock
cd ~/nexo-rs
./target/release/agent --config ./config >> ~/nexo-rs/agent.log 2>&1
EOF
chmod +x ~/.termux/boot/00-nexo-rs

B.7. Termux-specific tip — Chromium flags

The browser plugin (plugins: [browser]) needs the right Chromium launch flags on Termux. The defaults already cover Android; nothing extra to set. Just make sure chromium is on PATH (it is, after pkg install chromium).

Config layout (both platforms)

After agent setup runs, the project tree looks like:

nexo-rs/
├── config/
│   ├── agents.yaml          # opt-in dev defaults
│   ├── agents.d/            # your agents land here
│   │   └── <slug>.yaml
│   ├── broker.yaml          # NATS or local
│   ├── llm.yaml             # provider keys + model
│   └── plugins/             # one YAML per channel plugin
├── secrets/                 # mode 0600 token files (gitignored)
├── data/                    # SQLite databases (memory, taskflow, transcripts)
├── target/release/agent     # the built binary
└── agent.log                # if you redirected stdout

Edit YAML by hand or use the web admin (agent admin).

Troubleshooting

cargo build fails with linker errors on Linux — install build-essential and pkg-config (§A.1).
cargo build hits out of memory on Termux — close other apps, or build with one job: cargo build --release -j 1.
agent exits immediately with failed to load config — run agent setup first; the wizard creates the missing files.
WhatsApp QR pairing fails on Termux — make sure the device is on the same network as your phone, then open the QR pairing URL the daemon prints.
Admin tunnel URL doesn't respond — Cloudflare's quick tunnel occasionally rotates; restart agent admin and copy the new URL.

Useful commands after install

agent --help                                  # all subcommands
agent doctor capabilities --json              # which env toggles are armed
agent setup doctor                            # audit configured secrets
agent ext doctor --json                       # extension health
agent flow list                               # taskflow admin
agent dlq list                                # dead-letter queue

Full reference: https://lordmacu.github.io/nexo-rs/cli/reference.html

When asking an AI for help

Paste this URL into your prompt:

Install nexo-rs from https://lordmacu.github.io/nexo-rs/install-for-ai.html
on this machine. The OS is <Linux distro / Termux>. Stop after each
section to confirm output looks right.

The page above is the canonical, copy-paste-friendly install path. The full mdBook (https://lordmacu.github.io/nexo-rs/) covers the same ground in more depth — link there once the agent is up.

Installation

Pick the channel that matches your environment. Every channel produces the same nexo binary; the differences are in how it gets onto your machine and which dependencies come bundled.

Channel matrix

Channel	When to pick it	Time to first run	Needs Rust?
Pre-built binary — Linux/macOS (`install.sh`)	Just want `nexo` on PATH, fast	~10 s	No
Pre-built binary — Windows (`install.ps1`)	Native Windows, PowerShell	~10 s	No
crates.io (`cargo install nexo-rs`)	You already have a Rust toolchain	~3-5 min build	Yes
Debian / Ubuntu (`.deb`)	systemd host, `apt` integration	~10 s	No
Fedora / RHEL / Rocky (`.rpm`)	systemd host, `dnf` integration	~10 s	No
Termux (`.deb`, aarch64)	Phone-hosted personal agent	~10 s	No
Docker (GHCR)	Production, CI, "just works" + bundled Chrome/ffmpeg/…	~30 s	No
Nix flake	NixOS, reproducible dev shell	~3-5 min cold	(Nix)
Native (no Docker), from source	Track `main`, full control	~10-15 min	Yes

Quickest path — pre-built binary

Linux / macOS (also Windows from Git Bash):

curl -fsSL https://lordmacu.github.io/nexo-rs/install.sh | bash

Windows (PowerShell):

irm https://lordmacu.github.io/nexo-rs/install.ps1 | iex

Detects your OS + arch (Linux x86_64 / aarch64 static-musl, macOS Intel / Apple Silicon, Windows x86_64-MSVC), downloads the matching release artifact from GitHub Releases, verifies its sha256, drops nexo (nexo.exe on Windows) onto your PATH, then installs the bundled channel plugins + a persona. Falls back to cargo install nexo-rs → cargo install --git if there's no pre-built binary for your platform. Every release artifact is cosign-signed.

Then:

nexo                 # boots the daemon — zero config required

Windows users: download the .zip from Releases, or run the installer under WSL (it then sees Linux).

From crates.io

If you already have a Rust toolchain:

cargo install nexo-rs

Builds + installs the nexo binary into $CARGO_HOME/bin. The whole nexo-* workspace ships to crates.io, so this resolves cleanly without a git checkout.

From source

For contributors and operators who want to track main directly:

git clone https://github.com/lordmacu/nexo-rs
cd nexo-rs
cargo build --release --bin nexo
./target/release/nexo --help

The workspace compiles ~45 crates and produces the nexo binary plus two example smoke-test bins (integration-browser-check, llm_smoke). Toolchain is pinned to Rust 1.80 (MSRV) via rust-toolchain.toml — no manual channel selection needed. For faster iterative builds use cargo build --profile release-fast (same opt-level, no LTO, ~50 % quicker).

Prerequisites

Rust 1.80+ (rustup recommended)
NATS is optional — the daemon defaults to broker.type: local (an in-process stdio bridge, no external server). Run a NATS server only for multi-host clusters:
```
docker run -p 4222:4222 nats:2.10-alpine
nexo set-broker nats --url nats://localhost:4222
```
See broker shapes / broker.yaml.
Git (the memory subsystem uses per-agent workspace-git)
Chrome / Chromium (only if you plan to use the browser plugin)

Verification

./target/release/nexo --version
cargo test --workspace --lib

nexo --version prints the build provenance line (commit + build timestamp) so a bug report carries enough context to reproduce.

Bootstrap script

For native or Termux installs, ./scripts/bootstrap.sh automates the whole process — installs the system deps, downloads NATS if not present, scaffolds config/, and runs the setup wizard.

./scripts/bootstrap.sh           # interactive
./scripts/bootstrap.sh --yes     # accept all defaults

The script auto-detects Termux ($PREFIX set) and switches to pkg install + broker.type: local so you don't need root or NATS on a phone.

Next steps

Quick start — first agent running in five minutes
Setup wizard — pair channels and wire secrets
Docker — compose stack, secrets, GHCR pulls
Nix flake — nix run, dev shell
Native install — detailed no-Docker setup
Termux install — phone-hosted personal agent

Zero-config quickstart

The fastest path from an installed nexo binary to a running daemon — no YAML editing required, no NATS server, no API key needed to boot. (Install: curl … install.sh | bash, cargo install nexo-rs, or a .deb/.rpm — see Installation.) Once running, configure incrementally via nexo init (scaffold sample YAMLs), nexo set-broker (switch broker mode), or the operator UI (admin RPCs).

Total wall-clock time: 30 seconds from installed binary to live daemon serving health + admin RPCs.

This page reflects the post-Phase-92/93/94/95 ergonomics. For the classical full-control walkthrough (manually edit each YAML, pair channels, talk to a Telegram bot end-to-end), see Quickstart.

1. Run the daemon

nexo

That's it. The daemon discovers its config dir per Phase 92.9 precedence, falls back to baked-in defaults for any YAML that's missing (Phase 93), and starts serving on the default health + admin endpoints.

Expected log:

WARN  nexo_config: config dir not found — booting with
                   Default::default() for every YAML
                   (0 agents, BrokerKind::Local, 0 llm
                    providers, sqlite memory at default path)
INFO  nexo: broker ready kind=Local url=
INFO  nexo: long-term memory ready path=./data/memory.db
INFO  plugins.discovery: plugin registry wire complete loaded=0
INFO  nexo: pairing initialised
INFO  nexo: agent ready — waiting for shutdown signal

What's happening:

0 agents — daemon waits for nexo/admin/agents/upsert via the operator UI.
BrokerKind::Local — in-process tokio::mpsc. No NATS server needed. Subprocess plugins bridge through stdio (Phase 92).
0 LLM providers — any tool call to an LLM fails loud with provider_not_found; the daemon stays up.
SQLite memory at ./data/memory.db — auto-created.

The daemon is now ready to accept admin RPCs to populate state. The next sections walk through adding config either via the operator UI (recommended) or by hand-editing YAMLs.

2. Operator UI (recommended)

If you have the agent-creator-microapp extension installed, it exposes a web UI for managing agents, LLM providers, channels, and credentials. Every UI action is an admin RPC behind the scenes:

UI route	Admin RPC
New agent	`nexo/admin/agents/upsert`
Add LLM provider + key	`nexo/admin/llm_providers/upsert`
Pair WhatsApp	`nexo/admin/pairing/start`
Register credentials	`nexo/admin/credentials/register`
Set marketing rules	`nexo/admin/marketing/rules/upsert`

Each RPC writes to the corresponding YAML on disk and notifies the running daemon via config_watch, so changes take effect without restart for hot-reloadable subsystems.

If you don't yet have a microapp installed, see agent-creator-microapp or skip ahead to YAML scaffolding.

3. Scaffold sample YAMLs

When you want to configure something specific and don't want to read the source for field shapes, ask the daemon to write heavily-commented templates:

nexo init                        # writes 19 sample YAMLs to ~/.config/nexo/
nexo init --yaml broker          # only broker.yaml
nexo init --yaml broker,llm      # comma-separated names
nexo init --yaml plugins         # all plugins/*.yaml templates
nexo init --output /etc/nexo     # custom target dir
nexo init --force                # overwrite existing files
nexo init --yaml llm --stdout    # emit to stdout (for piping)

Templates cover the four required configs (agents, broker, llm, memory) plus every optional subsystem (extensions, mcp, runtime, pollers, taskflow, transcripts, pairing, webhook_receiver) and the plugin

persona dirs.

Edit any of them, then restart the daemon (or let config_watch pick up the change). Empty fields stay at their defaults — you only fill in what you actually want.

Example: adding a MiniMax LLM provider after nexo init:

nexo init --yaml llm --output ~/.config/nexo
# Edit ~/.config/nexo/llm.yaml — uncomment the minimax block
# Set MINIMAX_API_KEY in your env (the YAML uses ${MINIMAX_API_KEY})
export MINIMAX_API_KEY=your-key
nexo

4. Switch the broker at runtime

broker.yaml type: is the single most operator-tweaked field. Skip the YAML edit and use the dedicated subcommand:

nexo set-broker local                            # stdio bridge (default)
nexo set-broker nats --url nats://localhost:4222 # multi-host cluster
nexo set-broker local --no-signal                # edit YAML only, no respawn

The subcommand edits broker.yaml in your resolved config dir (auto-creating the file with defaults if missing) and sends SIGTERM to the running daemon by default. The supervisor loop (dev-daemon.sh, systemd, etc.) respawns and picks up the new config — ~3 second blackout.

--no-signal skips the kill; you control the restart timing yourself.

When to pick which broker shape: broker shapes architecture.

5. Override config via env vars (12-factor)

For Docker / Kubernetes / CI deployments where YAMLs live in secret mounts at non-canonical paths:

NEXO_BROKER_YAML=/run/secrets/broker.yaml \
NEXO_LLM_YAML=/run/secrets/llm.yaml \
NEXO_AGENTS_YAML=/etc/cfg/agents.yaml \
  nexo

Each NEXO_<NAME>_YAML env points at an absolute path. If set, that file wholesale replaces the YAML the daemon would otherwise load from the config dir.

Currently supported (Phase 94): NEXO_AGENTS_YAML, NEXO_BROKER_YAML, NEXO_LLM_YAML, NEXO_MEMORY_YAML.

6. Layered overrides (Kustomize-style)

For ConfigMap base + Secret overlay deployments where you want to override specific fields (not whole files):

nexo --config /etc/nexo --override-from /run/secrets

The daemon loads each YAML from /etc/nexo/<name>.yaml, then deep-merges the same-named file from /run/secrets/<name>.yaml on top (per-field for mappings; wholesale replace for sequences and scalars).

Example:

# /etc/nexo/broker.yaml  (committed to git, in ConfigMap)
broker:
  type: nats
  url: nats://placeholder
  persistence:
    enabled: true

# /run/secrets/broker.yaml  (mounted from Kubernetes Secret)
broker:
  url: nats://prod-cluster.example.com:4222

Effective config the daemon sees:

broker:
  type: nats                                    # from base
  url: nats://prod-cluster.example.com:4222     # from overlay
  persistence:
    enabled: true                               # from base (overlay didn't touch)

The same chain applies to all four required YAMLs. Both env vars and --override-from compose with Phase 93 defaults: when neither layer has a value, the daemon's Default::default() takes over.

Config dir discovery

When --config <dir> is not passed explicitly, the daemon resolves the config dir in this order:

NEXO_CONFIG_DIR env var
./config relative to cwd (legacy, only when present)
$XDG_CONFIG_HOME/nexo or $HOME/.config/nexo
./config as a last-resort error path

Subcommands that read or write config (nexo init, nexo set-broker) auto-create the directory and any missing files when they need to — operators don't run mkdir first.

Composition summary

The four ergonomic phases stack like this:

┌────────────────────────────────────────────────────────────┐
│  Phase 95: nexo init                                       │
│    Scaffolds sample YAMLs with field-level docs.           │
│                          ↓                                 │
│  Operator edits YAML  OR  uses admin RPC  OR  set-broker   │
│                          ↓                                 │
│  Phase 94: env / override-from                             │
│    NEXO_<NAME>_YAML overrides; --override-from deep-merges.│
│                          ↓                                 │
│  Phase 92.9: --config / NEXO_CONFIG_DIR / XDG default      │
│    Base config dir resolution.                             │
│                          ↓                                 │
│  Phase 93: zero-config defaults                            │
│    Any YAML still missing → Default::default().            │
│                          ↓                                 │
│  Phase 92: subprocess broker bridge                        │
│    `broker.yaml type: local` works even with extracted     │
│    subprocess plugins. No NATS server required.            │
└────────────────────────────────────────────────────────────┘

Each layer is optional; pick the ones your deployment needs.

Where to next

Want the classical walkthrough? → Quickstart shows manual YAML editing + Telegram bot end-to-end.
Deploy targets in detail → .deb, .rpm, Termux, Nix, native.
Add an agent at runtime → agents.yaml or the operator UI's /agents/new page.
Broker shapes → broker-shapes.md covers when to pick NATS vs local vs embedded.
Microapp framework → Microapps · getting started for building operator-facing UIs on top of the daemon.

Quickstart

Goal: by the end of this page you have a running nexo-rs daemon with one agent that replies on Telegram (or WhatsApp) when you send it a message.

Total wall-clock time on a fresh laptop: ~10 minutes. The first cargo build is the slow step — pre-built binaries skip it entirely.

In a hurry? Phase 92-95 added a much shorter path: nexo alone now boots a working daemon with zero YAMLs + zero external broker. See zero-config quickstart for the 30-second version. This page covers the classical full-control walkthrough.

What you'll have at the end

You (in Telegram)        →  "what's the weather in Bogotá?"
Your agent (Ana)         →  "Looking it up..."  (via tool)
Your agent (Ana)         →  "Currently 18 °C, light rain."

Plus everything wired together — the broker (local stdio bridge by default, or NATS), an LLM provider, a channel plugin, the agent runtime, memory. From here you can swap personas, add tools, pair more channels, or move to a multi-tenant deployment.

1. Install the binary

Pick one — the one-liner is the fastest, no Rust toolchain needed:

# Pre-built binary (Linux x86_64/aarch64, macOS Intel/Apple Silicon).
# Detects your platform, verifies sha256, drops `nexo` on PATH.
curl -fsSL https://lordmacu.github.io/nexo-rs/install.sh | bash
nexo --version

Other paths:

# From crates.io (needs a Rust toolchain)
cargo install nexo-rs

# Debian/Ubuntu (.deb), Fedora/RHEL (.rpm), Termux (aarch64 .deb) —
# grab the file for your arch from the latest release, e.g.:
#   https://github.com/lordmacu/nexo-rs/releases/latest
sudo apt install ./nexo-rs_0.1.6_amd64.deb        # Debian/Ubuntu
sudo dnf install ./nexo-rs-0.1.6-1.x86_64.rpm     # Fedora/RHEL
pkg install ./nexo-rs_0.1.6_aarch64.deb           # Termux

# Docker
docker pull ghcr.io/lordmacu/nexo-rs:latest

# From source (track main)
git clone https://github.com/lordmacu/nexo-rs.git
cd nexo-rs && cargo build --release && ./target/release/nexo --version

→ More installers: Installation, .deb, .rpm, Termux, Nix.

2. Start NATS (optional)

Phase 92 onwards, NATS is optional. Subprocess plugins bridge through the daemon's stdio JSON-RPC channel when broker.yaml type: local, so single-host dev deployments don't need an external broker server at all. Skip this step unless you're building a multi-host cluster.

For multi-host / prod-like setups, start NATS:

docker run -d --name nexo-nats -p 4222:4222 nats:2.10-alpine
# OR native install — see broker-shapes architecture doc

Then later, after the daemon is running:

nexo set-broker nats --url nats://localhost:4222

→ Broker shapes explains when to pick which.

3. Provide an LLM key

Pick one provider. MiniMax M2.5 is the primary; Anthropic and OpenAI-compatible APIs are first-class alternatives.

# Option A — MiniMax (default in shipped config)
export MINIMAX_API_KEY=your-key
export MINIMAX_GROUP_ID=your-group-id

# Option B — Anthropic
export ANTHROPIC_API_KEY=sk-ant-...

# Option C — any OpenAI-compatible endpoint
export OPENAI_API_KEY=sk-...
export OPENAI_BASE_URL=https://api.openai.com/v1

The shipped config/llm.yaml reads each via ${ENV_VAR} — no hardcoded keys.

4. Install the channel plugin + pair it

Channels are subprocess plugins (Phase 81.18 onward). Easiest is Telegram — no QR code, no Signal protocol, just a bot token from BotFather:

# Cargo install drops the binary in ~/.cargo/bin/ — the daemon's
# discovery walker scans that directory on boot, no YAML edit
# required (Phase 81.33 Stage 8 auto-detection).
cargo install nexo-plugin-telegram
nexo plugin list

# Tell BotFather to /newbot, save the token:
export TELEGRAM_BOT_TOKEN=123456:ABC-DEF...

For WhatsApp: cargo install nexo-plugin-whatsapp, then the setup wizard walks you through QR pairing.

For Google (Gmail / Calendar / Drive): cargo install nexo-plugin-google, then run nexo-plugin-google --oauth-once <agent_id> --device (or omit --device to use the loopback browser flow). See the Google plugin docs for the full CLI flag list.

For Web Search (Brave / Tavily / DuckDuckGo / Perplexity): cargo install nexo-plugin-web-search, then populate <config_dir>/plugins/web-search.yaml::instances[].providers with API key file refs. See the Web Search plugin docs. DuckDuckGo works with no API key as the fallback provider.

Six canonical plugins live on crates.io: whatsapp/telegram/email/browser/google/web-search.

How the daemon finds your plugin

The discovery walker (Phase 81.33 Stage 8) probes every search path on boot. Defaults out of the box:

Path	Use
`$HOME/.cargo/bin`	`cargo install nexo-plugin-X` lands here
`$HOME/.local/share/nexo/plugins`	XDG-style per-user install
`/usr/local/libexec/nexo/plugins`	system-wide install

In each path the walker looks for two shapes:

A directory containing a nexo-plugin.toml manifest + bin/<plugin-id> entrypoint (classic layout — used when you want to ship multiple files together).
A bare executable named nexo-plugin-<id>. The walker invokes the binary with --print-manifest (2s timeout), parses stdout as TOML, and registers the plugin if validation passes. This is the layout cargo install produces.

Operators can append paths via config/plugins/discovery.yaml:

discovery:
  search_paths:
    - /opt/nexo-plugins        # site-specific install root
  # Default paths above are STILL scanned — supply
  # `search_paths: []` to opt out entirely.
  auto_detect_binaries: true   # opt out by setting to false
  disabled: []                 # plugin ids to skip
  allowlist: []                # whitelist when non-empty

→ Authoring your own plugin: Plugin SDKs → Rust SDK documents the print_manifest_if_requested(MANIFEST) call that makes binaries discoverable.

5. Drop a minimal `agents.yaml`

Scaffold every YAML the daemon knows (heavily commented, sane defaults filled in):

nexo init --output ./config
# Writes 19 sample YAMLs you can edit in place.
# Or just the ones you need: nexo init --yaml broker,llm,agents

Then add an agent to config/agents.yaml (or drop it in config/agents.d/ana.yaml — the runtime merges that directory in, alphabetical, and hot-reloads it):

agents:
  - id: ana
    model:
      provider: minimax          # minimax | anthropic | openai | gemini | deepseek
      model: MiniMax-M2.5
    plugins: [telegram]          # the plugin you installed in step 4
    inbound_bindings:
      - plugin: telegram         # which channel may trigger this agent
    system_prompt: |
      You are Ana, a helpful assistant. You answer concisely. You
      speak Spanish if the user does, English otherwise. When you
      don't know something, say so — don't make it up.

(YAML config uses #[serde(deny_unknown_fields)] — a typo'd key fails fast at boot rather than being silently ignored. Full field list: agents.yaml reference.)

6. Run the daemon

nexo --config ./config

First boot prints a startup summary. With the defaults from nexo init (broker type: local), look for something like:

✓ broker ready — kind=Local (stdio bridge, no NATS server)
✓ plugin telegram — registered remote tools (registered_count=6)
✓ Telegram bot @YourBotName online
✓ Loaded 1 agent(s): ana
✓ LLM provider: minimax-m2.5 ready
✓ Memory: SQLite at ./data/memory.db
nexo-rs v0.1.6 ready

(If you'd run nexo set-broker nats … in step 2, the first line reads broker ready — kind=Nats url=nats://… instead.) If anything is missing, the log line tells you exactly what to fix — missing env var, wrong YAML key, channel pair failure.

7. Talk to it

Open Telegram, search for your bot's name, send hola. Within seconds you'll see Ana's reply — the LLM round-trip plus any tools the agent decided to call.

You: hola
Ana: ¡Hola! ¿En qué te puedo ayudar?
You: ¿qué clima hace en Bogotá?
Ana: Déjame revisarlo...
Ana: En Bogotá ahora hay 18 °C con lluvia ligera.

(Weather requires a web_fetch or weather tool — see agents.yaml to wire one up.)

What you just ran

sequenceDiagram
    participant U as You
    participant CH as Telegram plugin (subprocess)
    participant B as Broker (local stdio bridge, or NATS)
    participant A as Ana (agent runtime)
    participant L as MiniMax M2.5

    U->>CH: "hola"
    CH->>B: publish plugin.inbound.telegram
    B->>A: deliver to ana
    A->>L: chat.completion(messages, tools)
    L-->>A: assistant turn
    A->>B: publish plugin.outbound.telegram
    B->>CH: deliver
    CH-->>U: "¡Hola! ¿En qué te puedo ayudar?"

Every arrow is observable: nexo doctor plugins, the daemon log (plugin.inbound.* / plugin.outbound.* lines), and — in NATS mode — topic subscribers.

Where to next

You picked the simplest possible path. Common next moves:

See real product shapes → What you can build — gallery of 10 deployable use cases.
Multiple agents on multiple channels → drop more YAML files in config/agents.d/. Hot-reload picks them up without a restart. → Drop-in agents
Add tools your agent can call → wire a built-in tool, write a custom one, or install an extension pack. → agents.yaml reference
Build a plugin in your language → Plugin contract (Rust, Python, TypeScript, PHP).
Build a SaaS on top → Microapps · getting started.
Production deploy → Hetzner, Fly.io, AWS EC2.

Platform support

Honest matrix of what runs on what, plus the prerequisites each operating system needs for the optional voice / browser / WhatsApp features.

Daemon binary (`nexo`)

The core daemon — the agent loop, NATS bus, plugin supervisor, admin API, MCP client/server, memory layer, taskflow runtime — ships as a single static binary. It compiles against pure-Rust TLS (rustls) and a bundled SQLite C source, so no system OpenSSL or libsqlite is required at runtime.

Platform	Arch	Daemon	How to install
Linux (any glibc / musl distro)	x86_64	✅	`curl -fsSL https://lordmacu.github.io/nexo-rs/install.sh \| bash` · or `.deb` / `.rpm` · or `cargo install nexo-rs`
Linux (any glibc / musl distro)	aarch64	✅	`curl -fsSL https://lordmacu.github.io/nexo-rs/install.sh \| bash` · or `.deb` / `.rpm` · or `cargo install nexo-rs`
macOS	x86_64 (Intel)	✅	`curl -fsSL https://lordmacu.github.io/nexo-rs/install.sh \| bash` · or `cargo install nexo-rs`
macOS	aarch64 (Apple Silicon)	✅	`curl -fsSL https://lordmacu.github.io/nexo-rs/install.sh \| bash` · or `cargo install nexo-rs`
Windows	x86_64	✅	Download the `.zip` from Releases, or `cargo install nexo-rs` (the bash installer doesn't run natively)
Windows (WSL)	x86_64	✅	Same `install.sh` one-liner as the Linux rows
Docker (any host)	amd64 + arm64	✅	`docker pull ghcr.io/lordmacu/nexo-rs:latest`
Android (Termux)	aarch64	✅	`pkg install ./nexo-rs_<ver>_aarch64.deb` (download from Releases) — or `pkg install rust && curl -fsSL https://lordmacu.github.io/nexo-rs/install.sh \| bash` to build from source

Installer. The install.sh one-liner detects your OS + arch and downloads the matching pre-built tarball from the latest GitHub release (Linux x86_64 / aarch64 static-musl, macOS Intel / Apple Silicon), verifies its sha256, and drops nexo on your PATH — no Rust toolchain needed. It falls back to cargo install nexo-rs → cargo install --git for platforms with no pre-built binary. Every release artifact (tarball, .deb, .rpm) carries a .sha256 sidecar and a cosign signature.

Native Windows (cmd.exe / PowerShell, no WSL): grab the release .zip or cargo install nexo-rs. The shell installer is bash-only by design — use it under WSL if you prefer the one-liner.

Optional features — what compiles per OS

The daemon's default feature set works on every platform above. A microapp built on top of nexo-microapp-sdk can opt into extra features that pull additional system dependencies; this is what changes per OS.

Feature	What it enables	Linux	macOS	Windows	Termux
`stt-candle`	Default-track — inbound voice-note transcription via HuggingFace Candle (pure Rust)	✅	✅	✅	✅
`stt`	Legacy — same surface via whisper.cpp C++ binding (`whisper-rs`)	✅	✅	⚠️ needs VS Build Tools 2022 + CMake	⚠️ needs `cmake` + `clang` packages
`stt-cloud`	Cloud STT (native variant) — `SttProvider` trait + OpenAI Whisper-1 + Groq Whisper-large-v3 (REST). `CompositeProvider` fallback chain. Pulls reqwest with `rustls-tls`	✅	✅	✅	✅
`stt-cloud-wasm`	Cloud STT (wasm32 variant) — same trait + REST providers as `stt-cloud`, but reqwest pulled without `rustls-tls` (browser fetch API handles TLS). Use this for `wasm32-unknown-unknown` microapps	— (use `stt-cloud`)	— (use `stt-cloud`)	— (use `stt-cloud`)	— (use `stt-cloud`)
`stt-cloud-anthropic`	Adds Anthropic `voice_stream` WebSocket leg on top of `stt-cloud` (Claude.ai OAuth-gated; conversation engine + Deepgram Nova 3)	✅	✅	✅	✅
`stt-cloud-local-candle`	Bridge — `LocalCandleProvider` so the Candle backend joins a `CompositeProvider` chain as the offline fallback leg + `*_then_candle` convenience constructors	✅	✅	✅	✅
`voice`	Outbound voice replies via Microsoft Edge TTS + pure-Rust opus encoder	✅	✅	✅	✅
`wizard`	First-run LLM key probe via `reqwest` (rustls-tls only)	✅	✅	✅	✅
`enrichment`	Disposable-domain classifier + tenant-keyed cache	✅	✅	✅	✅
`tracking`	HMAC-signed message + link tokens	✅	✅	✅	✅
`email-template`	Block-based email composer + render + asset store	✅	✅	✅	✅

STT backend choice (`stt-candle` vs `stt`)

Phase 91 introduced the pure-Rust Candle backend (stt-candle) as the default track. The legacy whisper-rs path (stt) is retained for one stability window — Phase 91.12 drops it once telemetry confirms the migration.

Pick the right one:

stt-candle (recommended for every target) — HuggingFace Candle ML framework, no C++ build chain. Works out of the box on Linux, macOS, Windows, Termux / Android NDK. Model format is HuggingFace SafeTensors (openai/whisper-tiny and friends); the SDK auto-fetches the weights + tokenizer + config from HF Hub on first call when TranscribeConfig::model_id is set, or loads from a local directory pinned via TranscribeConfig::model_path (air-gapped deployments).
stt (legacy) — whisper-rs binding to whisper.cpp. Slightly faster on CPU, but the C++ build chain requires a per-target toolchain and breaks Android NDK / WASM cross-compile entirely. Keep it only if you've already shipped GGML .bin models you can't easily migrate yet.

Both backends share the audio-decode pipeline (ogg-opus → s16 PCM → f32) and the public TranscribeConfig / transcribe_file signature, so swapping is a Cargo feature change with no code edits at consumer sites.

GPU acceleration (opt-in, `stt-candle-*` sub-features)

The default stt-candle build is CPU-only pure-Rust so it cross-compiles to every target the workspace ships. Hardware acceleration is opt-in per build target:

Cargo feature	Backend	Platform
`stt-candle-metal`	Apple Metal	macOS / iOS
`stt-candle-cuda`	NVIDIA CUDA	Linux + Windows
`stt-candle-accelerate`	Apple Accelerate (BLAS)	macOS

Mix at most one per build. The audio decode + tokenizer pipeline stays identical — only the Tensor backend swaps.

Migration from a `stt` (whisper-rs) deployment

If you already ship a GGML .bin file and want to switch to stt-candle:

# 1. Download the equivalent SafeTensors model from HF Hub.
huggingface-cli download openai/whisper-tiny \
  --local-dir ./data/whisper-tiny

# 2. Point your microapp config at the new directory.
#    Either:
#      TranscribeConfig.model_path = "./data/whisper-tiny"
#    or, to auto-fetch on first call (HF Hub cache):
#      TranscribeConfig.model_id   = Some("openai/whisper-tiny")

# 3. Flip the Cargo feature.
#    Before: nexo-microapp-sdk = { features = ["stt"] }
#    After:  nexo-microapp-sdk = { features = ["stt-candle"] }

The whisper-rs path keeps working unchanged during the transition. Do not enable both features at once in a production build — stt-candle wins the public re-export when both are on, so the legacy path becomes effectively unreachable through the default API.

`stt` (legacy) — when you still need the C++ toolchain

If you stay on the stt feature, the original platform caveats still apply:

Linux: apt install clang cmake (or your distro's equivalent). Most dev machines already have it.
macOS: Xcode Command Line Tools — xcode-select --install. Provides clang + cmake.
Windows: Visual Studio Build Tools 2022 (the "Desktop development with C++" workload, or just MSVC + CMake from the individual components page) — no full Visual Studio IDE required. Plus cmake from https://cmake.org/download/. After install, open a "Developer Command Prompt for VS 2022" the first time so cl.exe is on PATH.
Termux: pkg install cmake clang from inside the Termux shell. Note that whisper.cpp performance on Android / Termux is noticeably lower than desktop CPUs; for production STT in Termux, consider stt-candle (which compiles trivially in Termux) or routing transcription to an upstream daemon.

Once the C++ build succeeds the first time, subsequent rebuilds are cached — operators usually pay this cost once during initial setup and never again.

Cloud STT (`stt-cloud*`) — REST + WebSocket backends

For deployments where on-device inference isn't a good fit (SaaS hot path, WASM browser microapps, metered cellular devices) the SDK ships a cloud STT path. Three providers, one fallback chain primitive, three one-line convenience constructors:

Cargo feature	What it adds
`stt-cloud`	`SttProvider` trait + `CompositeProvider` fallback chain + `OpenAiProvider` (Whisper-1 REST) + `GroqProvider` (Whisper-large-v3 REST) + `transcribe_file_with_chain` helper
`stt-cloud-anthropic`	Adds `AnthropicVoiceStream` — full WebSocket client for `wss://api.anthropic.com/api/ws/speech_to_text/voice_stream` (OAuth-gated; the same conversation engine + Deepgram Nova 3 stack Claude Code itself uses for voice input)
`stt-cloud-local-candle`	Adds `LocalCandleProvider` so the local Candle backend joins fallback chains as the offline-backup leg, plus `anthropic_then_candle` / `openai_then_candle` / `groq_then_candle` convenience constructors

Cloud-first with local fallback — one line

When stt-cloud-local-candle is on, compose any cloud primary with a local Candle backup in one call:

#![allow(unused)]
fn main() {
use std::sync::Arc;
use nexo_microapp_sdk::stt::{TranscribeConfig, cloud};

let candle_cfg = Arc::new(TranscribeConfig {
    model_id: Some("openai/whisper-tiny".into()),
    lang_hint: Some("es".into()),
    ..Default::default()
});

// Anthropic voice_stream → Candle fallback:
let chain = cloud::anthropic_then_candle(oauth_token, candle_cfg.clone());

// Or OpenAI / Groq REST → Candle fallback:
// let chain = cloud::openai_then_candle(api_key, candle_cfg.clone());
// let chain = cloud::groq_then_candle(api_key, candle_cfg);

let transcript = cloud::transcribe_file_with_chain(
    std::path::Path::new("/tmp/voice-note.ogg"),
    &chain,
    Some("es"),
).await?;
}

The fallback fires on transport errors (HTTP 5xx, network unreachable, WebSocket disconnect). Hard audio errors (EmptyAudio, UnsupportedFormat, Decode) short-circuit — the next leg would hit the same problem on the same bytes.

Anthropic `voice_stream` — Claude.ai OAuth required

AnthropicVoiceStream connects to the same endpoint Claude Code uses internally: wss://api.anthropic.com/api/ws/speech_to_text/voice_stream. Requires a Claude.ai subscriber OAuth token (not a regular Anthropic API key — different auth surface).

Wire format (linear16 PCM @ 16 kHz mono, JSON control frames, binary audio frames). The SDK collapses the streaming endpoint to a one-shot call: open WS, send buffer, send {"type":"CloseStream"}, drain until the 4-trigger finalize state machine resolves (PostCloseStreamEndpoint @ ~300 ms / NoDataTimeout @ 1.5 s / SafetyTimeout @ 5 s / WsClose). Live push-to-talk streaming is a deferred follow-up — see FOLLOWUPS.md 91.x.wasm.phase-4b.streaming.

WASM (`wasm32-unknown-unknown`) — REST cloud works, voice_stream deferred

The pure-Rust local backends (stt-candle Candle + stt whisper-rs) don't compile for wasm32-unknown-unknown today — the inference stack depends on crates that need kernel networking (mio) or aren't WASM-clean (opus-wave, tokenizers with onig, Candle's GEMM kernels).

REST cloud STT works on wasm32. Enable stt-cloud-wasm (the wasm-clean sibling of stt-cloud — reqwest pulled without rustls-tls, browser fetch API handles TLS). OpenAI Whisper-1 + Groq Whisper-large-v3 + the CompositeProvider fallback chain are fully supported in browser microapps. SttProvider trait drops Send + Sync bounds + uses async_trait(?Send) on wasm32 because the wasm-bindgen fetch backend returns futures holding js-sys types that aren't Send (single-threaded execution model — the bounds were a native-only thing anyway).

Cross-target microapps select the right feature per-target in their own Cargo.toml:

[target.'cfg(target_arch = "wasm32")'.dependencies]
nexo-microapp-sdk = { workspace = true, features = ["stt-cloud-wasm"] }

[target.'cfg(not(target_arch = "wasm32"))'.dependencies]
nexo-microapp-sdk = { workspace = true, features = ["stt-cloud", "stt-cloud-anthropic", "stt-cloud-local-candle"] }

stt-cloud-anthropic (voice_stream WebSocket) is still native-only — tokio-tungstenite drags TCP types absent on wasm32. Browser microapps wanting voice_stream would need a web-sys::WebSocket-based swap-in (filed as 91.x.wasm.phase-4c).

Voice (TTS) is portable everywhere

The voice feature uses pure-Rust crates (opus-wave, symphonia, ogg) plus a websocket call to Microsoft Edge's TTS endpoint. No C/C++ build, no system audio framework — works the same on Linux, macOS, Windows, and Termux.

Channels — what Rust + the host OS support

Channels (WhatsApp / Telegram / browser / email) ship as standalone subprocess plugins. Each plugin is its own Rust binary and inherits the same OS support matrix as the daemon:

Channel	Linux	macOS	Windows	Termux	Notes
WhatsApp	✅	✅	✅	✅	Uses Signal Protocol via the `wa-agent` upstream crate; pure Rust, all-platform
Telegram	✅	✅	✅	✅	Bot API long-poll; pure Rust
Browser	✅	✅	✅	⚠️ Chrome must be in `PATH`; Termux needs `pkg install chromium`
Email	✅	✅	✅	✅	IMAP poll + lettre SMTP; rustls-tls everywhere

Browser channel caveat — Chromium availability

The browser plugin spawns a Chromium instance via Chrome DevTools Protocol. The plugin doesn't bundle Chromium; it shells out to whatever Chrome / Chromium / Edge is in PATH:

macOS: brew install --cask google-chrome or use an existing Chrome install (/Applications/Google Chrome.app/... path is auto-detected).
Windows: install Chrome from https://www.google.com/chrome/ and let the plugin auto-detect at default install path.
Linux servers (headless): install via your distro (apt install chromium) — the plugin runs Chromium headless by default.
Termux: pkg install chromium — note that Termux's chromium package is significantly older than upstream and some CDP features may misbehave.

What's intentionally NOT in scope today

Wanted by users?	Why deferred
Homebrew formula (`brew install nexo-rs`)	Requires the macOS targets to land first + a release of the binary on those targets. The tap repo is created; the formula auto-publish will turn on as part of the Phase 27.2 follow-up.
`npm install -g @nexo-rs/cli`	The `@nexo-rs/cli` npm scope is reserved with a placeholder; the real CLI shim ships when `cargo dist` re-enables npm in `dist-workspace.toml` `installers`.
Native Windows MSI / PowerShell installer	Same dist-workspace dependency. The `.zip` from GH Releases works in the meantime.
Apple Silicon / Intel Mac via Homebrew	Tap exists, formula not auto-pushed yet. Curl installer covers both Intel + Apple Silicon directly.

Reporting platform-specific issues

If nexo --version runs but a particular feature breaks on your OS, file an issue with the version line + the relevant build channel (printed by nexo version in verbose mode):

nexo version | head -5
# nexo 0.1.6
# git_sha:  …
# channel:  tarball-x86_64-apple-darwin
# target:   x86_64-apple-darwin

Tag the issue with os:macos, os:windows, os:termux, etc., so we can track per-platform regressions across releases.

Setup wizard

The setup wizard is the recommended way to configure nexo-rs on a fresh install. It pairs channels, writes secrets, and patches the YAML config files so the runtime boots with everything it needs.

./target/release/agent setup

Run it from the repo root (or wherever your config/ directory lives).

What the wizard does

flowchart TD
    START([agent setup]) --> MENU{Menu}
    MENU --> LLM[LLM provider]
    MENU --> WA[WhatsApp pairing]
    MENU --> TG[Telegram bot]
    MENU --> GOOG[Google OAuth]
    MENU --> MEM[Memory DB location]
    MENU --> INFRA[NATS + runtime]
    MENU --> SKILLS[Enable / disable skills]

    LLM --> WRITE1[Write secrets/<br/>patch llm.yaml]
    WA --> QR[Scan QR<br/>write session dir]
    TG --> TOKEN[Ask bot token<br/>write secret]
    GOOG --> OAUTH[Open browser<br/>PKCE flow]
    MEM --> WRITE2[Patch memory.yaml]
    INFRA --> WRITE3[Patch broker.yaml]
    SKILLS --> WRITE4[Patch extensions.yaml]

    WRITE1 --> DONE([Done])
    QR --> DONE
    TOKEN --> DONE
    OAUTH --> DONE
    WRITE2 --> DONE
    WRITE3 --> DONE
    WRITE4 --> DONE

Every step is optional. You can run setup repeatedly — each section is idempotent.

Steps in detail

LLM provider

Prompts for the default provider (MiniMax, Anthropic, OpenAI-compat, Gemini). Writes the API key to ./secrets/<provider>_api_key.txt and ensures config/llm.yaml references it via ${file:...} or the corresponding env var.

WhatsApp pairing (multi-instance)

Per-agent. Asks which agent you are pairing and which instance label to use (personal, work, …). Each instance gets its own session dir under ./data/workspace/<agent>/whatsapp/<instance> and an allow_agents list (defense-in-depth ACL). The wizard:

Normalises config/plugins/whatsapp.yaml to sequence form (legacy single-mapping entries are auto-converted on first edit).
Upserts the entry by instance label.
Writes credentials.whatsapp: <instance> on the chosen agent's YAML — agents.yaml if the agent lives there, otherwise the matching agents.d/*.yaml.
Launches the pairing loop and renders the QR as Unicode blocks. Scan with WhatsApp → Settings → Linked Devices.
Runs the credential gauntlet so any drift surfaces immediately.

Re-run the wizard once per number you want to pair; instance labels are append-friendly.

Telegram bot (multi-instance)

Same shape as WhatsApp. Asks for instance label (default <agent>_bot) and bot token from @BotFather. Token lands at ./secrets/<instance>_telegram_token.txt with mode 0o600; the YAML references it via ${file:...} so secrets never live in telegram.yaml directly. Adds credentials.telegram: <instance> on the agent.

Google OAuth

The wizard writes one entry per agent in config/plugins/google-auth.yaml:

google_auth:
  accounts:
    - id: ana@google
      agent_id: ana
      client_id_path:     ./secrets/ana_google_client_id.txt
      client_secret_path: ./secrets/ana_google_client_secret.txt
      token_path:         ./secrets/ana_google_token.json
      scopes: [https://www.googleapis.com/auth/gmail.modify]

Two consent flows are offered after the YAML is written:

Device-code (default — works headless / over SSH): the wizard prints verification_url + a 6-character user_code. Open the URL on any device, type the code, approve. The wizard polls oauth2.googleapis.com/token until approval and persists the refresh_token at token_path (mode 0o600).
Skip and consent later via the google_auth_start LLM tool — uses the loopback PKCE flow, requires a local browser.

Scopes are comma-separated at the prompt; defaults to gmail.modify. Re-running with a different id adds a second account; re-running with the same id overwrites in place.

Memory DB location

Lets you pick where the SQLite long-term memory file lives. Default is ./data/memory.db. Per-agent isolation is on by default — each agent gets its own DB file under its workspace.

Infrastructure (NATS + runtime)

Asks for the NATS URL, optional user/password, and timeouts. Patches config/broker.yaml.

Skills on/off

Lets you selectively disable shipped extensions you don't plan to use (reduces tool surface exposed to the LLM).

Files the wizard touches

Target	What it writes
`config/llm.yaml`	Provider entries, base_url, auth mode
`config/plugins/whatsapp.yaml`	`session_dir`, `media_dir`
`config/plugins/telegram.yaml`	`token` (via `${file:...}`), allow-list
`config/plugins/google.yaml`	OAuth bundle path, scopes
`config/memory.yaml`	DB location
`config/broker.yaml`	NATS URL, creds
`config/extensions.yaml`	enabled/disabled list
`./secrets/*`	Plaintext secret files (gitignored)

Every YAML patch preserves existing keys and comments via the yaml_patch module — your hand edits survive.

Re-running

Re-run agent setup as many times as you want. Paired channels are detected and skipped unless you explicitly ask to re-pair. To wipe a paired session:

./target/release/agent setup wipe whatsapp --agent ana

Troubleshooting

WhatsApp QR expires too fast → the QR refreshes every ~20s; the wizard re-renders. Scan from the phone with a stable network.
Google OAuth fails with redirect_uri_mismatch → the wizard binds to 127.0.0.1:<port>; make sure your OAuth client allows http://127.0.0.1 as a redirect URI.
NATS unreachable → the wizard will warn but still write config. The runtime's disk queue will drain once NATS comes back.

Verifying releases

Every Nexo release artifact is signed with Sigstore Cosign using keyless OIDC — no long-lived private key, no PGP key management, no out-of-band trust establishment. The signature is tied to the GitHub Actions workflow run that produced the artifact, and a public record lives in the Rekor transparency log.

Why keyless

Traditional signing requires a long-lived signing key. If it leaks, every past release becomes suspect. Keyless signing instead anchors each signature to:

The GitHub Actions OIDC identity of the workflow run (https://token.actions.githubusercontent.com)
The specific repo + workflow file that ran (https://github.com/lordmacu/nexo-rs/.github/workflows/...)
The commit + ref the workflow built from

A short-lived certificate (10 min validity) is issued by Sigstore's fulcio CA, the artifact is signed with it, and the whole bundle is recorded in rekor (immutable). To forge a signature, an attacker would need to compromise GitHub's OIDC infra and the exact workflow path — and even then the forgery shows up in the public log.

Install Cosign

# macOS:
brew install cosign

# Linux (Debian/Ubuntu):
curl -L "https://github.com/sigstore/cosign/releases/latest/download/cosign-linux-amd64" \
  -o /usr/local/bin/cosign
chmod +x /usr/local/bin/cosign

# Linux (Fedora/RHEL):
sudo dnf install cosign

# Verify the install:
cosign version

Verify a Docker image

Every image at ghcr.io/lordmacu/nexo-rs is cosign-signed by the docker.yml workflow. Verify any tag with:

cosign verify ghcr.io/lordmacu/nexo-rs:latest \
  --certificate-identity-regexp 'https://github.com/lordmacu/nexo-rs/.*' \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com

A successful verification prints the full certificate + the Rekor entry URL. Anything else (signature missing, identity mismatch, broken cert chain) means don't trust this image — check the release notes, file an issue.

Verify a downloaded binary / .deb / .rpm / .tar.gz

The sign-artifacts.yml workflow attaches three files next to every release asset:

<asset>.sig — the raw signature
<asset>.pem — the leaf certificate
<asset>.bundle — combined Sigstore bundle (preferred; carries the inclusion proof)

Verify with the bundle (recommended, single command):

cosign verify-blob \
  --bundle nexo-rs_0.1.1_amd64.deb.bundle \
  --certificate-identity-regexp 'https://github.com/lordmacu/nexo-rs/.*' \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  nexo-rs_0.1.1_amd64.deb

Or with the standalone .sig + .pem if you prefer:

cosign verify-blob \
  --signature nexo-rs_0.1.1_amd64.deb.sig \
  --certificate nexo-rs_0.1.1_amd64.deb.pem \
  --certificate-identity-regexp 'https://github.com/lordmacu/nexo-rs/.*' \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  nexo-rs_0.1.1_amd64.deb

Verify in CI / scripted contexts

Drop this in a deploy pipeline:

#!/usr/bin/env bash
set -euo pipefail

ASSET="${1:?usage: $0 <asset-path>}"
BUNDLE="${ASSET}.bundle"

if [ ! -f "$BUNDLE" ]; then
    echo "ERROR: $BUNDLE missing — refusing to deploy unsigned artifact" >&2
    exit 1
fi

cosign verify-blob \
  --bundle "$BUNDLE" \
  --certificate-identity-regexp 'https://github.com/lordmacu/nexo-rs/.*' \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  "$ASSET" \
  || { echo "ERROR: signature verification failed for $ASSET" >&2; exit 2; }

Inspecting the transparency log

Every signature is searchable on Rekor:

# Search by artifact sha256:
cosign tree ghcr.io/lordmacu/nexo-rs:latest

The output shows every cosign-related artifact attached to the image (signatures, attestations, SBOMs) plus the Rekor log index where each was recorded.

What if verification fails

Identity regex doesn't match — the asset may have been built from a fork / unofficial workflow. Re-download from the GitHub release page directly.
bundle file missing — older releases (pre-Phase 27.3) don't have signatures. Tag v0.1.1 is the first signed release.
Cert chain expired / revoked — Sigstore's fulcio root CA has a long lifespan, but the leaf cert is short-lived. cosign automatically fetches the right TUF root; if you see chain errors run cosign initialize to refresh local trust roots.
Network errors talking to Rekor / Fulcio — both have CDN in front. Retry, or use --insecure-ignore-tlog for local verification (drops the transparency log check — only safe in air-gapped trust contexts).

Out of scope (for now)

Long-lived PGP keys for the apt / yum repos — needs Phase 27.4 signed-repo work to consume them on the user side. Until that ships, .deb / .rpm signatures live in the Cosign world only.
A Homebrew bottle-signing path that lets brew validate without the OIDC chain — Phase 27.6 follow-up.

Configuration layout

nexo-rs loads configuration from a single directory (passed via --config <path>, default ./config). The runtime reads a small set of required YAML files and a handful of optional ones.

Source: crates/config/src/lib.rs::AppConfig::load.

Directory tree

config/
├── agents.yaml              # required — base agent catalog
├── agents.d/                # optional — drop-in agents, merged in alpha order
│   ├── ana.example.yaml     # template (committed)
│   └── *.yaml               # real definitions (gitignored)
├── broker.yaml              # required — NATS / local broker + disk queue
├── llm.yaml                 # required — LLM providers
├── memory.yaml              # required — short-term + long-term + vector
├── extensions.yaml          # optional — extension search paths, toggles
├── mcp.yaml                 # optional — MCP servers the agent consumes
├── mcp_server.yaml          # optional — expose this agent as an MCP server
├── tool_policy.yaml         # optional — per-tool / per-agent policy
├── runtime.yaml             # optional — hot-reload watcher settings
├── plugins/
│   ├── whatsapp.yaml
│   ├── telegram.yaml
│   ├── email.yaml
│   ├── browser.yaml
│   ├── google.yaml
│   └── gmail-poller.yaml
└── docker/                  # optional — overrides for containerized runs
    ├── agents.yaml
    ├── llm.yaml
    └── …

Required vs optional

The loader fails startup if any required file is missing or malformed. Optional files return None when absent and unlock related features only if present.

File	Kind
`agents.yaml`	required
`broker.yaml`	required
`llm.yaml`	required
`memory.yaml`	required
`extensions.yaml`	optional
`mcp.yaml`	optional
`mcp_server.yaml`	optional
`tool_policy.yaml`	optional
`runtime.yaml`	optional — process runtime knobs: hot-reload + cron policy (one-shot retries and optional cron tool-call execution). Defaults enable reload at 500 ms debounce, one-shot retries (3 attempts, exponential backoff), and keep cron tool-calls disabled. See Config hot-reload.
`plugins/*.yaml`	optional (only needed for plugins you enable)

Drop-in agents

Files under config/agents.d/*.yaml are merged into the base agents.yaml in lexicographic filename order. Each file has the same top-level shape (agents: [...]); entries append to the base list.

Common patterns:

00-dev.yaml / 10-prod.yaml — control override order by numeric prefix
Keep agents.yaml public-safe and drop sensitive business content (sales prompts, pricing, phone numbers) into gitignored config/agents.d/ana.yaml
Ship config/agents.d/<name>.example.yaml as a template so the shape stays discoverable

Details in Drop-in agents.

Docker layout

config/docker/ mirrors the main layout and is consumed when the compose file mounts it at /app/config/docker:

# docker-compose.yml
command: ["agent", "--config", "/app/config/docker"]

Secrets inside Docker containers live at /run/secrets/<name> — the compose definitions use ${file:/run/secrets/...} references. See LLM config — auth for the full secret resolution rules.

Env vars and secrets in YAML

YAML values can reference env vars and files:

Syntax	Meaning
`${VAR}`	read env var, fail if unset or empty
`${VAR:-fallback}`	env var if set and non-empty, else `fallback`
`${VAR-fallback}`	env var if set (even empty), else `fallback`
`${file:./secrets/x}`	read file contents, trimmed of whitespace

Path-traversal rules for ${file:...}:

Relative paths are rooted at the current working directory
.. segments are rejected outright
Absolute paths must sit under one of these whitelisted roots:
- /run/secrets/ (Docker secrets)
- /var/run/secrets/ (Kubernetes projected volumes)
- ./secrets/ (project-local)
- the directory pointed at by $CONFIG_SECRETS_DIR (operator-defined)

Everything else is refused at parse time with an explicit error naming the invalid path and the allowed roots.

Validation

All config structs deserialize with #[serde(deny_unknown_fields)], so typos fail fast:

unknown field `modl`, expected `model`
at line 4, column 5 in config/agents.yaml

Missing required fields produce the same kind of message:

missing field `model`
at line 5, column 3 in config/agents.yaml

Env / file resolution errors identify the placeholder and the file:

env var MINIMAX_API_KEY not set (referenced in llm.yaml)

${file:../etc/passwd}: `..` not allowed in file reference (in broker.yaml)

Boot sequence

flowchart TD
    START([agent --config path]) --> LOAD[AppConfig::load]
    LOAD --> REQ{required files<br/>present & parseable?}
    REQ -->|no| FAIL([fail fast, exit 1])
    REQ -->|yes| OPT[read optional files]
    OPT --> DROP[merge config/agents.d/]
    DROP --> RESOLVE[resolve env / file placeholders]
    RESOLVE --> VAL[struct-level validation<br/>deny_unknown_fields]
    VAL --> SEM[semantic validation<br/>validate_agents, MCP headers]
    SEM --> READY([AppConfig ready])

agents.yaml — full agent schema
llm.yaml — LLM provider schema + auth modes
broker.yaml — NATS + disk queue
memory.yaml — short/long/vector
Drop-in agents — merge order and patterns

agents.yaml

The agent catalog. One entry per agent; each entry declares the model, channels, tools, sandboxing, and behavioral knobs for that agent.

Source: crates/config/src/types/agents.rs.

Top-level shape

agents:
  - id: ana
    model:
      provider: minimax
      model: MiniMax-M2.5
    plugins: [whatsapp]
    inbound_bindings:
      - plugin: whatsapp
    allowed_tools:
      - whatsapp_send_message
    outbound_allowlist:
      whatsapp:
        - "573000000000"
    system_prompt: |
      You are Ana, …

Full field reference

All fields use #[serde(deny_unknown_fields)] — typos fail fast.

Identity & model

Field	Type	Required	Default	Purpose
`id`	string	✅	—	Unique agent id. Used as session key, subject suffix, workspace dir name.
`model.provider`	string	✅	—	Provider key in `llm.yaml` (e.g. `minimax`, `anthropic`).
`model.model`	string	✅	—	Model id understood by that provider.
`description`	string	—	`""`	Human-readable role. Injected into `# PEERS` for delegation discovery.

Channels

Field	Type	Default	Purpose
`plugins`	`[string]`	`[]`	Plugin ids this agent wants to expose tools for (`whatsapp`, `telegram`, `browser`, …).
`inbound_bindings`	array	`[]`	Per-plugin binding list. Empty = receive nothing from `plugin.inbound.*` (strict mode).

Each inbound_bindings[] entry can override the agent-level defaults for that channel: allowed_tools, outbound_allowlist, skills, model, system_prompt_extra, sender_rate_limit, allowed_delegates. Useful for running the same agent on two channels with different rules. See Per-binding capability override below for the full override surface and merge rules.

Binding match rules are strict on (plugin, instance):

instance omitted/null only matches plugin.inbound.<plugin>
instance: foo only matches plugin.inbound.<plugin>.foo

Tool sandboxing

Field	Type	Default	Purpose
`allowed_tools`	`[string]`	`[]`	Build-time pruning of the tool registry. Glob suffix `*` allowed. Empty = all tools registered.
`tool_rate_limits`	object	`null`	Per-tool rate limit patterns. Glob-matched.
`tool_args_validation.enabled`	bool	`true`	Toggle JSON-schema validation of tool arguments.
`outbound_allowlist`	object	`{}`	Per-plugin recipient allowlist (e.g. phone numbers, chat ids). Defense-in-depth for `send` tools.

allowed_tools semantics:

For legacy agents (no inbound_bindings) the allowlist is applied at registry-build time — tools not matching the patterns are removed from the registry before the LLM sees them.
For agents with inbound_bindings the base registry keeps every tool and enforcement happens per-binding at turn time (see Per-binding capability override) so a binding's override can both narrow AND expand within the registry. Defense-in-depth: the LLM only receives tools allowed by the matched binding, and the tool-call execution path rejects any hallucinated name outside the same allowlist.

In both modes the LLM never receives disallowed tool definitions; the difference is where the filter is applied.

System prompt & workspace

Field	Type	Default	Purpose
`system_prompt`	string	`""`	Prepended to every LLM turn. Defines persona, rules, examples.
`workspace`	path	`""`	Directory with `IDENTITY.md`, `SOUL.md`, `USER.md`, `AGENTS.md`, `MEMORY.md`. Loaded at turn start. See Soul, identity & learning.
`extra_docs`	`[path]`	`[]`	Workspace-relative markdown files appended as `# RULES — <filename>`.
`transcripts_dir`	path	`""`	Directory for per-session JSONL transcripts. Empty = disabled.
`skills_dir`	path	`"./skills"`	Base directory for local skill files.
`skills`	`[string]`	`[]`	Local skill ids to inject into the system prompt. Resolved from `skills_dir`.
`language`	string	`null`	Output language for the LLM's reply. ISO code (`"es"`, `"en"`, `"en-US"`) or human name (`"Spanish"`, `"español"`). When set, the runtime renders a `# OUTPUT LANGUAGE` system block telling the model to keep workspace docs in English (single source of truth, plays nicely with recall + dreaming) but reply to the user in the configured language. Per-binding `language` overrides this for the matched channel. See Output language.

Heartbeat

heartbeat:
  enabled: true
  interval: 30s

Field	Type	Default	Purpose
`heartbeat.enabled`	bool	`false`	Turn heartbeat on for this agent.
`heartbeat.interval`	humantime	`"5m"`	Interval between `on_heartbeat()` fires.

See Agent runtime — Heartbeat.

Runtime knobs

config:
  debounce_ms: 2000
  queue_cap: 32

Field	Type	Default	Purpose
`config.debounce_ms`	u64	`2000`	Debounce window for burst-of-messages coalescing.
`config.queue_cap`	usize	`32`	Per-agent mailbox capacity.
`sender_rate_limit.rps`	f64	—	Per-sender token-bucket refill rate.
`sender_rate_limit.burst`	u64	—	Bucket size.

Agent-to-agent delegation

Field	Type	Default	Purpose
`allowed_delegates`	`[glob]`	`[]`	Peers this agent may delegate to. Empty = no restriction.
`accept_delegates_from`	`[glob]`	`[]`	Inverse gate: peers allowed to delegate to this agent.

Routing uses agent.route.<target_id> over NATS with a correlation_id. See Event bus — Agent-to-agent routing.

Dreaming (memory consolidation)

dreaming:
  enabled: false
  interval_secs: 86400
  min_score: 0.35
  min_recall_count: 3
  min_unique_queries: 2
  max_promotions_per_sweep: 20
  weights:
    frequency: 0.24
    relevance: 0.30
    recency: 0.15
    diversity: 0.15
    consolidation: 0.10

Defaults shown. See Soul — Dreaming.

Workspace-git

workspace_git:
  enabled: false
  author_name: "agent"
  author_email: "agent@localhost"

When enabled, the agent's workspace directory is a git repo that the runtime commits to after dream sweeps, forge_memory_checkpoint, and session close. Good for forensic replay.

Google auth (per-agent OAuth)

google_auth:
  client_id: ${GOOGLE_CLIENT_ID}
  client_secret: ${file:./secrets/google_secret.txt}
  scopes:
    - https://www.googleapis.com/auth/gmail.readonly
  token_file: ./data/workspace/ana/google_token.json
  redirect_port: 17653

Used by crates/plugins/google to run OAuth PKCE per agent.

Deprecated in Phase 17 — prefer declaring Google accounts in a dedicated config/plugins/google-auth.yaml and binding them from credentials.google (see next section). Inline google_auth still boots with a warn so existing deployments keep working; it is auto-migrated into the credential store at startup.

Credentials (per-agent WhatsApp / Telegram / Google)

Pins each agent to the plugin instance / Google account it may use for outbound traffic. The runtime resolves the target at publish time from the agent id — the LLM cannot pick the instance via tool args, closing the prompt-injection vector.

credentials:
  whatsapp: personal          # must match whatsapp.yaml instance label
  telegram: ana_bot           # must match telegram.yaml instance label
  google:   ana@gmail.com     # must match google-auth.yaml accounts[].id
  # Silence the "inbound ≠ outbound" warning when intentional:
  # telegram_asymmetric: true

Validated at boot by the gauntlet (agent --check-config runs the same checks without starting the daemon). Omitting credentials: keeps the legacy single-account behavior for back-compat.

Full schema + migration guide: config/credentials.md.

Relationship diagram

flowchart LR
    AG[agent entry] --> MOD[model provider]
    AG --> PL[plugins list]
    AG --> IB[inbound_bindings]
    AG --> AT[allowed_tools]
    AG --> OA[outbound_allowlist]
    AG --> WS[workspace]
    AG --> HB[heartbeat]
    AG --> DEL[delegation gates]
    IB -->|per-binding override| AT
    IB -->|per-binding override| OA
    MOD -->|resolved from| LLM[llm.yaml]
    PL -->|tools from| PLUG[plugins/*.yaml]
    WS -->|files| SOUL[SOUL.md /<br/>IDENTITY.md /<br/>MEMORY.md]

Per-binding capability override

A single agent can expose distinct capability surfaces per InboundBinding without running two agent processes. Typical use: the same Ana agent answers WhatsApp with a narrow sales-only surface and Telegram with the full catalogue.

Schema

Every inbound_bindings[] entry accepts the following optional overrides. Unset fields inherit the agent-level value.

Field	Type	Strategy	Notes
`allowed_tools`	`[string]`	replace	`["*"]` = every registered tool
`outbound_allowlist`	object	replace (whole)	Whatsapp/telegram recipient lists
`skills`	`[string]`	replace	Resolved from agent-level `skills_dir`
`model`	object	replace	Must keep the same `provider`
`system_prompt_extra`	string	append	Rendered as `# CHANNEL ADDENDUM` block
`sender_rate_limit`	`inherit` \| `disable` \| `{rps, burst}`	3-way	Untagged enum
`allowed_delegates`	`[string]`	replace	Peer allowlist for the `delegate` tool
`language`	string	replace	Output language for replies on this channel. Falls through to the agent-level `language` field when omitted. See Output language.

Anything else (workspace, transcripts_dir, heartbeat, memory, workspace_git, google_auth) stays at the agent level — identity and persistent state do not change per channel.

Example

agents:
  - id: ana
    model: { provider: anthropic, model: claude-haiku-4-5 }
    plugins: [whatsapp, telegram]
    workspace: ./data/workspace/ana
    skills_dir: ./skills
    system_prompt: |
      You are Ana.
    allowed_tools: []            # agent-level = permissive; bindings narrow
    outbound_allowlist: {}
    inbound_bindings:
      - plugin: whatsapp
        allowed_tools: [whatsapp_send_message]
        outbound_allowlist:
          whatsapp: ["573115728852"]
        skills: []
        sender_rate_limit: { rps: 0.5, burst: 3 }
        system_prompt_extra: |
          Channel: WhatsApp sales. Follow the ETB/Claro lead flow.
      - plugin: telegram
        instance: ana_tg
        allowed_tools: ["*"]
        outbound_allowlist:
          telegram: [1194292426]
        skills: [browser, github, openstreetmap]
        model: { provider: anthropic, model: claude-sonnet-4-5 }
        allowed_delegates: ["*"]
        sender_rate_limit: disable
        system_prompt_extra: |
          Channel: private Telegram. Full tool access allowed.

Boot-time validation

The runtime rejects configs with:

Duplicate (plugin, instance) tuples in the same agent.
Telegram instance referenced by a binding but not declared in config/plugins/telegram.yaml.
Binding model.provider different from the agent-level provider (the LLM client is wired once per agent).
Skills listed in a binding whose directory does not exist under skills_dir.

A binding that sets no overrides is allowed but logs a warn.

Matching order

Bindings are evaluated top-to-bottom; the first match wins. Because matching is strict on the instance axis, {plugin: telegram, instance: null} does not capture plugin.inbound.telegram.admin traffic.

Runtime isolation

Tool list shown to the LLM is filtered through the binding's allowed_tools; tools hidden on WhatsApp remain invisible even if the LLM hallucinates the name.
Tool-call execution re-checks the allowlist and returns not_allowed for anything outside — stops hallucination loops without executing the forbidden tool.
Outbound tools (whatsapp_send_message, telegram_send_message) read outbound_allowlist from the matched binding, so WhatsApp sends on the sales channel cannot reach numbers that only the private channel allows.
Sender rate limit buckets are keyed per binding; flood on one channel cannot drain the quota on another.

Back-compat

Agents without inbound_bindings do not consume plugin inbound events. Internal runtime paths that are not plugin inbound (for example heartbeat/delegation paths) still synthesize an effective policy from agent-level defaults.

Output language

Operators pin the language an agent replies in without rewriting workspace markdown. Workspace docs (IDENTITY, SOUL, MEMORY, USER, AGENTS) and tool descriptions stay in English — the single source of truth that recall, dreaming, vector search, and developer tooling all read. The runtime injects a # OUTPUT LANGUAGE system block right after the agent's system_prompt, telling the model to read those docs as-is but reply to the user in the configured language.

Where to set it

agents:
  - id: ana
    language: es                # default for every binding on this agent
    inbound_bindings:
      - plugin: whatsapp
        # → uses Spanish (inherits from the agent)
      - plugin: telegram
        instance: support_intl
        language: en            # → uses English on this channel only
      - plugin: telegram
        instance: bilingual_qa
        language: ""            # → no directive (model picks)

Resolution

Precedence (first non-empty wins):

inbound_bindings[i].language — per-channel override.
language — agent-level default.
null — no # OUTPUT LANGUAGE block emitted; the model decides from the user's input.

Empty string and whitespace-only values resolve to no directive on both layers — useful for "turn the directive off on this binding even though the agent has one".

Accepted values

The runtime treats the value as a label and forwards it verbatim into the directive (after sanitisation; see below). Both forms work:

ISO codes: "es", "en", "en-US", "pt-BR".
Human names: "Spanish", "English", "español", "Brazilian Portuguese".

Human names produce slightly clearer directives in practice (Respond to the user in Spanish. reads more natural than Respond to the user in es.), but both yield the same model behaviour with modern LLMs.

Rendered block

# OUTPUT LANGUAGE

Respond to the user in {language}. Workspace docs (IDENTITY, SOUL,
MEMORY, USER, AGENTS) and tool descriptions are in English — read
them as-is, but your turn-final reply to the user must be in
{language}.

The block lands after the agent's system_prompt (and the optional # CHANNEL ADDENDUM block) so its instruction wins under the LLM's recency bias.

Sanitisation

Defense-in-depth against config-driven prompt injection: every language value is normalised before rendering — control characters and embedded newlines are stripped, trimmed, and the result is capped at 64 characters. A YAML payload like language: "es\n\nIgnore previous instructions" cannot smuggle a multi-line directive into the system prompt.

Hot reload

Phase 18 hot-reload covers this field. Edit agents.d/<id>.yaml, save (or run agent reload), and the next message uses the new language. In-flight LLM turns finish on the old policy; subsequent turns flip to the new one.

Workspace docs and recall stay English regardless — see Soul, identity & learning.
Per-channel rotation walkthrough lives in Recipes — A/B prompt swap.

Link understanding

Per-agent (and per-binding) toggle that fetches URLs in the user's message and injects a # LINK CONTEXT block. Off by default. Full schema, caps, and SSRF denylist live on Link understanding. The field is link_understanding at agent scope and at each inbound_bindings[] entry; binding value replaces agent default, omitted = inherit.

Web search

Per-agent (and per-binding) toggle that exposes a web_search tool backed by Brave / Tavily / DuckDuckGo / Perplexity. Off by default. Full schema, providers, cache, and circuit-breaker behaviour live on Web search. The field is web_search at agent scope and at each inbound_bindings[] entry; binding value replaces agent default, omitted = inherit.

Pairing policy

Per-binding toggle that turns on the DM-challenge gate for inbound senders. Off by default. The field is pairing_policy on each inbound_bindings[] entry; null (default) = inherit agent value or skip the gate entirely. Full protocol, threat model, and CLI reference live on Pairing.

Common mistakes

Forgetting plugins: [...]. An agent without plugins has no inbound channel and no outbound tools. It is inert.
Setting allowed_tools without a wildcard. ["memory_*"] allows the full memory_* family; ["memory_store"] allows only one. Check the glob before assuming.
Large system_prompt duplication across agents. Use inbound_bindings[].system_prompt_extra to add per-channel content without duplicating the whole prompt.
Sharing a WhatsApp session across agents. Each agent's workspace should contain its own whatsapp/default session; the wizard does this automatically, but pointing two agents at the same session dir will cause message cross-delivery.
Translating the workspace markdown to match language. Don't. Workspace docs are the single source of truth read by recall, dreaming, and developer tooling — keep them in English. The # OUTPUT LANGUAGE block tells the model to translate the reply on its way out.

Drop-in agents — merging multiple agent files
llm.yaml — where model.provider is resolved
Skills catalog — names that go in allowed_tools

MiniMax M2.5

MiniMax M2.5 is the primary LLM provider for nexo-rs. It's the first provider implemented and the recommended default for new agents.

Source: crates/llm/src/minimax.rs, crates/llm/src/minimax_auth.rs.

Why it's primary

Strong tool-calling support on both the OpenAI-compat wire and the Anthropic Messages wire
Token Plan auth lets you run agents on a subscription without per-request billing headaches
Aggressive price/performance for multi-agent deployments

If you don't have a specific reason to pick another provider, start with MiniMax.

Configuration

# config/llm.yaml
providers:
  minimax:
    api_key: ${MINIMAX_API_KEY:-}
    group_id: ${MINIMAX_GROUP_ID:-}
    base_url: https://api.minimax.io
    rate_limit:
      requests_per_second: 2.0
      quota_alert_threshold: 100000

Per-agent selection:

# config/agents.d/ana.yaml
agents:
  - id: ana
    model:
      provider: minimax
      model: MiniMax-M2.5

Wire formats (`api_flavor`)

MiniMax exposes two HTTP shapes. The client auto-detects from base_url but can be overridden via api_flavor.

`api_flavor`	Endpoint	Shape	When
`openai_compat` (default)	`{base_url}/text/chatcompletion_v2`	OpenAI chat completions	Regular API keys, most use cases
`anthropic_messages`	`{base_url}/v1/messages`	Anthropic Messages	Token Plan / Coding keys served at `api.minimax.io/anthropic`

Auto-detection: if base_url ends in /anthropic, the client picks anthropic_messages automatically.

Authentication

Static API key

Simple path: put the key in env or a secrets file.

Env var precedence (first wins):

MINIMAX_CODE_PLAN_KEY
MINIMAX_CODING_API_KEY
./secrets/minimax_code_plan_key.txt
api_key field in llm.yaml

Token Plan OAuth bundle

For subscription-based access. The wizard writes a bundle to ./secrets/minimax_token_plan.json:

{
  "access_token": "...",
  "refresh_token": "...",
  "expires_at": "2026-05-01T12:00:00Z",
  "region": "https://api.minimax.io"
}

Auto-refresh: 60 seconds before expires_at, a background task POSTs to {region}/oauth/token with grant_type=refresh_token and rewrites the bundle atomically. Concurrent refreshes are serialized behind a mutex — you never get two refresh calls in flight.

Mid-flight 401: if an API call returns 401 while holding what we thought was a valid token (clock skew, revocation), the client force-refreshes once and retries the request. A second 401 is surfaced as a credential error.

Shared OAuth client id for the MiniMax Portal flow: 78257093-7e40-4613-99e0-527b14b39113.

Request / response flow

sequenceDiagram
    participant A as Agent loop
    participant RL as RateLimiter
    participant C as MiniMaxClient
    participant AU as AuthSource
    participant MX as MiniMax API

    A->>C: chat(ChatRequest)
    C->>RL: acquire()
    C->>AU: fresh_bearer()
    AU->>AU: refresh if <60s to expiry
    AU-->>C: access_token
    C->>MX: POST chatcompletion_v2 / v1/messages
    alt 200
        MX-->>C: ChatResponse
    else 401
        C->>AU: force_refresh()
        C->>MX: retry once
    else 429
        MX-->>C: Retry-After
        C-->>A: LlmError::RateLimit
    else 5xx
        MX-->>C: error body
        C-->>A: LlmError::ServerError
    end

Supported features

Feature	OpenAI-compat	Anthropic-messages
Chat completions	✅	✅
Tool calling	✅	✅
Streaming (SSE)	✅	✅
Token usage in stream	✅ (`stream_options.include_usage`)	✅ native
Multimodal (images)	✅	✅
JSON mode	✅	limited

Rate limiting

Per-provider token bucket. requests_per_second: 2.0 refills one slot every 500 ms. Acquired before every request.

An optional quota_alert_threshold emits a structured warn log when the remaining quota (if the provider reports it) crosses the threshold. Useful for Prometheus alerting.

Error classification

Response	Mapping	Behavior
429	`LlmError::RateLimit { retry_after_ms }`	Retried by the LLM retry layer (up to 5 attempts)
5xx	`LlmError::ServerError { status, body }`	Retried (up to 3 attempts)
401	Internal auth refresh + single retry, then `LlmError::CredentialInvalid`	Fail-fast after refresh attempt
Other 4xx	`LlmError::Other`	Fail fast

See Retry & rate limiting.

Common mistakes

Forgetting group_id. MiniMax requires a group id alongside the key for most endpoints. The wizard sets this; manual configs often miss it.
Pointing base_url at /anthropic with a regular API key. That endpoint is for Token Plan / Coding keys only — regular keys will 401. Leave base_url at https://api.minimax.io.
Refreshing the bundle manually mid-flight. The client already serializes refreshes. Editing the file while the agent runs can lead to an atomic write race — stop the agent, edit, restart.

Short-term memory

Per-session conversational buffer held entirely in memory. Tracks the turns of the ongoing conversation so the LLM has context on every completion request.

Source: crates/core/src/session/ (types.rs, manager.rs) — the Session struct owns the short-term buffer.

What lives in a session

Each Session stores:

Field	Type	Purpose
`history`	`Vec<Interaction>`	FIFO of turns (role + content + timestamp)
`context`	`serde_json::Value`	Free-form JSON blob for per-session state
`last_access`	timestamp	Used by TTL sweeper and cap eviction

An Interaction is {role: User | Assistant | Tool, content, timestamp}.

Sliding window — `max_history_turns`

short_term:
  max_history_turns: 50

Hard cap, sliding FIFO. When history.len() > max_history_turns, the oldest entry is removed on the next push:

flowchart LR
    MSG[new turn] --> PUSH[history.push]
    PUSH --> CHECK{len > max?}
    CHECK -->|no| DONE[done]
    CHECK -->|yes| DROP[history.remove(0)]
    DROP --> DONE

Old content is lost, not promoted. If you need long-term persistence, the agent must explicitly call the memory tool with action remember. See Long-term memory.

Session cap and eviction

short_term:
  max_sessions: 10000

Soft cap across the whole process. On overflow, the oldest-idle session (lowest last_access) is evicted to make room. Eviction fires the on_expire callbacks — used by workspace-git to checkpoint before tearing down the session.

max_sessions: 0 disables the cap (unbounded). Leave it at the default unless you have a specific reason — the cap is DoS protection against a spammer rotating chat_ids.

TTL sweeper

short_term:
  session_ttl: 24h

Sessions expire after session_ttl of inactivity. The sweeper runs every ttl / 4 (so every 6 h with the default 24 h TTL) and drops expired sessions.

stateDiagram-v2
    [*] --> Active: first message
    Active --> Active: message / event<br/>(last_access updated)
    Active --> Expired: idle > session_ttl
    Active --> Evicted: cap exceeded
    Expired --> [*]: sweeper
    Evicted --> [*]: on_expire callbacks fire

Expiry also fires on_expire — good place to hook session-close commits to a workspace-git repo.

Relationship to other memory layers

flowchart LR
    STM[short-term<br/>in-memory Vec] -.->|tool call:<br/>memory.remember| LTM[(long-term<br/>SQLite)]
    LTM -.->|vector enabled| VEC[(sqlite-vec)]
    STM -.->|transcripts_dir| TR[(JSONL transcripts)]
    STM -.->|session close| WSG[(workspace-git)]

STM does not auto-promote to LTM. Promotion happens via:

Explicit memory.remember tool call from the agent
Dream sweeps (Phase 10.6) that scan recall-event signals and promote hot memories
Session-close commits to workspace-git if enabled

Gotchas

Lost turns are gone. Once a turn falls off the sliding window it is not recoverable. If it mattered, save it to LTM before the next turn.
max_sessions: 0 has no DoS guard. Only do this in single-tenant setups where you control the sender id space.
last_access updates on any access. That includes heartbeat ticks if they read the session — effectively keeping a session alive past its TTL as long as the agent is alive.

End-to-end WhatsApp channel: Signal Protocol pairing, inbound message bridge, outbound send/reply/reaction/media tools, optional voice transcription.

Source: standalone repo at nexo-rs-plugin-whatsapp (extracted from crates/plugins/whatsapp/ per Phase 81.19.a; see PHASES.md for the migration notes). The crate ships as a lib + bin Shape B package: the lib re-exports WhatsappPlugin for an Android embedded host tomorrow, the bin is the subprocess entrypoint the daemon spawns per cfg.plugins.whatsapp entry (Phase 81.18.b.2). Internally the plugin wraps the wa-agent (a.k.a. whatsapp-rs) crate for Signal Protocol session lifecycle, QR pairing and the Bot API surface.

Install (Phase 81.18.b.2 — operator action required)

The daemon stopped constructing WhatsappPlugin in-tree as of Phase 81.18.b.2; it spawns the standalone subprocess binary per cfg entry. Operators with cfg.plugins.whatsapp populated must install the binary and surface its directory through plugins.discovery.search_paths before starting the daemon, or the discovery walker logs a clear warning and the plugin never boots:

# Recommended — download the pre-built tarball from the plugin's
# GitHub Releases into the daemon's plugin dir:
nexo plugin install lordmacu/nexo-plugin-whatsapp
nexo plugin list

# Or build from source:
cargo install --git https://github.com/lordmacu/nexo-plugin-whatsapp

nexo plugin install lands the binary + plugin.toml under <state_dir>/plugins/whatsapp/, which the daemon's discovery walker scans by default — no search_paths edit needed. If you build with cargo install --git instead, point discovery at the install dir in agents.yaml:

plugins:
  discovery:
    search_paths:
      - ~/.cargo/bin   # or wherever you installed the binary

Each cfg.plugins.whatsapp[] entry maps to one subprocess; per- instance state (session_dir Signal Protocol creds, media_dir, instance topic suffix, bridge.response_timeout_ms, acl.allow_list) is seeded into the child via NEXO_PLUGIN_WHATSAPP_* env vars at spawn time. Multi-account operators get true process isolation — one bot's creds.json corruption can't take down the others.

The admin RPC /whatsapp/<instance>/pair* HTTP endpoints keep working: a daemon-side broker subscriber (spawn_whatsapp_pairing_state_subscriber) listens on plugin.inbound.whatsapp.> and mirrors the subprocess's Connected / Disconnected / Reconnecting / Qr events into a daemon-owned PairingState per instance.

Known limitation (Phase 81.20.c follow-up)

Subprocess whatsapp instances do not currently surface AgentEventKind::PeerTyping events on the SSE live transcript stream. The daemon's AgentEventEmitter Arc doesn't cross the process boundary; bridging typing events through the broker ships in follow-up 81.20.c.typing-presence-rpc. Inbound message routing, outbound dispatch, pairing UI, and reconnect telemetry are unaffected.

Topics

Direction	Subject	Notes
Inbound	`plugin.inbound.whatsapp`	Legacy single-account
Inbound	`plugin.inbound.whatsapp.<instance>`	Multi-account routing
Outbound	`plugin.outbound.whatsapp`	Legacy single-account
Outbound	`plugin.outbound.whatsapp.<instance>`	Multi-account routing

During pairing the plugin also publishes qr lifecycle events on the inbound topic so the wizard can render the QR.

Config

# config/plugins/whatsapp.yaml
whatsapp:
  enabled: true
  session_dir: ""            # empty → per-agent default
  media_dir: ./data/media/whatsapp
  instance: default
  acl:
    allow_list: []           # empty + empty env = open ACL
    from_env: WA_AGENT_ALLOW
  behavior:
    ignore_chat_meta: true
    ignore_from_me: true
    ignore_groups: false
  bridge:
    response_timeout_ms: 30000
    on_timeout: noop         # noop | apology_text
  transcriber:
    enabled: false
    skill: whisper
  public_tunnel:
    enabled: false
    only_until_paired: true

Key fields:

Field	Default	Purpose
`session_dir`	per-agent	Signal Protocol state. Each account needs its own dir.
`instance`	`None`	Label for multi-account routing. Unlabelled keeps the legacy bare topic.
`allow_agents`	`[]`	Agents permitted to publish from this instance. Empty = accept any agent holding a resolver handle. Defense-in-depth for the per-agent `credentials` binding.
`acl.allow_list`	`[]`	Bare JIDs allowed to reach the agent. Empty + empty env = open.
`behavior.ignore_chat_meta`	`true`	Skip muted / archived / locked chats on the phone.
`behavior.ignore_from_me`	`true`	Drop the agent's own replies to prevent loops.
`behavior.ignore_groups`	`false`	Skip group chats entirely when `true`.
`bridge.response_timeout_ms`	`30000`	Per-message handler deadline.
`bridge.on_timeout`	`noop`	`noop` (no reply) or `apology_text`.
`transcriber.enabled`	`false`	Voice → text via `skill`.
`public_tunnel.enabled`	`false`	Expose `/whatsapp/pair` through a Cloudflare tunnel.
`public_tunnel.only_until_paired`	`true`	Tear down the tunnel after `Connected`.

Pairing

Pairing is setup-time only. The runtime refuses to start without paired credentials.

sequenceDiagram
    participant U as Operator
    participant W as agent setup
    participant WA as whatsapp-rs Client
    participant P as Phone

    U->>W: setup pair whatsapp --agent ana
    W->>WA: new_in_dir(session_dir)
    WA-->>W: QR image
    W-->>U: render QR (Unicode blocks)
    U->>P: Settings → Linked Devices → scan
    P->>WA: pair
    WA-->>W: Connected
    W->>W: persist creds to session_dir/.whatsapp-rs/creds.json

Credentials at <session_dir>/.whatsapp-rs/creds.json
Daemon-collision check at <session_dir>/.whatsapp-rs/daemon.json blocks a second process on the same account
Multi-account via Client::new_in_dir() — no XDG_DATA_HOME mutation
Credential expiry mid-run (401 loop) → operator must re-pair; no runtime QR fallback

Tools exposed to the LLM

Tool	Signature	Notes
`whatsapp_send_message`	`(to, text)`	Send to arbitrary JID.
`whatsapp_send_reply`	`(chat, reply_to_msg_id, text)`	Quote a specific inbound message.
`whatsapp_send_reaction`	`(chat, msg_id, emoji)`	Emoji tap-back.
`whatsapp_send_media`	`(to, file_path, caption?, mime?)`	File attachment.

All tools honor the per-binding outbound_allowlist.whatsapp — empty list = unrestricted, populated = hard allowlist.

Event shapes

Inbound payloads (on plugin.inbound.whatsapp[.<instance>]):

// message
{
  "kind": "message",
  "from": "573000000000@s.whatsapp.net",
  "chat": "573000000000@s.whatsapp.net",
  "text": "hi",
  "reply_to": null,
  "is_group": false,
  "timestamp": 1714000000,
  "msg_id": "3EB0..."
}

// media_received
{
  "kind": "media_received",
  "from": "...",
  "chat": "...",
  "msg_id": "...",
  "local_path": "./data/media/whatsapp/abc.jpg",
  "mime": "image/jpeg",
  "caption": null
}

// qr  (pairing only)
{"kind": "qr", "ascii": "...", "png_base64": "...", "expires_at": ...}

// lifecycle
{"kind": "connected" | "disconnected" | "reconnecting" | "credentials_expired"}

// observability
{"kind": "bridge_timeout", "msg_id": "...", "waited_ms": 30000}

Presence indicators

While the agent prepares a reply, the WhatsApp plugin pulses the <chatstate> stanza on the peer phone so the user sees a live "escribiendo…" / "grabando audio…" indicator instead of dead silence. The wire shape matches what WhatsApp Web emits natively:

<!-- text reply (default) -->
<chatstate to="JID"><composing/></chatstate>

<!-- voice note about to be sent -->
<chatstate to="JID"><composing media="audio"/></chatstate>

<!-- pulse stops -->
<chatstate to="JID"><paused/></chatstate>

The plugin switches the media attr automatically based on the outbound OutboundReplyKind:

Text reply → <composing/> for the LLM round-trip; pauses before the message lands.
Voice note (PTT) → <composing/> while the LLM thinks, flips to <composing media="audio"/> ~250 ms before the upload + ack so the peer client has time to repaint "grabando audio…", then pauses.
Image / video / document → not media-flagged in v1 (queued as follow-up).

Proactive voice notes (microapp-driven, no inbound trigger) get the same recording-presence wrap via the outbound dispatcher, so the indicator is consistent regardless of who initiated the send.

`typing_mode` knob

Plugin-instance YAML override. Default reproduces the historic behaviour.

whatsapp:
  enabled: true
  session_dir: ...
  typing_mode: instant   # default; see table below

Value	v1 behaviour
`instant`	Heartbeat starts the moment the handler is invoked. Recommended default.
`thinking`	Documented for parity with future reasoning-stream support; v1 falls back to `instant` + warn-log.
`message`	Documented for parity with future first-text-delta support; v1 falls back to `instant` + warn-log.
`never`	Skips the heartbeat entirely. Use when the bot should stay invisible (no presence cycling at all).

Unknown values warn-degrade to instant rather than failing boot, so a YAML typo cannot wedge the daemon.

The keepalive cadence (10 s), TTL safety cap (60 s) and consecutive-failure circuit breaker (2 strikes) are not exposed as YAML knobs in v1 — the defaults are what every agent wants. Crate consumers that need other values can pass a PresenceHeartbeatConfig through Session::chat_presence_heartbeat_with directly.

Old-client compatibility

Pre-2021 WhatsApp clients ignore the media attribute and paint "escribiendo…" regardless. That's a degradation but harmless: the voice note still arrives; only the indicator lies. Affects <0.5 % of installs.

Idioma del agente y voz (locale BCP-47)

The agent's language field accepts a full BCP-47 locale — es-AR, es-ES, es-US, en-GB, pt-BR, etc. — and the runtime honours both the language and the region for three things on every turn:

Per-locale system addendum locks the LLM into the regional register: voseo for es-AR (vos, tenés, podés), tuteo + castellano vocab for es-ES (vosotros, vale, coger), Spanglish-aware for es-US (loanwords like email/parking not auto-translated), British spelling + vocab for en-GB, etc. Operators shipping language: "es" (no region) get a Latam-neutral tuteo template.
Voice-mode SSML tutorial — when voice mode is toggled for the conversation, the marker tutorial appended to the system prompt uses the locale's native register (so the examples don't teach the LLM a dialect it shouldn't speak).

Default Edge voice — when the per-conversation voice_id is the install-wide default, the picker resolves a region-matched voice:

Locale	Voice
`es-AR`	`es-AR-ElenaNeural`
`es-MX`	`es-MX-DaliaNeural`
`es-ES`	`es-ES-ElviraNeural`
`es-CO`	`es-CO-SalomeNeural`
`es-PE`	`es-PE-CamilaNeural`
`es-CL`	`es-CL-CatalinaNeural`
`es-US`	`es-US-PalomaNeural`
`en-US`	`en-US-AriaNeural`
`en-GB`	`en-GB-SoniaNeural`
`en-AU`	`en-AU-NatashaNeural`
`en-CA`	`en-CA-ClaraNeural`
`pt-BR`	`pt-BR-FranciscaNeural`
`pt-PT`	`pt-PT-RaquelNeural`
`fr-FR`	`fr-FR-DeniseNeural`
`fr-CA`	`fr-CA-SylvieNeural`
`it-IT`	`it-IT-ElsaNeural`
`de-DE`	`de-DE-KatjaNeural`
`ja-JP`	`ja-JP-NanamiNeural`
`zh-CN`	`zh-CN-XiaoxiaoNeural`

Language-only locales fall back to the canonical region (es → es-MX, en → en-US, pt → pt-BR, …). Operators with a manually-picked voice_id keep their choice; the picker only fires when the stored voice is the install default.

The supported locale set is closed (lives in nexo_microapp_sdk::Locale); unsupported strings (klingon, es-419, zh-Hant) are rejected by the admin RPC with invalid_locale so a YAML typo cannot reach the daemon.

Behaviour change — `language: "es"` agents

Before this change, language: "es" agents inherited an Argentine voseo flavour from the legacy voice-mode addendum constant. The new behaviour routes language: "es" to the Latam-neutral template (tuteo, no voseo). Operators who want the previous Argentine flavour set language: "es-AR" explicitly.

Gotchas

Shared session_dir across agents = cross-delivery. Each agent should point at its own <workspace>/whatsapp/default. The wizard does this automatically; manual configs need care.
ignore_chat_meta: true silently skips muted/archived chats. If a user archives a chat on the phone, the agent never sees it again until they unarchive.
Credential expiry is irreversible without re-pair. whatsapp-rs will loop on 401. Watch for credentials_expired lifecycle events and alert.

See Setup wizard — WhatsApp pairing.

Skills catalog

nexo-rs uses "skill" to mean two different things. Both are covered on this page; gating semantics for each live in Gating by env / bins.

Extension skills — shipped under extensions/ in the repo, discovered and spawned like any other stdio extension. 22 of them landed in Phase 13.
Local skills — markdown files under an agent's skills_dir/ that get injected into the system prompt at turn start.

The two overlap in name but not in mechanism:

	Extension skill	Local skill
Where it lives	`extensions/<id>/` with `plugin.toml`	`skills/<name>/SKILL.md`
How it's loaded	Extension discovery → stdio spawn	`SkillLoader` at turn time
What it produces	Tools in `ToolRegistry`	Text injected into the prompt
Gating	Warn + continue, tools still registered	Warn + skip entirely

Extension skills (Phase 13)

All shipped as stdio extensions written in Rust. _common is a shared Rust library (circuit-breaker primitives), not an extension itself.

Core utilities

Id	Purpose	Requires
`weather`	Current + forecast via Open-Meteo (no auth).	—
`openstreetmap`	Forward / reverse geocoding via Nominatim.	—
`wikipedia`	Article search + summaries.	—
`fetch-url`	HTTP GET / POST with SSRF guard, retries, circuit breaker.	—
`rss`	Fetch & parse RSS / Atom / JSON feeds.	—
`dns-tools`	A/AAAA/MX/TXT/NS/SOA/SRV + reverse + whois.	—
`endpoint-check`	HTTP probe (status + latency) + TLS cert inspection.	—
`pdf-extract`	Extract text from PDFs.	—
`translate`	LibreTranslate self-hosted or DeepL API.	—
`summarize`	Chat-based text/file summary via OpenAI-compat endpoint.	—
`openai-whisper`	Audio transcription via OpenAI-compat `/audio/transcriptions`.	—

Search & knowledge

Id	Purpose	Requires
`brave-search`	Web search.	env `BRAVE_SEARCH_API_KEY`
`goplaces`	Google Places text search + details.	—
`wolfram-alpha`	Computational queries (short + full pods).	env `WOLFRAM_APP_ID`

Infra & ops

Id	Purpose	Requires	Write-gate
`github`	REST API: PRs, checks, issues.	env `GITHUB_TOKEN`	—
`cloudflare`	DNS, zones, cache purge.	env `CLOUDFLARE_API_TOKEN`	—
`docker-api`	`ps`, `inspect`, `logs`, `stats`, `start`, `stop`, `restart`.	bin `docker`	env `DOCKER_API_ALLOW_WRITE`
`proxmox`	Proxmox VE: nodes, VMs, containers, lifecycle.	env `PROXMOX_TOKEN`	env `PROXMOX_ALLOW_WRITE`, env `PROXMOX_INSECURE_TLS` for self-signed certs
`onepassword`	1Password secrets metadata; reveal gated.	bin `op`, env `OP_SERVICE_ACCOUNT_TOKEN`	env `OP_ALLOW_REVEAL`
`ssh-exec`	Remote command execution with host allowlist.	bin `ssh`, `scp`	host allowlist in config
`tmux-remote`	Drive tmux sessions (create, send keys, capture, kill).	bin `tmux`	—

Media & content

Id	Purpose	Requires
`msedge-tts`	Text-to-speech via Edge Read Aloud.	—
`rtsp-snapshot`	Frames / clips from RTSP or HTTP camera streams.	bin `ffmpeg`
`video-frames`	Extract frames + audio from videos.	bin `ffmpeg`, `ffprobe`
`tesseract-ocr`	OCR with language packs + PSM modes.	bin `tesseract`
`yt-dlp`	Download video / audio / metadata.	bin `yt-dlp`
`spotify`	Now-playing, search, play, pause, skip.	env `SPOTIFY_ACCESS_TOKEN`

Google (phase 13.18)

Single google extension covering 32 tools across Gmail, Calendar, Tasks, Drive, People, and Photos. Uses OAuth refresh-token flow. Writes gated by five independent env flags:

GOOGLE_ALLOW_SEND — Gmail send
GOOGLE_ALLOW_CALENDAR_WRITE
GOOGLE_ALLOW_DRIVE_WRITE
GOOGLE_ALLOW_TASKS_WRITE
GOOGLE_ALLOW_PEOPLE_WRITE

See Plugins — Google for the OAuth setup and the generic google_call tool that fronts the extension.

LLM providers (phase 13.19)

anthropic and gemini are native LLM clients living under crates/llm/, not extensions. See LLM providers and children.

Templates

Id	Purpose	Language
`template-rust`	Copy-and-edit skeleton (`ping`, `add`).	Rust
`template-python`	stdlib-only skeleton.	Python

See Extensions — Templates.

Local skills

Local skills are markdown files loaded by SkillLoader and injected into the system prompt at turn time. Defined in the agent config:

# agents.yaml
agents:
  - id: kate
    skills_dir: ./skills
    skills:
      - weather
      - github
      - summarize
      - google-auth

Each entry resolves to <skills_dir>/<name>/SKILL.md:

---
name: "Weather"
description: "Current conditions and forecasts"
requires:
  bins: ["curl"]
  env: ["WEATHER_API_KEY"]
max_chars: 5000
---
# Weather skill

Call `weather_forecast(city)` to get a 3-day forecast.
Use metric units. Default to the user's locale when unspecified.

Bundled local skills currently shipped in this repo:

Id	Purpose
`loop`	Bounded auto-iteration: run a prompt up to `max_iters` until `until_predicate` matches (`regex`, `exit`, or `judge`).
`stuck`	Bounded auto-debug for repeated `cargo build` / `cargo test` failures via `failing_command`, `max_rounds`, `focus_pattern`, and evidence-first diagnosis.
`simplify`	Bounded code simplification for a file/hunk via `target`, `scope`, `max_passes`, `preserve_behavior` (dead code, redundant guards, duplication, naming).
`verify`	Bounded acceptance verification via `acceptance_criterion`, `candidate_commands`, `max_rounds`, `judge_mode` (command evidence + explicit judge decision).
`skillify`	Capture a repeatable workflow and convert it into a reusable local `SKILL.md` with explicit inputs, steps, guardrails, and output contract.
`remember`	Memory-hygiene review flow: classify/promote/dedupe/conflict-resolve memory artifacts before applying any changes.
`update-config`	Safe config-edit skill for Nexo: map behavior changes to `config/*.yaml`, apply read-before-write merges, and surface hot-reload vs restart requirements.

loop can be attached from setup wizard like any other skill (nexo setup → Configurar agente → Skills) because it is registered in the setup skill catalog and requires no secrets.

stuck is also attachable from setup wizard and requires no secrets.

simplify is also attachable from setup wizard and requires no secrets.

verify is also attachable from setup wizard and requires no secrets.

skillify is also attachable from setup wizard and requires no secrets.

remember is also attachable from setup wizard and requires no secrets.

update-config is also attachable from setup wizard and requires no secrets.

Loading flow

flowchart TD
    CFG[agents.yaml skills: list] --> LOOP[for each name]
    LOOP --> READ[read skills_dir/name/SKILL.md]
    READ --> FM[parse YAML frontmatter]
    FM --> GATE{bins on PATH<br/>AND env set?}
    GATE -->|no| SKIP[warn + skip<br/>not injected]
    GATE -->|yes| RENDER[render into prompt:<br/>heading + blockquote + body]
    RENDER --> TRUNC[truncate to max_chars]
    TRUNC --> INJECT[inject into system prompt]

Why local skills skip-on-miss (vs extensions warn-and-continue)

A local skill is a text instruction to the LLM describing a capability. If the backing bin/env isn't available the tool will fail — but worse, the LLM was told the capability exists and will repeatedly try to use it. Skipping the skill prevents lying to the model.

An extension is a registered tool. If the LLM invokes it and the backing bin is missing, the tool returns an error — the LLM observes and adapts. Warn-and-continue is fine.

See Gating for the full semantics.

How to pick

Need the LLM to know how to do something (usage pattern, style rules, examples)? → local skill.
Need the LLM to do something (make a call, return data)? → extension skill.
Both? → ship the extension and write a local skill next to it that explains when to use it.

Plugin quickstart — zero to installed in 10 minutes

Phase 31.9. Linear, copy-paste path from empty directory to a plugin running inside an operator's daemon. Take this page once end-to-end before reading Plugin authoring overview, the Rust SDK reference, or the Plugin contract — those documents make sense faster after you have shipped one toy plugin.

Single-language path (Rust). Python / TypeScript / PHP quickstarts share the exact same shell commands; only the --lang flag and the in-repo source tree differ. Pointers to the sister SDKs appear at the bottom.

What you build

A plugin called hello_plugin that echoes every event arriving on its inbound topic onto plugin.inbound.hello_plugin_echo. By the end of this page:

A new GitHub repository under your account holds the plugin's source.
A signed (optional) GitHub release ships per-target tarballs.
An operator on a separate host runs nexo plugin install <you>/<repo> and the daemon spawns your plugin inside the next 10 seconds, then logs the handshake.

This is the same pipeline that ships the in-house plugins — see github.com/lordmacu/nexo-plugin-browser for a real-world output of the same quickstart, scaled up to 12 tools.

Prerequisites

Tool	Version	Why
Rust toolchain	1.80+ (`rustup` recommended)	Build the plugin binary.
`nexo` CLI	0.1.6+ on PATH	`plugin new`, `plugin run`, `plugin install`.
`git`	any	Push to GitHub.
GitHub account + a repo you can push to	—	Releases host the install artifacts.
`cosign` (optional)	2.x	Sign releases for operators on `--require-signature`. Skip until step 9.

Verify each:

cargo --version          # cargo 1.80+
nexo --version           # nexo 0.1.6+
git --version
gh auth status           # if using `gh` CLI for the repo create

If nexo is not yet on PATH: curl -fsSL https://lordmacu.github.io/nexo-rs/install.sh | bash (or cargo install nexo-rs) — then make sure ~/.cargo/bin is on PATH.

1. Scaffold

The CLI bundles the same Rust template that produces the in-tree template-plugin-rust — one command lands a fresh project on disk:

nexo plugin new hello_plugin --lang rust --owner alice
cd hello_plugin

Flags that matter:

<id> — the plugin's globally unique id. Must satisfy ^[a-z][a-z0-9_]{0,31}$. It becomes the prefix for any tool name, channel kind, or config namespace your plugin contributes.
--lang rust — switch to python, typescript, or php for the matching template. The remaining steps are identical.
--owner alice — your GitHub username. Used in the generated README + CI workflow's release URL.
--description "..." — optional one-liner; flows into the manifest + README + Cargo description.
--git — runs git init for you and stages the initial commit.
--dest /custom/path — emit elsewhere (default is ./<id>/).

Re-running the command on a non-empty directory aborts; pass --force only if you mean it.

2. Inspect what landed

hello_plugin/
├── Cargo.toml               # name = "hello_plugin", bin name matches plugin.id
├── nexo-plugin.toml         # manifest — read by the daemon at handshake
├── README.md                # operator-facing docs (edit these later)
├── scripts/
│   └── pack-tarball.sh      # the per-target tarball packer the CI uses
├── src/
│   └── main.rs              # tokio::main + PluginAdapter
└── tests/
    └── pack_tarball.rs      # regression test for the asset shape

Two files you must know intimately. Open both:

nexo-plugin.toml:

[plugin]
id = "hello_plugin"
version = "0.1.0"
name = "Hello Plugin"
description = "Echoes inbound events back onto the broker."
min_nexo_version = ">=0.1.0"

[plugin.requires]
nexo_capabilities = ["broker"]

[[plugin.channels.register]]
kind = "hello_plugin_inbound"
description = "Inbound events the plugin emits onto the broker."

[plugin.entrypoint]
command = "./bin/hello_plugin"   # resolved relative to this file

plugin.entrypoint.command = "./bin/hello_plugin" is the Phase 31.1.c install convention. The daemon's discovery walker reads the manifest, then spawns whatever command resolves to, relative to the manifest's containing directory. The pack-tarball step (step 8) copies your release binary into bin/hello_plugin so this entrypoint resolves on the operator host.

src/main.rs (truncated):

use nexo_broker::Event;
use nexo_microapp_sdk::plugin::{BrokerSender, PluginAdapter};

const MANIFEST: &str = include_str!("../nexo-plugin.toml");

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    tracing_subscriber::fmt()
        .with_writer(std::io::stderr)
        .init();

    PluginAdapter::new(MANIFEST)?
        .on_broker_event(handle_event)
        .on_shutdown(|| async { Ok(()) })
        .run_stdio()
        .await?;
    Ok(())
}

async fn handle_event(topic: String, event: Event, broker: BrokerSender) {
    let echo = Event::new(
        "plugin.inbound.hello_plugin_echo",
        "hello_plugin",
        serde_json::json!({
            "echoed_from": topic,
            "echoed_payload": event.payload,
        }),
    );
    let _ = broker.publish("plugin.inbound.hello_plugin_echo", echo).await;
}

The contract is small: build a PluginAdapter from your manifest text, register handlers, call run_stdio().await. The SDK owns the JSON-RPC envelope — you only see decoded Events.

Stdout discipline — every byte on stdout must be a JSON-RPC frame. Use eprintln! / tracing::* for plugin-side logs; a stray println! will corrupt the wire and the daemon will tear the subprocess down at handshake.

3. Build

cargo build

A debug binary lands at target/debug/hello_plugin in well under a second on a warm cache (mold + sccache, configured machine-wide on the dev box, are not required — vanilla cargo works too).

4. Smoke-test the handshake

Two probes the daemon performs at boot. Both must pass before the plugin shows up in nexo plugin list.

4.a — `--print-manifest`

The discovery walker (Phase 81.33 Stage 8) invokes each nexo-plugin-* binary with --print-manifest and reads the embedded TOML from stdout. Confirm yours obeys:

./target/debug/hello_plugin --print-manifest

Expected output: verbatim contents of nexo-plugin.toml, followed by exit 0. The scaffold wires the print_manifest_if_requested(MANIFEST) call into the first line of main(); if you see logs, JSON-RPC frames, or empty stdout, the helper is missing.

4.b — `initialize` handshake

Hand-feed a JSON-RPC initialize frame to verify the wire shape:

echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' \
    | ./target/debug/hello_plugin

Expected output (one line of JSON, formatted here for readability):

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "manifest": { "plugin": { "id": "hello_plugin", ... } },
    "server_version": "hello_plugin-0.1.0",
    "tools": []
  }
}

If you see anything else — extra blank lines, panic backtrace, malformed JSON — fix that before moving on. The daemon will reject the same frame your terminal saw.

5. Local dev loop

Boot a daemon with this directory injected at the head of plugins.discovery.search_paths. No install, no GitHub round trip, no signature verification — pure inner-loop dev:

nexo plugin run .

Expected stderr trace:

INFO local plugin override applied (plugin_id=hello_plugin)
INFO subprocess plugin spawned (id=hello_plugin, pid=...)
INFO hello_plugin starting
INFO subprocess plugin handshake ok (id=hello_plugin, version=0.1.0)

The plugin is now live inside the daemon. Ctrl+C tears both processes down cleanly.

Edit src/main.rs, re-run cargo build, and the daemon's hot-reload walker (Phase 81.10) re-spawns the subprocess automatically — no daemon restart.

For isolated contract debugging without your real agents.yaml, add --no-daemon-config. See Local dev loop conventions for the inner-loop reference.

6. Customize the handler

Replace the body of handle_event with whatever your plugin should do — call a third-party API, persist to disk, trigger a downstream agent. Re-publish the API's reply back through broker.publish so agents can observe it.

Eight common shapes (channel plugin, poller, hybrid bridge, etc.) are pre-baked in Patterns. Browse those once before designing — most plugins fit one of those shapes exactly.

7. Push to GitHub

gh repo create alice/hello_plugin --public --source=. --push

Or, with plain git:

git init && git add . && git commit -m "initial plugin"
git remote add origin git@github.com:alice/hello_plugin.git
git push -u origin main

The scaffolded repo already contains .github/workflows/release.yml (the Phase 31.2 template). The workflow fires on tag push (v*).

8. Cut a release

Bump the version in Cargo.toml and nexo-plugin.toml (must match), then tag and push:

# bump both files first — keep them in lock-step
git add Cargo.toml nexo-plugin.toml
git commit -m "v0.1.0"
git tag v0.1.0
git push origin main v0.1.0

Watch the workflow in the GitHub Actions tab. The four jobs shipped by the template:

validate-tag — refuses to release if the manifest version does not match the tag.
build — compiles the plugin for linux-x64 and macos-arm64 (extend the matrix yourself for more targets), runs pack-tarball.sh to produce the bin/hello_plugin + nexo-plugin.toml archive layout operators expect, and uploads <id>-<version>-<target>.tar.gz plus .sha256 sidecars.
sign (optional, gated by repo variable COSIGN_ENABLED) — cosign keyless signature against the GitHub Actions OIDC identity. See Signing & publishing for the full keyless tutorial.
release — creates the GitHub Release, attaches every tarball, every .sha256, the bare nexo-plugin.toml (so nexo plugin install can resolve manifest URL early), and the optional cosign material.

When the workflow turns green, the release page contains the exact asset shape nexo plugin install resolves against. See Publishing a plugin for the full asset naming convention.

9. Install on an operator host

Three install routes, depending on how you shipped the plugin:

9.a — `cargo install` (zero config)

Recommended for crates.io-published plugins. The daemon's discovery defaults already cover $HOME/.cargo/bin:

cargo install nexo-plugin-hello_plugin
nexo plugin list

The discovery walker invokes the binary with --print-manifest on next daemon boot (or next hot-reload tick under Phase 81.10), extracts the embedded TOML, and registers the plugin. No operator YAML edit. Authoring detail: Auto-discovery quickstart.

9.b — `nexo plugin install` (GitHub releases)

When you ship tarballs to GitHub releases instead of (or in addition to) crates.io:

nexo plugin install alice/hello_plugin@v0.1.0

The installer (Phase 31.1.c):

Hits https://api.github.com/repos/alice/hello_plugin/releases/tags/v0.1.0.
Picks the tarball matching the host's target triple.
Downloads tarball + .sha256, verifies the digest.
Optionally verifies cosign signature (default off; flip on with --require-signature after configuring trusted keys — see Plugin trust).
Extracts to <state_root>/plugins/hello_plugin-0.1.0/{nexo-plugin.toml, bin/hello_plugin, .nexo-install.json}.
Records the install in the per-host install ledger.

The daemon's plugins.discovery.search_paths defaults include $HOME/.local/share/nexo/plugins and /usr/local/libexec/nexo/plugins. Move (or symlink) the extracted directory under one of those to skip the YAML edit, or point a custom search_paths entry at <state_root>/plugins/.

9.c — Drop-in (manual)

Copy the binary into any default search path:

cp ./target/release/hello_plugin ~/.cargo/bin/nexo-plugin-hello_plugin
chmod +x ~/.cargo/bin/nexo-plugin-hello_plugin

Re-discovery happens on next daemon boot (or hot-reload).

10. Verify

nexo plugin list

Expected output:

ID            VERSION  TARGET                       CHANNEL  INSTALLED
hello_plugin  0.1.0    x86_64-unknown-linux-gnu     latest   2026-05-07T18:20:42+00:00

Daemon stderr should show the handshake within ~5 seconds:

INFO subprocess plugin spawned (id=hello_plugin, pid=...)
INFO subprocess plugin handshake ok (id=hello_plugin, version=0.1.0)

Publish anything onto an inbound topic the plugin subscribed to (e.g. agent broker publish-equivalent inside your microapp) and the echo lands on plugin.inbound.hello_plugin_echo.

Iterate

# bump in source, push tag
git tag v0.1.1 && git push origin v0.1.1

# operator-side
nexo plugin upgrade hello_plugin            # pulls latest tag
# or pin: nexo plugin upgrade hello_plugin --version v0.1.1

nexo plugin upgrade (Phase 31.8) atomically swaps the on-disk copy + restarts the subprocess inside the daemon. To roll back, re-run install against the older tag.

To remove:

nexo plugin remove hello_plugin

Troubleshooting

Symptom	Likely cause	Fix
`nexo plugin install` errors `no asset matching target <triple>`.	CI matrix did not build for the operator's host.	Add the missing triple to the workflow matrix, re-tag.
Daemon stderr shows `subprocess exited at handshake (status=...)`.	Plugin wrote non-JSON-RPC bytes to stdout (most likely a stray `println!`) or panicked before the handshake.	Re-run `./target/debug/<id>` against a synthetic frame from step 4 — the panic is reproducible there.
`nexo plugin list` does not show your plugin after install.	Daemon's `plugins.discovery.search_paths` does not include `<state_root>/plugins/`.	Add it to `config/plugins/discovery.yaml` and restart.
`nexo plugin install` errors `signature required`.	Operator runs with `--require-signature` and your release was unsigned.	Sign with cosign — see Signing & publishing.
Plugin runs locally with `nexo plugin run` but the published binary panics on the operator host.	Per-target build skipped a runtime dep (e.g. linked OpenSSL on Linux but not on the operator's distro).	Switch to `vendored-openssl` or static-link in `Cargo.toml`; rebuild.
Operator on `--require-signature` rejects your release with `cosign verify failed`.	Trusted-keys file does not include your identity issuer.	Operator adds your GitHub identity to `config/extensions/trusted_keys.toml`. See Plugin trust.

Going deeper

You shipped one plugin. From here:

Plugin authoring overview — the full picture (plugin vs extension vs microapp, plugin config dir, sandboxing, contributing tools / channel kinds / LLM providers / memory backends / hooks).
Rust SDK reference — full PluginAdapter surface, manifest schema, per-target tarball convention.
Plugin contract — the wire spec every SDK implements. Read this once and you can debug any plugin in any language.
Patterns (8 common shapes) — pre-baked designs for channels, pollers, hybrid bridges.
Publishing a plugin — full asset naming convention and the 4-job CI workflow shape.
Signing & publishing — cosign keyless tutorial.

Other languages

Same flow, swap step 1's --lang:

SDK	Scaffold	Reference
Python	`nexo plugin new hello --lang python --owner alice`	Python SDK
TypeScript	`nexo plugin new hello --lang typescript --owner alice`	TypeScript SDK
PHP	`nexo plugin new hello --lang php --owner alice`	PHP SDK

Steps 2–10 read identically; the source tree differs (no Cargo.toml, language-appropriate runtime). The wire contract is the same.

Plugin authoring overview

Phase 31.9. Entry point for authors building anything that extends nexo-rs from the outside. This page gets you to the right deeper guide in 60 seconds.

Read this when

You want to add capability to nexo-rs and have not yet picked between a plugin, an extension, or a microapp.
You have picked "plugin" and need to know which language SDK to start with.
You want a 5-minute end-to-end smoke test before committing to a language choice.

Plugin vs Extension vs Microapp

nexo-rs ships three extension surfaces. They differ in who owns the runtime, who owns the UI, and how operators install them.

You're building	Use	Owns UI?	Owns auth/billing?	Common languages
New channel (Slack, Discord, IRC) or poller	Plugin	No (daemon owns I/O)	No (operator config)	Rust, Python, TypeScript, PHP
Bundle of skills, advisors, prompts, or YAML config that operators `nexo ext install`	Extension	No	No	YAML + small Rust stubs
End-product on top of nexo-rs (multi-tenant SaaS, internal tool, white-label deploy)	Microapp	✅ yes	✅ yes	Any language with a NATS client

If you are still unsure:

Plugin if your code is reactive (broker.event fires → you do something) and ships as a binary the daemon spawns.
Extension if your code is declarative (skills + agents + prompts) and ships as a tarball operators install with nexo ext install.
Microapp if your code is the product. End users see your UI, your domain, your billing — nexo-rs is invisible infrastructure.

This page covers plugins. For extensions, jump to Manifest reference. For microapps, jump to Microapps · getting started.

Pick a language

All four SDKs implement the same wire contract — your choice is purely about ergonomics. Operators don't care which SDK you picked; they just run nexo plugin install <owner>/<repo>.

Language	Best for	Runtime deps	Per-target binaries?	SDK reference
Rust	Performance, single static binary, zero runtime deps.	None — `cargo build` produces a static ELF/Mach-O.	✅ yes (one tarball per Rust target)	Rust SDK
Python	Existing scripts, ML ecosystem, fast iteration.	`python3.11+` on operator host.	No (`noarch` — single tarball)	Python SDK
TypeScript	Existing Node servers, npm ecosystem, frontend devs.	`node 20+` on operator host.	No (`noarch`)	TypeScript SDK
PHP	Existing Composer / Symfony / Laravel codebase.	`php 8.1+` (Fibers required) on operator host.	No (`noarch`)	PHP SDK

Cross-cutting reference: Plugin contract is the wire spec all four SDKs implement. Read it once and you understand every SDK.

SDK packages

nexo plugin new --lang <lang> vendors the SDK for you, so you don't normally install it by hand — but if you're wiring the SDK into an existing project, the published packages are:

Language	Package (registry)	Add to a project
Rust	`nexo-microapp-sdk` (feature `plugin`) + `nexo-broker`	`cargo add nexo-microapp-sdk -F plugin && cargo add nexo-broker`
Python	`nexoai` — import name stays `nexo_plugin_sdk`	`pip install nexoai`
TypeScript	`nexo-plugin-sdk`	`npm install nexo-plugin-sdk`
PHP	`nexo/plugin-sdk`	`composer require nexo/plugin-sdk`

The Python / TypeScript / PHP SDKs live in one mono-repo — lordmacu/nexo-plugin-sdks (per-language release tags python-v* / ts-v* / php-v*). The Rust SDK ships from this repo (crates/microapp-sdk, feature plugin).

Microapp, not plugin? The product layer uses the same nexo-microapp-sdk crate without the plugin feature (its admin / voice / stt / wizard / events modules instead), plus the @lordmacu/nexo-microapp-ui-react React kit for the frontend. See Microapps · getting started and the agent-creator reference microapp.

5-min quickstart

The shortest path from zero to a running plugin uses Rust because the toolchain ships with cargo. Adapt the nexo plugin new --lang <other> step for Python / TypeScript / PHP — the rest is identical.

For the full zero-to-installed flow (scaffold → publish to crates.io → operator-side cargo install nexo-plugin-X → daemon auto-discovery), see the linear Plugin quickstart (10 min). This section is the abridged inner-loop version.

# 1. Scaffold from the bundled template (Phase 31.6).
nexo plugin new my_plugin --lang rust --owner alice
cd my_plugin

# 2. Build (under a second on a warm cache).
cargo build

# 3. Boot the daemon with this directory injected at the head
#    of plugins.discovery.search_paths. No install, no verify,
#    no GitHub round-trip — pure inner-loop dev.
nexo plugin run .

Expected stderr trace from step 3:

INFO local plugin override applied (plugin_id=my_plugin)
INFO subprocess plugin spawned (id=my_plugin, pid=...)
INFO my_plugin starting
INFO subprocess plugin handshake ok (id=my_plugin, version=0.1.0)

The plugin is now live. Publishing any event on a topic the plugin's manifest registers (default plugin.inbound.my_plugin_echo) reaches the handler in src/main.rs::handle_event.

To exit, send Ctrl+C — the daemon issues a shutdown request, the plugin's on_shutdown runs, and both processes return cleanly.

Plugin config dir

Phase 81.4 — operators place per-plugin YAML config under <config_dir>/plugins/<plugin_id>/. The daemon reads every *.yaml / *.yml file in that directory at boot, deep-merges them alphabetically, resolves ${ENV_VAR} placeholders, and (when your manifest declares a schema_path) validates the merged tree against your JSONSchema before calling init(). Validation failure aborts plugin load with InitOutcome::Failed; the daemon continues without the plugin.

Multi-file sharding lets operators split sensitive settings from declarative ones:

<config_dir>/plugins/slack/
  01-credentials.yaml   # api_token: "${SLACK_BOT_TOKEN}"
  02-channels.yaml      # channels: [...]
  03-allowlist.yaml     # rate limits per channel

Mappings deep-merge across files (later wins per-key). Arrays full-replace — they don't concat — so an operator override file completely substitutes the array from earlier files. Comment-only and non-.yaml files are ignored.

Declare your config schema in nexo-plugin.toml:

[plugin.config]
schema_path = "config.schema.json"   # relative to plugin root
hot_reload = true                    # parsed; wiring lands in 81.4.b

The schema validator currently supports the JSONSchema subset type / required / properties / additionalProperties / enum. Plugins needing oneOf / $ref / pattern will get richer validation in a future 81.4.c slice — for now, those keywords pass through silently.

Inside your plugin, consume ctx.plugin_config (an Arc<serde_yaml::Value>):

#![allow(unused)]
fn main() {
let api_token = ctx
    .plugin_config
    .get("api_token")
    .and_then(serde_yaml::Value::as_str)
    .ok_or_else(|| anyhow::anyhow!("api_token missing"))?;
}

When the operator hasn't placed any config files, the value is an empty mapping — your plugin sees Value::Mapping(empty), not Null. Plugins with all-optional fields boot cleanly without operator action.

Contributing channel kinds

Phase 81.24 — subprocess plugins that declare [plugin.extends].channels = [...] automatically get a host-side RemoteChannelAdapter registered for each kind. The daemon's ChannelAdapterRegistry routes outbound dispatches to your subprocess via three JSON-RPC methods:

channel.start { kind, instance } — subscribe outbound topics + begin publishing inbound (default 30 s timeout)
channel.stop { kind } — release resources (30 s)
channel.send_outbound { kind, msg } — send one outbound message; reply with { message_id, sent_at_unix } (60 s)

Wire spec + error codes: Plugin contract §5.x.

Sketch (Rust subprocess plugin) — handle each request from your adapter's reader loop:

#![allow(unused)]
fn main() {
match method {
    "channel.start" => reply_ok(id, serde_json::json!({ "ok": true })),
    "channel.stop"  => reply_ok(id, serde_json::json!({ "ok": true })),
    "channel.send_outbound" => {
        let msg = params.get("msg").cloned().unwrap_or_default();
        // Forward `msg` to your provider's API; map the API's
        // response into OutboundAck.
        let ack = send_to_slack(msg).await?;
        reply_ok(id, serde_json::json!({
            "message_id": ack.id,
            "sent_at_unix": ack.ts,
        }));
    }
    _ => reply_method_not_found(id, method),
}
}

For typed errors (rate-limit, recipient invalid, etc.), reply with the channel-specific error codes from the contract table — the host's adapter maps them to ChannelAdapterError variants the agent runtime understands.

The matching SDK helpers (handle_channel_start / handle_channel_send_outbound etc.) ship in Phase 81.24.b. Until then, hand-handle the JSON-RPC frames using the SDK's existing primitives.

Contributing LLM providers

Phase 81.25 — subprocess plugins that declare [plugin.extends].llm_providers = [...] get one host-side RemoteLlmFactory registered into LlmRegistry per provider name. When the agent runtime resolves model.provider = "<name>", the factory builds a RemoteLlmClient that translates trait calls into llm.chat JSON-RPC requests over your subprocess plugin's stdio pipe.

[plugin.extends]
llm_providers = ["cohere", "mistral"]

Two modes supported on the wire:

Sync — params.stream = false; reply once with WireChatResponse (default 60 s timeout).
Streaming — params.stream = true; emit zero or more llm.chat.delta { request_id, chunk } notifications + one final response carrying usage / finish_reason (default 300 s timeout).

Wire spec + error codes: Plugin contract §5.y.

Sketch (Rust subprocess plugin) — handle llm.chat from your adapter's reader loop:

#![allow(unused)]
fn main() {
match method {
    "llm.chat" => {
        let provider = params["provider"].as_str().unwrap_or("");
        let stream = params["stream"].as_bool().unwrap_or(false);
        let request = serde_json::from_value::<WireChatRequest>(
            params["request"].clone()
        )?;
        if stream {
            // Emit zero or more deltas
            send_notification("llm.chat.delta", json!({
                "request_id": id,
                "chunk": { "type": "text_delta", "delta": "Hello" },
            }));
            // ...then the final response.
            reply_ok(id, /* WireChatResponse with usage + finish_reason */);
        } else {
            // Sync: call your provider's API, build WireChatResponse.
            let resp = call_my_provider(provider, &request).await?;
            reply_ok(id, resp);
        }
    }
    _ => reply_method_not_found(id, method),
}
}

For typed errors (rate-limit, auth failed, model not found), reply with the LLM-specific error codes from the contract table — the host's RemoteLlmClient surfaces them as anyhow::Error with operator-greppable messages.

The matching SDK helpers (PluginAdapter::handle_llm_chat, streaming sender, etc.) ship in Phase 81.25.b. Until then, hand-handle the JSON-RPC frames.

Contributing hook handlers

Phase 81.27 — subprocess plugins that declare [plugin.extends].hooks = [...] get one host-side RemoteHookHandler registered into HookRegistry per hook name. When the daemon fires that hook, the handler translates the call into a hook.on_hook JSON-RPC request over your subprocess plugin's stdio pipe.

[plugin.extends]
hooks = ["before_message", "after_message"]

Wire spec + error semantic: Plugin contract §5.z.

The reply shape is the existing HookResponse struct:

#![allow(unused)]
fn main() {
HookResponse {
    abort: bool,                     // legacy block signal
    reason: Option<String>,          // operator-readable
    override: Option<Value>,         // key-by-key mutation
    decision: Option<String>,        // "allow" | "block" | "transform"
    transformed_body: Option<String>,// for "transform"
    do_not_reply_again: bool,        // anti-loop signal
}
}

Sketch (Rust subprocess plugin) — handle each hook by name:

#![allow(unused)]
fn main() {
match method {
    "hook.on_hook" => {
        let hook_name = params["hook_name"].as_str().unwrap_or("");
        let event = params["event"].clone();
        let response = match hook_name {
            "before_message" => check_pii(&event)?,    // your logic
            "after_message"  => log_audit(&event)?,
            _ => HookResponse::default(),               // Continue
        };
        reply_ok(id, serde_json::to_value(&response)?);
    }
    _ => reply_method_not_found(id, method),
}
}

Continue-on-error semantic — the host swallows every dispatch failure (timeout, malformed reply, JSON-RPC error) and returns HookResponse::default() so the registry's fire loop keeps iterating. Failures land in tracing::warn! for operator debugging but never break the agent flow. This means:

Returning -32601 method_not_found for an unknown hook is fine — host logs + continues.
A hung subprocess hook eventually times out (5s default; NEXO_PLUGIN_HOOK_TIMEOUT_MS env override) and the agent proceeds.
Returning a malformed HookResponse still continues; only well-formed responses with abort: true or decision: "block" actually block.

Hooks fire on the message hot path — keep handler latency low (<50 ms typical). Use the decision: "transform" path sparingly: every transform rewrites the event payload for subsequent handlers.

Contributing memory backends

Phase 81.26 — subprocess plugins that declare [plugin.extends].memory_backends = [...] get one host-side RemoteVectorBackend registered into the daemon's VectorBackendRegistry per backend name. v1 covers VECTOR storage only — short/long-term memory keep their SQLite implementation; plugins replace only the vector index. Primary use case: Pinecone / Qdrant / Weaviate / pgvector.

[plugin.extends]
memory_backends = ["pinecone"]

Three wire methods (default timeouts 30s upsert/delete, 10s search):

memory.vector_upsert { backend, collection, records } → { count }
memory.vector_search { backend, collection, query } → { matches: [...] }
memory.vector_delete { backend, collection, ids } → { count }

NEXO_PLUGIN_MEMORY_TIMEOUT_MS env overrides all three.

Wire spec + error codes: Plugin contract §5.w.

Sketch (Rust subprocess plugin) — handle each method by name:

#![allow(unused)]
fn main() {
match method {
    "memory.vector_upsert" => {
        let collection = params["collection"].as_str().unwrap_or("");
        let records: Vec<VectorRecord> = serde_json::from_value(
            params["records"].clone()
        )?;
        let count = my_pinecone_client.upsert(collection, records).await?;
        reply_ok(id, serde_json::json!({"count": count}));
    }
    "memory.vector_search" => {
        let collection = params["collection"].as_str().unwrap_or("");
        let query: VectorQuery = serde_json::from_value(
            params["query"].clone()
        )?;
        let matches = my_pinecone_client.search(collection, query).await?;
        reply_ok(id, serde_json::json!({"matches": matches}));
    }
    "memory.vector_delete" => {
        let collection = params["collection"].as_str().unwrap_or("");
        let ids: Vec<String> = serde_json::from_value(
            params["ids"].clone()
        )?;
        let count = my_pinecone_client.delete(collection, ids).await?;
        reply_ok(id, serde_json::json!({"count": count}));
    }
    _ => reply_method_not_found(id, method),
}
}

For typed errors (collection-not-found, dimension-mismatch, rate-limited, write-failed), reply with the memory-specific error codes from the contract table — the host's RemoteVectorBackend surfaces them as anyhow::Error with operator-greppable messages.

v1 limitation: registered backends are NOT yet consumed at runtime — LongTermMemory.recall_vector still uses sqlite-vec. Operators audit registered backends today via wire.vector_backend_registry.names(). Consumer-side dispatch (agents.yaml.<id>.vector_backend = "pinecone") lands in Phase 81.26.b.

Contributing tools

Phase 81.29 — subprocess plugins can expose tools that the daemon's LLM picks via function-calling. Each tool name lives in [plugin.extends].tools = [...] plus the subprocess advertises the matching schema at handshake.

[plugin]
id = "browser"
# ... other manifest fields ...

[plugin.extends]
tools = ["browser_navigate", "browser_click"]

The subprocess MUST advertise these tools in the initialize-reply's tools array:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "manifest": { "plugin": { "id": "browser", ... } },
    "server_version": "browser-0.1.1",
    "tools": [
      {
        "name": "browser_navigate",
        "description": "Navigate to a URL",
        "input_schema": {
          "type": "object",
          "properties": { "url": { "type": "string" } },
          "required": ["url"]
        }
      }
    ]
  }
}

When the agent's LLM picks a tool the daemon issues a tool.invoke JSON-RPC request to the subprocess:

{
  "jsonrpc": "2.0",
  "id": 80,
  "method": "tool.invoke",
  "params": {
    "plugin_id": "browser",
    "tool_name": "browser_navigate",
    "args": { "url": "https://example.com" },
    "agent_id": "shopper"
  }
}

The plugin replies with the tool's result (typically a ToolResponse-shaped object). Errors use the -33401..=-33405 band documented in Plugin contract §5.t.

Validation rules (host-side):

Tool names MUST satisfy the per-plugin namespace policy from Phase 81.3 (<plugin_id>_* or ext_<plugin_id>_*).
The advertised tools array MUST be a subset of extends.tools — drift in this direction is a hard failure at handshake.
Manifest entries WITHOUT an advertised counterpart are tolerated but logged at warn; runtime calls yield -33401 ToolNotFound.

Plugin-side responsibilities:

Validate args against the published input_schema before executing (defense in depth — host already validates host- side, but plugins should re-check).
Return -33402 ToolArgumentInvalid with details: <Value> pointing to the offending field if validation fails.
Return -33404 ToolUnavailable with data: { retry_after_ms: <u64> } for transient failures (rate-limits, locked resources).

Default timeout: 60 s (matches the LLM band — tools span fast browser_click to slow browser_navigate). Operator override via NEXO_PLUGIN_TOOL_TIMEOUT_MS.

v1 limitations — see follow-ups in FOLLOWUPS.md: streaming tools (chunked outputs via tool.invoke.delta), per-tool timeout knobs in manifest, SDK helper PluginAdapter::on_tool(name, handler).

Sandboxing your plugin

Phase 81.22 — Linux subprocess plugins can opt into bubblewrap isolation by declaring [plugin.sandbox] in nexo-plugin.toml. Default is disabled, so existing plugins keep today's behavior; opt in when you want defense-in-depth.

[plugin.sandbox]
enabled = true
network = "deny"                       # "deny" | "host"
fs_read_paths = ["/etc/ssl/certs"]    # absolute paths only
fs_write_paths = ["${state_dir}"]     # ${state_dir} = per-plugin
                                       # state root, the only safe
                                       # writable place by default
drop_user = true                       # nobody:nogroup uid mapping

Linux prereq: install bubblewrap (apt install bubblewrap on Debian/Ubuntu, available on Arch + Alpine + Fedora). The daemon discovers bwrap once at boot via PATH lookup. Without bwrap, sandbox-enabled plugins log a warning and run unsandboxed unless NEXO_PLUGIN_SANDBOX_REQUIRE=1 is set (in which case boot fails).

macOS: not yet enforced. The daemon logs a tracing::warn! at every spawn and runs the plugin without sandbox. Native sandbox-exec integration is deferred to follow-up 81.22.macos (deprecated-API risk noted).

Network policy:

network = "deny" → plugin runs in an isolated network namespace with only loopback. Use the host's daemon-mediated RPCs (llm.complete, memory.recall, broker.publish) for any external IO. Recommended default.
network = "host" → plugin shares the daemon's network. Operator must opt in via NEXO_PLUGIN_SANDBOX_HOST_NET_ALLOW=1; manifest validation rejects this otherwise. Use only when daemon-mediated RPCs cannot satisfy your plugin's IO needs.

Filesystem allowlist:

fs_read_paths = host paths bound read-only into the sandbox (bwrap --ro-bind). Common: /etc/ssl/certs for outbound TLS verification.
fs_write_paths = host paths bound read-write (bwrap --bind). The literal ${state_dir} token expands at spawn time to <state_root>/plugins/<plugin_id> — that is your plugin's per-instance owned scratch space. Only token recognized; only valid in fs_write_paths.

Hard denylist (compile-time, not configurable): allowlist entries that equal or include any of these paths are rejected at manifest validation:

/etc/shadow, /etc/sudoers, /etc/sudoers.d
/proc/sys, /proc/kcore, /proc/kallsyms
/sys/firmware, /sys/kernel
/dev/mem, /dev/kmem, /dev/port
/var/run/docker.sock, /run/docker.sock,
  /private/var/run/docker.sock
/root, /boot

Operator capability env knobs:

Env var	Effect
`NEXO_PLUGIN_SANDBOX_REQUIRE=1`	Refuse to spawn any plugin without `sandbox.enabled = true`.
`NEXO_PLUGIN_SANDBOX_HOST_NET_ALLOW=1`	Permit `network = "host"` manifests. Default off.

Recommended pattern: enabled = true, network = "deny", fs_write_paths = ["${state_dir}"], no fs_read_paths unless your plugin truly needs to read host config (e.g. CA bundles for TLS). Use daemon-mediated RPCs for everything else.

v1 out of scope — see follow-ups in FOLLOWUPS.md: granular network egress allowlist (81.22.b), per-syscall seccomp filters (81.22.c), nexo agent doctor plugins sandbox section (81.22.d), native macOS sandbox-exec (81.22.macos).

Future capability extensions

Phase 81.28 — subprocess plugins that contribute new channel kinds, LLM providers, memory backends, or HookInterceptor IDs declare them via an additive [plugin.extends] manifest section:

[plugin.extends]
channels         = ["slack"]              # paired with Phase 81.24 wrapper
llm_providers    = ["cohere"]             # paired with Phase 81.25
memory_backends  = ["pinecone"]           # paired with Phase 81.26
hooks            = ["pii_redact"]         # paired with Phase 81.27

Each list names the IDs the plugin contributes. Validation rules + the canonical schema live in Plugin contract §2.1. Daemon dispatch wiring (actually populating the matching registry slots) ships per-registry across Phase 81.24-27 — the schema is shipped today so subprocess plugin authors can declare intent ahead of those wrappers landing.

Local dev loop conventions

nexo plugin run <path> — boots the daemon with one local plugin overriding discovery; the rest of the system (broker, agents, channels) runs as configured.
nexo plugin run <path> --no-daemon-config — same, but clears cfg.agents.agents so the plugin runs in isolation for contract debugging.
Rebuild → respawn — Phase 81.10 hot-reload re-walks search_paths periodically, so a fresh cargo build triggers the daemon to respawn the subprocess automatically. No --watch flag yet (Phase 31.7.b deferred).

Next steps

Rust SDK — full Rust API + manifest example.
Python SDK, TypeScript SDK, PHP SDK — language-specific references with the same shape.
Plugin contract — wire spec; read this once and you can debug any SDK.
Patterns (8 common shapes) — pre-baked designs for channel plugins, pollers, hybrid bridges.
Publishing a plugin — asset naming convention + 4-job CI workflow shape.
Signing & publishing — cosign keyless tutorial that operators on --require-signature need.
Plugin trust (trusted_keys.toml) — operator-side verification policy your readers will configure to trust your releases.

Plugin manifest (Phase 81.13 unified)

Phase 81.13 unified the framework's two manifest parsers (nexo-extensions::manifest Phase 11 + nexo-plugin-manifest Phase 31.5+) into a single source of truth. Plugin authors now ship one TOML manifest at the plugin root that declares both the legacy contributions (tools / hooks / channels / providers / pollers) AND the modern admin RPC + HTTP server capabilities.

Filename

The canonical filename is plugin.toml. The framework also accepts nexo-plugin.toml as a legacy fallback for one deprecation cycle so existing plugins keep loading without an immediate rename. When both files are present in the same plugin root, plugin.toml wins and the daemon emits a warning.

Plugins authored after 81.13 should ship plugin.toml only.

Versioning

The TOML root may carry a manifest_version integer:

omitted or 1 → legacy v1 shape (flat [capabilities], [transport], [meta], [mcp_servers], [outbound_bindings], [context], [requires]). The parser auto-translates to v2 in memory and emits a one-shot deprecation warn per plugin.
2 → canonical Phase 81.13 shape. New plugins should set this explicitly to opt out of the deprecation warn.

Unknown values produce a clear parse error.

ID regex

Plugin ids match ^[a-z][a-z0-9_-]{0,63}$ (lowercase, starts with letter, body of letters/digits/underscores/hyphens, length 64). Both agent_creator and agent-creator styles are valid; the framework normalises neither so plugin authors get to pick.

Reserved ids that no plugin can claim (defended at boot): agent, browser, core, email, heartbeat, memory, telegram, whatsapp.

Where the legacy fields land

Pre-81.13 plugins kept their plugin.toml flat (Phase 11 shape). Those still parse — the compat layer translates each section as follows:

v1 location	v2 location
`[plugin]`	`[plugin]` (renames `min_agent_version` → `min_nexo_version`)
`[capabilities]`	`[plugin.capabilities]`
`[capabilities.admin]`	`[plugin.capabilities.admin]`
`[capabilities.http_server]`	`[plugin.capabilities.http_server]`
`[transport]` (`kind = "stdio"`)	`[plugin.entrypoint]`
`[transport]` (`kind = "nats"	"http"`)
`[meta]`	`[plugin.meta]`
`[requires]` (`bins`/`env`)	DROPPED with warn (preserved in 81.13.b)
`[mcp_servers]` (top-level)	DROPPED with warn
`[outbound_bindings]` (top-level)	DROPPED with warn
`[context]`	DROPPED with warn
`[plugin] priority`	DROPPED with warn
`[capabilities] tools/hooks/channels/providers/pollers`	DROPPED with warn

The "DROPPED" entries don't break boot — the parser logs the list of legacy fields it saw + skipped per plugin. Consumers that needed those fields keep reading them via the legacy nexo-extensions::manifest::ExtensionManifest::from_path path, which still parses the v1 shape directly.

Single-file canonical example

manifest_version = 2

[plugin]
id               = "agent_creator"
version          = "0.0.35"
name             = "Agent Creator"
description      = "Operator UI microapp."
min_nexo_version = ">=0.1.0"

[plugin.entrypoint]
command = "./agent-creator"
args    = []

[plugin.capabilities.admin]
required = ["agents_crud", "skills_crud", "llm_keys_crud"]
optional = ["channels_crud", "auth_rotate", "secrets_write"]

[plugin.capabilities.http_server]
port        = 8765
bind        = "127.0.0.1"
token_env   = "AGENT_CREATOR_TOKEN"
health_path = "/healthz"

[plugin.meta]
author     = "Cristian García"
license    = "MIT OR Apache-2.0"
homepage   = "https://example.com"
repository = "https://github.com/x/y"

Pre-81.13 example (still valid via compat)

[plugin]
id          = "agent-creator"
version     = "0.0.34"
name        = "Agent Creator"
description = "Operator UI microapp."

[capabilities]
tools = ["agent_list", "agent_get"]
hooks = ["before_message"]

[capabilities.admin]
required = ["agents_crud"]
optional = ["channels_crud"]

[transport]
kind    = "stdio"
command = "./agent-creator"

[meta]
author = "Cristian"

The framework parses this as manifest_version = 1 (auto- detected), translates to v2 in memory + emits a deprecation warn once at boot. Operator can migrate at their own pace.

Deferred (sub-phase 81.13.b)

Preserve legacy mcp_servers / outbound_bindings / context / requires.bins+env / capabilities.tools+hooks+channels+providers+pollers / transport.kind=nats|http / plugin.priority in the canonical v2 shape so the migrator stops dropping them.
Hard removal of nexo-plugin.toml filename + manifest_version = 1 mode (target: 0.2.0).
JSON-Schema export for editor autocomplete (mirrors OpenClaw's openclaw.plugin.json).

`[plugin.config_schema]` (Phase 93.1)

Plugins ship their config contract inside their own manifest so the daemon never hardcodes a per-plugin field. Optional — plugins without a config block (or those still on typed cfg.plugins.X through the Phase 93.5 deprecation window) omit the section.

# Multi-instance plugin (telegram, whatsapp, …).
[plugin.config_schema]
shape = "array"
schema = """{
  "type": "object",
  "properties": {
    "instance":      { "type": "string" },
    "bot_token_env": { "type": "string" },
    "enabled":       { "type": "boolean" }
  },
  "required": ["instance", "bot_token_env"]
}"""

# Single-instance plugin (email, browser, google).
[plugin.config_schema]
shape = "object"
schema = """{
  "type": "object",
  "properties": {
    "imap_host":    { "type": "string" },
    "smtp_host":    { "type": "string" },
    "username_env": { "type": "string" }
  },
  "required": ["imap_host", "smtp_host", "username_env"]
}"""

Fields

Field	Type	Default	Meaning
`schema`	JSON-Schema string	—	Draft-07 subset (see `config_schema`). Root MUST be `"type":"object"` even when `shape = "array"` — the schema describes ONE element.
`shape`	`"object"` \| `"array"`	—	YAML wire shape at `cfg.plugins.<plugin_id>`. `object` = single map; `array` = `Vec<map>` for multi-instance plugins.
`hot_reload`	boolean	`true`	Mirrors [`ConfigSection::hot_reload`]; plugins set `false` only if config touches state that requires a restart.

Static validation

cargo run --bin nexo manifest validate rejects malformed schemas with ManifestError::PluginConfigInvalidSchema { plugin_id, reason }:

Empty schema string.
schema is not valid JSON.
schema parses to a JSON value that is not an object.
Root object's "type" is not "object".

Operator YAML against schema runs at boot (Phase 93.2) using the same lightweight validator already shipping for install-time microapp config (Phase 83.17).

Runtime delivery (Phase 93.2)

Once schema validation passes, the host calls the plugin's NexoPlugin::configure(value) async hook with the operator's YAML slice. The trait method has a default no-op so plugins that haven't migrated keep working through the Phase 93.5 deprecation window.

Subprocess plugins receive the same value over their stdio JSON-RPC channel as a plugin.configure request:

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "plugin.configure",
  "params": { "value": <operator-YAML-as-JSON> }
}

The host BUFFERS the value during the brief window between configure(value) and the child's spawn completing; the buffered value is delivered automatically after initialize acks. Plugin SDKs should treat plugin.configure as re-entrant — hot-reload sends a fresh request when the operator's YAML changes.

Three error categories from PluginConfigureError:

Variant	Source	Meaning
`SchemaValidation`	host	Operator YAML failed `[plugin.config_schema]` walker before the plugin ran.
`PluginRejected`	plugin	Plugin's own runtime check (typed deserialise, secret resolve, probe) failed.
`SubprocessRpc`	host	Subprocess plugin didn't ack `plugin.configure` (transport, timeout, error).

Configure-then-init is the boot order: init's registrations may inspect what configure accepted, so the plugin sees a consistent world from the first init call onward.

Operator config delivery (Phase 93.3)

The daemon walks <config_dir>/plugins/*.yaml at boot and feeds each file into cfg.plugins.entries.<plugin_id> keyed by the filename stem. A new community plugin lands by dropping <config_dir>/plugins/slack.yaml; the daemon discovers it, matches the file's stem to manifest.plugin.id == "slack", and routes the parsed value into NexoPlugin::configure(value). Zero daemon-side edits.

Outer-wrapper-key strip is conservative — only when the YAML's single top-level key matches the filename stem:

# config/plugins/slack.yaml
slack:                            # ← stripped
  token_env: SLACK_BOT_TOKEN

becomes entries["slack"] == { token_env: SLACK_BOT_TOKEN }. If the operator's top-level key doesn't match the stem, the whole mapping is preserved verbatim — plugins decide how to interpret.

discovery.yaml is filtered (framework-internal). Parse failures log tracing::warn! on the plugins.config target and skip the file; other plugins still boot. Init-loop emits a tracing::info! when both entries.<id> AND the legacy plugins/<id>/*.yaml subdir populate — operator-visible deprecation-window state; entries always wins.

`[plugin.pairing.adapter]` — manifest-driven pairing adapter

Phase 81.33.b.real (Stage 1 of the plugin auto-discovery design). Status: shipped 2026-05-15.

Plugins that expose a channel with a DM-pairing flow declare the broker dispatch contract for the daemon's PairingChannelAdapter in nexo-plugin.toml. The daemon constructs a GenericBrokerPairingAdapter from the manifest and registers it into the shared PairingAdapterRegistry. This replaces the previous design where the daemon had to hardcode Arc::new(<plugin>PairingAdapter::new(broker)) blocks for every canonical plugin.

Manifest section

[plugin.pairing.adapter]
channel_id           = "whatsapp"
broker_topic_prefix  = "plugin.whatsapp"
# Optional knobs:
# format_challenge_text_kind = "broker"   # default: trait's built-in formatter
# normalize_cache_ttl_seconds = 3600      # default: unbounded

Required fields:

channel_id — stable string id matching what the gate stores in pairing_pending.channel and pairing_allow_from.channel. The registry uses this as the key under register(). Plugins that also ship [plugin.pairing] UI (kind = "qr" | "form" | "info" | "custom") should use the same channel_id so the registry + UI agree.
broker_topic_prefix — broker subject prefix the daemon dispatches JSON-RPC requests under (see contract below).

Optional fields:

format_challenge_text_kind — "default" (the value the trait already supplies) or "broker" (asks the plugin per challenge). Default is default; channels needing custom challenge formatting (e.g. Telegram MarkdownV2 escape) flip to broker.
normalize_cache_ttl_seconds — TTL for normalize_sender cache entries. None (default) = unbounded (cache lives the daemon's lifetime). Set when the plugin can return different canonical forms for the same raw over time.

Broker JSON-RPC dispatch contract

Daemon publishes JSON-RPC request messages on <broker_topic_prefix>.pairing.<method>. Plugin subscribes, handles, replies. All payloads are JSON.

`<prefix>.pairing.normalize_sender`

Request. { "raw": "<raw-sender>" }.

Reply. { "normalized": "<canonical>" } to accept, or { "normalized": null } to reject (the gate treats reject as Decision::Drop).

Examples:

Plugin	Input	Output
whatsapp	`573001112222@c.us`	`+573001112222`
whatsapp	`573001112222@s.whatsapp.net`	`+573001112222`
telegram	`@User_Name`	`@user_name`
telegram	`12345678`	`12345678`
telegram	`not_a_handle`	`null` (reject)

The daemon caches (raw → normalized) in memory. Cache hits do NOT round-trip the broker. With normalize_cache_ttl_seconds unset the cache grows bounded by distinct-sender count; typically < 10⁴ entries.

`<prefix>.pairing.send_reply`

Request. { "account": "<inst>", "to": "<sender>", "text": "<challenge>" }.

Reply. { "ok": true } or { "ok": false, "error": "<message>" }.

Delivers the pairing challenge text. account is the multi-instance discriminator (operators set it via config/plugins/<channel>.yaml's instance field).

`<prefix>.pairing.send_qr_image`

Request. { "account": "<inst>", "to": "<sender>", "png_base64": "<base64>" }.

Reply. { "ok": true } or { "ok": false, "error": "<message>" }.

Plugin decodes the base64 PNG and delivers it as a media message. Plugins whose channel cannot send media should reply { "ok": false, "error": "channel does not support media" } — the trait's default impl bails for plugins that haven't overridden, but explicit failure with a clear error helps operators diagnose.

`<prefix>.pairing.format_challenge_text` (only when `format_challenge_text_kind = "broker"`)

Request. { "code": "<setup-code>" }.

Reply. { "text": "<formatted-challenge>" }.

When the manifest sets format_challenge_text_kind = "broker" the daemon issues this RPC instead of using the trait's built-in formatter. Used for channels that need plugin-specific escaping (Telegram MarkdownV2, Discord embed JSON, …).

Migration path

Plugins migrate by adding the manifest section in their next release. Until then the daemon's legacy hardcoded build_known_pairing_registry() registration (gated by #[cfg(feature = "plugin-<id>")]) continues to serve. When a plugin ships the manifest section, the registry's register() overwrites by channel_id so the generic adapter wins without operator action.

Canonical plugins

nexo-plugin-whatsapp — pending. When shipped, the daemon's L1654 Arc::new(WhatsappPairingAdapter::new(broker)) becomes dead code (still cfg-gated, removable in a follow-up).
nexo-plugin-telegram — pending. Same shape, L1657.
Future plugins (signal, matrix, sms, discord) — drop manifest in plugins.discovery.search_paths; no daemon edit needed.

Implementing the broker side (plugin authors)

Subprocess plugins built on the nexo-microapp-sdk register the three handlers (normalize_sender, send_reply, send_qr_image) during PluginAdapter::init. Reference implementations land in the next nexo-plugin-whatsapp + nexo-plugin-telegram release; until then plugins can mirror the canonical <channel>PairingAdapter::normalize_sender Rust logic into the broker handler verbatim.

Daemon-side wiring

SubprocessNexoPlugin::build_pairing_adapter() (crates/core/src/agent/nexo_plugin_registry/subprocess.rs) checks self.cached_manifest.plugin.pairing.adapter. When present, returns Some(Arc::new(GenericBrokerPairingAdapter::from_manifest(broker, section))).

The daemon's boot loop (src/main.rs:6416) iterates every loaded plugin handle and calls build_pairing_adapter(broker.clone()). Same loop runs in the hot-spawn path (src/main.rs:7224+). Dispatch-ctx mode (boot_dispatch_ctx_if_enabled / autonomous-worker) uses only the legacy hardcoded registry today; threading plugin_handles_cell into that function is a follow-up if dispatch-ctx flows grow pairing-aware hooks.

Validation

cargo build --release-fast --bin nexo (default) — 3m09s.
cargo build --release-fast --bin nexo --no-default-features — 2m50s (nexo-plugin-whatsapp + nexo-plugin-telegram absent from the compile graph; manifest-section parsing still compiles because it's in nexo-plugin-manifest).
cargo nextest run --workspace — 6280/6280 (4 new tests for GenericBrokerPairingAdapter).
cargo nextest run -p nexo-pairing — 67/67.
cargo nextest run --no-default-features -p nexo-rs — 105/105.

Trade-offs

Concern	Decision
Sync `normalize_sender` blocking on broker	`tokio::task::block_in_place` + `Handle::block_on`. Requires multi-threaded runtime (every daemon hot path qualifies). Cache makes the cost a one-time miss per unique sender.
Cache eviction	Default unbounded; TTL knob if needed. Pairing-only volume keeps unbounded safe at scale.
Custom challenge text per channel	Optional `format_challenge_text_kind = "broker"`. Default uses the trait's built-in formatter (covers 90% of channels).
`channel_id` `'static` lifetime	`Box::leak` at construction. One-time leak per plugin per daemon run; bounded by plugin count.
Manifest schema change without daemon recompile	New optional fields use `#[serde(default)]`; backward compatible. New required fields = manifest schema version bump.

`[plugin.pairing.trigger]` — manifest-driven pairing trigger

Phase 81.20.x Stage 7 Phase 2. Status: introduced 2026-05-16.

Plugins that own a QR-pairing pump (e.g. WhatsApp wa-agent, future Signal QR-link) declare which admin RPC methods the daemon should forward pairing/start and pairing/cancel to. The daemon constructs a BrokerPairingTrigger (in nexo-pairing) from the manifest and inserts it into the dispatcher's PairingChannelTriggers map under the same channel_id as the sibling [plugin.pairing.adapter] section.

This replaces the previous design where the daemon had to import nexo_plugin_whatsapp::pairing_trigger::WhatsappPairingTrigger and call from_configs(...) — daemon no longer needs to link the plugin crate to drive its pump.

Manifest section

[plugin.pairing]
kind  = "qr"
label = "WhatsApp"

[plugin.pairing.trigger]
start_method  = "nexo/admin/whatsapp/pairing/start"
cancel_method = "nexo/admin/whatsapp/pairing/cancel"
# Optional knob:
# timeout_seconds = 120   # default: PAIRING_DEFAULT_TIMEOUT (180s)

Required fields:

start_method — full admin RPC method name the daemon forwards the pump-start request to. MUST live under the plugin's own [plugin.admin] method_prefix (e.g. "nexo/admin/<plugin_id>/pairing/start"). Forwarded via PluginAdminRouter so the plugin's existing admin subscriber pipeline serves it.
cancel_method — full admin RPC method name for pump cancellation. Same routing rules as start_method.

Optional fields:

timeout_seconds — per-call broker forward timeout. Defaults to PAIRING_DEFAULT_TIMEOUT (180s), the upper bound for the whole pairing handshake.

Compatibility constraints

[plugin.pairing.trigger] is only valid with kind = "qr". Form and Info kinds are operator-driven and need no remote pump. Custom kinds use their own nexo/notify/<rpc_namespace>/status_changed channel. Manifest validation rejects mismatched combinations at boot.
start_method and cancel_method MUST be non-empty.

Broker JSON-RPC dispatch contract

The daemon's BrokerPairingTrigger.start(ctx):

Forwards start_method via PluginAdminRouter, passing { challenge_id, agent_id, instance } as params.
Plugin's admin handler spawns its pump (wa-agent QR pump for whatsapp). The handler returns immediately — the pump runs in the subprocess.
The plugin publishes QR and state updates on plugin.inbound.<channel_id>.<instance>.pairing.qr and .../pairing.state (new contract). Daemon's single generic subscriber updates ctx.store.update_qr + notify_status.
On pairing/cancel, daemon forwards cancel_method via the same router. Plugin tears down the pump cleanly.

Inbound topics (plugin → daemon)

plugin.inbound.<channel_id>.<instance>.pairing.qr
plugin.inbound.<channel_id>.<instance>.pairing.state

QR payload: { "challenge_id": "<uuid>", "png_base64": "<base64>", "rotates_at": "<rfc3339>" }.

State payload: { "challenge_id": "<uuid>", "state": "Linked" | "Error" | "Pending", "error": "<msg-if-Error>" }.

Migration path

Plugins migrate by adding the manifest section in their next release. Until a plugin ships the section, the daemon falls back to its legacy hardcoded WhatsappPairingTrigger import (gated by #[cfg(feature = "plugin-whatsapp")]). Once the plugin manifests the section, the boot loop's generic registration overwrites by channel_id, so the broker trigger wins without operator action.

Canonical plugins

nexo-plugin-whatsapp — pending v0.4.4. When shipped, the daemon's WhatsappPairingTrigger::from_configs registration in src/main.rs (cfg-gated) becomes dead code; removable once v0.4.4 lands on crates.io.
Future pairing channels (signal, matrix, sms with QR-link) — drop manifest into plugins.discovery.search_paths; no daemon edit needed.

Implementing the broker side (plugin authors)

Subprocess plugins built on the nexo-microapp-sdk register the start_method and cancel_method handlers via their existing [plugin.admin] subscriber (the broker topic prefix is <plugin.admin.broker_topic_prefix>.<suffix>, where <suffix> is the trailing portion of the admin method after the plugin's prefix).

Reference impl lands in nexo-plugin-whatsapp v0.4.4. The handler should:

Read { challenge_id, agent_id, instance } from params.
Spawn an async task that drives the pump (wa-agent's pair_with_callback).
As QR frames rotate, publish to plugin.inbound.whatsapp.<instance>.pairing.qr.
On connect success, publish state Linked. On terminal error, publish state Error with the error message.
Return { "ok": true } from the admin RPC reply (the start was accepted; success/failure flows through the inbound topics).

Cancellation: the daemon forwards cancel_method with { challenge_id }. Handler aborts the spawned pump task and returns { "ok": true }.

Validation

cargo nextest run -p nexo-plugin-manifest — manifest schema
- validator tests.
cargo nextest run -p nexo-pairing — BrokerPairingTrigger unit tests.
cargo nextest run -p nexo-rs — daemon boot-loop integration.

Trade-offs

Concern	Decision
Daemon import of plugin crate for trigger	Eliminated — the broker trigger forwards via existing `PluginAdminRouter`.
Reuse of `[plugin.admin]` topic vs. dedicated `[plugin.pairing.trigger]` topic	Reuse. Cleaner: one less manifest section, one less subscriber to maintain in the plugin. Admin handler already exists in canonical plugins.
Plugin needs to push QR back to daemon	New `plugin.inbound.<channel>.<inst>.pairing.{qr,state}` broker topics. Daemon's single generic subscriber consumes both.
Backwards compat for plugins without the section	Daemon falls back to the legacy hardcoded trigger registration. Removed after the canonical plugin ships the new manifest.

`[plugin.public_tunnel]` — manifest-driven Cloudflare quick tunnel

Phase 81.20.x Stage 7 Phase 2. Status: introduced 2026-05-16.

Plugins that expose HTTP routes the operator might want to reach from outside the LAN (e.g. WhatsApp pairing while the daemon runs on a desktop and the QR is scanned on a phone) declare this section. The daemon spawns a Cloudflare quick tunnel pointed at its HTTP port; plugin routes exposed via [plugin.http] become reachable at https://*.trycloudflare.com/<plugin-mount-prefix>/....

Two-key opt-in

The daemon opens a public tunnel only when both are true:

Plugin manifest: [plugin.public_tunnel] enabled = true.
Operator capability env: NEXO_PLUGIN_PUBLIC_TUNNEL_ALLOW=1.

A manifest declaration alone is not enough — the daemon still honours the operator's hardening choice. Declaring the section with enabled = false (or omitting it) keeps the plugin forever-private even if the operator flips the env on for another plugin.

Manifest section

[plugin.public_tunnel]
enabled = true
# Optional — when set, the daemon subscribes to this exact broker
# subject and closes the tunnel on the first published message.
# Wildcards (`*`, `>`) are rejected at manifest validation.
close_on_event = "plugin.lifecycle.whatsapp.tunnel_done"

Fields

enabled (bool, default false) — plugin-side opt-in. When false, the section behaves identically to "no section declared".
close_on_event (string, optional) — literal broker subject. When set, the daemon subscribes once and tears the tunnel down on the first inbound message. Typical use: a pairing-channel plugin publishes <plugin>.tunnel_done after the operator completes pairing so the public URL stops responding immediately. When None, the tunnel stays up for the daemon's lifetime.

Validation

Rejected at boot:

close_on_event is the empty string or whitespace (PublicTunnelCloseEventEmpty).
close_on_event contains a NATS wildcard (* or >). The daemon refuses wildcards so a stray plugin event can't race-close a healthy tunnel (PublicTunnelCloseEventWildcard).

URL sidecar file

When a tunnel comes up, the daemon writes the public URL to $NEXO_HOME/state/tunnel.url (or ~/.nexo/state/tunnel.url). The nexo pair start CLI reads this file directly so the operator doesn't have to copy/paste from logs.

Migration from `wa_tunnel_cfg`

Earlier daemon revisions extracted public_tunnel from each WhatsApp YAML entry under config/plugins/whatsapp.yaml and spawned a Cloudflare tunnel inline. That orchestration was removed (Phase 81.20.x Stage 7 Phase 2) — plugins now declare the intent in their manifest, the daemon iterates uniformly, and the operator's env flag stays authoritative.

Daemon-side wiring

main.rs iterates wire.plugin_handles after wire_plugin_registry returns. For every plugin with [plugin.public_tunnel] enabled = true, the daemon:

Spawns nexo_tunnel_quick::TunnelManager::new(8080).start().
Logs the URL + writes it to the sidecar file.
If close_on_event is set, spawns a broker subscriber that awaits one message then calls tunnel.shutdown().await.

The capability gate is checked once at the top of the iterator block. When OFF, the iterator logs a single informational line ("declared by at least one plugin but env is not set") and skips every tunnel spawn — useful for hardened deployments that want a visible audit trail.

Threat model

https://*.trycloudflare.com/<plugin-prefix>/... is reachable from anywhere the URL is shared. Cloudflare provides DDoS protection + edge TLS but does NOT enforce authentication. The plugin's HTTP handler is responsible for any access control on the exposed paths.

Pairing pages are time-limited (the QR rotates ~every 20s, the challenge expires in 5 minutes). When close_on_event is set, the tunnel is teardown-immediate post-pairing — the URL becomes 404 the moment the plugin signals completion.

Validation commands

cargo nextest run -p nexo-plugin-manifest — manifest schema
- validator tests.
Manual smoke: NEXO_PLUGIN_PUBLIC_TUNNEL_ALLOW=1 cargo run --bin nexo with a manifest declaring the section. Daemon log should show "public tunnel up" with the assigned URL.

`[plugin.http]` — daemon-proxied HTTP routes

Phase 81.33.b.real Stage 2 (Layer 8 of the plugin auto-discovery design). Status: shipped 2026-05-15.

Plugins that need to expose HTTP endpoints to operators or external callers declare a mount prefix in their nexo-plugin.toml. The daemon's HTTP server (port :8080) matches every request against the registered prefixes and forwards matches to the plugin's subprocess via broker JSON-RPC. The plugin handles internal routing under the prefix.

Distinct from [plugin.http_server], which advertises a plugin-bound port the daemon does NOT proxy.

Manifest section

[plugin.http]
mount_prefix     = "/whatsapp"
# Optional:
# timeout_seconds = 60     # default: 30

Fields:

mount_prefix (required) — path prefix the daemon mounts. Must start with /. Cannot contain ? or #. The daemon matches path.starts_with(mount_prefix).
timeout_seconds (optional) — per-request broker RPC timeout. Default 30s. Plugins serving slow flows (image generation, OAuth dances) extend.

Broker JSON-RPC contract

Daemon → plugin on plugin.<id>.http.request:

{
  "method": "GET",
  "path": "/whatsapp/pair",
  "query": "instance=default",
  "headers": [["Host", "127.0.0.1:8080"], ["User-Agent", "..."]],
  "body_base64": ""
}

Plugin replies:

{
  "status": 200,
  "headers": [["Content-Type", "text/html; charset=utf-8"]],
  "body_base64": "<base64-encoded body bytes>"
}

Body bytes are base64-encoded so binary payloads (PNGs, PDFs, small file uploads) round-trip cleanly through JSON. The daemon decodes server-side and writes raw bytes back to the TCP stream.

Route matching

The router stores all prefixes in longest-first order. A request matches the most-specific prefix:

Registered prefixes	Request	Matched plugin
`/api`, `/api/v1`	`/api/v1/users`	`/api/v1` plugin
`/api`, `/api/v1`	`/api/v2/users`	`/api` plugin
`/api`	`/health`	none (fallthrough)

A miss falls through to the daemon's legacy hardcoded paths (/health, /metrics, etc.).

Reserved prefixes. The router refuses to register any plugin mount under these daemon-internal paths:

/health — liveness checks
/metrics — Prometheus scrape
/pair — daemon's WS pairing companion
/admin — admin RPC surface (port 9091, not 8080, but reserved here too for safety against future shared-port designs)
/.well-known — protocol probes (RFC 8615)

A plugin manifest declaring any of these (or a sub-path like /health/foo) is rejected at registration with a warn-level log; the plugin's broker handler stays unhooked for that prefix and the daemon's internal handler serves uninterrupted.

Plugins choosing non-reserved prefixes (/whatsapp, /oauth, /api/v1/..., /healthy, etc.) register normally.

Error rendering

Failure	Status	Body
Broker timeout (plugin slow / unresponsive)	504 Gateway Timeout	`{"error":"plugin gateway timeout"}`
Plugin replied with malformed JSON	502 Bad Gateway	`{"error":"plugin reply malformed"}`
Plugin replied with `status: 500` etc.	passes through	plugin's body verbatim

The daemon writes the plugin's headers + body_base64 verbatim for non-error replies — the plugin owns the response shape.

Limitations (Stage 2)

No streaming responses. SSE / chunked transfer / progressive rendering NOT supported. Plugin must buffer the full response before replying. Use [plugin.http_server] (own port) for streaming endpoints.
No WebSocket upgrades. The daemon's /pair WS handshake is daemon-internal (Phase 87 pairing companion). Plugins wanting WS endpoints bind their own port.
Body size cap. Daemon reads up to 16KB upfront. Larger bodies require streaming, which Stage 2 does not support.
No request-id header injection. Plugins should generate their own trace ids if needed; the daemon does not auto-stamp X-Request-Id.

Implementing the plugin side

Subprocess plugins built on the Rust nexo-microapp-sdk register a broker handler on plugin.<plugin_id>.http.request:

#![allow(unused)]
fn main() {
// Sketch (final SDK helpers ship alongside the next plugin release):
ctx.broker
    .subscribe("plugin.<id>.http.request")
    .await?
    .for_each(|msg| async {
        let req: HttpRequest = serde_json::from_value(msg.payload)?;
        let res = my_router.dispatch(&req).await;
        broker
            .publish(&msg.reply_to.unwrap(), Event::new(/* ... */, res))
            .await
    });
}

Reference impl lands with the next nexo-plugin-whatsapp release; until then plugins copy the dispatch logic from the daemon's current nexo_plugin_whatsapp::pairing::dispatch_route.

Migration status

Canonical plugin crates have NOT yet shipped a manifest revision adding [plugin.http]. Daemon's legacy hardcoded if let Some(rest) = path.strip_prefix("/whatsapp/") block (gated by #[cfg(feature = "plugin-whatsapp")] via 93.12.c.2) continues to serve. When a plugin ships the section, the router matches before the legacy block and the plugin's broker handler takes over without operator action; the legacy block becomes dead code (still cfg-gated, removable in a follow-up after the canonical plugins migrate).

Validation

cargo build --release-fast --bin nexo (default) — 3m11s.
cargo build --release-fast --bin nexo --no-default-features — 2m53s.
cargo nextest run --workspace — 6294/6294 (8 new tests in plugin_http::tests covering router matching, response decoding, broker error path).
cargo nextest run --no-default-features -p nexo-rs — 105/105.
cargo nextest run -p nexo-pairing — 75/75.

Trade-offs

Concern	Decision
Sync HTTP handler blocking on broker	Daemon-side handler is `async` already; broker RPC stays async. No sync→async bridge needed.
Binary body via base64	Acceptable for pairing pages (≤100KB) + OAuth callbacks. Streaming follow-up if large-upload demand appears.
Header passthrough	Daemon forwards all request headers; reply sets `Content-Type` from plugin response. Cookies need explicit `Set-Cookie` from plugin.
Mount prefix conflicts with daemon routes	Daemon reserves `/health`, `/metrics`, `/pair`, `/admin`, `/.well-known`. Router registration rejects any plugin asking for these prefixes or sub-paths (logged at warn level). A plugin cannot hijack health checks or admin RPCs even if its manifest declares the matching prefix.

`[plugin.admin]` — daemon-forwarded admin RPC commands

Phase 81.33.b.real Stage 4 (Layer 9 of the plugin auto-discovery design). Status: shipped 2026-05-15.

Plugins that expose admin RPC commands (bot inspectors, account listers, channel-specific control planes) declare a method prefix in their nexo-plugin.toml. The daemon's admin dispatcher matches every incoming method against the registered prefixes and forwards matches to the plugin's subprocess via broker JSON-RPC. The plugin handles internal dispatch.

Replaces the previous pattern where each plugin needed a hardcoded .with_<plugin>_handle(Arc<dyn XxxHandle>) builder method on the dispatcher (e.g. the legacy with_wa_bot_handle integration for nexo/admin/whatsapp/bot/*).

Manifest section

[plugin.admin]
method_prefix       = "nexo/admin/whatsapp/"
broker_topic_prefix = "plugin.whatsapp.admin"
# Optional:
# timeout_seconds = 30

Fields:

method_prefix (required) — admin RPC method prefix the plugin owns. Must start with nexo/admin/ and end with /. Example: nexo/admin/whatsapp/.
broker_topic_prefix (required) — broker subject prefix the daemon forwards under. Example: plugin.whatsapp.admin.
timeout_seconds (optional) — per-method broker RPC timeout. Default 30s.

Broker JSON-RPC contract

Daemon → plugin on <broker_topic_prefix>.<verb>:

{
  "method": "nexo/admin/whatsapp/bot/list",
  "params": { "agent_id": "kate" }
}

Plugin replies:

{ "ok": true,  "result": { "bots": [...] } }
{ "ok": false, "error": "session not connected" }

The daemon emits the broker subject by stripping method_prefix from the incoming method, replacing / with ., and appending to broker_topic_prefix. So nexo/admin/whatsapp/bot/list → bot/list → bot.list → plugin.whatsapp.admin.bot.list.

Reserved prefixes

The router refuses to register any plugin whose method_prefix collides with daemon-internal admin handlers. The reserved list:

nexo/admin/agents/
nexo/admin/credentials/
nexo/admin/pairing/
nexo/admin/llm/
nexo/admin/channels/
nexo/admin/tenants/
nexo/admin/memory/
nexo/admin/sessions/
nexo/admin/snapshots/
nexo/admin/policy/

Comparison is bidirectional — nexo/admin/agents/sneaky/ (subpath of reserved) AND nexo/admin/ (super-prefix that would shadow reserved) are both rejected. Plugins choosing non-reserved prefixes (nexo/admin/whatsapp/, nexo/admin/signal/, nexo/admin/oauth/) register normally.

Rejected registrations log at warn level; the daemon's internal handlers continue to serve uninterrupted.

Capability enforcement

Plugin admin methods that match the router are gated by the existing channels_crud capability (reused so operators already granted channel admin can call plugin admin without a fresh capability). Per-plugin capability grants are a follow-up when finer control is needed.

Error rendering

Failure	Status
Broker timeout (plugin slow / unresponsive)	`AdminRpcError::Internal("plugin admin forward failed: broker error: ...")` → JSON-RPC error code `-32603`
Plugin replied with malformed JSON	same
Plugin replied with `ok: false`	`AdminRpcError::Internal("<plugin error message>")`

The daemon does NOT translate plugin error strings; the typed error surfaces verbatim through the JSON-RPC response.

Migration status

The legacy with_wa_bot_handle builder on AdminRpcDispatcher is preserved. Until nexo-plugin-whatsapp ships a manifest revision with [plugin.admin], the daemon's hardcoded nexo/admin/whatsapp/bot/list + bot/send match arms continue to serve. When whatsapp ships the section, the router matches BEFORE the typed match arms (generic forward wins) and the legacy block becomes dead code (cfg-gated, removable in follow-up).

Implementing the plugin side

Subprocess plugins built on nexo-microapp-sdk subscribe to <broker_topic_prefix>.> and dispatch:

#![allow(unused)]
fn main() {
// Sketch (final SDK helpers ship with the next plugin release):
ctx.broker
    .subscribe("plugin.whatsapp.admin.>")
    .await?
    .for_each(|msg| async {
        let topic = msg.topic.strip_prefix("plugin.whatsapp.admin.").unwrap();
        match topic {
            "bot.list" => handle_bot_list(msg.payload).await,
            "bot.send" => handle_bot_send(msg.payload).await,
            _ => reply_err(msg, "method not found"),
        }
    });
}

Validation

cargo build --release-fast --bin nexo (default) — 3m10s clean.
cargo build --release-fast --bin nexo --no-default-features — 2m53s clean.
cargo nextest run --workspace — 6312/6312 (9 new tests in plugin_admin::tests).
cargo nextest run -p nexo-pairing — covers router matching, reserved-prefix rejection, subpath/super-prefix safety, method-to-broker-suffix translation, broker error path.
cargo nextest run --no-default-features -p nexo-rs — 105/105.
cargo nextest run --no-default-features -p nexo-setup — 317/317.

Trade-offs

Concern	Decision
Per-plugin capability gates	Reuse `channels_crud`. Follow-up if finer grant needed.
Plugin reply schema	`{ ok: bool, result: Value, error: String }`. Single envelope; plugin owns the `result` shape per method.
Method-to-topic translation	`/` → `.` mechanical. Plugin's broker handler design is straightforward.
Reserved-prefix safety	Bidirectional collision check (both subpath AND super-prefix rejected).
Interior-mutability router	`Arc<PluginAdminRouter>` with internal `RwLock<Vec<Route>>` so daemon can populate AFTER `wire_plugin_registry` returns without rebuilding the dispatcher.

`[plugin.metrics]` — Prometheus scrape contribution

Phase 81.33.b.real Stage 5 (Layer 11 of the plugin auto-discovery design). Status: shipped 2026-05-15.

Plugins exposing Prometheus metrics declare a broker topic the daemon scrapes on every /metrics HTTP request. The plugin's subprocess handles the scrape, returns Prometheus text, and the daemon concatenates it into the aggregate response.

Replaces the previous pattern where each plugin's metrics call was hardcoded inside src/main.rs::run_metrics_server (e.g. the legacy nexo_plugin_email::metrics::render_prometheus(...) direct call).

Manifest section

[plugin.metrics]
prometheus          = true
broker_topic_prefix = "plugin.email"
# Optional:
# timeout_seconds = 5

Fields:

prometheus (default false) — opt the plugin into the /metrics scrape loop.
broker_topic_prefix (required when prometheus = true) — daemon publishes to <broker_topic_prefix>.metrics.scrape.
timeout_seconds (optional, default 5s) — per-scrape broker RPC timeout. Scrapes happen per /metrics HTTP request so the daemon-side latency budget is tight; plugins exceeding the timeout warn-log + contribute nothing for that scrape.

Broker JSON-RPC contract

Daemon → plugin on <broker_topic_prefix>.metrics.scrape:

{}

Plugin replies:

{ "text": "# HELP <metric> ...\n# TYPE ...\n<metric> <value>\n..." }

Empty / missing text is treated as a successful scrape with no metrics. The daemon does NOT validate Prometheus text shape — plugin owns the surface entirely. Adding a trailing newline is optional; the daemon appends \n if missing.

Daemon-side aggregation

run_metrics_server (src/main.rs:15097+) concatenates from:

nexo_core::telemetry::render_prometheus(nats_open) — daemon-internal counters.
nexo_llm::telemetry::render_prometheus() — LLM provider stats.
nexo_mcp::telemetry::render_prometheus() + server-side dispatch metrics.
nexo_poller::telemetry::render_prometheus() — Gmail / generic poller counters.
nexo_plugin_email::metrics::render_prometheus(...) — legacy direct call, kept until email plugin migrates to broker scrape.
nexo_tunnel_quick::metrics::render_prometheus_for(...) — tunnel supervisor counters.
Phase 5: nexo_pairing::plugin_metrics::scrape_all(...) — every plugin that declared [plugin.metrics] prometheus = true.

Order matters for Prometheus scrape — duplicate metric names across sources are not deduplicated; the LAST occurrence wins when the scraper rebuilds its state. Plugins should namespace their metrics with a prefix (my_plugin_<metric>) to avoid collisions.

Failure isolation

One slow / unresponsive plugin does NOT stall the /metrics response. Each scrape has its own timeout (default 5s). On failure (timeout, broker error, malformed reply) the daemon:

Logs a warn-level event with plugin id + error string.
Contributes empty string for that plugin in the aggregate.
Continues with the remaining plugins.

This trades immediate observability of plugin metric outages for operator UX — a watchdog scraping every 15s sees gaps when a plugin is unhealthy, but the daemon's own metrics (CPU, memory, LLM, MCP, tunnels) keep flowing.

Implementing the plugin side

Subprocess plugins subscribe to <broker_topic_prefix>.metrics.scrape and reply:

#![allow(unused)]
fn main() {
// Sketch (final SDK helpers ship with the next plugin release):
ctx.broker
    .subscribe("plugin.<id>.metrics.scrape")
    .await?
    .for_each(|msg| async {
        let text = my_metrics_module::render_prometheus(...);
        broker.publish(
            &msg.reply_to.unwrap(),
            json!({ "text": text }),
        ).await
    });
}

Reference impl lands with the next nexo-plugin-email release; until then the daemon's legacy hardcoded call keeps email metrics flowing.

Migration status

nexo-plugin-email — NOT migrated. Legacy in-process call at src/main.rs:15295 continues to serve. When email ships the manifest section, BOTH paths fire (legacy direct call AND broker scrape) until the legacy call is retired in a follow-up.
Other canonical plugins — none currently expose Prometheus metrics. New plugins opting into metrics declare the manifest section directly with no legacy fallback to maintain.

Validation

cargo build --release-fast --bin nexo (default) — 3m clean.
cargo build --release-fast --bin nexo --no-default-features — 3m01s clean.
cargo nextest run --workspace — 6321/6321 (5 new tests in plugin_metrics::tests covering descriptor construction, empty-descriptors short-circuit, failure isolation across multiple plugins, broker error path).
mdbook build docs clean.

Trade-offs

Concern	Decision
Sequential vs concurrent scrape	Sequential. Concurrent would shave latency for `n > 3` plugins but adds a `futures` dep edge. Acceptable at current scale (≤10 plugins typical).
Per-scrape timeout	5s default. Plugins exceeding this contribute empty (warn-log). Trades immediate visibility for daemon `/metrics` SLO.
Duplicate metric name collisions	Daemon does NOT deduplicate. Plugins namespace with `my_plugin_<metric>` prefix per Prometheus convention.
Plugin reply shape	`{ text: String }`. Simple envelope; daemon appends newline if missing. Adding `labels` / `timestamps` would be a follow-up if a plugin needs them.
Email migration fallback	Legacy `nexo_plugin_email::metrics::render_prometheus` call kept. When email ships manifest section both paths run until cleanup follow-up retires the legacy.

`[plugin.dashboard]` — setup wizard surface

Phase 81.33.b.real Stage 6 (Layer 10 polish of the plugin auto-discovery design). Status: shipped 2026-05-15.

Plugins that want to appear in the setup wizard's channel dashboard declare how the daemon detects their instances + auth state via this manifest section. A generic ManifestDashboardSource in nexo-setup consumes the section and runs the right discovery / auth check, eliminating the need for per-channel hardcoded crates/setup/src/services/channels_dashboard.rs impls (the 3 canonical ones — telegram, whatsapp, email — stay as fallback until canonical plugin crates ship manifest revisions).

Manifest section

[plugin.dashboard]
# Instance enumeration strategy:
[plugin.dashboard.layout]
kind = "single"                          # | "workspace_walk"

# Auth-state probe strategy:
[plugin.dashboard.auth_check]
kind = "file_presence"                   # | "session_dir_files"
path = "telegram_bot_token.txt"          # for file_presence (relative to secrets_dir)

Telegram shape (single instance, file-presence auth)

[plugin.dashboard.layout]
kind = "single"

[plugin.dashboard.auth_check]
kind = "file_presence"
path = "telegram_bot_token.txt"

Email shape

[plugin.dashboard.layout]
kind = "single"

[plugin.dashboard.auth_check]
kind = "file_presence"
path = "email_password.txt"

Whatsapp shape (multi-instance via workspace walk, session-dir auth)

[plugin.dashboard.layout]
kind = "workspace_walk"
subdir = "whatsapp"

[plugin.dashboard.auth_check]
kind = "session_dir_files"
candidates = ["session.db", "state.db", "device.json", "registration.json"]

Field reference

`layout`

kind = "single" — exactly one instance labelled "default". Used by channels with one account per agent (telegram, email).
kind = "workspace_walk", subdir = "<name>" — walk <workspace>/<agent>/<subdir>/<instance>/ for every directory entry. Used by channels with multi-instance per-agent layouts (whatsapp). subdir must be a single segment (no /).

`auth_check`

kind = "file_presence", path = "<rel>" — authenticated if <secrets_dir>/<rel> exists + is non-empty. Path must be relative (no leading /). <secrets_dir> is the operator's secrets root (typically ~/.nexo/secrets or $NEXO_HOME/secrets).
kind = "session_dir_files", candidates = [...] — authenticated if the per-instance directory contains ANY of the listed filenames. Only meaningful with layout.kind = "workspace_walk"; the per-instance dir is <workspace>/<agent>/<subdir>/<instance>/. If the directory exists with OTHER files (but none of the candidates), reports Stale; empty dir reports NotAuthenticated.

Daemon-side dispatch

crates/setup/src/services/channels_dashboard.rs ships:

pub trait ChannelDashboardSource — channel_id() + discover().
pub struct ManifestDashboardSource — generic impl that consumes a parsed PluginDashboardSection + plugin id.
pub fn dashboard_sources_from_manifests(manifests) -> Vec<Box<dyn ChannelDashboardSource>> — helper that filters manifests + builds a source per declaring plugin.
pub fn default_dashboard_sources() -> Vec<Box<dyn ChannelDashboardSource>> — the 3 hardcoded canonical impls (telegram, whatsapp, email). Kept for backwards compat until canonical plugin crates ship manifest revisions.

Operators combining both:

#![allow(unused)]
fn main() {
let mut sources = default_dashboard_sources();
sources.extend(dashboard_sources_from_manifests(&discovered_manifests));
let entries = detect_channels_with_sources(&sources, &config_dir, &secrets_dir)?;
}

Migration

Canonical plugin crates currently ship NO [plugin.dashboard] section. The 3 hardcoded sources in nexo-setup continue to serve the wizard. When a plugin ships the section AND the wizard caller discovers manifests + extends the source list, the manifest-driven source contributes alongside the hardcoded one. Once all 3 canonical plugins migrate, the hardcoded sources can be retired in a Stage 7 cleanup follow-up.

New canonical channels added in the future (signal, sms, …) ship the manifest section directly + skip the hardcoded path entirely.

Validation

cargo build --release-fast --bin nexo (default) — 3m13s.
cargo build --release-fast --bin nexo --no-default-features — 2m54s.
cargo nextest run --workspace — 6334/6334 (13 new tests: 8 in nexo-plugin-manifest::dashboard::tests, 5 in nexo-setup::services::channels_dashboard::manifest_dashboard_tests).
mdbook build docs clean.

Trade-offs

Concern	Decision
Schema enumerates known shapes	2 layouts (single / workspace_walk) + 2 auth checks (file_presence / session_dir_files). Covers the 3 canonical channels. New shapes = schema extension (typed enum variant + interpreter branch).
Plugin-side auth check via broker (alternative)	Rejected: the wizard runs WITHOUT the plugin subprocess alive in many scenarios (initial setup, secret rotation, plugin-binary-not-yet-installed). The daemon performing the FS check directly is more robust.
`channel_id` `'static` lifetime	Process-wide intern table keyed by plugin id; one-time leak per plugin per process. Bounded by plugin count.
Workspace walk path resolution	`<config_dir>.parent() / data/workspace` — matches the layout used by canonical plugins today. Manifest does NOT let plugins override the workspace root path (security: prevent arbitrary FS reads via a malicious manifest).
Symbol exports	`ManifestDashboardSource` + `dashboard_sources_from_manifests` are public so callers (admin wizard, setup CLI, future microapp) can wire them without re-implementing.

nexo Plugin Contract

Field	Value
`contract_version`	`1.10.0`
Status	Stable
Authoritative reference	This document
Reference implementations	Host: `crates/core/src/agent/nexo_plugin_registry/subprocess.rs`. Rust child: `crates/microapp-sdk/src/plugin.rs` (feature `plugin`). Python / TypeScript / PHP children: `github.com/lordmacu/nexo-plugin-sdks`. See §11.

This contract describes how an out-of-tree plugin binary communicates with the nexo daemon. A conforming plugin can be written in any language — Rust, Python, TypeScript, Go, etc. — as long as it implements the protocol defined here.

1. Transport

Plugin runs as a child process of the daemon.
Daemon writes to the child's stdin. Child writes to its stdout.
stderr is closed by the daemon (currently /dev/null — Phase 81.23 will collect it into structured tracing).
Each direction is a stream of newline-delimited UTF-8 lines.
Each line is exactly one JSON-RPC 2.0 message — request, response, or notification.
Lines must not exceed the platform pipe buffer (typically 4 KiB on Linux); fragmenting one JSON object across multiple lines is not supported.

2. Manifest

The plugin ships a nexo-plugin.toml file — schema defined by the nexo-plugin-manifest crate. The fields relevant to this contract are:

[plugin]
id = "slack"                       # ASCII slug, ^[a-z][a-z0-9_]{0,31}$
version = "0.2.0"                  # semver
name = "Slack Channel"
description = "..."
min_nexo_version = ">=0.1.0"

[plugin.requires]
nexo_capabilities = ["broker"]

# Phase 81.14 — subprocess entrypoint.
[plugin.entrypoint]
command = "/usr/local/bin/plugin-slack"  # absolute path or PATH binary
args = ["--mode", "stdio"]               # optional
env = { "RUST_LOG" = "info" }            # optional, MUST NOT begin with "NEXO_"

# Phase 81.8 — channel kinds the plugin exposes. Drives the
# broker subscribe / publish allowlist (see §6).
[[plugin.channels.register]]
kind = "slack"
adapter = "SlackChannelAdapter"

The host parses this manifest at boot and uses plugin.id to verify the child's identity in the initialize handshake (§4.1). It uses plugin.entrypoint.command to spawn the child process. Any env key beginning with NEXO_ is rejected at boot — those names are reserved for the daemon's own runtime configuration.

2.1 Extends section (Phase 81.28)

A subprocess plugin that contributes to a daemon-side registry beyond [plugin.channels.register] declares its capabilities in an additive [plugin.extends] section:

[plugin.extends]
channels         = ["slack", "discord"]    # paired with Phase 81.24 wrapper
llm_providers    = ["cohere", "mistral"]   # paired with Phase 81.25
memory_backends  = ["pinecone", "qdrant"]  # paired with Phase 81.26
hooks            = ["pii_redact"]          # paired with Phase 81.27

Each list names the IDs the plugin contributes to the matching registry. Validation rules:

Each id MUST match ^[a-z][a-z0-9_]{0,31}$.
No duplicates within a single list.
No cross-list duplicates — an id MUST occupy at most one of the four lists within a single plugin.
All four fields default to empty; legacy manifests parse unchanged.

The four canonical sections are fixed in code (EXTENDS_SECTIONS); adding a new capability surface requires a manifest-crate change. This is intentional — the closed schema keeps serde(deny_unknown_fields) defense intact and gates new extension points behind a coordinated rollout.

[plugin.extends] is the declarative half of the capability story. Daemon dispatch wiring — actually populating LlmClientRegistry / memory backend store / HookInterceptor registry — ships with Phase 81.24 (channels), 81.25 (LLM providers), 81.26 (memory backends), and 81.27 (hooks). Capability-negotiation handshake (verifying the subprocess's initialize reply matches the declared extensions) is a follow-up (81.28.b).

[plugin.extends].channels exists in parallel with [plugin.channels.register] (§6 — topic allowlist). Use extends for subprocess plugins routed through the future remote ChannelAdapter wrapper; use register for in-tree adapters that link directly into the daemon binary. Both surfaces stay independent.

2.2 Sandbox section (Phase 81.22)

Subprocess plugins on Linux can opt into bubblewrap-based isolation via an additive [plugin.sandbox] section. Default = disabled — every existing manifest parses unchanged; the daemon spawns the plugin as a normal child process.

[plugin.sandbox]
enabled = true                        # default false (opt-in)
network = "deny"                       # "deny" | "host"
fs_read_paths = ["/etc/ssl/certs"]    # absolute, ro-bind into sandbox
fs_write_paths = ["${state_dir}"]     # absolute, rw-bind. ${state_dir}
                                       # token expands to the plugin's
                                       # per-instance state root.
drop_user = true                       # default true; bwrap maps the
                                       # child to nobody:nogroup (uid
                                       # 65534) via --unshare-user.

When enabled, the daemon wraps the spawn Command with bwrap flags:

Process hardening: --die-with-parent --unshare-pid --unshare-uts --unshare-ipc --new-session.
Filesystem skeleton: --proc /proc --dev /dev --tmpfs /tmp plus read-only binds for /usr /bin /sbin /lib /lib64 /etc/ssl. The plugin command's parent dir is also auto-bound read-only so the binary is reachable inside the sandbox.
Network: --unshare-net for network = "deny". For network = "host" the operator must set the NEXO_PLUGIN_SANDBOX_HOST_NET_ALLOW=1 capability env var; the manifest validator otherwise rejects the field.
User: --unshare-user --uid 65534 --gid 65534 when drop_user = true.
Allowlist: each fs_read_paths entry becomes --ro-bind <path> <path>; each fs_write_paths entry becomes --bind <path> <path> after ${state_dir} expansion.

Operators control sandbox enforcement via two env knobs:

Env var	Purpose
`NEXO_PLUGIN_SANDBOX_REQUIRE`	When `1`, the daemon refuses to spawn any plugin without `sandbox.enabled = true`. Strict-mode operator gate.
`NEXO_PLUGIN_SANDBOX_HOST_NET_ALLOW`	When `1`, manifests declaring `network = "host"` validate. Default off.

Hard denylist (compile-time const) — operator-supplied allowlists that equal or include any of these paths are rejected at manifest load:

/etc/shadow, /etc/sudoers, /etc/sudoers.d
/proc/sys, /proc/kcore, /proc/kallsyms
/sys/firmware, /sys/kernel
/dev/mem, /dev/kmem, /dev/port
/var/run/docker.sock, /run/docker.sock, /private/var/run/docker.sock
/root, /boot

Validation errors surface as ManifestError::Sandbox* variants (SandboxAllowlistTouchesDenylist, SandboxRelativePath, SandboxInvalidStateDirInterpolation, SandboxHostNetworkWithoutCapability).

Platform support: Linux requires bubblewrap in PATH (apt install bubblewrap). macOS is currently a no-op + tracing::warn! log per spawn — native sandbox-exec integration is deferred to follow-up 81.22.macos. With NEXO_PLUGIN_SANDBOX_REQUIRE=1 on macOS, the daemon refuses to spawn (treats macOS as unsupported).

Out of scope for v1:

Granular network egress allowlist (network = "allowlist", network_allowlist = ["host:port"]) — defers to 81.22.b (slirp4netns + nftables).
Per-syscall seccomp filters — defers to 81.22.c.
Cgroup / rlimit resource caps — Phase 81.21.c.
Doctor CLI surface — defers to 81.22.d.

3. JSON-RPC envelope

All frames are valid JSON-RPC 2.0:

Request

{
  "jsonrpc": "2.0",
  "id": <integer or string>,
  "method": "<method-name>",
  "params": <object | null>
}

Response (success)

{
  "jsonrpc": "2.0",
  "id": <same as request>,
  "result": <object | null>
}

Response (error)

{
  "jsonrpc": "2.0",
  "id": <same as request, null if request was un-parseable>,
  "error": {
    "code": <integer>,
    "message": "<string>"
  }
}

Notification

A notification is a request without an id field. The peer must not reply.

{
  "jsonrpc": "2.0",
  "method": "<method-name>",
  "params": <object | null>
}

The contract uses notifications for unidirectional broker events — see §5.

4. Lifecycle

4.1 `initialize` (host → child request)

After spawning the child, the daemon writes one initialize request and awaits the response. The child must respond before NEXO_PLUGIN_INIT_TIMEOUT_MS (default 5000) elapses or the daemon kills it and surfaces PluginInitError::Other.

Request:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": { "nexo_version": "0.1.5" }
}

Response:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "manifest": { "plugin": { "id": "slack", "version": "0.2.0", ... } },
    "server_version": "slack-0.2.0"
  }
}

The child must echo a manifest whose plugin.id matches the id the daemon expected (the id under which the plugin was registered in the factory registry). Mismatch is a hard failure — the daemon kills the child and refuses to load the plugin. This defends against an out-of-tree binary impersonating a different plugin.

server_version is a free-form string identifying the running binary; the SDK defaults it to <id>-<version> from the manifest.

4.1.1 Tool catalog advertisement (Phase 81.29, optional)

Plugins declaring [plugin.extends].tools = [...] MUST include a tools array in the initialize-reply result. Each entry is a RemoteToolDef:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "manifest": { "plugin": { "id": "browser", ... } },
    "server_version": "browser-0.1.1",
    "tools": [
      {
        "name": "browser_navigate",
        "description": "Navigate to a URL",
        "input_schema": {
          "type": "object",
          "properties": { "url": { "type": "string" } },
          "required": ["url"]
        }
      }
    ]
  }
}

Validation rules at the host:

The tools field is OPTIONAL when extends.tools is empty. Required (non-empty) when the manifest declares any tool.
Every advertised name MUST appear in manifest.plugin.extends.tools. Drift in this direction (advertised but not declared) is a hard failure: the daemon kills the child and refuses to load.
Manifest entries WITHOUT an advertised counterpart are tolerated but logged at warn — runtime calls to those tools yield -33401 ToolNotFound.
name must satisfy the per-plugin namespace rule (<plugin_id>_* or ext_<plugin_id>_*).
input_schema is an arbitrary JSONSchema object; the daemon caches it for arg validation before each tool.invoke.

4.2 `shutdown` (host → child request)

The daemon sends shutdown when it wants the plugin to exit gracefully. The child should flush state, then reply.

Request:

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "shutdown",
  "params": { "reason": "host requested" }
}

Response:

{
  "jsonrpc": "2.0",
  "id": 2,
  "result": { "ok": true }
}

Reply with an error object instead of result if shutdown fails — the host surfaces PluginShutdownError::Other to the operator.

After the reply, the daemon waits 1 second for the process to exit on its own. If the child is still alive, the daemon sends SIGKILL. So: reply, then exit.

5. Broker bridge

The wire-level shape of the broker bridge is two notifications:

5.1 `broker.event` (host → child)

Whenever the daemon's broker delivers an event on a topic matching one of the plugin's outbound subscriptions (derived from manifest.channels.register[].kind — see §6), the daemon sends:

{
  "jsonrpc": "2.0",
  "method": "broker.event",
  "params": {
    "topic": "plugin.outbound.slack.team_a",
    "event": {
      "id": "01940000-0000-0000-0000-000000000001",
      "timestamp": "2026-05-01T00:00:00Z",
      "topic": "plugin.outbound.slack.team_a",
      "source": "agent.coordinator",
      "session_id": "01940000-0000-0000-0000-000000000099",
      "payload": { "text": "hello", ... }
    }
  }
}

The event field is a serialised nexo_broker::Event. The plugin processes the event (e.g. forwards payload.text to Slack's API) and may reply with a broker.publish notification (§5.2) — but it is not required to reply.

5.2 `memory.recall` (child → host request) <Phase 81.20.a>

When the plugin needs to look up agent memory entries, it issues a JSON-RPC request to the daemon. Unlike broker.event / broker.publish which are notifications, this is a request-response flow: the child sends with an id and awaits the matching reply.

Child → host request:

{
  "jsonrpc": "2.0",
  "id": 42,
  "method": "memory.recall",
  "params": {
    "agent_id": "ventas_v1",
    "query": "user prefers concise answers",
    "limit": 5
  }
}

Host → child reply (success):

{
  "jsonrpc": "2.0",
  "id": 42,
  "result": {
    "entries": [
      {
        "id": "01940000-0000-0000-0000-000000000001",
        "agent_id": "ventas_v1",
        "content": "user prefers concise answers",
        "tags": ["preference"],
        "concept_tags": [],
        "created_at": "2026-04-30T18:22:31Z",
        "memory_type": null
      }
    ]
  }
}

Host → child reply (error):

-32601 method not found (only memory.recall wired in 81.20.a; llm.complete / tool.dispatch ship in 81.20.b/.c).
-32602 invalid params (missing agent_id / wrong type for query).
-32603 memory not configured (operator hasn't enabled long-term memory) OR memory backend returned an error.

limit defaults to 10, capped hard at 1000. The handler calls LongTermMemory::recall(agent_id, query, limit) which already expands the query with up to 3 derived concept tags so FTS5 hits memories whose stored content diverges from the query surface.

5.3 `llm.complete` (child → host request) <Phase 81.20.b>

When the plugin needs an LLM completion, it issues a request and awaits the response.

Child → host request:

{
  "jsonrpc": "2.0",
  "id": 50,
  "method": "llm.complete",
  "params": {
    "provider": "minimax",
    "model": "minimax-m2.5",
    "messages": [
      {"role": "user", "content": "summarize this in one line: ..."}
    ],
    "max_tokens": 1024,
    "temperature": 0.7,
    "system_prompt": "You answer concisely."
  }
}

messages[].role is one of system, user, assistant, tool. max_tokens defaults to 4096; temperature defaults to 0.7; system_prompt is optional.

Host → child reply (success):

{
  "jsonrpc": "2.0",
  "id": 50,
  "result": {
    "content": "Concise reply text.",
    "finish_reason": "stop",
    "usage": {
      "prompt_tokens": 25,
      "completion_tokens": 8
    }
  }
}

finish_reason is one of stop, length, tool_use, other:<reason>.

Host → child reply (errors):

-32602 invalid params (missing provider / model / messages, malformed message, empty messages array).
-32603 LLM not configured (operator hasn't wired the registry to the subprocess pipeline) OR client build failed (provider name not registered, config invalid) OR chat() call returned an error.
-32601 provider returned tool calls instead of text — MVP surfaces this as not_implemented. The tool-call wire shape (which lets the child re-submit tool_result follow-ups) lands in a future contract bump.

Daemon-side caps max_tokens at u32::MAX. Streaming via llm.complete.delta notifications is opt-in via params.stream = true (Phase 81.20.b.c).

Streaming flow

When the request includes "stream": true, the host calls LlmClient::stream instead of chat. Each text chunk arrives as a notification correlated to the original request id:

{
  "jsonrpc": "2.0",
  "method": "llm.complete.delta",
  "params": { "request_id": 50, "chunk": "hello" }
}
{
  "jsonrpc": "2.0",
  "method": "llm.complete.delta",
  "params": { "request_id": 50, "chunk": " world" }
}

The final reply matches the original id but carries only finish_reason + usage — content is omitted because the child reassembled it from deltas:

{
  "jsonrpc": "2.0",
  "id": 50,
  "result": {
    "finish_reason": "stop",
    "usage": { "prompt_tokens": 12, "completion_tokens": 7 }
  }
}

Tool-call deltas in streaming mode are dropped (same scope as the non-streaming MVP). If the provider returns ONLY tool calls during a stream (no text), the final reply is -32601 not_implemented.

5.4 `broker.publish` (child → host)

When the plugin wants to push an event onto the broker (e.g. delivering an inbound message from Slack), it writes:

{
  "jsonrpc": "2.0",
  "method": "broker.publish",
  "params": {
    "topic": "plugin.inbound.slack.team_a",
    "event": {
      "id": "01940000-0000-0000-0000-000000000002",
      "timestamp": "2026-05-01T00:01:00Z",
      "topic": "plugin.inbound.slack.team_a",
      "source": "slack",
      "session_id": null,
      "payload": { "from": "U01ABC", "text": "hi", ... }
    }
  }
}

The host validates the topic against the allowlist (§6) before forwarding to the broker. Topics outside the allowlist are dropped with a tracing::warn! log and never reach the broker.

5.x Channel methods (Phase 81.24)

Subprocess plugins that contribute new channel kinds (declared in [plugin.extends].channels, §2.1) implement three host- initiated request methods. The host's RemoteChannelAdapter wraps each ChannelAdapter trait method into a JSON-RPC request; the child replies with the corresponding result or a typed error.

Every payload carries kind so a single subprocess advertising multiple kinds (extends.channels = ["slack", "discord"]) can dispatch via one request handler.

`channel.start`

// host → child
{
  "jsonrpc": "2.0",
  "id": 42,
  "method": "channel.start",
  "params": {
    "kind": "slack",
    "instance": "primary"   // null when no per-instance multiplexing
  }
}

// child → host
{ "jsonrpc": "2.0", "id": 42, "result": { "ok": true } }

Subscribe to plugin.outbound.<kind> (or per-instance plugin.outbound.<kind>.<instance> when instance is set) and begin publishing inbound events. Default host-side timeout 30 seconds.

`channel.stop`

// host → child
{
  "jsonrpc": "2.0",
  "id": 43,
  "method": "channel.stop",
  "params": { "kind": "slack" }
}

// child → host
{ "jsonrpc": "2.0", "id": 43, "result": { "ok": true } }

Release resources, drop subscriptions, stop publishing inbound. Idempotent. Default host-side timeout 30 seconds.

`channel.send_outbound`

// host → child
{
  "jsonrpc": "2.0",
  "id": 44,
  "method": "channel.send_outbound",
  "params": {
    "kind": "slack",
    "msg": { "kind": "text", "to": "U123", "body": "hi" }
  }
}

// child → host (success)
{
  "jsonrpc": "2.0",
  "id": 44,
  "result": { "message_id": "1234.5678", "sent_at_unix": 1741032000 }
}

msg.kind is one of text, media, or custom (see OutboundMessage in §3 for the full enum). Default host-side timeout 60 seconds. Operator override via NEXO_PLUGIN_CHANNEL_TIMEOUT_MS env (single value applied to all 3 methods).

Channel-specific error codes

In addition to the JSON-RPC standard codes (§7), channel.* methods MAY return:

Code	Meaning	Maps to `ChannelAdapterError`
`-33001`	`channel.connection_failed`	`Connection { source: <message> }`
`-33002`	`channel.authentication_failed`	`Authentication { reason: <message> }`
`-33003`	`channel.recipient_invalid`	`Recipient { recipient: <data.recipient>, reason: <data.reason \| message> }`
`-33004`	`channel.rate_limited`	`RateLimited { retry_after_secs: <data.retry_after_secs> }`
`-33005`	`channel.unsupported_feature`	`Unsupported { feature: <data.feature \| message> }`

Error example:

{
  "jsonrpc": "2.0",
  "id": 44,
  "error": {
    "code": -33004,
    "message": "rate limited",
    "data": { "retry_after_secs": 42 }
  }
}

-32601 method_not_found from a child means the plugin declared the kind in extends.channels but did not implement the requested method; the host surfaces this as ChannelAdapterError::Unsupported { feature: "<method>" }.

5.y LLM provider methods (Phase 81.25)

Subprocess plugins that contribute LLM providers (declared in [plugin.extends].llm_providers, §2.1) implement one host- initiated request method with two modes (sync + streaming). The host's RemoteLlmClient wraps each LlmClient trait call into a JSON-RPC request; the child replies with the corresponding result or a typed error.

Every payload carries provider so a single subprocess advertising multiple providers (extends.llm_providers = ["cohere", "mistral"]) can dispatch via one request handler.

`llm.chat` (non-streaming)

// host → child
{
  "jsonrpc": "2.0",
  "id": 50,
  "method": "llm.chat",
  "params": {
    "provider": "cohere",
    "model": "command-r",
    "stream": false,
    "request": {
      "model": "command-r",
      "messages": [{ "role": "user", "content": "hi" }],
      "max_tokens": 1024,
      "temperature": 0.7
    }
  }
}

// child → host
{
  "jsonrpc": "2.0",
  "id": 50,
  "result": {
    "content": { "type": "text", "text": "Hello world" },
    "usage": { "prompt_tokens": 12, "completion_tokens": 4 },
    "finish_reason": { "kind": "stop" }
  }
}

The full request schema mirrors nexo_llm::types::ChatRequest fields (messages / tools / max_tokens / temperature / system_prompt / stop_sequences / tool_choice / system_blocks / cache_tools). tool_choice serializes as {"kind":"auto"|"any"|"none"|"specific","name":"<n>"?}.

The full result schema:

content — {type:"text", text:"..."} OR {type:"tool_calls", tool_calls:[{id, name, arguments}]}
usage — {prompt_tokens, completion_tokens}
finish_reason — {kind:"stop"|"tool_use"|"length"|"other","reason":"<r>"?}
cache_usage — optional {cache_read_input_tokens, cache_creation_input_tokens, input_tokens, output_tokens}

Default host-side timeout 60 seconds.

`llm.chat` (streaming)

// host → child
{
  "jsonrpc": "2.0",
  "id": 51,
  "method": "llm.chat",
  "params": {
    "provider": "cohere",
    "model": "command-r",
    "stream": true,
    "request": { "...": "as above" }
  }
}

// child → host: zero or more deltas
{
  "jsonrpc": "2.0",
  "method": "llm.chat.delta",
  "params": {
    "request_id": 51,
    "chunk": { "type": "text_delta", "delta": "Hello" }
  }
}
{
  "jsonrpc": "2.0",
  "method": "llm.chat.delta",
  "params": {
    "request_id": 51,
    "chunk": { "type": "text_delta", "delta": " world" }
  }
}

// child → host: final response (id matches request)
{
  "jsonrpc": "2.0",
  "id": 51,
  "result": {
    "content": { "type": "text", "text": "" },
    "usage": { "prompt_tokens": 12, "completion_tokens": 4 },
    "finish_reason": { "kind": "stop" }
  }
}

chunk.type values:

text_delta — { delta: "<text>" }
tool_call_start — { id, name }
tool_call_args_delta — { id, delta }
tool_call_end — { id }
usage — { usage: {prompt_tokens, completion_tokens} }
end — { finish_reason: {kind, reason?} }

Default host-side stream timeout 300 seconds. Operator override via NEXO_PLUGIN_LLM_TIMEOUT_MS env (single value applied to both sync + streaming).

LLM-specific error codes

In addition to the JSON-RPC standard codes (§7), llm.chat MAY return:

Code	Meaning
`-33101`	`llm.connection_failed`
`-33102`	`llm.authentication_failed`
`-33103`	`llm.rate_limited` (`data.retry_after_secs`)
`-33104`	`llm.model_not_found`
`-33105`	`llm.context_overflow`

Error example:

{
  "jsonrpc": "2.0",
  "id": 50,
  "error": {
    "code": -33103,
    "message": "rate limited",
    "data": { "retry_after_secs": 30 }
  }
}

The host surfaces these as anyhow::Error with messages operators can grep ("rate limited", "authentication failed", etc.). Structured retry-after info lands in the message string for v1; future contract bumps may add a typed LlmProviderError enum.

5.z Hook methods (Phase 81.27)

Subprocess plugins that contribute hook handlers (declared in [plugin.extends].hooks, §2.1) implement one host-initiated request method.

`hook.on_hook`

// host → child
{
  "jsonrpc": "2.0",
  "id": 60,
  "method": "hook.on_hook",
  "params": {
    "plugin_id": "compliance_plugin",
    "hook_name": "before_message",
    "event": {
      "sender": "alice",
      "body": "ping"
    }
  }
}

// child → host (block)
{
  "jsonrpc": "2.0",
  "id": 60,
  "result": {
    "abort": true,
    "reason": "PII detected",
    "decision": "block"
  }
}

// child → host (transform — rewrite payload)
{
  "jsonrpc": "2.0",
  "id": 60,
  "result": {
    "abort": false,
    "decision": "transform",
    "transformed_body": "[REDACTED]"
  }
}

// child → host (allow / no-op)
{
  "jsonrpc": "2.0",
  "id": 60,
  "result": {}
}

The result shape is HookResponse (defined in crates/extensions/src/runtime/mod.rs). Fields:

abort: bool (legacy block signal — Phase 11.6)
reason: Option<String> (operator-readable explanation)
override: Option<JsonValue> (key-by-key event mutation; non-object values logged + ignored)
decision: Option<"allow" | "block" | "transform"> (Phase 83.3 — richer audit signal)
transformed_body: Option<String> (only meaningful with decision: "transform")
do_not_reply_again: bool (anti-loop signal — host suppresses pending auto-replies for the conversation)

Default host-side timeout: 5 seconds (lower than channel 30s and LLM 60s — hooks fire on the message hot path; long timeouts block agent flow). Operator override via NEXO_PLUGIN_HOOK_TIMEOUT_MS env.

Continue-on-error semantic

Every dispatch failure (transport closed, subprocess crash, timeout, JSON-RPC error, malformed reply) returns HookResponse::default() (Continue) on the host side. The HookRegistry::fire loop continues iterating remaining handlers and the agent flow does NOT break on subprocess misbehavior.

This is the explicit philosophy from hook_registry.rs:

"extension misbehavior must not take down agent flow."

Operators see the failures via tracing::warn! (target plugins.init and the handler's own dispatch logs).

-32601 method_not_found from a child means the plugin declared extends.hooks = [...] but did not implement the wire method; the host treats this as Continue (no hard failure).

5.w Memory backend methods (Phase 81.26)

Subprocess plugins that contribute vector store backends (declared in [plugin.extends].memory_backends, §2.1) implement three host-initiated request methods. The host's RemoteVectorBackend wraps each VectorBackend trait method into a JSON-RPC request; the child replies with the corresponding result or a typed error.

Every payload carries backend so a single subprocess advertising multiple backends (extends.memory_backends = ["pinecone", "qdrant"]) can dispatch via one request handler.

v1 ships the wire surface + registry only — operator-side consumer wiring (LongTermMemory.recall_vector reading from wire.vector_backend_registry) lands in 81.26.b. Operators can audit registered backends today via wire.vector_backend_registry.names().

`memory.vector_upsert`

// host → child
{
  "jsonrpc": "2.0",
  "id": 70,
  "method": "memory.vector_upsert",
  "params": {
    "backend": "pinecone",
    "collection": "kb",
    "records": [
      {
        "id": "r1",
        "content": "hello",
        "embedding": [0.1, 0.2, 0.3],
        "metadata": {"source": "kb"}
      }
    ]
  }
}

// child → host
{ "jsonrpc": "2.0", "id": 70, "result": { "count": 1 } }

embedding is a pre-computed dense vector (host-side embedder or LLM provider produces it; backend stores). metadata is opaque JSON the backend may filter against. Default host-side timeout 30 seconds.

`memory.vector_search`

// host → child
{
  "jsonrpc": "2.0",
  "id": 71,
  "method": "memory.vector_search",
  "params": {
    "backend": "pinecone",
    "collection": "kb",
    "query": {
      "embedding": [0.1, 0.2, 0.3],
      "limit": 10,
      "filter": {"namespace": "tenant-1"}
    }
  }
}

// child → host
{
  "jsonrpc": "2.0",
  "id": 71,
  "result": {
    "matches": [
      {
        "id": "r1",
        "content": "hello",
        "score": 0.97,
        "metadata": {"source": "kb"}
      }
    ]
  }
}

filter is opaque — backend interprets per its native convention (Pinecone metadata filter, Qdrant filter expression, Weaviate where, etc.). The host does NOT validate or rewrite. score uses the backend's native scale (cosine vs dot-product vs distance). Default host-side timeout 10 seconds (search is hot-path).

`memory.vector_delete`

// host → child
{
  "jsonrpc": "2.0",
  "id": 72,
  "method": "memory.vector_delete",
  "params": {
    "backend": "pinecone",
    "collection": "kb",
    "ids": ["r1", "r2"]
  }
}

// child → host
{ "jsonrpc": "2.0", "id": 72, "result": { "count": 2 } }

Default host-side timeout 30 seconds. Operator override via NEXO_PLUGIN_MEMORY_TIMEOUT_MS env (single value applied to all 3 methods).

Memory-specific error codes

In addition to the JSON-RPC standard codes (§7), memory.* methods MAY return:

Code	Meaning	`data` fields
`-33301`	`memory.collection_not_found`	`collection`
`-33302`	`memory.dimension_mismatch`	`expected`, `got`
`-33303`	`memory.rate_limited`	`retry_after_secs`
`-33304`	`memory.write_failed`	`(message)`

Error example:

{
  "jsonrpc": "2.0",
  "id": 70,
  "error": {
    "code": -33302,
    "message": "dimension mismatch",
    "data": { "expected": 768, "got": 2 }
  }
}

The host surfaces these as anyhow::Error with messages operators can grep ("dimension mismatch: expected 768, got 2", "rate limited; retry after 60s", etc.).

5.t Tool methods (Phase 81.29)

Plugins declaring [plugin.extends].tools = [...] get a host-initiated tool.invoke request per agent-loop tool call. The daemon's LLM picks a tool name from the cached function- calling spec (built from initialize-reply's tools array, see §4.1.1), the agent's tool registry routes the call to a RemoteToolHandler, and the handler serializes the call into a tool.invoke JSON-RPC frame over the existing stdio bridge.

Default timeout: 60 s. Operator override via NEXO_PLUGIN_TOOL_TIMEOUT_MS.

`tool.invoke`

Host → child request:

{
  "jsonrpc": "2.0",
  "id": 80,
  "method": "tool.invoke",
  "params": {
    "plugin_id": "browser",
    "tool_name": "browser_navigate",
    "args": { "url": "https://example.com" },
    "agent_id": "shopper"
  }
}

Child → host reply on success — the result body is whatever JSON shape the daemon's ToolHandler::call returns to the agent loop. The conventional shape mirrors the in-tree ToolResponse:

{
  "jsonrpc": "2.0",
  "id": 80,
  "result": {
    "content": [
      { "type": "text", "text": "Navigated to https://example.com" }
    ],
    "is_error": false
  }
}

Plugins MAY return any JSON Value — the daemon does not validate the result shape beyond the JSON-RPC envelope. Tool authors using the Rust SDK return ToolResponse directly.

Tool-specific error codes

Code	Variant	Semantic
`-33401`	`ToolNotFound`	Plugin doesn't actually implement the declared tool name (drift between manifest and implementation)
`-33402`	`ToolArgumentInvalid`	Args failed plugin-side schema validation; surface `details: <Value>` for the offending fields
`-33403`	`ToolExecutionFailed`	Tool executed but raised a typed error (network failure, CDP hung, etc.)
`-33404`	`ToolUnavailable`	Resource exhausted, rate-limited, or otherwise transient. Optional `data: { retry_after_ms: <u64> }`
`-33405`	`ToolDenied`	Plugin's per-tenant authorization rejected the call (auth-style)
`-32601`	`MethodNotFound`	Plugin does not implement `tool.invoke` — manifest declared `extends.tools` but child doesn't handle the method

The host surfaces these as anyhow::Error with messages operators can grep ("tool not found", "argument invalid", "unavailable; retry after 5s", etc.). The agent loop receives the error and decides what to do (LLM retry, abort tool plan, escalate).

6. Topic allowlist

The host derives subscribe + publish patterns from the manifest's [[plugin.channels.register]] entries.

For each entry with kind = K:

Direction	Patterns
Outbound (daemon → child)	`plugin.outbound.K`, `plugin.outbound.K.>`
Inbound (child → daemon)	`plugin.inbound.K`, `plugin.inbound.K.>`

Wildcard semantics follow nexo_broker::topic::topic_matches:

* matches exactly one path segment.
> matches one or more trailing segments (must have ≥1).
Plain segments match literally.

So plugin.inbound.slack.> matches plugin.inbound.slack.team_a and plugin.inbound.slack.team_a.thread_42 but not plugin.inbound.slack (no trailing segments). That's why both exact and wildcard patterns are in the allowlist for each kind.

A child publish to a topic that does not match any pattern in the allowlist is dropped — this is the host's primary defense against a plugin attempting to hijack core nexo topics like agent.route.* or command.*.

7. Error codes

-32xxx is JSON-RPC reserved range; nexo extensions live in -31xxx (none used yet) and -32000..-32099 (implementation defined).

Code	Meaning
`-32700`	Parse error — line is not valid JSON
`-32600`	Invalid request — well-formed JSON but not JSON-RPC 2.0
`-32601`	Method not found
`-32602`	Invalid params
`-32603`	Internal error
`-32000`	nexo: shutdown handler returned an error
`-32001..-32099`	Reserved for future nexo error variants

The host translates each of these into a structured PluginInitError or PluginShutdownError variant for operator diagnostics.

8. Backpressure

The host's stdin writer feeds the child via a bounded mpsc channel of depth 64. When the channel is full (the child is processing more slowly than the broker is delivering events to it), new broker.event notifications are dropped with a warn-level log rather than blocking the daemon's broker.

This matches the at-most-once delivery semantics the broker itself promises — no plugin should be relying on every event arriving. Plugins that need durable delivery should subscribe to a NATS jetstream stream out-of-band, which is outside the scope of this contract.

9. Examples

9.1 Rust

Using the nexo-microapp-sdk crate with the plugin feature (Phase 81.15.a):

use nexo_microapp_sdk::plugin::{PluginAdapter, BrokerSender};
use nexo_broker::Event;

const MANIFEST: &str = include_str!("../nexo-plugin.toml");

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    PluginAdapter::new(MANIFEST)?
        .on_broker_event(|topic: String, event: Event, broker: BrokerSender| async move {
            // Outbound: deliver to the external service.
            // (Pseudocode; replace with your channel client.)
            let payload = event.payload.clone();
            let text = payload.get("text").and_then(|v| v.as_str()).unwrap_or("");
            send_to_slack(text).await;

            // Inbound: relay any reply back via the broker.
            let reply = Event::new(
                "plugin.inbound.slack",
                "slack",
                serde_json::json!({"echo": text}),
            );
            let _ = broker.publish("plugin.inbound.slack", reply).await;
        })
        .on_shutdown(|| async { Ok(()) })
        .run_stdio()
        .await?;
    Ok(())
}

async fn send_to_slack(_text: &str) {}

9.2 Python — `nexoai`

pip install nexoai (the nexo-plugin-sdk name was taken on PyPI; the importable module stays nexo_plugin_sdk). Source: nexo-plugin-sdks/python/.

import asyncio
from nexo_plugin_sdk import PluginAdapter, Event

MANIFEST = open("nexo-plugin.toml").read()

async def on_event(topic: str, event: Event, broker) -> None:
    # call back into the host (memory.recall §5.2 / llm.complete §5.3):
    entries = await broker.memory_recall(agent_id="my_agent", query="user prefers concise answers", limit=5)
    result = await broker.llm_complete(provider="minimax", model="minimax-m2.5",
                                       messages=[{"role": "user", "content": "summarize: ..."}])
    await broker.publish("plugin.inbound.slack",
                         Event.new("plugin.inbound.slack", "slack", {"summary": result.content}))

async def main() -> None:
    await PluginAdapter(manifest_toml=MANIFEST, on_event=on_event).run()

asyncio.run(main())

9.3 TypeScript / Node — `nexo-plugin-sdk`

npm install nexo-plugin-sdk. Source: nexo-plugin-sdks/typescript/.

import { readFileSync } from "node:fs";
import { PluginAdapter, Event } from "nexo-plugin-sdk";

const adapter = new PluginAdapter({
  manifestToml: readFileSync("nexo-plugin.toml", "utf-8"),
  onEvent: async (topic, event, broker) => {
    const entries = await broker.memoryRecall({ agentId: "my_agent", query: "user prefers concise answers", limit: 5 });
    const result = await broker.llmComplete({ provider: "minimax", model: "minimax-m2.5",
      messages: [{ role: "user", content: "summarize: ..." }] });
    await broker.publish("plugin.inbound.slack",
      Event.new("plugin.inbound.slack", "slack", { summary: result.content }));
  },
});
await adapter.run();

9.4 PHP — `nexo/plugin-sdk`

composer require nexo/plugin-sdk (PHP ≥ 8.1 — uses Fibers). Source: nexo-plugin-sdks/php/ (mirrored to nexo-plugin-sdk-php for Packagist).

<?php declare(strict_types=1);
require __DIR__ . '/vendor/autoload.php';
use Nexo\Plugin\Sdk\{PluginAdapter, BrokerSender, Event};

$adapter = new PluginAdapter([
    'manifestToml' => file_get_contents(__DIR__ . '/nexo-plugin.toml'),
    'onEvent' => function (string $topic, Event $event, BrokerSender $broker): void {
        $entries = $broker->memoryRecall(['agentId' => 'my_agent', 'query' => 'user prefers concise answers', 'limit' => 5]);
        $r = $broker->llmComplete(['provider' => 'minimax', 'model' => 'minimax-m2.5',
            'messages' => [['role' => 'user', 'content' => 'summarize: ...']]]);
        $broker->publish('plugin.inbound.slack', Event::new('plugin.inbound.slack', 'slack', ['summary' => $r->content]));
        // streaming: $broker->llmCompleteStream($opts, fn(string $chunk) => /* ... */);
    },
]);
$adapter->run();

9.5 Tools — host-initiated `tool.invoke` (Phase 81.29)

A plugin that declares [plugin.extends].tools = ["myplugin_weather"] in its manifest advertises a tool catalog at handshake (the initialize reply's tools array, §4.1.1) and handles one tool.invoke request per agent-loop tool call (§5.t). All four SDKs expose the same surface: a catalog of tool definitions, one dispatch handler (optionally with a context giving broker access mid-invocation), and a typed -33401..-33405 error band.

Rust — crates/microapp-sdk with feature plugin:

use nexo_microapp_sdk::plugin::{PluginAdapter, ToolDef, ToolInvocation, ToolInvocationError};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    PluginAdapter::new(include_str!("../nexo-plugin.toml"))?
        .declare_tools(vec![ToolDef {
            name: "myplugin_weather".into(),
            description: "Current weather for a city".into(),
            input_schema: serde_json::json!({
                "type": "object", "properties": { "city": { "type": "string" } }, "required": ["city"]
            }),
        }])
        .on_tool(|inv: ToolInvocation| async move {
            let city = inv.args.get("city").and_then(|v| v.as_str())
                .ok_or_else(|| ToolInvocationError::ArgumentInvalid("missing `city`".into()))?;
            Ok(serde_json::json!({ "content": [{ "type": "text", "text": format!("Sunny in {city}") }], "is_error": false }))
        })
        .run_stdio().await?;
    Ok(())
}

Python — nexoai:

import asyncio
from nexo_plugin_sdk import PluginAdapter, ToolDef, ToolInvocation, ToolArgumentInvalid, text_result

MANIFEST = open("nexo-plugin.toml").read()

async def on_tool(inv: ToolInvocation):
    city = (inv.args or {}).get("city")
    if not city:
        raise ToolArgumentInvalid("missing `city`")
    return text_result(f"Sunny in {city}")

async def main() -> None:
    await PluginAdapter(
        manifest_toml=MANIFEST,
        tools=[ToolDef("myplugin_weather", "Current weather for a city",
                       {"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]})],
        on_tool=on_tool,           # or on_tool_with_context=fn(inv, ctx) — ctx.broker = the on_event broker handle
    ).run()

asyncio.run(main())

TypeScript — nexo-plugin-sdk:

import { readFileSync } from "node:fs";
import { PluginAdapter, ToolArgumentInvalidError, textResult } from "nexo-plugin-sdk";

const adapter = new PluginAdapter({
  manifestToml: readFileSync("nexo-plugin.toml", "utf-8"),
  tools: [{ name: "myplugin_weather", description: "Current weather for a city",
    inputSchema: { type: "object", properties: { city: { type: "string" } }, required: ["city"] } }],
  onTool: (inv) => {
    const city = (inv.args as { city?: string } | null)?.city;
    if (!city) throw new ToolArgumentInvalidError("missing `city`");
    return textResult(`Sunny in ${city}`);
  },
  // or onToolWithContext: (inv, ctx) => { ... ctx.broker.memoryRecall(...) ... }
});
await adapter.run();

PHP — nexo/plugin-sdk:

<?php declare(strict_types=1);
require __DIR__ . '/vendor/autoload.php';
use Nexo\Plugin\Sdk\{PluginAdapter, Tool, ToolArgumentInvalid, ToolDef, ToolInvocation};

$adapter = new PluginAdapter([
    'manifestToml' => file_get_contents(__DIR__ . '/nexo-plugin.toml'),
    'tools' => [new ToolDef('myplugin_weather', 'Current weather for a city',
        ['type' => 'object', 'properties' => ['city' => ['type' => 'string']], 'required' => ['city']])],
    'onTool' => function (ToolInvocation $inv) {
        $city = $inv->args['city'] ?? null;
        if (!$city) { throw new ToolArgumentInvalid('missing `city`'); }
        return Tool::text("Sunny in {$city}");
    },
    // or 'onToolWithContext' => fn(ToolInvocation $inv, ToolContext $ctx) => /* $ctx->broker->memoryRecall(...) */
]);
$adapter->run();

Throwing one of the typed errors maps to the matching -33401..-33405 code: ToolNotFound (-33401), ToolArgumentInvalid (-33402, carries details), ToolExecutionFailed (-33403 — also the catch-all for an uncaught generic exception), ToolUnavailable (-33404, carries retry_after_ms), ToolDenied (-33405). A tool.invoke arriving when no handler is registered replies -32601. Declaring a tool whose name is not in the manifest's [plugin.extends].tools is a hard failure at construction (the daemon would otherwise kill the plugin — see §4.1.1).

10. Versioning + compatibility

This contract uses semver. The current version is 1.0.0.

Change kind	Semver bump
Add a new optional manifest field	minor
Add a new optional method (host or child)	minor
Add a new optional notification	minor
Add a new error code in `-32000..-32099`	minor
Remove or rename a method / notification / field	major
Change the JSON shape of a method's params or result	major
Tighten validation (e.g. rejecting previously-allowed input)	major

Plugins should declare the contract version they target via the manifest's min_nexo_version field plus a future contract_version field (Phase 81.16 follow-up). The host rejects plugins targeting a major version it does not support.

11. Reference implementations

Host adapter: crates/core/src/agent/nexo_plugin_registry/subprocess.rs (SubprocessNexoPlugin) — Phase 81.14 + 81.14.b.
Rust child SDK: crates/microapp-sdk/src/plugin.rs (PluginAdapter, feature plugin) — Phase 81.15.a.
Python / TypeScript / PHP child SDKs: github.com/lordmacu/nexo-plugin-sdks — python/ (PyPI nexoai), typescript/ (npm nexo-plugin-sdk), php/ (Packagist nexo/plugin-sdk, via the nexo-plugin-sdk-php mirror). All implement initialize / broker.event / shutdown / broker.publish, the child→host calls memory.recall / llm.complete (+ llm.complete.delta streaming), and the host→child tool.invoke request + the initialize-reply tool catalog (§4.1.1 / §5.t — Python nexoai ≥ 0.4.0, TypeScript / PHP ≥ 0.3.0).
Go SDK: not yet planned.

11.1 Conformance kit

nexo-plugin-sdks/conformance/ is a cross-language conformance kit: one Python mock-host (conformance/mock_host.py), a set of declarative scenarios (conformance/scenarios/*.json — the expect* steps are the golden), and one config-driven fixture per SDK. An SDK is conformant iff python conformance/run.py --lang <lang> passes for it — the kit drives the fixture through every exchange this contract defines (initialize / shutdown / broker.event / broker.publish / memory.recall / llm.complete (+ streaming) / tool.invoke + the §4.1.1 catalog) and diffs the frames structurally (methods, ids, result/error shapes, error codes, error.data shape, key presence/absence — not message text). The nexo-plugin-sdks CI runs the {python, typescript, php} matrix; the Rust SDK runs the same kit in this repo's CI via a shallow clone (--lang rust --fixture <built-binary> --check-contract-version docs/src/plugins/contract.md — the version check ties §13's top entry to the kit's SCENARIOS_TARGET, so a contract bump that lands without updating the kit fails CI). The kit does not replace the per-SDK test suites, which cover lang-specific robustness (async readers, Fiber scheduling, the stdout guard, signal handling). Added in Phase 31.12; the Rust-fixture wiring + reconciling the divergences the kit surfaces (the Rust child SDK does not yet emit error.data.details / error.data.retry_after_ms, and its nexo_broker::Event serializes with extra id / timestamp / session_id fields the scripting SDKs omit) is follow-up 31.12.b.

12. Out of contract scope

The following are part of the broader plugin platform but are deliberately out of THIS document's scope:

memory.recall / llm.complete / tool.dispatch RPC bridges (Phase 81.20) — let the child invoke daemon-mediated framework services.
Supervisor + respawn + resource limits (Phase 81.21).
Sandbox (network + filesystem allowlist via manifest, Phase 81.22).
Stdio → tracing bridge (Phase 81.23).
Plugin marketplace + signing (Phase 31).

Each of these will either extend this contract additively (in which case contract_version bumps minor) or live in a separate contract document.

13. Changelog

Version	Date	Changes
`1.0.0`	2026-05-01	Initial publication. Lifecycle (`initialize` / `shutdown`) + broker bridge (`broker.event` / `broker.publish`) + manifest `[plugin.entrypoint]` section. Host adapter shipped in Phase 81.14 + 81.14.b; Rust child SDK in Phase 81.15.a.
`1.1.0`	2026-05-01	Phase 81.20.a — `memory.recall` request-response added. Additive; existing 1.0.0 plugins continue to work unchanged. Manifest `[plugin.supervisor]` section (Phase 81.21.b) — additive. Host-side activation: Phase 81.17.b boot wire. Phase 81.21 supervisor + 81.21.b stderr tail capture.
`1.2.0`	2026-05-01	Phase 81.20.b — `llm.complete` request-response added. Additive. MVP supports text responses only; tool-call responses surface as `-32601 not_implemented`. Streaming (`llm.complete.delta` notifications) on roadmap as 81.20.b.b. Host-side runtime threading deferred to 81.20.b.b — daemon today returns `-32603 "llm not configured"` until main.rs threads `LlmServices` into the subprocess pipeline.
`1.3.0`	2026-05-01	Phase 81.20.b.b runtime threading shipped (memory + llm both flow end-to-end through production daemon path). Phase 81.20.b.c streaming added — `llm.complete` accepts `stream: true` opt-in; chunks emit as `llm.complete.delta { request_id, chunk }` notifications, final reply omits `content`. Additive — non-streaming requests unchanged.
`1.4.0`	2026-05-04	Phase 81.28 — `[plugin.extends]` manifest section added (`channels` / `llm_providers` / `memory_backends` / `hooks` lists). Schema-only this revision: parser + validator ship; daemon dispatch wiring per registry lands in Phase 81.24 (channels), 81.25 (LLM providers), 81.26 (memory backends), 81.27 (hooks). Additive — manifests without `[plugin.extends]` parse and validate unchanged.
`1.5.0`	2026-05-04	Phase 81.24 — `channel.start` / `channel.stop` / `channel.send_outbound` host-initiated request methods added (§5.x). Subprocess plugins declaring `[plugin.extends].channels = [...]` get one `RemoteChannelAdapter` per kind registered into the daemon's `ChannelAdapterRegistry`. Channel-specific error codes `-33001` through `-33005` map onto typed `ChannelAdapterError` variants. Default host-side timeouts: 30 s for start/stop, 60 s for send_outbound; `NEXO_PLUGIN_CHANNEL_TIMEOUT_MS` overrides all three. Additive — plugins not declaring channels are unaffected.
`1.6.0`	2026-05-04	Phase 81.25 — `llm.chat` host-initiated request method (sync + streaming via `params.stream` flag) + `llm.chat.delta` streaming notifications added (§5.y). Subprocess plugins declaring `[plugin.extends].llm_providers = [...]` get one `RemoteLlmFactory` per provider name registered into the daemon's `LlmRegistry`. LLM-specific error codes `-33101` through `-33105`. Default timeouts: 60 s sync chat, 300 s streaming; `NEXO_PLUGIN_LLM_TIMEOUT_MS` overrides both. Additive — plugins not declaring llm_providers are unaffected.
`1.7.0`	2026-05-04	Phase 81.27 — `hook.on_hook` host-initiated request method added (§5.z). Subprocess plugins declaring `[plugin.extends].hooks = [...]` get one `RemoteHookHandler` per hook name registered into the daemon's `HookRegistry`. Reply shape is the existing `HookResponse` (already serde-derived); reused directly as wire type. Continue-on-error semantic: every dispatch failure (transport, timeout, JSON-RPC err, decode) returns `HookResponse::default()` so `HookRegistry::fire` keeps iterating + agent flow doesn't break. Default 5s timeout (lower than channels/LLM); `NEXO_PLUGIN_HOOK_TIMEOUT_MS` env override. Additive — plugins not declaring hooks are unaffected.
`1.8.0`	2026-05-04	Phase 81.26 — `memory.vector_upsert` / `memory.vector_search` / `memory.vector_delete` host-initiated request methods added (§5.w). Subprocess plugins declaring `[plugin.extends].memory_backends = [...]` get one `RemoteVectorBackend` per name registered into the daemon's `VectorBackendRegistry`. Memory-specific error codes `-33301..=-33304`. Default timeouts: 30s upsert/delete, 10s search; `NEXO_PLUGIN_MEMORY_TIMEOUT_MS` env override. v1 ships wire + registry only — consumer-side wiring (`LongTermMemory.recall_vector` reading from registry) lands in 81.26.b. Vector-only scope: short/long-term memory keep SQLite; plugin replaces ONLY the vector index. Additive — plugins not declaring memory_backends are unaffected.
`1.9.0`	2026-05-04	Phase 81.22 — `[plugin.sandbox]` manifest section added (§2.2). Linux-only bubblewrap-based isolation: 5 fields (`enabled`, `network`, `fs_read_paths`, `fs_write_paths`, `drop_user`). Hard denylist enforced via `SANDBOX_DENYLIST_HOST_PATHS` const — operator-supplied allowlists that cover or equal denylisted paths are rejected at validate time. Two operator capability env knobs: `NEXO_PLUGIN_SANDBOX_REQUIRE` (strict-mode rejection of sandbox-disabled plugins) + `NEXO_PLUGIN_SANDBOX_HOST_NET_ALLOW` (gate for `network = "host"`). macOS no-op + warn (native sandbox-exec deferred to 81.22.macos). Default off — every existing manifest parses and runs unchanged. Additive — plugins without `[plugin.sandbox]` are unaffected.
`1.10.0`	2026-05-04	Phase 81.29 — `tool.invoke` host-initiated request method added (§5.t) + initialize-reply `tools` array extension (§4.1.1). Subprocess plugins declaring `[plugin.extends].tools = [...]` advertise a tool catalog (`name`/`description`/`input_schema`) at handshake; daemon caches the schemas + builds typed function-calling defs for the LLM without per-call round-trip. Each agent-loop tool call becomes a single `tool.invoke { plugin_id, tool_name, args, agent_id }` request. Tool-specific error codes `-33401..=-33405` map onto typed failures (`ToolNotFound` / `ToolArgumentInvalid` / `ToolExecutionFailed` / `ToolUnavailable` / `ToolDenied`). Default timeout 60 s; `NEXO_PLUGIN_TOOL_TIMEOUT_MS` env override. Subset check: advertised tools MUST be subset of `extends.tools` (drift detection). New `extends.tools` field is the 5th list in `[plugin.extends]` (joining channels/llm_providers/memory_backends/hooks). Tool name MUST satisfy per-plugin namespace policy from 81.3 (`<plugin_id>_` or `ext_<plugin_id>_`). Completes the 5-wrapper subprocess fleet (channels 81.24 + LLM 81.25 + hooks 81.27 + memory 81.26 + tools 81.29) — subprocess plugins can now contribute every category of host-side capability. Additive — plugins not declaring `extends.tools` are unaffected.

Plugin patterns

Common shapes for nexo subprocess plugins. Each pattern is a template you adapt — pick the closest match to what you're building, copy the skeleton, modify.

All patterns work in any of the 4 SDK languages (Rust / Python / TypeScript / PHP). Examples below use the language that's clearest for the pattern.

Pattern 1 · Echo channel

When to use · You're learning the SDK or wiring a brand-new channel and want a smoke-test before adding logic.

The plugin echoes every inbound broker.event back as broker.publish on a mirrored topic. Useful for verifying the wire format end-to-end before you write business logic.

from nexo_plugin_sdk import PluginAdapter, Event

async def on_event(topic, event, broker):
    out_topic = topic.replace("plugin.outbound.", "plugin.inbound.")
    await broker.publish(out_topic, Event.new(out_topic, "my_plugin", event.payload))

await PluginAdapter(manifest_toml=MANIFEST, on_event=on_event).run()

→ Used in every template (extensions/template-plugin-{rust,python,typescript,php}/)

Pattern 2 · Webhook receiver

When to use · An external service POSTs JSON; you want the daemon to see it as a plugin.inbound.<kind> event.

Plugin runs an HTTP server (or listens on a Unix socket) for inbound POST requests. Each request becomes a broker publish. Plugin's manifest declares an http_server capability so the daemon's reverse-proxy / port-allocator wires the route.

use nexo_microapp_sdk::plugin::{BrokerSender, Event, PluginAdapter};
use axum::{Router, routing::post, extract::State, Json};

async fn webhook(State(broker): State<Arc<BrokerSender>>, Json(body): Json<Value>) {
    let event = Event::new("plugin.inbound.webhook", "my_plugin", body);
    let _ = broker.publish("plugin.inbound.webhook", event).await;
}

#[tokio::main]
async fn main() -> Result<()> {
    let adapter = PluginAdapter::new(MANIFEST);
    let broker = adapter.broker();
    tokio::spawn(async move {
        let app = Router::new().route("/webhook", post(webhook)).with_state(broker);
        axum::serve(listener, app).await
    });
    adapter.run().await
}

Manifest declares the inbound topic the plugin will publish to:

[[plugin.channels.register]]
kind = "webhook"
adapter = "WebhookAdapter"

Pattern 3 · RPC bridge to an external API

When to use · You're exposing a third-party service (Stripe, Twilio, internal CRM) as a tool the agent can call.

Plugin doesn't deal with channels — it registers as a tool provider. The agent sends a tool.call request; the plugin forwards to the external API and replies.

import { PluginAdapter } from "nexo-plugin-sdk";

const adapter = new PluginAdapter({
  manifestToml: MANIFEST,
  onEvent: async (topic, event, broker) => {
    if (topic === "plugin.tool.stripe.create_invoice") {
      const inv = await stripeClient.invoices.create(event.payload);
      await broker.publish("plugin.tool.stripe.create_invoice.reply",
        Event.new("plugin.tool.reply", "stripe-bridge", { result: inv }));
    }
  },
});

Manifest contributes the tool:

[[plugin.tools.expose]]
name = "stripe.create_invoice"
schema_path = "./tools/create_invoice.json"

Pattern 4 · Scheduled poller

When to use · You need to poll an external feed every N minutes and publish only changes (deltas) to the broker.

Plugin holds local state (last-seen IDs / etag / cursor), re-polls on a timer, dedupes against state, publishes new items. Persist state to <state_dir>/<plugin_id>/state.json so restarts don't re-emit historical items.

import asyncio, json, aiohttp
from pathlib import Path
from nexo_plugin_sdk import PluginAdapter, Event

STATE = Path(".nexo-state/poller.json")
seen_ids: set[str] = set(json.loads(STATE.read_text())) if STATE.exists() else set()

async def poll_loop(broker):
    while True:
        async with aiohttp.ClientSession() as s:
            items = await (await s.get("https://example.com/feed.json")).json()
        for item in items:
            if item["id"] in seen_ids:
                continue
            seen_ids.add(item["id"])
            await broker.publish("plugin.inbound.feed",
                Event.new("plugin.inbound.feed", "feed_poller", item))
        STATE.write_text(json.dumps(list(seen_ids)))
        await asyncio.sleep(300)  # 5 min

adapter = PluginAdapter(manifest_toml=MANIFEST)
asyncio.create_task(poll_loop(adapter.broker))
await adapter.run()

→ See Build a poller module for the YAML-only path that doesn't need a plugin at all.

Pattern 5 · Long-running connection (websocket / SSE)

When to use · The external service is push-based (Slack RTM, Discord gateway, MQTT broker, custom WebSocket).

Plugin opens the persistent connection at startup. Inbound messages from the external side become broker.publish events. On disconnect, the plugin reconnects with exponential backoff.

#![allow(unused)]
fn main() {
use tokio_tungstenite::connect_async;

let (ws, _) = connect_async("wss://gateway.discord.gg/").await?;
let (write, mut read) = ws.split();
// Auth handshake omitted...

while let Some(msg) = read.next().await {
    let evt = parse_discord(msg?)?;
    broker.publish("plugin.inbound.discord", evt).await?;
}
// On disconnect: reconnect with backoff.
}

The SDK's signal handling (default-on) lets the daemon shut the plugin down cleanly even mid-connection.

Pattern 6 · Stateful conversation glue

When to use · The external channel sends fragments (audio chunks, typing indicators, partial messages) and you want to assemble them before the agent sees a complete event.

Plugin maintains a per-conversation buffer; only emits a broker.publish when the message is "complete" (final chunk, silence timeout, or explicit done marker).

buffer: dict[str, list[str]] = {}
timers: dict[str, asyncio.Task] = {}

async def on_chunk(conv_id, fragment, broker):
    buffer.setdefault(conv_id, []).append(fragment)
    if conv_id in timers:
        timers[conv_id].cancel()
    timers[conv_id] = asyncio.create_task(flush_after(conv_id, broker, delay=2.0))

async def flush_after(conv_id, broker, delay):
    await asyncio.sleep(delay)
    text = "".join(buffer.pop(conv_id, []))
    await broker.publish("plugin.inbound.assembled",
        Event.new("plugin.inbound.assembled", "voice_glue", {"text": text}))

Pattern 7 · Outbound-only adapter

When to use · The plugin only sends (Twilio SMS sender, push notification dispatcher, Slack outbound webhook).

Plugin subscribes to plugin.outbound.<kind> events from the daemon, calls the external API, and publishes a delivery_status event back so the agent knows whether it landed.

const adapter = new PluginAdapter({
  manifestToml: MANIFEST,
  onEvent: async (topic, event, broker) => {
    if (topic.startsWith("plugin.outbound.sms")) {
      const result = await twilio.messages.create({
        to: event.payload.to,
        body: event.payload.body,
        from: TWILIO_FROM,
      });
      await broker.publish("plugin.delivery.sms",
        Event.new("plugin.delivery.sms", "twilio-out",
                  { sid: result.sid, status: result.status }));
    }
  },
});

Pattern 8 · Provider abstraction (multi-instance)

When to use · Operator wants 3 different Telegram bots, each isolated. Or 5 WhatsApp accounts.

Plugin manifest declares instance support. Operator's config spawns N copies, each with a distinct instance label. Topics become plugin.inbound.<kind>.<instance> so agent bindings can target a specific one.

# operator's pollers.yaml
plugins:
  telegram:
    - instance: support-bot
      bot_token_env: TG_SUPPORT_TOKEN
    - instance: sales-bot
      bot_token_env: TG_SALES_TOKEN

The plugin reads instance from args or env at startup and publishes to plugin.inbound.telegram.<instance>.

Choosing between patterns

If you...	Use
Are wiring a brand-new channel for the first time	Echo (pattern 1)
Need to receive HTTP from an external service	Webhook receiver (2)
Are exposing an external API as a tool	RPC bridge (3)
Need to poll something on a timer	Scheduled poller (4)
Have a push-based external service	Long-running connection (5)
Receive fragmented inputs (chunks, partials)	Stateful glue (6)
Only need to send (no receive)	Outbound-only (7)
Want N copies of the same plugin	Provider abstraction (8)

Rust plugin SDK

Phase 31.9. Author plugins in Rust that the daemon spawns as subprocesses, talking the same JSON-RPC 2.0 wire format used by the Python / TypeScript / PHP SDKs.

The SDK lives in crates/microapp-sdk/ behind the plugin Cargo feature; the reference plugin template is at extensions/template-plugin-rust/. Use nexo plugin new <id> --lang rust to scaffold a fresh out-of-tree project from that template.

Read this when

You picked Rust from the language picker in Plugin authoring overview and want the SDK reference.
You are porting an in-tree plugin (crates/plugins/<id>) into an out-of-tree subprocess and need the wire-API mapping.
You want the canonical Rust handler signature for broker.event notifications.

Why subprocess + Rust

Running Rust plugins as separate processes — instead of crates linked into the daemon — gives you:

Isolation — a panic in your plugin terminates one process, not the daemon.
One contract, every language — the daemon treats your binary the same way it treats Python or TypeScript plugins. Switching languages later is an SDK choice, not a daemon recompile.
No link-time coupling — your plugin can use any Rust toolchain or tokio version that compiles; the daemon does not care.
Single static binary — cargo build --release produces one file the publish workflow uploads as the per-target tarball.

Daemon-side spawn code in crates/core/src/agent/nexo_plugin_registry/subprocess.rs treats the plugin as an opaque executable; Rust plugins re-use that path without modification.

Architecture

Operator host                              Plugin process
┌──────────────────┐    stdin   ┌─────────────────────────────┐
│ daemon (Rust)    │──JSON-RPC──▶│ target/release/<id>         │
│ subprocess host  │             │   tokio::main async runtime │
│                  │◀──JSON-RPC──│   PluginAdapter.run_stdio() │
└──────────────────┘    stdout   └─────────────────────────────┘

The daemon writes newline-delimited JSON-RPC requests to your binary's stdin; you write replies + outbound broker.publish notifications back on stdout. stderr is collected by the operator's tracing pipeline (Phase 81.23 fold pending) — use it freely for plugin-side logs.

Public API

#![allow(unused)]
fn main() {
use nexo_broker::Event;
use nexo_microapp_sdk::plugin::{BrokerSender, PluginAdapter};
}

PluginAdapter builder methods:

Method	Required	Description
`PluginAdapter::new(manifest_toml: &str)`	✅	Body of `nexo-plugin.toml`. Read once at startup; the SDK validates `plugin.id` + `plugin.version` and surfaces `ManifestError` on parse failure.
`.on_broker_event(handler)`	⬜	`async fn(topic: String, event: Event, broker: BrokerSender)`. Invoked for every `broker.event` notification. Each handler call is spawned on the runtime; the dispatch loop continues reading stdin without blocking.
`.on_shutdown(handler)`	⬜	`async fn() -> Result<(), Box<dyn Error + Send + Sync>>`. Awaited before the SDK replies `{ok: true}` to the host's `shutdown` request. In-flight `on_broker_event` tasks are awaited too.
`.run_stdio().await`	✅	Single-shot — calling it twice returns `PluginError::AlreadyRunning`. Drives the JSON-RPC loop until stdin closes or the host sends `shutdown`.

Event (re-exported from nexo-broker) carries topic, source, payload: serde_json::Value, optional correlation_id

metadata. Construct with Event::new(topic, source, payload) which stamps a fresh UUID + RFC3339 timestamp.

BrokerSender::publish(topic: &str, event: Event) -> Result<(), WireError> serializes a broker.publish notification to stdout under an internal write lock. The daemon's bridge re-checks the topic against the manifest's [[plugin.channels.register]] allowlist before forwarding to the broker.

Manifest example

[plugin]
id = "my_plugin"
version = "0.1.0"
name = "My Plugin"
description = "Forwards inbound events to a third-party API."
min_nexo_version = ">=0.1.0"

[plugin.requires]
nexo_capabilities = ["broker"]

[[plugin.channels.register]]
kind = "my_plugin_inbound"
description = "Inbound events the plugin emits onto the broker."

plugin.id MUST match ^[a-z][a-z0-9_]{0,31}$. Cargo's [[bin]] name MUST equal plugin.id so the publish workflow's pack-tarball.sh finds the artifact at target/<target>/release/<id>.

See Plugin contract for the full manifest schema and the JSON-RPC envelope every method exchanges.

Quickstart

Scaffold + build + run, copy-paste:

nexo plugin new my_plugin --lang rust --owner alice
cd my_plugin
cargo build
nexo plugin run .

nexo plugin run boots the daemon with your plugin injected at the head of plugins.discovery.search_paths, bypassing the install pipeline. See Local dev loop for the inner-loop conventions and --no-daemon-config.

The handler in the scaffolded src/main.rs echoes every inbound event back on plugin.inbound.<id>_echo:

use nexo_broker::Event;
use nexo_microapp_sdk::plugin::{print_manifest_if_requested, BrokerSender, PluginAdapter};

const MANIFEST: &str = include_str!("../nexo-plugin.toml");

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // First line — honour the daemon's plugin auto-discovery probe.
    // When invoked with `--print-manifest`, dump the embedded TOML
    // to stdout and exit 0 before constructing any runtime state.
    print_manifest_if_requested(MANIFEST);

    tracing_subscriber::fmt()
        .with_writer(std::io::stderr)
        .init();

    PluginAdapter::new(MANIFEST)?
        .on_broker_event(handle_event)
        .on_shutdown(|| async {
            tracing::info!("plugin shutdown handler invoked");
            Ok(())
        })
        .run_stdio()
        .await?;
    Ok(())
}

async fn handle_event(topic: String, event: Event, broker: BrokerSender) {
    let echo = Event::new(
        "plugin.inbound.my_plugin_echo",
        "my_plugin",
        serde_json::json!({
            "echoed_from": topic,
            "echoed_payload": event.payload,
        }),
    );
    let _ = broker
        .publish("plugin.inbound.my_plugin_echo", echo)
        .await;
}

Replace the body of handle_event with your channel's real outbound logic (forward to a third-party API, persist to disk, trigger a downstream agent, etc.) and re-publish the API's reply back through broker so agents can observe it.

Smoke test

Hand-run the binary against a synthetic JSON-RPC frame to confirm the handshake is well-formed:

echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' \
    | ./target/debug/my_plugin

The plugin should print one JSON-RPC response containing your manifest's id, version, name, and the SDK's server_version. If you see anything other than a single line of valid JSON on stdout, check that you have not added stray println!s in the handler — every byte on stdout must be a JSON-RPC frame. Use eprintln! / tracing::* for logs.

Auto-discovery probe

The daemon's discovery walker checks each candidate binary with --print-manifest (Phase 81.33 Stage 8). Verify your plugin answers it correctly:

./target/debug/my_plugin --print-manifest

The expected output is the verbatim contents of nexo-plugin.toml followed by exit 0. The print_manifest_if_requested(MANIFEST) call in main() handles this for you — if the smoke test prints anything else (logs, empty stdout, a JSON-RPC frame) the helper is missing from your entry point.

Per-target tarball convention

Operators install Rust plugins via the same nexo plugin install <owner>/<repo>[@<tag>] CLI. The resolver expects per-target tarballs:

<id>-<version>-<target>.tar.gz
├── nexo-plugin.toml
└── bin/<id>           # static binary, mode 0755

Targets follow Rust's standard target triples (x86_64-unknown-linux-gnu, aarch64-apple-darwin, x86_64-unknown-linux-musl, etc.). The shipped CI workflow in extensions/template-plugin-rust/.github/workflows/release.yml covers Linux musl + macOS by default; add additional matrix entries to support more.

CI publish workflow

The shipped workflow has 4 jobs: validate-tag → build (matrix) → optional sign (cosign keyless, gated by repo variable COSIGN_ENABLED) → release (uploads all tarballs + sha256 sidecars + signing material + a copy of nexo-plugin.toml). See Publishing a plugin for the full asset naming convention and Signing & publishing for the end-to-end signed-release tutorial.

Local validation

Before pushing a tag, dry-run the pack step:

cargo build --release --target x86_64-unknown-linux-gnu
bash scripts/pack-tarball.sh x86_64-unknown-linux-gnu
ls dist/
# my_plugin-0.1.0-x86_64-unknown-linux-gnu.tar.gz
# my_plugin-0.1.0-x86_64-unknown-linux-gnu.tar.gz.sha256

The Rust integration test extensions/template-plugin-rust/tests/pack_tarball.rs covers this end-to-end against a synthetic binary; copy it when you fork the template to keep the convention regression-tested.

SDK tests

cargo test -p nexo-microapp-sdk --features plugin

Covers handshake, manifest validation, dispatch (including non-blocking reader proof), shutdown lifecycle, unknown-method handling, oversized-frame rejection.

Python plugin SDK

Author plugins in Python that the daemon spawns as subprocesses, talking the same JSON-RPC 2.0 wire format used by the Rust SDK in crates/microapp-sdk/. The robustness defaults (stdout guard, frame cap, signal handling) match the TypeScript and PHP SDKs (sub-phase 31.4.c).

Reference template: extensions/template-plugin-python/ (or run nexo plugin new --lang python). The SDK package lives in the nexo-plugin-sdks repo (python/ subdir) and ships on PyPI as nexoai — pip install nexoai (the nexo-plugin-sdk name was taken; the importable module is still nexo_plugin_sdk).

Why subprocess + Python instead of an embedded interpreter

Running Python plugins as separate processes:

Keeps the daemon language-agnostic; one wire contract, many SDK languages.
Isolates plugin failures (a runaway Python plugin cannot panic the daemon).
Sidesteps GIL coordination + PyO3 link-time complexity.

Daemon-side spawn code in crates/core/src/agent/nexo_plugin_registry/subprocess.rs treats the plugin as an opaque executable; Python plugins re-use it without modification.

Architecture summary

Operator host                         Plugin process
┌──────────────────┐    stdin   ┌──────────────────────────┐
│ daemon (Rust)    │──JSON-RPC──▶│ bin/<id> (bash launcher) │
│ subprocess host  │             │   exec python3 main.py   │
│                  │◀──JSON-RPC──│   PluginAdapter.run()    │
└──────────────────┘    stdout   └──────────────────────────┘

The bash launcher in bin/<id> sets PYTHONPATH=lib/ and exec's the vendored Python runtime so the plugin's deps come from lib/ only — no site-packages interference.

Public API

from nexo_plugin_sdk import (
    PluginAdapter,
    BrokerSender,
    Event, EventHandler, ShutdownHandler,
    PluginError, ManifestError, WireError,
    read_manifest,
    install_stdout_guard, uninstall_stdout_guard, is_stdout_guard_installed,
    STDOUT_GUARD_MARKER,
    MAX_FRAME_BYTES, JSONRPC_VERSION,
    serialize_frame, build_response, build_error_response, build_notification,
)

PluginAdapter constructor (all keyword-only):

Parameter	Default	Description
`manifest_toml: str`	required	Body of `nexo-plugin.toml`. Parsed + validated once at construction; the SDK checks `plugin.id` (incl. the `^[a-z][a-z0-9_]{0,31}$` slug regex the host enforces) and `plugin.version`. A failed construction leaves no stdout guard installed.
`server_version: str`	`"0.1.0"`	Returned in the `initialize` reply alongside the manifest.
`on_event`	`None`	`async (topic, Event, BrokerSender) -> None`. Invoked for every `broker.event` notification. Handler runs in a detached task; the dispatch loop continues reading stdin without blocking.
`on_shutdown`	`None`	`async () -> None`. Awaited before the SDK replies `{ok: true}` to the host's `shutdown` request. In-flight `on_event` (and `tool.invoke`) tasks are also awaited before returning.
`tools`	`None`	`list[ToolDef]` — the tool catalog advertised in the `initialize` reply's `tools` array (contract §4.1.1). Also settable post-construction via `.declare_tools([...])`. Every `name` must appear in the manifest's `[plugin.extends].tools` — a name that doesn't raises `ManifestError` at construction (mirrors the host's hard-failure). Omit → no `tools` array in the reply.
`on_tool`	`None`	`(ToolInvocation) -> Any`, sync or async. Dispatch handler for `tool.invoke` (contract §5.t). Runs on a detached task tracked by the shutdown drain. Mutually exclusive with `on_tool_with_context`.
`on_tool_with_context`	`None`	`(ToolInvocation, ToolContext) -> Any`, sync or async. Like `on_tool`, but `ctx.broker` is the same `BrokerSender` `on_event` gets — a tool body can `memory_recall` / `llm_complete` mid-invocation. Wins over `on_tool` when both are set.
`enable_stdout_guard: bool`	`True`	Replace `sys.stdout` with a line-buffering proxy that diverts non-JSON lines (a stray `print`) to stderr tagged `[stdout-guard]`. Blessed replies / `broker.publish` frames write through the captured original stdout, bypassing the guard.
`max_frame_bytes: int`	`MAX_FRAME_BYTES` (1 MiB)	Inbound JSON-RPC frames larger than this are rejected with a `WireError` log; dispatch continues.
`handle_process_signals: bool`	`True`	SIGTERM / SIGINT → graceful shutdown: drain in-flight handlers, then exit 0. `loop.add_signal_handler` is the primary path, falling back to `signal.signal` where unavailable (Windows ProactorEventLoop / non-main-thread).

Calling run() twice raises PluginError. The stdin reader is fully async (loop.connect_read_pipe + asyncio.StreamReader) — no threadpool worker.

Event is a dataclass with topic, source, payload, optional correlation_id + metadata. BrokerSender.publish(topic, event) serializes a JSON-RPC notification to the captured original stdout under an asyncio write lock.

Stdout guard limitation

The guard only intercepts the text-stream API (print, sys.stdout.write). A C extension or subprocess that writes to file descriptor 1 directly bypasses it. Plugin authors who need stdout output should use print() / sys.stdout.write().

Tool dispatch (`tool.invoke`, contract §4.1.1 + §5.t)

A plugin that declares [plugin.extends].tools = ["myplugin_weather"] advertises a catalog of ToolDef(name, description, input_schema) and handles one tool.invoke request per agent-loop tool call:

from nexo_plugin_sdk import (
    PluginAdapter, ToolDef, ToolInvocation, ToolContext,
    ToolNotFound, ToolArgumentInvalid, ToolExecutionFailed, ToolUnavailable, ToolDenied,
    text_result,
)

async def on_tool(inv: ToolInvocation, ctx: ToolContext):
    if inv.tool_name != "myplugin_weather":
        raise ToolNotFound(inv.tool_name)
    city = (inv.args or {}).get("city")
    if not city:
        raise ToolArgumentInvalid("missing `city`", details={"field": "city"})
    # ctx.broker is the on_event broker handle — host calls work mid-invocation:
    # _ = await ctx.broker.memory_recall(agent_id=inv.agent_id or "", query=city)
    return text_result(f"Sunny in {city}")     # any JSON value is fine; this is the conventional shape

await PluginAdapter(
    manifest_toml=MANIFEST,
    tools=[ToolDef("myplugin_weather", "Current weather for a city",
                   {"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]})],
    on_tool_with_context=on_tool,   # or on_tool=fn(inv) when you don't need the broker
).run()

The handler's return value becomes the JSON-RPC result verbatim (non-JSON-serializable → -33403). Raising one of the ToolInvocationError subclasses maps to the matching -33401..-33405 code (with error.data.details / error.data.retry_after_ms when set); an uncaught generic exception maps to -33403; a tool.invoke with no handler registered replies -32601. (PyPI nexoai ≥ 0.4.0.)

Tarball convention (`noarch`)

Operators install Python plugins via the same nexo plugin install <owner>/<repo>[@<tag>] CLI. The resolver in nexo-ext-installer falls back to noarch when no per-target tarball matches the daemon's host triple:

<id>-<version>-noarch.tar.gz
├── nexo-plugin.toml
├── bin/<id>           # bash launcher, mode 0755
└── lib/
    ├── plugin/main.py
    └── nexo_plugin_sdk/
        └── ...

The launcher (~5 LOC) reads:

#!/usr/bin/env bash
DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
exec env PYTHONPATH="$DIR/lib" python3 "$DIR/lib/plugin/main.py" "$@"

Pure-Python deps constraint

noarch requires that vendored deps work on every operator's CPU. Native extensions (*.so, *.pyd, *.dylib) invalidate the claim. The publish workflow's audit step runs scripts/verify-pure-python.sh post-vendor and rejects any tree containing those suffixes.

If your plugin needs a native dep, per-target Python tarballs (<id>-<version>-py312-x86_64-linux.tar.gz etc.) are tracked as Phase 31.4.b and not yet shipped.

CI publish workflow

The shipped workflow in extensions/template-plugin-python/.github/workflows/release.yml has the same 4-job shape as the Rust template (see Publishing a plugin) but:

Build matrix has a single noarch entry.
Build step uses actions/setup-python@v5 + pip install --target lib/ instead of cargo zigbuild.
Vendor audit step calls scripts/verify-pure-python.sh to enforce the pure-Python constraint.

Sign + release jobs are identical to the Rust template; cosign keyless OIDC ships .sig + .pem + .bundle per asset when the COSIGN_ENABLED repo variable is "true".

Operator install flow (no changes for Python)

nexo plugin install your-handle/your-plugin@v0.2.0

Identical pipeline to the Rust install path:

Resolve release JSON.
Try <id>-0.2.0-<host-triple>.tar.gz (miss for noarch plugins).
Fall back to <id>-0.2.0-noarch.tar.gz (Phase 31.4 addition).
Verify sha256.
Cosign verify per trusted_keys.toml (Phase 31.3).
Extract under <dest_root>/<id>-0.2.0/.
Daemon picks it up at next boot or hot-reload; spawns bin/<id> which exec's python3 lib/plugin/main.py.

Local smoke test

echo '{"jsonrpc":"2.0","id":1,"method":"initialize"}' \
    | python3 src/main.py

Should print one JSON-RPC response with your manifest + server_version.

End-to-end test for the pack pipeline:

python3 -m unittest extensions/template-plugin-python/tests/test_pack_tarball.py -v

SDK tests

In a clone of nexo-plugin-sdks:

cd python
PYTHONPATH=. python3 -m unittest discover -v tests/

21 tests: handshake (incl. unknown-method -32601), manifest validation (missing id, invalid TOML, id-regex violation), dispatch (incl. non-blocking reader proof + oversized frame rejected with continued dispatch), stdout guard (idempotent install, divert vs passthrough, handler-print diverted while the blessed frame stays clean), broker.publish back channel, lifecycle (double run() rejected, SIGTERM exits 0, SIGTERM drains an in-flight handler).

TypeScript plugin SDK

Author plugins in TypeScript (or plain JavaScript) that the daemon spawns as subprocesses, talking the same JSON-RPC 2.0 wire format used by the Rust SDK in crates/microapp-sdk/ and the Python / PHP SDKs.

Reference template: extensions/template-plugin-typescript/ (or run nexo plugin new --lang typescript). The SDK package lives in the nexo-plugin-sdks repo (typescript/ subdir) and ships on npm as nexo-plugin-sdk — npm install nexo-plugin-sdk.

Why subprocess + Node instead of an embedded runtime

Running TypeScript plugins as separate Node processes:

Keeps the daemon language-agnostic; one wire contract, three shipped SDK languages (Rust, Python, TypeScript).
Isolates plugin failures (a runaway plugin cannot crash the daemon).
Sidesteps V8 embedding complexity.

Daemon-side spawn code in crates/core/src/agent/nexo_plugin_registry/subprocess.rs treats the plugin as an opaque executable; TypeScript plugins re-use it without modification.

Architecture summary

Operator host                         Plugin process
┌──────────────────┐    stdin   ┌──────────────────────────┐
│ daemon (Rust)    │──JSON-RPC──▶│ bin/<id> (bash launcher) │
│ subprocess host  │             │   exec node main.js      │
│                  │◀──JSON-RPC──│   PluginAdapter.run()    │
└──────────────────┘    stdout   └──────────────────────────┘

The bash launcher in bin/<id> sets NODE_PATH=lib/node_modules and exec's the vendored Node runtime so the plugin's deps come from lib/ only — no global node_modules interference.

Public API

import {
  PluginAdapter,
  BrokerSender,
  Event,
  PluginError, ManifestError, WireError,
  installStdoutGuard, parseManifest,
  STDOUT_GUARD_MARKER,
} from "nexo-plugin-sdk";

PluginAdapter constructor options:

Option	Required	Description
`manifestToml: string`	✅	Body of `nexo-plugin.toml`. Read once at startup; the SDK validates `plugin.id` (regex `/^[a-z][a-z0-9_]{0,31}$/`), `plugin.version`, `plugin.name`, `plugin.description`.
`serverVersion?: string`	⬜	Returned in the `initialize` reply. Default `"0.1.0"`.
`onEvent?: EventHandler`	⬜	`async (topic, Event, BrokerSender) => Promise<void>`. Invoked for every `broker.event` notification. Handler runs in a detached task; the dispatch loop continues reading stdin without blocking.
`onShutdown?: ShutdownHandler`	⬜	`async () => Promise<void>`. Awaited before `{ok: true}` reply to the host's `shutdown` request. In-flight `onEvent` (and `tool.invoke`) tasks are also awaited before returning.
`tools?: ToolDef[]`	⬜	`{ name, description, inputSchema }[]` — the tool catalog advertised in the `initialize` reply's `tools` array (contract §4.1.1; serialized with the wire key `input_schema`). Every `name` must appear in the manifest's `[plugin.extends].tools` — otherwise the constructor throws `ManifestError`.
`onTool?: (inv) => unknown \| Promise<unknown>`	⬜	Dispatch handler for `tool.invoke` (contract §5.t). Runs as a detached task tracked by the shutdown drain. Mutually exclusive with `onToolWithContext`.
`onToolWithContext?: (inv, ctx) => unknown \| Promise<unknown>`	⬜	Like `onTool`, but `ctx.broker` is the same `BrokerSender` `onEvent` gets — a tool body can `memoryRecall` / `llmComplete` mid-invocation. Wins over `onTool` when both are set.
`enableStdoutGuard?: boolean`	⬜ default `true`	Patches `process.stdout.write` so any stray `console.log` from your handler (or a chatty transitive dep) is diverted to stderr tagged with `STDOUT_GUARD_MARKER` instead of corrupting the JSON-RPC frame stream.
`maxFrameBytes?: number`	⬜ default 1 MiB	Reject inbound frames larger than this with a `WireError` log; dispatch continues.
`handleProcessSignals?: boolean`	⬜ default `true`	Listen for SIGTERM + SIGINT and trigger graceful shutdown (drain in-flight, exit 0).

Event is a value object with topic, source, payload, optional correlation_id + metadata. BrokerSender.publish(topic, event) serializes a JSON-RPC notification to stdout under a Promise-chain write lock so concurrent handler tasks never interleave half-written frames.

Tool dispatch (`tool.invoke`, contract §4.1.1 + §5.t)

import { PluginAdapter, ToolNotFoundError, ToolArgumentInvalidError, textResult } from "nexo-plugin-sdk";

const adapter = new PluginAdapter({
  manifestToml: readFileSync("nexo-plugin.toml", "utf-8"),
  tools: [{ name: "myplugin_weather", description: "Current weather for a city",
    inputSchema: { type: "object", properties: { city: { type: "string" } }, required: ["city"] } }],
  onToolWithContext: async (inv, ctx) => {
    if (inv.toolName !== "myplugin_weather") throw new ToolNotFoundError(inv.toolName);
    const city = (inv.args as { city?: string } | null)?.city;
    if (!city) throw new ToolArgumentInvalidError("missing `city`", { field: "city" });
    // ctx.broker is the onEvent broker handle — e.g. await ctx.broker.memoryRecall({ agentId: inv.agentId ?? "", query: city });
    return textResult(`Sunny in ${city}`);     // any JSON value is fine; this is the conventional shape
  },
  // or onTool: (inv) => ... when you don't need the broker
});
await adapter.run();

The handler's return value becomes the JSON-RPC result verbatim (non-serializable → -33403). Throwing ToolNotFoundError / ToolArgumentInvalidError (.details) / ToolExecutionFailedError / ToolUnavailableError (.retryAfterMs) / ToolDeniedError maps to the matching -33401..-33405 code; an uncaught throw maps to -33403; a tool.invoke with no handler registered replies -32601. (npm nexo-plugin-sdk ≥ 0.3.0.)

Tarball convention (`noarch`)

Operators install TypeScript plugins via the same nexo plugin install <owner>/<repo>[@<tag>] CLI. The resolver in nexo-ext-installer falls back to noarch when no per-target tarball matches the daemon's host triple (Phase 31.4):

<id>-<version>-noarch.tar.gz
├── nexo-plugin.toml
├── bin/<id>           # bash launcher, mode 0755
└── lib/
    ├── plugin/main.js   # compiled from src/main.ts via tsc
    └── node_modules/
        ├── nexo-plugin-sdk/dist/...
        └── ...           # pure-JS production deps

The launcher (~5 LOC) reads:

#!/usr/bin/env bash
set -euo pipefail
DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
exec env NODE_PATH="$DIR/lib/node_modules" node "$DIR/lib/plugin/main.js" "$@"

Pure-JS deps constraint

noarch requires that vendored deps work on every operator's CPU. Native node addons (*.node, *.so, *.dylib, *.dll) invalidate the claim. The publish workflow's audit step runs scripts/verify-pure-js.sh post-vendor and rejects any tree containing those suffixes.

If your plugin needs a native dep, per-target TypeScript tarballs (<id>-<version>-node20-x86_64-linux.tar.gz etc.) are tracked as Phase 31.5.b and not yet shipped.

Stdout guard — the robustness multiplier

Plugin authors invariably console.log("debug") at some point, or import a chatty dep (dotenv banners, transitive logging libs). Without protection, the very first non-JSON line on stdout corrupts the daemon's JSON-RPC parser mid-stream — no recovery path, the host disconnects.

The default-on stdout guard wraps process.stdout.write and:

Buffers writes until a newline arrives.
Each complete line is JSON.parse-tested.
Lines that parse → forwarded to the real stdout.
Lines that don't parse → diverted to stderr tagged with [stdout-guard] <line>.

The blessed write path (BrokerSender and the SDK's own response helpers) always emits valid JSON so frames pass through unchanged. Operator log scraping picks up the [stdout-guard] marker so debug output stays visible without breaking the wire format.

Set enableStdoutGuard: false only if you have another guard layer (e.g. process-level isolation) — it is the single strongest recommendation in the SDK.

CI publish workflow

The shipped workflow in extensions/template-plugin-typescript/.github/workflows/release.yml has the same 4-job shape as the Rust + Python templates but:

Build matrix has a single noarch entry.
Build step uses actions/setup-node@v4 + npm ci + npm run typecheck + npm run build (tsc to dist/).
Pre-vendor: npm prune --omit=dev strips dev deps so only runtime deps land in the tarball.
Vendor audit step calls scripts/verify-pure-js.sh .audit/lib/node_modules to enforce pure-JS.

Sign + release jobs are identical to the Rust + Python templates; cosign keyless OIDC ships .sig + .pem + .bundle per asset when the COSIGN_ENABLED repo variable is "true".

Operator install flow (no changes for TypeScript)

nexo plugin install your-handle/your-plugin@v0.2.0

Identical pipeline to the Rust + Python install paths:

Resolve release JSON.
Try <id>-0.2.0-<host-triple>.tar.gz (miss for noarch plugins).
Fall back to <id>-0.2.0-noarch.tar.gz (Phase 31.4 addition).
Verify sha256.
Cosign verify per trusted_keys.toml (Phase 31.3).
Extract under <dest_root>/<id>-0.2.0/.
Daemon picks it up at next boot or hot-reload; spawns bin/<id> which exec's node lib/plugin/main.js with NODE_PATH=lib/node_modules.

Local smoke test

echo '{"jsonrpc":"2.0","id":1,"method":"initialize"}' \
    | node dist/main.js

Should print one JSON-RPC response with your manifest + server_version.

End-to-end test for the pack pipeline:

node --test tests/pack-tarball.test.mjs

SDK tests

In a clone of nexo-plugin-sdks:

cd typescript
npm install
npm run build
npm test

13 tests across handshake, manifest validation, dispatch, stdout-guard, wire, lifecycle. All run via stdlib node:test so there is zero install friction beyond the SDK's runtime dep on smol-toml.

PHP plugin SDK

Author plugins in PHP 8.1+ that the daemon spawns as subprocesses, talking the same JSON-RPC 2.0 wire format used by the Rust + Python + TypeScript SDKs.

Reference template: extensions/template-plugin-php/ (or run nexo plugin new --lang php). The SDK package lives in the nexo-plugin-sdks repo (php/ subdir, mirrored to nexo-plugin-sdk-php for Packagist) and ships on Packagist as nexo/plugin-sdk — composer require nexo/plugin-sdk.

Why PHP 8.1+

The SDK uses Fibers (introduced in PHP 8.1) to run each broker.event handler as a cooperative coroutine. Without Fibers the dispatch loop would block on slow handlers, breaking the contract invariant proven necessary by the TS + Python SDKs.

Architecture summary

Operator host                          Plugin process
┌──────────────────┐    stdin    ┌────────────────────────────┐
│ daemon (Rust)    │──JSON-RPC──▶│ bin/<id> (bash launcher)   │
│ subprocess host  │             │   exec php main.php        │
│                  │◀──JSON-RPC──│   PluginAdapter::run()     │
└──────────────────┘    stdout   │   Fiber scheduler ticks    │
                                 │   between stdin polls      │
                                 └────────────────────────────┘

The bash launcher in bin/<id> runs:

exec env php -d display_errors=stderr -d log_errors=0 \
    "$DIR/lib/plugin/main.php" "$@"

-d display_errors=stderr is critical — without it, PHP's default behavior writes errors to stdout, which would corrupt the JSON-RPC frame stream.

Daemon-side spawn code in crates/core/src/agent/nexo_plugin_registry/subprocess.rs treats the plugin as an opaque executable; PHP plugins re-use it without modification.

Public API

use Nexo\Plugin\Sdk\PluginAdapter;     // async dispatch loop
use Nexo\Plugin\Sdk\BrokerSender;      // write-only broker handle
use Nexo\Plugin\Sdk\Event;             // value object
use Nexo\Plugin\Sdk\Manifest;          // standalone TOML parser
use Nexo\Plugin\Sdk\StdoutGuard;       // defensive guard
use Nexo\Plugin\Sdk\Wire;              // JSON-RPC frame helpers + MAX_FRAME_BYTES
use Nexo\Plugin\Sdk\PluginError;       // base exception
use Nexo\Plugin\Sdk\ManifestError;     // raised when manifest malformed
use Nexo\Plugin\Sdk\WireError;         // raised on malformed/oversized frames

PluginAdapter constructor options:

Option	Required	Description
`manifestToml: string`	✅	Body of `nexo-plugin.toml`. Read once at startup; the SDK validates `plugin.id` (regex `/^[a-z][a-z0-9_]{0,31}$/`), `plugin.version`, `plugin.name`, `plugin.description`.
`serverVersion?: string`	⬜	Returned in the `initialize` reply. Default `"0.1.0"`.
`onEvent?: callable(string, Event, BrokerSender): void`	⬜	Invoked for every `broker.event` notification. Runs in a Fiber so the dispatch loop continues.
`onShutdown?: callable(): void`	⬜	Awaited before `{ok: true}` reply to the host's `shutdown` request. In-flight Fibers (`onEvent` + `tool.invoke`) also drained first.
`tools?: ToolDef[]`	⬜	`new ToolDef($name, $description, $inputSchema)[]` — the tool catalog advertised in the `initialize` reply's `tools` array (contract §4.1.1; serialized with the wire key `input_schema`). Every `$name` must appear in the manifest's `[plugin.extends].tools` — otherwise the constructor throws `ManifestError`.
`onTool?: callable(ToolInvocation): mixed`	⬜	Dispatch handler for `tool.invoke` (contract §5.t). Runs in a Fiber tracked by the scheduler's drain set. Mutually exclusive with `onToolWithContext`.
`onToolWithContext?: callable(ToolInvocation, ToolContext): mixed`	⬜	Like `onTool`, but `$ctx->broker` is the same `BrokerSender` `onEvent` gets — a tool body can `memoryRecall` / `llmComplete` mid-invocation. Wins over `onTool` when both are set.
`enableStdoutGuard?: bool`	⬜ default `true`	Installs an `ob_start` callback that diverts non-JSON `echo`/`print`/`printf`/`var_dump` output to stderr tagged with `[stdout-guard]`.
`maxFrameBytes?: int`	⬜ default `1048576`	Reject inbound frames larger than this with `WireError`; dispatch continues.
`handleProcessSignals?: bool`	⬜ default `true`	Listen for SIGTERM + SIGINT via `pcntl_async_signals` and trigger graceful shutdown (drain in-flight, exit 0).

Tool dispatch (`tool.invoke`, contract §4.1.1 + §5.t)

The tool classes live in src/Tool.php (loaded via the files autoload entry alongside src/Host.php):

use Nexo\Plugin\Sdk\{PluginAdapter, Tool, ToolDef, ToolInvocation, ToolContext,
    ToolNotFound, ToolArgumentInvalid, ToolExecutionFailed, ToolUnavailable, ToolDenied};

$adapter = new PluginAdapter([
    'manifestToml' => file_get_contents(__DIR__ . '/nexo-plugin.toml'),
    'tools' => [new ToolDef('myplugin_weather', 'Current weather for a city',
        ['type' => 'object', 'properties' => ['city' => ['type' => 'string']], 'required' => ['city']])],
    'onToolWithContext' => function (ToolInvocation $inv, ToolContext $ctx): mixed {
        if ($inv->toolName !== 'myplugin_weather') { throw new ToolNotFound($inv->toolName); }
        $city = $inv->args['city'] ?? null;
        if (!$city) { throw new ToolArgumentInvalid('missing `city`', ['field' => 'city']); }
        // $ctx->broker is the onEvent broker handle — e.g. $ctx->broker->memoryRecall(['agentId' => $inv->agentId ?? '', 'query' => $city]);
        return Tool::text("Sunny in {$city}");   // any JSON value is fine; this is the conventional shape
    },
    // or 'onTool' => fn(ToolInvocation $inv) => ... when you don't need the broker
]);
$adapter->run();

The handler's return value becomes the JSON-RPC result verbatim (non-encodable → -33403). Throwing ToolNotFound / ToolArgumentInvalid ($details) / ToolExecutionFailed / ToolUnavailable ($retryAfterMs) / ToolDenied maps to the matching -33401..-33405 code (the code is carried via parent::__construct($msg, $code) like RpcServerError — read it with getCode()); an uncaught \Throwable maps to -33403; a tool.invoke with no handler registered replies -32601. (Packagist nexo/plugin-sdk ≥ 0.3.0.)

Tarball convention (`noarch`)

Operators install PHP plugins via the same nexo plugin install <owner>/<repo>[@<tag>] CLI. The resolver in nexo-ext-installer falls back to noarch when no per-target tarball matches the daemon's host triple (Phase 31.4):

<id>-<version>-noarch.tar.gz
├── nexo-plugin.toml
├── bin/<id>           # bash launcher mode 0755
└── lib/
    ├── plugin/main.php
    └── vendor/        # composer install --no-dev output
        ├── autoload.php
        ├── nexo/plugin-sdk/...
        ├── yosymfony/toml/...
        └── composer/...

Composer integration

Templates consume the in-tree SDK via a path repository:

"repositories": [
  {
    "type": "path",
    "url": "../sdk-php",
    "options": { "symlink": false }
  }
]

symlink: false is critical — without it Composer creates a symlink in vendor/nexo/plugin-sdk/ pointing at the path repo. When the tarball is packed, that symlink would break on the operator host. With symlink: false Composer copies the SDK files physically — the tarball stays self-contained.

The publish workflow runs:

composer install --no-dev --optimize-autoloader --classmap-authoritative

This produces a deterministic + smallest vendor tree. The operator host does NOT need Composer installed — the vendor/autoload.php shipped in the tarball is plain PHP and works with just php-cli.

composer.lock is checked in for the template (reproducibility analogous to Cargo.lock for binary projects). The SDK itself omits the lockfile so consumers resolve fresh against their own constraints.

Pure-PHP deps constraint

noarch requires that vendored deps work on every operator's CPU. Native PHP extensions (*.so, *.dylib, *.dll) are normally loaded via php.ini from /usr/lib/php/<version>/, NOT vendored. If a Composer dep smuggles in a native build artifact under vendor/, the publish workflow's scripts/verify-pure-php.sh audit step rejects the tarball.

If your plugin needs a native dep, per-target tarballs are tracked as Phase 31.5.c.b and not yet shipped.

Stdout guard — what's guarded vs not

API	Behavior
`echo $x;`	✅ Guarded — non-JSON lines diverted to stderr.
`print $x;`	✅ Guarded.
`printf("%s", $x);`	✅ Guarded.
`var_dump($x);`	✅ Guarded.
`fwrite(STDOUT, $x);`	❌ NOT guarded — bypasses `ob_start`. The SDK's own `BrokerSender::publish()` uses this deliberately so blessed JSON frames always reach the host.

Plugin authors who need stdout output should use echo / print / printf — those are guarded. Calling fwrite(STDOUT, ...) directly from author code is undefined behavior; the operator's daemon will see the raw bytes and disconnect on parser failure.

CI publish workflow

The shipped workflow in extensions/template-plugin-php/.github/workflows/release.yml has the same 4-job shape as the Rust + Python + TS templates but:

Build matrix has a single noarch entry.
Build step uses shivammathur/setup-php@v2 with php-version: "8.3" + tools: composer:v2.
composer validate --strict gates the build.
composer install --no-dev --optimize-autoloader --classmap-authoritative produces the vendor tree.
Pack step calls scripts/pack-tarball-php.sh with SKIP_COMPOSER=1 (composer ran already).
Vendor audit step calls scripts/verify-pure-php.sh .audit/lib/vendor to enforce pure-PHP.

Sign + release jobs are identical to the other templates; cosign keyless OIDC ships .sig + .pem + .bundle per asset when the COSIGN_ENABLED repo variable is "true".

Operator install flow (no changes for PHP)

nexo plugin install your-handle/your-plugin@v0.2.0

Identical pipeline to the Rust + Python + TS install paths:

Resolve release JSON.
Try <id>-0.2.0-<host-triple>.tar.gz (miss for noarch plugins).
Fall back to <id>-0.2.0-noarch.tar.gz (Phase 31.4 addition).
Verify sha256.
Cosign verify per trusted_keys.toml (Phase 31.3).
Extract under <dest_root>/<id>-0.2.0/.
Daemon picks it up at next boot or hot-reload; spawns bin/<id> which exec's php lib/plugin/main.php.

Local smoke test

echo '{"jsonrpc":"2.0","id":1,"method":"initialize"}' \
    | php src/main.php

Should print one JSON-RPC response with your manifest + server_version.

End-to-end test for the pack pipeline:

php tests/test_pack_tarball.php

SDK tests

In a clone of nexo-plugin-sdks:

cd php
composer install
php tests/run-all.php

14 test cases across handshake, manifest validation, dispatch (incl. Fiber-based slow-handler proof + drain), stdout-guard, wire-format hardening, lifecycle, event round-trip. All run via plain PHP scripts using proc_open — zero PHPUnit / Pest dep, mirroring the TS SDK's node:test choice and the Python SDK's unittest choice.

Plugin author constraint: cooperative scheduling

The Fiber scheduler preserves the "reader does not block on handler" invariant only at SDK boundaries. If your handler calls a synchronous blocking I/O function:

$result = file_get_contents("https://example.com/slow");  // blocks

…the dispatch loop blocks for the duration of the call. Cooperative scheduling cannot interrupt blocking I/O. Two mitigations:

Keep handlers fast — typical channel plugins do work in <10ms.
For long external calls, periodically Fiber::suspend() to yield. The SDK doesn't auto-suspend; that's an explicit author decision.

This matches the Python and TypeScript SDKs' contract — long blocking work is the author's responsibility to break up.

Publishing a plugin (CI workflow)

Phase 31.2. Operators install plugins via:

nexo plugin install <owner>/<repo>[@<tag>]

The CLI hits the GitHub Releases API of <owner>/<repo> and expects a fixed asset naming convention. This page documents the convention so plugin authors can publish releases that the operator-side install path consumes without translation.

The reference Rust plugin template extensions/template-plugin-rust/ ships a drop-in workflow plus helper scripts. Copy them to your own plugin repo and you are done.

Asset naming convention

For every release tag v<semver> (e.g. v0.2.0) the workflow uploads the following assets to the GitHub Release:

Asset	Required	Contents
`nexo-plugin.toml`	✅	The plugin manifest. Operator's CLI fetches first to learn `plugin.id`.
`<id>-<version>-<target>.tar.gz`	✅	One per supported target. Layout: `bin/<id>` + `nexo-plugin.toml` at the root, no top-level wrapping dir. Binary mode `0755` on Unix.
`<id>-<version>-<target>.tar.gz.sha256`	✅	Single line of lowercase hex (64 chars).
`<id>-<version>-<target>.tar.gz.sig`	⬜	Cosign keyless signature blob.
`<id>-<version>-<target>.tar.gz.pem`	⬜	Cosign certificate.
`<id>-<version>-<target>.tar.gz.bundle`	⬜	Cosign Sigstore bundle.

Targets follow Rust's standard target triple notation (x86_64-unknown-linux-gnu, aarch64-apple-darwin, etc.).

Publish workflow shape

The shipped workflow has four jobs:

validate-tag — checks tag format ^v[0-9]+\.[0-9]+\.[0-9]+(-[a-zA-Z0-9.]+)?$, asserts the tag matches the version declared in nexo-plugin.toml. Hard fails on mismatch (no partial release).
build — matrix over targets. For each:
- cargo zigbuild --release --target <target> for linux musl entries (cross-compiled from ubuntu-latest).
- cargo build --release --target <target> for darwin entries (run on macos-latest).
- bash scripts/pack-tarball.sh <target> produces the tarball
  - sha256 sidecar following the convention above.
sign (optional, gated on repo variable COSIGN_ENABLED == "true") — keyless cosign signs each tarball using the workflow's OIDC token, producing .sig / .pem / .bundle per asset.
release — creates the GitHub Release if missing, uploads all artifacts including nexo-plugin.toml. Uses --clobber so re-runs of the same tag overwrite stale assets.

Required permissions

permissions:
  contents: write   # gh release upload
  id-token: write   # cosign keyless OIDC

GITHUB_TOKEN is auto-provided. No additional secrets required for the unsigned path. Cosign keyless does not need any secret either — it uses Sigstore/Fulcio with the workflow's OIDC token.

Enabling cosign signing

gh variable set COSIGN_ENABLED --body true

After signing is enabled, every tag push produces signing material that operators with config/extensions/trusted_keys.toml (Phase 31.3) can verify against your GitHub identity.

Constraint: cargo bin name = plugin id

Cargo's [[bin]] name MUST equal nexo-plugin.toml [plugin] id. The convention is bin/<id> inside the tarball, and pack-tarball.sh looks for the binary at target/<target>/release/<id>. Mismatch fails the pack step (built binary missing at target/...).

Local validation

Before pushing a tag, dry-run the pack step:

cargo build --release --target x86_64-unknown-linux-gnu
bash scripts/pack-tarball.sh x86_64-unknown-linux-gnu
ls dist/
# my_plugin-0.2.0-x86_64-unknown-linux-gnu.tar.gz
# my_plugin-0.2.0-x86_64-unknown-linux-gnu.tar.gz.sha256

The Rust integration test tests/pack_tarball.rs covers this end to end against a synthetic binary; copy it when you fork the template to keep the convention regression-tested.

Troubleshooting

tag 'X' does not match v<semver> — the workflow rejects any tag that does not start with v and parse as semver. Examples: v0.2.0, v1.0.0-beta.3. Reject: 0.2.0 (missing v), v0.2, v01.0.0 (leading zero).
nexo-plugin.toml version <X> != tag <Y> — the workflow enforces that the tag and the manifest version match. Update one before retagging.
built binary missing at target/... — cargo produced a binary at a path other than what pack-tarball.sh expected. Check [[bin]] name in Cargo.toml matches [plugin] id in nexo-plugin.toml.
Operator hits TargetNotFound — your matrix did not build for the operator's target triple. Re-enable the matrix entry and re-run; operator can also pass --target to override.

Signing & publishing your plugin

Phase 31.9. End-to-end tutorial: take a freshly scaffolded plugin from nexo plugin new, ship it as a public GitHub release that operators can install signed, and confirm an operator with --require-signature accepts it.

This page is the how-to. For reference material:

Publishing a plugin — asset naming convention + workflow-job shape.
Plugin trust (trusted_keys.toml) — operator-side verification policy and troubleshooting.

Read this when

You finished a plugin and want to publish your first release.
You want operators on --require-signature to trust your releases via cosign keyless signing.
You want a concrete checklist before tagging v0.1.0.

Prerequisites

A GitHub repo containing the plugin scaffolded by nexo plugin new <id> --lang <lang>. Repo must use the shipped .github/workflows/release.yml from the matching extensions/template-plugin-<lang>/ template (the scaffolder copies it for you).
gh CLI authenticated against the repo (gh auth status).
git configured to push tags to origin.
(Optional, for signing) cosign is not required on your host — keyless cosign runs inside GitHub Actions using the workflow's OIDC token.

1. Publish your first release (unsigned)

The shortest path. Tag, push, watch CI.

# Pick a semver tag matching plugin.version in nexo-plugin.toml.
# The validate-tag job will reject any mismatch.
git tag v0.1.0
git push origin v0.1.0

The shipped workflow runs three jobs by default (validate-tag → build → release; sign is gated and stays inactive until you opt in):

gh run watch                # tail the latest run
gh release view v0.1.0      # confirm assets uploaded

Expected assets per <target>:

nexo-plugin.toml
my_plugin-0.1.0-x86_64-unknown-linux-gnu.tar.gz
my_plugin-0.1.0-x86_64-unknown-linux-gnu.tar.gz.sha256

Operators can already install at this point with default trust mode (warn):

nexo plugin install your-handle/my_plugin@v0.1.0

The CLI prints ! No signature in release; trust mode is 'warn' — proceeding unverified. and extracts the plugin.

2. Add cosign keyless signing

Cosign keyless does not need any secret on your end — it uses Sigstore + Fulcio with the GitHub Actions OIDC token. Enable it with one command:

gh variable set COSIGN_ENABLED --body true

Re-tag (or move the existing tag) and re-run the workflow:

git tag -d v0.1.0
git tag v0.1.0
git push --force origin v0.1.0

The sign job now runs and produces three extra assets per tarball:

my_plugin-0.1.0-x86_64-unknown-linux-gnu.tar.gz.sig
my_plugin-0.1.0-x86_64-unknown-linux-gnu.tar.gz.pem
my_plugin-0.1.0-x86_64-unknown-linux-gnu.tar.gz.bundle

The certificate's Subject Alternative Name (SAN) encodes the workflow URL plus the ref:

https://github.com/your-handle/my_plugin/.github/workflows/release.yml@refs/tags/v0.1.0

Operators with --require-signature will allowlist this SAN shape via a regex — that's what step 3 is about.

3. Operator-side trust setup

Operators who want to enforce signatures add an [[authors]] entry to <config_dir>/extensions/trusted_keys.toml:

schema_version = "1.0"
default = "warn"

[[authors]]
owner = "your-handle"
identity_regexp = "^https://github\\.com/your-handle/[^/]+/\\.github/workflows/release\\.yml@.*$"
oidc_issuer = "https://token.actions.githubusercontent.com"
mode = "require"

Notes for the operator (link this paragraph from your plugin's README):

owner matches the <owner> segment of nexo plugin install <owner>/<repo> invocations.
identity_regexp should be specific to your owner and loose on tag so it survives release-tag bumps. The example above accepts every repo under your-handle/ that ships release.yml from its default workflow path.
Anchored ^…$ is intentional — leaving anchors off makes the regex match substrings of unrelated SANs.

The full sample with comments lives at config/extensions/trusted_keys.toml.example in the nexo-rs repo.

4. Verify the round trip

On a host with cosign installed, an operator runs:

nexo plugin install your-handle/my_plugin@v0.1.0 --require-signature

Expected human output:

→ Resolving your-handle/my_plugin@v0.1.0 (target: x86_64-unknown-linux-gnu)
✓ Found release v0.1.0 (x86_64-unknown-linux-gnu, 4.1 MB, sha256 ab12cd34ef56…)
→ Downloading
✓ sha256 verified
→ Verifying signature against trusted_keys.toml
✓ Signature verified (identity: https://github.com/your-handle/my_plugin/.github/workflows/release.yml@refs/tags/v0.1.0)
→ Extracting to /var/lib/nexo/plugins
✓ Plugin installed at /var/lib/nexo/plugins/my_plugin-0.1.0
✓ Lifecycle event emitted (broker)

JSON output (--json) carries the full report including signature_verified, signature_identity, signature_issuer, trust_mode, and trust_policy_matched:

nexo plugin install your-handle/my_plugin@v0.1.0 --require-signature --json

{
  "ok": true,
  "id": "my_plugin",
  "version": "0.1.0",
  "target": "x86_64-unknown-linux-gnu",
  "plugin_dir": "/var/lib/nexo/plugins/my_plugin-0.1.0",
  "binary_path": "/var/lib/nexo/plugins/my_plugin-0.1.0/bin/my_plugin",
  "sha256": "ab12cd34ef56...",
  "size_bytes": 4194304,
  "was_already_present": false,
  "lifecycle_event_emitted": true,
  "signature_verified": true,
  "signature_identity": "https://github.com/your-handle/my_plugin/.github/workflows/release.yml@refs/tags/v0.1.0",
  "signature_issuer": "https://token.actions.githubusercontent.com",
  "trust_mode": "require",
  "trust_policy_matched": "your-handle"
}

5. Troubleshooting

Symptom	Cause	Fix
`CosignNotFound`	Operator host lacks `cosign` binary.	Install via `brew install cosign`, `apt install cosign`, or download from https://github.com/sigstore/cosign/releases.
`PolicyRequiresSig`	Trust mode is `require` but release has no `.sig` / `.cert`.	Re-run the workflow after `gh variable set COSIGN_ENABLED --body true`.
`CosignFailed`	Cert SAN does not match `identity_regexp`.	Compare the SAN reported in the error against the regex. Common cause: regex too tight on tag (`v0\.1\.0` instead of `.*`).
`Sha256Mismatch`	Tarball corrupted in transit or rebuilt out-of-band.	Re-tag and re-run; uploads are reproducible from the same commit.
`TargetNotFound`	Operator's host triple has no matching tarball.	Add the missing entry to the `build` matrix in `release.yml` and re-tag.

For full operator-side troubleshooting (cosign discovery fallbacks, identity_regexp examples, manual cosign verify-blob invocation), see Plugin trust.

Plugin supervisor (auto-respawn)

Subprocess plugins are isolated child processes. When one crashes, the daemon supervisor can either pause + log (default) or auto-respawn it with exponential backoff up to a bounded number of attempts. This page documents the manifest knobs that control that behaviour, the broker lifecycle events the supervisor publishes, and the edge cases operators should plan for.

Manifest knobs

[plugin.supervisor]
respawn = false              # opt-in. Default: false (Phase 81.21.b semantics)
max_attempts = 3             # cap on respawns before "gave_up". Default: 3
backoff_ms = 1000            # initial backoff; doubles per attempt, capped 60s. Default: 1000
stderr_tail_lines = 32       # ring buffer per running child for crash forensics. Default: 32

respawn is opt-in — community-tier plugins should not silently keep restarting if they're broken. Operators that trust their plugin (in-house adapters, well-tested community plugins) flip the toggle on; everything else stays paused-on-crash.

max_attempts is the hard ceiling. After that many consecutive respawn attempts the supervisor publishes gave_up and stops. The operator must restart the daemon (or fix the plugin + redeploy) to recover.

backoff_ms is the initial wait before the first retry. Each subsequent attempt doubles the wait, capped at 60 seconds. Example with backoff_ms = 1000:

Attempt	Wait
1	1s
2	2s
3	4s
4	8s
5	16s
6	32s
7+	60s (capped)

stderr_tail_lines is the per-running-plugin ring buffer of recent stderr lines. On crash the supervisor drains it into the stderr_tail field of the lifecycle events for forensic context. Hard-capped at 512 by manifest validation.

Lifecycle events (broker)

Every transition publishes a best-effort event on the daemon's broker (NATS-style topic). Subscribers can stream these into audit logs, dashboards, or alerts.

Topic	When	Payload
`plugin.lifecycle.<id>.crashed`	Child exit detected (non-zero)	`{plugin_id, exit_code, stderr_tail: Vec<String>}`
`plugin.lifecycle.<id>.respawning`	Before each backoff sleep	`{plugin_id, attempt: u32 (1-indexed), backoff_ms: u64}`
`plugin.lifecycle.<id>.respawned`	After successful re-handshake	`{plugin_id, attempt, total_uptime_ms}`
`plugin.lifecycle.<id>.gave_up`	After `attempts >= max_attempts`	`{plugin_id, attempts, last_exit_code, stderr_tail}`
`plugin.lifecycle.<id>.restarted_manually`	After `force_restart` completes	`{plugin_id, previous_uptime_ms: u64, restarted_at_ms: i64, new_pid?: u32}`

source field on every event = "plugin.supervisor". stderr_tail is chronological (oldest first), capped at the manifest's stderr_tail_lines.

respawned.total_uptime_ms carries the previous Inner's uptime in milliseconds (Phase 90 audit fix — was always 0). Subscribers diffing crashed→respawned timestamps can now consume the field directly.

gave_up.last_exit_code = -1 (sentinel) indicates a spawn failure — the supervisor never reached the handshake. A real child exit code (e.g. 1, 127, 139) means the child started but crashed; the per-attempt stderr_tail carries forensics. Spawn- failure paths emit an empty stderr_tail because there was no process to read from.

restarted_manually is published only by operator-initiated nexo/admin/plugins/restart calls. Auto-respawn cycles emit crashed+respawning+respawned/gave_up instead. new_pid is Some when Tokio could read the freshly spawned child's PID (almost always the case); None for pathological spawns where Child::id() returned None.

Auto-respawn flow

Initial init() — spawn_one_attempt + handshake
                  │
                  ▼
            (child running)
                  │ ───── NormalExit (clean shutdown) ──── return
                  │
                  ▼ Crashed
       publish "crashed" event
                  │
                  │  ┌── respawn=false ──── return (Phase 81.21.b semantics)
                  │  │
                  │  ▼ respawn=true
       maybe reset attempt counter (heuristic)
                  │
                  │  ┌── attempt >= max_attempts ──── publish "gave_up" + return
                  │  │
                  │  ▼
       publish "respawning {attempt+1, backoff_ms}"
                  │
       sleep next_backoff(attempt) (or shutdown short-circuit)
                  │
       drain pending oneshots with "plugin restarted; retry"
                  │
       spawn_one_attempt + handshake
                  │
                  │  ┌── Err ──── attempt += 1; loop continues
                  │  │
                  │  ▼ Ok
       check shutdown_signaled (kill child if shutdown fired race)
                  │
       install new Inner; publish "respawned"
                  │
                  ▼
       attempt += 1; loop continues

Reset attempt counter heuristic

If the most recent child sobreived ≥ backoff_ms × max_attempts × 2 milliseconds after a respawn, the supervisor treats the next crash as a transient blip rather than a continuation of a respawn loop — the attempt counter resets to 0. This permits recovery from network blips / OAuth token refreshes / occasional segfaults without masking real crash loops.

The window is hard-capped at 10 × 60s = 600s so an over-tuned manifest can't disable the heuristic entirely.

The window is not an operator knob; it derives from backoff_ms + max_attempts. Operators that want a longer window bump backoff_ms (which also slows down respawns) — that trade-off is intentional. A future follow-up may expose restart_window_secs as an explicit field if real-world demand emerges.

Shutdown semantics

shutdown() flips a per-plugin atomic flag and notifies the supervisor immediately. A supervisor parked in backoff sleep wakes within milliseconds (no waiting up to 60s for the natural deadline).
A shutdown that races a respawn handshake will kill the just-spawned child if shutdown fires between spawn_one_attempt returning Ok and the new Inner installation. No orphaned processes.
The daemon-wide ctx_shutdown cancellation token is also observed. Either source returns the supervisor cleanly.

Manual restart

Operators can force-restart any subprocess plugin from the admin UI without restarting the daemon. Useful after a gave_up event (auto-respawn loop exhausted) or to apply config changes that only take effect at boot.

Topic	Capability	Behaviour
`nexo/admin/plugins/restart { plugin_id }`	`plugin_restart`	Force-kill + fresh spawn + new respawn_loop

The restart is distinct from auto-respawn:

Publishes plugin.lifecycle.<id>.restarted_manually (NOT crashed+respawned) — operator dashboards can distinguish intentional restarts from crash recovery.
Capability plugin_restart is separate from plugin_doctor (read-only). Security review can grant write+destructive separately from read access.
Bypasses respawn=false — even with auto-respawn disabled, the manual restart spawns a fresh child + respawn_loop. After manual restart, the new respawn_loop respects the manifest's respawn setting again.

Flow

operator clicks "Restart" in plugin admin UI
  ↓
RPC nexo/admin/plugins/restart { plugin_id }
  ↓
LivePluginRestarter.restart() — lookup + downcast + force_restart()
  ↓
SubprocessNexoPlugin::force_restart()
  ├─ capture previous_uptime_ms (Inner.spawned_at.elapsed())
  ├─ drain pending oneshots with "plugin restarted by operator"
  ├─ cancel.cancel() (cascade tears down writer/reader/forwarders/supervisor)
  ├─ wait up to 2s for supervisor task to drain
  ├─ force-kill child if still alive
  ├─ tokio::time::timeout(60s, spawn_one_attempt(...))
  ├─ capture new_pid from child.id()
  ├─ install new Inner
  ├─ spawn fresh respawn_loop
  ├─ publish "restarted_manually" event
  └─ return PluginsRestartResponse { plugin_id, previous_uptime_ms,
                                     restarted_at_ms, new_pid }

Errors

Error	Maps to	Operator action
`plugin {id} not found`	`InvalidParams`	Refresh admin UI; plugin removed from manifest
`plugin {id} is in-tree`	`InvalidParams`	Use daemon restart for in-tree plugins
`restart timed out` (60s)	`Internal`	Plugin in degraded state; inspect logs + fix manifest
`plugin handles not yet populated; daemon still booting`	`Internal`	Retry after 1-2s; daemon finishing `wire_plugin_registry`

Limitations

Subprocess plugins only — in-tree plugins (assistant, dispatch-tools) cannot be hot-restarted. Operator restarts the daemon.
Manifest unchanged — force_restart uses the cached manifest; operator-edited manifest.entrypoint.command won't take effect until daemon restart. Manifest hot-reload is a deferred follow-up.
No coalesce — concurrent restart calls (two operators clicking simultaneously) execute sequentially via self.inner.lock(). Functional but with funny intermediate state for ~1s. Add explicit coalesce only if abuse seen.
No restart cooldown / rate-limiting — capability gate is the gate. Add cooldown only if abuse seen.

Limitations + open follow-ups

No Prometheus counter — nexo_plugin_respawn_total{plugin_id, outcome} pending the general metrics pipeline.
No multi-recipient encrypt for stderr_tail — captured plaintext only. A plugin that prints secrets to stderr will leak them via lifecycle events.
Per-attempt timeout is the same NEXO_PLUGIN_INIT_TIMEOUT_MS used by the initial spawn. A respawn handshake that hangs beyond the timeout counts as a failed attempt.

Operator checklist

Decide respawn per-plugin. Default false is safer; flip on for plugins you trust.
Tune backoff_ms to your plugin's recovery character. OAuth refresh blips: 1-5s. Network outages: 5-30s. Heavy boot plugins: 5s+ to avoid wasting CPU on tight retry loops.
Subscribe to plugin.lifecycle.> from a downstream system (audit log, alerting). The gave_up topic is the operator's clearest signal that human action is needed.
Read stderr_tail on crashed events for a quick crash triage before tailing log files manually.

Web Search plugin

Multi-provider web search (Brave / Tavily / DuckDuckGo / Perplexity) for Nexo agents. Subprocess binary; daemon discovers

spawns via [plugin.entrypoint].

Phase 95 — extracted from crates/web-search/ to standalone subprocess plugin nexo-rs-plugin-web-search v0.1.0. Daemon's web_search_router field on AgentContext / AgentRuntime removed (nexo-core 0.2.0 breaking).

Install

cargo install nexo-plugin-web-search

The binary lands at $HOME/.cargo/bin/nexo-plugin-web-search. Discovery walker probes it with --print-manifest and auto-registers.

Operator config

<config_dir>/plugins/web-search.yaml:

instances:
  - id: default                     # required, unique
    # agent_id omitted → shared across all agents
    providers:
      brave:
        api_key_path: ./secrets/brave_api_key.txt
        timeout_ms:   8000
      tavily:
        api_key_path: ./secrets/tavily_api_key.txt
        timeout_ms:   10000
      duckduckgo:
        timeout_ms:   12000          # no API key required
    cache:
      enabled: true
      path:    ./data/web_search_cache.db
      ttl_secs: 3600
    default_order: [brave, tavily, duckduckgo]

Multi-instance × multi-agent

Power-users with several agents each wanting their own search profile declare multiple instances: entries. Optional agent_id per instance scopes it to that single agent:

instances:
  - id: default                     # shared baseline
    providers: { duckduckgo: {} }
    default_order: [duckduckgo]
  - id: research                    # private for ana
    agent_id: ana
    providers:
      perplexity:
        api_key_path: ./secrets/ana_perplexity.txt
    cache: { path: ./data/ana_research.db }
    default_order: [perplexity]
  - id: news                        # another private for ana
    agent_id: ana
    providers:
      brave:
        api_key_path: ./secrets/ana_brave.txt
    cache: { enabled: false }
    default_order: [brave]

Resolution per agent's web_search call:

args.instance if operator-supplied.
Agent's first private instance from by_agent map.
First shared instance (no agent_id).
Error if none.

Tool surface

web_search arguments:

Field	Required	Description
`query`	yes	Search query string.
`count`	no	1-10; defaults from per-binding policy.
`instance`	no	Search profile id. Absent → agent's default.
`provider`	no	Provider override: `brave`/`tavily`/`duckduckgo`/`perplexity`.
`freshness`	no	Time window: `day`/`week`/`month`/`year`.
`country`	no	ISO-3166 alpha-2.
`language`	no	ISO-639-1.
`expand`	no	v0.1.0 no-op; v0.2.0 follow-up.

Per-binding policy fields (agents.yaml::inbound_bindings[].web_search):

Field	Default	Effect
`enabled`	`false`	Gate. False blocks all `web_search` calls on this binding (returns Denied).
`provider`	`"auto"`	Default provider override. `args.provider` wins.
`default_count`	`5`	Default `count` when LLM omits it.
`cache_ttl_secs`	`600`	Per-router cache TTL hint.
`expand_default`	`false`	Default `expand` arg.

Admin RPCs

Method	Params	Reply
`nexo/admin/web_search/bot_info`	`{}`	plugin metadata + instance counts
`nexo/admin/web_search/cache_stats`	`{instance?}`	per-instance status
`nexo/admin/web_search/cache_clear`	`{instance?}`	placeholder (v0.2.0)
`nexo/admin/web_search/provider_status`	`{}`	per-instance configured providers
`nexo/admin/web_search/list_instances`	`{}`	full instances + by_agent + shared map

Metrics

Prometheus exposition format via broker scrape plugin.web_search.metrics.scrape. Daemon's /metrics aggregator appends.

Source

github.com/lordmacu/nexo-rs-plugin-web-search — crates.io: nexo-plugin-web-search 0.1.0.

Installing personas — `nexo persona install`

A persona pack bundles an out-of-tree agent definition (system prompt

plugin bindings + workspace seed + secrets templates) that operators install into their nexo daemon. Distinct from a plugin (plugins register CODE; personas register CONFIG that consumes that code). Authored as a v2 manifest pack and published as a GitHub Release; the daemon resolves + downloads + verifies + extracts under the operator's configured search path.

v1 vs v2. The legacy install.sh-driven flow (v1 manifest) stays supported for airgapped hosts + CI. nexo persona install only consumes v2 manifests (manifest_version = 2); a v1 pack errors with a clear migration hint pointing at install.sh.

Quickstart

# Install the latest release of a persona from GitHub:
nexo persona install lordmacu/nexo-persona-cody

# Pin to a specific release tag:
nexo persona install lordmacu/nexo-persona-cody@v0.2.0

# JSON output for CI:
nexo persona install lordmacu/nexo-persona-cody --json

# List every installed persona:
nexo persona list

# Remove (with confirmation gate):
nexo persona remove cody             # prints what WOULD be removed
nexo persona remove cody --yes       # actually removes

Subcommands

Command	Purpose
`nexo persona install <owner>/<repo>[@<tag>]`	Resolve + verify + extract a v2 persona pack.
`nexo persona list`	Walk every configured search path, render every installed persona.
`nexo persona remove <id> [--yes]`	Atomic removal of the install dir for `<id>`.
`nexo persona get <id>`	Print the full manifest + computed contributes paths for `<id>`.
`nexo persona upgrade <id>`	Re-resolve the installed persona's source repo at `latest` + install if newer. Refuses to downgrade.
`nexo persona run <path>`	Inner-loop dev: validate a local persona pack + boot the daemon with its parent dir prepended to `personas.discovery.search_paths`. Mirror of `nexo plugin run`.
`nexo persona help`	Print the help text inline.

Flags

`install`

Flag	Default	Effect
`--dest <dir>`	`cfg.personas.discovery.search_paths[0]` (or `<state_dir>/personas/`)	Override the install root. Must be absolute.
`--target <triple>`	Daemon's host triple (`NEXO_INSTALL_TARGET` env wins)	Asset-matching target. Persona packs typically publish `noarch` only; the resolver falls back automatically.
`--json`	off	Emit a JSON envelope instead of human-readable lines. CI-friendly.

`list`

Flag	Effect
`--json`	JSON array under `{ "personas": [...] }`.

`remove`

Flag	Effect
`--yes`	Required — without it the command prints what it WOULD remove and exits 0.
`--json`	Same JSON envelope as `install`.

`get`

Prints id / version / description / homepage / install_root + every contributes.agent_configs and contributes.plugin_configs_partial path resolved to absolute. JSON variant returns the full manifest sections (requires, meta) too — CI can grep specific fields without re-parsing the on-disk TOML.

Flag	Effect
`--json`	Emit the typed manifest payload as JSON instead of human lines.

`upgrade`

Inspects cfg.personas.discovery.search_paths, finds the installed persona by id, extracts its source GitHub repo from manifest.persona.homepage, hits the GitHub Releases API at /releases/latest, and re-runs the install pipeline if the resolved version is strictly newer than the on-disk one. Refuses to downgrade (use nexo persona install <coords>@<tag> to pin if intentional).

Flag	Effect
`--json`	Same JSON envelope as `install`.

`run`

Inner-loop dev — point the daemon at a local persona pack without going through the install + verify pipeline. Validates the path's persona.toml, prepends the pack's parent dir to cfg.personas.discovery.search_paths (so the boot-time F5 discovery picks it up as <parent>/<id>-<version>/), then falls through to the daemon boot path.

# Develop a persona locally:
mkdir -p /tmp/dev/cody-0.99.0
$EDITOR /tmp/dev/cody-0.99.0/persona.toml
nexo persona run /tmp/dev/cody-0.99.0

Flag	Effect
`--json`	Emit the override payload as JSON before daemon boot starts streaming logs.

Configuration — `personas/discovery.yaml`

Lives at <config_dir>/personas/discovery.yaml. Optional — absent file means no scan happens (the daemon boots with an empty persona catalog).

discovery:
  search_paths:
    - /var/lib/nexo/personas        # default for system installs
    - /home/operator/.nexo/personas # default for user installs
  disabled: []                      # ids skipped even when found
  allowlist: []                     # empty = accept any; non-empty = whitelist

The CLI consumes the same config: nexo persona list walks search_paths and applies the disabled / allowlist filters.

Layout on disk

After a successful install, the pack lives under:

<install_root>/
  <id>-<version>/
    persona.toml
    agents.d/
      <agent>.yaml
    plugins/
      <plugin>.partial.yaml
    secrets/
      <secret>.txt.template
    data/
      workspace/...

The <id>-<version> shape mirrors the plugin install layout (Phase 31.1.b) so operators familiar with one immediately read the other. Re-installing the same id+version short-circuits via the idempotency check (no re-download, returns was_already_present: true).

Boot-time discovery

When the daemon starts, after plugins.start_all it walks cfg.personas.discovery.search_paths, parses + validates every <id>-<version>/persona.toml, applies the disabled / allowlist filters, and registers each survivor in an in-memory persona catalog. Discovery is best-effort: malformed / unparseable packs are logged at WARN and skipped rather than aborting boot.

Kill switch — `NEXO_DISABLE_BUNDLED_PERSONAS`

Set to 1 / true / on to skip discovery entirely, regardless of cfg.personas.discovery.search_paths. The daemon's in-memory catalog stays empty; the CLI still works against the on-disk dirs (it re-runs discovery itself).

export NEXO_DISABLE_BUNDLED_PERSONAS=1
nexo daemon

Useful for hardened deployments that want to refuse all out-of-tree persona packs at the daemon level even when the search paths config still references dirs. Surfaces in nexo doctor capabilities as a Medium risk toggle (Phase F7 of cody-cli-install).

Wire shape — release JSON conventions

A v2 persona release on GitHub must publish these assets at the release tag:

Asset	Required	Purpose
`persona.toml`	yes	The v2 manifest.
`<id>-<version>-<target>.tar.gz` OR `<id>-<version>-noarch.tar.gz`	yes (one of)	The pack tarball. `noarch` is the fallback when no per-target asset exists.
`<tarball>.sha256`	yes	Single line of lowercase hex (64 chars).
`<tarball>.sig` + `<tarball>.cert`	optional	Cosign material — when both present, the resolver records them in the resolved entry (verification gates land in a follow-up wave).

The naming convention mirrors nexo plugin install (Phase 31.1.c) so a single CI workflow can publish both flavors with the same tooling (cargo dist, gh release upload).

Errors

Symptom	Cause	Fix
`release tag does not parse as semver`	Tag uses `release-1.2.3` or another non-semver shape.	Re-tag as `vX.Y.Z`.
`release is missing required asset persona.toml`	The release JSON has no `persona.toml` asset.	Upload the manifest as a release asset matching the convention.
`persona id violates id regex`	The manifest's `[persona] id` has uppercase / spaces / etc.	Rename to `^[a-z0-9][a-z0-9-]{2,63}$`.
`v1 packs install via the persona's install.sh`	Manifest declares `manifest_version = 1`.	Bump to `2` (no field-shape changes); same TOML re-parses.
`tar entry path contains ..; rejected for safety`	Malicious / malformed tarball.	Re-pack ensuring every entry path is relative + traversal-free.
`persona install root must be an absolute path`	`--dest <relative>`.	Pass an absolute path.

Persona pack manifest schema (persona.toml) — see the Cody pack README for the v2 manifest shape (a dedicated docs page is a TBD follow-up).
Plugin install (nexo plugin install) — sister CLI surface; the persona installer reuses ~60 % of the resolve + download + sha256-verify plumbing.
Broker shapes — local vs. NATS vs. embedded (orthogonal, but referenced by personas declaring [persona.requires] features).

Manifest (`plugin.toml`)

Every extension ships a plugin.toml at its root. It declares identity, transport, capabilities, runtime requirements, and any bundled MCP servers. The runtime parses and validates the manifest before spawning anything.

Source: crates/extensions/src/manifest.rs.

Minimal example

[plugin]
id = "weather"
version = "0.1.0"
name = "Weather"
description = "Fetch weather by city name."
min_agent_version = "0.1.0"
priority = 0

[capabilities]
tools = ["get_weather"]
hooks = []

[transport]
type = "stdio"
command = "./weather"
args = []

[requires]
bins = ["curl"]
env = ["WEATHER_API_KEY"]

[context]
passthrough = false

[meta]
author = "you"
license = "MIT OR Apache-2.0"

Sections

`[plugin]`

Field	Required	Purpose
`id`	✅	Unique id. Regex `^[a-z][a-z0-9_-]$`, ≤ 64 chars. Must not be a reserved* id (see below).
`version`	✅	Semver.
`name`	—	Human-readable label.
`description`	—	≤ 512 UTF-8 chars.
`min_agent_version`	—	Semver. Checked against the running agent version at load time.
`priority`	—	`i32`, default `0`. Lower fires first in hook chains.

Reserved ids: agent, browser, core, email, heartbeat, memory, telegram, whatsapp. The host may register more via register_reserved_ids().

`[capabilities]`

[capabilities]
tools = ["get_weather", "get_forecast"]
hooks = ["before_message", "after_tool_call"]
channels = []
providers = []

At least one capability list must be non-empty. Names match ^[a-z][a-z0-9_]*$, ≤ 64 chars, no duplicates.

`[transport]`

One of three forms:

# stdio — spawn a child process
[transport]
type = "stdio"
command = "./my-extension"
args = ["--verbose"]

# nats — talk over a NATS subject prefix
[transport]
type = "nats"
subject_prefix = "ext.myext"

# http — call over HTTP
[transport]
type = "http"
url = "https://localhost:8080"

Validation: command, subject_prefix, url non-empty; url must be http(s)://.

`[requires]`

[requires]
bins = ["ffmpeg", "imagemagick"]
env  = ["OPENAI_API_KEY"]

Declarative preconditions used for gating: when the runtime discovers the extension, it calls Requires::missing(). If any bins is not on $PATH or any env is unset, the extension is skipped (warn, not fail) and its tools are not registered.

See Stdio runtime — Gating.

`[context]`

[context]
passthrough = true

When true, every tool call sent to this extension has _meta = { agent_id, session_id } injected into the JSON args. Lets the extension tell calls apart per-agent without the runtime having to encode the split into every tool signature.

`[mcp_servers]` (phase 12.7)

Inline MCP server declarations bundled with the extension:

[mcp_servers.gmail]
type = "stdio"
command = "./gmail-mcp"
args = []

[mcp_servers.calendar]
type = "streamable_http"
url = "https://mcp.example.com/calendar"

Each server name must match ^[a-z][a-z0-9_-]*$, ≤ 32 chars. Alternatively, drop a sidecar .mcp.json next to plugin.toml if the manifest has no [mcp_servers] section.

Validation at a glance

flowchart TD
    READ[read plugin.toml] --> PARSE[parse TOML]
    PARSE --> ID{id valid?<br/>regex + length<br/>+ not reserved}
    ID --> VER{version<br/>valid semver?}
    VER --> MIN{min_agent_version<br/>satisfied?}
    MIN --> CAPS{at least one<br/>capability declared?}
    CAPS --> NAMES{capability names<br/>valid + unique?}
    NAMES --> TRANS{transport<br/>non-empty +<br/>http scheme valid?}
    TRANS --> MCP{mcp_server names<br/>valid?}
    MCP --> OK([Manifest accepted])
    ID --> FAIL([Diagnostic: Error])
    VER --> FAIL
    MIN --> FAIL
    CAPS --> FAIL
    NAMES --> FAIL
    TRANS --> FAIL
    MCP --> FAIL

Any failure produces a DiagnosticLevel::Error in the discovery report — the candidate is dropped but scanning continues so an operator sees every broken manifest at once.

Agent-version gating

[plugin]
min_agent_version = "0.2.0"

On load the runtime compares against the agent build version. A mismatch logs a diagnostic and drops the candidate. Useful for shipping a manifest that relies on a newer host API without crash-looping older deployments. The host can override the reported version for tests via set_agent_version().

Discovery and NATS runtime — how the manifest drives spawn
CLI — agent ext validate <path> checks a manifest without touching the registry
Templates — prebuilt skeletons to copy

Extension patterns

Common shapes for nexo extensions. An extension is a self-contained directory with a manifest.toml that declares contributed tools, advisors, skills, MCP servers, channel adapters, and config schemas. Operators install with nexo ext install ./your-extension.

Pick the closest match; copy the skeleton; modify.

Pattern 1 · Tool bundle

When to use · You have 3-10 related tools (e.g. CRM ops: crm_lookup, crm_create_contact, crm_update_deal, crm_close_deal) and you want to ship them as a unit.

A tool bundle is the simplest extension. Each tool gets its own JSON schema + handler binary (or in-process Rust function). The manifest enumerates them; the daemon registers all on nexo ext install.

[extension]
id = "crm-tools"
version = "0.2.0"
description = "Salesforce-style CRM operations"

[[tools]]
name = "crm_lookup"
schema_path = "tools/crm_lookup.json"
binary = "./bin/crm-tools"

[[tools]]
name = "crm_create_contact"
schema_path = "tools/crm_create_contact.json"
binary = "./bin/crm-tools"

[[tools]]
name = "crm_close_deal"
schema_path = "tools/crm_close_deal.json"
binary = "./bin/crm-tools"

The binary is a single executable that dispatches by tool name. Operators add the tool names to agents.yaml once installed.

Pattern 2 · Advisor pack

When to use · You're shipping domain-specific personas (sales, legal-review, customer-support escalation) that other operators can drop into their agents.

Each advisor is a markdown system-prompt file the agent prepends to its base persona when handling specific topics. Bundle 3-8 together for a vertical.

[extension]
id = "sales-advisor-pack"
version = "0.1.0"
description = "BANT-style qualification + handoff prompts"

[[advisors]]
id = "bant-qualifier"
prompt_path = "advisors/bant_qualifier.md"

[[advisors]]
id = "objection-handler"
prompt_path = "advisors/objection_handler.md"

[[advisors]]
id = "demo-booker"
prompt_path = "advisors/demo_booker.md"

advisors/bant_qualifier.md:

You are a BANT-trained sales qualifier. For every inbound message,
internally score:
- Budget: ...
- Authority: ...
- Need: ...
- Timeline: ...
Only progress to demo-booker advisor when score >= 70.

Pattern 3 · Skill bundle

When to use · You have multi-step workflows (send-quote, escalate-to-human, handoff-to-team) that aren't single LLM turns — they need scripted sequences with branching.

Skills are YAML-defined workflows the agent can invoke. Multi-step with conditionals and tool calls. The extension ships YAML + referenced templates.

[extension]
id = "support-skills"
version = "0.3.1"

[[skills]]
id = "escalate-to-human"
yaml_path = "skills/escalate.yaml"

[[skills]]
id = "schedule-followup"
yaml_path = "skills/followup.yaml"

skills/escalate.yaml:

id: escalate-to-human
description: "Hand off to a human on a Telegram channel"
steps:
  - tool: format_transcript
    args: { last_n: 10 }
  - tool: telegram_post
    args:
      channel: ${ESCALATION_CHANNEL}
      message: |
        ⚠ Escalation request from ${user_id}
        Summary: ${summary}
        Transcript: ${transcript_url}
  - reply: "Te conecto con un agente humano. Te responderá pronto."

Pattern 4 · MCP server bundle

When to use · You're wrapping an external service as an MCP server so multiple agents can use it.

The extension ships a binary that speaks MCP (stdio or HTTP+SSE). Operators register the MCP server via the manifest; agents see its tools as native ones.

[extension]
id = "github-mcp"
version = "1.0.0"

[[mcp_servers]]
id = "github"
command = "./bin/github-mcp"
transport = "stdio"
env_passthrough = ["GITHUB_TOKEN"]

→ See Building an MCP server extension for the full walkthrough.

Pattern 5 · Multi-tenant SaaS extension

When to use · You're building a vertical SaaS (sales / support / marketing) where each tenant gets the same toolkit but isolated state, scoped credentials, per-tenant audit logs.

The extension declares multi_tenant.isolated_state = true. The framework partitions tool state, credentials, and skill output per tenant_id. Agents bound to a tenant only see that tenant's data.

[extension]
id = "sales-saas"
version = "1.2.0"

[[tools]]
name = "crm_lookup"
schema_path = "tools/crm_lookup.json"

[[advisors]]
id = "bant-qualifier"
prompt_path = "advisors/bant.md"

[multi_tenant]
isolated_state = true       # state stored under tenant scope
per_tenant_secrets = true   # secrets resolved per tenant
audit_per_tenant = true     # audit log scoped per tenant

[multi_tenant.quotas]
default = { llm_tokens_month = 1_000_000, agents = 3 }

The microapp layer (above) provisions tenants + assigns this extension to them via admin RPC.

→ Multi-tenant SaaS guide

Pattern 6 · Channel adapter pack

When to use · You're contributing a new channel kind that's not a subprocess plugin (e.g. a stdlib-friendly one that fits inline as a daemon module).

The extension declares a channel adapter implementation. The framework registers it with the channel registry; agents reference it via channels: [<kind>:<instance>] in agents.yaml.

[extension]
id = "discord-channel"
version = "0.1.0"

[[channel_adapters]]
kind = "discord"
adapter_module = "discord_adapter"   # rust crate path or shared lib
config_schema_path = "discord_config.json"

Most channels ship as plugins (subprocess), not extensions. Use this pattern only when the adapter must run in-process for performance or to share daemon state directly.

Pattern 7 · Config schema extension

When to use · You want to expose a new YAML config block that operators set in agents.yaml or a new file under config/.

The extension declares a JSON Schema for the new config; the daemon merges it into nexo doctor config validation and nexo agent doctor reports.

[extension]
id = "billing-config"
version = "0.1.0"

[[config_schemas]]
section = "billing"
schema_path = "billing.schema.json"
yaml_files = ["billing.yaml"]

Operator's config/billing.yaml:

billing:
  provider: stripe
  webhook_secret: ${STRIPE_WEBHOOK_SECRET}
  default_plan: pro

nexo doctor config will validate the file against your schema.

Pattern 8 · Knowledge-base loader

When to use · You're shipping a curated KB (FAQs, runbooks, playbooks) that should land in the operator's vector store.

The extension ships markdown / JSON documents + a kb_loader hook that imports them into the configured vector store on install.

[extension]
id = "support-kb"
version = "1.0.0"

[[kb_collections]]
id = "support-faqs"
loader = "./bin/load-faqs"     # binary that reads docs/ and emits chunks
docs_dir = "docs/"
embedding_model = "minimax-embed"

The loader runs once at install time + re-runs whenever the operator updates the extension version. Output lands in <state_dir>/<tenant_id>/vector/support-faqs/.

Choosing between patterns

If you...	Use
Have related tools to ship together	Tool bundle (1)
Have domain-specific persona prompts	Advisor pack (2)
Have multi-step scripted workflows	Skill bundle (3)
Wrap an external service as MCP	MCP server bundle (4)
Build a vertical SaaS	Multi-tenant SaaS (5)
Add a new in-process channel kind	Channel adapter (6)
Add a new config section	Config schema (7)
Ship a curated knowledge base	KB loader (8)

Plugin vs Extension — quick decision

If you find yourself between Plugin and Extension:

Choose Plugin when: the work is a separate process, runs in a non-Rust language, or interacts with an external service that has its own connection lifecycle (WebSocket, gateway, push).
Choose Extension when: the work is in-process Rust, ships with curated assets (advisors / skills / KBs), or needs tight multi-tenant state isolation.

Templates

The repo ships two extension templates as starting points. Copy one, rename it, fill in the tools, done.

Location: extensions/template-rust/ and extensions/template-python/.

What's shared

Both templates follow the same wire protocol and directory shape:

<your-ext>/
├── plugin.toml        # manifest (see ./manifest.md)
├── README.md          # what the extension does
├── <binary or script> # stdio-RPC entry point
└── ...                # build files specific to the language

The agent talks to both in the same JSON-RPC 2.0 shape:

initialize — handshake; returns {server_version, tools, hooks}
tools/<name> — tool invocation; returns the tool's result
hooks/<name> — hook invocation (when any hook is declared)

Line-delimited JSON over stdin/stdout. stderr is forwarded to the agent's tracing output — that's your debug log.

Rust template (`extensions/template-rust/`)

Standalone Cargo project outside the agent workspace — its own Cargo.toml, own Cargo.lock, own target/. Keeps your extension's deps independent of the agent's.

template-rust/
├── Cargo.toml
├── Cargo.lock
├── plugin.toml
├── README.md
├── src/
│   └── main.rs        # JSON-RPC loop
└── target/            # (gitignore)

src/main.rs implements:

#![allow(unused)]
fn main() {
// pseudocode
loop {
    let line = read_line_from_stdin();
    let req: JsonRpcRequest = parse(line);
    let result = match req.method.as_str() {
        "initialize" => handshake_info(),
        "tools/ping" => ping(req.params),
        "tools/add"  => add(req.params),
        "hooks/before_message" => pass(),
        _ => method_not_found(),
    };
    write_line_to_stdout(json!({ "jsonrpc": "2.0", "id": req.id, "result": result }));
}
}

Build with cargo build --release; the release binary at ./target/release/template-rust is what plugin.toml::transport.command points at.

Python template (`extensions/template-python/`)

template-python/
├── plugin.toml
├── main.py       # #!/usr/bin/env python3
└── README.md

stdlib only (no pip install). Same JSON-RPC loop over stdin/stdout. Logs to stderr via print(..., file=sys.stderr).

Good for quick extensions where starting a Python interpreter per tool call is acceptable (batch workloads, cron-ish tasks, one-off scripting).

Promoting a template to your own extension

flowchart LR
    COPY[copy template-rust<br/>to my-extension] --> EDIT[edit plugin.toml<br/>id, version, tools]
    EDIT --> CODE[implement tools/...]
    CODE --> BUILD[cargo build --release]
    BUILD --> VAL[agent ext validate<br/>./my-extension/plugin.toml]
    VAL --> INSTALL[agent ext install<br/>./my-extension --link --enable]
    INSTALL --> DOCTOR[agent ext doctor<br/>--runtime]

Conventions in the shipped templates

plugin.toml declares the minimum required capabilities — no phantom hooks or tools
requires.bins / requires.env left empty; add your own
[context] passthrough = false — opt in explicitly when you need per-agent / per-session state
License left blank — pick one and add it to [meta]

Gotchas

Rust template builds in its own workspace. Don't cargo add from the repo root — that edits the agent workspace, not the extension.
Python template spawns a new interpreter per extension, not per tool call. Stdin/stdout stay open for the life of the process. Don't exit after one tool call.
JSON-RPC ids must echo back. If your handler drops the id field, the agent can't correlate the reply.

CLI (`agent ext`)

Operator-facing commands for discovering, installing, validating, and toggling extensions. Every subcommand accepts --json for scripting.

Source: crates/extensions/src/cli/.

Subcommands

agent ext list                           [--json]
agent ext info <id>                      [--json]
agent ext enable <id>
agent ext disable <id>
agent ext validate <path>
agent ext doctor                         [--runtime] [--json]
agent ext install <path>                 [--update] [--enable] [--dry-run] [--link] [--json]
agent ext uninstall <id> --yes           [--json]

`list` — discovered extensions

Walks the configured search_paths, prints each candidate, its transport, and its enabled/disabled state.

`info <id>` — manifest + status

Prints the full parsed manifest, the runtime state if the agent is currently running, and any diagnostics attached to the candidate.

`enable` / `disable` — toggle in `extensions.yaml`

Rewrites the disabled list in config/extensions.yaml:

extensions:
  disabled: [weather]

No runtime side effect; operator must restart the agent to apply.

`validate <path>` — manifest check without registering

Parses and validates a plugin.toml at <path>. Good for CI checks on an extension's manifest before shipping.

`doctor` — preflight checks

Runs the same Requires::missing() logic as discovery, plus transport-specific checks:

flowchart TB
    START([agent ext doctor]) --> DISC[discover candidates]
    DISC --> REQ[check requires.bins + requires.env]
    REQ --> RUNT{--runtime?}
    RUNT -->|yes| SPAWN[spawn each stdio extension<br/>and handshake]
    RUNT -->|no| DONE([report table])
    SPAWN --> DONE

--runtime actually spawns each stdio extension and runs the handshake — useful to catch a broken binary before production boot.

`install <path>` — copy or symlink

Adds an extension to the active search_paths:

agent ext install ./extensions/weather
agent ext install /abs/path/to/my-ext --link --enable

--update replaces an existing extension with the same id
--enable adds it to extensions.yaml enabled (default: disabled until you enable)
--dry-run prints what would happen without writing
--link creates a symlink instead of copying — requires an absolute source path. Good for dev loops.

`uninstall <id> --yes`

Removes the extension's directory from the active search path (or the symlink, in --link installs). --yes is mandatory — no accidental destruction.

Exit codes

Code	Meaning
0	Success
1	Extension not found / `--update` target missing
2	Invalid manifest / invalid source / `--link` needs absolute path
3	Config write failed
4	Invalid id (reserved or empty)
5	Target exists (use `--update`)
6	Id collision across roots
7	`uninstall` missing `--yes` confirmation
8	Copy / atomic swap failed
9	Runtime check(s) failed (`doctor --runtime`)

Non-zero codes are stable for scripting.

JSON mode

Every subcommand that produces human output also supports --json for machine consumption. Fields are stable per code-phase; schema is not officially frozen yet — pin to a specific agent version in CI.

Common ops flows

Ship an extension to staging

agent ext validate ./my-ext/plugin.toml
agent ext install ./my-ext --link --enable
agent ext doctor --runtime

Disable a flapping extension without redeploying

agent ext disable weather   # writes to extensions.yaml
systemctl reload agent       # or restart, depending on deployment

CI gate

# .github/workflows/extension.yml
- run: cargo build --release
- run: agent ext validate ./plugin.toml

Building a multi-tenant SaaS microapp (Phase 82 walkthrough)

This page connects the dots across Phase 82's primitives so a microapp author can ship a multi-tenant SaaS extension without re-deriving the architecture from each sub-phase doc. Every section maps directly to a primitive that's already built; the work is wiring them together for your specific shape.

What you get from Phase 82

Primitive	Doc
`BindingContext` propagation (per-call agent + binding identity)	`agents.md`
Webhook receiver (single HTTP entry, YAML-routed to NATS)	`ops/webhook-receiver.md`
Outbound dispatch from extension (`nexo/dispatch`)	`extensions/stdio.md`
NATS event subject → agent turn binding	`config/agents.md`
Per-binding tool rate-limit	`ops/per-binding-rate-limits.md`
Per-extension state directory	`extensions/state-management.md`
Multi-tenant audit log filter (Phase 82.8)	inline below
Admin RPC (CRUD agents/credentials/pairing/llm/channels)	`microapps/admin-rpc.md`
Agent events firehose	`microapps/admin-rpc.md`
HTTP server capability	`microapps/admin-rpc.md`
Operator chat takeover	`microapps/admin-rpc.md`
Agent escalation	`microapps/admin-rpc.md`

Reference scaffold

agent-creator is the reference SaaS-shaped microapp (out-of-tree repo: see your operator's microapp registry for the URL). It uses every primitive in this list and is the recommended starting point for clone-and-adapt. The rest of this page assumes you've checked it out alongside the daemon source.

Tenant onboarding flow

Operator creates a row in your microapp's tenants table (see migrations/0001_tenants.sql). Each tenant carries an account_id: TEXT PRIMARY KEY that becomes the cross-cutting identifier through:
- BindingContext.account_id on every inbound + tool call
- goal_turns.account_id for audit isolation (Phase 82.8)
- ProcessingScope::Conversation { account_id, … } for pause/resume (Phase 82.13)
- EscalationEntry { agent_id, scope, … } where scope carries the account_id (Phase 82.14)

The microapp creates per-tenant artifacts under state_dir_for(extension_id)/tenants/<account_id>/:

~/.nexo/extensions/agent-creator/state/tenants/acme/
  ├── leads.sqlite
  ├── opt_outs.sqlite
  └── credentials.json    # encrypted at rest

Operator binds the tenant to a channel via nexo/admin/credentials/register (Phase 82.10.d) — the same bearer token gets both the channel's outbound write capability AND the per-tenant audit scope.

Channel binding

agents.yaml.<id>.inbound_bindings lists which channels the agent answers. Each binding inherits the tenant's account_id via the channel plugin's inbound shape (Phase 82.5 InboundMessageMeta). Provider plugins (whatsapp, telegram, email, slack-mcp) are responsible for stamping account_id onto the inbound — this is what threads tenancy through to the audit log + rate-limit buckets + escalation scopes.

Credential vault pattern

Credentials are filesystem-backed (Phase 82.10.h.3 FilesystemCredentialStore):

secrets/<channel>/<instance>/payload.json

For multi-tenant, use <instance> = <account_id> so the operator UI can rotate one tenant's bearer without touching others. The Phase 82.12 token_hash helper lets the daemon notify a microapp of rotation without putting the cleartext old token on the wire.

Drip scheduler (or whatever cron-like flow you need)

Phase 82.4 + 82.4.b ships the NATS event subscriber runtime — extensions subscribe to a NATS subject and the daemon binds each event to an agent turn. For a per-tenant drip:

Microapp publishes marketing.drip.fire.<account_id> on NATS at the cron tick.
agents.yaml.<agent_id>.event_subscribers includes marketing.drip.fire.* (glob).
Per-binding rate-limit (Phase 82.7, tool_rate_limits.<binding_id>.send_drip = 10/min) caps the per-tenant outbound velocity so a runaway tenant doesn't starve the others.

Compliance hooks

Redactor (Phase 10.4) runs inside TranscriptWriter::append_entry BEFORE persistence. Body bytes that hit disk are already redacted; the firehose emits the same redacted body. Microapps don't have to implement their own redaction — operator config in transcripts.yaml is the single point of control.
Audit retention (Phase 82.10.h.1) — operators set NEXO_MICROAPP_ADMIN_AUDIT_RETENTION_DAYS / NEXO_MICROAPP_ADMIN_AUDIT_MAX_ROWS. Boot sweep enforces both.
Operator takeover (Phase 82.13) — pause a single conversation with nexo/admin/processing/pause; agent goes silent while operator types a manual reply via nexo/admin/processing/intervention. Compliance teams use this for high-risk tenants.

Audit queries

For per-tenant billing / support, query the audit log scoped to one tenant:

#![allow(unused)]
fn main() {
use nexo_agent_registry::SqliteTurnLogStore;
use chrono::{Duration, Utc};

let rows = store
    .tail_for_account("acme", Utc::now() - Duration::days(30), 500)
    .await?;
}

The store filters strictly by account_id and excludes legacy NULL rows. Cross-tenant probes return an empty list (not an error) — defense in depth against existence oracles. Operator- scoped tools (tail, tail_since) keep returning every row including legacy NULL.

For admin RPC audit (Phase 82.10.h SQLite writer):

nexo microapp admin audit tail \
    --microapp-id agent-creator \
    --since-mins 60 \
    --format json | jq '.[] | select(.method | startswith("nexo/admin/agents/"))'

Live event firehose

Microapps that need a real-time UI (chat, dashboard) hold the transcripts_subscribe capability and receive nexo/notify/agent_event notifications on their stdio. The boot subscriber loop (Phase 82.11) handles fan-out, lag recovery, and per-microapp filtering — the microapp just reads JSON-RPC frames as they arrive. See microapps/admin-rpc.md for the wire shape.

Going to production

Ship the microapp binary alongside its plugin.toml.
Operator drops it into extensions/<id>/ and runs nexo ext install <path>.

Operator grants capabilities in extensions.yaml.entries.<id>.capabilities_grant. Common shape for a multi-tenant chat SaaS:

extensions:
  entries:
    agent-creator:
      capabilities_grant:
        - agents_crud
        - credentials_crud
        - pairing_initiate
        - llm_keys_crud
        - transcripts_read
        - transcripts_subscribe
        - operator_intervention
        - escalations_read
        - escalations_resolve

Operator runs nexo doctor capabilities to confirm every INVENTORY toggle is on.
Boot — the daemon validates the grants, spawns the microapp, threads the admin RPC dispatcher into the extension's stdio, and starts the firehose subscribe tasks for every microapp that holds the capability.

What's NOT in v0

These are framework-supported but not wired in main.rs yet (see FOLLOWUPS.md under the 82.x sections):

Pairing notifier wire — microapps poll pairing/status instead of receiving live pairing_status_changed frames.
EventForwarder thread account_id from BindingContext on live writes (audit reader is correct; the writer always emits None today).
escalate_to_human built-in tool registration in ToolRegistry — microapps that want escalations today have to call the admin RPC directly.
processing_state_changed / escalation_requested / escalation_resolved event variants on the firehose.

All of these are framework-level deferreds, not microapp-level work. They land in the same boot-order refactor that's tracked across the FOLLOWUPS entries.

Per-extension state directory (Phase 82.6)

Extensions need a stable place to put SQLite databases, vault files, and per-tenant artifacts. Phase 82.6 formalises the convention and ships a CLI helper so authors and operators agree on the path layout.

Canonical path

$NEXO_HOME/extensions/<extension-id>/state/

NEXO_HOME falls back to $HOME/.nexo when unset, then to the current working directory if even $HOME is missing (rare; covers minimal CI containers).

For an extension agent-creator on a typical install:

~/.nexo/extensions/agent-creator/state/

CLI

# Print the path (no filesystem touch).
nexo ext state-dir agent-creator
# /home/operator/.nexo/extensions/agent-creator/state

# Create the directory if missing (idempotent).
nexo ext state-dir agent-creator --ensure

Operators pipe the output into cd, sqlite3 .backup, etc. The base form is pure path resolution — useful in scripts that want to compute paths without side effects. --ensure is the moral equivalent of mkdir -p.

Programmatic access

nexo-extensions exposes:

#![allow(unused)]
fn main() {
use nexo_extensions::{ensure_state_dir, state_dir_for};

// Compute the path without touching disk.
let path = state_dir_for("agent-creator");

// Materialise it (idempotent).
let path = ensure_state_dir("agent-creator")?;
}

The daemon calls ensure_state_dir at extension first spawn so microapps can rely on the directory existing by the time their initialize handshake runs. The path is also exposed via the NEXO_EXTENSION_STATE_ROOT env var injected into the extension's process environment (constant EXTENSION_STATE_ROOT_ENV in the same module).

Backup procedure

The state dir is a regular filesystem location — operators back it up with the same tooling they use for other on-disk state:

# Whole-extension snapshot.
tar czf agent-creator-state-$(date +%F).tgz \
    -C "$(nexo ext state-dir agent-creator)" .

# SQLite-aware online backup (preferred for live DBs).
sqlite3 "$(nexo ext state-dir agent-creator)/db.sqlite" \
    ".backup '/var/backups/agent-creator-$(date +%F).db'"

Isolation

Each extension owns its own subtree. nexo does not enforce namespacing inside state/ — that's the extension's responsibility. v1 microapps that store per-tenant artifacts typically sub-divide as state/tenants/<tenant-id>/…. The framework treats the whole subtree as opaque.

Getting started: build a microapp in 1 hour

This walks the first hour of building a nexo microapp end to end. Goal: by the end of this page you have a working hello-world microapp running against a local nexo daemon, with one tool the LLM can call.

For the language-agnostic protocol spec, see contract.md. For the full Rust SDK reference, see rust.md. For a complete, shipping example — React UI + HTTP backend over the admin RPC + firehose SSE, consuming the @lordmacu/nexo-microapp-ui-react theme preset — see lordmacu/agent-creator-microapp and its write-up in the agent-creator reference microapp.

Prerequisites

✅ Rust 1.80+ (`rustup default stable`)
✅ The `template-microapp-rust/` directory (from a `git clone` of
   nexo-rs, or copied out — it depends on `nexo-microapp-sdk` from
   crates.io, so the copy builds standalone)
✅ A configured nexo daemon (one agent, one channel binding)

You don't need crates.io publish keys, npm, or a CI pipeline. Local files only.

Step 1 — copy the template (5 min)

# From your work directory (a `git clone` of nexo-rs gives you the
# template under extensions/):
cp -r /path/to/nexo-rs/extensions/template-microapp-rust ./mi-microapp
cd ./mi-microapp

# Rename inside Cargo.toml + plugin.toml + src/main.rs:
sed -i 's/template-microapp-rust/mi-microapp/g' Cargo.toml plugin.toml src/main.rs

git init && git add -A && git commit -m "scaffold from nexo template"

# Sanity-check it builds (no path-dep surgery needed — the SDK
# resolves from crates.io):
cargo build

Now you have:

mi-microapp/
├── Cargo.toml          # depends on nexo-microapp-sdk = "0.1" (crates.io)
├── plugin.toml         # capabilities + transport declaration
├── README.md           # rename checklist + porting guide
└── src/main.rs         # ~100 LOC including comments

Step 2 — write your first tool (15 min)

Open src/main.rs. Replace the greet_tool body with your domain logic:

#![allow(unused)]
fn main() {
async fn buscar_cliente(args: Value, ctx: ToolCtx) -> Result<ToolReply, ToolError> {
    let phone = args
        .get("phone")
        .and_then(|v| v.as_str())
        .ok_or_else(|| ToolError::wire("phone required"))?;

    // BindingContext threads the agent + channel + account
    // (Phase 82.1) through every call.
    let agent = ctx.binding().map(|b| b.agent_id.clone()).unwrap_or_default();

    Ok(ToolReply::ok_json(json!({
        "agent": agent,
        "phone": phone,
        "found": false,
        "lead_id": null,
    })))
}
}

#![allow(unused)]
fn main() {
let app = Microapp::new("mi-microapp", env!("CARGO_PKG_VERSION"))
    .with_tool("mi_microapp_buscar_cliente", buscar_cliente);
}

Build:

cargo build --release

The binary lands in ./target/release/mi-microapp.

Step 3 — smoke test the wire (5 min)

The microapp speaks line-delimited JSON-RPC over stdio. You can exercise it without the daemon:

echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' \
  | ./target/release/mi-microapp

Expected output (one line, JSON):

{"jsonrpc":"2.0","id":1,"result":{
  "tools":["mi_microapp_buscar_cliente"],
  "hooks":["before_message"],
  "server_info":{"name":"mi-microapp","version":"0.1.0"}
}}

tools/call works the same way:

printf '%s\n%s\n' \
  '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' \
  '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"mi_microapp_buscar_cliente","arguments":{"phone":"+57311"}}}' \
  | ./target/release/mi-microapp

If both calls return clean JSON, your microapp speaks the contract.

Step 4 — install into the daemon (15 min)

Copy the build artifact + plugin.toml into the daemon's extensions/ directory:

mkdir -p ~/.nexo/extensions/mi-microapp
cp target/release/mi-microapp ~/.nexo/extensions/mi-microapp/
cp plugin.toml ~/.nexo/extensions/mi-microapp/

Reference the microapp from ~/.nexo/config/extensions.yaml:

extensions:
  entries:
    mi-microapp:
      enabled: true
      capabilities_grant:
        - dispatch_outbound       # if your tools call nexo/dispatch
        # add more as your microapp needs them

Reference its tool from ~/.nexo/config/agents.yaml:

agents:
  - id: ana
    extensions: [mi-microapp]
    allowed_tools:
      - mi_microapp_buscar_cliente   # appears in the LLM tool catalogue

Restart the daemon:

nexo daemon restart
# or for dev: kill the process and re-run `nexo daemon start`

Step 5 — verify the LLM sees your tool (10 min)

Send a test message through your bound channel. The LLM should see mi_microapp_buscar_cliente in its tool catalogue and call it on relevant prompts.

Check the daemon logs:

nexo logs --tail | grep mi-microapp

You should see:

extensions: spawned mi-microapp pid=...
extensions: mi-microapp -> initialize ok
tools/call mi_microapp_buscar_cliente {"phone": "..."}

If the tool is being called but the LLM doesn't surface it correctly, the prompt may not have descriptions rich enough — add a description to your tool registration.

Step 6 — add per-agent config (10 min)

Different agents may need different microapp behaviour. Use Phase 83.1 (see proyecto/PHASES.md) extensions_config:

agents:
  - id: ana
    extensions: [mi-microapp]
    extensions_config:
      mi-microapp:
        regional: bogota
        api_token_env: ANA_ETB_TOKEN

  - id: maria
    extensions: [mi-microapp]
    extensions_config:
      mi-microapp:
        regional: cali
        api_token_env: MARIA_ETB_TOKEN

In your handler, the BindingContext.agent_id lets you key into a per-agent config map you build at initialize time. Until 83.1.b ships the JSON-RPC propagation, the operator can also pass the config via env vars and your microapp reads them on boot.

Common patterns

Multi-tenant SaaS

You're shipping a single microapp binary that serves multiple tenants. See extensions/multi-tenant-saas.md. Key idea: every tool call carries BindingContext.account_id (Phase 82.1) — key your per-tenant SQLite tables on it.

Compliance enforcement

Drop in nexo-compliance-primitives to anti-loop / anti-manipulation / opt-out / PII-redact / rate limit / consent track. Wire each primitive into a Phase 83.3 hook that votes Block or Transform before the LLM sees the inbound.

Outbound dispatch

Need your microapp to send a WhatsApp / Telegram / email reply? Use the nexo-microapp-sdk outbound feature:

[dependencies]
nexo-microapp-sdk = { path = "...", features = ["outbound"] }

Then ctx.outbound().dispatch(...) from inside any tool handler. See extensions/stdio.md.

Troubleshooting

Symptom	Fix
`extensions: mi-microapp -> initialize timed out`	Microapp didn't reply within 30 s. Check stderr; missing tokio runtime is the most common cause.
`tool 'mi_microapp_x' not in catalogue`	Tool name missing the `<extension_id>_` prefix. Daemon enforces the namespacing.
`capability denied: dispatch_outbound`	Operator forgot to add the capability to `extensions.yaml.entries.<id>.capabilities_grant`.
`404 unknown method: hooks/before_message`	The hook name in your `with_hook(...)` call doesn't match a daemon-emitted hook. Check `crates/extensions/src/runtime/mod.rs::HOOK_NAMES`.
Build fails: `nexo-microapp-sdk = "0.1"` not found	SDK isn't on crates.io yet (Phase 83.14). Use `path = "..."` against your nexo-rs checkout.

Next steps

You have a working microapp. Now:

Read contract.md end-to-end — the wire spec is short, and every detail matters for compat.
Read rust.md for the full SDK reference.
For multi-tenant SaaS: extensions/multi-tenant-saas.md.
For compliance gating: pull in nexo-compliance-primitives and wire its primitives into your Phase 83.3 hooks.

Microapp patterns

Common shapes for nexo microapps. A microapp is a complete product that consumes nexo-rs as its agent runtime — your microapp owns the UI, the multi-tenant story, the billing; the framework runs out of view.

Microapps talk to the framework over admin RPC over NATS — provision tenants, configure agents, manage knowledge bases, rotate API keys.

Pattern 1 · Single-tenant deploy

When to use · You're building an internal tool for one team or one company. Multi-tenancy is overkill.

Microapp configures one tenant at boot, never creates more. Used for: an internal sales bot, a personal AI assistant, a single-org customer-support system.

#![allow(unused)]
fn main() {
use nexo_microapp_sdk::admin::{AdminClient, TenantSpec, AgentSpec};

let admin = AdminClient::connect("nats://localhost:4222").await?;

// Bootstrap on first run; idempotent on subsequent boots.
admin.ensure_tenant(TenantSpec {
    id: "default".into(),
    plan: "internal".into(),
    quotas: Quotas::unlimited(),
}).await?;

admin.ensure_agent("default", AgentSpec {
    id: "ana".into(),
    persona_path: "./personas/ana.md".into(),
    channels: vec!["whatsapp:internal".into()],
    llm: "minimax-m2.5".into(),
}).await?;
}

Microapp's UI is a thin admin panel. Most config lives in YAML; microapp tweaks runtime knobs.

Pattern 2 · Multi-tenant SaaS

When to use · You're selling to multiple customers. Each gets isolated state, their own agents, their own KB.

Microapp creates a tenant per signup. The framework partitions state per tenant_id. Microapp owns the auth / billing / UI; framework runs the agent loop.

#![allow(unused)]
fn main() {
async fn handle_signup(req: SignupRequest, admin: &AdminClient) -> Result<TenantId> {
    let tenant_id = format!("client-{}", uuid::Uuid::new_v4());

    admin.create_tenant(TenantSpec {
        id: tenant_id.clone(),
        plan: req.plan,
        quotas: quotas_for_plan(&req.plan),
    }).await?;

    // Provision the customer's first agent.
    admin.create_agent(&tenant_id, AgentSpec {
        id: "default-agent".into(),
        persona_path: req.persona.unwrap_or_else(default_persona),
        channels: vec![],   // customer pairs channels via UI later
        llm: "minimax-m2.5".into(),
    }).await?;

    Ok(tenant_id)
}
}

agent-creator-microapp (the reference implementation) is built exactly this way — every signup gets a tenant, end-users build their own WhatsApp agents through a WhatsApp-Web-style UI.

→ agent-creator reference → Multi-tenant SaaS guide

Pattern 3 · BYO-UI

When to use · You're building a SaaS but want full control over the user-facing interface (custom React app, mobile app, Tauri desktop).

Microapp exposes its own HTTP / GraphQL / gRPC API. The frontend calls the microapp; the microapp calls the framework via admin RPC. The framework never serves UI directly.

// React frontend
async function pairWhatsApp(agentId: string): Promise<{ qr: string }> {
  return fetch(`/api/agents/${agentId}/whatsapp/pair`, { method: "POST" })
    .then(r => r.json());
}

#![allow(unused)]
fn main() {
// Microapp backend (Rust + Axum)
async fn pair_whatsapp(
    State(admin): State<AdminClient>,
    Path(agent_id): Path<String>,
    auth: AuthSession,  // resolves tenant_id
) -> Json<PairQrResponse> {
    let qr = admin.pair_channel(
        &auth.tenant_id,
        &agent_id,
        ChannelKind::Whatsapp,
    ).await.unwrap();
    Json(PairQrResponse { qr })
}
}

The microapp can be in any language — Rust, Python, TypeScript, PHP, Go — as long as it speaks NATS to the framework.

Pattern 4 · Knowledge-as-a-Service

When to use · Customers upload documents (PDFs, MD, URLs); your microapp ingests them into a per-tenant vector store; agents answer from the KB.

Microapp owns the upload UI + ingestion pipeline. Framework exposes vector-store admin RPC; microapp uses it to populate each tenant's KB.

#![allow(unused)]
fn main() {
async fn ingest_document(
    tenant_id: &str,
    doc: UploadedDoc,
    admin: &AdminClient,
) -> Result<()> {
    let chunks = chunk_document(&doc.content);
    for chunk in chunks {
        let embedding = embed(&chunk).await?;
        admin.vector_upsert(tenant_id, VectorRecord {
            id: uuid::Uuid::new_v4().to_string(),
            collection: "kb".into(),
            content: chunk.text,
            embedding,
            metadata: doc.metadata.clone(),
        }).await?;
    }
    Ok(())
}
}

Agents in that tenant query via a search_kb tool that the framework wires automatically when vector_collections: [kb] is declared in their agents.yaml.

Pattern 5 · Webhook-driven SaaS

When to use · External services (Stripe, GitHub, Shopify) push events to your SaaS; you trigger agent workflows from those events.

Microapp accepts webhooks at POST /webhook/<provider>. Each webhook becomes a RemoteTrigger published to the framework, which routes to the right agent based on tenant + provider.

#![allow(unused)]
fn main() {
async fn stripe_webhook(
    State((admin, secret)): State<(AdminClient, String)>,
    body: Bytes,
    headers: HeaderMap,
) -> StatusCode {
    let event = stripe::verify_webhook(&body, &headers, &secret)?;
    let tenant_id = lookup_tenant_by_stripe_customer(&event.customer).await?;

    admin.publish_remote_trigger(&tenant_id, RemoteTrigger {
        kind: "stripe.charge.failed".into(),
        target_agent: "billing-bot".into(),
        payload: serde_json::to_value(&event)?,
    }).await?;

    StatusCode::OK
}
}

→ RemoteTrigger outbound publisher

Pattern 6 · Background workers + scheduled jobs

When to use · Microapp needs to run periodic tasks (digest emails, lead nurturing campaigns, billing reconciliation) that don't fit naturally into the agent loop.

Microapp uses its own job runner (Sidekiq / Celery / cron). When a job fires, it talks to the framework via admin RPC to dispatch the agent task.

# Microapp's celery worker
@celery.task
def daily_digest(tenant_id: str):
    admin = AdminClient.connect("nats://...")
    leads = fetch_new_leads(tenant_id)
    if not leads:
        return
    admin.dispatch_agent_task(
        tenant_id=tenant_id,
        agent_id="digest-bot",
        prompt=f"Build a 3-line summary of {len(leads)} new leads",
        context={"leads": leads},
    )

The framework ships cron_schedule tools too — but microapp-side jobs can do anything the framework can't (DB queries, third-party API calls, multi-step orchestration).

Pattern 7 · White-label deploy

When to use · You're selling the same microapp to multiple customers, each with their own branding / domain.

Microapp reads its branding (logo, name, primary color) from the tenant's config. Each tenant's domain points to the same microapp deploy with a header (X-Tenant-Slug: acme) that resolves to the right tenant.

#![allow(unused)]
fn main() {
async fn extract_tenant(headers: &HeaderMap) -> Result<TenantId> {
    let slug = headers.get("X-Tenant-Slug")
        .and_then(|v| v.to_str().ok())
        .ok_or(BadRequest)?;
    Ok(tenant_id_for_slug(slug).await?)
}
}

The framework's per-tenant secrets + audit logs handle the isolation; microapp handles the branding.

Pattern 8 · Hybrid (your stack + framework)

When to use · You have an existing product (Rails / Django / Laravel SaaS) and want to add agent capability without rebuilding.

Microapp keeps its existing UI / DB / auth. It only delegates the agent loop to nexo-rs. The integration is one admin RPC client in your existing backend.

// Existing Laravel SaaS adds an agent endpoint
class AgentController extends Controller
{
    public function ask(Request $req): JsonResponse
    {
        $admin = app(AdminClient::class);
        $reply = $admin->dispatchAgentTask(
            tenantId: auth()->user()->tenant_id,
            agentId: 'support-copilot',
            prompt: $req->input('message'),
        );
        return response()->json(['reply' => $reply]);
    }
}

Your existing app stays as-is; nexo-rs becomes a backend service your code calls when it needs an agent.

Choosing between patterns

If you...	Use
Build for one team / one company	Single-tenant deploy (1)
Sell to multiple customers	Multi-tenant SaaS (2)
Want a custom UI (React / mobile / Tauri)	BYO-UI (3)
Customers upload docs to query	Knowledge-as-a-Service (4)
External services push events to you	Webhook-driven (5)
Need scheduled tasks beyond cron tools	Background workers (6)
Sell to multiple resellers	White-label (7)
Have an existing SaaS to augment	Hybrid (8)

Microapp vs Extension — quick decision

If you're between Microapp and Extension:

Choose Microapp when: you own the UI, the auth, the billing, and the framework runs out of view. End-users never see nexo.
Choose Extension when: you're contributing functionality into the framework that operators install with nexo ext install. End-users may see your tool / advisor / skill output but not your code's UI.

A SaaS often combines both: a multi-tenant microapp + one or two custom extensions for the vertical.

Admin RPC

Phase 82.10 ships a bidirectional JSON-RPC layer that lets microapps perform admin operations on the daemon without leaving the existing stdio transport. Today the daemon → microapp direction is tools/call + hooks/<name>; the inverse is nexo/admin/<domain>/<method>.

A microapp with an operator UI (e.g. agent-creator-microapp) uses this surface to:

CRUD agents (agents.yaml.<id>)
Register / revoke channel credentials (many-to-many)
Initiate WhatsApp QR pairing flows
Manage LLM provider entries (llm.yaml.providers.* global, llm.yaml.tenants.<id>.providers.* per-tenant — Phase 83.8.12.5)
Approve / revoke MCP-channel servers per agent
CRUD tenants (config/tenants.yaml) for SaaS deployments hosting N empresas / workspaces from one daemon (Phase 83.8.12 — nexo/admin/tenants/{list,get,upsert,delete})
Force a hot-reload after batch mutations

Layered grant model

Admin RPC uses two layers of opt-in:

plugin.toml [capabilities.admin] — what the microapp needs:
```
[capabilities.admin]
required = ["agents_crud", "credentials_crud", "pairing_initiate"]
optional = ["llm_keys_crud", "channels_crud"]
```
- required — boot fails if operator did not grant.
- optional — boot OK; runtime calls return -32004 capability_not_granted until granted.

extensions.yaml.entries.<id>.capabilities_grant — what the operator allows:

extensions:
  entries:
    agent-creator:
      capabilities_grant:
        - agents_crud
        - credentials_crud
        - pairing_initiate
        # llm_keys_crud not granted → calls return -32004

Boot diff produces a CapabilityBootReport:

Diff outcome	Severity	Behaviour
Required not granted	error	Boot fails
Optional not granted	warn	Runtime returns -32004
Granted but not declared	warn	Allowed (forward-compat)
All matched	ok	No log

Wire shape

Microapp → daemon request (over the existing stdio):

{
  "jsonrpc": "2.0",
  "id": "app:01HXXX...",
  "method": "nexo/admin/agents/list",
  "params": { "active_only": true }
}

Daemon → microapp response:

{
  "jsonrpc": "2.0",
  "id": "app:01HXXX...",
  "result": {
    "agents": [
      { "id": "ana", "active": true, "model_provider": "minimax", "bindings_count": 2 }
    ]
  }
}

ID prefix app: distinguishes microapp-initiated requests from daemon-initiated tools/call. Daemon-initiated IDs use random UUIDs without that prefix; the runtime asserts the invariant at boot.

Capability denial

When the capability gate refuses a call:

{
  "jsonrpc": "2.0",
  "id": "app:01HXXX...",
  "error": {
    "code": -32004,
    "message": "capability_not_granted",
    "data": {
      "capability": "agents_crud",
      "microapp_id": "agent-creator",
      "method": "nexo/admin/agents/upsert"
    }
  }
}

SDK side maps this to AdminError::CapabilityNotGranted { capability, method }.

Domains + methods

Method	Capability	Domain	Wraps
`nexo/admin/agents/list`	`agents_crud`	agents	yaml read
`nexo/admin/agents/get`	`agents_crud`	agents	yaml read
`nexo/admin/agents/upsert`	`agents_crud`	agents	yaml mutate + reload
`nexo/admin/agents/delete`	`agents_crud`	agents	yaml remove + reload
`nexo/admin/credentials/list`	`credentials_crud`	credentials	filesystem + yaml join
`nexo/admin/credentials/register`	`credentials_crud`	credentials	filesystem write + yaml mutate (many-to-many)
`nexo/admin/credentials/revoke`	`credentials_crud`	credentials	filesystem unlink + yaml mutate
`nexo/admin/pairing/start`	`pairing_initiate`	pairing	session_store insert + plugin trigger
`nexo/admin/pairing/status`	`pairing_initiate`	pairing	session_store read
`nexo/admin/pairing/cancel`	`pairing_initiate`	pairing	session_store mutate + notification
`nexo/admin/llm_providers/list`	`llm_keys_crud`	llm_providers	llm.yaml read
`nexo/admin/llm_providers/upsert`	`llm_keys_crud`	llm_providers	env var validation + llm.yaml mutate
`nexo/admin/llm_providers/delete`	`llm_keys_crud`	llm_providers	refuse if agent uses + llm.yaml remove
`nexo/admin/channels/list`	`channels_crud`	channels	yaml read
`nexo/admin/channels/approve`	`channels_crud`	channels	yaml mutate (idempotent)
`nexo/admin/channels/revoke`	`channels_crud`	channels	yaml mutate
`nexo/admin/channels/doctor`	`channels_crud`	channels	static yaml verdicts
`nexo/admin/reload`	`agents_crud`	meta	force Phase 18 hot-reload
`nexo/admin/llm/complete`	`llm_complete`	llm	one-shot completion (admin debugger)
`nexo/admin/agent_events/list`	`transcripts_read`	agent_events	transcript pagination
`nexo/admin/agent_events/read`	`transcripts_read`	agent_events	single transcript fetch
`nexo/admin/agent_events/search`	`transcripts_read`	agent_events	full-text search
`nexo/admin/microapp_audit/tail`	`audit_read`	audit	per-microapp audit log tail
`nexo/admin/processing/pause`	`operator_intervention`	processing	pause autonomous loop
`nexo/admin/processing/resume`	`operator_intervention`	processing	resume after pause
`nexo/admin/processing/intervention`	`operator_intervention`	processing	inject operator turn
`nexo/admin/processing/state`	`operator_intervention`	processing	read pause/intervention state
`nexo/admin/escalations/list`	`escalations_read`	escalations	pending escalation queue
`nexo/admin/escalations/resolve`	`escalations_resolve`	escalations	mark escalation handled
`nexo/admin/skills/list`	`skills_crud`	skills	filesystem walk + manifest read
`nexo/admin/skills/get`	`skills_crud`	skills	single skill manifest
`nexo/admin/skills/upsert`	`skills_crud`	skills	filesystem write + reload
`nexo/admin/skills/delete`	`skills_crud`	skills	filesystem unlink + reload
`nexo/admin/tenants/list`	`tenants_crud`	tenants	tenants.yaml read
`nexo/admin/tenants/get`	`tenants_crud`	tenants	tenants.yaml lookup
`nexo/admin/tenants/upsert`	`tenants_crud`	tenants	tenants.yaml mutate + reload
`nexo/admin/tenants/delete`	`tenants_crud`	tenants	tenants.yaml remove + reload
`nexo/admin/mcp/list`	`mcp_crud`	mcp	mcp.yaml read
`nexo/admin/mcp/get`	`mcp_crud`	mcp	mcp.yaml lookup
`nexo/admin/mcp/upsert`	`mcp_crud`	mcp	mcp.yaml mutate + reload
`nexo/admin/mcp/delete`	`mcp_crud`	mcp	mcp.yaml remove + reload
`nexo/admin/plugins/doctor`	`plugin_doctor`	plugins	discovery snapshot (manifests + capabilities)
`nexo/admin/plugins/restart`	`plugin_restart`	plugins	force-restart subprocess plugin (Phase 81.21.b.b)
`nexo/admin/memory/query`	`memory_query`	memory	LongTermMemory recall
`nexo/admin/memory/list_snapshots`	`memory_snapshot`	memory	snapshot bundle list
`nexo/admin/memory/delete_snapshot`	`memory_snapshot`	memory	snapshot bundle delete (idempotent)
`nexo/admin/memory/create_snapshot`	`memory_snapshot`	memory	capture bundle (server forces redact_secrets+admin-ui provenance)
`nexo/admin/memory/restore_snapshot`	`memory_snapshot`	memory	restore from snapshot_id (server resolves bundle path; auto_pre_snapshot=true)
`nexo/admin/secrets/write`	`secrets_write`	secrets	per-microapp secret store mutate
`nexo/admin/auth/rotate_token`	`auth_rotate`	auth	bearer + cookie HMAC rotation
`nexo/admin/whatsapp/bot/list`	`channels_crud`	whatsapp	bot enumeration
`nexo/admin/whatsapp/bot/send`	`channels_crud`	whatsapp	one-off send

Live methods: 57 across 17 capabilities. Phase 81.21.b.b added plugin_restart (write+destructive, distinct from read-only plugin_doctor). Phase 90.x.memory-snapshot.create-restore added memory_snapshot covering all four CRUD verbs on snapshot bundles.

Many-to-many credentials

A single channel credential can serve N agents simultaneously:

# agents.yaml — both agents bind to the shared credential
agents:
  - id: ana
    inbound_bindings:
      - { plugin: whatsapp, instance: shared }
  - id: carlos
    inbound_bindings:
      - { plugin: whatsapp, instance: shared }

Operators rebind from either side:

Credential side — nexo/admin/credentials/register {channel, instance, agent_ids: ["ana","carlos"], payload: {...}} writes the credential file and appends {plugin: channel, instance} to each agent's inbound_bindings (skipping duplicates).
Agent side — nexo/admin/agents/upsert {id, inbound_bindings: [...]} replaces the binding list directly.

nexo/admin/credentials/revoke {channel, instance} removes the binding from every agent that was using it AND deletes the credential file.

Framework is channel-agnostic; v1 microapp UIs scope to WhatsApp only.

Channel credential persisters (Phase 82.10.n)

credentials/register does NOT only write the opaque credential blob: it also brides into the per-channel plugin's runtime state (yaml accounts list, secret file, in-memory store) via the ChannelCredentialPersister trait. Channel plugins register a persister at boot; the dispatcher routes per input.channel.

Lifecycle on register (when a persister is registered):

validate_shape(payload, metadata) — synchronous, network-free shape check. Bad shape → -32602 invalid_params.
Opaque blob write (CredentialStore::write_credential).
persist(instance, payload, metadata).await — writes the per-channel runtime state. Failure leaves the opaque blob on disk so the operator can retry.
Agent bindings + reload signal (existing).
probe(instance, payload, metadata).await — best-effort connectivity check. Errors NEVER abort register; outcome is surfaced to the caller as validation.

Response shape:

{
  "summary": { "channel": "telegram", "instance": "kate",
                "agent_ids": ["kate"] },
  "validation": {
    "probed": true,
    "healthy": true,
    "detail": "authenticated as @kate_bot",
    "reason_code": "ok"
  }
}

validation is null when no persister is registered for the channel (back-compat: pre-82.10.n callers see only summary- shaped data inside the wrapper).

Stable reason codes

reason_code mirrors the pattern in research/docs/auth-credential-semantics.md:

Code	Meaning
`ok`	Probe completed; channel reachable + authenticated
`unsupported_channel`	No persister registered for the channel
`invalid_payload`	Persister rejected payload shape
`invalid_metadata`	Persister rejected metadata shape
`connectivity_failed`	Network failure (DNS, TCP, timeout)
`auth_failed`	Provider rejected credentials (401, IMAP `NO`)
`tls_failed`	TLS handshake failed
`not_probed`	Persister opted out of probing (whatsapp default)

Bundled persisters

Channel	Yaml file	Secret layout	Probe
`telegram`	`<config_dir>/plugins/telegram.yaml`	`<secrets>/telegram_<instance>_token.txt` (mode 0600)	`GET https://api.telegram.org/bot<TOKEN>/getMe` (5s timeout)
`email`	`<config_dir>/plugins/email.yaml`	`<secrets>/email/<instance>.toml` (mode 0600)	TCP connect + TLS handshake to IMAP host (5s timeout)
`whatsapp`	n/a (pairing flow owns it)	n/a	`not_probed` (pairing has its own probe surface)

Telegram persister metadata fields (all optional, defaults applied):

{
  "polling": { "enabled": true, "interval_ms": 1000 },
  "allow_agents": ["kate"],
  "allowed_chat_ids": [123, 456]
}

Email persister payload + metadata shape (all required unless noted):

{
  "channel": "email",
  "instance": "ops",
  "agent_ids": ["ana"],
  "payload": {
    "address": "ops@example.com",
    "password": "..."          // OR "xoauth2_token", exactly one
  },
  "metadata": {
    "imap": { "host": "imap.example.com", "port": 993, "tls": "implicit_tls" },
    "smtp": { "host": "smtp.example.com", "port": 587, "tls": "starttls" },
    "provider": "gmail"        // optional
  }
}

Audit redaction

payload.token, payload.password, payload.xoauth2_token are replaced with "<redacted>" before the audit row's args_hash is computed. Defense-in-depth: any token / password / xoauth2_token / api_key / secret key inside metadata.* (including nested objects) is also redacted.

Adding a new channel persister

Implement ChannelCredentialPersister in crates/setup/src/persisters/<channel>.rs.
Add to nexo_setup::persisters re-exports.
Push into AdminBootstrapInputs.persisters in src/main.rs.
Document the payload + metadata schema + reason codes here.

The trait + dispatcher registry lives in nexo-core; the trait is #[async_trait] and probe has a default implementation returning not_probed so a persister can opt out.

Async pairing flow

Microapp                                   Daemon
   |--- pairing/start (agent_id, channel) ---->|
   |<-- {challenge_id, expires_at_ms, ...} ----|
   |                                            |
   | (out-of-band: channel plugin starts QR)    |
   |                                            |
   |<-- nexo/notify/pairing_status_changed -----|
   |    {challenge_id, state: "qr_ready", data: {qr_ascii, qr_png_base64}}
   |                                            |
   | (operator scans QR on phone)               |
   |                                            |
   |<-- nexo/notify/pairing_status_changed -----|
   |    {challenge_id, state: "linked", data: {device_jid}}
   |                                            |
   | (microapp calls credentials/register to    |
   |  complete the binding)                     |

Notification topic: nexo/notify/pairing_status_changed (no id field — server-pushed).

States: pending → qr_ready → awaiting_user → linked | expired | cancelled. Microapp may also poll nexo/admin/pairing/status or cancel via nexo/admin/pairing/cancel.

Audit log

Every dispatched call appends one row regardless of outcome (ok / error / denied):

#![allow(unused)]
fn main() {
struct AdminAuditRow {
    microapp_id: String,
    method: String,
    capability: String,
    args_hash: String,        // SHA-256 of canonicalized params
    started_at_ms: u64,
    result: AdminAuditResult,
    error_code: Option<i32>,
    duration_ms: u64,
}
}

args_hash lets operator audit pipelines detect repeated identical calls (potential abuse) without storing PII payloads.

Two writer implementations:

InMemoryAuditWriter — default, used in tests and as a fallback when no on-disk path is configured. Resets on restart.
SqliteAdminAuditWriter (Phase 82.10.h.1) — writes the microapp_admin_audit table (idempotent CREATE TABLE IF NOT EXISTS + WAL + 3 indices on microapp_id, method, and tenant_id). sweep_retention(retention_days, max_rows) runs at boot to enforce age + cap limits via the NEXO_MICROAPP_ADMIN_AUDIT_RETENTION_DAYS / _MAX_ROWS toggles. Library-level tail(&AuditTailFilter) query (Phase 82.10.h.2) backs the nexo microapp admin audit tail CLI — format_rows_as_table and format_rows_as_json helpers ship in the same module.

Phase 83.8.12.6.runtime + .b — skills resolution chain + migration

The runtime SkillLoader resolves a skill name in this order:

<root>/<tenant_id>/<name>/SKILL.md (when the agent has tenant_id set)
<root>/__global__/<name>/SKILL.md
<root>/<name>/SKILL.md (legacy pre-83.8.12.6 layout — logs a deprecation warning when used)

Per-tenant skills override the global namespace, and the global namespace fills in for tenants that don't have their own copy. The legacy fallback keeps existing deployments working without any migration; the deprecation log nudges operators toward the new layout.

For a clean cutover, nexo_setup::skills_migrate::migrate_legacy_skills_to_global moves every legacy <root>/<name>/SKILL.md into <root>/__global__/<name>/SKILL.md. Idempotent, leaves tenant-scoped layouts untouched, reports filename conflicts.

Phase 83.8.12.4.b — per-tenant event firehose + escalations filter

AgentEventKind::TranscriptAppended events carry the agent's tenant_id whenever the runtime knows it (agent.tenant_id from agents.yaml). The framework writer (TranscriptWriter::with_tenant_id) and reader (TranscriptReaderFs::with_tenant_id) both stamp the field on emit; firehose subscribers can filter per-tenant without a per-event lookup against agents.yaml. Untagged deployments (single-tenant) emit tenant_id: null — back compat preserved.

agent_events/list and escalations/list honour filter.tenant_id defense-in-depth: cross-tenant queries return empty (no leak of existence). Agents lacking a tenant_id field in agents.yaml are excluded from any non-null tenant filter.

Phase 83.8.12.7 — per-tenant audit scope

Every audit row carries an Option<String> tenant_id that the dispatcher sniffs from params.tenant_id (string-typed only — non-string values yield None defensively). Calls that lack a tenant scope (echo, pairing/*, credentials/*) leave the column NULL so existing pre-83.8.12.7 deployments keep working. Operators can filter the tail by tenant for SaaS billing or compliance reviews:

# CLI — restrict to one tenant scope
nexo microapp admin audit tail --tenant acme --limit 100

# combine with other filters
nexo microapp admin audit tail --tenant acme --result denied --since-mins 60

# library-side convenience: tail_for_tenant(tenant, since_ms?, limit)
let rows = writer.tail_for_tenant("acme", None, 50).await?;

Schema migrates forward-only on open(): the inline CREATE TABLE IF NOT EXISTS adds tenant_id for fresh DBs, and ALTER TABLE ... ADD COLUMN tenant_id TEXT runs idempotently on legacy DBs (the duplicate-column-name error is the green path). Existing audit rows keep NULL and are excluded from any tenant-scoped tail.

INVENTORY env toggles

Per-domain global kill switches in crates/setup/src/capabilities.rs::INVENTORY:

Env var	Default	Disable effect
`NEXO_MICROAPP_ADMIN_AGENTS_ENABLED`	`1`	All `agents/*` return `-32601`
`NEXO_MICROAPP_ADMIN_CREDENTIALS_ENABLED`	`1`	All `credentials/*` return `-32601`
`NEXO_MICROAPP_ADMIN_PAIRING_ENABLED`	`1`	All `pairing/*` return `-32601`
`NEXO_MICROAPP_ADMIN_LLM_KEYS_ENABLED`	`1`	All `llm_providers/*` return `-32601`
`NEXO_MICROAPP_ADMIN_CHANNELS_ENABLED`	`1`	All `channels/*` return `-32601`

Capability grants are the per-microapp check; INVENTORY is the operator-global kill switch (e.g. enterprise op disables pairing entirely while keeping agents CRUD).

SDK side

Microapp Rust code uses the SDK's AdminClient (gated by the admin cargo feature):

[dependencies]
nexo-microapp-sdk = { version = "0.1", features = ["admin"] }

#![allow(unused)]
fn main() {
use nexo_microapp_sdk::admin::{AdminClient, AdminError};
use nexo_tool_meta::admin::agents::AgentsListFilter;

async fn list_active_agents(client: &AdminClient) -> Result<usize, AdminError> {
    let response: nexo_tool_meta::admin::agents::AgentsListResponse =
        client.call(
            "nexo/admin/agents/list",
            AgentsListFilter { active_only: true, plugin_filter: None },
        ).await?;
    Ok(response.agents.len())
}
}

Each call generates a fresh app:<uuid-v7> request id, registers a oneshot receiver, writes the JSON-RPC frame, and awaits the response (default 30 s timeout). Capability denial maps to the typed AdminError::CapabilityNotGranted { capability, method }.

Operator identity stamping (Phase 82.10.m)

A handful of admin methods carry an operator_token_hash: String field in their wire shape — processing/{pause, resume, intervention} and escalations/resolve. The canonical list lives at nexo_tool_meta::admin::operator_stamping::OPERATOR_STAMPED_METHODS.

Microapps register a closure-based source via AdminClient::set_operator_token_hash; the SDK then transparently stamps the field on every outbound stamped call. The override is unconditional (defense-in-depth): any caller-supplied value is replaced with the value the closure returns.

#![allow(unused)]
fn main() {
use std::sync::Arc;
use arc_swap::ArcSwap;
use nexo_microapp_sdk::admin::AdminClient;

// Hot-swappable identity source — rotation updates the ArcSwap
// in place; the next stamped call re-reads it.
let live_hash = Arc::new(ArcSwap::from_pointee(
    "deadbeef0123cafe".to_string(),
));

fn install(client: &AdminClient, source: Arc<ArcSwap<String>>) {
    client.set_operator_token_hash(move || (*source.load_full()).clone());
}
}

The closure is invoked once per outbound stamped call, so a post-rotation pause request lands the new identity without any re-registration. Non-stamped methods (agents/list, escalations/list, etc.) pass through untouched.

This pattern replaces the legacy "HTTP middleware injection" approach where each microapp duplicated the method list locally. Single source of truth lives in nexo-tool-meta.

Production wiring

Three production adapters ship in nexo_setup::admin_adapters (Phase 82.10.h.3) — they close the cycle between core (which declares the traits) and setup (which holds the concrete yaml_patch + filesystem code):

#![allow(unused)]
fn main() {
use nexo_setup::admin_adapters::{
    AgentsYamlPatcher, FilesystemCredentialStore, LlmYamlPatcherFs,
};

let agents = AgentsYamlPatcher::new(config_dir.join("agents.yaml"));
let llm    = LlmYamlPatcherFs::new(config_dir.join("llm.yaml"));
let creds  = FilesystemCredentialStore::new(secrets_root);
let audit  = SqliteAdminAuditWriter::open(state_dir.join("admin_audit.db")).await?;

let dispatcher = AdminRpcDispatcher::new()
    .with_capabilities(capability_set)
    .with_audit_writer(audit)
    .with_agents_domain(agents.clone(), reload_signal.clone())
    .with_credentials_domain(agents, creds)
    .with_llm_providers_domain(llm);
}

AgentsYamlPatcher is Clone and feeds both the agents and the credentials domain (the latter mutates inbound_bindings on each agent). serde_yaml::Value ↔ serde_json::Value conversion happens inside the adapter, so trait callers stay JSON-typed (matching what microapps see on the wire).

Bootstrap helper (Phase 82.10.h.b.5)

nexo_setup::admin_bootstrap::AdminRpcBootstrap::build wraps the full wire path so operators don't hand-thread every adapter into the dispatcher:

#![allow(unused)]
fn main() {
use nexo_setup::admin_bootstrap::{AdminBootstrapInputs, AdminRpcBootstrap};

let bootstrap = AdminRpcBootstrap::build(AdminBootstrapInputs {
    config_dir: &config_dir,
    secrets_root: &secrets_root,
    audit_db: std::env::var_os("NEXO_MICROAPP_ADMIN_AUDIT_DB")
        .as_ref()
        .map(std::path::Path::new),
    extensions_cfg: &extensions_cfg,
    admin_capabilities: &per_extension_admin_caps,
    reload_signal,
})
.await?;
}

build returns Ok(None) when no microapp declares [capabilities.admin] so the daemon pays zero overhead in the common case. When it returns Some(bootstrap), the spawn loop threads the per-microapp AdminRouter through StdioSpawnOptions::admin_router and post-spawn binds the live outbound writer:

#![allow(unused)]
fn main() {
let opts = bootstrap
    .spawn_options_for(&extension_id, default_opts)
    .unwrap_or(default_opts);
let runtime = StdioRuntime::spawn_with(&manifest, opts).await?;
bootstrap.bind_writer(&extension_id, runtime.outbox_sender());
}

A periodic 30 s task prunes the in-memory pairing store.

In-memory pairing challenge store (Phase 82.10.h.b.1)

InMemoryPairingChallengeStore is a DashMap<Uuid, …> + TTL adapter — same pattern as OpenClaw's activeLogins map. read_challenge lazily flips entries past their TTL to PairingState::Expired with an operator-readable data.error, so polls converge to the terminal state without waiting for the prune cadence. Daemon restart drops in-flight challenges (the WhatsApp QR client-side expires in ~30 s anyway, so a SQLite-backed store would be wasted work).

Pairing notifier (deferred)

StdioPairingNotifier ships as a building block but is not yet wired into AdminRpcBootstrap. Microapps fall back to polling pairing/status until a follow-up exposes a separate notification queue independent of the response writer.

Agent events firehose (Phase 82.11)

agent_events is the cross-app surface microapps use to stream and query agent activity. v0 emits one variant — TranscriptAppended — but the wire shape is a discriminated #[non_exhaustive] enum so future kinds (batch job completion, image-gen output, custom) land non-breaking.

Backfill RPC (`nexo/admin/agent_events/*`)

nexo/admin/agent_events/list { agent_id, kind?, since_ms?, limit? } — newest-first window query, default since_ms = now - 30d, limit = 500 clamped to 1000.
nexo/admin/agent_events/read { agent_id, session_id, since_seq?, limit? } — one-scope ascending tail, exclusive since_seq (a microapp that received seq=4 live re-issues read with since_seq=4 and gets seq=5,6,7,…). Unknown scope returns events: [], NOT -32601.
nexo/admin/agent_events/search { agent_id, query, kind?, limit? } — FTS5 query over the redacted body. Backed by the existing transcripts_fts virtual table.

All three require capability transcripts_read.

Live notifications (`nexo/notify/agent_event`)

JSON-RPC notification frame, no id:

{"jsonrpc":"2.0","method":"nexo/notify/agent_event",
 "params":{"kind":"transcript_appended","agent_id":"ana",
           "session_id":"…","seq":7,"role":"user",
           "body":"[REDACTED:phone] hola","sent_at_ms":…,
           "sender_id":"wa.55","source_plugin":"whatsapp"}}

Body is always already-redacted at emit time — the hook fires inside TranscriptWriter::append_entry AFTER the redactor (Phase 10.4) replaces secrets with [REDACTED:label]. Defense-in-depth: a microapp without transcripts_read cannot recover the raw body either.

There is no explicit subscribe RPC — AdminRpcBootstrap inspects the operator's grant matrix at boot:

Microapp granted transcripts_subscribe → receives every TranscriptAppended frame.
Microapp granted agent_events_subscribe_all → receives every kind. Reserved for audit / compliance microapps that need full visibility (v0 emits only TranscriptAppended so the two caps are equivalent today; the slot future-proofs for batch / output kinds).
Microapp without either cap → receives no frames; backfill RPC still gated on transcripts_read.

seq discipline: per-session_id monotonic counter that advances by 1 per TranscriptAppended frame. Live + backfill agree on seq values, so a microapp that misses live frames (broadcast lag, transient stdin block) re-issues agent_events/read with since_seq = last_seen to resync.

INVENTORY toggle

NEXO_MICROAPP_AGENT_EVENTS_ENABLED (default 1). Off → broadcast emitter is replaced with a no-op AND no subscribe tasks spawn. Backfill RPC continues to work (so a microapp with transcripts_read keeps querying past sessions). Useful for hardened deployments that want only on-demand history.

Lag handling

tokio::sync::broadcast channel with default capacity 256. Subscribers that fall behind get RecvError::Lagged(n) — boot wires this as a single warn log and the receiver re-syncs to the next surviving frame. Microapps that need gap-free history call agent_events/read from last_seen_seq.

HTTP server capability (Phase 82.12)

Microapps that ship their own HTTP UI / API (meta-microapp, dashboard, settings panel) declare it in plugin.toml:

[capabilities.http_server]
port = 9001
bind = "127.0.0.1"             # default — loopback only
token_env = "AGENT_CREATOR_TOKEN"
health_path = "/healthz"        # default

Boot supervisor

HttpServerSupervisor::probe(decl) polls GET <bind>:<port><health_path> every 250 ms until 200 OK or the 30 s ready timeout. Typed errors:

Timeout { url } — no listener after 30 s.
BadStatus { url, status } — listener responds non-200.

Once probed, spawn_monitor_loop(decl) polls every 60 s. Failures log at warn and flip a watch::Receiver<bool> so nexo extension status / admin-ui can surface the live health state. Monitor handle aborts on drop.

Bind policy

bind defaults to 127.0.0.1. Anything else (0.0.0.0, public IP, …) requires the operator to flip extensions.yaml.<id>.allow_external_bind = true. The AdminRpcBootstrap::build validator checks this BEFORE spawning the extension; mismatches surface as AdminBootstrapError::ExternalBindNotAllowed { microapp_id, bind }. Defense in depth against accidentally world-exposed services.

Shared bearer token

The microapp reads <token_env> at boot (the daemon passes it through via the initialize env block). All inbound HTTP requests must include Authorization: Bearer <token> or X-Nexo-Token: <token>. Token rotation arrives as a JSON-RPC notification — the daemon emits nexo/notify/token_rotated { old_hash, new } after the operator changes the env + reloads. Microapps compare old_hash against token_hash(<their current token>) (sha256-hex truncated to 16 chars) before swapping, so a stale notification hitting an already-restarted microapp is ignored.

INVENTORY toggle

NEXO_MICROAPP_HTTP_SERVERS_ENABLED (default 1). Off → boot supervisor skips the probe + monitor loop entirely. Microapps still spawn; the daemon just doesn't gate ready on the HTTP endpoint. Useful for hardened deployments that ban embedded HTTP servers or run them out-of-band.

Operator processing pause + intervention (Phase 82.13)

Operators sometimes need to suspend agent autonomy on a specific scope and step in manually. v0 ships chat-takeover (per-conversation pause + manual reply); the wire shape is generalised across every agent shape so future variants (batch override, event injection, image-gen output edit) plug in without breaking the surface.

Wire shapes

#![allow(unused)]
fn main() {
#[non_exhaustive]
enum ProcessingScope {
    Conversation { agent_id, channel, account_id, contact_id, mcp_channel_source? },
    AgentBinding { ... },   // reserved
    Agent { ... },          // reserved
    EventStream { ... },    // reserved
    BatchQueue { ... },     // reserved
    Custom { ... },         // forward-compat
}

#[non_exhaustive]
enum InterventionAction {
    Reply { channel, account_id, to, body, msg_kind, attachments?, reply_to_msg_id? },
    SkipItem { ... },        // reserved
    OverrideOutput { ... },  // reserved
    InjectInput { ... },     // reserved
    Custom { ... },          // forward-compat
}

#[non_exhaustive]
enum ProcessingControlState {
    AgentActive,
    PausedByOperator { scope, paused_at_ms, operator_token_hash, reason? },
}
}

operator_token_hash is the Phase 82.12 token_hash shape (sha256-hex truncated to 16 chars) — audits correlate without storing the cleartext bearer.

Methods

nexo/admin/processing/pause { scope, reason?, operator_token_hash } → ProcessingAck { changed, correlation_id }. Idempotent.
nexo/admin/processing/resume { scope, operator_token_hash } → ack.
nexo/admin/processing/intervention { scope, action, operator_token_hash } → ack. Rejects calls on a non-paused scope (-32004 not_paused) so operators never double-respond.
nexo/admin/processing/state { scope } → ProcessingStateResponse { state }.

All four gated on the operator_intervention capability. Per-scope sub-gates (operator_intervention_conversation, _batch, …) are a future-proofing slot.

v0 surface

Only the Conversation + Reply combination routes end-to-end. Non-v0 scopes / actions surface as -32601 not_implemented so callers can probe the wire shape today without the daemon pretending to support unimplemented shapes.

Notification (Phase 82.13.b.firehose)

Pause and resume transitions are emitted on the agent event firehose (nexo/notify/agent_event) as AgentEventKind::ProcessingStateChanged. Operator UIs render the pause indicator in real time without polling processing/state. The constant PROCESSING_STATE_CHANGED_NOTIFY_METHOD is reserved for any future dedicated subject; today the variant rides on the same firehose channel as every other agent event.

{
    "jsonrpc": "2.0",
    "method": "nexo/notify/agent_event",
    "params": {
        "kind": "processing_state_changed",
        "agent_id": "ana",
        "scope": { "kind": "conversation", ... },
        "prev_state": { "state": "agent_active" },
        "new_state": {
            "state": "paused_by_operator",
            "scope": { "kind": "conversation", ... },
            "paused_at_ms": 1700000000000,
            "operator_token_hash": "abcdef0123456789",
            "reason": "investigando"
        },
        "at_ms": 1700000000000
    }
}

Idempotent retries (a second pause on an already-paused scope, a resume on agent_active) skip the emit so subscribers do not see phantom transitions. Reply intervention does NOT emit ProcessingStateChanged — state stays paused; the TranscriptAppended emit on the operator stamp signals operator activity instead.

Transcript stamping (Phase 82.13.b.1)

When the operator dispatches a reply via nexo/admin/processing/intervention, the daemon optionally stamps the reply onto the agent transcript so the agent sees it on its next turn (after resume). To opt in, the microapp passes the active session_id in the params:

{
    "method": "nexo/admin/processing/intervention",
    "params": {
        "scope": { "kind": "conversation", "agent_id": "ana", ... },
        "action": {
            "kind": "reply",
            "channel": "whatsapp",
            "account_id": "wa.0",
            "to": "wa.55",
            "body": "ya te resuelvo, dame 1 minuto",
            "msg_kind": "text"
        },
        "operator_token_hash": "abcdef0123456789",
        "session_id": "33333333-3333-4333-8333-333333333333"
    }
}

After the channel send acks, the daemon appends one entry to the session transcript:

Field	Value
`role`	`Assistant` (so the agent reads it as natural continuity on its next turn)
`content`	The reply body, run through the standard redactor
`source_plugin`	`intervention:<channel>` (e.g. `intervention:whatsapp`) — distinguishes operator stand-in from native LLM output
`sender_id`	`operator:<token_hash>` — identifies the operator without exposing PII
`message_id`	Channel-side provider id when the plugin acked one

The same redactor + FTS index + Phase 82.11 firehose pipeline as native agent appends — subscribers of nexo/notify/agent_event see the operator's reply with the discriminator above.

The ack includes a transcript_stamped hint:

Value	Meaning
`Some(true)`	Reply persisted on transcript. Agent will see it on next turn.
`Some(false)`	Channel send happened, transcript was NOT modified. Either no `session_id` in params, no transcript appender wired in boot, or persistence failed (logged).
`None` (omitted)	Field not applicable (e.g. for non-Reply interventions).

When transcript_stamped: false and the operator UI knows the active session, prompt the operator to reopen the conversation and retry — the agent will otherwise reanudar "ciega" without seeing what was said during takeover.

The SDK helper threads this through fluently:

#![allow(unused)]
fn main() {
use nexo_microapp_sdk::admin::{HumanTakeover, SendReplyArgs};

let takeover = HumanTakeover::engage(&admin, scope, token_hash, None).await?;
takeover
    .send_reply(
        "whatsapp",
        "wa.0",
        "wa.55",
        SendReplyArgs::text("ya te resuelvo")
            .with_session(active_session_id),
    )
    .await?;
takeover.release(None).await?;
}

Operator summary on resume (Phase 82.13.b.2)

The operator can hand the agent a free-text summary of what happened during takeover. The daemon stamps it as a System transcript entry just after the resume flip, so the agent reads it as a system directive on its next turn:

{
    "method": "nexo/admin/processing/resume",
    "params": {
        "scope": { "kind": "conversation", "agent_id": "ana", ... },
        "operator_token_hash": "abcdef0123456789",
        "session_id": "33333333-3333-4333-8333-333333333333",
        "summary_for_agent": "cliente confirmó dirección, IA puede continuar con confirmación de envío"
    }
}

The stamped entry shape:

Field	Value
`role`	`System`
`content`	`[operator_summary] <body>` (body trimmed; prefix added server-side)
`source_plugin`	`intervention:summary`
`sender_id`	`operator:<token_hash>`
`message_id`	`None`

Validation (handler-side, all -32602 invalid_params):

Code	When
`session_id_required_with_summary`	`summary_for_agent` set but `session_id` missing
`empty_summary`	summary trims to zero length
`summary_too_long`	summary > 4096 chars (matches `TranscriptsIndex` FTS5 doc cap)

Validation runs BEFORE the state flip, so a rejected call keeps the scope paused. Stamping itself is best-effort — appender errors leave the scope AgentActive (resume still succeeds) and surface only via ack.transcript_stamped: Some(false).

The SDK helper takes the summary on release() after pinning the session via with_session():

#![allow(unused)]
fn main() {
let takeover = HumanTakeover::engage(&admin, scope, token_hash, None)
    .await?
    .with_session(active_session_id);
// ... operator types replies via takeover.send_reply ...
takeover
    .release(Some(
        "cliente confirmó dirección, IA puede continuar con envío".into(),
    ))
    .await?;
}

The pinned session is reused by both send_reply (transcript stamping) and release (summary injection) — set once, forget. Per-call SendReplyArgs.with_session() overrides the pinned one when both are present.

Pending inbounds during pause (Phase 82.13.b.3)

While a scope is PausedByOperator, inbound user messages arriving on the channel are buffered server-side instead of firing an agent turn. On resume, the buffer is drained and each inbound is stamped on the agent transcript as a User entry with its ORIGINAL timestamp — so the agent reads real chronology of what the customer said during takeover.

Field	Value
`role`	`User`
`content`	Original (already-redacted) inbound body
`source_plugin`	Channel that produced the inbound (`whatsapp`, etc.)
`sender_id`	Counterparty id (e.g. WA jid)
`message_id`	Channel-side provider id when present

The cap is configured via NEXO_PROCESSING_PENDING_QUEUE_CAP (default 50, set to 0 to disable buffering entirely). When the cap is exceeded, the OLDEST entry is evicted FIFO and an AgentEventKind::PendingInboundsDropped firehose event fires so operator UIs can surface the drop.

// Firehose frame on cap-exceeded eviction:
{
    "jsonrpc": "2.0",
    "method": "nexo/notify/agent_event",
    "params": {
        "kind": "pending_inbounds_dropped",
        "agent_id": "ana",
        "scope": { "kind": "conversation", "agent_id": "ana", ... },
        "dropped": 1,
        "at_ms": 1700000000000
    }
}

ProcessingAck.drained_pending: Some(N) on the resume call reports how many entries were drained — None when the queue was empty (no field on the wire). Operator UIs render "replay: 3 messages" so the operator knows what the agent will see on its next turn.

Round-trip end-to-end (Phase 82.13.c, 2026-05-02): the inbound dispatcher push hook now lives in runtime.rs, gated on a shared Arc<dyn ProcessingControlStore> boot wires to BOTH the admin RPC dispatcher AND every AgentRuntime. When the operator pauses via nexo/admin/processing/pause, the very next inbound channel message is buffered onto the per-scope queue (cap = NEXO_PROCESSING_PENDING_QUEUE_CAP, default 50, FIFO eviction). Body is redacted at push time so the queue never holds raw PII. Resume drains the queue onto the transcript as User entries with original timestamps — agent reanudes coherently with full chronology.

Smoke recipe (manual end-to-end):

# 1. Pause a conversation via admin RPC.
curl -X POST localhost:.../admin -d '{
    "method": "nexo/admin/processing/pause",
    "params": {
        "scope": { "kind": "conversation", "agent_id": "ana",
                   "channel": "whatsapp", "account_id": "wa.0",
                   "contact_id": "wa.55" },
        "operator_token_hash": "..."
    }
}'

# 2. Send 3 WhatsApp inbounds while paused.
#    The agent does NOT reply (intake hook buffers them).

# 3. Resume with optional summary.
curl -X POST localhost:.../admin -d '{
    "method": "nexo/admin/processing/resume",
    "params": {
        "scope": { ... },
        "session_id": "...",
        "summary_for_agent": "cliente confirmó dirección",
        "operator_token_hash": "..."
    }
}'

# 4. Verify the transcript JSONL contains 3 fresh `User`
#    entries with their ORIGINAL timestamps (not now()),
#    plus a `[operator_summary] cliente confirmó dirección`
#    System entry just after the resume.

# 5. Send 1 more WhatsApp inbound → agent replies normally,
#    seeing all 4 buffered + 1 fresh user messages on its
#    next turn.

Boot activation still depends on src/main.rs building the AdminRpcBootstrap (deferred follow-up — same boot-order refactor that gates the rest of the admin RPC surface). Until then, the pause check + buffer infra exist but are dormant in production. Once that lands, this round-trip works without any further changes.

Agent escalations (Phase 82.14)

Cross-app primitive for the "I need help here" channel: agents flag work items they cannot complete autonomously, operators see a list and dismiss / take over. v0 ships the admin RPC surface (read + resolve) plus the auto-resolve hook on processing/pause; the escalate_to_human built-in tool that raises new escalations is deferred to 82.14.b.

Wire shapes

#![allow(unused)]
fn main() {
enum EscalationReason {
    OutOfScope, MissingData, NeedsHumanJudgment,
    Complaint, Error, Ambiguity, PolicyViolation, Other,
}
enum EscalationUrgency { Low, Normal, High }

#[non_exhaustive]
enum ResolvedBy {
    OperatorTakeover,
    OperatorDismissed { reason: String },
    AgentResolved,
}

#[non_exhaustive]
enum EscalationState {
    None,
    Pending {
        scope: ProcessingScope,   // 82.13 enum
        summary, reason, urgency,
        context: BTreeMap<String, Value>,
        requested_at_ms,
    },
    Resolved { scope, resolved_at_ms, by },
}
}

context is free-form per agent shape: chat agents emit {"question": …, "customer_phone": …}, batch agents emit {"job_id": …, "invalid_rows": 47}, image-gen emits {"prompt": …, "policy": "nudity"}. Keeps the schema stable while letting each agent surface meaningful detail.

Methods

nexo/admin/escalations/list { filter (default pending), agent_id?, scope_kind?, limit } → EscalationsListResponse { entries }. Newest-first by requested_at_ms / resolved_at_ms; default cap 100, max 1000.
nexo/admin/escalations/resolve { scope, by, dismiss_reason?, operator_token_hash } → EscalationsResolveResponse { changed, correlation_id }. by = "dismissed" requires a dismiss_reason; by = "takeover" is the same outcome the auto-resolve hook produces.

Two granular capabilities:

escalations_read — gates list. Read-only dashboards hold this.
escalations_resolve — gates resolve. Strictly stronger grant for operator UIs that act on escalations.

Auto-resolve on pause

When nexo/admin/processing/pause fires on a scope with a matching Pending escalation AND both the processing + escalation stores are wired, the dispatcher auto-flips the escalation to Resolved { OperatorTakeover } BEFORE applying the pause. Failures in the auto-resolve path log at warn and never block the pause itself — operator intent (pause) takes priority over side-effects.

Notification literals

escalation_requested and escalation_resolved are pinned as pub const in the wire crate; the emit site lands in 82.14.b alongside the escalate_to_human built-in tool + the BindingContext→scope derivation.

Limitations

Bidirectional flow over single stdio: app: ID prefix disambiguates microapp-initiated requests from daemon-initiated ones. Daemon must not use app: prefix for its own request IDs.
Audit log writer choice: InMemoryAuditWriter resets on daemon restart; pick SqliteAdminAuditWriter::open(path) for durable retention + the boot-time sweep_retention() sweeper.
channels/doctor static-only: live MCP probe stays in nexo channel doctor --runtime CLI.
Live operator approval: every grant is yaml-static. v1 has no ask interactive flow (deferred to 82.10.i).

Microapp contract (Phase 83.6)

This page is the language-agnostic specification for what makes a program a nexo microapp. Every microapp — whether built with the Rust SDK, hand-written in Python, or shipped as a Go binary — implements the wire protocol below. If your code passes this contract, the daemon will load it.

Companion pages:

Building microapps in Rust — the Rust SDK shortcut that hides the wire details when you don't need them.
Admin RPC — the operator surface for managing agents/credentials/pairing/transcripts from inside a microapp.

Wire protocol overview

A microapp is a child process the daemon launches once at boot and keeps alive across multiple agent turns. Communication is line-delimited JSON-RPC 2.0 over stdio:

stdin (daemon → microapp): one JSON-RPC frame per line, UTF-8.
stdout (microapp → daemon): same shape; mixed responses + notifications + outbound requests.
stderr: free-form log lines forwarded to the daemon's tracing subscriber. Microapps SHOULD prefix log lines with [INFO], [WARN], [ERROR] so the daemon can map them.

Every JSON-RPC frame is exactly one line (no embedded newlines in the JSON). The daemon's reader splits on \n. A microapp MUST flush stdout after every frame.

Framing rules

Direction	Shape	Notes
Daemon → microapp request	`{"jsonrpc":"2.0","id":<int>,"method":...,"params":...}`	Numeric id (incrementing).
Microapp → daemon response	`{"jsonrpc":"2.0","id":<int>,"result":...}` or `{...,"error":{"code":...,"message":...}}`	id MUST echo the request's.
Microapp → daemon outbound request	`{"jsonrpc":"2.0","id":"app:<uuid>","method":...,"params":...}`	id MUST start with `"app:"` to disambiguate from daemon-initiated.
Daemon → microapp response to outbound	`{"jsonrpc":"2.0","id":"app:<uuid>","result":...}`	Echoes the microapp's id.
Either direction notification	`{"jsonrpc":"2.0","method":...,"params":...}` (no `id`)	Fire-and-forget; never gets a response.

Methods (daemon → microapp)

These are the methods the daemon will call on your microapp. Implement them all. Methods not in this list are reserved for future versions; respond with error code -32601 (method not found) for forward-compat.

`initialize`

Called once per microapp lifetime, immediately after spawn. Returns the microapp's tool catalogue + declared capabilities.

{"method":"initialize","params":{
  "extension_id":"agent-creator",
  "state_dir":"/path/to/.nexo/extensions/agent-creator/state",
  "config":{"...microapp-specific config from extensions.yaml..."}
}}

Result:

{
  "tools":[
    {"name":"agent_creator_create","description":"...","input_schema":{...}}
  ],
  "version":"0.1.0"
}

`tools/list`

Re-queried on every binding refresh. Same return shape as initialize.tools. Microapps SHOULD return identical bytes across calls so the daemon's tool-cache prefix matcher stays warm.

`tools/call`

The core agent-loop entry point. Carries the effective BindingContext (the agent / channel / account triple) and the LLM's tool-call args.

{"method":"tools/call","params":{
  "tool":"agent_creator_create",
  "args":{"name":"alice"},
  "binding_context":{...},
  "inbound":{...}
}}

Result {"output":<JSON>} (success) or {"error":"description"} (microapp-side failure — distinct from JSON-RPC error which signals a protocol-level fault).

`agents/updated`

Notification (no id). Fired when the daemon's agents.yaml hot-reload picked up a change that affects this microapp's binding surface. Payload includes the new agent IDs visible to this microapp.

`hooks/<name>`

Called when the daemon dispatches a hook the microapp registered during initialize (Phase 83.3). Reply with a HookDecision.

`shutdown`

Called once before the daemon SIGTERMs the process. Microapps should flush state and reply with {"ok":true} within 5 s. The daemon will SIGKILL after 10 s regardless.

Methods (microapp → daemon)

Outbound calls — capability-gated. The operator's extensions.yaml lists which capabilities this microapp may use.

`nexo/dispatch`

Phase 82.3. Send an outbound message via a channel plugin (e.g. WhatsApp). Requires dispatch_outbound capability.

{"id":"app:<uuid>","method":"nexo/dispatch","params":{
  "to":"+573000000000",
  "channel":"whatsapp",
  "body":"Hello"
}}

`nexo/admin/*`

Phase 82.10. Operator-surface admin RPC: agents CRUD, credentials, pairing, LLM keys, channels. Each method is gated by a separate capability (agents_crud, credentials_crud, pairing_initiate, llm_keys_crud, channels_crud). See admin-rpc.md for the full surface.

Notifications (daemon → microapp)

Fire-and-forget messages the daemon pushes when an event lands. Microapps subscribe by holding the matching capability.

Method	Capability	Payload	Phase
`nexo/notify/transcript_appended`	`transcripts_subscribe`	`{session_id, role, body, ts_ms}`	82.11
`nexo/notify/pairing_status_changed`	`pairing_initiate`	`{channel, instance, status}`	82.10
`nexo/notify/token_rotated`	`credentials_crud`	`{old_hash, new}`	82.12
`nexo/notify/agent_event`	`transcripts_subscribe`	`{kind, agent_id, payload}`	82.11

Shapes

Binding context

Phase 82.1. Every tools/call carries this triple so the microapp knows which agent / channel / account fired the tool.

{
  "binding_context":{
    "agent_id":"ana",
    "channel":"whatsapp",
    "account_id":"acme",
    "binding_id":"whatsapp:acme",
    "binding_index":0
  }
}

account_id is the multi-tenant key. Multi-tenant SaaS microapps key their per-tenant SQLite tables on this field. See multi-tenant SaaS walkthrough.

Inbound message reference

Phase 82.5. Carries the original inbound message metadata (sender, timestamp, kind) so a tool handler can correlate to the trigger.

{
  "inbound":{
    "kind":"whatsapp_message",
    "from":"+573000000000",
    "ts_ms":1735689600000,
    "session_id":"..."
  }
}

Extension config

Loaded from extensions.yaml.entries.<id>.config and threaded through initialize.params.config. Opaque to the daemon — microapps validate their own schema (Phase 83.17 will add boot-time schema validation as opt-in).

Hook decision

Phase 83.3. The microapp's vote on whether a hook should proceed.

{"vote":"allow|deny|abstain","reason":"...","metadata":{...}}

abstain is the default — microapps that don't know about a particular hook should abstain rather than vote.

Tool call request / response

Already shown above under tools/call. The output field on success is opaque JSON; the LLM sees its stringified form.

Error envelope

JSON-RPC error field follows the standard:

{"code":-32000,"message":"...","data":{"...optional structured info..."}}

The range -32000 to -32099 is reserved for nexo. Codes below -32099 and standard JSON-RPC codes (-32700 parse error, -32600 invalid request, -32601 method not found, -32602 invalid params, -32603 internal error) keep their RFC meaning.

Conventions

Tool name namespacing

Tools MUST be prefixed with the extension id followed by an underscore: <extension_id>_<tool>. Examples:

✅ agent_creator_create
✅ acme_billing_charge
❌ create (unprefixed)
❌ agent-creator/create (wrong separator)

The daemon validates the prefix on every initialize / tools/list and rejects unprefixed tools so the LLM never sees two microapps' send tools competing.

Reserved JSON-RPC error codes

-32000 to -32099 are reserved. Common codes microapps SHOULD emit:

Code	Meaning
`-32000`	Capability not granted
`-32001`	Tool input failed schema validation
`-32002`	Backend service unavailable
`-32003`	Rate limit (the microapp's own per-tool limit)
`-32004`	Auth error talking to the microapp's external service
`-32099`	Microapp internal error (catchall)

Timeouts

The daemon's default per-call timeout is 30 seconds. extensions.yaml.entries.<id>.timeout_secs overrides per microapp. A timeout closes the in-flight call but leaves the process alive; the daemon will retry the next call normally.

Backward compatibility

The contract evolves under these rules:

Additive fields always. New fields on existing shapes appear behind #[serde(default)] (Rust) / "missing key is default" (other langs). Microapps MUST NOT reject unknown fields.
Deprecation requires N + N+1. To remove a method or field, the daemon emits a tracing::warn! + admin-ui notice in release N. The actual removal lands in N+1.
Capability matrix grows monotonically. New capabilities default to false for existing microapps; old capabilities never silently change semantics.
Wire format MUST stay UTF-8 line-JSON. A switch to length-prefixed framing or binary protocol would be a breaking change requiring an explicit major-version bump coordinated with all SDK languages.

Worked example: Python hello-world

A volunteer should be able to ship a working microapp in Python using only this doc and the standard library:

#!/usr/bin/env python3
import json
import sys

def respond(req_id, result):
    sys.stdout.write(json.dumps({
        "jsonrpc": "2.0", "id": req_id, "result": result
    }) + "\n")
    sys.stdout.flush()

for line in sys.stdin:
    req = json.loads(line)
    rid = req["id"]
    method = req["method"]
    if method == "initialize":
        respond(rid, {
            "tools": [{
                "name": "hello_world_greet",
                "description": "Echo a greeting",
                "input_schema": {"type": "object", "properties": {
                    "name": {"type": "string"}
                }, "required": ["name"]}
            }],
            "version": "0.1.0"
        })
    elif method == "tools/call":
        name = req["params"]["args"]["name"]
        respond(rid, {"output": {"greeting": f"hello, {name}"}})
    elif method == "tools/list":
        respond(rid, {"tools": [...]})  # same as initialize
    elif method == "shutdown":
        respond(rid, {"ok": True})
        break
    else:
        sys.stdout.write(json.dumps({
            "jsonrpc": "2.0", "id": rid,
            "error": {"code": -32601, "message": f"unknown method: {method}"}
        }) + "\n")
        sys.stdout.flush()

Drop this in extensions/hello/main.py, mark executable, add extensions.yaml.entries.hello: { path: "extensions/hello/main.py" }, and nexo ext install ./extensions/hello. The daemon will load it and the LLM will see hello_world_greet in its tool catalogue.

Worked example: Go skeleton

Same protocol, idiomatic Go I/O:

package main

import (
    "bufio"
    "encoding/json"
    "fmt"
    "os"
)

type RPC struct {
    JSONRPC string          `json:"jsonrpc"`
    ID      interface{}     `json:"id,omitempty"`
    Method  string          `json:"method,omitempty"`
    Params  json.RawMessage `json:"params,omitempty"`
    Result  interface{}     `json:"result,omitempty"`
    Error   *RPCError       `json:"error,omitempty"`
}

type RPCError struct {
    Code    int    `json:"code"`
    Message string `json:"message"`
}

func main() {
    scanner := bufio.NewScanner(os.Stdin)
    enc := json.NewEncoder(os.Stdout)
    for scanner.Scan() {
        var req RPC
        json.Unmarshal(scanner.Bytes(), &req)
        switch req.Method {
        case "initialize":
            enc.Encode(RPC{JSONRPC: "2.0", ID: req.ID, Result: map[string]interface{}{
                "tools":   []map[string]interface{}{{
                    "name":         "hello_go_greet",
                    "description":  "Echo a greeting",
                    "input_schema": map[string]interface{}{"type": "object"},
                }},
                "version": "0.1.0",
            }})
        // tools/call, tools/list, shutdown … same pattern
        default:
            enc.Encode(RPC{JSONRPC: "2.0", ID: req.ID, Error: &RPCError{
                Code: -32601, Message: fmt.Sprintf("unknown: %s", req.Method),
            }})
        }
    }
}

Worked example: TypeScript / Node skeleton

import * as readline from 'readline';

const rl = readline.createInterface({ input: process.stdin });

function respond(id: any, result: any) {
  process.stdout.write(JSON.stringify({ jsonrpc: '2.0', id, result }) + '\n');
}

rl.on('line', (line) => {
  const req = JSON.parse(line);
  switch (req.method) {
    case 'initialize':
      respond(req.id, {
        tools: [{
          name: 'hello_ts_greet',
          description: 'Echo a greeting',
          input_schema: { type: 'object' }
        }],
        version: '0.1.0'
      });
      break;
    // tools/call, tools/list, shutdown — same pattern
    default:
      process.stdout.write(JSON.stringify({
        jsonrpc: '2.0', id: req.id,
        error: { code: -32601, message: `unknown: ${req.method}` }
      }) + '\n');
  }
});

Reference: Rust SDK shortcut

For Rust microapps, the nexo-microapp-sdk crate (Phase 83.4) hides the wire details. See Building microapps in Rust for the high-level API. The SDK implements this contract verbatim — anything you can do via the SDK you can do by hand, but the SDK is the recommended path because it stays in lockstep with the daemon's contract version.

`agent-creator` — SaaS meta-microapp (Phase 83.8)

A reference microapp that drives the framework as a multi-tenant SaaS meta-creator of WhatsApp agents. Operators (the SaaS owner) provision one daemon per company; clients (tenants) CRUD their own agents, skills, LLM keys, and conversation views through the microapp.

Lives out of the workspace at /home/familia/chat/agent-creator-microapp/. Pulls nexo-microapp-sdk + nexo-tool-meta + nexo-compliance-primitives via path deps during dev; switch to crates.io once published.

Tool surface (22 tools)

Agents — Phase 83.8.8

Tool	Backed by
`agent_list`	`nexo/admin/agents/list`
`agent_get`	`nexo/admin/agents/get`
`agent_upsert`	`nexo/admin/agents/upsert`
`agent_delete`	`nexo/admin/agents/delete`

Skills — Phase 83.8.8

Tool	Backed by
`skill_list`	`nexo/admin/skills/list`
`skill_get`	`nexo/admin/skills/get`
`skill_upsert`	`nexo/admin/skills/upsert`
`skill_delete`	`nexo/admin/skills/delete`

The skill body lands at <root>/<name>/SKILL.md — the runtime SkillLoader reads it on every agent turn, so a CRUD round-trip shows up in the agent's prompt without a daemon restart.

LLM providers — Phase 83.8.8

Tool	Backed by
`llm_provider_list`	`nexo/admin/llm_providers/list`
`llm_provider_upsert`	`nexo/admin/llm_providers/upsert`
`llm_provider_delete`	`nexo/admin/llm_providers/delete`

Pairing — Phase 83.8.9

Tool	Backed by
`whatsapp_pair_start`	`nexo/admin/pairing/start`
`whatsapp_pair_status`	`nexo/admin/pairing/status`
`whatsapp_pair_cancel`	`nexo/admin/pairing/cancel`

Conversations — Phase 83.8.9

Tool	Backed by
`conversation_list`	`nexo/admin/agent_events/list`
`conversation_read`	`nexo/admin/agent_events/read`
`conversation_search`	`nexo/admin/agent_events/search`

The live firehose (nexo/notify/agent_event) is consumed by the SDK TranscriptStream::filter_by_agent helper — multi-tenant defense-in-depth drops events whose agent_id is not in the tenant's allowed set before the microapp ever sees the frame.

Operator takeover — Phase 83.8.9

Tool	Backed by
`takeover_engage`	SDK `HumanTakeover::engage` → `nexo/admin/processing/pause`
`takeover_send`	`HumanTakeover::send_reply` → `nexo/admin/processing/intervention`
`takeover_release`	`HumanTakeover::release` → `nexo/admin/processing/resume`

takeover_send flows operator-typed replies through the ChannelOutboundDispatcher trait wired in Phase 83.8.4.a — Phase 83.8.4.b ships the production BrokerOutboundDispatcher (nexo_setup::admin_adapters) that publishes to the per-channel plugin.outbound.<channel>[.<account>] topic each plugin's existing dispatcher already listens on. WhatsApp translator ships in v1; Telegram + Email translators are TBD per-channel follow-ups (83.8.4.b.tg / 83.8.4.b.em).

Escalations — Phase 83.8.9

Tool	Backed by
`escalation_list`	`nexo/admin/escalations/list`
`escalation_resolve`	`nexo/admin/escalations/resolve`

EscalationReason::UnknownQuery (Phase 83.8.5) covers the "agent doesn't know" UI notification path.

Compliance hook — Phase 83.8.10

The before_message hook chains:

OptOutMatcher (Spanish + English keywords) → Abort.
AntiLoopDetector (3 repetitions in 60 s) → Abort.
PiiRedactor (cards / phones / emails) → log redaction stats.

Defaults-on. Per-agent override propagation through extensions_config.compliance is logged in FOLLOWUPS.md as a framework follow-up — needs the wire shape on BindingContext.

Capabilities (`plugin.toml`)

[capabilities.admin]
required = [
    "agents_crud", "skills_crud", "llm_keys_crud",
    "pairing_initiate", "transcripts_read",
    "operator_intervention",
    "escalations_read", "escalations_resolve",
]
optional = ["credentials_crud", "channels_crud"]

The operator grants these in extensions.yaml.<id>.capabilities_grant. Missing required → boot-time fail-fast; missing optional → handler-time -32004.

SDK opt-in

#![allow(unused)]
fn main() {
Microapp::new(APP_NAME, env!("CARGO_PKG_VERSION"))
    .with_admin()                       // Phase 83.8.8.a
    .with_hook("before_message", hooks::compliance::before_message)
    .with_tool("agent_list", tools::agents::agent_list)
    // … 21 more tools
    .run_stdio()
    .await
}

with_admin() wires the SDK AdminClient through the same stdout writer the daemon-reply path uses, intercepts inbound app: correlation IDs, and exposes the client through ToolCtx::admin() / HookCtx::admin(). Tool handlers do no hand-rolled JSON-RPC plumbing — every admin call is one ctx.admin()?.call("nexo/admin/<method>", &params).await.

Stress-test methodology

This microapp exists to stress-test the framework. Friction encountered during construction triggers a framework fix (agnostic + reusable by other microapps), not a microapp-side workaround. Five gaps closed during the v1 build:

nexo/admin/skills/* CRUD missing → end-to-end shipped.
processing.intervention did not dispatch outbound → ChannelOutboundDispatcher trait + handler wire.
SDK AdminClient had no runtime integration → Microapp::with_admin() builder + ToolCtx accessor.
Operator UI needed EscalationReason::UnknownQuery → variant added.
SDK lacked HumanTakeover + TranscriptStream::filter_by_agent helpers → both shipped.

See FOLLOWUPS.md (workspace root) for the active deferred-follow-up list.

Templates — language-by-language reference

This page lists the starting points for authoring a nexo microapp in each supported language.

The contract (contract.md) is the source of truth — line-delimited JSON-RPC over stdio. Every template below ships a working initialize → tools/list → tools/call → shutdown loop against that contract. They differ only in ergonomics and per-language idioms.

Rust (recommended) — `nexo-microapp-sdk`

Where: extensions/template-microapp-rust/ in the nexo-rs repo.

Why use the SDK: the daemon's contract version evolves under N+N+1 deprecation rules. The Rust SDK lives in lockstep with the daemon, so an additive field on the wire becomes an additive field on ToolCtx / HookCtx automatically. Hand- rolled parsers risk silent drift.

Quick start:

cp -r /path/to/nexo-rs/extensions/template-microapp-rust ./mi-microapp
cd ./mi-microapp
# rename in Cargo.toml + plugin.toml + src/main.rs
cargo build --release

See rust.md for the full SDK reference and getting-started.md for the 1-hour walkthrough.

SDK feature flags:

Feature	Adds
(default)	`Microapp` builder + tool/hook handlers
`outbound`	`OutboundDispatcher` for `nexo/dispatch` outbound calls
`admin`	`AdminClient` for `nexo/admin/*` calls (capability-gated)
`test-harness`	`MicroappTestHarness` + `MockBindingContext` for unit tests

Python — hand-rolled (stdlib only)

No SDK ships today. Authors implement the wire protocol directly using sys.stdin / sys.stdout / json. The contract doc has a full worked example.

Skeleton:

#!/usr/bin/env python3
import json
import sys

def respond(req_id, result):
    sys.stdout.write(json.dumps({
        "jsonrpc": "2.0", "id": req_id, "result": result
    }) + "\n")
    sys.stdout.flush()

for line in sys.stdin:
    req = json.loads(line)
    rid = req["id"]
    method = req["method"]

    if method == "initialize":
        respond(rid, {
            "tools": [{
                "name": "myapp_greet",
                "description": "Echo a greeting",
                "input_schema": {"type": "object", "properties": {
                    "name": {"type": "string"}
                }, "required": ["name"]}
            }],
            "version": "0.1.0"
        })
    elif method == "tools/call":
        name = req["params"]["args"]["name"]
        respond(rid, {"output": {"greeting": f"hello, {name}"}})
    elif method == "tools/list":
        respond(rid, {"tools": [...]})  # same as initialize
    elif method == "shutdown":
        respond(rid, {"ok": True})
        break
    else:
        sys.stdout.write(json.dumps({
            "jsonrpc": "2.0", "id": rid,
            "error": {"code": -32601, "message": f"unknown method: {method}"}
        }) + "\n")
        sys.stdout.flush()

plugin.toml:

[plugin]
id = "my-python-microapp"
version = "0.1.0"
name = "My Python Microapp"

[capabilities]
tools = ["myapp_greet"]

[transport]
kind = "stdio"
command = "python3"
args    = ["./main.py"]

Library tips:

pydantic for the JSON-RPC envelopes if you want typed parsing.
anyio if you need async tool handlers.
For test, run the binary as a subprocess and pipe JSON-RPC frames in/out.

TypeScript / Node — hand-rolled

Same shape as Python; Node's readline does the line-splitting.

Skeleton:

import * as readline from 'readline';

const rl = readline.createInterface({ input: process.stdin });

function respond(id: any, result: any) {
  process.stdout.write(JSON.stringify({ jsonrpc: '2.0', id, result }) + '\n');
}

rl.on('line', (line) => {
  const req = JSON.parse(line);
  switch (req.method) {
    case 'initialize':
      respond(req.id, {
        tools: [{
          name: 'myapp_greet',
          description: 'Echo a greeting',
          input_schema: { type: 'object' }
        }],
        version: '0.1.0'
      });
      break;
    case 'tools/call':
      respond(req.id, { output: { greeting: `hello, ${req.params.args.name}` } });
      break;
    case 'shutdown':
      respond(req.id, { ok: true });
      process.exit(0);
    default:
      process.stdout.write(JSON.stringify({
        jsonrpc: '2.0', id: req.id,
        error: { code: -32601, message: `unknown: ${req.method}` }
      }) + '\n');
  }
});

plugin.toml:

[plugin]
id = "my-ts-microapp"

[transport]
kind = "stdio"
command = "node"
args    = ["./dist/main.js"]

Library tips:

@types/node for stdio types.
zod for tool input schema validation server-side.
bun works as a drop-in for node and gives faster startup.

Go — hand-rolled

Same shape; bufio.Scanner for line reading.

Skeleton:

package main

import (
    "bufio"
    "encoding/json"
    "fmt"
    "os"
)

type RPC struct {
    JSONRPC string          `json:"jsonrpc"`
    ID      interface{}     `json:"id,omitempty"`
    Method  string          `json:"method,omitempty"`
    Params  json.RawMessage `json:"params,omitempty"`
    Result  interface{}     `json:"result,omitempty"`
    Error   *RPCError       `json:"error,omitempty"`
}

type RPCError struct {
    Code    int    `json:"code"`
    Message string `json:"message"`
}

func main() {
    scanner := bufio.NewScanner(os.Stdin)
    enc := json.NewEncoder(os.Stdout)
    for scanner.Scan() {
        var req RPC
        json.Unmarshal(scanner.Bytes(), &req)
        switch req.Method {
        case "initialize":
            enc.Encode(RPC{JSONRPC: "2.0", ID: req.ID, Result: map[string]interface{}{
                "tools":   []map[string]interface{}{{
                    "name":         "myapp_greet",
                    "description":  "Echo a greeting",
                    "input_schema": map[string]interface{}{"type": "object"},
                }},
                "version": "0.1.0",
            }})
        case "shutdown":
            enc.Encode(RPC{JSONRPC: "2.0", ID: req.ID, Result: map[string]bool{"ok": true}})
            return
        default:
            enc.Encode(RPC{JSONRPC: "2.0", ID: req.ID, Error: &RPCError{
                Code: -32601, Message: fmt.Sprintf("unknown: %s", req.Method),
            }})
        }
    }
}

plugin.toml:

[transport]
kind = "stdio"
command = "./my-go-microapp"   # the compiled binary

Choosing a language

Use case	Recommended stack
Multi-tenant SaaS, performance-sensitive	Rust + SDK
Quick prototype / glue to existing Python data pipeline	Python + stdlib
TypeScript shop, integration with web ecosystem	TypeScript + stdlib
Single-binary distribution to ops, no runtime dep	Go + stdlib

Rule of thumb: if your microapp is the product, use Rust + SDK so contract evolution is automatic. If your microapp glues to another runtime you already maintain, use the host language and pin the contract version explicitly in your code.

Contract version pinning

Whichever language you pick, your microapp MUST be aware of the contract version it was tested against. The Rust SDK pins it via Cargo.toml = "0.1"; hand-rolled microapps MUST embed a constant + assert at boot.

NEXO_CONTRACT_VERSION = "0.1"
# Future: read daemon's `initialize` response for a contract_version
# field and warn if it disagrees.

The contract doc's backward compat rules apply: additive fields always, deprecation N + N+1, wire format frozen.

CLI reference

Single source of truth for every agent subcommand, flag, exit code, and env var. agent is the one binary you'll ever run in production — this is everything it can do.

Source: src/main.rs (Mode enum + parse_args), crates/extensions/src/cli/, crates/setup/src/.

Invocation

agent [--config <dir>] [<subcommand> ...]

Arg parser: hand-rolled, not clap. --help / -h work; -c is not an alias for --config (case-sensitive exact match).
No subcommand → run the daemon (default).
Global flag: --config <dir> (default ./config).

Global environment variables

Variable	Values	Purpose
`RUST_LOG`	tracing-subscriber filter	Log level (e.g. `info,agent=debug`). Default `info`.
`AGENT_LOG_FORMAT`	`pretty` \| `compact` \| `json`	Log format. Default `pretty`.
`AGENT_ENV`	`production` (or `prod`)	Triggers JSON logs unless `AGENT_LOG_FORMAT` overrides.
`TASKFLOW_DB_PATH`	file path	Flow CLI DB (default `./data/taskflow.db`).
`CONFIG_SECRETS_DIR`	dir path	Whitelists an extra root for `${file:...}` YAML refs.

Exit codes (generic)

Code	Meaning
`0`	Success
`1`	General failure (not found, config invalid, connection refused)
`2`	Warnings-only outcome (currently only `--check-config` non-strict)

Ext subcommand has its own richer code table — see below.

Subcommand index

Subcommand	Purpose
(default)	Run the agent daemon
`init`	Scaffold sample YAMLs (Phase 95)
`set-broker`	Switch broker.yaml between `local` and `nats` (Phase 92.9)
`setup`	Interactive credential wizard
`status`	Query running agent instances
`dlq`	Dead-letter queue inspection
`ext`	Extension management
`flow`	TaskFlow operations
`mcp-server`	Run as MCP stdio server
`admin`	Run the web admin UI behind a Cloudflare quick tunnel
`reload`	Trigger config hot-reload on a running daemon
`--check-config`	Pre-flight config validation
`--dry-run`	Load config and print the plan

Daemon (default)

agent [--config ./config]

Boots every configured agent runtime, connects to the broker (NATS or local fallback), starts metrics (:9090), health (:8080), and admin (:9091 loopback) servers.

Exit codes:

0 — clean shutdown via SIGTERM / Ctrl+C
1 — config load failed, broker unreachable at startup, plugin failed to initialize

Logs to: stderr. See Logging.

`init`

Scaffold sample YAMLs into the config dir. Templates are baked into the binary at compile time (include_str!), so this works on a fresh install with zero network access.

agent init                                       # all 19 templates → ${XDG_CONFIG_HOME:-~/.config}/nexo
agent init --output /etc/nexo-rs                 # custom dir
agent init --yaml broker,llm                     # shorthand: only those two
agent init --yaml plugins/whatsapp               # plugin subdir templates
agent init --force                               # overwrite existing files
agent init --stdout --yaml broker                # print one template to stdout (no file write)

Yaml filter shorthand: bare names (broker, agents, llm, memory, extensions, mcp, mcp_server, runtime, pollers, taskflow, transcripts, pairing, webhook_receiver) resolve to top-level YAMLs. Plugin subdir templates: plugins/whatsapp, plugins/telegram, plugins/email, plugins/browser, plugins/discovery. Persona templates: personas/discovery.

Exit codes: 0 on write, 1 on filter mismatch, 2 if --force not passed and target exists.

Postinst scripts in the .deb / .rpm / Termux packages call agent init --output <CONFIG_DIR> automatically on first install so a fresh-from-package operator never starts from a blank dir.

`set-broker`

Switch the broker mode without editing broker.yaml by hand. Rewrites broker.yaml to the requested kind, then (by default) sends SIGHUP to every running daemon that loaded this config dir — the daemon respawns with the new broker (~3s blackout for in-flight messages, drained from the persistence layer).

agent set-broker local                                # stdio bridge (no NATS server)
agent set-broker nats --url nats://localhost:4222     # multi-host mode
agent set-broker local --no-signal                    # edit YAML only, daemon stays on old broker until restart
agent --config /etc/nexo-rs set-broker nats --url nats://10.0.0.5:4222

local mode uses the daemon-derived stdio_bridge transport for subprocess plugins — no NATS server required. nats mode requires a reachable NATS server at --url; subprocess plugins inherit NEXO_BROKER_URL and connect via async-nats.

See broker shapes for the full architectural picture and zero-config quickstart for the typical operator flow.

Exit codes: 0 on success, 1 if nats requested without --url or YAML write failed, 2 if no daemon matched (YAML still updated; user must start the daemon manually).

`setup`

Interactive credential wizard. Launches a prompt-driven flow for every service you want to enable — LLM keys, WhatsApp QR, Telegram bot token, Google OAuth, etc.

agent setup                    # full interactive wizard
agent setup list               # list installable service ids
agent setup <service>          # configure one service (e.g. minimax, whatsapp)
agent setup doctor             # validate every credential / token (also runs the Phase 70.6 pairing-store audit)
agent setup telegram-link      # print Telegram bot link-to-chat URL

Exit codes: 0 on completion; 1 on error.

See Setup wizard for the step-by-step.

`status`

Query the running daemon via the loopback admin console.

agent status                                   # every agent, table
agent status ana                               # one agent, table
agent status --json                            # raw JSON
agent status --endpoint http://remote:9091     # override endpoint

Table output columns: ID | MODEL | BINDINGS | DELEGATES | DESCRIPTION

Exit codes:

0 — query succeeded
1 — endpoint unreachable or agent id not found

`dlq`

Dead-letter queue inspection. See DLQ operations for the full picture.

agent dlq list                 # plain-text table, up to 1000 entries
agent dlq replay <id>          # move back to pending_events for retry
agent dlq purge                # drop every entry (destructive)

Exit codes: 0 success; 1 failure (entry not found, DB error).

list columns: id | topic | failed_at | reason.

`ext`

Extension management. See Extensions — CLI for details and workflows.

agent ext list                         [--json]
agent ext info <id>                    [--json]
agent ext enable <id>
agent ext disable <id>
agent ext validate <path>
agent ext doctor                       [--runtime] [--json]
agent ext install <path>               [--update] [--enable] [--dry-run] [--link] [--json]
agent ext uninstall <id> --yes         [--json]

Flags:

Flag	Where	Purpose
`--json`	list / info / doctor / install / uninstall	Machine-readable output
`--runtime`	`doctor`	Also spawn stdio extensions to verify handshake
`--update`	`install`	Overwrite if already installed
`--enable`	`install`	Flip to `enabled: true` in `extensions.yaml`
`--link`	`install`	Symlink source (absolute path required) instead of copy
`--dry-run`	`install`	Validate without writing
`--yes`	`uninstall`	Required confirmation

Exit codes (extension-specific):

Code	Meaning
0	Success
1	Extension not found / `--update` target missing
2	Invalid manifest / invalid source / `--link` needs absolute path
3	Config write failed
4	Invalid id (reserved or empty)
5	Target exists (use `--update`)
6	Id collision across roots
7	`uninstall` missing `--yes` confirmation
8	Copy / atomic swap failed
9	Runtime check(s) failed (`doctor --runtime`)

`flow`

TaskFlow operations. See TaskFlow — FlowManager.

agent flow list                [--json]
agent flow show <id>           [--json]
agent flow cancel <id>
agent flow resume <id>

Env var: TASKFLOW_DB_PATH (default ./data/taskflow.db).

Exit codes: 0 success; 1 on error (flow not found, wrong state, DB inaccessible).

list sorts by updated_at DESC; show includes every recorded step; resume only works on Manual or ExternalEvent waits.

`mcp-server`

Run the agent as an MCP stdio server so MCP clients (Claude Desktop, Cursor, Zed) can consume its tools.

agent mcp-server

Reads JSON-RPC from stdin, writes responses to stdout
Does not boot a daemon or broker
Requires config/mcp_server.yaml with enabled: true

Exit codes: 0 on clean exit; 1 if mcp_server.yaml disabled.

See MCP — Agent as MCP server for deployment recipes (Claude Desktop config, allowlist, auth token).

`admin`

Run the web admin UI behind a fresh Cloudflare quick tunnel. A new ephemeral trycloudflare.com URL is minted on every launch — no account, no DNS, no TLS setup.

agent admin                  # listen on 127.0.0.1:9099 (default)
agent admin --port 9199      # pick a different loopback port
agent admin --port=9199      # same thing, equals form

What happens on launch:

Install cloudflared if missing. The tunnel crate detects the host OS/arch and downloads the matching cloudflared binary into the platform data dir. Subsequent launches reuse the cached copy.
Mint a fresh random password. 24 URL-safe characters from the OS RNG. Printed once to stdout — copy it now; there is no recovery short of relaunching agent admin.
Start a loopback HTTP server. Listens on 127.0.0.1:<port> and serves the React bundle embedded at Rust compile time (see admin-ui/) behind HTTP Basic Auth. A bundle-missing fallback page is served if admin-ui/dist/ was empty when cargo build ran.
Open a quick tunnel. cloudflared tunnel --url http://127.0.0.1:<port> returns an ephemeral https://…trycloudflare.com URL, which the command prints to stdout alongside the username (admin) and the freshly-minted password.
Wait for Ctrl+C / SIGTERM. Graceful shutdown kills the cloudflared child and stops the HTTP listener.

Exit codes:

0 — clean shutdown
1 — cloudflared install failed, port already bound, or tunnel negotiation failed

Notes:

URL is re-generated every launch. If you need a stable URL, switch to a named Cloudflare tunnel (requires an account and wrangler config — out of scope for this command).
Auth is HTTP Basic for now; the browser prompts for admin / <password> on first load. Username is fixed; password is fresh every launch. Keep the shell scrollback if you need to re-paste it.
The password is never persisted — losing it means stopping agent admin and starting again (which also rotates the tunnel URL).

`reload`

Triggers a config hot-reload on a running daemon. Publishes control.reload on the broker the daemon is listening to (resolved from broker.yaml), subscribes-before-publish to control.reload.ack, waits up to 5 s, and prints the outcome.

agent reload                 # human-readable summary
agent reload --json          # serialized ReloadOutcome

Example output:

$ agent reload
reload v7: applied=2 rejected=0 elapsed=18ms
  ✓ ana
  ✓ bob

Exit codes:

0 — at least one agent reloaded
1 — no ack within 5 s (daemon not running)
2 — every agent rejected

Full semantics — what's reloaded, apply-on-next-message, failure modes — in Config hot-reload.

`--check-config`

Pre-flight validation. Loads every YAML file, resolves env vars, checks schema, validates credentials. No broker, no daemon. Meant for CI.

agent --check-config                    # warnings-only mode
agent --check-config --strict           # warnings become errors

Exit codes:

0 — all clear
1 — hard errors (missing required creds, invalid schema)
2 — warnings only (non-strict mode)

`--dry-run`

Load the config and print a plan. Doesn't connect to the broker or start any runtime task.

agent --dry-run
agent --dry-run --json

Output (plain text):

Config directory
Broker kind (nats | local)
Plugin list
Agent directory table (id, model, bindings, delegates, description)

Exit codes: 0 valid; 1 on error.

Daemon admin endpoints

Reference for status --endpoint and anyone wiring a custom dashboard:

Endpoint	Method	Bind	Purpose
`/admin/agents`	GET	`127.0.0.1:9091`	List every agent (JSON)
`/admin/agents/<id>`	GET	`127.0.0.1:9091`	Single agent (JSON)
`/admin/tool-policy`	GET	`127.0.0.1:9091`	Tool policy queries
`/admin/credentials/reload`	POST	`127.0.0.1:9091`	Phase 17 — re-read agents/plugins YAML and atomically swap the credential resolver. Returns `ReloadOutcome` JSON. See `config/credentials.md`.
`/health`	GET	`0.0.0.0:8080`	Liveness probe
`/ready`	GET	`0.0.0.0:8080`	Readiness probe
`/metrics`	GET	`0.0.0.0:9090`	Prometheus
`/whatsapp/pair*`	GET	`0.0.0.0:8080`	WhatsApp pairing QR (first instance)
`/whatsapp/<instance>/pair*`	GET	`0.0.0.0:8080`	Multi-instance WhatsApp pairing

Cross-links

Gotchas

Hand-rolled parser. Unexpected flag ordering can produce "unknown argument" errors that are less forgiving than clap-based CLIs. Stick to the form shown in each subcommand.
Global --config must come before the subcommand. agent --config ./x ext list works; agent ext list --config ./x does not.
Admin console is loopback-only. status --endpoint against a remote host requires a tunnel; it won't listen publicly.

Background agents — `agent run --bg` / `agent ps` / `agent attach` / `agent discover`

Operator-side CLI for spawning, listing, and inspecting goals that should outlive the spawning shell. Pairs with assistant mode: the agent runs in the background while you go do something else; you check in via agent ps / agent attach whenever convenient.

SessionKind

Every goal handle carries a SessionKind enum identifying how it was spawned and how it should survive a daemon restart:

Kind	Meaning	Survives restart
`Interactive`	User-driven REPL turn or chat-channel inbound (default)	No — Phase 71 reattach flips `Running` → `LostOnRestart`
`Bg`	Operator spawned a detached goal via `agent run --bg`	Yes — keeps `Running`
`Daemon`	Persistent supervised goal (e.g. assistant_mode binding's always-on agent loop)	Yes
`DaemonWorker`	Worker spawned BY a `Daemon` goal — short-lived sub-agent	Yes (treated like `Bg` for reattach)

Schema migration v5 adds the kind column to agent_handles SQLite table; pre-80.10 rows default to Interactive automatically.

`agent run [--bg] <prompt>`

Spawns a new goal handle. With --bg, sets kind = Bg + phase_id = "cli-bg" + returns the goal_id immediately so the operator can detach. Without --bg, sets kind = Interactive + phase_id = "cli-run".

$ nexo agent run --bg "review the latest commits and post a summary"
[agent-run] goal_id=a9f62654-688b-4e41-95c9-1ec2a1a39f6d
[agent-run] kind=bg
[agent-run] status=running (queued for daemon pickup)
[agent-run] prompt: review the latest commits and post a summary
[agent-run] detached — re-attach later with `nexo agent attach a9f62654-...`

Validates that the prompt is non-empty. JSON output via --json.

Note: the slim MVP inserts the row into agent_handles but the daemon-side pickup of queued goals is deferred to 80.10.g. For now, the row sits Running until manually transitioned via agent attach (Phase 80.16) or a future supervisor.

`agent ps [--all] [--kind=...] [--db=<path>] [--json]`

Reads agent_handles.db read-only and renders a markdown table.

$ nexo agent ps
# Agent runs (db: /home/.../state/agent_handles.db)

| ID       | Kind | Status  | Phase    | Started             | Ended |
|----------|------|---------|----------|---------------------|-------|
| a9f62654 | bg   | Running | cli-bg   | 2026-04-30T19:03:01 | -     |

1 rows shown.

Default filter: only Running goals. Use --all to include terminal rows. Use --kind=bg (or interactive / daemon / daemon_worker) to narrow.

JSON output for scripting:

$ nexo agent ps --json | jq '.[] | select(.kind == "bg")'

Empty / missing-DB case prints a friendly message and exits 0:

(no agent runs recorded yet — db not found at /home/.../agent_handles.db)

`agent discover [--include-interactive] [--db=<path>] [--json]`

Operator's "what is running detached?" view. Filtered to Bg / Daemon / DaemonWorker kinds by default. Pass --include-interactive to broaden.

$ nexo agent discover
# Discoverable goals (db: /home/.../state/agent_handles.db)

| ID       | Kind | Phase  | Started             | Last activity        |
|----------|------|--------|---------------------|----------------------|
| a9f62654 | bg   | cli-bg | 2026-04-30T19:03:01 | 2026-04-30T19:25:42  |

1 goal(s).

Sort: started_at descending (newest first).

Empty result includes a hint:

(no detached / daemon goals running; pass --include-interactive to broaden)

`agent attach <goal_id> [--db=<path>] [--json]`

Read-only viewer of a goal's latest persisted snapshot. Live event streaming via NATS lands in 80.16.b — for now, the command shows the most recent AgentSnapshot from the registry.

$ nexo agent attach a9f62654-688b-4e41-95c9-1ec2a1a39f6d
# Agent Goal a9f62654-688b-4e41-95c9-1ec2a1a39f6d

- **kind**: bg
- **status**: Running
- **phase_id**: cli-bg
- **started_at**: 2026-04-30 19:03:01 UTC

## Last progress
Reviewed last 12 commits, drafted summary, awaiting outbound channel hookup.

- **turn_index**: 4/30
- **last_event_at**: 2026-04-30 19:25:42 UTC

[attach] Live event stream requires daemon connection — re-run with NATS available (Phase 80.16.b follow-up).

For terminal goals, the hint changes:

[attach] Goal is in terminal state Done; no further updates expected.

Validates the UUID upfront (exit 1 with "is not a valid UUID"); exits 1 with "no agent handle found" when the row is absent.

Database path resolution

All four commands resolve the agent_handles.db path the same 3-tier way (mirrors agent dream from Phase 80.1.d):

--db <path> (explicit override, beats everything)
NEXO_STATE_ROOT env → <state_root>/agent_handles.db
XDG default ~/.local/share/nexo/state/agent_handles.db

The YAML tier is intentionally absent — agents.state_root is not a config field today; state_root flows into BootDeps directly per Phase 80.1.b.b.b documentation. Set NEXO_STATE_ROOT to align the CLI with your daemon's actual data dir.

Reattach kind-aware

Phase 71 reattach (boot-time recovery) is now SessionKind-aware since 80.10:

kind == Interactive + Running pre-restart → flip to LostOnRestart (the user is gone; no caller waiting)
kind ∈ {Bg, Daemon, DaemonWorker} + Running pre-restart → keep Running (the operator expects them to survive)

Use reattach_running_kind_aware() from SqliteAgentRegistryStore. The legacy non-kind-aware reattach_running stays for backward callers.

Deferred follow-ups

80.10.b = Phase 80.16 — nexo agent attach TTY re-attach (already shipped in DB-only viewer mode)
80.10.c — Daemon supervisor process for Daemon / DaemonWorker kinds (separate process lifecycle distinct from the interactive daemon)
80.10.d — nexo agent kill <goal_id> graceful abort signal
80.10.e — nexo agent logs <goal_id> re-stream goal output without attaching
80.10.f — Phase 77.17 schema-migration system integration
80.10.g — Daemon-side pickup of queued goals: today the CLI inserts the row but no daemon worker consumes it automatically
80.16.b — Live event streaming via NATS subscribe (agent.registry.snapshot.<goal_id> + agent.driver.> filtered by goal_id payload)
80.16.c — User input piping via agent.inbox.<goal_id> (already wired in Phase 80.11 — multi-agent coordination uses the same channel; CLI input piping is the user-facing consumer)

Migrations CLI

Versioned YAML schema migrations are now available for operator config files under config/.

Commands

nexo setup migrate --dry-run (default behavior) — reports pending file migrations and target schema version without writing files.
nexo setup migrate --apply — applies pending migrations in place.

Each migrated file carries a top-level schema_version marker. The loader tolerates this metadata field and strips it before strict typed deserialization.

Boot and hot-reload behavior

runtime.yaml accepts:

migrations:
  auto_apply: true

auto_apply: true makes boot + Phase 18 hot-reload apply pending config schema migrations before loading the runtime snapshot.
auto_apply: false (default) leaves files untouched and prints a pending-migrations warning with file/version pairs.

Notes

The migration functions are idempotent and versioned.
setup migrate --apply is the safest path for explicit review-driven upgrades in production environments.

Docker

Production deployment as a compose stack: nats broker + nexo runtime, Docker secrets for credentials, persistent volumes for SQLite data and the disk queue.

Source: docker-compose.yml, Dockerfile, config/docker/.

Pre-built image at GHCR

Every push to main and every v* tag publishes a multi-arch image (linux/amd64 + linux/arm64) at:

ghcr.io/lordmacu/nexo-rs:latest          # latest tagged release
ghcr.io/lordmacu/nexo-rs:v0.1.1          # exact version
ghcr.io/lordmacu/nexo-rs:edge            # latest main commit
ghcr.io/lordmacu/nexo-rs:main-<sha>      # pinned to a specific commit

Pull and run:

docker pull ghcr.io/lordmacu/nexo-rs:latest
docker run --rm \
  -v $(pwd)/config:/app/config:ro \
  -v $(pwd)/data:/app/data \
  -p 8080:8080 -p 9090:9090 \
  ghcr.io/lordmacu/nexo-rs:latest

Build pipeline: .github/workflows/docker.yml. Tags + labels follow OCI image spec and are generated by docker/metadata-action. Image carries SBOM and SLSA provenance attestations (verify with docker buildx imagetools inspect).

Compose layout

flowchart LR
    subgraph STACK[docker-compose]
        NATS[nats:2.10<br/>:4222 client<br/>:8222 monitoring]
        AG[nexo<br/>:8080 health<br/>:9090 metrics]
    end
    AG --> NATS

    VOL1[(./config RO)] --> AG
    VOL2[(./data RW)] --> AG
    VOL3[(./extensions RO)] --> AG
    SEC[/run/secrets/...] --> AG

    IDE[MCP clients] -.->|port 8080| AG
    PROM[Prometheus] -.->|port 9090| AG

`docker-compose.yml`

Two services, healthchecks on both, shared volumes:

nats — nats:2.10-alpine, exposes :4222 for agent clients and :8222 for monitoring (healthcheck hits :8222/healthz)
nexo — the main runtime
- Ports: :8080 (health), :9090 (metrics)
- Environment: RUST_LOG=info, AGENT_ENV=production
- shm_size: 1gb — required for Chrome processes (browser plugin)
- Bind mounts: ./config:/app/config:ro, ./data:/app/data:rw, ./extensions:/app/extensions:ro
- depends_on: { nats: { condition: service_healthy } }

Dockerfile

Multi-stage:

Builder — Rust cargo build --release --locked
Runtime — debian:bookworm-slim with operational tools baked in:
- ca-certificates, libsqlite3-0
- Python + ffmpeg + tmux + yt-dlp + tesseract (for skills that need them)
- Google Chrome on amd64 (OAuth + Widevine work); falls back to Chromium on arm64
- cloudflared (downloaded per TARGETARCH at build time)
- dumb-init as PID 1

Entry point: /usr/local/bin/nexo --config /app/config.

Exposed ports: 8080, 9090.

Config overrides — `config/docker/`

Mirrors the main config layout. The compose service mounts the production overrides path:

command: ["nexo", "--config", "/app/config/docker"]

Key differences in the docker overrides:

broker.yaml — NATS URL points at the Docker service name (nats://nats:4222); persistence at /app/data/queue/broker.db
llm.yaml — reads API keys from /run/secrets/<name>
Other files (agents.yaml, memory.yaml, extensions.yaml) override defaults for container paths

Secrets

The compose file declares Docker secrets and the config overrides reference them:

services:
  nexo:
    secrets:
      - minimax_api_key
      - minimax_group_id
      - google_client_id
      - google_client_secret
secrets:
  minimax_api_key:
    file: ./secrets/minimax_api_key.txt
  minimax_group_id:
    file: ./secrets/minimax_group_id.txt
  ...

Config reads them via the ${file:/run/secrets/...} syntax. Secrets appear as mode-0400 files inside the container — nothing ever touches env vars.

See Configuration — layout.

Operating the stack

docker compose up -d           # start
docker compose logs -f nexo   # follow logs
docker compose exec nexo nexo ext list
docker compose exec nexo nexo dlq list
docker compose restart nexo   # rolling reload (SIGTERM → 5 s grace)
docker compose down            # stop (preserves volumes)

Scaling

Horizontal scaling needs an external NATS cluster. Running the compose with two agent replicas pointed at a single NATS server works for isolated workloads but duplicate-delivery across agents on the same topic is not avoided by the compose itself — the single-instance lockfile (see Fault tolerance) assumes one agent process per data directory.
For real scale: one NATS cluster + N agent processes, each with its own ./data/ volume.

Health checks for orchestration

services:
  nexo:
    healthcheck:
      test: ["CMD", "curl", "-f", "http://127.0.0.1:8080/ready"]
      interval: 10s
      timeout: 3s
      retries: 3
      start_period: 30s

Readiness gate is /ready (covered in metrics + health). start_period needs to cover first-boot extension discovery + all agent runtimes attaching to their topics.

Gotchas

Volume ownership. Don't mount ./data as root-owned if your container runs as non-root. The runtime will fail to write the SQLite files and you'll only see cryptic readonly database errors.
Chrome needs /dev/shm space. The shm_size: 1gb is not optional when the browser plugin is active — Chrome processes silently corrupt their state if starved.
config/docker/ is committed, secrets are not. ./secrets/ is gitignored. Populate it before the first compose up.

Slim daemon builds (Cargo feature-gates)

Phase 93.12.a (2026-05-15) introduced Cargo feature-gates for canonical plugin crates so operators targeting embedded or mobile (Android Flutter FFI, slim Docker images) can ship a daemon binary without the optional plugin crates in its compile graph.

Available features

Feature	Default	Drops crate
`plugin-telegram`	✅ on	`nexo-plugin-telegram`
`plugin-whatsapp`	✅ on	`nexo-plugin-whatsapp`
`plugin-browser`	off	(no-op placeholder; browser already has no Cargo dep)

email is NOT feature-gated — structurally in-process by design (Phase 93.11 audit, bucket D). Autonomous worker + EmailToolContext + /metrics rendering all hold Arc<EmailPlugin> in-process. No subprocess driver today.

whatsapp gate (93.12.c.1 + 93.12.c.2, shipped)

Both halves shipped — slim daemon binary can be built without nexo-plugin-whatsapp in its compile graph:

cargo build --release --bin nexo --no-default-features
cargo tree --no-default-features -i nexo-plugin-whatsapp
# expected: error: package ID specification ... did not match any packages

Gated sites:

Crate	Site	Detail
`src/main.rs`	`RuntimeHealth.wa_pairing`	typed `BTreeMap<String, SharedPairingState>` field
`src/main.rs`	`spawn_whatsapp_pairing_state_subscriber`	broker subscriber fn
`src/main.rs`	`spawn_whatsapp_typing_presence_subscriber`	typing-presence broker bridge fn
`src/main.rs`	`build_known_pairing_registry`	`WhatsappPairingAdapter::new`
`src/main.rs`	admin pairing trigger map	`WhatsappPairingTrigger::from_configs`
`src/main.rs`	instance loop	`wa_pairing` + `wa_tunnel_cfg` population
`src/main.rs`	subscriber spawn block	`spawn_whatsapp_pairing_state_subscriber` call
`src/main.rs`	tunnel auto-open	`/whatsapp/pair` Cloudflare quick tunnel
`src/main.rs`	tool fallback (boot)	`register_whatsapp_tools`
`src/main.rs`	tool fallback (hot-spawn)	`register_whatsapp_tools`
`src/main.rs`	HTTP handler	`/whatsapp/*` route dispatcher
`crates/setup/src/writer.rs`	pairing flow	`session::pair_once` + helpers + dual-shape `wipe_channel_session`
`crates/setup/src/admin_bootstrap.rs`	admin RPC	`with_wa_bot_handle` + outbound translator
`crates/setup/src/admin_adapters.rs`	outbound translator	`WhatsAppTranslator` struct + impl + tests
`crates/setup/tests/channel_outbound_end_to_end.rs`	e2e test	file-level `#![cfg]`

Runtime impact when --no-default-features:

Admin RPC /whatsapp/* returns channel-unavailable.
HTTP /whatsapp/* route returns 404 (handler block absent).
Auto-open Cloudflare quick tunnel for pairing is skipped.
Pairing trigger map has no whatsapp entry — admin pairing/start returns "channel not supported".
Outbound dispatcher rejects whatsapp routes with typed TranslationError::UnsupportedChannel.

WhatsApp still runs as a discovered subprocess if its manifest sits in plugins.discovery.search_paths and the binary is installed — the gate removes only compile-time imports. Subprocess broker path is unaffected.

Building a telegram-less daemon

cargo build --release --bin nexo --no-default-features

Verify the crate dropped from the dep graph:

cargo tree --no-default-features -i nexo-plugin-telegram
# expected: error: package ID specification `nexo-plugin-telegram` did not match any packages

cargo tree -i nexo-plugin-telegram (without --no-default-features) prints the canonical nexo-rs v0.1.x parent — proving the gate is the only thing keeping telegram in.

Runtime behaviour

A feature-gated build still runs telegram as a discovered subprocess if its manifest sits in plugins.discovery.search_paths and the nexo-plugin-telegram binary is installed (via cargo install nexo-plugin-telegram or release tarball). The gate removes only the daemon's compile-time imports (pairing adapter constructor + outbound-tool fallback registration). The subprocess path uses broker JSON-RPC, not direct Rust imports, so it is unaffected.

Tradeoff: the feature-disabled daemon loses the daemon-side fallback that registers telegram_* outbound tools into the agent's ToolRegistry if the plugin manifest does not yet declare [[plugin.tools.outbound]]. Standalone telegram v0.3.0+ ships the manifest section, so the fallback is dead weight for any operator running a current plugin binary.

CI matrix

The release workflow validates both shapes:

cargo build --bin nexo                        # default (telegram in)
cargo build --bin nexo --no-default-features  # slim (telegram out)

Both targets must compile clean for release-fast and release profiles before the binary ships.

When to add a new feature-gate

Add plugin-<id> = ["dep:nexo-plugin-<id>"] if:

The plugin has a non-trivial Cargo dep with transitive cost (binary size, link time, native dep like OpenSSL).
The plugin is genuinely optional for the target audience (Android, embedded, slim Docker).
The compile-time integration points are localised — no cross-crate admin-RPC entanglement that would force the gate to bubble through crates/setup or crates/core.

If any of (1)-(3) fail, prefer subprocess discovery over a feature-gate — manifest-driven runtime decoupling avoids the conditional-compilation noise.

Metrics & health

Prometheus metrics on :9090/metrics, health/readiness on :8080, admin console on 127.0.0.1:9091. Everything an operator or orchestrator needs to decide "is the agent healthy?" without reading logs.

Source: crates/core/src/telemetry.rs, src/main.rs.

Ports at a glance

Port	Binding	Purpose
`:9090`	`0.0.0.0`	Prometheus `/metrics` scrape
`:8080`	`0.0.0.0`	Health `/health`, readiness `/ready`, WhatsApp pairing pages
`:9091`	`127.0.0.1`	Admin console (loopback only)

Ports are not configurable yet — if you need to remap, port-forward outside the agent (Docker, k8s service).

`/metrics` (Prometheus)

Exposed metrics:

Name	Type	Labels	What
`llm_requests_total`	counter	`agent`, `provider`, `model`	Every LLM completion request
`llm_latency_ms`	histogram	`agent`, `provider`, `model`	Buckets 50, 100, 250, 500, 1000, 2500, 5000, 10000 ms
`messages_processed_total`	counter	`agent`	Inbound messages that reached an agent
`nexo_extensions_discovered`	counter	`status={ok,disabled,invalid}`	Emitted on every discovery sweep
`nexo_tool_calls_total`	counter	`agent`, `outcome={ok,error,blocked,unknown}`, `tool`	Tool invocations
`nexo_tool_cache_events_total`	counter	`agent`, `event={hit,miss,put,evict}`, `tool`	Tool-level memoization
`nexo_tool_latency_ms`	histogram	`agent`, `tool`	Per-tool latency
`circuit_breaker_state`	gauge	`breaker`	`0 = Closed`, `1 = Open`; always includes `nats`
`credentials_accounts_total`	gauge	`channel`	Per-channel labelled instance count (Phase 17)
`credentials_bindings_total`	gauge	`agent`, `channel`	`1` when the agent has a credential bound, `0` otherwise
`channel_account_usage_total`	counter	`agent`, `channel`, `direction={inbound,outbound}`, `instance`	Every credential use
`channel_acl_denied_total`	counter	`agent`, `channel`, `instance`	Outbound calls rejected by `allow_agents`
`credentials_resolve_errors_total`	counter	`channel`, `reason`	Resolver failures (`unbound`, `not_found`, `not_permitted`)
`credentials_breaker_state`	gauge	`channel`, `instance`	`0=closed`, `1=half-open`, `2=open`. Per-(channel, instance) circuit breaker — a 429 from one number cannot trip the breaker for a sibling account.
`credentials_boot_validation_errors_total`	counter	`kind`	Gauntlet errors by kind at boot
`credentials_insecure_paths_total`	gauge	—	Credential files with lax permissions at boot
`credentials_google_token_refresh_total`	counter	`account_fp`, `outcome={ok,err}`	Google OAuth refresh attempts (fp = sha256[..8], not raw email)
`pairing_inbound_challenged_total`	counter	`channel`, `result={delivered_via_adapter,delivered_via_broker,publish_failed,no_adapter_no_broker_topic}`	DM-challenge dispatch attempts (Phase 26.x)
`pairing_approvals_total`	counter	`channel`, `result={ok,expired,not_found}`	`nexo pair approve` outcomes (Phase 26.y)
`pairing_codes_expired_total`	counter	—	Setup codes pruned past TTL or rejected as expired on approve
`pairing_bootstrap_tokens_issued_total`	counter	`profile`	Bootstrap tokens minted by `BootstrapTokenIssuer::issue`
`pairing_requests_pending`	gauge	`channel`	Pending pairing requests (push-tracked; `PairingStore::refresh_pending_gauge` exposed for drift recovery after a daemon restart)

Circuit-breaker state for the nats breaker is sampled at scrape time from broker readiness, so a stalled publish path shows up in the next scrape without needing an eager push.

The credentials_* and channel_* series are documented with full schema examples in config/credentials.md. account_fp is always an 8-byte sha256 fingerprint of the account id, never the raw JID or email, so scraped metrics stay safe to share.

Useful alerts

LLM provider flapping

- alert: LlmError5xxHigh
  expr: sum(rate(llm_requests_total{outcome="error"}[5m])) by (provider) > 0.1
  for: 5m

NATS circuit open

- alert: NatsBreakerOpen
  expr: circuit_breaker_state{breaker="nats"} == 1
  for: 1m

Tool call failures

- alert: ToolErrorSpike
  expr: |
    sum(rate(nexo_tool_calls_total{outcome="error"}[5m])) by (tool) > 0.5
  for: 10m

Health endpoints

flowchart LR
    GET1[GET /health] --> OK[200 OK<br/>always<br/>{status:ok}]
    GET2[GET /ready] --> CHK{broker ready<br/>AND agents > 0?}
    CHK -->|yes| RDY[200 OK<br/>{status:ready,<br/>agents_running:N}]
    CHK -->|no| NOT[503 Service Unavailable<br/>{status:not_ready,<br/>broker_ready,<br/>agents_running}]

GET /health — liveness probe. Returns 200 as long as the process is accepting connections. Don't use this as a traffic gate.
GET /ready — readiness probe. Returns 200 only when the broker is ready and at least one agent runtime is attached to inbound topics. Returns 503 during boot, shutdown, or broker outage.
GET /whatsapp/* — QR pairing pages and the /whatsapp/pair tunnel endpoint; see WhatsApp plugin.

Kubernetes probes

livenessProbe:
  httpGet: { path: /health, port: 8080 }
  initialDelaySeconds: 10
  periodSeconds: 10
readinessProbe:
  httpGet: { path: /ready, port: 8080 }
  initialDelaySeconds: 30
  periodSeconds: 5

initialDelaySeconds: 30 for readiness covers extension discovery and every agent runtime attaching its subscriptions.

Admin console (`:9091`)

Loopback-only. Exposes:

Path	Purpose
`/admin/agents`	Agent directory with live status, session counts
`/admin/tool-policy`	Query the tool-policy registry

The agent status [--endpoint URL] [--agent-id ID] [--json] CLI subcommand hits this endpoint and prints a table or JSON; good for scripting ops without grepping logs.

Remote access requires an explicit tunnel — the port is never exposed publicly by default.

Scrape config sample

# prometheus.yml
scrape_configs:
  - job_name: nexo-rs
    scrape_interval: 15s
    static_configs:
      - targets: ['agent:9090']

For Docker compose: the service name is agent. For k8s: use the service DNS.

Gotchas

circuit_breaker_state only labels per-breaker, not per-provider. Multiple LLM providers each have their own breaker instance, but they surface as distinct breaker label values. If you expected {provider="anthropic"} you'll need a label rename in your Prometheus relabel config.
Histograms are non-configurable. Buckets are compiled in. If your SLO requires fine-grained buckets below 50 ms, it is worth opening an issue.
/ready 503 during shutdown is expected. Don't alert on 5 s of 503 bursts — alert on rate(> 30 s).

Logging

tracing under the hood. Human-readable in dev, JSON in production, always to stderr (stdout is reserved for wire protocols like MCP JSON-RPC).

Source: src/main.rs::init_tracing.

Quick reference

Env var	Default	Meaning
`RUST_LOG`	`info`	`EnvFilter` syntax (`nexo_core=debug,async_nats=warn,*=info`)
`AGENT_LOG_FORMAT`	`pretty` (`json` in `AGENT_ENV=production`)	`pretty` \| `compact` \| `json`
`AGENT_ENV`	unset	Set to `production` to default to JSON logs

Levels

Pick the lowest verbosity that still surfaces the signal you care about:

Level	Use
`error`	Unrecoverable — operator action needed
`warn`	Degraded but running (circuit open, retry budget burning)
`info`	Lifecycle (startup, shutdown, reconnects)
`debug`	Per-turn detail (tool invoked, session created)
`trace`	Per-event firehose — only when chasing a bug

Log formats

`pretty` (dev default)

Coloured, multi-line. Good at the terminal, bad in log pipelines.

2026-04-24T17:22:13Z  INFO agent::runtime: agent runtime ready
    at src/main.rs:1243
    in agent_boot with agent="ana"

`compact`

One line per event. Middle ground.

2026-04-24T17:22:13Z INFO agent="ana" agent runtime ready

`json`

Structured. One JSON object per line. Default when AGENT_ENV=production.

{"ts_unix_ms":1714000000000,"level":"INFO","target":"agent::runtime","thread_id":"ThreadId(3)","file":"src/main.rs","line":1243,"spans":[{"name":"agent_boot","agent":"ana"}],"message":"agent runtime ready"}

Every entry carries:

ts_unix_ms — milliseconds since epoch (stable for ingestion)
level, target
thread_id, file, line — for pinpointing
spans — span hierarchy with attached fields
Any structured fields passed via tracing::info!(agent = %id, ...)

Correlating across agents

Cross-agent work lands on agent.route.<target_id> with a correlation_id. In logs, the correlation id shows up as a field on every event that happened inside a delegation span.

flowchart LR
    A[agent A<br/>info: tool_call agent.route.ops] --> MSG[NATS message<br/>correlation_id=req-123]
    MSG --> B[agent B<br/>info: handling agent.route with correlation_id=req-123]
    B --> REPLY[reply on agent.route.A<br/>correlation_id=req-123]
    REPLY --> A2[agent A<br/>info: delegation returned correlation_id=req-123]

Grep logs by correlation_id to see the whole fan-out+in as a single thread.

Structured-field conventions

Convention for fields that show up across the codebase:

Field	Where
`agent`	Any log tied to a specific agent runtime
`session`	Any log inside a session context (usually UUID)
`extension` (or `ext`)	Any log from extension runtimes
`tool`	Any tool invocation log
`provider`, `model`	LLM client logs
`correlation_id`	Delegation-related logs
`topic`	Broker publish/subscribe logs

When adding new code, reuse these names — log pipelines can count on them.

Where stdout goes

stdout is reserved for:

MCP server mode (agent mcp serve) — JSON-RPC traffic
CLI subcommands that return data (agent ext list --json, agent flow show --json, agent dlq list)

Everything else, including normal log output, goes to stderr. Don't pipe agent … 2>&1 | jq unless you know the subcommand never writes non-JSON to stdout.

Practical setups

Local dev

export RUST_LOG=agent=debug,nexo_core=debug,info
cargo run --bin agent -- --config ./config

Production (Docker)

services:
  agent:
    environment:
      AGENT_ENV: production
      RUST_LOG: info,async_nats=warn

Everything lands on stderr → container runtime picks it up → your log pipeline ingests JSON directly.

Chasing a specific agent

export RUST_LOG=agent=info
# then grep by field
docker compose logs agent | jq 'select(.spans[].agent == "ana")'

Gotchas

tracing is compile-time filtered. If you grep logs for a debug-level event and see nothing, verify RUST_LOG covers the module.
JSON mode drops ANSI colors. Rightly so — but don't pipe it through a TTY colorizer and then be confused by escape sequences.
stderr ordering isn't guaranteed against stdout. Never assume a log line printed right after a println! happens in log order — pipes buffer independently.

Dead-letter queue operations

The DLQ is where events end up when they exhaust their retry budget or fail to deserialize at all. The runtime never silently drops an event — if it can't be delivered, it lands here for an operator to inspect or replay.

Source: crates/broker/src/disk_queue.rs, src/main.rs (agent dlq ... subcommands).

When items land there

flowchart LR
    PUB[publish event] --> NATS{NATS up?}
    NATS -->|yes| OK[delivered]
    NATS -->|no| DQ[pending_events]
    DQ --> DRAIN[disk queue drain]
    DRAIN -->|attempts < 3| DQ
    DRAIN -->|attempts >= 3| DLQ[dead_letters]
    DQ -.->|deserialization error| DLQ

3 attempts (DEFAULT_MAX_ATTEMPTS) without success → row moves to dead_letters
Unparseable payload → moves immediately (a poison pill is not worth retrying)
Circuit-breaker-open on publish counts as an attempt — if the breaker stays open, the queue will eventually flush into DLQ

See Fault tolerance for the full retry flow.

The `DeadLetter` row

#![allow(unused)]
fn main() {
struct DeadLetter {
    id: String,          // UUID
    topic: String,       // NATS subject
    payload: String,     // JSON event body
    failed_at: i64,      // unix timestamp (ms)
    reason: String,      // error text
}
}

Storage: SQLite table dead_letters in the broker DB (typically ./data/queue/broker.db).

CLI

agent dlq list              # list up to 1000 entries
agent dlq replay <id>       # move one entry back to pending_events
agent dlq purge             # delete every entry

`list` output

Columns: id | topic | failed_at | reason. Plain text, one entry per line, suitable for grep / awk piping.

2f9c2e4a-...  plugin.inbound.whatsapp  2026-04-24T17:22:13Z  circuit breaker open
b1a3a9f5-...  plugin.outbound.telegram 2026-04-24T17:23:01Z  deserialization error: unexpected field `...`

`replay`

Moves the row back to pending_events with attempts = 0:

$ agent dlq replay 2f9c2e4a-...
replayed 2f9c2e4a-... → pending_events (next daemon drain will retry it)

The retry happens on the next drain() cycle of the running agent — replay itself does not attempt delivery. That way a running agent in a different shell picks it up; a stopped agent leaves the event safely in pending_events for its next startup.

`purge`

Destructive. Drops every row in dead_letters:

$ agent dlq purge
purged 42 dead-letter entries

Use with care — there is no per-topic filter. If you need a scoped purge, inspect with list, selectively replay what you want to keep, then purge the rest.

Exit codes

Code	Meaning
0	Success
1	Failure (event not found for replay, DB access error, etc.)

Common workflows

Post-outage triage

# See what piled up during the NATS outage
agent dlq list | wc -l

# Spot-check
agent dlq list | head
agent dlq list | awk '{print $2}' | sort | uniq -c

# If reasons look transient (circuit open, timeouts):
agent dlq list | awk '{print $1}' | while read id; do
  agent dlq replay "$id"
done

Poison-pill cleanup

If reason mentions deserialization errors, the payload is malformed — no amount of retry will help. Collect the offenders, fix the producer side, then:

agent dlq list | grep deserialization | awk '{print $1}' > /tmp/poison.txt
# ... verify they're truly poison ...
agent dlq purge

Preview without modifying

The CLI has no --dry-run flag today. Use agent dlq list to preview first; the DB rows are stable until you explicitly replay or purge.

Monitoring

There is no dedicated DLQ metric yet. Approximations:

A spike in circuit_breaker_state{breaker="nats"} == 1 time strongly predicts DLQ growth — alert on it.
Consider wrapping agent dlq list | wc -l in a cron job that pushes the count to Prometheus via the textfile collector if you want a direct gauge.

Gotchas

replay doesn't wake a stopped agent. If no agent is running against the same data directory, the row just moves back to pending_events and waits for the next startup drain.
No replay deduplication. Replaying an event that was already successfully delivered later will deliver it again. If your consumer isn't idempotent, spot-check downstream state before replaying.
purge is global. Scope it with list | replay selectively if you need to preserve a subset.

Config hot-reload

Operators rotate per-agent knobs (allowlists, model strings, prompts, rate limits, delegation gates) without restarting the daemon. Sessions currently handling a message finish their turn on the old snapshot; the next event picks up the new one (apply-on-next-message). Plugin configs (whatsapp.yaml, telegram.yaml, …) are not hot-reloadable yet — see limitations.

What triggers a reload

Trigger	Source
File save under `config/`	`notify`-based watcher, debounced 500 ms
`agent reload` CLI	Publishes `control.reload` on the broker
Direct broker publish	Any integration can emit `control.reload`

What's reloaded

Files watched by default (paths relative to the config dir):

agents.yaml
agents.d/ (recursive)
llm.yaml
runtime.yaml

Extra paths listed under runtime.reload.extra_watch_paths are appended to the list.

The fields that apply live without a restart:

Field	Location	Effect
`allowed_tools` (agent + binding)	`agents.d/*.yaml`	Tool list visible to the LLM + per-call guard
`outbound_allowlist`	same	Defense-in-depth in `whatsapp_send_` / `telegram_send_`
`skills`	same	Skill blocks rendered into the system prompt
`model.model` (binding-level)	same	LLM model string on next turn
`system_prompt` + `system_prompt_extra`	same	System block composition
`sender_rate_limit`	same	Per-binding token bucket
`allowed_delegates`	same	Delegation ACL
`providers.<name>.api_key`	`llm.yaml`	Rotated via a fresh `LlmClient` on next turn
`lsp.languages`, `lsp.idle_teardown_secs`, `lsp.prewarm` (agent + binding)	`agents.d/*.yaml`	LSP tool reads policy per call (C2)
`team.max_members`, `team.max_concurrent`, `team.idle_timeout_secs`, `team.worktree_per_member` (agent + binding)	same	Team* tools read policy per call (C2)
`config_tool.allowed_paths`, `config_tool.approval_timeout_secs`	same	Read on the next ConfigTool call (M11 follow-up promotes the rest)
`repl.allowed_runtimes` (agent + binding)	same	ReplTool gates spawn on the per-call allowlist (C2)
`remote_triggers` (agent + binding)	same	RemoteTriggerTool reads allowlist per call
`cron_*` model fields	same	CronCreateTool reads `effective.model` per call
`proactive.tick_interval_secs`, `proactive.jitter_pct`, `proactive.max_idle_secs`	same	Proactive driver reads on the next tick
All Phase 16 binding overrides (`allowed_tools`, `outbound_allowlist`, `skills`, `model.model`, `system_prompt_extra`, `sender_rate_limit`, `allowed_delegates`, `language`, `link_understanding`, `web_search`, `pairing_policy`, `dispatch_policy`, `remote_triggers`, `proactive`, `repl`, `lsp`, `team`, `config_tool`)	`agents.d/*.yaml`, `inbound_bindings[].<field>`	Resolved fresh per snapshot build; consumed at handler entry via `ctx.effective_policy()`

Fields that require a restart (logged as warn during reload):

id, plugins, workspace, skills_dir, transcripts_dir
heartbeat.enabled, heartbeat.interval
config.debounce_ms, config.queue_cap
model.provider (binding-level provider must match agent provider — the LlmClient is wired once per agent)
broker.yaml, memory.yaml, mcp.yaml, extensions.yaml
Boolean enable flips: lsp.enabled, team.enabled, repl.enabled, config_tool.self_edit, proactive.enabled (any per-binding override of these). Flipping false → true requires registering the tool in the per-agent tool_base (immutable post-boot — Arc<ToolRegistry>); flipping true → false would leave a registered-but-refused tool that the LLM still sees in its catalogue. The handler refuses with a <feature>Disabled error in the second case, but operators should restart for clean semantics.
Subsystem actor lifecycle: LspManager child processes, ReplRegistry subprocess pool, TeamMessageRouter broker subscriptions stay alive across reloads. Operator restart is required to recycle child processes (e.g. after a toolchain update for rust-analyzer).

The "boolean enable flips" + "subsystem actor lifecycle" limitations match prior art: upstream agent CLI useManageMCPConnections.ts:624 does invalidate-and-refetch without killing the MCP child stdio process; OpenClaw research/src/plugins/services.ts:33-78 boots plugin services once per process and keeps them resident across config changes.

Adding or removing an agent also requires a restart in this release; see limitations.

Configuration

config/runtime.yaml is optional. Defaults:

reload:
  enabled: true           # master switch
  debounce_ms: 500        # notify-debouncer-full window
  extra_watch_paths: []   # appended to the built-in list
cron:
  one_shot_retry:
    max_retries: 3
    base_backoff_secs: 30
    max_backoff_secs: 1800

Set enabled: false to turn off the file watcher + the control.reload subscriber. The CLI agent reload still works — the daemon never opens a privileged socket, it just listens on the shared broker.

The reload pipeline

file save / CLI / broker
        │
        ▼
  debouncer (500 ms)
        │
        ▼
  AppConfig::load (YAML + env resolution)
        │
        ▼
  validate_agents_with_providers  ──fail──▶  log warn, bump
        │                                    config_reload_rejected_total,
        ▼                                    keep old snapshot
  RuntimeSnapshot::build (per agent)
        │
        ▼
  ArcSwap::store  (atomic per agent)
        │
        ▼
  events.runtime.config.reloaded

Validation failure never swaps. The daemon always serves a snapshot that passed its boot gauntlet.

CLI

# Human-readable output
$ agent reload
reload v7: applied=2 rejected=0 elapsed=18ms
  ✓ ana
  ✓ bob

# Machine-readable
$ agent reload --json
{
  "version": 7,
  "applied": ["ana", "bob"],
  "rejected": [],
  "elapsed_ms": 18
}

Exit codes:

0 — at least one agent reloaded.
1 — no control.reload.ack within 5 s (daemon not running).
2 — every agent rejected (partial-fail signal for CI).

Broker contract

Topic	Direction	Payload
`control.reload`	→ daemon	`{requested_by: string}`
`control.reload.ack`	← daemon	serialized `ReloadOutcome`

ReloadOutcome JSON shape:

{
  "version": 7,
  "applied": ["ana", "bob"],
  "rejected": [
    {"agent_id": "ana", "reason": "snapshot build: ..."}
  ],
  "elapsed_ms": 18
}

Telemetry

Metric	Type	Labels
`config_reload_applied_total`	counter	—
`config_reload_rejected_total`	counter	—
`config_reload_latency_ms`	histogram	—
`runtime_config_version`	gauge	`agent_id`

Scrape via the metrics endpoint (ops/metrics).

Apply-on-next-message semantics

A reload does not interrupt sessions that are currently handling a message. Specifically:

The LLM turn in flight keeps its captured Arc<RuntimeSnapshot> for the life of the turn — tool calls inside that turn all see the same policy, even if several reloads land during the turn.
The next event delivered to the agent reads the latest snapshot via snapshot.load() on the intake hot path.

If you need a "force-apply now" semantic (terminate in-flight sessions, respawn), use agent reload --kick-sessions — not implemented yet, tracked in Phase 19.

Security model

control.reload topic has no application-level auth. Anyone with broker publish rights can trigger a reload. In production with NATS, restrict the control.> subject pattern via NATS account permissions; see NATS with TLS + auth. The local-broker fallback is in-process only — no remote attack surface.
File-watcher trust = filesystem write. Whoever can edit config/agents.d/*.yaml can change capability surface. Treat the config dir as a privileged resource: 0600 on YAML files, 0700 on the directory.
events.runtime.config.reloaded payload includes agent ids and rejection reasons. Subscribers see them. Single-process deployments are fine; in multi-tenant setups, gate the events.runtime.> pattern in NATS auth.
Outbound allowlist scope. The Phase 16 outbound allowlist governs WhatsApp + Telegram tools only. Google tools are gated by the OAuth scopes granted at credential creation (see Per-agent credentials) — there is no per-recipient list for Google.
Apply-on-next-message and tightening reloads. A reload that narrows an allowlist for security reasons does not affect in-flight sessions until they next receive an event. If you need the change to take effect immediately, restart the daemon (or wait for the upcoming agent reload --kick-sessions flag in Phase 19).

Failure modes

Bad YAML: AppConfig::load fails. Old snapshot keeps serving. config_reload_rejected_total bumps. The warn log names the file + line.
Validation errors: aggregate — every problem across every agent shows in one warn block. Fix them in one edit instead of restart-and-repeat.
Unknown provider: rejected at boot + at reload by KnownProviders check. Boot validation lists what's registered.
Missing tool in binding's allowed_tools: caught by the post-registry validation pass during reload.
Agent added / removed: Phase 18 rejects these with a clear message; restart the daemon to reshape the fleet.

Limitations

Intentional scope gaps for Phase 18, tracked for Phase 19:

Add / remove agent at runtime. The coordinator rejects new ids and left-over registered handles with an actionable message. Restart needed.
Plugin config hot-reload (whatsapp.yaml, telegram.yaml, browser.yaml, email.yaml). Plugin daemons own I/O (QR pairing, long-polling). Reshaping them live requires a dedicated lifecycle refactor.
config_reloaded hook for extensions to react. Pending.
SIGHUP trigger as an extra UX path. Deferred — use the broker topic or the CLI.

Plugin trust (cosign + `trusted_keys.toml`)

Phase 31.3. Operators control which plugin authors are trusted by maintaining <config_dir>/extensions/trusted_keys.toml. The nexo plugin install CLI reads this file before extracting any tarball; cosign verification of .sig + .cert (+ optional .bundle) assets gates the install.

The framework's own release signing precedent — see Verifying releases — uses the same Sigstore keyless flow. Plugin trust applies that flow per author, with operator-side allowlisting.

Trust modes

Mode	What happens
`ignore`	Skip cosign verification entirely. Useful for dev / CI / installing a plugin you built locally.
`warn` (default)	Verify when `.sig` + `.cert` are present in the release; if absent, log a stderr warning and proceed unverified.
`require`	Reject any install whose tarball does not produce a valid allowlisted signature.

Mode resolution precedence on each install:

CLI flag (--require-signature / --skip-signature-verify).
Per-author [[authors]] mode field, when the install's owner matches.
Global default field.
Built-in fallback (warn).

Mutually exclusive flags --require-signature + --skip-signature-verify fail the install at parse time.

Sample `trusted_keys.toml`

schema_version = "1.0"
default = "warn"

# Optional override; falls back to $PATH walk + well-known
# locations (/usr/local/bin/cosign, /opt/homebrew/bin/cosign,
# ~/go/bin/cosign).
# cosign_binary = "/usr/local/bin/cosign"

[[authors]]
owner = "lordmacu"
identity_regexp = "^https://github.com/lordmacu/[^/]+/\\.github/workflows/release\\.yml@.*$"
oidc_issuer = "https://token.actions.githubusercontent.com"
mode = "require"

A copy with comments lives at config/extensions/trusted_keys.toml.example in the repo root.

How `identity_regexp` is matched

Every cosign keyless signature carries a Subject Alternative Name (SAN) on its certificate. In GitHub Actions flow the SAN encodes the workflow URL plus the ref:

https://github.com/<owner>/<repo>/.github/workflows/release.yml@refs/tags/v0.2.0

The operator regex must match that string. Make it specific enough to lock in the workflow path but loose enough to tolerate ref / repo additions. Examples:

Goal	Regex
Trust everything from this owner via `release.yml`	`^https://github\.com/lordmacu/[^/]+/\.github/workflows/release\.yml@.*$`
Trust a specific repo only	`^https://github\.com/lordmacu/nexo-plugin-slack/\.github/workflows/release\.yml@.*$`
Trust any owner-prefix workflow path	`^https://github\.com/lordmacu/.*$`

Required prerequisite: `cosign` on the host

The verifier shells out to cosign verify-blob. Install before using any non-ignore trust mode:

brew install cosign           # macOS
sudo apt install cosign       # Debian/Ubuntu
sudo dnf install cosign       # Fedora/RHEL

The framework pins to cosign 2.4.1 (matching its own release-signing workflow). Any ≥ 2.4 should work; older versions predate the keyless argv shape used here.

CLI flags

# Use the trusted_keys.toml default for this install:
nexo plugin install lordmacu/nexo-plugin-slack@v0.2.0

# Force `Require` for this call regardless of config:
nexo plugin install lordmacu/nexo-plugin-slack@v0.2.0 --require-signature

# Force `Ignore` (skip verification) for this call:
nexo plugin install lordmacu/nexo-plugin-slack@v0.2.0 --skip-signature-verify

JSON output additions

Every install report (--json) now includes:

Field	Value
`signature_verified`	`true` when cosign verification succeeded.
`signature_identity`	SAN string parsed from cosign output (`Subject:` line). Omitted when verification was skipped.
`signature_issuer`	OIDC issuer the cert was minted by.
`trust_mode`	`"ignore"` / `"warn"` / `"require"` — the effective mode used.
`trust_policy_matched`	Repo owner that matched a `[[authors]]` entry, or omitted.

The error report (PluginInstallErrorReport) gains five new kind values: CosignNotFound, CosignFailed, VerifyIo, PolicyRequiresSig, AssetIncomplete, TrustedKeysParse, IdentityRegexpInvalid. Plus the parse-time conflict FlagsConflict (mutually-exclusive flags).

Troubleshooting

cosign binary not found — install cosign. Or set cosign_binary in your trust file. Or pass --skip-signature-verify for a one-off install of trusted bytes you already vetted.
trust policy requires signature for <owner> — your mode = "require" rejected an unsigned plugin. Ask the author to enable COSIGN_ENABLED=true on their publish workflow (see Publishing a plugin), or relax the per-author mode to warn.
cosign verify-blob exited non-zero — the cert SAN did not match your identity_regexp. Check the publisher's workflow URL (it appears in their release's actions log) and update the regex. Capture the full cosign stderr from the error message for the exact mismatch.
identity_regexp ... invalid — your regex did not compile. Common cause: forgetting to escape . or /. The Rust regex crate's syntax docs are here.

Capability toggles

Several bundled extensions ship with dangerous capabilities off by default — write paths, secret reveal, cache purges. Each capability is gated by a single environment variable. The operator flips it on by exporting the var in the agent process's environment.

agent doctor capabilities enumerates every known toggle, its current state, and a hint for enabling it.

$ agent doctor capabilities
Capability toggles
──────────────────────────────────────────────────────────────────
EXT          ENV VAR                       STATE     RISK     EFFECT
onepassword  OP_ALLOW_REVEAL               disabled  HIGH     Reveal raw secret values…
onepassword  OP_INJECT_COMMAND_ALLOWLIST   disabled  HIGH     Allow `inject_template` to pipe…
cloudflare   CLOUDFLARE_ALLOW_WRITES       disabled  HIGH     Create / update / delete DNS…
cloudflare   CLOUDFLARE_ALLOW_PURGE        disabled  CRITICAL Purge zone cache…
docker-api   DOCKER_API_ALLOW_WRITE        disabled  HIGH     Start / stop / restart…
proxmox      PROXMOX_ALLOW_WRITE           disabled  CRITICAL VM / container lifecycle…
ssh-exec     SSH_EXEC_ALLOWED_HOSTS        disabled  HIGH     Allow `ssh_run` against…
ssh-exec     SSH_EXEC_ALLOW_WRITES         disabled  CRITICAL Allow `scp_upload`…

Pass --json for machine-readable output (admin UI, dashboards):

agent doctor capabilities --json

Toggle reference

Env var	Extension	Kind	Risk	Effect
`OP_ALLOW_REVEAL`	onepassword	bool	high	Returns secret values verbatim instead of fingerprints
`OP_INJECT_COMMAND_ALLOWLIST`	onepassword	allowlist	high	Enables `inject_template` exec mode for the listed commands
`CLOUDFLARE_ALLOW_WRITES`	cloudflare	bool	high	Authorizes `create_dns_record`, `update_dns_record`, `delete_dns_record`
`CLOUDFLARE_ALLOW_PURGE`	cloudflare	bool	critical	Authorizes `purge_cache`
`DOCKER_API_ALLOW_WRITE`	docker-api	bool	high	Authorizes `start_container`, `stop_container`, `restart_container`
`PROXMOX_ALLOW_WRITE`	proxmox	bool	critical	Authorizes VM/container lifecycle actions
`SSH_EXEC_ALLOWED_HOSTS`	ssh-exec	allowlist	high	Hosts the agent may target with `ssh_run`
`SSH_EXEC_ALLOW_WRITES`	ssh-exec	bool	critical	Authorizes `scp_upload`

Boolean kinds accept true, 1, or yes (case-insensitive). Anything else — including unset — counts as disabled.

Allowlist kinds are comma-separated. Empty / whitespace-only inputs count as disabled. The agent never falls back to "anything goes" when the variable is unset.

When to enable

The default is off because every toggle moves the agent from "informational" to "consequential" — failures are no longer just a bad reply, they can mutate real systems or leak secrets.

Enable a toggle only when:

The agent will provably need that capability for the next session.
The operator (you) is present and the session is observed.
There is a way to revert quickly — a wrapper script, a per-shell .envrc, or a systemd unit drop-in you can comment out.

Avoid enabling toggles globally in ~/.profile. Scope them to the specific shell or systemd unit that runs the agent.

How to revoke

Boolean: unset CLOUDFLARE_ALLOW_WRITES (or restart the shell / service).
Allowlist: unset OP_INJECT_COMMAND_ALLOWLIST to disable, or export OP_INJECT_COMMAND_ALLOWLIST= (empty string) to keep the intent visible while still treating the feature as disabled.

The agent reads these on each call (no caching), so revocation is immediate without a restart for most paths. The single exception is OP_INJECT_COMMAND_ALLOWLIST reading happens at tool-call time, not extension-spawn time, so it also picks up changes live.

Adding a new toggle

When a future extension introduces a new write/reveal env var, add a matching CapabilityToggle to crates/setup/src/capabilities.rs::INVENTORY. Without that entry, agent doctor capabilities is silently incomplete — the inventory is the operator-facing source of truth.

Backup + restore

Nexo state lives under NEXO_HOME (default ~/.nexo/ for native installs, /var/lib/nexo-rs/ for the systemd package, /app/data/ in the Docker image). Backing it up + restoring it is the operator's responsibility today; a proper nexo backup / nexo restore subcommand is tracked under Phase 36.

Quickest path — `scripts/nexo-backup.sh`

The repo ships a shell script that does the right thing without stopping the daemon:

# Single-shot, output to ./
NEXO_HOME=/var/lib/nexo-rs sudo -E scripts/nexo-backup.sh

# Custom output dir, exclude secrets (default)
scripts/nexo-backup.sh --out /backups/

# Include secrets/ for full recovery (encrypt the archive yourself)
scripts/nexo-backup.sh --include-secrets

What it does:

Hot snapshot every SQLite DB via sqlite3 .backup — the official online-backup mechanism. Captures a consistent point-in-time image even with concurrent writers; no daemon stop required.
rsync non-DB state — JSONL transcripts, the agent workspace-git dir if Phase 10.9 is enabled, any operator files dropped under NEXO_HOME. Skips *.tmp, *.lock, and the queue/ disk-queue dir (replays on next boot from NATS, no need to back up).
secret/ excluded by default. Re-run with --include-secrets to include them; encrypt the resulting tarball before transit (use age, gpg, or push to an encrypted bucket).
sha256 manifest at MANIFEST.sha256 inside the archive so restore can verify integrity.
zstd-19 compression — typical 10× ratio over raw SQLite.
Sidecar <archive>.sha256 with the archive's outer hash so backup pipelines can detect transit corruption.

Restore

# Pull the archive locally first
scp ops@host:/backups/nexo-backup-20260426T121500Z.tar.zst .

# Extract
zstd -dc nexo-backup-20260426T121500Z.tar.zst | tar -xf -

# Verify the manifest
cd nexo-backup-20260426T121500Z
sha256sum -c MANIFEST.sha256

# Stop the daemon (state must not be mid-write)
sudo systemctl stop nexo-rs

# Replace state
sudo rsync -a --delete --chown=nexo:nexo \
  ./ /var/lib/nexo-rs/

# Start
sudo systemctl start nexo-rs
sudo journalctl -u nexo-rs -f

The daemon must be stopped during the rsync — SQLite WAL files do not survive a parallel-write replacement.

Cron schedule

Drop in /etc/cron.daily/nexo-backup:

#!/bin/sh
set -eu
ARCHIVE_DIR=/backups/nexo
mkdir -p "$ARCHIVE_DIR"

# Snapshot, retain locally
NEXO_HOME=/var/lib/nexo-rs \
    /opt/nexo-rs/scripts/nexo-backup.sh --out "$ARCHIVE_DIR"

# Push to remote (Backblaze, S3, Wasabi, etc.)
rclone copy --include '*.tar.zst*' "$ARCHIVE_DIR" remote:nexo-backups/

# Retain 30 days locally + 90 days remote
find "$ARCHIVE_DIR" -name 'nexo-backup-*.tar.zst*' -mtime +30 -delete
rclone delete --min-age 90d remote:nexo-backups/

chmod +x /etc/cron.daily/nexo-backup. Single-host operators get a tested daily backup pipeline in 6 lines.

What survives a backup

Component	In backup	Notes
Long-term memory (vector + relational)	✅	`memory.db`
Transcripts	✅	`transcripts/` JSONL + `transcripts.db` FTS
TaskFlow state	✅	`taskflow.db`
Pairing store + setup-code key	⚠️	DB included; key only with `--include-secrets`
LLM credentials	⚠️	`secret/` only with `--include-secrets`
Per-agent SOUL.md + MEMORY.md	✅	rsync from workspace
Agent workspace git	✅	full `.git` dir included if Phase 10.9 is on
Disk-queue (NATS replay buffer)	❌	regenerates from NATS on boot
Process logs	❌	journalctl handles those separately

Migrations

Schema migrations across Nexo versions are still ad-hoc — ALTER TABLE … .ok() patterns inside the runtime. Phase 36 adds:

nexo migrate status — show the applied vs available migration set
nexo migrate up [target] — apply pending migrations forward
nexo migrate down [target] — roll back if a release ships reversible migrations
A migrations/ dir with versioned, checksummed SQL files

Until then, pin to a specific Nexo version per deployment and test upgrades on a copy of the backup before applying to production.

Status

Tracked as Phase 36 — Backup, restore, migrations.

Sub-phase	Status
`scripts/nexo-backup.sh` shell bridge	✅ shipped
Operator doc (this page)	✅ shipped
`nexo backup --out <dir>` subcommand	⬜ deferred
`nexo restore --from <archive>` subcommand	⬜ deferred
`nexo migrate up/down/status` versioned migrations	⬜ deferred
Encrypted archive output (age / gpg)	⬜ deferred
CI test that backup → restore round-trips on a fixture	⬜ deferred

The shell script + this doc are the bridge. Once the runtime subcommands ship, this page rewrites to point at them and the script gets retired.

Agent memory snapshots

Atomic point-in-time snapshots of an agent's full memory state, packaged as a single verifiable bundle. Built for rollback after a corrupt dream, forensic audit ("what did the agent know at T?"), portable export between hosts, and pre-restore safety nets in autonomous mode.

What goes in a bundle

Layer	Source	In-bundle path
Memory git repo	`<memdir>/.git/`	`git/**`
Operator-curated files	`<memdir>/MEMORY.md` + topic files	`memory_files/**`
Long-term SQLite	`<sqlite>/long_term.sqlite`	`sqlite/long_term.sqlite`
Vector SQLite	`<sqlite>/vector.sqlite`	`sqlite/vector.sqlite`
Concepts	`<sqlite>/concepts.sqlite`	`sqlite/concepts.sqlite`
Compactions	`<sqlite>/compactions.sqlite`	`sqlite/compactions.sqlite`
Extractor cursor	runtime state provider	`state/extract_cursor.json`
Last dream run row	agent registry	`state/dream_run.json`
Manifest	seal	`manifest.json`

Bundle layout on disk

<state_root>/tenants/<tenant>/snapshots/<agent_id>/
├── <id>.tar.zst           # bundle body (or .tar.zst.age when encrypted)
└── <id>.tar.zst.sha256    # whole-file SHA-256 sibling

Two independent integrity checks ride together:

Manifest seal — manifest.bundle_sha256 = SHA-256 of every per-artifact hex digest concatenated in declared order. Verifiable from the manifest alone, no recursion on the tar bytes.
File-level seal — sibling .sha256 text file = SHA-256 of the bundle file as it lives on disk (post-encryption when encrypted). Detects bit-flips during transit / cold storage even when the body is age-wrapped.

Both must pass for verify to report ok.

CLI

nexo memory snapshot --agent <id> [--tenant <t>] [--label <s>]
                     [--redact-secrets] [--encrypt age:<recipient>]

nexo memory restore  --agent <id> [--tenant <t>] --from <bundle>
                     [--dry-run] [--no-auto-pre-snapshot]
                     [--decrypt-identity <path>]

nexo memory list     --agent <id> [--tenant <t>] [--json]
nexo memory diff     --agent <id> [--tenant <t>] <id-a> <id-b>
nexo memory export   --agent <id> [--tenant <t>] --id <snapshot-id> --to <path>
nexo memory verify   --bundle <path>
nexo memory delete   --agent <id> [--tenant <t>] --id <snapshot-id>

--tenant defaults to default for single-tenant deployments. Multi- tenant SaaS deployments require explicit values aligned with the canonicalized identifier rules described in capabilities.

nexo memory restore is gated on NEXO_MEMORY_RESTORE_ALLOW=true (see capabilities). Without the flag the subcommand refuses, even with --yes.

Configuration

Lives in config/memory.yaml under memory.snapshot:

memory:
  snapshot:
    enabled: true
    root: ${NEXO_HOME}/state
    auto_pre_dream: false              # opt-in safety net before autoDream
    auto_pre_restore: true             # always snapshot before restore
    auto_pre_mutating_tool: false      # opt-in: pre-Plan-mode mutating tool
    lock_timeout_secs: 60
    redact_secrets_default: true
    encryption:
      enabled: false
      recipients: []                   # age public keys (age1...)
      identity_path: ${NEXO_HOME}/secret/snapshot-identity.txt
    retention:
      keep_count: 30
      max_age_days: 90
      gc_interval_secs: 3600
    events:
      mutation_subject_prefix: "nexo.memory.mutated"
      lifecycle_subject_prefix: "nexo.memory.snapshot"
      mutation_publish_enabled: true

Hot-reload via the standard ConfigReloadCoordinator path: edit YAML and the retention worker picks up the new policy at the next tick.

Lifecycle events (NATS)

Best-effort published when a broker is wired. Subjects are formed from EventsSection.lifecycle_subject_prefix (default nexo.memory.snapshot) — operators that override the prefix in YAML get the override on every event topic.

LifecycleEvent is serde(tag = "kind", rename_all = "snake_case"), so every payload below carries an extra "kind": "<verb>" discriminator field flattened alongside the documented fields:

Subject	Trigger	Payload (after `serde(flatten)`)
`<prefix>.<agent_id>.created`	snapshot success	`{kind:"created", ...SnapshotMeta}` — flattened: `id`, `agent_id`, `tenant`, `label?`, `created_at_ms`, `bundle_path`, `bundle_size_bytes`, `bundle_sha256`, `git_oid?`, `schema_versions`, `encrypted`, `redactions_applied`
`<prefix>.<agent_id>.restored`	restore success	`{kind:"restored", ...RestoreReport}` — flattened: `agent_id`, `from`, `pre_snapshot?`, `git_reset_oid?`, `sqlite_restored_dbs[]`, `state_files_restored[]`, `workers_restarted`, `dry_run`
`<prefix>.<agent_id>.deleted`	delete success	`{kind:"deleted", agent_id, tenant, snapshot_id, ts_ms}`
`<prefix>._all.gc`	retention sweep	`{kind:"gc", ts_ms, report:{bundles_deleted, orphan_staging_dirs_removed, agents_visited, errors}}`

The _all segment in the gc subject is a sentinel — gc events are cross-agent and have no single agent_id to fan-out on. Subscribers filtering with nexo.memory.snapshot.<agent>.> therefore miss gc; use nexo.memory.snapshot.> (or the configured equivalent) to catch both.

Mutation events (one per memory write) flow to <events.mutation_subject_prefix>.<agent_id> (default prefix nexo.memory.mutated) when memory.snapshot.events.mutation_publish_enabled = true. Subscribers can stream them into an audit log without forking memory writes.

Encryption

Optional, behind the snapshot-encryption Cargo feature:

cargo build --features snapshot-encryption
nexo memory snapshot --agent ana --encrypt age:age1xyz...
nexo memory restore --agent ana --from <bundle>.tar.zst.age \
                    --decrypt-identity ~/.nexo/secret/snapshot-identity.txt

The body is wrapped in an age stream; the manifest stays plaintext inside the encrypted payload but the per-artifact hashes commit to it. The sibling .sha256 file always covers the bytes that land on disk (post-encryption), so transit integrity stays verifiable without the identity.

Multi-recipient encryption (admin UI)

Phase 90 follow-up — when the snapshot is captured via the admin UI (nexo/admin/memory/create_snapshot { encrypt: true }), the daemon wraps the bundle for every recipient listed under memory.snapshot.encryption.recipients, not just the first. Each operator with a matching identity file can independently restore the bundle.

memory:
  snapshot:
    encryption:
      enabled: true
      recipients:
        - "age1backupadmin..."   # backup operator's age public key
        - "age1dradmin..."       # disaster-recovery operator's key
      identity_path: ${NEXO_HOME}/secret/snapshot-identity.txt

Both recipients above receive a header section in every admin-UI snapshot. Either operator's identity file can decrypt it. Duplicate recipient strings (operator paste-twice typo) are silently deduplicated.

The CLI's single-recipient --encrypt age:age1xyz... flag is unchanged — it remains the power-user / scripted path. To capture a multi-recipient bundle from the CLI today, use the admin RPC via nexo/admin/memory/create_snapshot.

Boot-time validation: at daemon startup the runtime parses every recipient string. A typo (e.g. age1xyz truncated by accident) fails the daemon boot with a clear recipients[N] failed to parse error so operators discover the issue before relying on the encryption.

Threat model

Loss of identity → encrypted bundle is unrecoverable. Mirror identity files into your operator-credential store with the same retention as your other long-lived secrets.
Sibling .sha256 missing → verify reports bundle_sha256_ok = false but does not error. Operators must treat this as a hard fail before restore.
Bundle smaller than the live state → expected: restore overwrites whatever was there, including untracked files in the memdir. Use --dry-run first.
Cross-tenant restore → blocked at path validation. A bundle whose tenant string does not match the request errors with CrossTenantError before any disk mutation.
Last snapshot deletion → delete refuses to drop the agent's only remaining bundle. Retention sweeps obey the same floor.
Auto-pre-snapshot during restore → on by default. Disable with --no-auto-pre-snapshot only when the rollback anchor is unwanted (e.g. you are restoring into a fresh agent with no prior state).
Encrypted bundles + verify → without the identity the per-artifact hashes inside the body cannot be checked; the report's manifest_ok and per_artifact_ok are reported as true by convention while age_protected is set. Operators who must verify the manifest of an encrypted bundle should run verify after a decrypt + restore round-trip.

Retention

A background worker sweeps every gc_interval_secs:

Orphan staging cleanup — any .staging-<id>/ or .restore-staging-<id>/ directory left behind by a process kill is deleted at startup and at every tick.
Per-agent count + age — bundles older than max_age_days or exceeding keep_count are deleted oldest-first via the same delete() path the CLI uses, so the "never delete the last snapshot" floor is respected.

Restore mechanics

The full sequence for a real (non---dry-run) restore:

verify the bundle. Schema-too-new and checksum mismatch fail here without touching live state.
auto_pre_snapshot (default on): take a snapshot labelled auto:pre-restore-<orig_id> so the operation is reversible.
Acquire the per-agent lock. Concurrent snapshot/restore for the same agent will fail with Concurrent.
Unpack to .restore-staging-<uuid>/.
Tag the live HEAD with pre-restore-<id> so prior state stays reachable via git reflog show pre-restore-<id>.
SQLite swap: each live DB is renamed to <name>.sqlite.pre-restore.bak and the staging copy moves into place. The .bak files survive the restore for manual recovery.
Memdir replace: live memdir is renamed to <memdir>-pre-restore-<id>/ and the staging contents are written on top. Failures roll the rename back.
State provider replay: extractor cursor + last dream-run row.
Drop staging dir + lock.

Admin RPC surface (Phase 90.x.memory-snapshot + .create-restore)

The nexo-plugin-admin SPA at /m/memory drives four admin RPCs that mirror the CLI's list, delete, snapshot, and restore verbs. All four are gated by the memory_snapshot capability — operators that already grant the read-only pair (list_snapshots + delete_snapshot) automatically get write access via the same trust boundary.

Method	Capability	Behaviour
`nexo/admin/memory/list_snapshots`	`memory_snapshot`	Newest-first list + `encryption_available` flag
`nexo/admin/memory/delete_snapshot`	`memory_snapshot`	Idempotent removal by `snapshot_id`
`nexo/admin/memory/create_snapshot`	`memory_snapshot`	Capture fresh bundle (`label?`, `encrypt?`)
`nexo/admin/memory/restore_snapshot`	`memory_snapshot`	Restore by `snapshot_id` (`dry_run?`)

Defaults forced server-side

Unlike the CLI, the admin path forces a fixed contract so operator mistakes via the SPA don't leak secrets or skip the safety net:

redact_secrets = true — UI-driven snapshots always run the secret-guard scanner. The CLI keeps --no-redact for power users who want raw bundles.
auto_pre_snapshot = true — every UI restore captures a pre-restore bundle so the operation is reversible. The CLI keeps --no-auto-pre-snapshot for fresh-agent restores.
created_by = "admin-ui" — provenance trace lands in the bundle manifest's created_by column for audit reads.

Restore by `snapshot_id`, not `bundle_path`

The wire never carries a filesystem path. The daemon resolves snapshot_id → bundle_path via its own list() lookup before opening the bundle. This forecloses on accidentally turning the admin endpoint into an arbitrary-file-read primitive.

Defensive tenant validation

restore_snapshot requires tenant in the params. The adapter reads the bundle manifest's recorded tenant and refuses if they disagree, with both tenants quoted in the error. Operator typos that would have crossed staging ↔ prod accidentally are caught before any disk mutation.

Encryption recipient resolution

When create_snapshot is invoked with encrypt: true, the daemon resolves the actual age recipient from memory.snapshot.encryption.recipients[0] — the wire never carries the recipient string, and operators rotate recipients via YAML + restart. The same EncryptionSection clone surfaces encryption_available on every list response so the SPA can grey out the encrypt toggle when no recipients are configured.

For restore of an encrypted bundle the adapter resolves identity_path from the same EncryptionSection. Missing identity_path with an encrypted bundle errors with "encrypted but no identity_path configured; restore via CLI".

Dry-run UX

restore_snapshot { dry_run: true } runs the full validation pipeline (tenant check + bundle resolution + identity resolution) but stops short of mutating live state. The returned RestoreReportWire { dry_run: true } carries the sqlite_restored_dbs[] and state_files_restored[] the SPA renders as a preview table — the operator inspects the diff before flipping the toggle and re-issuing destructively.

Lock semantics

Restore takes the same per-agent AgentLockMap lock the CLI uses. A restore against an agent already holding the lock (concurrent snapshot, retention sweep, second restore) will time out with Concurrent after lock_timeout_secs. The handler bubbles the error through; the SPA renders it as a retryable warning.

Health checks

Three layers of health probes for a Nexo deployment, each tuned for a different consumer:

/health — liveness. Cheap (atomic flag check). HTTP 200 means the process is up; doesn't guarantee it can serve work.
/ready — readiness. Expensive (verifies broker connection, agents loaded, snapshot warm). HTTP 200 means the runtime can accept inbound traffic. Use this for load-balancer health checks.
scripts/nexo-health.sh — operator + monitoring. JSON summary with counter snapshots. Bridge until nexo doctor health (Phase 44) ships.

Liveness — `/health`

Returns HTTP 200 + ok body when the agent process is alive. The runtime sets a RUNNING flag at startup and clears it on graceful shutdown. Does not verify any subsystem — useful for "is the daemon there at all" probes.

curl -fsSL http://127.0.0.1:8080/health
# ok

Kubernetes liveness probe:

livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
  timeoutSeconds: 3
  failureThreshold: 3

A failing liveness probe should restart the container. Be generous on initialDelaySeconds — first-boot extension discovery + memory open + agent runtime spin-up can take 15-25s.

Readiness — `/ready`

Returns 200 only when all of:

Broker (NATS or local) is reachable
Every configured agent has loaded its tool registry
The hot-reload snapshot has been warmed (Phase 18)
Pairing store is open (if pairing_policy.auto_challenge is on)

Returns 503 with a JSON body listing the failing subsystem otherwise:

{
  "ready": false,
  "reasons": [
    {"subsystem": "broker", "detail": "nats://localhost:4222: connection refused"}
  ]
}

Use this for load-balancer / service-mesh routing decisions. A node that's live but not ready should not receive traffic.

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  periodSeconds: 5
  timeoutSeconds: 2
  failureThreshold: 1

Operator one-shot — `scripts/nexo-health.sh`

Single-shot JSON summary intended for watch -n 5 nexo-health.sh during ops, cron health-mailers, and uptime monitors that want one structured payload covering everything.

# Default — pretty human output
scripts/nexo-health.sh

# JSON only (cron, monitoring scrapers)
scripts/nexo-health.sh --json

# Custom hosts (e.g., probing through a service mesh)
scripts/nexo-health.sh --host nexo.internal:8080 \
                      --metrics-host nexo.internal:9090

# Strict mode — open circuit breaker counts as unhealthy.
# Default mode tolerates breaker-open (degraded-but-up).
scripts/nexo-health.sh --strict

Pretty output:

============================================================
 nexo-rs health  ·  2026-04-26T15:30:00Z
============================================================

  overall:      ok
  admin:        127.0.0.1:8080
  metrics:      127.0.0.1:9090

  probes:
    ✓ live       ok
    ✓ ready      ok
    ✓ metrics    ok

  counters:
    tool_calls_total              4711
    llm_stream_chunks_total       28391
    web_search_breaker_open_total 0

JSON shape (for monitoring scrapers):

{
  "overall": "ok",
  "timestamp": "2026-04-26T15:30:00Z",
  "endpoints": { "admin": "127.0.0.1:8080", "metrics": "127.0.0.1:9090" },
  "probes": [
    {"name": "live",    "status": "ok", "detail": "ok"},
    {"name": "ready",   "status": "ok", "detail": "{...}"},
    {"name": "metrics", "status": "ok", "detail": "# HELP nexo_..."}
  ],
  "counters": {
    "tool_calls_total":              4711,
    "llm_stream_chunks_total":       28391,
    "web_search_breaker_open_total": 0
  }
}

Exit codes:

0 — overall healthy
1 — at least one probe failed (or --strict and a breaker is open)

Cron health mailer

# /etc/cron.d/nexo-health
*/5 * * * * nexo /opt/nexo-rs/scripts/nexo-health.sh --json --strict \
    >> /var/log/nexo-rs/health.jsonl 2>&1 \
    || (tail -1 /var/log/nexo-rs/health.jsonl | mail -s "nexo unhealthy" ops@yourorg)

Five-minute resolution, one line of JSONL per check, mail on failure.

Uptime monitor integration

UptimeRobot / BetterStack / Pingdom:

URL:        https://nexo.example.com/ready
Interval:   60s
Timeout:    5s
Expected:   HTTP 200

That's all most monitors need. The JSON body of /ready explains the failure when the alert fires.

What `nexo-health.sh` adds beyond `/ready`

Signal	`/ready`	`nexo-health.sh`
Process up + accepting traffic	✅	✅
Counter snapshot (tool calls, LLM chunks)	❌	✅
Web-search breaker state	❌	✅
Single JSON payload	❌ (HTTP 200/503)	✅
Suitable for HTTP probe	✅	❌ (shells out)

Use /ready for the orchestrator. Use nexo-health.sh for the operator's eyeballs and the alerting pipeline.

Status

Tracked as Phase 44 — Auxiliary observability surfaces.

Capability	Status
`/health` liveness endpoint	✅ shipped (Phase 9)
`/ready` readiness endpoint	✅ shipped (Phase 9)
`scripts/nexo-health.sh` operator one-shot	✅ shipped
Operator runbook (this page)	✅ shipped
`nexo doctor health` aggregating subcommand	⬜ deferred
`nexo inspect <session_id>` state-transition pretty-print	⬜ deferred
Per-session structured event log under `data/events/`	⬜ deferred

Cost & quota controls

Operator runbook for tracking + capping LLM spend. Today the runtime emits enough Prometheus metrics for an operator to build their own picture; the proper nexo costs subcommand + budget caps land in Phase 45.

Estimating spend — `scripts/nexo-cost-report.sh`

Aggregates nexo_llm_stream_chunks_total by provider, multiplies by a price table, prints (or emits JSON) per-provider rolling totals.

# Human-readable report against the local /metrics endpoint
scripts/nexo-cost-report.sh

# JSON for monitoring / dashboards
scripts/nexo-cost-report.sh --json

# Custom price table (your negotiated enterprise rates)
scripts/nexo-cost-report.sh --prices ~/our-enterprise-rates.tsv

# Probe a remote daemon
scripts/nexo-cost-report.sh --metrics-host nexo.internal:9090

Pretty output:

============================================================
 nexo-rs cost report  ·  2026-04-26T15:30:00Z
============================================================

  PROVIDER                    CHUNKS     EST_TOKENS    EST_USD
  anthropic                    28391          85173    $0.7666
  minimax                       4711          14133    $0.0042
  ollama                        1208           3624    $0.0000

  total estimated: $0.7708

  disclaimer: heuristic estimate. Calibrate
    NEXO_TOKENS_PER_CHUNK once you have a measured baseline.

Calibration

The default tokens-per-chunk = 3 is a heuristic. To get an accurate number for your deployment:

Find a typical conversation in transcripts (session_logs tool output).
Sum the usage.total_tokens from the chat.completion end event(s).
Divide by the total chunk count emitted during that conversation (visible in nexo_llm_stream_chunks_total{provider="...",kind="text_delta"}).
Set NEXO_TOKENS_PER_CHUNK env to the result.

Example:

# Anthropic typical: 4-token granularity per delta
NEXO_TOKENS_PER_CHUNK=4 scripts/nexo-cost-report.sh

# OpenAI typical: 1 token per delta on streaming
NEXO_TOKENS_PER_CHUNK=1 scripts/nexo-cost-report.sh

When the runtime ships nexo_llm_tokens_total{provider,model,direction} (Phase 45 deliverable), the heuristic is replaced by direct token counts and the calibration step disappears.

Built-in price table

Provider	Model	$/1M in	$/1M out
anthropic	claude-opus-4	15.00	75.00
anthropic	claude-sonnet-4	3.00	15.00
anthropic	claude-haiku-4	0.80	4.00
openai	gpt-4o	2.50	10.00
openai	gpt-4o-mini	0.15	0.60
minimax	abab6.5s	0.20	0.60
minimax	M2.5	0.30	1.50
gemini	gemini-1.5-pro	1.25	5.00
gemini	gemini-1.5-flash	0.075	0.30
deepseek	deepseek-chat	0.14	0.28
ollama	*	0.00	0.00

These are public list prices as of 2026-04. Operators with enterprise contracts override via --prices:

provider	model	in_per_1m	out_per_1m
anthropic	claude-sonnet-4	2.40	12.00
openai	gpt-4o	2.00	8.00

(One row per provider×model. * model = applies to any model from that provider.)

Daily budget alerts via cron

Snapshot every 24h, mail the operator if estimated spend > cap:

# /etc/cron.daily/nexo-cost-alert
#!/bin/sh
set -eu
CAP=10.00            # $/day soft cap

REPORT=$(/opt/nexo-rs/scripts/nexo-cost-report.sh --json)
TOTAL=$(echo "$REPORT" | jq -r '.total_estimated_usd')

if awk -v t="$TOTAL" -v c="$CAP" 'BEGIN { exit !(t > c) }'; then
    echo "$REPORT" | mail -s "nexo daily spend over \$$CAP: \$$TOTAL" \
        ops@yourorg.com
fi

This is alerting only, not enforcement — the runtime keeps serving traffic. For hard caps, wait for Phase 45.

Hard quota caps (deferred)

Phase 45 ships per-agent monthly budget caps:

# config/agents.yaml — once 45.x lands
agents:
  - id: kate
    cost_cap_usd:
      monthly: 50.00
      daily: 5.00
      action: refuse_new_turns   # or: warn_only, throttle
      warn_topic: alerts.kate.budget

When hit:

refuse_new_turns — agent returns a fixed response ("I've reached my budget for the period; please ask the operator to extend.") to every new inbound. Existing in-flight turns finish.
warn_only — log + telemetry but keep serving.
throttle — switch to a cheaper model variant (claude-haiku-4 instead of claude-opus-4) for the rest of the period.

Per-binding token rate limits (e.g. "WhatsApp sales binding capped at 5k tokens/hour") layer on top of the existing sender_rate_limit. Phase 45.x.

Inspecting the metrics directly

If the script is too coarse:

# Top providers by total chunks (last 5m rate)
curl -sS http://127.0.0.1:9090/metrics | \
    awk '/^nexo_llm_stream_chunks_total/{gsub(/.*provider="/, "", $1); gsub(/".*/, "", $1); n[$1]+=$2} END{for (p in n) print n[p], p}' | \
    sort -rn

# TTFT p95 by provider (curl + jq if you have promtool):
promtool query instant http://127.0.0.1:9090 \
    'histogram_quantile(0.95, sum by (provider, le) (rate(nexo_llm_stream_ttft_seconds_bucket[5m])))'

The full metric inventory lives in Grafana dashboards → metric coverage (in repo as ops/grafana/README.md).

Status

Tracked as Phase 45 — Cost & quota controls.

Capability	Status
`scripts/nexo-cost-report.sh` heuristic estimator	✅ shipped
Operator runbook (this page)	✅ shipped
`nexo_llm_tokens_total{provider,model,direction}` metric	⬜ deferred
Per-agent monthly budget cap (config + enforcement)	⬜ deferred
`agents.<id>.cost_cap_usd` schema	⬜ deferred
Per-binding token rate limit	⬜ deferred
Pre-flight token-count predictor in agent prompt	⬜ deferred
`nexo costs` CLI rolling 24h/7d/30d aggregator	⬜ deferred
`/api/costs` admin endpoint	⬜ deferred

Privacy toolkit

GDPR-style operator workflows for handling user data requests until the proper nexo forget / nexo export-user subcommands ship (tracked under Phase 50).

Right to be forgotten

scripts/nexo-forget-user.sh does cascading delete across every SQLite DB and JSONL transcript under NEXO_HOME, then VACUUMs the databases so the deleted rows don't survive in free pages.

# Stop the daemon first — SQLite WAL doesn't survive parallel writes
sudo systemctl stop nexo-rs

# DRY RUN — shows what would be deleted, doesn't change anything
NEXO_HOME=/var/lib/nexo-rs sudo -E scripts/nexo-forget-user.sh \
  --id "+5491155556666"

# When the dry-run looks right, re-run with --apply
NEXO_HOME=/var/lib/nexo-rs sudo -E scripts/nexo-forget-user.sh \
  --id "+5491155556666" \
  --apply

# Restart
sudo systemctl start nexo-rs

What gets deleted (cascading across all DBs):

Table column	Match	Source DB
`user_id`	exact	every DB
`sender_id`	exact	every DB (used in pairing, transcripts)
`account_id`	exact	every DB (used in WA / TG plugins)
`contact_id`	exact	memory + transcripts
`peer_id`	exact	agent-to-agent routing

Plus JSONL transcript lines where any of those keys equals the target id.

The script emits forget-user-<id>-<timestamp>.json with the exact deletion counts — this is the operator's GDPR audit trail, ship it back to the requester as proof of compliance.

`--keep-audit` flag

Strict GDPR says even the admin-audit row recording the deletion should be removed (the user has the right to no trace). But that breaks operator audit chains. Use --keep-audit to opt out of that single specific erasure:

nexo-forget-user.sh --id "<id>" --apply --keep-audit

The script keeps the admin_audit table row showing that the deletion happened (without the user-id field, which is hashed). Other tables fully wiped either way.

Right to data export

Until nexo export-user --id <id> ships, manual SQL works:

USER_ID="+5491155556666"
OUT_DIR="export-${USER_ID}-$(date -u +%Y%m%dT%H%M%SZ)"
mkdir -p "$OUT_DIR"

# Stop the daemon for a consistent point-in-time export
sudo systemctl stop nexo-rs

# Per-DB extraction
for db in /var/lib/nexo-rs/*.db; do
    name=$(basename "$db" .db)
    sqlite3 "$db" \
        ".headers on" \
        ".mode json" \
        ".output $OUT_DIR/${name}.json" \
        "SELECT * FROM ($(sqlite3 "$db" '
          SELECT GROUP_CONCAT(
            \"SELECT '\" || name || \"' AS table_name, * FROM \" || name ||
            \" WHERE user_id = '\" || ? || \"' OR sender_id = '\" || ? || \"' OR account_id = '\" || ? || \"'\",
            \" UNION ALL \"
          )
          FROM sqlite_master m
          WHERE m.type='table'
            AND EXISTS (
              SELECT 1 FROM pragma_table_info(m.name) p
              WHERE p.name IN ('user_id','sender_id','account_id')
            )
        '))" -- "$USER_ID" "$USER_ID" "$USER_ID"
done

# Per-JSONL extraction
for f in /var/lib/nexo-rs/transcripts/*.jsonl; do
    name=$(basename "$f")
    jq -c \
        --arg id "$USER_ID" \
        'select((.user_id // .sender_id // .account_id // "") == $id)' \
        "$f" > "$OUT_DIR/$name"
done

# Restart
sudo systemctl start nexo-rs

# Tar + zstd, optionally encrypt
tar -C "$(dirname "$OUT_DIR")" -cf - "$(basename "$OUT_DIR")" | \
    zstd -19 -T0 > "${OUT_DIR}.tar.zst"

# (Recommended) age-encrypt before transit
age -r age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \
    -o "${OUT_DIR}.tar.zst.age" \
    "${OUT_DIR}.tar.zst"
shred -u "${OUT_DIR}.tar.zst"

The result is a tarball the operator hands to the requester — JSON files per DB + filtered transcript JSONLs — encrypted with the requester's age public key.

When nexo export-user --id <id> ships, this whole shell pipeline collapses into one command with built-in encryption.

Retention policy

Operator-defined per deployment. Recommended defaults:

Surface	Retention	Why
Transcripts	90 days	Enough for ops debugging + agent recall
Memory (long-term)	indefinite	Agent's working memory; pruned by recall signals
TaskFlow finished flows	30 days	Audit trail for completed work
TaskFlow failed flows	365 days	Forensics
Admin audit log	365 days	Compliance
Disk-queue (NATS replay)	7 days	Disaster recovery
Pairing pending requests	60 min	TTL-enforced by the store

Apply via cron (until nexo retention apply ships):

# /etc/cron.daily/nexo-retention
#!/bin/sh
set -eu
DB=/var/lib/nexo-rs/transcripts.db

# 90-day rolling window on transcripts
sqlite3 "$DB" "DELETE FROM transcripts
                WHERE timestamp < strftime('%s', 'now', '-90 days');"
sqlite3 "$DB" 'VACUUM;'

# Same for taskflow finished + failed
DB=/var/lib/nexo-rs/taskflow.db
sqlite3 "$DB" "DELETE FROM flows
                WHERE status='Finished'
                  AND finished_at < datetime('now', '-30 days');"
sqlite3 "$DB" "DELETE FROM flows
                WHERE status='Failed'
                  AND finished_at < datetime('now', '-365 days');"

PII detection (deferred)

Phase 50 plans inbound PII flagging — separate from the existing outbound redactor. The rough shape:

Regex pre-screen for SSN-shape, credit-card-shape (Luhn-checked), phone-number-shape per locale.
Optional LLM-backed second-pass via the future Phase 68 local tier (gemma3-270m).
Hits land in data/pii-flags.jsonl for operator review; agent dialog continues unimpeded.

Today: nothing automated. The outbound redactor in crates/core/src/redaction.rs (regex-based) catches the obvious shapes before they reach long-term memory or the LLM, but doesn't emit a queue for operator review.

Encryption at rest

Two roads, both deferred to Phase 50.x:

Application-level — sqlcipher build of libsqlite3-sys with a key fed from secrets/. Every page encrypted; backups need the same key to restore.
Filesystem-level — dm-crypt / LUKS on the volume hosting NEXO_HOME. Operator does it once at provision, no Nexo changes required.

The native install + Hetzner / Fly recipes assume filesystem-level crypto handled by the host (LUKS on Hetzner, encrypted EBS on AWS, Fly volumes are encrypted at rest by default). When sqlcipher is ready we'll document switching tiers.

Status

Capability	Status
`scripts/nexo-forget-user.sh` cascading delete	✅ shipped
Operator data-export shell pipeline (above)	✅ documented
Retention policy + cron template	✅ documented
`nexo forget --user <id>` subcommand	⬜ deferred
`nexo export-user --id <id>` subcommand	⬜ deferred
Inbound PII detection + review queue	⬜ deferred
`sqlcipher` encryption at rest	⬜ deferred
Admin-action audit log (separate from this script's manifest)	⬜ deferred

Tracked as Phase 50 — Privacy toolkit.

Anonymous telemetry (opt-in)

Nexo can emit a weekly heartbeat with anonymous, aggregated deployment shape so the project knows what configurations are actually in production. The heartbeat is disabled by default — nothing leaves your host until you explicitly opt in.

This page documents exactly what's sent, what isn't, and how to inspect the payload before enabling it.

What is sent

Every 7 days (drift-resistant — 7d ± 1h jitter), if telemetry is enabled, Nexo POSTs a single JSON document to https://telemetry.lordmacu.dev/nexo over HTTPS:

{
  "schema_version": 1,
  "instance_id": "0fa3...",
  "version": "0.1.1",
  "rust_version": "1.80.1",
  "os": "linux",
  "arch": "aarch64",
  "uptime_days": 14,

  "agents": {
    "total": 3,
    "active_24h": 2
  },

  "channels": {
    "whatsapp": 1,
    "telegram": 1,
    "email": 0,
    "browser": 1
  },

  "llm_providers": [
    "minimax",
    "anthropic"
  ],

  "memory_backend": "sqlite-vec",

  "sessions": {
    "average_per_agent_24h": 12,
    "p95_per_agent_24h": 28
  },

  "extensions_loaded": 4,

  "broker_kind": "nats"
}

What is not sent

❌ Message content. Not a single byte of any conversation, prompt, response, or tool call ever leaves the host.
❌ Identifiers. No phone numbers, email addresses, contact names, agent names, channel handles. The instance_id is a random UUID generated on first opt-in and stored in ~/.nexo/telemetry-id; it can't be tied to anything except a rerun of the same install.
❌ API keys / tokens / secrets. None. The provider list is the literal string "minimax", never the key.
❌ IP addresses. The receiving server (telemetry.lordmacu.dev) drops the source IP at ingress before the payload hits any database. The HTTP access log retains only the country code derived from a one-way hash of the IP, used solely to plot the geographic distribution gauge on the public dashboard.
❌ Hostname. Not in the payload. Not derived from anything in the payload.
❌ Time of day. The heartbeat is jittered so the timestamp doesn't reveal a pattern.

Why opt in

It's the only honest signal the project has about what's actually deployed. Without it, every roadmap discussion is guessing. With it, prioritization improves: if 80% of opt-in deployments use Anthropic + WhatsApp, then a regression on that combo gets a hot-fix; a niche feature goes to maintenance mode.

The aggregate dashboard at https://lordmacu.github.io/nexo-rs/usage/ (published once Phase 41 fully ships) shows everyone what everyone else is doing in aggregate — same data the maintainers see.

Enable / disable

# Show current state + what would be sent right now
nexo telemetry status

# Enable (writes to /etc/nexo-rs/telemetry.yaml or ~/.nexo/telemetry.yaml)
nexo telemetry enable

# Inspect exactly what tomorrow's heartbeat will contain
nexo telemetry preview

# Disable + remove the instance_id file
nexo telemetry disable

Hot-reload aware (Phase 18) — toggling doesn't require a daemon restart. The runtime watches the telemetry config; the next heartbeat tick respects whatever is currently on disk.

On first nexo boot in a fresh install, the daemon prints once to the journal:

========================================================================
  nexo telemetry is DISABLED.
  Enabling it sends an anonymous, aggregated weekly heartbeat
  describing your deployment shape (channel mix, LLM provider mix,
  agent count). No message content, no identifiers, no API keys.
  Inspect the payload:        nexo telemetry preview
  Enable:                     nexo telemetry enable
  Read the full spec:         https://lordmacu.github.io/nexo-rs/ops/telemetry.html
========================================================================

Subsequent boots stay silent. Toggling on or off prints a one-line confirmation.

Server-side guarantees

The receiving endpoint at telemetry.lordmacu.dev:

Drops the source IP at the load balancer, before the request reaches any application code or log aggregator.
Stores the JSON document verbatim with no enrichment.
Aggregates documents per instance_id only to compute the active_install_count cardinality on the public dashboard.
Retains raw documents for 90 days, then aggregates and deletes the originals.
Does not correlate documents across instance_id rotations — if you nexo telemetry disable && nexo telemetry enable, you become a fresh install in the dataset.

The server source code lives at https://github.com/lordmacu/nexo-telemetry-server (deferred — opens once Phase 41 finishes server side). Reproducible build, verifiable signatures.

Inspecting in transit

The HTTP request is plain HTTPS POST with the JSON payload above as the body. Easy to mitm in a corp environment:

mitmproxy -p 8888 -s drop_telemetry.py &
NEXO_TELEMETRY_PROXY=http://127.0.0.1:8888 nexo telemetry preview

The runtime respects HTTPS_PROXY / HTTP_PROXY / standard proxy env vars for the heartbeat HTTP client (it goes through the same reqwest client every other Nexo egress uses).

Disabling at the firewall

If you just want to make sure no telemetry can leave even if it gets accidentally enabled:

sudo iptables -A OUTPUT -d telemetry.lordmacu.dev -j REJECT

The runtime will see a network error in its logs every 7 days (rate-limited to once-per-week to not flood). It does not retry-forever — one attempt per scheduled tick.

Compliance notes

GDPR: anonymous aggregate data with no identifiers and no PII falls outside Article 4(1) "personal data". The instance_id is technical metadata, not a pseudonym — it can't be re-tied to a natural person via any data the project holds.
HIPAA: no PHI is collected; the field set is infrastructure metadata only.
Corporate sec teams: the receiving endpoint speaks only HTTPS, no fallback to HTTP. The server cert is publicly pinnable. The payload schema is documented + versioned; new fields require bumping schema_version and a documented changelog entry below.

Schema changelog

Version	Released	What changed
1	TBD when Phase 41 ships	Initial schema as documented above

Future schema changes append a row here. Old clients are not forced to upgrade — the server accepts every advertised schema_version indefinitely (rolled-up dashboard panels include only the fields a given schema carries).

Out of scope

Per-agent / per-binding metrics — that's the Prometheus /metrics endpoint, scraped locally by your own Prometheus (see Grafana dashboards). The telemetry heartbeat is deployment-shape only.
Crash reports — Nexo emits anyhow backtraces to the local journal but never sends them off-host.
Real-time analytics — heartbeat is once weekly. There's no call-home for live metrics, ever.

Recipes

End-to-end walkthroughs that wire multiple subsystems together. Each recipe runs against a clean checkout of nexo-rs — prerequisites are at the top.

Recipe	What you build
WhatsApp sales agent	A drop-in agent that greets WhatsApp leads, asks qualifying questions, and notifies a human on hot leads.
Agent-to-agent delegation	Route work from one agent to another using `agent.route.*` with correlation ids.
Python extension	Write a stdlib-only extension that adds a custom tool to any agent.
MCP server from Claude Desktop	Expose the agent's tools to the Anthropic desktop client.
NATS with TLS + auth	Harden the broker for a multi-node deployment.
Rotating config without downtime	Three Phase 18 hot-reload scenarios: API key rotation, A/B prompt swap, narrowing an outbound allowlist mid-incident.
Future marketing plugin (multi-client)	Prepare multi-client autonomous marketing agents with strict instance/model isolation before plugin implementation.

If a recipe drifts from reality, open an issue — it means the docs didn't get updated alongside a code change.

WhatsApp sales agent

Build a drop-in agent that handles a sales line on WhatsApp:

Greets the lead with the right operator (ETB / Claro / generic)
Qualifies via a short scripted flow (address, package, budget)
Notifies a human on hot leads, narrows the tool surface so the LLM only ever sees the lead-notification tool

This is the production shape of the shipped ana agent.

Prerequisites

agent built (cargo build --release)
NATS running (docker run -p 4222:4222 nats:2.10-alpine)
A MiniMax M2.5 key
A phone with WhatsApp ready to scan a QR

1. Provide the LLM key

export MINIMAX_API_KEY=...
export MINIMAX_GROUP_ID=...

2. Create a gitignored agent file

config/agents.d/ana.yaml is gitignored; put the business-sensitive content there.

agents:
  - id: ana
    model:
      provider: minimax
      model: MiniMax-M2.5
    plugins: [whatsapp]
    inbound_bindings:
      - plugin: whatsapp
    allowed_tools:
      - notify_lead                        # only this tool is visible
    outbound_allowlist:
      whatsapp:
        - "573000000000@s.whatsapp.net"    # human advisor's WA
    workspace: ./data/workspace/ana
    workspace_git:
      enabled: true
    heartbeat:
      enabled: false
    system_prompt: |
      You are Ana, a sales advisor for ETB and Claro. Help customers
      choose the best internet, TV, and phone package.

      On the first incoming message:
      - If it contains "etb" -> route directly to the ETB flow.
      - If it contains "claro" -> route directly to the Claro flow.
      - Otherwise, ask which operator they prefer.

      Capture: name, address, socioeconomic stratum, preferred package
      (internet only / internet+TV / triple play).

      When the lead is ready, invoke `notify_lead` with JSON containing:
      {name, phone, address, operator, package, notes}. Do not call any
      other tool — this is your only tool.

3. Pair WhatsApp for this agent

./target/release/agent setup whatsapp

The wizard creates ./data/workspace/ana/whatsapp/default/, flips config/plugins/whatsapp.yaml::whatsapp.session_dir to point at it, and renders a QR. Scan from the WhatsApp app.

4. Ship the `notify_lead` tool as an extension

Copy the Rust template and rename:

cp -r extensions/template-rust extensions/notify-lead
cd extensions/notify-lead

Edit plugin.toml:

[plugin]
id = "notify-lead"
version = "0.1.0"

[capabilities]
tools = ["notify_lead"]

[transport]
type = "stdio"
command = "./target/release/notify-lead"

Implement tools/notify_lead in src/main.rs — it should publish to plugin.outbound.whatsapp.default with a recipient = the human advisor number you listed in outbound_allowlist.

Build and install:

cargo build --release
cd ../..
./target/release/agent ext install ./extensions/notify-lead --link --enable
./target/release/agent ext doctor --runtime

5. Run

./target/release/agent --config ./config

Flow diagram

sequenceDiagram
    participant U as Lead
    participant WA as WhatsApp
    participant N as NATS
    participant A as Ana
    participant H as Human advisor

    U->>WA: "Hi, I want internet service"
    WA->>N: plugin.inbound.whatsapp
    N->>A: deliver
    A->>A: qualify (address, package)
    A->>A: invoke notify_lead(json)
    A->>N: plugin.outbound.whatsapp (advisor number)
    N->>WA: deliver
    WA->>H: "🚨 New lead — Luis, 573111111111, triple play"

Why this shape works

allowed_tools: [notify_lead] prevents the LLM from hallucinating other actions — the model literally cannot see other tools.
outbound_allowlist.whatsapp is defense-in-depth: even if the LLM crafts a send to an unexpected number, the runtime rejects it.
workspace_git.enabled: true lets you audit what Ana remembered over time via memory_history — useful for reviewing tough calls.
Gitignored agents.d/ana.yaml keeps tarifarios and business content out of the public repo.

Testing

Open WhatsApp on a second phone and send "hi, ETB"
Watch agent status ana for session activity
Watch docker compose logs agent | jq 'select(.agent == "ana")' for turn-by-turn reasoning

Cross-links

Agent-to-agent delegation

Route work from one agent to another using agent.route.<target_id> with a correlation id. Typical shapes:

Kate delegates research to ops and waits for the reply
Ana fans out lead data to crm-bot, ticket-bot, and logger
A supervisor agent orchestrates specialist subagents

Prerequisites

Two agents configured in config/agents.yaml (and/or agents.d/)
NATS running
Either agent can be the caller or callee; the topology is symmetric

Agent config

agents:
  - id: kate
    model: { provider: minimax, model: MiniMax-M2.5 }
    plugins: [telegram]
    inbound_bindings: [{ plugin: telegram }]
    allowed_delegates: [ops, crm-bot]
    description: "Personal assistant; delegates research to ops."

  - id: ops
    model: { provider: minimax, model: MiniMax-M2.5 }
    accept_delegates_from: [kate]
    description: "Operations agent; answers factual questions about systems."

Key fields:

allowed_delegates (on the caller) — globs of peer ids this agent may route to. Empty = no restriction.
accept_delegates_from (on the callee) — inverse gate. Empty = no restriction.
description — injected into both sides' # PEERS block so the LLM knows who can do what.

Both gates are glob lists and can be set on either side or both.

Wire shape

sequenceDiagram
    participant K as Kate
    participant B as NATS
    participant O as Ops

    Note over K: LLM decides to delegate
    K->>B: publish agent.route.ops<br/>{correlation_id: "req-abc", body: "what's the latest DB migration status?"}
    B->>O: deliver
    O->>O: on_message + LLM turn
    O->>B: publish agent.route.kate<br/>{correlation_id: "req-abc", body: "migration 0042 is running..."}
    B->>K: deliver
    K->>K: correlate reply by req-abc

Correlation ids are caller-chosen strings. The callee echoes the id back on the reply; the caller uses it to match replies to requests (especially for fan-out + reassemble patterns).

Using the `delegate` tool

The runtime exposes a delegate tool whenever allowed_delegates is non-empty. LLM call shape:

{
  "name": "delegate",
  "args": {
    "to": "ops",
    "body": "what's the latest DB migration status?"
  }
}

The runtime:

Generates a fresh correlation_id
Publishes to agent.route.ops with that id
Waits (bounded) for the reply on agent.route.kate
Returns the body as the tool result

Timeouts and retry policy match the broker defaults — the circuit breaker on the target topic protects against an unreachable callee.

Fan-out

To fan out to multiple peers, the LLM can issue several delegate calls in one turn. The runtime issues each with a unique correlation_id and gathers the replies in parallel.

Guardrails

Self-delegation is rejected at the manager level.
Unknown target id → tool returns an error result, no broker traffic.
allowed_delegates empty + no constraint means the agent can delegate to any peer — prefer an explicit list in production.

Observability

Every delegation emits two log lines (dispatch + reply) with structured fields:

{"agent": "kate", "target": "ops", "correlation_id": "...", "event": "delegate_dispatch"}
{"agent": "kate", "target": "ops", "correlation_id": "...", "event": "delegate_reply", "latency_ms": 1342}

Filter on correlation_id to trace a single delegation end to end.

Cross-links

Python extension

Ship a custom tool written in Python — no dependencies beyond stdlib. The agent spawns your script, handshakes with it over stdin/stdout, and exposes your tool to the LLM.

Prerequisites

python3 on the host $PATH
A running nexo-rs install with extensions.enabled: true

1. Copy the template

cp -r extensions/template-python extensions/word-count
cd extensions/word-count

2. Edit `plugin.toml`

[plugin]
id = "word-count"
version = "0.1.0"
description = "Count words in a piece of text."
priority = 0

[capabilities]
tools = ["count_words"]

[transport]
type = "stdio"
command = "python3"
args = ["./main.py"]

[requires]
bins = ["python3"]

[meta]
license = "MIT OR Apache-2.0"

[requires] bins = ["python3"] gates the extension: if Python isn't on $PATH, the runtime skips the extension with a warn log instead of crash-looping.

3. Write `main.py`

#!/usr/bin/env python3
import sys, json

def reply(id, result=None, error=None):
    msg = {"jsonrpc": "2.0", "id": id}
    if error is None:
        msg["result"] = result
    else:
        msg["error"] = error
    sys.stdout.write(json.dumps(msg) + "\n")
    sys.stdout.flush()

def log(*args):
    print(*args, file=sys.stderr, flush=True)

HANDSHAKE = {
    "server_version": "0.1.0",
    "tools": [{
        "name": "count_words",
        "description": "Count whitespace-separated words in a string.",
        "input_schema": {
            "type": "object",
            "properties": {"text": {"type": "string"}},
            "required": ["text"]
        }
    }],
    "hooks": []
}

def main():
    log("word-count starting")
    for line in sys.stdin:
        try:
            req = json.loads(line)
        except json.JSONDecodeError:
            continue
        method = req.get("method", "")
        rid = req.get("id")
        if method == "initialize":
            reply(rid, HANDSHAKE)
        elif method == "tools/count_words":
            params = req.get("params", {}) or {}
            text = params.get("text", "")
            count = len(text.split())
            reply(rid, {"count": count})
        else:
            reply(rid, error={"code": -32601, "message": f"unknown method: {method}"})

if __name__ == "__main__":
    main()

Make it executable:

chmod +x main.py

4. Validate and install

cd ../..
./target/release/agent ext validate ./extensions/word-count/plugin.toml
./target/release/agent ext install ./extensions/word-count --link --enable
./target/release/agent ext doctor --runtime

--link creates a symlink instead of a copy — good for the edit-test loop. doctor --runtime actually spawns the extension and runs the handshake, so a Python error that kills the interpreter during init surfaces here rather than in production logs.

5. Allow the tool per agent

The registered tool name is ext_word-count_count_words. Add it to the right agent's allowed_tools (or use a glob):

agents:
  - id: kate
    allowed_tools:
      - ext_word-count_*
      # ...

6. Run

./target/release/agent --config ./config

Send a message that would prompt the LLM to use the tool; watch the logs for tools/count_words on stderr.

Debugging

stderr of the Python process is forwarded to the agent's log pipeline. print(..., file=sys.stderr) lines show up in the agent's tracing output with the extension=word-count field.
Handshake failures are visible in ext doctor --runtime and prevent the tool from being registered at all.
Per-tool latency shows up in the nexo_tool_latency_ms{tool="ext_word-count_count_words"} Prometheus histogram.

Productionizing

Pin command to an absolute path or a virtualenv-local interpreter; python3 on $PATH may vary across hosts.
Pick your dependency strategy carefully — the template is stdlib only. If you need requests or similar, ship a requirements.txt
- bootstrap script, or switch to the Rust template.
If the extension holds a connection to a remote service, add a heartbeat loop so you can detect liveness.
For long-running tool calls, print status events to stderr — they become structured log entries and help debug hung tools.

Cross-links

Build a poller module (V1 — deprecated)

⚠ Deprecated since Phase 96 (nexo-poller 0.2.0). The in-tree builtins this page documents (gmail, rss, google_calendar) have been extracted to standalone subprocess plugin repos. New pollers should follow Build a poller plugin (V2). The OutboundDelivery / TickOutcome types referenced below are replaced by PollerHost::broker_publish + TickAck as of Phase 96. Treat this page as historical reference.

Three steps. No main.rs edit, no scheduler, no breaker, no SQLite work. The runner gives you all of that — your code only describes what to fetch, what to dispatch, and (optionally) what kind-specific LLM tools to expose.

Reference (post-Phase-96): crates/poller/src/builtins/ for the two remaining in-tree examples (webhook_poll.rs + agent_turn.rs). Phase 96 extractions live in standalone repos: nexo-rs-poller-rss, nexo-rs-poller-google-calendar, nexo-rs-poller-gmail.

Step 1 — implement the trait

#![allow(unused)]
fn main() {
// crates/poller/src/builtins/jira.rs
use std::sync::Arc;

use nexo_poller::{
    OutboundDelivery, PollContext, Poller, PollerError, TickOutcome,
};
use async_trait::async_trait;
use serde::Deserialize;
use serde_json::{json, Value};

#[derive(Debug, Deserialize, Clone)]
#[serde(deny_unknown_fields)]
struct JiraConfig {
    base_url: String,
    project_key: String,
    deliver: nexo_poller::builtins::gmail::DeliverCfg,
}

pub struct JiraPoller;

#[async_trait]
impl Poller for JiraPoller {
    fn kind(&self) -> &'static str { "jira" }

    fn description(&self) -> &'static str {
        "Polls Jira for newly assigned issues in a project."
    }

    fn validate(&self, config: &Value) -> Result<(), PollerError> {
        serde_json::from_value::<JiraConfig>(config.clone())
            .map(drop)
            .map_err(|e| PollerError::Config {
                job: "<jira>".into(),
                reason: e.to_string(),
            })
    }

    async fn tick(&self, ctx: &PollContext) -> Result<TickOutcome, PollerError> {
        let cfg: JiraConfig = serde_json::from_value(ctx.config.clone())
            .map_err(|e| PollerError::Config {
                job: ctx.job_id.clone(),
                reason: e.to_string(),
            })?;

        // 1. Pull data. Use ctx.cursor for incremental fetches.
        // 2. Decide what to dispatch.
        // 3. Build OutboundDelivery items — the runner publishes them
        //    via Phase 17 credentials so you never touch the broker.

        let payload = json!({ "text": "(jira tick — replace with real fetch)" });
        Ok(TickOutcome {
            items_seen: 0,
            items_dispatched: 1,
            deliver: vec![OutboundDelivery {
                channel: nexo_auth::handle::TELEGRAM,
                recipient: cfg.deliver.to.clone(),
                payload,
            }],
            next_cursor: None,
            next_interval_hint: None,
        })
    }
}
}

Anything Poller::validate returns Err(PollerError::Config { … }) fails this job at boot — siblings keep going.

Poller::tick returns:

Ok(TickOutcome) — the runner persists next_cursor, increments counters, dispatches every OutboundDelivery via the agent's Phase 17 binding, and sleeps until next slot.
Err(PollerError::Transient(…)) — counts toward the breaker; next tick retries with backoff.
Err(PollerError::Permanent(…)) — auto-pauses the job and fires the failure_to alert.

PollContext.stores exposes the credential stores when your module needs paths (e.g., Gmail / Calendar built-ins read client_id_path from there). Plain ctx.credentials.resolve(…) is enough when you only need a CredentialHandle.

Step 2 — register

#![allow(unused)]
fn main() {
// crates/poller/src/builtins/mod.rs
pub mod gmail;
pub mod google_calendar;
pub mod jira;          // ← new
pub mod rss;
pub mod webhook_poll;

pub fn register_all(runner: &PollerRunner) {
    runner.register(Arc::new(gmail::GmailPoller::new()));
    runner.register(Arc::new(rss::RssPoller::new()));
    runner.register(Arc::new(webhook_poll::WebhookPoller::new()));
    runner.register(Arc::new(google_calendar::GoogleCalendarPoller::new()));
    runner.register(Arc::new(jira::JiraPoller));   // ← new
}
}

That is the only place wiring is touched. main.rs already calls register_all.

Step 3 — declare a job

# config/pollers.yaml
pollers:
  jobs:
    - id: ana_jira_assigned
      kind: jira
      agent: ana
      schedule: { every_secs: 300 }
      config:
        base_url: https://company.atlassian.net
        project_key: ENG
        deliver:
          channel: telegram
          to: "1194292426"

Run the daemon. Verify with:

agent pollers list                # ana_jira_assigned shows up
agent pollers run ana_jira_assigned   # tick on demand

Add per-kind LLM tools

Your module can ship its own tools alongside the generic pollers_* ones. Override Poller::custom_tools:

#![allow(unused)]
fn main() {
fn custom_tools(&self) -> Vec<nexo_poller::CustomToolSpec> {
    use nexo_llm::ToolDef;
    use nexo_poller::{CustomToolHandler, CustomToolSpec, PollerRunner};
    use async_trait::async_trait;

    struct JiraSearch;
    #[async_trait]
    impl CustomToolHandler for JiraSearch {
        async fn call(
            &self,
            runner: Arc<PollerRunner>,
            args: Value,
        ) -> anyhow::Result<Value> {
            // Use `runner` to inspect / mutate jobs the same way
            // built-in `pollers_*` tools do — list_jobs, run_once,
            // set_paused, reset_cursor are all available.
            let id = args["id"]
                .as_str()
                .ok_or_else(|| anyhow::anyhow!("`id` required"))?;
            let outcome = runner.run_once(id).await?;
            Ok(json!({ "matching": outcome.items_seen }))
        }
    }

    vec![CustomToolSpec {
        def: ToolDef {
            name: "jira_search".into(),
            description: "Run the Jira poll job once without persisting state.".into(),
            parameters: json!({
                "type": "object",
                "properties": {
                    "id": { "type": "string" }
                },
                "required": ["id"]
            }),
        },
        handler: Arc::new(JiraSearch),
    }]
}
}

The agent then sees jira_search automatically — no extra registration step. The adapter in nexo-poller-tools::register_all walks every registered Poller's custom_tools() and wires each spec into the per-agent ToolRegistry.

What the runner gives you for free

Per-job tokio task with every | cron | at schedule + jitter.
Cross-process atomic lease in SQLite (lease takeover after TTL expiry — daemon crash mid-tick is recoverable).
Cursor persistence — your next_cursor is the next tick's ctx.cursor. Survives restarts. agent pollers reset <id> clears it.
Exponential backoff on Transient, auto-pause on Permanent.
Per-job circuit breaker keyed on ("poller", job_id).
Outbound dispatch via Phase 17 — OutboundDelivery lands at plugin.outbound.<channel>.<instance> resolved from the agent's binding. You never touch the broker.
7 Prometheus series labelled by kind, agent, job_id, status. Audit log under target=credentials.audit.
Admin endpoints + CLI subcommands (agent pollers …).
Six generic LLM tools (pollers_list, pollers_show, pollers_run, pollers_pause, pollers_resume, pollers_reset).
Hot-reload via POST /admin/pollers/reload — add | replace | remove | keep plan applied atomically.

Tests pattern

#![allow(unused)]
fn main() {
#[tokio::test]
async fn validate_accepts_minimal() {
    let p = JiraPoller;
    let cfg = json!({
        "base_url": "https://x.atlassian.net",
        "project_key": "ENG",
        "deliver": { "channel": "telegram", "to": "1" },
    });
    p.validate(&cfg).unwrap();
}

#[tokio::test]
async fn validate_rejects_unknown_field() {
    let p = JiraPoller;
    let cfg = json!({ "wat": true, "deliver": { "channel": "x", "to": "1" }});
    assert!(p.validate(&cfg).is_err());
}
}

Cursor / dispatch tests follow the same pattern as the in-tree built-ins (gmail.rs, rss.rs, webhook_poll.rs).

Anti-patterns

Don't publish to the broker directly from tick. Return OutboundDelivery so the runner uses Phase 17 + audit log.
Don't share global state across modules. Use cursors for per-job state; use DashMap inside your struct for per-account caches (gmail does this for GoogleAuthClient).
Don't sleep inside tick for backoff. Return PollerError::Transient and let the runner own the backoff schedule — that way agent pollers reset and hot-reload still cancel cleanly.
Don't auto-create jobs from inside an LLM tool. The runner intentionally exposes only read + control on existing jobs. Operators own pollers.yaml.

Build a poller plugin (V2 — out-of-tree subprocess)

Phase 96 introduced the [plugin.poller] manifest section. Out-of-tree poller plugins ship as standalone Cargo crates publishing to crates.io, spawned as subprocesses by the daemon, and communicating with the runtime via broker JSON-RPC. The daemon's nexo-poller runtime stays provider-agnostic — pollers reach the world through a single egress trait (PollerHost) for outbound, credentials, logs, metrics, and LLM invocations.

If you maintained an in-tree builtin under crates/poller/src/builtins/ before Phase 96, migrate to this recipe. The legacy nexo-poller-ext StdioRuntime bridge is deprecated since v0.2.0 and slated for deletion two release cycles after Phase 96 ships.

Three steps

Scaffold a new Cargo crate that depends on nexo-microapp-sdk with the poller feature.
Implement PollerHandler::tick — fetch, parse, dispatch via host.broker_publish, return a TickAck.
Write a nexo-plugin.toml declaring [plugin.poller] plus the broker topics your plugin needs to subscribe / publish on.

Step 1 — Cargo.toml

[package]
name = "nexo-poller-jira"
version = "0.1.0"
edition = "2021"

[[bin]]
name = "nexo-poller-jira"
path = "src/main.rs"

[lib]
name = "nexo_poller_jira"
path = "src/lib.rs"

[dependencies]
nexo-microapp-sdk = { version = "0.2", features = ["plugin", "poller"] }
nexo-poller       = "0.2"
nexo-broker       = "0.1"
nexo-config       = "0.1"

tokio              = { version = "1", features = ["macros", "rt-multi-thread", "sync", "time", "io-util", "io-std"] }
async-trait        = "0.1"
serde              = { version = "1", features = ["derive"] }
serde_json         = "1"
anyhow             = "1"
tracing            = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
reqwest            = { version = "0.12", default-features = false, features = ["rustls-tls"] }

Step 2 — `PollerHandler` implementation

#![allow(unused)]
fn main() {
// src/lib.rs
use std::sync::Arc;

use async_trait::async_trait;
use serde::Deserialize;
use serde_json::json;

use nexo_microapp_sdk::poller::{PollerHandler, TickRequest};
use nexo_poller::{PollerError, PollerHost, TickAck, TickMetrics};

#[derive(Debug, Deserialize, Clone)]
#[serde(deny_unknown_fields)]
pub struct JiraJobConfig {
    pub base_url: String,
    pub project_key: String,
    pub deliver: DeliverCfg,
}

#[derive(Debug, Deserialize, Clone)]
#[serde(deny_unknown_fields)]
pub struct DeliverCfg {
    pub channel: String,
    #[serde(alias = "recipient")]
    pub to: String,
}

pub struct JiraHandler {
    http: reqwest::Client,
}

impl JiraHandler {
    pub fn new() -> Self {
        Self {
            http: reqwest::Client::builder()
                .timeout(std::time::Duration::from_secs(30))
                .build()
                .expect("reqwest"),
        }
    }
}

#[async_trait]
impl PollerHandler for JiraHandler {
    async fn tick(
        &self,
        req: TickRequest,
        host: Arc<dyn PollerHost>,
    ) -> Result<TickAck, PollerError> {
        let cfg: JiraJobConfig = serde_json::from_value(req.config.clone())
            .map_err(|e| PollerError::Config { job: req.job_id.clone(), reason: e.to_string() })?;

        // Fetch from Jira (replace with real API call).
        let resp = self.http.get(&format!("{}/rest/api/3/search?jql=project={}", cfg.base_url, cfg.project_key))
            .send().await
            .map_err(|e| PollerError::Transient(anyhow::Error::from(e)))?;
        if !resp.status().is_success() {
            return Err(PollerError::Transient(anyhow::anyhow!("HTTP {}", resp.status())));
        }
        let _body: serde_json::Value = resp.json().await
            .map_err(|e| PollerError::Transient(anyhow::Error::from(e)))?;

        // Resolve the outbound channel's account_id via reverse-RPC.
        let cred = host.credentials_get(cfg.deliver.channel.clone()).await
            .map_err(|e| PollerError::Permanent(anyhow::anyhow!("credentials_get: {e}")))?;
        let account_id = cred.get("account_id")
            .and_then(|v| v.as_str())
            .ok_or_else(|| PollerError::Permanent(anyhow::anyhow!("no account_id")))?
            .to_string();
        let topic = format!("plugin.outbound.{}.{}", cfg.deliver.channel, account_id);

        // Dispatch one message per new issue.
        let payload = json!({ "to": cfg.deliver.to, "text": "new Jira issue" });
        let payload_bytes = serde_json::to_vec(&payload)
            .map_err(|e| PollerError::Transient(anyhow::Error::from(e)))?;
        host.broker_publish(topic, payload_bytes).await
            .map_err(|e| PollerError::Transient(anyhow::anyhow!("broker_publish: {e}")))?;

        Ok(TickAck {
            next_cursor: None,
            next_interval_hint: None,
            metrics: Some(TickMetrics { items_seen: 1, items_dispatched: 1 }),
        })
    }
}
}

Step 3 — `nexo-plugin.toml`

manifest_version = 2

[plugin]
id               = "jira"
version          = "0.1.0"
name             = "Jira Poller"
description      = "Jira issues poller — fetches new issues, dispatches via deliver channel."
min_nexo_version = ">=0.2.0"

[plugin.entrypoint]
command = "nexo-poller-jira"

[plugin.requires]
nexo_capabilities = ["broker"]

[plugin.capabilities.broker]
subscribe = [
    "plugin.poller.jira.tick",
    "_inbox.>",
]
publish = [
    "daemon.rpc.jira",
    "plugin.outbound.whatsapp.>",
    "plugin.outbound.telegram.>",
    "_inbox.>",
]

[plugin.poller]
kinds                = ["jira"]
broker_topic_prefix  = "plugin.poller.jira"
lifecycle            = "long_lived"
max_concurrent_ticks = 1
tick_timeout_secs    = 60

Operator config

# pollers.yaml
jobs:
  - id: backend_jira
    kind: jira
    agent: ana
    schedule: { every: 15m }
    config:
      base_url: "https://acme.atlassian.net"
      project_key: "ENG"
      deliver:
        channel: telegram
        to: "-1001234567"

Install + boot

cargo install nexo-poller-jira
agent run

The daemon discovers the plugin via its [plugin.entrypoint] line, registers the jira kind in the PluginPollerRouter, and routes matching jobs through broker JSON-RPC. The plugin's broker subscriber receives ticks on plugin.poller.jira.tick, dispatches to your PollerHandler::tick, encodes the TickAck into the wire reply, and publishes back on the message's reply_to topic.

What `PollerHost` exposes

The poller reaches the runtime through one trait. Four methods:

Method	Use case
`broker_publish(topic, payload)`	Outbound — direct to broker (Phase 92 path)
`credentials_get(channel)`	Resolve `{ account_id, … }` for the outbound channel
`log(level, message, fields)`	Structured log forwarded to daemon tracing
`metric_inc(name, labels)`	Counter increment forwarded to daemon Prometheus
`llm_invoke(request)`	LLM completion through daemon's `LlmRegistry`

No OutboundDelivery, no Channel enum, no credential bundle types in your code — your plugin owns its own outbound logic and topic construction.

Migrating from V1 (in-tree builtin)

If you maintained a builtin under crates/poller/src/builtins/:

Create the standalone repo from the recipe above.
Copy your Poller::tick body into PollerHandler::tick. Three rename rules:
- ctx.credentials.resolve(agent, channel) → host.credentials_get(channel).await
- OutboundDelivery { channel, recipient, payload } push → build the topic yourself (plugin.outbound.<channel>.<account_id>) and call host.broker_publish(topic, payload_bytes)
- TickOutcome { items_seen, items_dispatched, deliver, next_cursor, next_interval_hint } → TickAck { next_cursor, next_interval_hint, metrics: Some(TickMetrics { items_seen, items_dispatched }) }
Drop your entry from crates/poller/src/builtins/mod.rs::register_all.
Publish your new crate to crates.io. The daemon's [plugin.poller] manifest discovery picks it up at boot.

The reference Phase 96 extractions live at nexo-rs-poller-rss, nexo-rs-poller-google-calendar, and nexo-rs-poller-gmail — see those repos for end-to-end examples with broker subscriber boot, reverse-RPC credential refresh, and serde-driven config parsing.

Deploy on Hetzner Cloud (CX22)

A concrete recipe for a single-VPS production deploy. CX22 is the Hetzner sweet spot — €3.79/mo, 2 vCPU, 4 GB RAM, 40 GB SSD, ARM64, 20 TB transfer included. Runs the Nexo daemon + an internal NATS broker comfortably with headroom for the browser plugin (Chrome).

This recipe targets a single-tenant personal-agent deploy. For multi-tenant or multi-process see Phase 32.

What you end up with

Nexo daemon under systemd, auto-start on boot
NATS broker on the same host (nats-server from the official Debian package), auto-start
Cloudflare Tunnel for inbound HTTPS without opening ports
UFW firewall: only outbound + cloudflared
Unattended security upgrades
TLS handled by Cloudflare; no Let's Encrypt cert renewal to babysit

Estimated cost: ~€4/month (CX22 only; Cloudflare Tunnel is free).

0. Prerequisites

Hetzner Cloud account with API token
Cloudflare account with a domain pointed at it
SSH key uploaded to Hetzner (hcloud ssh-key create --name ops --public-key-from-file ~/.ssh/id_ed25519.pub)

1. Provision the VPS

Via Hetzner Cloud console: New Server → Location: any close to your users → Image: Debian 12 → Type: CX22 (ARM64, shared vCPU). Add your SSH key. Name it nexo-1.

CLI alternative:

hcloud server create \
  --name nexo-1 \
  --type cx22 \
  --image debian-12 \
  --ssh-key ops \
  --location nbg1

Wait ~30s, grab the IPv4 from the dashboard.

2. Initial hardening (one-time)

SSH in as root, then drop privileges to a sudo user:

ssh root@<ip>
adduser ops
usermod -aG sudo ops
rsync --archive --chown=ops:ops ~/.ssh /home/ops
exit

ssh ops@<ip>
sudo apt update && sudo apt full-upgrade -y
sudo apt install -y unattended-upgrades ufw fail2ban
sudo dpkg-reconfigure -p low unattended-upgrades

# Firewall: deny inbound, allow outbound + ssh from your IP only
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow from <your-home-ip> to any port 22 proto tcp
sudo ufw enable

# Disable root SSH + password auth
sudo sed -i 's/^#\?PermitRootLogin.*/PermitRootLogin no/' /etc/ssh/sshd_config
sudo sed -i 's/^#\?PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config
sudo systemctl restart ssh

3. Install Nexo from the .deb

Once Phase 27.4 ships and a release exists with an arm64 .deb:

curl -LO https://github.com/lordmacu/nexo-rs/releases/latest/download/nexo-rs_arm64.deb

# Verify the signature first (Phase 27.3):
curl -LO https://github.com/lordmacu/nexo-rs/releases/latest/download/nexo-rs_arm64.deb.bundle
cosign verify-blob \
  --bundle nexo-rs_arm64.deb.bundle \
  --certificate-identity-regexp 'https://github.com/lordmacu/nexo-rs/.*' \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  nexo-rs_arm64.deb \
  || { echo "REFUSING TO INSTALL UNSIGNED PACKAGE"; exit 1; }

sudo apt install ./nexo-rs_arm64.deb

The post-install scaffolds the nexo user, owns /var/lib/nexo-rs/, and prints next steps. Does not auto-start the service — that comes after we wire config.

4. Install + enable NATS

# Hetzner Debian repo doesn't ship nats-server; use the upstream .deb
NATS_VERSION=2.10.20
curl -LO "https://github.com/nats-io/nats-server/releases/download/v${NATS_VERSION}/nats-server-v${NATS_VERSION}-linux-arm64.deb"
sudo apt install ./nats-server-v${NATS_VERSION}-linux-arm64.deb
sudo systemctl enable --now nats-server

NATS now listens on 127.0.0.1:4222 (loopback only) — exactly what we want; only Nexo running on the same host should reach it.

5. Wire Nexo config

sudo -u nexo nexo setup

The wizard asks for:

LLM provider keys (Anthropic / MiniMax / etc.) — paste them; they land in /var/lib/nexo-rs/secret/ mode 0600 owned by nexo:nexo
WhatsApp / Telegram pairing — defer if not needed yet
Memory backend — pick sqlite-vec (default for single-host)

The wizard writes /etc/nexo-rs/{agents,broker,llm,memory}.yaml. Verify broker.yaml points at nats://127.0.0.1:4222.

6. Cloudflare Tunnel for HTTPS

The Nexo admin port (8080) shouldn't be exposed directly. Use a tunnel:

# Install cloudflared
curl -LO https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-arm64.deb
sudo apt install ./cloudflared-linux-arm64.deb

# Authenticate (opens a browser link — visit it on your laptop)
cloudflared tunnel login

# Create tunnel
cloudflared tunnel create nexo-1

# Route a hostname
cloudflared tunnel route dns nexo-1 nexo.yourdomain.com

# Config
sudo mkdir -p /etc/cloudflared
sudo tee /etc/cloudflared/config.yml >/dev/null <<EOF
tunnel: nexo-1
credentials-file: /home/ops/.cloudflared/<UUID>.json

ingress:
  - hostname: nexo.yourdomain.com
    service: http://127.0.0.1:8080
  - service: http_status:404
EOF

# Run as a service
sudo cloudflared service install
sudo systemctl enable --now cloudflared

Now https://nexo.yourdomain.com reaches the Nexo admin via Cloudflare's edge — TLS terminated at Cloudflare, no cert renewal, DDoS protection bundled.

7. Start Nexo

sudo systemctl enable --now nexo-rs
sudo journalctl -u nexo-rs -f

You should see the boot sequence: config validated → broker connected → agents loaded → ready.

8. Verify

# Local health check (over the loopback)
curl -fsSL http://127.0.0.1:8080/health

# External via the tunnel
curl -fsSL https://nexo.yourdomain.com/health

# Metrics endpoint
curl -fsSL http://127.0.0.1:9090/metrics | head -20

9. Backups

The state lives in /var/lib/nexo-rs/. Daily snapshot to S3 / Backblaze:

# /etc/cron.daily/nexo-backup
#!/bin/sh
set -eu
TIMESTAMP=$(date -u +%Y%m%dT%H%M%SZ)
BACKUP="/tmp/nexo-${TIMESTAMP}.tar.zst"

# Pause the runtime briefly so SQLite isn't mid-write.
systemctl stop nexo-rs

tar -I 'zstd -19 -T0' \
    -cf "$BACKUP" \
    -C /var/lib/nexo-rs \
    --exclude='./queue/*.tmp' \
    .

systemctl start nexo-rs

# Upload — adjust to your storage backend
rclone copy "$BACKUP" remote:nexo-backups/
rm "$BACKUP"

# Retain last 30
rclone delete --min-age 30d remote:nexo-backups/

chmod +x /etc/cron.daily/nexo-backup.

For a sub-second pause-free backup, use SQLite's VACUUM INTO-based hot backup — track Phase 36 (backup, restore, migrations) for the upcoming nexo backup subcommand.

10. Updates

# Pull the latest .deb
curl -LO https://github.com/lordmacu/nexo-rs/releases/latest/download/nexo-rs_arm64.deb
# Verify (always)
cosign verify-blob ...
# Install (apt restarts the service automatically)
sudo apt install ./nexo-rs_arm64.deb

Or wire the apt repo (Phase 27.4 follow-up) and run apt upgrade nexo-rs like any other system package.

Limits + escape hatches

Browser plugin uses ~300 MB RAM per Chrome process. CX22 has 4 GB; budget 2 instances tops. Bump to CX32 (€7/mo, 4 vCPU, 8 GB) when you start hitting OOM.
NATS on the same host is fine for single-tenant; for multi-host, run NATS on its own VM (CX12, €3.29/mo).
TLS at Cloudflare only means traffic between Cloudflare's edge and your VPS is plain HTTP over the tunnel. The tunnel is encrypted at the transport layer (QUIC + mTLS to Cloudflare), so this is fine — but if you want defense-in-depth, terminate TLS again locally with caddy or nginx.

Troubleshooting

Tunnel disconnects after reboot — systemctl status cloudflared. The credentials file moved if you reinstalled cloudflared with a different service install. Re-run cloudflared service install after cloudflared tunnel login.
NATS refuses connections — the upstream .deb binds 0.0.0.0:4222 by default. Edit /etc/nats-server/nats-server.conf to set host: 127.0.0.1 and systemctl restart nats-server.
Nexo can't write to /var/lib/nexo-rs/ — sudo chown -R nexo:nexo /var/lib/nexo-rs && sudo chmod 0750 /var/lib/nexo-rs.

Docker compose — single-machine but containerized (vs systemd-native here)
Native install — the underlying mechanics of step 3 if you skip the .deb
Phase 27.4 (Debian / RPM packages) — source of the .deb this recipe consumes

Deploy on Fly.io

Recipe for a single-region Fly.io deploy. Fly's strengths fit Nexo well: persistent volumes (for the SQLite state), health checks, free TLS, easy multi-region scale-out, and a generous free tier (up to 3 shared-1x VMs free) that covers a personal agent.

What you end up with

Nexo daemon + bundled local NATS broker on a single Fly machine
Persistent volume mounted at /var/lib/nexo-rs/
Free TLS via fly.io subdomain (custom domain optional)
Auto-redeploy on every git push to main (via Fly GitHub Action)
Fly's built-in metrics + log streaming

Estimated cost: $0–$5/mo (free tier covers shared-1x VM + small volume; bigger Chrome workloads = $5-15/mo on a performance-1x).

0. Prerequisites

# Install flyctl
curl -L https://fly.io/install.sh | sh
fly auth login
fly auth signup     # if first time

# Confirm:
fly version

1. Initialize the app

From the repo root:

fly launch \
  --name nexo-yourname \
  --region <closest-region>  \
  --vm-cpu-kind shared       \
  --vm-cpus 1                \
  --vm-memory 1024           \
  --no-deploy

--no-deploy lets us tweak the generated fly.toml before the first build.

2. `fly.toml`

Replace the auto-generated fly.toml with this:

app = "nexo-yourname"
primary_region = "ams"           # or whichever closest

# Use the published GHCR image instead of building per-deploy.
[build]
  image = "ghcr.io/lordmacu/nexo-rs:latest"

# Persistent state — Fly volumes survive restarts and are
# mounted into the VM. SQLite + transcripts + secret/ live here.
[mounts]
  source = "nexo_data"
  destination = "/app/data"

# Override the container CMD so config + state align with the
# fly volume layout. NEXO_HOME defaults to /app/data so
# everything writable lands on the volume.
[env]
  RUST_LOG = "info"
  NEXO_HOME = "/app/data"

# `services` block tells Fly which container ports to expose.
[[services]]
  internal_port = 8080
  protocol = "tcp"
  auto_stop_machines = false   # keep the agent running 24/7
  auto_start_machines = true
  min_machines_running = 1

  [[services.ports]]
    port = 80
    handlers = ["http"]
    force_https = true

  [[services.ports]]
    port = 443
    handlers = ["tls", "http"]

  [services.concurrency]
    type = "connections"
    soft_limit = 200
    hard_limit = 250

  [[services.tcp_checks]]
    interval = "15s"
    timeout = "2s"
    grace_period = "30s"

# Metrics endpoint — Fly scrapes Prometheus-style automatically.
[metrics]
  port = 9090
  path = "/metrics"

# VM sizing — bump to performance-1x when the browser plugin is on.
[[vm]]
  cpu_kind = "shared"
  cpus = 1
  memory_mb = 1024

3. Create the volume

fly volumes create nexo_data --region ams --size 3

3 GB covers SQLite + a few months of transcripts. Bump as needed.

4. Set secrets

Fly's secret store injects them as env vars at runtime. Reference them from config/llm.yaml via ${ENV_VAR} placeholders:

fly secrets set ANTHROPIC_API_KEY=sk-ant-...
fly secrets set MINIMAX_API_KEY=...
fly secrets set MINIMAX_GROUP_ID=...
# Anything else your llm.yaml references via ${...}

The Nexo config loader resolves ${ANTHROPIC_API_KEY} placeholders from the process env — works the same whether the env vars come from /run/secrets/, ~/.bashrc, or Fly secrets.

5. Pre-bake the config

Fly mounts /app/data from the volume but /app/config lives inside the image. Two options:

Option A — bake config into a custom image (recommended). Wrap the GHCR image in a tiny Dockerfile:

# Dockerfile.fly
FROM ghcr.io/lordmacu/nexo-rs:latest

# Copy your operator config tree into the image. Adjust to
# whatever your setup needs — just don't ship secrets here, use
# fly secrets for those.
COPY ./config/fly /app/config

# fly.toml's CMD already passes `--config /app/config`.

Then change fly.toml:

[build]
  dockerfile = "Dockerfile.fly"

Option B — write config to the volume on first boot. Use a Fly machine init script that runs nexo setup --non-interactive --from-env once, then exits.

6. Deploy

fly deploy

First deploy spins up the volume + machine. Subsequent deploys hot-swap the image with zero-downtime rolling restart.

7. Verify

# Health
fly status
curl https://nexo-yourname.fly.dev/health

# Metrics (over the Fly internal network)
fly proxy 9090:9090 -a nexo-yourname &
curl http://127.0.0.1:9090/metrics | head -20

# Logs
fly logs

# SSH in if something looks off
fly ssh console

8. Custom domain

fly certs add nexo.yourdomain.com
# Add the CNAME to your DNS as instructed
fly certs check nexo.yourdomain.com

9. Continuous deploy on push

Drop this into .github/workflows/fly-deploy.yml:

name: fly-deploy
on:
  push:
    branches: [main]
permissions:
  contents: read
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: superfly/flyctl-actions/setup-flyctl@master
      - run: flyctl deploy --remote-only
        env:
          FLY_API_TOKEN: ${{ secrets.FLY_API_TOKEN }}

Get a token: fly tokens create deploy -x 999999h. Drop in repo secrets as FLY_API_TOKEN.

10. Backups

# Manual snapshot
fly volumes snapshots create nexo_data
fly volumes snapshots list  nexo_data

# Restore (creates a new volume from the snapshot)
fly volumes create nexo_data_restored \
  --snapshot-id vs_xxxxxxxxxxxx \
  --region ams

For automated backups, set up a daily Fly cron machine that runs fly volumes snapshots create against the data volume.

Limits + escape hatches

Free tier shared-1x has 1 vCPU + 256 MB RAM — too small for the browser plugin. Disable Chrome (plugins.browser.enabled: false) on shared-1x; or bump to performance-1x ($15/mo, 1 vCPU + 2 GB).
Single-region by default — Fly has a multi-region story but the broker (NATS) doesn't speak Fly's distributed primitives. For multi-region, run NATS on a dedicated VM with NatsBroker cluster mode and pin Nexo machines to the same region as their broker.
Volume snapshots cost $0.15/GB/month — small but adds up if you keep many. Auto-prune via the snapshot cron.

Troubleshooting

Volume mount fails on machine start — fly volumes list must show the volume in the same region as the machine. Mismatch = create the volume in the right region or move the machine.
Out of memory + machine cycles — most likely the browser plugin loaded Chrome on a shared-1x. Check fly logs for OOM killer messages; bump VM size or disable the browser plugin.
Secrets not picked up after deploy — Fly redacts them in logs but they're in the env. SSH in (fly ssh console), run printenv | grep ANTHROPIC to verify.

Docker GHCR — same image Fly pulls
Hetzner deploy — bare-VM alternative if you outgrow Fly's free tier or want full control
Phase 27.5 (Docker GHCR) — source of the image this recipe pulls

Deploy on AWS (EC2)

Recipe for a single-AZ AWS deploy on t4g.small (ARM Graviton). Fits a personal-agent or small team; production multi-AZ scale-out needs Phase 32 multi-host orchestration.

What you end up with

Nexo daemon under systemd on EC2 + EBS gp3 for state
Nginx + ACM cert for TLS termination (free)
Route53 hostname pointing at the instance
IAM role granting only SES send + S3 backup-bucket access (no console / no read of other AWS resources)
Daily snapshot of the EBS volume + lifecycle policy retaining 30
CloudWatch agent shipping /var/log/nexo-rs/*.log + metrics

Estimated cost (us-east-1, on-demand):

t4g.small instance: ~$13.43/mo
gp3 16 GB EBS: ~$1.28/mo
Route53 hosted zone: $0.50/mo
ACM cert: free
SES outbound (5k emails/mo on free tier first 12 months): free then $0.10/1k
Total: ~$15-20/mo

Cheaper alternative for personal-agent budgets: use Hetzner's CX22 at €4/mo if you don't need AWS-specific integrations.

0. Prerequisites

AWS account with billing alarms set
Route53 hosted zone for your domain
AWS CLI installed and aws configure'd locally
Terraform 1.5+ if you want infra-as-code (recommended)

1. Provision via Terraform (recommended)

The repo will eventually ship deploy/terraform/aws/ (Phase 40 follow-up). Until then, here's a minimal main.tf:

terraform {
  required_providers {
    aws = { source = "hashicorp/aws", version = "~> 5.0" }
  }
}

provider "aws" {
  region = "us-east-1"
}

# --- VPC + subnet -----------------------------------------------------
resource "aws_vpc" "nexo" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_support   = true
  enable_dns_hostnames = true
  tags = { Name = "nexo" }
}

resource "aws_subnet" "nexo_public" {
  vpc_id                  = aws_vpc.nexo.id
  cidr_block              = "10.0.1.0/24"
  availability_zone       = "us-east-1a"
  map_public_ip_on_launch = true
}

resource "aws_internet_gateway" "nexo" {
  vpc_id = aws_vpc.nexo.id
}

resource "aws_route_table" "nexo_public" {
  vpc_id = aws_vpc.nexo.id
  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.nexo.id
  }
}

resource "aws_route_table_association" "nexo_public" {
  subnet_id      = aws_subnet.nexo_public.id
  route_table_id = aws_route_table.nexo_public.id
}

# --- security group ----------------------------------------------------
resource "aws_security_group" "nexo" {
  name   = "nexo"
  vpc_id = aws_vpc.nexo.id

  # SSH only from your home IP — replace 1.2.3.4/32 with yours.
  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["1.2.3.4/32"]
  }

  # 443 open to the world, terminated at nginx on the instance.
  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  # 80 only to redirect to https.
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# --- IAM role: SES + S3 backups, nothing else --------------------------
resource "aws_iam_role" "nexo" {
  name = "nexo-instance"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action    = "sts:AssumeRole"
      Effect    = "Allow"
      Principal = { Service = "ec2.amazonaws.com" }
    }]
  })
}

resource "aws_iam_role_policy" "nexo" {
  name = "nexo-instance-policy"
  role = aws_iam_role.nexo.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      { Effect = "Allow", Action = ["ses:SendEmail","ses:SendRawEmail"], Resource = "*" },
      { Effect = "Allow", Action = ["s3:PutObject","s3:GetObject","s3:DeleteObject","s3:ListBucket"], Resource = ["arn:aws:s3:::your-nexo-backups","arn:aws:s3:::your-nexo-backups/*"] }
    ]
  })
}

resource "aws_iam_instance_profile" "nexo" {
  name = "nexo-instance"
  role = aws_iam_role.nexo.name
}

# --- AMI lookup: latest Debian 12 arm64 -------------------------------
data "aws_ami" "debian" {
  most_recent = true
  owners      = ["136693071363"]   # Debian official
  filter {
    name   = "name"
    values = ["debian-12-arm64-*"]
  }
}

# --- instance ----------------------------------------------------------
resource "aws_instance" "nexo" {
  ami                    = data.aws_ami.debian.id
  instance_type          = "t4g.small"
  subnet_id              = aws_subnet.nexo_public.id
  vpc_security_group_ids = [aws_security_group.nexo.id]
  iam_instance_profile   = aws_iam_instance_profile.nexo.name
  key_name               = "your-existing-aws-keypair-name"

  root_block_device {
    volume_size = 16
    volume_type = "gp3"
    encrypted   = true
  }

  tags = {
    Name = "nexo-1"
  }
}

# --- Route53 DNS -------------------------------------------------------
data "aws_route53_zone" "main" {
  name = "yourdomain.com."
}

resource "aws_route53_record" "nexo" {
  zone_id = data.aws_route53_zone.main.zone_id
  name    = "nexo.yourdomain.com"
  type    = "A"
  ttl     = 300
  records = [aws_instance.nexo.public_ip]
}

output "nexo_ip" {
  value = aws_instance.nexo.public_ip
}

Then:

terraform init
terraform apply
# review the plan; type 'yes'

2. Hardening + install (post-provision)

SSH in:

ssh admin@nexo.yourdomain.com
sudo apt update && sudo apt full-upgrade -y
sudo apt install -y unattended-upgrades ufw fail2ban nginx certbot python3-certbot-nginx
sudo dpkg-reconfigure -p low unattended-upgrades

# UFW — defense in depth on top of the security group
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow 22/tcp
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw enable

# Disable root SSH + password auth
sudo sed -i 's/^#\?PermitRootLogin.*/PermitRootLogin no/' /etc/ssh/sshd_config
sudo sed -i 's/^#\?PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config
sudo systemctl restart ssh

Install Nexo (when 27.4 .deb is available):

curl -LO https://github.com/lordmacu/nexo-rs/releases/latest/download/nexo-rs_arm64.deb
# Verify Cosign signature first (Phase 27.3) — see verify.md
sudo apt install ./nexo-rs_arm64.deb

NATS:

NATS_VERSION=2.10.20
curl -LO "https://github.com/nats-io/nats-server/releases/download/v${NATS_VERSION}/nats-server-v${NATS_VERSION}-linux-arm64.deb"
sudo apt install ./nats-server-v${NATS_VERSION}-linux-arm64.deb
sudo systemctl enable --now nats-server

3. nginx + ACM-via-certbot

sudo tee /etc/nginx/sites-available/nexo >/dev/null <<'EOF'
server {
    listen 80;
    server_name nexo.yourdomain.com;
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name nexo.yourdomain.com;

    # Cert paths populated after `certbot --nginx`
    ssl_certificate     /etc/letsencrypt/live/nexo.yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/nexo.yourdomain.com/privkey.pem;
    ssl_protocols       TLSv1.2 TLSv1.3;

    # Health check — proxied through to the daemon
    location /health    { proxy_pass http://127.0.0.1:8080; access_log off; }
    location /ready     { proxy_pass http://127.0.0.1:8080; access_log off; }

    # Admin surface (auth via the daemon's session token)
    location /api/      { proxy_pass http://127.0.0.1:8080; }
    location /admin/    { proxy_pass http://127.0.0.1:8080; }

    # Block /metrics from public — scrape internally only
    location /metrics   { return 403; }
}
EOF
sudo ln -s /etc/nginx/sites-available/nexo /etc/nginx/sites-enabled/nexo
sudo nginx -t

# Issue cert (ACME via Let's Encrypt — same chain ACM uses)
sudo certbot --nginx -d nexo.yourdomain.com --non-interactive --agree-tos -m ops@yourdomain.com
sudo systemctl reload nginx

If you want AWS ACM specifically (instead of Let's Encrypt), front the EC2 with an ALB and attach an ACM cert there — adds ~$18/mo for the ALB. Most personal deploys don't need it.

4. Wire SES for outbound email

The IAM role grants ses:SendEmail. Configure in config/llm.yaml:

plugins:
  email:
    provider: ses
    aws_region: us-east-1
    # Credentials come from the EC2 instance profile — no keys
    # in the YAML.
    sender: "agent@nexo.yourdomain.com"

Verify the sender domain in SES first:

aws ses verify-domain-identity --domain yourdomain.com
# Add the printed TXT record to Route53
aws ses set-identity-mail-from-domain --identity yourdomain.com \
    --mail-from-domain mail.yourdomain.com

If your SES account is still in sandbox, request production access via the SES console — required to send to non-verified recipients.

5. EBS snapshots + lifecycle

# Daily snapshot via DLM (Data Lifecycle Manager) — set up once
# in Terraform or via the console:

aws dlm create-lifecycle-policy \
    --description "nexo daily snapshots, retain 30" \
    --state ENABLED \
    --execution-role-arn arn:aws:iam::ACCT:role/AWSDataLifecycleManagerDefaultRole \
    --policy-details '{...}'   # see DLM docs

Or the cheap way: cron + aws ec2 create-snapshot on the instance itself, retaining 30 days locally.

6. CloudWatch logs + metrics

sudo apt install -y amazon-cloudwatch-agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
# Point at /var/log/nexo-rs/*.log + 9090/metrics scrape

The Prometheus metrics endpoint can be pulled by CloudWatch Container Insights via the EMF agent if you go in that direction. For most personal deploys, journalctl + a Grafana Cloud free-tier scrape is cheaper.

Limits + escape hatches

t4g.small RAM (2 GB) is tight if the browser plugin is on. Bump to t4g.medium (4 GB, ~$26/mo) before turning on Chrome.
Single AZ. AZ outage = full downtime. Multi-AZ needs Phase 32 + an external NATS cluster. Acceptable for personal agents; not for SLAs.
SES sandbox limit (200 emails/day) until you request production. Plan for this if email channel is primary.
EIP not allocated. Stop/start the instance and the public IP changes. Allocate an Elastic IP (free when attached) if the Route53 record can't auto-update.

Troubleshooting

Nexo can't send email — aws sts get-caller-identity from the instance must show the nexo-instance role. If empty, the instance profile is missing.
certbot --nginx fails — DNS hasn't propagated yet. Wait 5-10 min after the Route53 record creation.
/health returns 503 — broker not ready. systemctl status nats-server; if good, check journalctl -u nexo-rs for credential errors (instance profile didn't propagate, or config/llm.yaml references a key the instance can't reach).

Hetzner Cloud — bare-VM, cheaper
Fly.io — easier scaling, less AWS lock-in
Phase 27.4 (Debian package) — source of the .deb this recipe consumes
Phase 27.3 (Cosign) — signature verification before install

NATS with TLS + auth

Harden the broker for a multi-node deployment: mTLS on the client connection, NKey-based authentication, and a separate NATS server process (not the throwaway Docker-compose one).

Prerequisites

A NATS server ≥ 2.10
nsc CLI for generating NKeys
The agent binary deployed where it will run

1. Generate NKeys

nsc add operator --generate-signing-key nexo-ops
nsc add account --name nexo-prod
nsc add user --name agent-kate --account nexo-prod
nsc generate creds --account nexo-prod --name agent-kate > secrets/agent-kate.nkey

secrets/agent-kate.nkey is a single-file credential that contains both the NKey seed and the signed JWT. Treat it like any other secret — gitignored, Docker-secret, k8s-secret.

2. Configure the NATS server

nats-server.conf:

listen: 0.0.0.0:4222
http: 0.0.0.0:8222

tls {
  cert_file: "/etc/nats/tls/server.crt"
  key_file:  "/etc/nats/tls/server.key"
  ca_file:   "/etc/nats/tls/ca.crt"
  verify:    true       # require client certs too (mTLS)
}

authorization {
  operator = "/etc/nats/nsc/operator.jwt"
  resolver = MEMORY
  accounts = [
    { name: nexo-prod, jwt: "/etc/nats/nsc/nexo-prod.jwt" }
  ]
}

Start the server:

nats-server -c nats-server.conf

3. Configure the agent

config/broker.yaml:

broker:
  type: nats
  url: tls://nats.example.com:4222
  auth:
    enabled: true
    nkey_file: ./secrets/agent-kate.nkey
  persistence:
    enabled: true
    path: ./data/queue
  fallback:
    mode: local_queue
    drain_on_reconnect: true

The agent reads nkey_file at startup and presents it on every connection.

4. Verify the client

Before starting the full agent, smoke-test the credentials with the nats CLI:

nats --creds ./secrets/agent-kate.nkey \
     --tlsca /etc/nats/tls/ca.crt \
     -s tls://nats.example.com:4222 \
     pub test.topic "hello"

If this works, the agent will too.

5. Deploy

Start the agent as usual:

agent --config ./config

On boot the agent:

Opens a TLS connection to the broker
Presents its NKey + JWT
Server validates against the operator/account JWT
Subscribes only to subjects its account is allowed to access

6. Multi-agent isolation

Give each agent its own NKey and an export/import declaration in the NSC account so agents can talk to each other on specific subjects only. Example policy:

# allow kate to publish agent.route.ops
# deny kate from publishing plugin.outbound.* (only the WA plugin should)

The agent does not enforce NATS auth itself — it just presents credentials. The broker enforces. That's the point: you can revoke a compromised agent without touching the agent's code or config.

Observability

circuit_breaker_state{breaker="nats"} flips to 1 if the broker rejects the credentials on startup or after a refresh
disk queue buffers every publish while the circuit is open — see Event bus — disk queue
nats --trace on the server side logs every auth failure with the rejected subject

Gotchas

verify: true (mTLS) requires client certs and NKey auth. Picking one or the other is a policy choice — don't half-configure.
JWT expiry. Account JWTs expire; NSC's push command renews them against the resolver.
Disk queue on client side. Even with auth misconfigured, the agent keeps running on the local fallback; operators may miss the outage without alerting on circuit_breaker_state.

Cross-links

Rotating config without downtime

Three practical hot-reload scenarios. Each shows the YAML edit, how to trigger the swap, and what the operator should see in the logs and on the metrics endpoint. Reference: Config hot-reload.

Prerequisites

A running daemon (agent in another terminal or under systemd).
Broker reachable from the same host (broker.yaml).
Phase 16 + Phase 18 features enabled (default since 0.x of nexo-rs).

A quick sanity check:

$ agent reload
reload v1: applied=1 rejected=0 elapsed=14ms
  ✓ ana

If you get exit 1 with "no control.reload.ack received within 5s", the daemon isn't running or runtime.reload.enabled is false — fix that first.

1. Rotate an LLM API key

The Anthropic key on production rotates every 90 days. Old key still valid for an hour after the rotation.

Edit

config/llm.yaml:

 providers:
   anthropic:
-    api_key: ${file:./secrets/anthropic_old.txt}
+    api_key: ${file:./secrets/anthropic_new.txt}
     base_url: https://api.anthropic.com

Apply

# Drop the new key first, THEN trigger the reload — the file watcher
# would also do it 500 ms after the save, the CLI is just explicit.
$ printf '%s' "sk-ant-..." > secrets/anthropic_new.txt
$ chmod 600 secrets/anthropic_new.txt
$ agent reload
reload v2: applied=2 rejected=0 elapsed=22ms
  ✓ ana
  ✓ bob

Verify

# The aggregate counter bumped:
$ curl -s localhost:9090/metrics | grep config_reload_applied_total
config_reload_applied_total 2

# Per-agent versions advanced:
$ curl -s localhost:9090/metrics | grep runtime_config_version
runtime_config_version{agent_id="ana"} 2
runtime_config_version{agent_id="bob"} 2

# Watch one agent's next turn — the new key is used by the LlmClient
# rebuilt inside RuntimeSnapshot::build:
$ tail -f agent.log | grep "llm request"

In-flight LLM calls keep using the old client (the in-flight Arc<dyn LlmClient> is captured per-turn). They land in <30 s; the old key is still valid for the hour the auth team gave you.

2. A/B test a system prompt

You want to roll out a friendlier sales pitch on Ana's WhatsApp binding without touching the Telegram one (which has a longer support persona).

Edit

config/agents.d/ana.yaml:

 inbound_bindings:
   - plugin: whatsapp
     allowed_tools: [whatsapp_send_message]
     outbound_allowlist:
       whatsapp: ["573115728852"]
-    system_prompt_extra: |
-      Channel: WhatsApp sales. Follow the ETB/Claro lead-capture flow.
+    system_prompt_extra: |
+      Channel: WhatsApp sales (variant B — warmer tone).
+      Follow the ETB/Claro lead-capture flow but lead with a personal
+      greeting and use first names.
   - plugin: telegram
     instance: ana_tg
     allowed_tools: ["*"]
     ...

Apply

The file watcher picks the save up automatically:

$ tail -f agent.log
INFO config reload applied version=3 applied=["ana"] rejected_count=0 elapsed_ms=18

Or trigger manually:

$ agent reload
reload v3: applied=1 rejected=0 elapsed=18ms
  ✓ ana

Verify

Send one message on each channel and tail the LLM request log to see which prompt block went to the model.

$ grep "snapshot_version=3" agent.log
INFO inbound matched binding agent_id=ana plugin=whatsapp \
  binding_index=0 snapshot_version=3

Telegram binding's system_prompt_extra is unchanged; only the WA binding picks up variant B.

Roll back

If variant B underperforms, git revert the YAML and agent reload. Sessions in flight finish their turn on B; the next inbound is back on A.

3. Tighten an outbound allowlist after an incident

A jailbroken prompt almost made Ana send WhatsApp messages to arbitrary numbers (Phase 16's defense-in-depth caught it). Until you investigate, narrow the allowlist to the on-call advisor only.

Edit

config/agents.d/ana.yaml:

 inbound_bindings:
   - plugin: whatsapp
     allowed_tools: [whatsapp_send_message]
     outbound_allowlist:
       whatsapp:
-        - "573115728852"
-        - "573215555555"
-        - "573009999999"
+        - "573115728852"   # incident-only: on-call advisor

Apply

$ agent reload
reload v4: applied=1 rejected=0 elapsed=15ms
  ✓ ana

Verify

Try the previously-allowed-but-now-blocked number from a test message. The LLM will try; the tool will reject:

ERROR tool_call rejected reason="recipient 573215555555 is not in \
  this agent's whatsapp outbound allowlist"

The session's Arc<RuntimeSnapshot> is captured at the start of each turn, so even mid-conversation the next user reply re-loads from the new snapshot and the allowlist update takes effect immediately.

What you cannot reload (yet)

Adding or removing agents — restart the daemon. Phase 19.
Plugin instances (whatsapp.yaml, telegram.yaml instance blocks) — restart the daemon. Plugin sessions own QR pairing / long-polling state that needs lifecycle plumbing. Phase 19.
broker.yaml, memory.yaml — restart the daemon. Long-lived connections + storage handles aren't safe to swap mid-flight.
workspace, skills_dir, transcripts_dir on an agent — restart that agent.

The daemon logs every restart-required field that changed during a reload as warn so you don't have to remember which knob lives where.

Architecture overview

nexo-rs is a single-process multi-agent runtime. One binary (agent) hosts every agent, every channel plugin, every extension, and the persistence layer. Coordination between components happens over NATS (with a local tokio-mpsc fallback when NATS is offline).

Why single-process: shared in-memory caches, zero IPC overhead between agent and tool invocations, simpler ops. The broker and disk queue give us the durability a multi-process layout would provide, without the coordination cost.

High-level layout

flowchart TB
    subgraph PROC[agent process]
        direction TB

        subgraph PLUGINS[Channel plugins]
            WA[WhatsApp]
            TG[Telegram]
            MAIL[Email / Gmail poller]
            BR[Browser CDP]
            GOOG[Google APIs]
        end

        subgraph BUS[Event bus]
            NATS[(NATS)]
            LOCAL[(Local mpsc fallback)]
            DQ[(Disk queue + DLQ)]
        end

        subgraph AGENTS[Agent runtimes]
            A1[Agent: ana]
            A2[Agent: kate]
            A3[Agent: ops]
        end

        subgraph STORE[Persistence]
            STM[(Short-term sessions<br/>in-memory)]
            LTM[(Long-term memory<br/>SQLite + sqlite-vec)]
            WS[(Workspace-git<br/>per agent)]
        end

        subgraph TOOLS[Tools & integrations]
            EXT[Extensions<br/>stdio / NATS]
            MCP[MCP client / server]
            LLM[LLM providers]
        end

        PLUGINS --> BUS
        BUS --> AGENTS
        AGENTS --> BUS
        AGENTS --> STORE
        AGENTS --> TOOLS
        TOOLS --> LLM
    end

    USERS[End users] <--> PLUGINS

Workspace crates

The Cargo.toml workspace defines these member crates:

Crate	Responsibility
`crates/core`	Agent runtime, trait, `SessionManager`, `HookRegistry`, heartbeat, tool registry
`crates/broker`	NATS client, local fallback, disk queue, DLQ, backpressure
`crates/llm`	LLM clients (MiniMax, Anthropic, OpenAI-compat, Gemini), retry, rate limiter
`crates/memory`	Short-term sessions, long-term SQLite, vector search via sqlite-vec
`crates/config`	YAML parsing, env-var resolution, secrets loading
`crates/extensions`	Manifest parser, discovery, stdio + NATS runtimes, watcher, CLI
`crates/mcp`	MCP client (stdio + HTTP), server mode, tool catalog, hot-reload
`crates/taskflow`	Durable flow state machine with wait/resume
`crates/resilience`	`CircuitBreaker` three-state machine
`crates/setup`	Interactive wizard, YAML patcher, pairing flows
`crates/tunnel`	Public HTTPS tunnel for pairing / webhooks
`crates/plugins/browser`	Chrome DevTools Protocol client
`crates/plugins/whatsapp`	Wrapper over `whatsapp-rs` (Signal Protocol)
`crates/plugins/telegram`	Bot API client
`crates/plugins/email`	IMAP / SMTP
`crates/plugins/gmail-poller`	Cron-style Gmail → broker bridge
`crates/plugins/google`	Gmail / Calendar / Drive / Sheets tools

Binaries

Defined in Cargo.toml:

Binary	Entry	Purpose
`agent`	`src/main.rs`	Main daemon; also exposes `setup`, `dlq`, `ext`, `flow`, `status` subcommands
`browser-test`	`src/browser_test.rs`	CDP integration smoke test
`integration-browser-check`	`src/integration_browser_check.rs`	End-to-end browser flow validation
`llm_smoke`	`src/bin/llm_smoke.rs`	LLM provider smoke test

Runtime topology

agent runs a single tokio multi-thread runtime. Work is split into independent tasks:

flowchart LR
    MAIN[main tokio runtime]
    MAIN --> PA[Per-agent runtime task]
    MAIN --> PI[Plugin intake loops]
    MAIN --> HB[Heartbeat scheduler]
    MAIN --> MCP[MCP runtime manager]
    MAIN --> EXT[Extension stdio runtimes]
    MAIN --> MET[Metrics server :9090]
    MAIN --> HEALTH[Health server :8080]
    MAIN --> ADMIN[Admin console :9091]
    MAIN --> LOCK[Single-instance lock watcher]

Each agent runtime owns its own subscription to inbound topics, its own session manager view, its own LLM-loop state. Agents do not share mutable in-memory state — coordination between agents happens over the event bus (agent.route.<target_id>).

What lives where — quick mental model

A message arrives → lands on plugin.inbound.<channel> (NATS)
Agent runtime consumes it → SessionManager attaches or creates a session, HookRegistry fires before_message
LLM loop runs → tools invoked through the registry, which calls into extensions / MCP / built-ins, each wrapped by CircuitBreaker
Tool result flows back → after_tool_call hooks fire, LLM decides next turn
Agent emits reply → publishes to plugin.outbound.<channel>
Channel plugin delivers → physical message goes to the user

Details per subsystem:

Agent runtime

The agent runtime is the per-agent machinery that consumes inbound events, drives the LLM loop, invokes tools, and emits outbound events. One AgentRuntime is instantiated per configured agent at boot; each runs as its own async task.

Source: crates/core/src/agent/ (behavior.rs, agent.rs, runtime.rs, hook_registry.rs), boot in src/main.rs.

AgentBehavior trait

Every agent implements AgentBehavior (crates/core/src/agent/behavior.rs). The trait is intentionally small — default no-ops let built-in types (like LlmAgentBehavior) override only what they need.

Method	Fires on	Default
`on_message(ctx, msg)`	Inbound message from a plugin	no-op
`on_event(ctx, event)`	Any event on a subscribed topic	no-op
`on_heartbeat(ctx)`	Periodic tick (if heartbeat enabled)	no-op
`decide(ctx, msg)`	LLM-reasoning hook (stub for custom flows)	empty string

The shipped LlmAgentBehavior implements the full chat-completion loop with tool calls, streaming, rate-limited retry, and hook fan-out.

Boot sequence

sequenceDiagram
    participant Main as src/main.rs
    participant Cfg as AppConfig
    participant Disc as Extension discovery
    participant SM as SessionManager
    participant TR as ToolRegistry
    participant LLM as LLM client
    participant AR as AgentRuntime
    participant Bus as Broker

    Main->>Cfg: load(config_dir)
    Main->>Disc: run_extension_discovery()
    Main->>SM: with_cap(ttl, max_sessions)
    Main->>TR: register built-ins + extensions + MCP
    Main->>LLM: build per provider (w/ CircuitBreaker)
    loop per agent in config
        Main->>AR: new(agent_id, behavior, tools, sm, llm, broker)
        AR->>Bus: subscribe plugin.inbound.<channel>+
        AR->>Bus: subscribe agent.route.<agent_id>
        AR-->>Main: ready
    end
    Main->>Main: install signal handlers
    Main->>Main: serve forever

Request/response lifecycle

A single inbound message drives the following flow inside one agent runtime:

sequenceDiagram
    participant Bus as NATS
    participant AR as AgentRuntime
    participant SM as SessionManager
    participant HR as HookRegistry
    participant LLM as LLM
    participant TR as ToolRegistry
    participant Ext as Extension / MCP / built-in

    Bus->>AR: plugin.inbound.<ch>
    AR->>SM: get_or_create(session_key)
    AR->>HR: fire("before_message")
    loop LLM turn loop
        AR->>LLM: completion(messages, tools)
        LLM-->>AR: assistant turn (text or tool_calls)
        alt tool_calls present
            AR->>HR: fire("before_tool_call", name, args)
            AR->>TR: invoke(tool_name, args)
            TR->>Ext: call
            Ext-->>TR: result
            TR-->>AR: result
            AR->>HR: fire("after_tool_call", name, result)
        else text only
            AR->>Bus: publish plugin.outbound.<ch>
        end
    end
    AR->>HR: fire("after_message")

SessionManager

Defined in crates/core/src/session/manager.rs. Tracks per-user conversational state in memory.

Key: SessionKey derived from (agent_id, channel, sender_id); group chats get one session per group
Storage: DashMap<SessionKey, Session> — lock-free concurrent map
TTL: configured via memory.short_term.session_ttl (default 30 min); each access updates last_access
Cap: soft limit DEFAULT_MAX_SESSIONS = 10,000; on overflow the oldest-idle session is evicted before insert
Sweeper: background task scans every 1 s, removes expired entries
Callbacks: on_expire() fires via tokio::spawn when a session is dropped — used by the MCP runtime to tear down per-session children

stateDiagram-v2
    [*] --> Active: first message
    Active --> Active: on_message / on_event<br/>(last_access updated)
    Active --> Expired: idle > TTL
    Active --> Evicted: cap exceeded,<br/>oldest-idle chosen
    Expired --> [*]: sweeper removes
    Evicted --> [*]: on_expire() fires

HookRegistry

Defined in crates/core/src/agent/hook_registry.rs. Lets extensions inject behavior at well-known points in the lifecycle without patching the runtime.

Hook names: arbitrary strings. In practice the runtime fires: before_message, after_message, before_tool_call, after_tool_call, on_session_start, on_session_end
Fan-out: sequential by priority (lower first), insertion order breaks ties
Cap: 128 handlers per hook name — defensive guard against a buggy extension re-registering on every reload
Errors: logged, treated as Continue — one misbehaving hook does not cascade into the rest
Override: a hook may return Override(new_args) to mutate what the next hook (or the runtime itself) sees

Heartbeat

# per-agent config
heartbeat:
  enabled: true
  interval: 30s

Scheduled per agent if heartbeat.enabled: true
Interval parsed via humantime — any humantime duration works
Each tick:
1. Fires AgentBehavior::on_heartbeat(ctx)
2. Publishes agent.events.<agent_id>.heartbeat
Typical uses: proactive messages ("good morning"), reminders, external state syncs (pull Gmail, scan calendar), liveness pings

Graceful shutdown

src/main.rs installs SIGTERM / Ctrl+C handlers. On signal, the process tears down in a specific order so in-flight work finishes cleanly:

flowchart TD
    SIG[SIGTERM / Ctrl+C] --> C1[Cancel dream-sweep loops<br/>5 s grace]
    C1 --> C2[Mark /ready = false<br/>stop new traffic]
    C2 --> C3[Stop plugin intake<br/>no new inbound]
    C3 --> C4[Shutdown MCP runtime manager<br/>5 s clean close]
    C4 --> C5[Shutdown extensions<br/>5 s grace then kill_on_drop]
    C5 --> C6[Stop agent runtimes<br/>drain buffered messages]
    C6 --> C7[Abort metrics + health tasks]
    C7 --> EXIT([exit 0])

This order is enforced in src/main.rs around lines 1389–1458. Extensions get the longest grace period because stdio children can be mid-tool-call; the disk queue absorbs any events that the plugins couldn't finish publishing.

Why this shape

One tokio runtime, many tasks: lets you run 10 agents on one CPU core when idle, saturates cores under load. No thread-per-agent bloat.
No shared mutable state across agents: each agent holds its own registry views, its own session map. Cross-agent communication goes over the bus → visible, replayable, testable.
Hooks instead of inheritance: extensions customize behavior without recompiling the core. Every insertion point is named, sequenced, and capped.

Event bus (NATS)

Every piece of communication between plugins, agents, and the broker layer itself flows over NATS (async-nats = 0.35). When NATS is offline, a local tokio::mpsc bus takes over and a SQLite-backed disk queue holds events until reconnection. No events are lost.

Source: crates/broker/ (nats.rs, local.rs, disk_queue.rs, topic.rs).

Why NATS

Subject-based routing fits the "N plugins × M agents" fan-out naturally (plugin.inbound.* wildcards)
Low-latency pub/sub with no broker-side state to manage
Cluster-ready without rewriting the data plane
Async-nats is mature, has JetStream if we ever need it

The design doc discusses the alternatives (RabbitMQ, Redis streams) that were rejected; see proyecto/design-agent-framework.md.

Subject namespace

Pattern	Direction	Example	Who publishes	Who subscribes
`plugin.inbound.<plugin>`	plugin → agent	`plugin.inbound.whatsapp`	Channel plugins	Agent runtimes
`plugin.inbound.<plugin>.<instance>`	plugin → agent	`plugin.inbound.telegram.sales_bot`	Multi-instance plugins (WA, TG)	Agent runtimes
`plugin.outbound.<plugin>`	agent → plugin	`plugin.outbound.whatsapp`	Agent tools (send, reply…)	Channel plugins
`plugin.outbound.<plugin>.<instance>`	agent → plugin	`plugin.outbound.whatsapp.ana`	Agent tools	Specific plugin instance
`plugin.health.<plugin>`	plugin → runtime	`plugin.health.browser`	Plugins	Health server
`agent.events.<agent_id>`	internal	`agent.events.ana`	Runtime internals	Dashboards, tests
`agent.events.<agent_id>.heartbeat`	scheduler → agent	`agent.events.kate.heartbeat`	Heartbeat scheduler	That agent
`agent.route.<target_id>`	agent → agent	`agent.route.ops`	Sending agent's `delegate` tool	Target agent runtime
`taskflow.resume`	external → flow	`taskflow.resume`	Anything (other agents, services, ops)	TaskFlow resume bridge

Multi-instance plugins append an .<instance> suffix so two WhatsApp accounts (e.g. Ana's line and Kate's line) can run side by side without subject collisions.

Agent-to-agent routing

sequenceDiagram
    participant Ana
    participant Bus as NATS
    participant Ops

    Ana->>Ana: LLM decides to delegate
    Ana->>Bus: publish agent.route.ops<br/>(correlation_id=X)
    Bus->>Ops: deliver
    Ops->>Ops: on_message handler runs
    Ops->>Bus: publish agent.route.ana<br/>(correlation_id=X)
    Bus->>Ana: deliver
    Ana->>Ana: correlate reply by ID

The sender always includes a correlation_id in the event envelope; the receiver echoes it on the reply. That's how one agent can fan out to several agents and reassemble results.

Broker abstraction

crates/broker exposes a Broker trait implemented by two backends:

NatsBroker — real NATS connection wrapped in a CircuitBreaker
LocalBroker — in-process tokio::mpsc for tests and offline mode

Switching between them is driven by config. The local broker matches NATS subject semantics (including . segments and > wildcards), which keeps the test surface identical to production.

Disk queue

When a publish to NATS fails — circuit breaker open, connection lost, transient 5xx — the event is persisted to the disk queue instead of being dropped.

Property	Value
Storage	SQLite
Default path	`./data/` (configurable via `broker.persistence.path`)
Tables	`pending_events`, `dead_letters`
Event format	JSON serialization of `Event { id, topic, payload, enqueued_at, attempts }`
Drain order	FIFO by `enqueued_at`
Batch size	up to 100 per `drain()` call
Max attempts before DLQ	3 (`DEFAULT_MAX_ATTEMPTS`)

flowchart LR
    PUB[publish] --> OK{NATS up?}
    OK -->|yes| NATS[(NATS)]
    OK -->|no| ENQ[disk_queue.enqueue]
    ENQ --> SQLITE[(pending_events)]
    RECON[NATS reconnect] --> DRAIN[disk_queue.drain]
    SQLITE --> DRAIN
    DRAIN --> NATS
    DRAIN -.->|3 attempts failed| DLQ[(dead_letters)]
    DRAIN -.->|deserialization error| DLQ

Drain on reconnect

When NatsBroker detects reconnection, it calls disk_queue.drain():

Read up to 100 oldest events from pending_events
Republish each to NATS
On success: delete row
On failure: increment attempts, leave row in place
Once attempts >= 3: move to dead_letters

Dead-letter queue (DLQ)

Events that exhaust retries, or fail to deserialize at all, land in dead_letters. They're not silently discarded — CLI lets you inspect and replay them.

agent dlq list              # show all dead events
agent dlq replay <event_id> # move one back to pending_events
agent dlq purge             # drop the table (destructive!)

Replay moves the entry back to pending_events; the next drain cycle retries it with attempts reset.

Backpressure

Two independent mechanisms:

Local broker channels are 256-capacity tokio::mpsc per subscriber. If a subscriber is slow, dropped events log a slow consumer warning but the subscription stays alive.
Disk queue applies proportional sleep at >50% capacity (scaled from 0 ms up to MAX_BACKPRESSURE_MS = 500 ms). At the hard cap it additionally drops the oldest event and sleeps 500 ms — an intentional "shed load, don't block the producer forever" stance.

The disk queue's backpressure only matters when NATS is down for a long time and the producer is faster than real time. In normal operation the disk queue stays near-empty.

Local fallback

When NATS is unreachable or the circuit breaker on the publish path is Open, the runtime degrades gracefully:

Inbound events from local plugins (e.g. a Telegram webhook fielded in-process) go through LocalBroker and reach agents immediately
Outbound events that target a plugin hosted in the same process (which is every shipped plugin) also go through LocalBroker
Anything that would have crossed a real NATS hop sits in the disk queue until reconnection

In practice, single-machine deployments keep working even with no NATS at all — the disk queue and the local broker together are sufficient for one process. NATS starts earning its keep the moment you scale to multiple processes, machines, or regions.

Broker deployment shapes

The nexo daemon supports three legitimate broker deployment shapes, covering single-host dev / production server clusters / embedded mobile builds with a single broker.yaml switch + (in some cases) a different build mode. Phase 92's stdio-bridge closed the gap that forced operators to install NATS for any single-host deployment.

Picking a shape

┌────────────────────────────────────────────────────────────┐
│  Multiple daemons across hosts?                            │
│                                                            │
│   YES ──→ Shape 1: NATS                                    │
│           broker.yaml type: nats                           │
│           Plugins: subprocess (extracted out-of-tree)      │
│           Infra:   NATS server / cluster on the network    │
│                                                            │
│   NO  ──→ Single host. Mobile / embedded?                  │
│                                                            │
│           YES (Android / iOS / WASM / Flutter FFI)         │
│            ──→ Shape 3: Embedded                           │
│                broker.yaml type: local                     │
│                Plugins: lib-linked Rust crates (no spawn)  │
│                Infra:   none                               │
│                                                            │
│           NO (laptop dev / server deb / desktop app)       │
│            ──→ Shape 2: Server single-host                 │
│                broker.yaml type: local                     │
│                Plugins: subprocess (extracted)             │
│                Infra:   none — stdio-bridge handles xprocess│
│                                                            │
└────────────────────────────────────────────────────────────┘

Shape 1 — Server multi-host (NATS)

The classical setup. NATS server runs on the network; every daemon in the cluster connects to it. Subprocess plugins connect directly to the same NATS, addressed by URL.

┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│  daemon A    │    │  daemon B    │    │  daemon C    │
│  (host 1)    │    │  (host 2)    │    │  (host 3)    │
└──────┬───────┘    └──────┬───────┘    └──────┬───────┘
       │                   │                   │
       ▼                   ▼                   ▼
  ┌─────────────────────────────────────────────────┐
  │            NATS cluster (TLS-mTLS)              │
  └─────────────────────────────────────────────────┘
       ▲                   ▲                   ▲
       │                   │                   │
┌──────┴───────┐    ┌──────┴───────┐    ┌──────┴───────┐
│  whatsapp    │    │  marketing   │    │  telegram    │
│  subprocess  │    │  subprocess  │    │  subprocess  │
│  (host 1)    │    │  (host 2)    │    │  (host 3)    │
└──────────────┘    └──────────────┘    └──────────────┘

Configuration:

# broker.yaml
broker:
  type: nats
  url: "nats://nats.example.com:4222"
  auth:
    enabled: true
    nkey_file: /etc/nexo/nats.nkey

Daemon stamps subprocess plugins with NEXO_BROKER_KIND=nats + NEXO_BROKER_URL=<url>, so each plugin connects to the same NATS server independently. Cross-host fanout is NATS's job.

Shape 2 — Server single-host (stdio bridge)

Daemon and plugins share one host. No NATS server installed; the daemon's in-process Local broker (tokio::mpsc) handles every event. Subprocess plugins reach the local broker through a JSON-RPC stdio bridge piggybacking on the channel the daemon already opens for tool.invoke.

┌─────────────────────────────────────────────────────────────────┐
│  daemon process (single host)                                   │
│                                                                 │
│  ┌─────────────────────────────────┐                           │
│  │  LocalBroker (tokio::mpsc)      │                           │
│  └──┬───────────────────────────┬──┘                           │
│     │                           │                              │
│     │ broker.publish            │ broker.subscribe forwarder   │
│     │ broker.event              │   pre-subscribed from        │
│     │                           │   manifest at boot           │
│     ▼                           ▼                              │
│  ┌─────────────────────────────────┐                           │
│  │  subprocess.rs JSON-RPC dispatch │                           │
│  │   - tool.invoke (existing)       │                           │
│  │   - broker.publish (81.14.b)     │                           │
│  │   - broker.event (81.14.b)       │                           │
│  └──┬─────────────────────────────┬─┘                           │
└─────┼─────────────────────────────┼─────────────────────────────┘
      │ stdin ◀──┐         ┌──▶ stdout
      ▼          │         │
┌─────────────┐  │         │  ┌─────────────┐
│  whatsapp   │──┤         ├──│  marketing  │
│  subprocess │  │         │  │  subprocess │
│             │  │         │  │             │
│  Stdio-     │  │         │  │  SDK's      │
│  Bridge-    │  │         │  │  Broker-    │
│  Broker     │  │         │  │  Sender +   │
│  (Phase 92) │  │         │  │  on_broker_ │
│             │  │         │  │  event      │
└─────────────┘  │         │  └─────────────┘
                 │         │
                 ▼         ▼
            (each plugin's StdioBridgeBroker holds an
             mpsc::Sender<Value> the SDK's PluginAdapter
             drains onto its single async stdout writer)

Configuration:

# broker.yaml
broker:
  type: local
  url: ""

Daemon stamps subprocess plugins with NEXO_BROKER_KIND=stdio_bridge and omits NEXO_BROKER_URL (no network endpoint). Each plugin's main.rs reads the env, calls PluginAdapter::with_stdio_bridge_broker(), and wraps the returned StdioBridgeBroker in AnyBroker::stdio_bridge. From there the plugin's existing BrokerHandle::publish / .subscribe code keeps working unchanged.

Marketing uses the SDK's BrokerSender + on_broker_event hooks directly (it never went through AnyBroker::from_config); those hooks already route through the same stdout writer, so marketing needs zero migration to participate in this shape.

Shape 3 — Embedded (Android / iOS / WASM)

Daemon and plugins linked into a single binary as Rust crates. No subprocess spawns at all; the LocalBroker works directly because every plugin shares the same process memory.

┌─────────────────── APK / Bundle / WASM module ──────────────────┐
│                                                                 │
│  Single process:                                                │
│                                                                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐          │
│  │ daemon core  │  │ WhatsappPlugin│  │ MarketingPlugin│        │
│  │              │  │   (crate)    │  │   (crate)    │          │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘          │
│         │                  │                  │                  │
│         └──────────────────┴──────────────────┘                 │
│                            │                                     │
│                            ▼                                     │
│                ┌──────────────────────┐                          │
│                │  LocalBroker         │                          │
│                │  (tokio::mpsc        │                          │
│                │   in-process)        │                          │
│                └──────────────────────┘                          │
└─────────────────────────────────────────────────────────────────┘

Configuration:

# broker.yaml
broker:
  type: local

No env vars stamped — there's no subprocess. The host (Android JNI, Flutter FFI shim, WASM glue) injects an Arc<AnyBroker::Local> directly into each plugin factory. Use of stdio-bridge is impossible in this shape because there's nothing on the other end of stdin.

Plugin crates already expose the surface for this; the build flip is a feature flag on the daemon and a different main entrypoint in the host shim. See Phase 90 (Android embed) for the concrete build pipeline.

Daemon env vars stamped on each subprocess plugin

Env var	When stamped	Plugin reads when
`NEXO_BROKER_KIND`	always for `whatsapp` + `telegram` instance factories (Phase 92.4)	constructing the broker in `main.rs`
`NEXO_BROKER_URL`	only when KIND is `nats` (Phase 92.4 omits for stdio_bridge)	constructing the NATS BrokerInner

For non-instance plugins (marketing today, plugins discovered via plugins.discovery.search_paths without a per-instance factory) the env clear path isn't applied; they inherit NEXO_BROKER_KIND from the daemon's own env if the operator exported it before launching the daemon. The Phase 92 dev-daemon.sh template seeds NEXO_BROKER_KIND in the daemon's environment for this reason.

Migration path from a pre-92 deployment

A single-host operator that installed NATS only because subprocess plugins forced it (the pre-92 default behaviour) follows this sequence after pulling a 92-or-later release:

Upgrade the daemon binary to one that contains Phase 92.
Upgrade each subprocess plugin binary (whatsapp, telegram) to one that contains the matching 92.6 migration.
Stop NATS: sudo systemctl stop nats-server.
Switch broker.yaml back to type: local.
Restart the daemon. Verify plugins.discovery: plugin registry wire complete loaded=N invalid=… init_failed_total=0 in the log.
Exercise an end-to-end flow (e.g. send a WhatsApp message, expect the bot reply). The whole pipeline now runs without any external broker.

Cluster operators (Shape 1) are unaffected — type: nats continues to work identically.

Source map

Concern	File
`BrokerKind::StdioBridge` enum	`crates/config/src/types/broker.rs`
`StdioBridgeBroker` impl	`crates/broker/src/stdio_bridge.rs`
`AnyBroker::StdioBridge` variant	`crates/broker/src/any.rs`
Daemon-side `broker.publish` handler	`crates/core/src/agent/nexo_plugin_registry/subprocess.rs` (Phase 81.14.b)
Daemon-side `broker.event` forwarder	same file (auto-subscribe from manifest)
`seed_*_subprocess_env_for` helpers	`proyecto/src/main.rs`
SDK `with_stdio_bridge_broker` helper	`crates/microapp-sdk/src/plugin.rs`
Plugin migrations (consumers)	`nexo-rs-plugin-whatsapp/src/main.rs`, `nexo-rs-plugin-telegram/src/main.rs`

Phase 92 (the stdio-bridge broker) shipped in v0.1.6 — see the release notes. Remaining sub-phases (an end-to-end integration test, Prometheus metrics for the bridge) are tracked as follow-ups.

Phase 93.11 — Compile-Time Plugin Decoupling Audit

Status: closed 2026-05-16. All 47 anchor sites cleared.

Phase 95 close-out — 6/N plugin extraction milestone

Status: closed 2026-05-17. Web-search joins the canonical subprocess plugin set: browser (81.17.c) → telegram (81.18) → whatsapp (81.19.a) → email (81.19.b) → google (94) → web-search (95).

nexo-rs-plugin-web-search lives in a standalone repo and ships as nexo-plugin-web-search 0.1.0. The daemon's nexo-core 0.2.0 breaking release removes web_search_router from AgentContext / AgentRuntime / AgentSpawnConfig / McpServerBootContext; the WebSearchTool in-process ToolHandler (crates/core/src/agent/web_search_tool.rs) deleted entirely. crates/web-search/ survives as a workspace member for direct consumers (microapp embeds, future MCP standalone tools); the daemon's compile graph no longer pulls it.

Phase 95 also adds the agnostic tool.invoke.params.policy framework contract (microapp-sdk 0.1.19 + nexo-core 0.2.0): the daemon's RemoteToolHandler stamps the per-binding EffectivePolicy::for_tool(tool_name) slice onto every JSON-RPC envelope. Future subprocess tools needing per-binding gating (lsp, dream, fork) reuse the same envelope without daemon-side changes.

Phase 94 close-out — 5/5 plugin extraction milestone

Status: closed 2026-05-16. The Phase 81 plugin-extraction lineage is complete: browser (81.17.c) → telegram (81.18) → whatsapp (81.19.a) → email (81.19.b) → google (94).

nexo-rs-plugin-google lives in a standalone repo and ships as nexo-plugin-google 0.2.0 (subprocess binary). The daemon (nexo-rs) no longer imports nexo_plugin_google::* from main.rs nor from crates/setup/. crates/plugins/google/ survives as the lib dep for nexo-poller's google_calendar + gmail builtins (in-process callers); future cleanup migrates the poller to the published lib crate so the in-tree dir can be deleted.

Result

cargo tree -i nexo-plugin-{whatsapp,telegram,email,browser} returns "did not match any packages" against either the default daemon build or --no-default-features. The daemon binary compiles with zero direct or transitive dependency on any canonical channel-plugin crate; pairing, outbound dispatch, HTTP routes, admin RPC, metrics scrape, dashboard sources, and pairing triggers all flow through manifest-declared broker contracts:

[plugin.pairing.adapter] (Phase 81.33.b.real Stage 1)
[plugin.http] (Stage 2)
[plugin.admin] (Stage 4)
[plugin.metrics] (Stage 5)
[plugin.dashboard] (Stage 6)
[plugin.pairing.trigger] (Phase 81.20.x Stage 7 Phase 2)
[plugin.public_tunnel] (Phase 81.20.x Stage 7 Phase 2)

Operators wanting an embedded in-process build link the canonical plugin crate directly from a custom binary — the daemon's published default ships with zero hardcoded plugin imports.

Historical TL;DR (pre-close)

Daemon binary (nexo-daemon) cannot build without the four canonical plugin crates as Cargo dependencies. Phase 93's opaque-config + PluginsConfig.entries work eliminated runtime YAML coupling but did not touch the compile-time import graph. 47 distinct import sites (43 in code, 4 in Cargo) anchor the daemon to nexo-plugin-whatsapp, nexo-plugin-telegram, nexo-plugin-email, nexo-plugin-browser.

Recommendation (closed via Phase 81.20.x Stage 7 Phase 2): Hybrid path adopted. Cargo feature-gates landed first (whatsapp/telegram/browser ~Stage 7 Phase 1) and then the gates themselves were deleted once each plugin shipped its manifest sections — cargo tree returns unmatched for all four canonical plugins as of 2026-05-16.

Inventory: 47 sites

Cargo dependency anchors (4)

File	Dep
`Cargo.toml:376`	`nexo-plugin-whatsapp = "0.1.3"` (root workspace dep)
`Cargo.toml:377`	`nexo-plugin-email = "0.1.3"` (root workspace dep)
`Cargo.toml:410`	`nexo-plugin-whatsapp = { workspace = true }` (daemon bin)
`Cargo.toml:415`	`nexo-plugin-telegram = "0.1.1"` (daemon bin)
`Cargo.toml:416`	`nexo-plugin-email = { workspace = true }` (daemon bin)
`crates/setup/Cargo.toml:25`	`nexo-plugin-email = { workspace = true }`
`crates/setup/Cargo.toml:69`	`nexo-plugin-whatsapp = { workspace = true }`

nexo-plugin-browser is NOT a Cargo dep — only env-var seeding via copy-pasted env_config reference (no compile- time coupling). Already decoupled in practice.

Code import sites (43) — by classification bucket

Bucket A — dies after 81.32.c7.c (full-parity tools registry extract). The register_<channel>_tools(&tools) blocks at agent boot

hot-spawn path. Plugins already expose NexoPlugin::register_outbound_tools(&self, &ToolRegistry) (trait method exists). Daemon currently hardcodes per-plugin calls as fallback. Migration: drop the hardcoded calls; loop over plugin_handles and dispatch register_outbound_tools.

src/main.rs:5128  register_whatsapp_tools     (boot path)
src/main.rs:5132  register_telegram_tools     (boot path)
src/main.rs:5145  filter_from_allowed_patterns
src/main.rs:5146  register_email_tools_filtered
src/main.rs:5150  EMAIL_TOOL_NAMES.len()
src/main.rs:6917  register_whatsapp_tools     (hot-spawn path)
src/main.rs:6920  register_telegram_tools     (hot-spawn path)
src/main.rs:7025  filter_from_allowed_patterns
src/main.rs:7026  register_email_tools_filtered

9 sites. Already tracked by Phase 81.32.c7.c, not new debt. Effort: ~4h.

Bucket B — dies after 81.33.b.real (manifest-driven pairing-adapter). Daemon hardcodes XxxPairingAdapter::new(broker) because SubprocessNexoPlugin::build_pairing_adapter() still defaults to None — manifest schema for [plugin.pairing.adapter] not yet finalised (see crates/core/src/agent/plugin_host.rs:113-119).

src/main.rs:1654  WhatsappPairingAdapter::new(broker)
src/main.rs:1657  TelegramPairingAdapter::new(broker)

2 sites. New follow-up surfaced: Phase 81.33.b.real manifest schema + GenericBrokerPairingAdapter. Effort: ~5h.

Bucket C — dies after 93.5.d (WhatsApp orchestration generalisation). Daemon-owned pairing state map + tunnel config + HTTP UI dispatcher.

src/main.rs:710     SharedPairingState (fn sig)
src/main.rs:1038    SharedPairingState (fn sig)
src/main.rs:1126    QrSnapshot
src/main.rs:2492    pairing_trigger::CHANNEL_ID
src/main.rs:2494    WhatsappPairingTrigger::from_configs
src/main.rs:3226    SharedPairingState
src/main.rs:3244    PairingState::new
src/main.rs:15167   dispatch_route + WhatsappRoute
src/main.rs:15174   PAIR_PAGE_HTML
crates/setup/src/admin_adapters.rs:3728   dispatch::TOPIC_OUTBOUND
crates/setup/src/admin_bootstrap.rs:713   WhatsappBotHandle
crates/setup/src/writer.rs:789            session::pair_once

12 sites. Already tracked as 93.5.d DEFERRED-strict. Trigger: 2nd pairing-based channel with daemon-owned tunnel + admin RPC. Effort when triggered: ~5h.

Bucket D — email in-process integration (autonomous_worker + tool ctx + metrics + wizard validators). Email is the only canonical plugin that has NOT undergone subprocess flip. Daemon holds Arc<EmailPlugin> to expose:

dispatcher_handle() for outbound tool routing
bounce_store_handle() for delivery receipts
attachments_dir() for MCP server attachment paths
health_map() for /metrics Prometheus rendering
WorkerState enum for /health HTML rendering
MCP autonomous_worker mode embeds EmailToolContext in-process to share Arc<HealthMap>

src/main.rs:715,3289,3299                EmailPlugin construction (boot)
src/main.rs:3698,3766,3767               EmailToolContext (boot)
src/main.rs:14129,14153,14154,14174      EmailPlugin (autonomous_worker)
src/main.rs:15117,15273,15286-15289      metrics + render_email_health + WorkerState
crates/setup/src/services/email.rs:28-31 ImapConnection / provider_hint /
                                         SmtpClient / spf_dkim (wizard validators)

15 sites. Email is structurally in-process by design. Subprocess-flip would require a Phase 81.20.x plan with:

Broker RPC for dispatcher_handle outbound (lat impact)
Cross-process Arc<HealthMap> sync (state-replication design)
Subprocess-side MCP server merged into autonomous_worker mode (or autonomous_worker stays in-tree as "core service")
Setup wizard validators (ImapConnection / SmtpClient probe) invocable via wizard-only crate or broker RPC

Estimated effort: 20-30h. No active trigger today — email works in-process, autonomous_worker depends on it.

Imports already dead-after-something (zero today, classify-only)

None. Every import site has a live consumer. Cat C audit 93.5.c (2026-05-15) confirmed zero zombies in the related typed-config sites; same conclusion holds for plugin imports.

Decision matrix

Option 1 — Cargo feature-gates

[features]
default = ["plugin-whatsapp", "plugin-telegram", "plugin-email", "plugin-browser"]
plugin-whatsapp = ["dep:nexo-plugin-whatsapp"]
plugin-telegram = ["dep:nexo-plugin-telegram"]
plugin-email    = ["dep:nexo-plugin-email"]
plugin-browser  = [] # already no Cargo dep; flag gates env seeding code

Each import site wrapped with #[cfg(feature = "plugin-X")]. Build slim binary with cargo build --no-default-features --features plugin-whatsapp.

Pros.

Cheap: ~6h for the three subprocess-flipped plugins (whatsapp + telegram + browser). 9-12 cfg wrap sites (the bucket-A code is already feature-shaped by the if-plugin-present check).
Compile-time enforcement: cargo build --no-default-features --features plugin-whatsapp proves "daemon compiles without telegram crate".
Unlocks Android/embedded slim builds today (whatsapp-only daemon is real demand per project_android_flutter_target memory).
Composable with subprocess-flip: a feature-disabled plugin can still run as a discovered subprocess if its manifest is in search_paths/. The daemon never imports the crate; the subprocess does.

Cons.

Doesn't help 3rd-party plugins (still need to be Cargo deps of daemon or run as subprocesses).
Doesn't decouple email (in-process by design).
Adds #[cfg] noise in main.rs (~9-12 blocks).
Workspace default-features quirks already biting (per feedback_rustls_default_features_off); each new feature needs default-features = false discipline.

Effort. ~6h for whatsapp + telegram + browser gates + test matrix (build --no-default-features per single-channel combo). Email NOT gated (keeps current shape).

Option 2 — Full dynamic-loading via manifest + broker

Daemon drops all nexo_plugin_X:: imports. Every integration becomes:

Pairing: NexoPlugin::build_pairing_adapter() → GenericBrokerPairingAdapter (81.33.b.real manifest).
Tools: NexoPlugin::register_outbound_tools(&tool_registry) (already trait, just drop hardcoded fallbacks — 81.32.c7.c).
Pairing trigger map: NexoPlugin::pairing_trigger() → dyn PairingTrigger (new opt-in trait method).
Tunnel + session_dir: NexoPlugin::orchestration_descriptor() → typed OrchestrationRequirements (new — covers tunnel port, pairing-state HTTP route, instance loop driver).
Email in-process: broker RPC OR keep gated in-tree.

Pros.

True self-describing: drop plugin in search_paths/, it works without daemon recompile.
Android-friendly (no compile-time plugin baggage on lib target).
Eliminates the 30+ import-site debt entirely.
Forces clean trait surfaces (good architectural pressure).

Cons.

Cost: ~30-50h across multiple sub-phases (manifest schema design, GenericBrokerPairingAdapter, OrchestrationDescriptor, email broker-RPC bridge OR feature-gate, wizard validator extraction).
Speculative trait shapes without driver: 93.5.d locked DEFERRED-strict for exactly this reason. 93.10 channels_dashboard same shape.
Email subprocess-flip is 20-30h on its own with real latency tradeoffs (in-process Arc<HealthMap> → broker cross-process sync).
Migration risk: each trait method that mis-fits the real-world 2nd-channel driver = churn.

Effort. 30-50h. No discrete trigger today (no 3rd-party plugin, no email subprocess driver).

Option 3 — Hybrid (recommended)

Ship feature-gates now (~6h) for whatsapp + telegram + browser. Email stays default-on. This delivers the Android/embedded slim-daemon win NOW.
Defer dynamic-loading trait surface until trigger: 3rd-party plugin demand OR email subprocess driver lands.
Land 81.32.c7.c (tool registry full-parity extract, already tracked, ~4h) — kills 9 bucket-A imports regardless of which long-term path wins.
Land 81.33.b.real (manifest-driven pairing adapter schema, already implied by L113-119 comment in plugin_host.rs, ~5h) — kills 2 bucket-B imports.

Combined immediate work ~15h, drops ~11 of 43 code imports + adds compile-time enforcement that whatsapp|telegram|browser are optional. Leaves the email-in-process bucket (15 sites) intact as documented design choice, not silent debt.

Recommendation

Ship Option 3 Hybrid. Phase 93 closes with:

93.5.d DEFERRED-strict (already done)
93.10 DEFERRED until non-canon dashboard driver (already done)
93.11 = this memo (concrete data + decision)
93.12 (new) = ship feature-gates per above
Bucket-B follow-up = 81.33.b.real (existing implied)

Phase 93 then status = shipped + audit-complete. Email in-process is acknowledged design (not bug). Daemon can ship slim builds for embedded targets immediately. Dynamic-loading becomes a Phase 100+ candidate triggered by real 3rd-party demand, not speculative.

Trigger watchlist (re-open this audit if)

3rd-party plugin (slack/discord/sms) requests to ship as pure subprocess without daemon recompile.
Android Flutter integration finalised + daemon binary size becomes a release-blocker.
Email subprocess-flip plan opens (would invalidate the "email stays in-process" recommendation).
2nd pairing-based channel lands and 93.5.d unblocks — forces revisit of trait shape consensus.

Plugin Auto-Discovery — Design Memo

Status: design memo (no code change). Produced 2026-05-15 to anchor the next 2-4 sessions of work toward the goal:

Adding a new plugin to nexo should be a drop-in operation. The operator places the plugin binary and its manifest in plugins.discovery.search_paths and the daemon picks up EVERY capability the plugin declares — outbound tools, credentials, HTTP routes, pairing flows, dashboard surface, per-instance orchestration — without any daemon-side code change.

Reference mining

OpenClaw (/home/familia/chat/research/):

src/channels/plugins/types.plugin.ts:47-96 — ChannelPlugin declarative top: id, meta, capabilities, gatewayMethods, configSchema, reload.
src/channels/plugins/types.adapters.ts:76-858 — imperative handler split (gateway.startAccount / pairing / auth.login / outbound.send* / messaging.normalizeTarget / directory.self / lifecycle.onAccountConfigChanged).
src/channels/plugins/types.core.ts:100 — webhookPath as per-account declarative HTTP mount point.
src/gateway/server-channels.ts:285-449 — daemon-managed per-account lifecycle loop (AbortController, exponential backoff 5s→5min, ≤10 retries, status snapshot).
src/plugins/inspect-shape.ts:36-127 — runtime introspection classifies plugins as plain-capability / hybrid-capability / hook-only / non-capability by counting channelIds, providerIds, gatewayMethodCount, httpRouteCount.
docs/channels/pairing.md:41-49 — pairing state lives in ~/.openclaw/credentials/<channel>-pairing.json + <channel>-allowFrom.json; pairing adapter is the only per-channel custom logic surface.

claude-code-leak/ ausente en /home/familia/chat/. Mining absence declared explicitly.

Current Rust shape (crates/core/src/agent/plugin_host.rs):

L66-199 — NexoPlugin trait. Already has the auto-discovery shape: manifest(), init(&ctx), shutdown(), build_pairing_adapter(broker), register_outbound_tools(&reg), configure(&yaml), credential_store(), as_any(). Defaults let new plugins opt in.
PluginInitContext (L204-300+) — hands plugins tool_registry, advisor_registry, hook_registry, broker, llm_registry, reload_coord, sessions, long_term_memory, shutdown, channel_adapter_registry, plugin_config. Plenty of extension points already.

The trait + context is already mostly self-describing. What's missing is daemon-side dispatch — code in src/main.rs that iterates plugin_handles instead of hardcoding per-plugin blocks.

Inventory of 12 capability layers

Layer	Auto-discoverable today?	What blocks it
Config schema	✅ done (Phase 93.1-93.4)	—
Manifest discovery	✅ done	—
Subprocess lifecycle	✅ done	—
Broker RPC integration	✅ done	—
Credential store	✅ done (Phase 93.6-93.9)	—
Outbound tools	✅ partial	Phase 81.32.c7.c — daemon-side hardcoded fallbacks (`register_whatsapp_tools` etc.) coexist with trait method.
Pairing adapter	❌	Phase 81.33.b.real — trait method exists but no daemon dispatch; subprocess plugins can't supply Rust trait obj across process boundary.
HTTP routes	❌	Daemon hardcodes `/whatsapp/pair`. No trait method for plugins to declare routes.
Admin RPC commands	❌ partial	Daemon hardcodes `with_wa_bot_handle`. No generic admin-RPC registration.
Channel dashboard	✅ partial (Phase 93.10)	`ChannelDashboardSource` lives in `nexo-setup`, NOT exposed via `NexoPlugin` trait. Plugins can't auto-register a dashboard surface.
Metrics / health endpoints	❌ partial	Daemon hardcodes `/email/health`, `/metrics` whatsapp-instances JSON.
Orchestration	❌	Phase 93.5.d — daemon hardcodes whatsapp instance loop, tunnel auto-open, pairing-state map.

Seven layers need work to reach "drop plugin → daemon discovers everything".

Architectural principles (non-negotiable)

Manifest is the single source of truth. Anything the daemon needs to know about a plugin is in nexo-plugin.toml. Daemon never inspects plugin Cargo features, plugin source code, or plugin runtime state to discover capabilities.
Subprocess boundary is honoured. Rust trait objects do not cross process boundaries. Anywhere the daemon would need to call into the plugin per-message, the dispatch goes through broker JSON-RPC (with caches at hot paths).
In-tree plugins use trait dispatch; subprocess plugins use broker dispatch. NexoPlugin trait methods stay valid for in-tree plugins (Phase 81.20 candidates). For subprocess plugins, SubprocessNexoPlugin translates trait calls into broker RPCs against a generic adapter constructed from manifest data.
Per-channel custom logic stays in the plugin process. normalize_sender, auth_check, instance-discovery — every per-channel rule executes inside the subprocess, never in the daemon. Daemon stays generic.
Hardcoded canonical-plugin paths are deprecation-tracked, not deleted opportunistically. Out-of-tree plugin crates ship on their own release cadence; daemon ships fallbacks until canonical plugins opt into the generic path via their own next manifest revision.

Patterns

Two patterns repeat across all 7 remaining layers. Pin them once in the framework; reuse for each layer.

Pattern A: broker-RPC dispatch with cache

For per-event hot paths that need plugin-side logic.

Manifest declares the broker topic shape:

[plugin.pairing.adapter]
channel_id = "whatsapp"
broker_topic_prefix = "plugin.whatsapp"
# daemon will call: <broker_topic_prefix>.pairing.normalize_sender
#                   <broker_topic_prefix>.pairing.send_reply
#                   <broker_topic_prefix>.pairing.send_qr_image

Daemon-side adapter:

#![allow(unused)]
fn main() {
pub struct GenericBrokerPairingAdapter {
    channel_id: &'static str,
    broker: AnyBroker,
    topic_prefix: String,
    // Cache: raw sender → normalized form. Pairing volume is
    // low; cache grows bounded by unique senders.
    normalize_cache: Arc<RwLock<HashMap<String, Option<String>>>>,
}
}

normalize_sender(raw) checks cache, on miss does broker.request("<prefix>.pairing.normalize_sender", raw) with a short timeout, then caches result.
send_reply/send_qr_image are already async — direct broker RPC.

Trade-off. First-sighting of every sender pays a broker round-trip (~1-5ms local). Subsequent lookups are O(1) cache. For pairing flows, this is acceptable because handshakes are rare. For high-throughput hot paths (every inbound message), upfront broadcast-of-known-normalizations would be required — design that into the manifest as a separate batch RPC if a layer needs it.

Sync trait → async broker. PairingChannelAdapter::normalize_sender is fn sync. The generic adapter uses tokio::runtime::Handle::block_on inside an inherent async-block-on-cache-miss helper, OR the trait gets migrated to async fn first (preferred if downstream callers are already in async contexts).

Pattern B: declarative interpreter

For boot-time setup that needs plugin-side logic but only fires once per startup or per config-reload.

Manifest declares the data; daemon interprets:

[plugin.orchestration]
per_instance_state = true            # daemon allocates a state-map keyed by instance
public_tunnel.enabled = true         # daemon offers an auto-tunnel knob
public_tunnel.route = "/whatsapp/pair"  # daemon mounts the tunneled prefix
inbound_state_topic = "plugin.inbound.whatsapp"  # daemon subscribes here for state events

[plugin.http]
mount_prefix = "/whatsapp"            # daemon mounts a proxy under this prefix
# requests get forwarded via broker as:
#   plugin.<id>.http.<method>.<path-encoded>

Daemon iterates plugin_handles.iter().filter_map(|h| h.manifest().http.as_ref()) and mounts proxies generically.

Trade-off. The manifest schema enumerates known orchestration shapes — adding a NEW shape (e.g. "websocket pairing" vs current "HTTP-poll-for-QR") requires extending the schema. That's a breaking change to the manifest contract, NOT to daemon code. Plugin authors get a compile-time deserialization error pointing at the missing field. Schema evolution is centralized + versioned.

Per-layer design

Layer 6 — outbound tools (already partially generic)

Today. NexoPlugin::register_outbound_tools(&registry) trait method exists with default no-op. Plugins like nexo-plugin-whatsapp override to call register_whatsapp_tools(&tools). Daemon ALSO has hardcoded fallbacks (src/main.rs:5128, 6917) gated on cfg.plugins.iter().any(|p| p == "whatsapp") — these fire IN ADDITION TO the trait method, scoped by feature gate.

To close (Phase 81.32.c7.c). Remove hardcoded fallbacks once canonical plugins ship a manifest declaring [[plugin.tools.outbound]] per tool. Daemon iterates plugin_handles[..].register_outbound_tools(&registry) only; delete the fallback if plugin == "whatsapp" blocks.

Effort. ~3-4h. Touches main.rs boot loop + hot-spawn loop. Plugin crates must publish a release with the manifest section first — coordinate via release notes.

Layer 7 — pairing adapter (Phase 81.33.b.real)

Today. NexoPlugin::build_pairing_adapter(broker) trait method exists with default None. Daemon hardcodes build_known_pairing_registry() (src/main.rs:1651-1660) that constructs whatsapp + telegram adapters by Rust type, both cfg-gated.

To close. Pattern A. New manifest section [plugin.pairing.adapter] (channel_id, broker_topic_prefix). GenericBrokerPairingAdapter in nexo-pairing reads manifest

owns cache. SubprocessNexoPlugin::build_pairing_adapter() returns Some(Arc::new(GenericBrokerPairingAdapter::from_manifest(self.manifest(), broker))) when manifest declares the section, else None.

Daemon build_known_pairing_registry becomes a loop:

#![allow(unused)]
fn main() {
for handle in &plugin_handles {
    if let Some(adapter) = handle.build_pairing_adapter(broker.clone()) {
        registry.register(adapter);
    }
}
}

Canonical plugins (whatsapp, telegram) ship next manifest revision adding the section + handle the broker RPCs in their subprocess. Until then daemon falls back to legacy hardcoded registrations (already cfg-gated).

Trade-off accepted. normalize_sender cache miss = one broker round-trip per unique sender. Pairing flows are low volume; cost is invisible in practice.

Effort. ~5h: manifest schema + adapter impl + subprocess plugin RPC handler stubs + integration test.

Layer 8 — HTTP routes

Today. Daemon run_health_server (src/main.rs:~15140+) hardcodes /whatsapp/* route handler using nexo_plugin_whatsapp::pairing::dispatch_route. Email and other channels with HTTP needs would each add hardcoded blocks.

To close. Pattern B. New manifest section:

[plugin.http]
mount_prefix = "/whatsapp"
# daemon forwards every request under this prefix via broker

Daemon-side proxy: a single generic handle_plugin_http_route function that matches request.path against registered prefixes, then issues a broker RPC plugin.<id>.http.request with serialized request bundle. Plugin subprocess implements its own internal router under that prefix.

Trade-off. Every HTTP request to a plugin pays a broker round-trip (~1-2ms local) + serialization. For human-facing pages (pairing QR, OAuth callbacks) this is invisible. For machine-to-machine high-throughput webhooks, consider whether the plugin should listen on its own port directly (avoid the proxy entirely) and only register a "I have a port" descriptor for the dashboard. Add a mount_kind: "proxy" | "direct" knob in the manifest section if needed.

Effort. ~6h: manifest schema + daemon proxy handler + broker RPC contract + subprocess router scaffolding + integration test (round-trip a pairing GET through the proxy).

Layer 9 — admin RPC commands

Today. Setup wizard's admin RPC dispatcher (crates/setup/src/admin_bootstrap.rs:712) hardcodes .with_wa_bot_handle(Arc::new(WhatsappBotHandle)). Only whatsapp currently has plugin-specific admin commands but the pattern extrapolates poorly.

To close. Pattern A (broker-RPC) for admin command dispatch. Manifest section:

[[plugin.admin.command]]
namespace = "whatsapp"     # admin RPC method prefix
methods = ["pair_start", "pair_status", "pair_revoke", "bot_status"]

Daemon's admin dispatcher iterates registered plugin admin namespaces; on admin.<namespace>.<method> call, forwards via broker to plugin subprocess.

Removes WhatsappBotHandle typed integration entirely. Other plugins (telegram bot-info, email account-info) auto-declare their own admin namespaces.

Effort. ~5h: manifest schema + admin dispatcher generic routing + broker RPC contract + remove with_wa_bot_handle + integration test.

Layer 10 — channel dashboard (Phase 93.10 polish)

Today. Phase 93.10 shipped ChannelDashboardSource trait in nexo-setup with 3 hardcoded canonical impls. New canonical channel = new impl in nexo-setup = framework code change.

To close. Pattern B. Move ChannelDashboardSource data into manifest:

[plugin.dashboard]
auth_check_kind = "file_presence"     # | "session_dir_with_files" | "broker_probe"
auth_check_args = { path = "telegram_bot_token.txt" }
multi_instance_layout = "single"      # | "workspace_walk" | "broker_list"

Daemon-side generic interpreter reads the section + dispatches to the matching auth-check / instance-discovery handler. For shapes the interpreter doesn't recognise (rare), fall back to a broker RPC plugin.<id>.dashboard.discover that the subprocess implements.

Trade-off. Schema enumerates known auth-check + layout shapes. A 5th channel with a wholly new auth shape (e.g. OAuth-token-presence-with-refresh-due-check) requires extending the enumeration. This is the SAME trade-off as Pattern B elsewhere: schema evolution > framework code change.

Move the 3 canonical sources from nexo-setup to manifest data on the canonical plugin crates (next release each).

Effort. ~4h: interpreter + manifest schema + migrate 3 canonical impls + integration test.

Layer 11 — metrics / health endpoints

Today. Daemon hardcodes /email/health, /metrics whatsapp-instances JSON output, etc.

To close. Pattern B + Pattern A combined. Manifest declares which metrics surfaces a plugin owns:

[plugin.metrics]
prometheus = true                  # daemon scrapes plugin's broker RPC
health_endpoint = "/email/health"  # exposed as proxy

/metrics aggregator on daemon already collects from registered sources. Add a generic BrokerScrapeSource that issues plugin.<id>.metrics.scrape per scrape interval, parses Prometheus text response, merges into aggregate.

Trade-off. Per-scrape broker RPC cost (~1ms × number of plugins, ≤10ms total at typical scale). Cache-with-TTL if scrape is high-frequency.

Effort. ~4h.

Layer 12 — orchestration (Phase 93.5.d)

Today. Daemon hardcodes whatsapp orchestration in src/main.rs (instance loop L3219+, tunnel auto-open L3833+, pairing-state subscriber spawn L3608+).

To close. Pattern B with the orchestration schema:

[plugin.orchestration]
per_instance_state = true
inbound_state_topic = "plugin.inbound.whatsapp"
inbound_state_events = ["connected", "disconnected", "reconnecting", "qr"]

[plugin.orchestration.public_tunnel]
offer = true
mount_route = "/whatsapp/pair"
only_until_paired = true

Daemon iterates plugin_handles[..].manifest().orchestration and runs the orchestration loop generically:

Allocates per-instance state map (opaque Value indexed by instance label).
Subscribes the broker bridge that mirrors inbound_state_events into the state map.
Auto-opens public tunnel via nexo-tunnel-quick if offer = true and config allows.

State map is opaque from daemon's POV — it just stores JSON payloads keyed by instance. Plugin subprocess writes events with its own internal schema. HTTP layer (Layer 8) proxies queries into the state map.

Trade-off. State payloads are opaque JSON daemon-side. No typed access; daemon can't enforce schema. Plugin contract is "whatever you publish on inbound_state_topic is what callers get back from /whatsapp/<inst>/status". Plugin authors test the round-trip themselves.

This is the LARGEST single piece. Probably split:

12a — opaque state map + subscriber bridge (~5h)
12b — public tunnel auto-open generalised (~3h)
12c — remove whatsapp-specific blocks from daemon (~2h, after whatsapp ships orchestration manifest section)

Migration plan

Execution order matters because layers depend on each other:

Stage 1 — Layer 7 (pairing adapter) — closes Phase 81.33.b.real. Smallest deliverable. Validates Pattern A end-to-end with a real subprocess. ~5h. First.
Stage 2 — Layer 8 (HTTP routes) — unblocks the orchestration-tunnel work. The orchestration tunnel needs to know how plugins expose pairing pages; once HTTP-via-proxy is the contract, the tunnel just mounts the proxy prefix. ~6h.
Stage 3 — Layer 12a + 12b (orchestration core + tunnel) — closes Phase 93.5.d main mass. Depends on Layer 8. ~8h.
Stage 4 — Layer 9 (admin RPC) — orthogonal; can interleave with Stage 3. Removes with_wa_bot_handle typed path. ~5h.
Stage 5 — Layer 11 (metrics) — small, independent. Can ship anywhere. ~4h.
Stage 6 — Layer 10 (dashboard polish) — move sources from nexo-setup to manifest data. Last because plugin crates need 2 prior releases first (Pattern B precedent + Stage 1's manifest format). ~4h.
Stage 7 — Layer 6 cleanup + Layer 12c — remove all remaining hardcoded plugin-name fallbacks from daemon (register_whatsapp_tools fallbacks, whatsapp orchestration block). Only after canonical plugin crates have shipped the manifest revisions for layers 1-6. ~3h.

Total: ~35h (~7 sessions of 5h each, more realistic than the optimistic earlier estimates).

Critical dependency. Each stage that needs a new manifest section blocks on a coordinated release of the 3 canonical plugin crates (whatsapp, telegram, email). The daemon ships fallbacks until the plugin manifest revisions are out. Plan plugin releases AHEAD of removing the daemon fallback.

Trade-offs we are explicitly accepting

Layer	Trade-off
7 — pairing	Broker RPC per unique sender. Cache after first sighting. ≤5ms one-time per pairing handshake.
8 — HTTP	Broker round-trip per request. ≤2ms. Unacceptable for high-throughput webhooks — those keep direct ports.
9 — admin RPC	Broker round-trip per admin command. ≤3ms. Admin commands are human-initiated, latency invisible.
10 — dashboard	Schema enumerates auth-check + layout shapes. New shapes = schema extension, not framework code change.
11 — metrics	Broker scrape per plugin per scrape interval. Cache-with-TTL if frequency is sub-second.
12 — orchestration	State map daemon-side is opaque `serde_json::Value`. Plugin owns schema entirely.

Open questions

Trait async migration. Several trait methods are sync today (PairingChannelAdapter::normalize_sender, ChannelDashboardSource::discover). Generic broker-RPC dispatch needs async. Migrate trait to async or wrap with sync→async bridges? Lean: migrate to async, callers are already in async contexts.
Plugin manifest schema version. Each new manifest section bumps an implicit schema version. Should we add an explicit nexo_manifest_version field that the daemon checks for forward compatibility? Lean: yes, add nexo_manifest_version = 2 in this design wave, daemon refuses to load v1 plugins after transition window.
In-tree plugin migration. Email is still in-process (Phase 93.11 bucket D). Does it adopt the same manifest sections, or does in-process keep using direct trait dispatch? Lean: same manifest sections, but EmailPlugin overrides each build_pairing_adapter / mount_http / ... to return Rust impls directly. Subprocess plugins return generic adapters. Trait method is the unifying API.
Hot-reload. OpenClaw supports plugin config hot-reload (reload.configPrefixes). Rust's static linking + subprocess model makes this harder. Lean: defer — each section's reload semantics get spec'd when the section ships. For now, config reload triggers subprocess restart of affected plugins.
Plugin permission model. Once plugins can declare HTTP routes + admin commands + metrics endpoints, the daemon needs to enforce per-plugin permissions (a malicious plugin shouldn't register /admin/dangerous-thing). Lean: prefix every plugin's declared routes with /plugins/<plugin_id>/ mandatory. No plugin can mount at /admin or /health directly. Add the namespace constraint in this design wave.

Validation strategy

Each stage gets:

Unit tests in the affected crate for the new types + interpreters.
Integration test spinning up a real subprocess plugin declaring the new manifest section, exercising the round-trip via broker.
Build matrix preservation — every stage keeps cargo build --no-default-features clean. Slim daemon does not need any plugin manifest section to compile.
Documentation — docs/src/plugins/<section>.md per manifest section the operator-writing plugin author needs to know.

Non-goals

Hot-reload of compiled plugin binaries. Subprocess restart is the reload story.
Wasm plugin runtime. Out of scope. If/when added, this manifest-driven contract is what Wasm modules speak.
3rd-party plugin distribution (registry, signing). Out of scope. Operator-managed paths only.
Web UI auto-generation from manifest. Phase 83 microapp consumes the manifest for its own UI but the auto-discovery contract is daemon-side only.

Next session: brainstorm + spec + plan for Stage 1

Per the project's /forge flow, the actual execution begins with /forge brainstorm 81.33.b.real → spec → plan → ejecutar. This memo is the architectural anchor that every brainstorm must reference.

Update 2026-05-15 — Stages 1+2+4+5+6 + reference plugin shipped

Five of the seven pending stages closed in a single session:

Stage 1 (pairing adapter) — PR #65.
Stage 2 (HTTP routes) — PR #66.
Stage 4 (admin RPC) — PR #67.
Stage 5 (Prometheus metrics) — PR #68.
Stage 6 (dashboard surface) — PR #69.
Reference plugin demo + tests — PR #70.

Stage 3 (orchestration tunnel) skipped after re-evaluation: the generic state-map / subscriber-bridge originally scoped became redundant once Stage 2 routed HTTP through broker, and the remaining tunnel auto-open is daemon-side polish that operators can already trigger via nexo admin --tunnel. Stage 7 (cleanup hardcoded fallbacks) deferred pending coordinated releases of the 3 canonical plugin crates adopting the new manifest sections — daemon-side legacy paths cannot be retired until plugin-side migration ships.

Reference plugin. crates/test-fixtures/reference-plugin/ exercises every manifest section in one place. Pure-function broker handlers (no I/O) so each contract is unit-testable without spinning up a real subprocess. Operators / plugin authors copy the crate as a starting template.

The user-visible auto-discovery goal is met today: any new plugin can declare the 5 manifest sections + ship broker handlers, and the daemon auto-discovers every capability with zero framework code change.

Cargo-install ergonomics (2026-05-16)

Stage 8 of auto-discovery: closing the last operator-side friction. Before today, cargo install nexo-plugin-X deposited a binary in ~/.cargo/bin/ but the daemon still required the operator to edit config/plugins/discovery.yaml and add the directory to search_paths. Out-of-the-box discovery was empty.

The fix is two-part:

PluginDiscoveryConfig::default() populates standard install paths. The defaults now expand to $HOME/.cargo/bin, $HOME/.local/share/nexo/plugins, and /usr/local/libexec/nexo/plugins. Missing dirs are tolerated (Warn diagnostic, walker continues) so a clean machine boots without errors. Operator-supplied paths append to the defaults rather than replacing them — supply an explicit empty search_paths: [] to opt out.
Binary-mode discovery branch. When auto_detect_binaries is true (default), the walker also scans each search root's immediate children for executables whose filename matches nexo-plugin-<id> (.exe accepted on Windows). Each candidate is spawned with --print-manifest (2s timeout, killed on overshoot); stdout is parsed as TOML and treated as the plugin's manifest. The discovered binary path is stamped into manifest.plugin.entrypoint.command so the subprocess factory can spawn it directly — the manifest's own ./bin/<id> placeholder is ignored.

The SDK gains nexo_microapp_sdk::plugin::print_manifest_if_requested. Plugin authors call it as the first statement of main(); it writes the bundled manifest to stdout and exits 0 when the flag is present, otherwise returns normally. Two lines on the plugin side, zero framework knowledge required.

Trust boundary. This opens the door to executing arbitrary binaries during daemon boot. The trust root is whoever owns the search-path directory (typically the operator's own ~/.cargo/bin). Operators in hardened environments can opt out via discovery.auto_detect_binaries: false and pin discovery back to filesystem-resident nexo-plugin.toml manifests only.

Limitations / deferred work.

No probe-result cache. Every boot re-spawns each binary. With N=5 plugins and ~20ms-per-probe this is ~100ms total — under the noise floor of LLM-bound startup, so cache deferred. If cold-boot latency becomes a constraint, key by (path, mtime, size) and persist at <state_root>/plugin-discovery-cache.json.
The nexo-plugin-<id> naming convention is the contract. Plugins that ship as awesome-channel (no prefix) will never be auto-detected. Documented in the plugin author guide.
One probe failure (timeout / non-zero exit) does not block other plugins. The failed candidate is emitted as a ManifestParseError diagnostic and the walker continues.

Fault tolerance

Every external call goes through a CircuitBreaker. Every retryable error has a bounded retry policy with jittered exponential backoff. Every event survives a NATS outage. A second process cannot race the first onto the same bus.

This page collects all of those guardrails in one place.

CircuitBreaker

Source: crates/resilience/src/lib.rs.

A three-state machine wrapped around any fallible external call. Once a dependency is failing, the breaker fails fast instead of piling up calls against a dead endpoint; periodic probes let it recover without human intervention.

stateDiagram-v2
    [*] --> Closed
    Closed --> Open: 5 consecutive failures
    Open --> HalfOpen: backoff elapsed
    HalfOpen --> Closed: 2 consecutive successes
    HalfOpen --> Open: any failure<br/>(backoff × 2, capped)

Defaults

Field	Default	Meaning
`failure_threshold`	5	consecutive failures before opening
`success_threshold`	2	consecutive successes in HalfOpen before closing
`initial_backoff`	10 s	wait time on first open
`max_backoff`	120 s	cap on exponential backoff

Where it wraps

LLM calls — one circuit per provider (MiniMax, Anthropic, OpenAI-compat, Gemini). A provider outage doesn't cascade to others.
NATS publish — one circuit over the broker. When it opens the disk queue absorbs writes.
CDP commands — one circuit per browser session. A dead Chrome doesn't freeze the agent loop.
Extension stdio — implicit via the StdioRuntime lifecycle (crashed child → respawn, bounded).

Signals

CircuitBreaker exposes the usual methods (allow(), on_success(), on_failure()) plus two explicit overrides:

trip() — force Open from outside (e.g. a health check decided the dep is down before a call fails)
reset() — force Closed (e.g. the operator just restored the dep and doesn't want to wait for the probe window)

Retry policies

Retries live at a layer above the circuit breaker — they handle transient failures (429, 5xx, network blips) that don't warrant flipping the breaker. Every retry policy uses jittered exponential backoff to avoid thundering-herd reconnection storms.

Component	Max attempts	Backoff range
LLM 429 (rate limit)	5	1 s → 60 s, jittered exponential
LLM 5xx (server error)	3	1 s → 30 s, jittered exponential
NATS publish drain	3 per event	disk queue drain cycle
CDP	via circuit only	backoff = circuit's open window

These live in crates/llm/src/retry.rs (LLM) and crates/broker/src/disk_queue.rs (NATS drain).

Error classification

Retries only trigger on retryable errors. A 4xx other than 429 — missing key, invalid model, malformed request — fails fast. The rationale: retrying a misconfigured call wastes budget and still fails. Fail loudly, fix the config.

No message drop

The broker layer guarantees at-least-once delivery for publishes that reach the runtime:

flowchart LR
    P[publisher] --> TRY{NATS healthy?}
    TRY -->|yes| NATS[(NATS)]
    TRY -->|no| DQ[(disk queue)]
    DQ --> WAIT{reconnect?}
    WAIT -->|yes| DRAIN[drain FIFO]
    DRAIN --> NATS
    DQ -->|3 failed attempts| DLQ[(dead letters)]
    DLQ --> CLI[agent dlq replay]

In the absolute worst case — NATS down forever, disk full — the disk queue starts shedding oldest events at its hard cap, but the producer never crashes and never silently drops.

Single-instance lockfile

A second agent process pointed at the same data directory would double-subscribe every topic, delivering every message twice. To prevent that, boot acquires a lockfile and kicks out any stale or racing instance.

Source: src/main.rs::acquire_single_instance_lock.

flowchart TD
    START[agent boot] --> READ[read data/agent.lock]
    READ --> EXIST{file exists?}
    EXIST -->|no| WRITE[write our PID]
    EXIST -->|yes| PID[parse PID]
    PID --> ALIVE{/proc/PID/ exists?}
    ALIVE -->|no| WRITE
    ALIVE -->|yes| SIGTERM[send SIGTERM]
    SIGTERM --> WAIT[wait up to 5 s<br/>50 × 100 ms polls]
    WAIT --> DEAD{process gone?}
    DEAD -->|yes| WRITE
    DEAD -->|no| SIGKILL[send SIGKILL]
    SIGKILL --> WRITE
    WRITE --> LOCK[RAII handle alive]

The SingleInstanceLock RAII struct stores our own PID. On drop it only removes the lockfile if the stored PID still matches the current one — so a takeover by a third process doesn't let the original owner wipe the lock on its way out.

Graceful shutdown

See Agent runtime — Graceful shutdown for the ordered teardown sequence. Key points from a fault-tolerance angle:

Dream-sweep loops and MCP sessions get explicit grace windows so in-flight work doesn't produce partial state
Plugin intake is stopped before agent runtimes — the runtimes drain anything already in their mailboxes before exiting
If the disk queue has unflushed events on SIGTERM, they survive to the next boot

Operator guardrails

Beyond the automatic mechanisms:

Skill gating — an extension declaring requires.env = ["FOO"] is skipped at discovery when FOO is unset, instead of being registered and failing on every invocation. See Extensions — manifest.
Inbound filter — events with neither text nor media (receipts, typing indicators, reactions-only) are dropped before they reach the LLM, saving cost and avoiding noisy turns.
Health endpoints — :8080/ready and :8080/live expose lifecycle state for k8s liveness / readiness probes.
Metrics — :9090/metrics (Prometheus) exposes everything from inbound event counts to circuit breaker state; see Metrics.

Transcripts (FTS + redaction)

Per-session JSONL transcripts under agents.<id>.transcripts_dir are the canonical record of every turn. Two optional layers wrap that record:

FTS5 index — a SQLite virtual table that mirrors transcript content for MATCH queries. Backs the session_logs tool's search action when present.
Redaction — a regex pre-processor that rewrites entry content before it ever reaches disk. Patterns target common credentials and home-directory paths.

Source: crates/core/src/agent/transcripts_index.rs, crates/core/src/agent/redaction.rs, crates/core/src/agent/transcripts.rs.

Configuration

config/transcripts.yaml (optional; absent → defaults below):

fts:
  enabled: true                       # default
  db_path: ./data/transcripts.db      # default

redaction:
  enabled: false                      # default — opt in
  use_builtins: true                  # only relevant if enabled
  extra_patterns:
    - { regex: "TENANT-[0-9]+", label: "tenant_id" }

JSONL is the source of truth. The FTS index is derivable; if the DB is corrupted or deleted, agent transcripts reindex (planned) can rebuild it from disk.

FTS schema

CREATE VIRTUAL TABLE transcripts_fts USING fts5(
    content,
    agent_id        UNINDEXED,
    session_id      UNINDEXED,
    timestamp_unix  UNINDEXED,
    role            UNINDEXED,
    source_plugin   UNINDEXED,
    tokenize = 'unicode61 remove_diacritics 2'
);

The DB is shared across agents; isolation is enforced at query time by WHERE agent_id = ?. User queries are escaped as a single FTS5 phrase so operators (OR, NOT, :) in the user input never reach the engine as syntax.

`session_logs` integration

When the index is available, the search action returns:

{
  "ok": true,
  "query": "reembolso",
  "backend": "fts5",
  "count": 3,
  "hits": [
    {
      "session_id": "…",
      "timestamp": "2026-04-25T18:00:00Z",
      "role": "user",
      "source_plugin": "wa",
      "preview": "...quería un [reembolso] del pedido..."
    }
  ]
}

If the index is None (FTS disabled or init failed), the action falls back to the legacy substring scan over JSONL. The shape is the same minus backend: "fts5".

Redaction patterns

Label	Detects	Example match
`bearer_jwt`	`Bearer eyJ…` JWT triplets	`Bearer eyJhbGc.eyJzdWI.dGVzdA`
`anthropic_key`	Anthropic API keys	`sk-ant-abcdef…`
`openai_key`	`sk-` prefix API keys (OpenAI etc.)	`sk-abc123…`
`aws_access_key`	AWS access key id	`AKIAIOSFODNN7EXAMPLE`
`hex_token_32`	Long hex strings	`5d41402abc4b2a76b9719d911017c592`
`home_path`	Linux/macOS home dirs	`/home/familia`, `/Users/alice`

Each match is replaced with [REDACTED:<label>]. Patterns run in the order above, so more specific shapes (Bearer JWT, Anthropic) win over generic catch-alls below.

A 40-char base64 pattern targeting AWS secret keys was deliberately omitted — it produces too many false positives on legitimate hashes and opaque ids. Operators who need it can add it scoped via extra_patterns.

Custom patterns

redaction:
  enabled: true
  extra_patterns:
    - { regex: "TENANT-[0-9]+",   label: "tenant_id" }
    - { regex: "internal\\.acme", label: "internal_host" }

Custom patterns run after built-ins. Invalid regex aborts boot with a message naming the offending index and label.

What redaction does not do

It does not maintain a reverse map. Once content is redacted on disk the original is gone — by design. A reversible mapping would recreate the leak surface this feature is meant to close.
It does not rewrite previously-written JSONL files. New entries redact going forward; historical content stays as-is.
It does not redact tracing logs — that's a separate concern.
The FTS index stores the redacted text, so search results never surface the original secrets either.

Operational notes

The FTS index uses WAL journaling and capped pool size of 4 — it shares the same idiom as the long-term memory DB.
Insert is best-effort. If an FTS write fails (disk full, lock contention) the tool logs at warn and the JSONL append still succeeds. The source of truth is never compromised.
Boot logs include transcripts FTS index ready (or the warn that it fell back) and transcripts redaction active when the redactor has any rule loaded.

nexo-rs vs OpenClaw

OpenClaw is the closest reference point in the multi-channel-agent-gateway space. nexo-rs mined OpenClaw's plugin SDK, channel boundaries, and skills layout for ideas, then rebuilt the runtime in Rust with stricter operational guarantees. This page lays out the differences honestly — including where OpenClaw still has the edge.

Substrate

Dimension	OpenClaw	nexo-rs
Language	TypeScript	Rust
Runtime	Node 22+	none — single statically-linked binary
Install footprint	`pnpm install` over ~42 runtime deps + 24 dev deps	one binary, ~90 MB built; ~15 MB to download (xz tarball), ~18 MB `.deb`, ~25 MB `.rpm`
Cold-start	`node` boot + module resolution	direct `exec` — sub-100ms to `agent serve`
Mobile target	feasible with Termux + Node	first-class on Termux, no root, no Docker
Memory safety	runtime errors	Rust ownership: data races, use-after-free, null deref refused at compile

The single-binary shape is the reason nexo-rs runs comfortably on a phone (Termux) and on a fresh VPS without a Node ecosystem underneath. cargo build --release and ship target/release/agent — that is the whole deliverable.

Process & messaging

Dimension	OpenClaw	nexo-rs
Process model	single Node process	multi-process via NATS, in-process `LocalBroker` fallback when NATS is offline
Subject namespace	n/a (in-process buses)	`plugin.inbound.<plugin>[.instance]` / `plugin.outbound.…` / `agent.route.<id>` / `taskflow.resume`
Fault tolerance	best-effort	`NatsBroker` wraps every publish in a `CircuitBreaker`; failures spill to a SQLite-backed disk queue and drain on reconnect
At-least-once delivery	n/a	drain path documented as at-least-once; consumers dedupe by `event.id`
DLQ	n/a	failed events land in `dead_letters` after 3 attempts; `agent dlq list/replay/purge` from the CLI
Subscription survival	restart	NATS subscriptions auto-resubscribe on reconnect with backoff (250 ms → 10 s)

Hot reload

Dimension	OpenClaw	nexo-rs
Config change	restart	`agent reload` (or file-watcher trigger) swaps a `RuntimeSnapshot` via `ArcSwap` — in-flight turns finish on the old snapshot, the next event picks up the new one
Watched files	—	`agents.yaml`, `agents.d/*.yaml`, `llm.yaml` (extra paths via `runtime.yaml`)
Per-agent reload channel	—	mpsc to each `AgentRuntime`, the coordinator drains acks to confirm

Per-agent capability sandbox

OpenClaw's plugin allowlist is global to the gateway. nexo-rs pushes the allowlist down to the agent and the binding (the inbound channel surface):

agents:
  - id: kate
    plugins: [whatsapp, telegram, browser, taskflow]
    allowed_tools: ["whatsapp_*", "browser_navigate", "memory_*"]
    outbound_allowlist:
      whatsapp: ["+57…"]
      telegram: [123456789]
    skill_overrides:
      ffmpeg-tools: warn
    accept_delegates_from: ["ana"]
    inbound_bindings:
      - plugin: whatsapp
        instance: kate_wa
        # per-binding overrides for the same agent
        allowed_tools: ["whatsapp_*"]
        outbound_allowlist:
          whatsapp: ["+57…"]

What that buys:

An LLM running under kate cannot send messages to a number not in outbound_allowlist, even if a prompt injection asks it to.
Two channels exposed to the same agent (sales WA, private TG) carry different capability surfaces — the sales binding doesn't get the private one's tool set.
Skill modes (strict / warn / disable) are decided per agent, with explicit requires.bin_versions semver constraints (probed at boot, process-cached).

Secrets

Dimension	OpenClaw	nexo-rs
Credential resolution	env vars	`agents.<id>.credentials` block per channel; resolver maps to per-channel stores (gauntlet validates at boot)
1Password	n/a	`op` CLI extension + `inject_template` tool: render `{{ op://Vault/Item/field }}` and pipe to allowlisted commands without exposing the secret
Audit log	n/a	append-only JSONL at `OP_AUDIT_LOG_PATH`: every `read_secret` and `inject_template` records `agent_id`, `session_id`, fingerprint, `reveal_allowed` — never the value
Capability inventory	n/a	`agent doctor capabilities [--json]` enumerates every write/reveal env toggle (`OP_ALLOW_REVEAL`, `CLOUDFLARE_`, `DOCKER_API_`, `PROXMOX_`, `SSH_EXEC_`) with state + risk

Transcripts

OpenClaw stores transcripts as JSONL and greps them. nexo-rs keeps the JSONL (source of truth) and adds:

SQLite FTS5 index (data/transcripts.db) — write-through from TranscriptWriter::append_entry. The session_logs search agent tool uses MATCH queries with phrase-escaped user input so operator strings can't inject FTS operators.
Pre-persistence redactor (opt-in) — regex pass over content before write. 6 built-in patterns (Bearer JWT, sk-…, sk-ant-…, AWS access keys, 64+ hex tokens, home paths) plus operator-defined extra_patterns. JSONL and FTS receive the same redacted text.
Atomic header writes — OpenOptions::create_new(true) so 16 concurrent first-appends to the same session result in exactly one header line.

Durable workflows

OpenClaw doesn't ship a durable-flow primitive. nexo-rs has TaskFlow:

taskflow LLM tool with actions start | status | advance | wait | finish | fail | cancel | list_mine.
Three wait conditions: Timer { at }, ExternalEvent { topic, correlation_id }, Manual.
Single global WaitEngine ticks every 5 s (configurable), resumes flows whose deadlines have passed.
taskflow.resume NATS subject lets external services wake external_event flows: publish {flow_id, topic, correlation_id, payload} and the bridge calls try_resume_external.
agent flow list/show/cancel/resume from the CLI.
Guardrails: timer_max_horizon (default 30 days) blocks unbounded waits; non-empty topic + correlation_id required for external_event.

LLM auth

Dimension	OpenClaw	nexo-rs
Anthropic	API key	API key and `claude_subscription` OAuth PKCE flow — uses the operator's Claude Code subscription quota instead of API billing
MiniMax	API key	API key and Token Plan / Coding Plan OAuth bundle (`api_flavor: anthropic_messages`)
OpenAI-compat	API key	API key + DeepSeek wired out of the box (OpenAI-compat reuse)
Gemini	not in core	first-class client

MCP

OpenClaw supports MCP as a client. nexo-rs is both:

Client — stdio and HTTP transports, full tool / resource / prompt catalog, tools/list_changed hot-reload.
Server — agent mcp-server exposes the agent's own tools (filtered by allowlist) over stdio for Claude Desktop / Cursor / any MCP-aware host. Proxy tools (ext_*, mcp_*) are unconditionally hidden so the agent doesn't become an open relay.

Build size

target/release/nexo              ~90 MB   (built binary)
nexo-rs-<target>.tar.xz          ~12-16 MB  (release download, xz -9)
nexo-rs_<ver>_<arch>.deb         ~14-18 MB
nexo-rs-<ver>-1.<arch>.rpm       ~20-25 MB

The binary has grown from ~34 MB at v0.1.0 as the feature surface expanded (whisper STT, sqlite-vec, embedded config templates, CDP, the driver subsystem, …). What you actually fetch is the compressed artifact — ~15 MB for the musl tarball. For comparison, an OpenClaw install (Node + node_modules after pnpm install) sits in the hundreds of megabytes — most of it needed at runtime, not just build-time.

Where OpenClaw is still ahead

Honest list:

Installer & onboarding flow — OpenClaw's openclaw doctor family and the bundled installer give a smoother first-run UX than nexo-rs's agent setup wizard, especially for non-Rust developers.
TS familiarity — the JS / TS audience for plugin authors is larger than the Rust audience; if your team writes mostly TypeScript, contributing back to OpenClaw is faster.
Track record — OpenClaw has a longer release history, more maintainers, and more shipped extensions in the wild.
Apps surface — OpenClaw ships iOS / Android / macOS companion apps; nexo-rs only ships the daemon and the loopback web admin (admin-ui Phase A0–A11 still in progress).

Summary

If you want operational guarantees (single binary, fault-tolerant broker, per-agent sandbox, durable workflows, secrets audit) and you're OK with Rust, nexo-rs.

If you want fast onboarding, a TS plugin ecosystem, and the OpenClaw apps, OpenClaw.

The two projects share enough vocabulary that moving an extension between them is mostly a port, not a rewrite. The plugin SDK shape (stdio-spoken JSON-RPC + a plugin.toml manifest) is deliberately compatible.

Driver subsystem (Phase 67)

The driver subsystem turns the nexo-rs agent runtime into the "human in the loop" for another agent — typically the Claude Code CLI. It runs a goal-bound experiment: spawn the external CLI, watch its tool-use stream, decide allow/deny on every action, feed back acceptance failures, and stop only when the CLI claims "done" AND objective verification passes.

This page describes the architectural shape; concrete impl details live with each sub-phase.

Why

Claude Code (or any other local CLI agent) is excellent at writing code, but it sometimes:

over-claims completion — says "done" when tests are red;
proposes destructive shell commands when stuck;
forgets which approaches it already tried and failed.

A second agent — driven by nexo-rs, backed by a different LLM (MiniMax M2.5), with persistent memory — closes those gaps.

Architecture

nexo-rs daemon
│
├─ "claude-driver" agent
│   ├─ LLM: MiniMax M2.5
│   ├─ memory: short_term + long_term + vector + transcripts
│   └─ skills: claude_cli, git_checkpoint, test_runner,
│              acceptance_eval, escalate
│
└─ MCP server (in-process)
    └─ tool: permission_prompt(tool_name, input) → {allow|deny, message}

claude  (subprocess, one per turn)
└─ claude --resume <id>
          --output-format stream-json
          --permission-prompt-tool mcp__nexo-driver__permission_prompt
          --add-dir <worktree>
          --allowedTools "Read,Grep,Glob,LS,WebFetch"
          -p "<turn prompt>"

Termination model

Claude says "done" — driver does NOT trust it. Driver runs the goal's acceptance criteria (cargo build, cargo test, cargo clippy, PHASES marker, custom verifiers). Only when all pass is the goal declared Done. Otherwise the failures are folded into the next turn's prompt: "you said done, but here's what still fails — fix it".

The driver also stops on budget exhaustion: max turns, wall-time, tokens, or consecutive denies. On exhaustion the driver escalates to the operator (WhatsApp / Telegram via existing channel plugins) with a state dump.

Foundational types — `nexo-driver-types`

The contract — AgentHarness trait + Goal / Attempt / Decision / AcceptanceCriterion / BudgetGuards types — lives in the leaf crate nexo-driver-types. Every value is serde-serializable so the contract can travel through NATS, get re-imported by extensions, and power admin-ui dashboards without dragging in the daemon.

How a turn flows (Phase 67.1)

#![allow(unused)]
fn main() {
use std::time::Duration;
use nexo_driver_claude::{ClaudeCommand, spawn_turn};
use nexo_driver_types::CancellationToken;

async fn doc(session_id: String) -> anyhow::Result<()> {
let cmd = ClaudeCommand::discover("Implementa Phase 26.z")?
    .resume(session_id)
    .allowed_tools(["Read", "Grep", "Glob", "LS"])
    .permission_prompt_tool("mcp__nexo-driver__permission_prompt")
    .cwd("/tmp/claude-runs/26-z");

let cancel = CancellationToken::new();
let mut turn = spawn_turn(cmd, &cancel, Duration::from_secs(600), Duration::from_secs(1)).await?;

while let Some(ev) = turn.next_event().await? {
    // dispatch on ev (Assistant tool_use → permission_prompt; Result → done check)
    let _ = ev;
}
let _exit = turn.shutdown().await?;
Ok(())
}
}

next_event cooperatively races three signals via tokio::select!: the cancel token, the per-turn deadline, and the JSONL stream. Errors land as Cancelled, Timeout, ParseLine, etc. Cleanup is always shutdown() — ChildHandle::Drop is the panic safety net.

Persistence (Phase 67.2)

SqliteBindingStore keeps (goal_id → claude session_id) plus timestamps in a single claude_session_bindings table. Two filters are applied on get:

idle TTL — last_active_at must be within idle_ttl of now;
max age — created_at + max_age must be in the future.

Either filter can be None (no filter) or Duration::ZERO (alias).

Three soft-delete-friendly operations live alongside clear:

mark_invalid(goal_id) flips last_session_invalid = 1 instead of deleting the row. Phase 67.8 (replay-policy) calls this when Claude rejects a session id mid-turn; the row stays for forensics.
touch(goal_id) bumps last_active_at only. Driver loop calls it per observed event so the idle filter doesn't need a structural upsert per turn.
purge_older_than(cutoff) reaps rows the operator no longer cares about. Phase 67.6 (worktree janitor) calls it nightly.

Schema migrations: PRAGMA user_version = 1 is the sentinel; every open() runs CREATE TABLE/INDEX IF NOT EXISTS. Future v2 will extend that helper.

Permission flow (Phase 67.3)

Every Claude tool call that isn't on the static allowlist (Read,Grep,Glob,LS,WebFetch) goes through the MCP server before execution:

Claude Code ─── tools/call mcp__nexo-driver__permission_prompt ───▶
                                                                    │
                                                          stdio JSON-RPC
                                                                    │
                                                                    ▼
                                              nexo-driver-permission-mcp (child)
                                                                    │
                                                            calls PermissionDecider
                                                                    │
                                                                    ▼
                                                     {behavior: allow|deny, ...}

PermissionMcpServer exposes one tool, permission_prompt. The in-process AllowSession cache keyed on (tool_name, hash(input)) short-circuits repeat calls (a Claude turn that re-reads the same file pays the decider once).

Outcomes Claude receives are always one of two shapes:

{ "behavior": "allow" }                   // optional updatedInput
{ "behavior": "deny", "message": "..." }

Internally the driver tracks five outcomes — AllowOnce, AllowSession{scope}, Deny, Unavailable, Cancelled — collapsing the last three to deny on the wire. Unavailable (timeout) is fail-closed by design.

Phase 67.3 ships the bin in placeholder modes (--allow-all for dev, --deny-all <reason> for shadow). Phase 67.4 will swap those flags for --socket <path> so the bin asks the daemon's LlmDecider (MiniMax + memory) for each decision.

Goal lifecycle (Phase 67.4)

nexo-driver run goal.yaml
        │
        ▼
DriverOrchestrator::run_goal
        │
        ├─ workspace_manager.ensure(&goal)        ─┐
        │                                          │
        ├─ write_mcp_config(workspace,             ├─ side-effects in
        │     bin_path, socket_path)               │   <workspace>/
        │                                          │
        ├─ DriverSocketServer (already running) ──┘
        │     spawned by builder, owned via JoinHandle
        │
        └─ for each turn:
             ├─ budget.is_exhausted? → BudgetExhausted{axis}
             ├─ AttemptStarted event
             ├─ run_attempt(ctx, params)
             │     spawn `claude --resume <id> ... --mcp-config ...`
             │     event-loop on stream-json
             │     binding_store.upsert(session_id)
             │     acceptance.evaluate(criteria, workspace)
             │     return AttemptResult { outcome }
             ├─ AttemptCompleted event
             └─ match outcome:
                Done            → break, GoalCompleted{Done}
                NeedsRetry{f}   → next turn with prior_failures
                Continue{...}   → next turn (e.g. session-invalid retry)
                Cancelled       → break
                BudgetExhausted → break
                Escalate{r}     → emit Escalate event, break

AttemptOutcome::Continue covers two cases the loop treats the same: the stream ended without Result::Success (Claude crashed early), and a session not found reply that triggered binding_store.mark_invalid so the next turn starts fresh.

NATS subjects emitted (when feature = "nats" and emit_nats_events: true):

agent.driver.goal.{started,completed}
agent.driver.attempt.{started,completed}
agent.driver.decision (Phase 67.7 will populate when LlmDecider records its rationale)
agent.driver.acceptance
agent.driver.budget.exhausted
agent.driver.escalate
agent.driver.replay (Phase 67.8 — replay-policy verdict)
agent.driver.compact (Phase 67.9 — compact-policy scheduled a /compact <focus> turn)

Compact policy (Phase 67.9)

Long agentic runs let Claude's context grow without bound. The orchestrator runs a CompactPolicy after every successful work turn: when running tokens cross threshold * context_window, the next iteration is rewritten as a /compact <focus> slash command turn so Claude Code shrinks its own context before the next work turn. Compact turns absorb token usage but do not bump the goal's turn counter, so they don't burn the budget. min_turns_between_compacts prevents back-to-back compacts. Set context_window: 0 (or enabled: false) in compact_policy: to disable.

Sub-phases

Phase	What	Status
67.0	`AgentHarness` trait + types	✅
67.1	`claude_cli` skill (spawn + stream-json + resume)	✅
67.2	Session-binding store (SQLite)	✅
67.3	MCP `permission_prompt` in-process	✅
67.4	Driver agent loop + budget guards	✅
67.5	Acceptance evaluator	✅
67.6	Git worktree sandboxing + per-turn checkpoint	✅
67.7	Memoria semántica de decisiones	✅
67.8	Replay-policy (resume tras crash mid-turn)	✅
67.9	Compact opportunista	✅
67.10	Escalación a WhatsApp/Telegram	⬜
67.11	Shadow mode (calibración)	⬜
67.12	Multi-goal paralelo	⬜
67.13	Cost dashboard + admin-ui A4 tile	⬜

Project tracker + multi-agent dispatch (Phase 67.A–H)

The project-tracker subsystem lets a nexo-rs agent answer "qué fase va el desarrollo" through Telegram / WhatsApp / a shell, and lets it dispatch async programmer agents that ship phases on its behalf.

The implementation is layered:

Layer	Crate	Responsibility
Project files	`nexo-project-tracker`	Parse `PHASES.md` + `FOLLOWUPS.md`, watch for changes, expose read tools.
Multi-agent state	`nexo-agent-registry`	DashMap + SQLite store of every in-flight goal, cap + queue + reattach.
Goal control	`nexo-driver-loop`	`spawn_goal` / `pause_goal` / `resume_goal` / `cancel_goal` per-goal.
Tool surface	`nexo-dispatch-tools`	`program_phase`, `dispatch_followup`, hook system, agent control + query, admin.
Capability gate	`nexo-config` + `nexo-core`	`DispatchPolicy` per agent / binding, `ToolRegistry` filter.

Project tracker (Phase 67.A)

FsProjectTracker reads <root>/PHASES.md (required) and <root>/FOLLOWUPS.md (optional) at startup, caches parsed state behind a parking-lot RwLock with a 60 s TTL, and starts a notify watcher on the parent directory that invalidates the cache on Modify | Create | Remove events.

Read tools register through nexo_dispatch_tools::READ_TOOL_NAMES (project_status, project_phases_list, followup_detail, git_log_for_phase).

Set ${NEXO_PROJECT_ROOT} to point at a workspace other than the daemon's cwd.

Multi-agent registry (Phase 67.B)

AgentRegistry is the single source of truth for every goal the driver has admitted. Each entry holds an ArcSwap<AgentSnapshot> (turn N/M, last acceptance, last decision summary, diff_stat) so list_agents / agent_status readers never block writers.

admit(handle, enqueue) enforces the global cap. Beyond the cap, enqueue=true parks the goal as Queued; enqueue=false rejects.
release(goal_id, terminal) returns the next-up queued goal so the orchestrator can promote it via promote_queued once the worktree / binding is ready.
apply_attempt(AttemptResult) refreshes the live snapshot. Idempotent against out-of-order replay (lower turn_index ignored).
Reattach (Phase 67.B.4) walks the SQLite store at boot and rehydrates Running rows. With resume_running=false they flip to LostOnRestart and surface to the operator.

LogBuffer keeps a per-goal ring of recent driver events for the agent_logs_tail tool — bounded so a chatty goal cannot OOM the process.

Persistence wiring (Phase 71)

The bin reads agent_registry.store from config/project-tracker/project_tracker.yaml and opens SqliteAgentRegistryStore when the resolved path is non-empty. Env placeholders (${NEXO_AGENT_REGISTRY_DB:-./data/agents.db}) are expanded before the open. Path open failures fall back to MemoryAgentRegistryStore with a warn so a corrupt sqlite file never bricks boot.

When the registry is sqlite-backed and reattach_on_boot: true, the bin runs the reattach sweep with resume_running=false. Every prior-run Running row flips to LostOnRestart, and any notify_origin / notify_channel hook attached to that goal fires once with an [abandoned] summary so the originating chat learns the goal could not be resumed. Subprocess respawn is intentionally not attempted — restoring a Claude Code worktree the daemon no longer owns is unsafe to do silently and lives under Phase 67.C.1.

Shutdown drain (Phase 71.3)

On SIGTERM the bin runs nexo_dispatch_tools::drain_running_goals before plugin teardown so notify_origin reaches WhatsApp / Telegram while their adapters are still alive. Each Running goal's Cancelled hooks fire with a [shutdown] summary; per-hook dispatch is bounded by a 2 s timeout so a stuck publish cannot hold shutdown hostage. The row then flips to LostOnRestart so the next boot's reattach sweep does not re-fire the same notification.

[shutdown] daemon stopping — goal `<id>` was running and has
been marked abandoned. Re-dispatch with `program_phase
phase_id=<phase>` if you still need it.

SIGKILL still bypasses this — the boot-time reattach sweep is the safety net for that case.

Turn-level audit log (Phase 72)

Live state (AgentSnapshot) only carries the latest decision / diff / acceptance per goal. Once a turn rolls forward the previous turn's data is gone. To answer "what did the agent actually do across its 40 turns?" the runtime now writes a durable row per turn into a goal_turns table on the same agents.db:

goal_turns(
    goal_id      TEXT,
    turn_index   INTEGER,
    recorded_at  INTEGER,
    outcome      TEXT,        -- done | continue | needs_retry | …
    decision     TEXT,        -- last Decision rendered as
                              --   "<tool> (allow|deny:msg|observe:note) — rationale"
    summary      TEXT,        -- mirror of AgentSnapshot.last_progress_text
    diff_stat    TEXT,
    error        TEXT,        -- pre-rendered for needs_retry / escalate / budget
    raw_json     TEXT,        -- full AttemptResult payload
    PRIMARY KEY (goal_id, turn_index)
);

EventForwarder writes a row on every AttemptResult event, upsert-on-conflict so a replay can't dup history. The new chat tool agent_turns_tail goal_id=<uuid> [n=20] returns a markdown table of the last N rows (default 20, capped at 1000):

showing 20 of 40 turn(s) for `…`

| turn | outcome | decision | summary | error |
|---|---|---|---|---|
| 21 | continue | Edit (allow) — patch crate slack | wired Plugin trait | - |
| 22 | needs_retry | Bash (allow) — cargo build | … | E0432 in slack/src/lib.rs |
…

Best-effort writes: an append failure logs a warn but never blocks the driver loop. When the registry isn't sqlite-backed (memory fallback), the tool reports "set agent_registry.store in project_tracker.yaml" rather than silently returning empty.

Async dispatch (Phase 67.C + 67.E)

DriverOrchestrator::spawn_goal(self: Arc<Self>, goal) returns a tokio::task::JoinHandle so the calling tool returns the goal id instantly without waiting for the run to finish. Per-goal pause / cancel signals (watch<bool> and CancellationToken::child_token) let pause_agent / cancel_agent target one goal without taking down the rest of the orchestrator.

program_phase_dispatch is the heart of the dispatch surface: it reads the sub-phase out of PHASES.md, runs DispatchGate::check, constructs a Goal with the dispatcher / origin metadata, asks the registry for a slot, and either spawns the goal or returns Queued / Forbidden / NotFound. dispatch_followup is the mirror that pulls the description from a FOLLOWUPS.md item.

Capability gate (Phase 67.D)

DispatchPolicy { mode, max_concurrent_per_dispatcher, allowed_phase_ids, forbidden_phase_ids } lives on AgentConfig and (as Option<DispatchPolicy>) on InboundBinding. The per-binding override fully replaces the agent-level value so an operator can be precise per channel ("asistente is none everywhere except this Telegram chat where it is full").

DispatchGate::check short-circuits in this order:

capability None → CapabilityNone (every kind).
ReadOnly capability + write kind → CapabilityReadOnly.
write + require_trusted + !sender_trusted → SenderNotTrusted. Read tools bypass the trust gate so list_agents stays open for unpaired senders.
forbidden_phase_ids match → PhaseForbidden.
non-empty allowed_phase_ids + no match → PhaseNotAllowed.
dispatcher / sender / global caps. Global cap with queue_when_full=true is admitted; the orchestrator queues it. Without queue → GlobalCapReached.

ToolRegistry::apply_dispatch_capability(policy, is_admin) prunes the registry of dispatch tool names not allowed by the resolved policy. ToolRegistryCache::get_or_build_with_dispatch builds the per-binding filtered registry that respects both allowed_tools and dispatch_policy. Hot reload (Phase 18) constructs a fresh ToolRegistryCache per snapshot, so a new dispatch_policy lands on the next intake without restart; in-flight goals keep their pre-reload tool surface so a hot reload never preempts.

Completion hooks (Phase 67.F)

Each hook is (on: HookTrigger, action: HookAction, id). Triggers fire on Done | Failed | Cancelled | Progress { every_turns }. Actions:

notify_origin — publish a markdown summary to the chat that triggered the goal. No-op when origin.plugin == "console".
notify_channel { plugin, instance, recipient } — publish to an explicit channel different from the origin (escalate to ops).
dispatch_phase { phase_id, only_if } — chain another goal when only_if matches the firing transition. Implemented via a pluggable DispatchPhaseChainer so the runtime owns program_phase_dispatch plumbing.
nats_publish { subject } — JSON payload to a custom subject.
shell { cmd, timeout } — opt-in via allow_shell_hooks. Capability PROGRAM_PHASE_ALLOW_SHELL_HOOKS registered with the setup inventory so agent doctor capabilities flags it the moment the operator exports the env var. Receives NEXO_HOOK_GOAL_ID / PHASE_ID / TRANSITION / PAYLOAD_JSON env vars.

HookIdempotencyStore (SQLite) keeps (goal_id, transition, action_kind, action_id) UNIQUE so at-least-once NATS replay or a mid-hook restart cannot fire a hook twice.

HookRegistry (in-memory DashMap<GoalId, Vec<CompletionHook>>) backs add_hook / remove_hook / agent_hooks_list.

NATS subjects (Phase 67.H.2)

Subject	Producer
`agent.dispatch.spawned`	`program_phase_dispatch` admitted
`agent.dispatch.denied`	`DispatchGate::check` denied
`agent.tool.hook.dispatched`	hook fired ok
`agent.tool.hook.failed`	hook attempt errored
`agent.registry.snapshot.<goal_id>`	per-goal periodic beacon
`agent.driver.progress`	every Nth completed work-turn

Plus the existing Phase 67.0–67.9 subjects: agent.driver.{goal,attempt}.{started,completed}, agent.driver.{decision,acceptance,budget.exhausted,escalate,replay,compact}.

CLI (Phase 67.H.1)

nexo-driver-tools mirrors the chat tool surface for shell use:

nexo-driver-tools status [--phase <id> | --followups]
nexo-driver-tools dispatch <phase_id>
nexo-driver-tools agents list [--filter running|queued|...]
nexo-driver-tools agents show <goal_id>
nexo-driver-tools agents cancel <goal_id> [--reason "…"]

origin.plugin = "console" so notify_origin is a no-op (the operator sees stdout, not a chat reply).

Built-in registration (`nexo` daemon)

The default nexo agent binary registers every dispatch tool definition at boot via nexo_core::agent::dispatch_handlers::register_dispatch_tools_into. The LLM sees program_phase, list_agents, agent_status, etc. in its toolset; per-binding dispatch_capability (config/agents.yaml) prunes the write tools for bindings that opted out.

What's NOT bundled by default is the runtime DispatchToolContext — the orchestrator + registry + tracker references the handlers consult. Without it, a tool call returns a clean dispatch tools require AgentContext.dispatch to be set at boot error instead of pretending success. Two integration paths from there:

In-process orchestrator — boot a DriverOrchestrator alongside the agents, share one AgentRegistry. See the next section for the wiring sample.
NATS-based dispatch — agent bin publishes a message to agent.driver.dispatch.request that a separate nexo-driver daemon consumes. This is the topology to use when the Claude subprocess needs hardware (GPU box) the agent daemon doesn't have. The dispatch tool surface only changes in the registry it consults; operators can swap the in- process AgentRegistry for one that mirrors a NATS-backed registry without touching the handlers.

Boot wiring (B8)

The integrator's main.rs ties everything together. Minimal shape:

use std::sync::Arc;
use nexo_agent_registry::{AgentRegistry, MemoryAgentRegistryStore, LogBuffer};
use nexo_core::agent::{
    dispatch_handlers::{register_dispatch_tools_into, DispatchToolContext},
    tool_registry::ToolRegistry,
};
use nexo_dispatch_tools::{
    event_forwarder::EventForwarder,
    hooks::{DefaultHookDispatcher, HookRegistry, NoopNatsHookPublisher},
    policy_gate::CapSnapshot,
    NoopTelemetry,
};
use nexo_pairing::PairingAdapterRegistry;
use nexo_project_tracker::FsProjectTracker;

// 1. Project tracker.
let tracker: Arc<dyn nexo_project_tracker::ProjectTracker> =
    Arc::new(FsProjectTracker::open(std::env::current_dir().unwrap())?);

// 2. Agent registry + log buffer.
let registry = Arc::new(AgentRegistry::new(
    Arc::new(MemoryAgentRegistryStore::default()),
    4,
));
let log_buffer = Arc::new(LogBuffer::new(200));
let hook_registry = Arc::new(HookRegistry::new());

// 3. Hook dispatcher with the channel adapters that Phase 26
//    registered (whatsapp / telegram).
let pairing = PairingAdapterRegistry::new();
// pairing.register(WhatsappPairingAdapter::new(...));
// pairing.register(TelegramPairingAdapter::new(...));
let hook_dispatcher = Arc::new(DefaultHookDispatcher::new(
    pairing,
    Arc::new(NoopNatsHookPublisher),
));

// 4. Orchestrator with EventForwarder so registry / log_buffer /
//    hooks see every driver event.
let inner_sink: Arc<dyn nexo_driver_loop::DriverEventSink> =
    Arc::new(nexo_driver_loop::NoopEventSink);
let event_sink: Arc<dyn nexo_driver_loop::DriverEventSink> =
    Arc::new(EventForwarder::new(
        registry.clone(),
        log_buffer.clone(),
        hook_registry.clone(),
        hook_dispatcher.clone(),
        inner_sink,
    ));
// (orchestrator builder consumes event_sink)

// 5. Bundle for AgentContext.dispatch.
let dispatch_ctx = Arc::new(DispatchToolContext {
    tracker,
    orchestrator: orch.clone(),
    registry,
    hooks: hook_registry,
    log_buffer,
    default_caps: CapSnapshot {
        queue_when_full: true,
        ..Default::default()
    },
    require_trusted: true,
    telemetry: Arc::new(NoopTelemetry),
});

// 6. Register the handlers into the base ToolRegistry. The
//    per-binding cache prunes write tools when capability=None
//    or read_only.
let base = ToolRegistry::new();
register_dispatch_tools_into(&base);

// 7. Per-session AgentContext.with_dispatch(dispatch_ctx)
//    + .with_sender_trusted(true) + .with_inbound_origin(plugin,
//    instance, sender).

Without step 6 the handlers exist but aren't reachable by the LLM. Without step 4 the registry / log_buffer / hooks stay inert. Without step 5 the handlers return MissingDispatchCtx.

Plan mode (Phase 79.1)

Plan mode is a per-goal toggle that puts the agent into a read-only "exploration + design" phase. While active, every mutating tool call is short-circuited at the dispatcher with a structured PlanModeRefusal, and the model is expected to call ExitPlanMode { final_plan } once it has a coherent plan. The operator approves (or rejects) the plan via the pairing channel, and plan mode flips back to off so the agent can implement.

The feature ports two prior agent CLI tools (EnterPlanModeTool + ExitPlanModeV2Tool) with three deliberate diffs from the upstream CLI:

Decision	upstream	nexo-rs (Phase 79.1)
Approval channel	Local TUI dialog; `KAIROS_CHANNELS` flag DISABLES plan mode under chat channels	Pairing-friendly: every approval flows through the chat channel itself via `[plan-mode] approve\|reject plan_id=…`
Refusal payload	Free-form string from `validateInput`	Structured `PlanModeRefusal { tool_name, tool_kind, hint, entered_at, entered_reason }`
Plan body	Read from disk via `getPlanFilePath(agentId)`	`final_plan: String` arg, capped at 8 KiB — disk fallback parked as a follow-up

YAML knobs

agents:
  - id: cody
    plan_mode:
      enabled: true                    # tool registered + reachable
      auto_enter_on_destructive: false # opt-in pairing with Phase 77.8
      default_active: ~                 # role-aware default (see below)
      approval_timeout_secs: 86400      # 24 h; goal stops with ApprovalTimeout if exceeded
      require_approval: false           # default safe rollout — flip to `true` in production

    inbound_bindings:
      - plugin: whatsapp
        instance: ops
        role: coordinator               # used by the role-aware `default_active`
        plan_mode:                      # full per-binding override
          require_approval: true

default_active is null by default. The runtime resolves it through PlanModePolicy::compute_default_active(role):

Binding role	Default `default_active`
`coordinator`	`true` (these bindings drive non-trivial work)
`worker`	`false` (workers receive sub-goals from a coordinator that already planned)
`proactive`	`false` (Phase 77.20 ticks would be disrupted by a blocking approval flow)
unset / unknown	`false` (safest opt-out)

Operators can pin the value with default_active: true | false to override the role-aware default.

Tools

`EnterPlanMode`

Zero parameters except an optional reason: string. Returns

{
  "entered_plan_mode": true,
  "already_in_plan_mode": false,
  "entered_at": 1700000000,
  "reason": "explore auth flow",
  "instructions": "..."
}

Hard guard: rejects with PermissionDenied when called from a sub-agent / cron / poller / heartbeat-spawned goal. Lift from upstream agent CLI, refined with OpenClaw research/src/acp/session-interaction-mode.ts:4-15 — only chat-rooted goals qualify because only they have a path to deliver the operator approval that ends plan mode.

`ExitPlanMode { final_plan: String }`

Submits the plan and (when require_approval: true) waits for operator decision. Returns

{
  "exited_plan_mode": true,
  "unlocked_at": 1700000123,
  "entered_at": 1700000000,
  "plan_bytes": 412,
  "plan_chars": 411,
  "plan_id": "01J…",
  "approval_required": true
}

`plan_mode_resolve { plan_id, decision: approve|reject, reason? }`

Operator-side resolver. The pairing parser (future wiring) calls this when it sees [plan-mode] approve plan_id=… or [plan-mode] reject plan_id=… reason=… on an inbound message. Direct callable too — tests and CLI ops use it without the parser layer.

reject requires reason (non-empty); the rejection causes the awaiting ExitPlanMode to fail with the canonical follow-up prompt (lift from research/src/agents/bash-tools.exec-approval-followup.ts:27-40):

Plan rejected by operator. Reason: <reason>.
Do not call ExitPlanMode again with the same plan.
Adjust the plan based on the rejection reason and present a revised plan.

Plan-mode stays ON after a reject so the model must revise.

Mutating tools blocked while plan mode is on

Canonical list lives in crates/core/src/plan_mode.rs::MUTATING_TOOLS. Adding a tool to the runtime registry without classifying it (mutating OR read-only) makes ToolRegistry::assert_plan_mode_classified() panic at boot in strict deployments — ensures no tool silently bypasses the gate.

Currently classified mutating:

Bash (gated when the next-shipped Phase 77.8 destructive classifier returns is_mutating: true; until 77.8 lands, every Bash call refuses fail-safe)
FileWrite, FileEdit, NotebookEdit
program_phase, delegate_to, dispatch_followup
TeamCreate, TeamDelete
ScheduleCron, RemoteTrigger
Config (the apply op only — read and propose stay read-only)
Plugin outbound names following the <channel>.<verb> convention (e.g. whatsapp.send, browser.click)

Classified read-only (always callable):

FileRead, Glob, Grep, WebSearch, WebFetch
ListMcpResources, ReadMcpResource, ToolSearch
AskUserQuestion, Sleep
EnterPlanMode, ExitPlanMode
Memory + observability tools (memory_search, agent_query, agent_turns_tail, session_logs, what_do_i_know, who_am_i, my_stats)

Notify-line formats (frozen)

Every transition emits a canonical line via tracing::info!. The formats are frozen — operator dashboards and parsers read them. Any change here is a breaking change.

[plan-mode] entered at <RFC3339> — reason: <model[: <text>]|operator|auto-destructive: <check>>
[plan-mode] awaiting approval plan_id=<UUID> (resolve via plan_mode_resolve { plan_id, decision: approve|reject })
[plan-mode] approved plan_id=<UUID>
[plan-mode] rejected plan_id=<UUID> reason=<…>
[plan-mode] approval timed out plan_id=<UUID>
[plan-mode] exited — plan: <first 200 chars>… (full plan in turn log #<turn_idx>)
[plan-mode] refused tool=<name> kind=<bash|file_edit|outbound|delegate|dispatch|schedule|config|read_only>

Future work: pipe these to notify_origin so the pairing channel sees them directly (today they live in stdout / structured logs).

Persistence

agent_registry.goals.plan_mode is a TEXT column carrying the JSON-serialised PlanModeState. It survives daemon restart via the Phase 71 reattach path: a goal that was in plan mode when the daemon died comes back with the same state, and the per-turn system-prompt hint resumes.

Follow-ups

Tracked in proyecto/FOLLOWUPS.md::Phase 79.1:

Operator-approval scope check (port from OpenClaw roleScopesAllow pattern when 79.10 ships).
final_plan_path variant for plans larger than 8 KiB.
Acceptance retry policy for flaky test suites.

References

PRIMARY: upstream agent CLI, upstream agent CLI, upstream agent CLI (prepareContextForPlanMode).
SECONDARY: research/src/acp/session-interaction-mode.ts:4-15 (interactive vs background sessions), research/src/agents/bash-tools.exec-approval-followup.ts:27-40 (canonical reject follow-up prompt).
Plan + spec: proyecto/PHASES.md::79.1.

`TodoWrite` (Phase 79.4)

TodoWrite is the model's intra-turn scratch list. Every call replaces the entire list (full-replace semantics). The runtime wipes the stored list to [] whenever every item is completed, so the next planning cycle starts fresh.

The tool is always callable — including while plan mode is on, since it never touches workspace, broker, or external state. Lift from upstream agent CLI.

Diff vs Phase 14 TaskFlow

Trait	`TodoWrite` (this)	TaskFlow (Phase 14)
Lifetime	Per goal, in-memory	Persistent, cross-session
Owner	Model	Operator + model + flows
Shape	Flat array	DAG with deps + waits
Semantics	Full-replace, wipe-on-all-completed	Partial mutations, manual close
Survives daemon restart	No	Yes
When to use	Coordinating sub-steps inside a long Phase 67 driver-loop turn without spawning sub-goals	Multi-day work programs, cross-session state

Tool shape

{
  "todos": [
    {
      "content": "Run cargo test",
      "status": "pending",
      "activeForm": "Running cargo test"
    }
  ]
}

Every item must carry both content (imperative) and activeForm (present continuous) — the upstream CLI shows activeForm is what dashboards render while the item is in_progress, so they get a natural progress string without grammar fixup. Snake-case active_form is also accepted for consistency with the rest of nexo-rs.

status is one of pending | in_progress | completed.

Bounds

Max 50 items per goal (defensive — the upstream CLI does not enforce one; nexo-rs adds the cap so a runaway model cannot grow the list unbounded).
Max 200 UTF-8 bytes per content and active_form field.
A bad write rejects without clobbering the existing list.

Response

{
  "old_todos": [...],
  "new_todos": [...],
  "wiped_on_all_completed": false,
  "in_progress_count": 1,
  "instructions": "Todos updated. Keep exactly one item `in_progress` at a time. Mark completed IMMEDIATELY after finishing each task; do not batch completions."
}

old_todos echoes the previous list so the model sees the diff in the same turn. wiped_on_all_completed: true flags that the runtime just cleared the stored list.

When the model should use it

The tool description ships the canonical "use proactively for 3+ step tasks" guidance lifted from upstream agent CLI. In short: multi-step coding tasks → seed a list, mark exactly one item in_progress, mark it completed the moment it finishes (don't batch), tear the list down once everything is done.

References

SECONDARY: OpenClaw research/ — no equivalent (grep -rln "todo" research/src/ returns only unrelated cron / delivery files).

`ToolSearch` (Phase 79.2 — MVP)

ToolSearch is the discovery surface for deferred tools — tools whose full JSONSchema lives behind a single ToolSearch(...) lookup instead of inline in the system prompt. The savings are real once the tool surface gets wide (40+ tools after Phase 13 + 77 + 79); the MVP shipped here lays the foundation.

Lift from upstream agent CLI (input schema, select: prefix, keyword search with +token required prefix, scoring weights for name parts vs description vs searchHint).

How a tool becomes "deferred"

The runtime does not infer this from the tool itself. Callers opt in when registering:

#![allow(unused)]
fn main() {
use nexo_core::agent::tool_registry::{ToolMeta, ToolRegistry};

registry.register_with_meta(
    MyTool::tool_def(),
    MyTool,
    ToolMeta::deferred_with_hint("send a slack message"),
);
}

ToolMeta has two fields:

Field	Default	Effect
`deferred: bool`	`false`	When `true`, surfaces only via `ToolSearch`
`search_hint: Option<String>`	`None`	Curated phrase used by keyword ranking — beats raw description scoring

Existing register(...) calls keep working unchanged — the side channel is opt-in.

MVP caveat

The four LLM provider clients (anthropic, minimax, gemini, openai_compat) still emit every registered tool's full schema in the request body. The actual token-cost savings land when a follow-up wires those four clients to consult ToolRegistry::deferred_tools() and filter accordingly. Tracked in FOLLOWUPS.md::Phase 79.2. Until then, ToolSearch is useful as a discovery API the model can use today, and the registration path is correct so the upgrade is a four-file change.

Query forms

Lifted verbatim from the upstream CLI:

select:Read,Edit,Grep      → fetch these exact tools by name (comma-separated)
notebook jupyter           → keyword search, up to max_results best matches
+slack send                → require "slack" in name/desc/hint, rank by remaining terms

Tokens prefixed with + are required: tools that don't match all required tokens are filtered out before scoring. Other tokens are optional but still contribute to the score.

Scoring (mirror of the upstream CLI)

Match site	Non-MCP score	MCP score
Exact name part	10	12
Substring within name part	5	6
`search_hint` contains term	+4	+4
`description` contains term	+2	+2

Matches with score == 0 are dropped. Results sorted by score desc then name asc; truncated to max_results (default 5, hard cap 25).

mcp__server__action-shaped names are tokenised by splitting on __ and _; CamelCase names split on case boundaries; dotted plugin names (whatsapp.send) split on .. So a query "slack" matches mcp__slack__send_message on the exact server-name part.

Response shape

{
  "query": "...",
  "query_kind": "select" | "keyword",
  "total_deferred_tools": 7,
  "matches": [
    {
      "name": "FileEdit",
      "score": 12,            // omitted in `select` responses
      "description": "...",
      "parameters": { ... }   // full JSONSchema
    }
  ],
  "missing": ["UnknownTool"]   // only present in `select` responses
}

The model can read the schema directly out of parameters and call the tool on the next turn. When the runtime starts filtering deferred tools out of the request body (follow-up), the matched tool will also be auto-injected into that turn's available tool list.

References

PRIMARY: upstream agent CLI, upstream agent CLI,62-108 (isDeferredTool).
SECONDARY: OpenClaw research/ — no equivalent. Single-process TS reference does not face the wide-surface MCP token cost that motivates this tool.
Plan + spec: proyecto/PHASES.md::79.2.

`SyntheticOutput` (Phase 79.3)

SyntheticOutput forces a goal to terminate with a JSON value that matches a caller-provided JSONSchema. Closes the gap between "model produces free prose" and "downstream consumer needs a struct" — direct input for Phase 19/20 pollers, Phase 51 eval harness, and any future contract-shaped goal.

Lift from upstream agent CLI.

Diff vs upstream

The upstream CLI builds one tool per schema via createSyntheticOutputTool(jsonSchema) so the model's input is the schema. Nexo-rs runs as a daemon — building a fresh tool per call breaks tool-registry semantics. We ship a single tool whose input carries BOTH the schema and the value:

{
  "schema": { "type": "object", "properties": { "name": { "type": "string" } }, "required": ["name"] },
  "value":  { "name": "ana" }
}

Pollers and eval harnesses inject the schema via prompt template; ad-hoc callers pass it inline. The terminal_schema follow-up (tracked in FOLLOWUPS.md) lets the runtime carry the schema and only the model's value flows on the wire — closer to the upstream single-input shape.

Tool shape

Arg	Type	Required	Notes
`schema`	object	yes	JSONSchema (Draft 7 / 2019-09 / 2020-12). Must be a JSON object.
`value`	any	yes	The value to validate. Object / array / scalar — any shape the schema permits.

Response

{
  "ok": true,
  "structured_output": <value>,
  "instructions": "Output validated. The goal can terminate now — do not call any other tool this turn unless the goal contract calls for it explicitly."
}

On failure the call returns an error whose body lists every violation with its JSONPath:

SyntheticOutput: value does not match schema (2 errors): /age: 30 is not of type "string"; /tags/0: "purple" is not one of ["red","green"]

Validation

Uses jsonschema = "0.20" — already an optional dep on nexo-core (default-on via the schema-validation Cargo feature that Phase 9.2 introduced). Builds without the feature compile, but SyntheticOutput returns a clear "feature disabled" error rather than silently passing through — synthesised output without validation is worse than no synthesis.

Plan-mode classification

Classified ReadOnly in nexo_core::plan_mode::READ_ONLY_TOOLS. The tool only validates and echoes; it never touches workspace, broker, or external state. Safe to call while plan mode is on.

References

PRIMARY: upstream agent CLI.
SECONDARY: OpenClaw research/ — no equivalent. Single-process TS reference shapes its outputs via Zod parsing inline; no separate "force structured output" tool.
Plan + spec: proyecto/PHASES.md::79.3.

`NotebookEdit` (Phase 79.13)

Cell-level edits on Jupyter .ipynb notebooks. Pure-Rust round-trip through serde_json::Value — no jupyter binary, no Python dependency. Unknown top-level fields survive untouched (forward-compat with newer nbformat).

Lift from upstream agent CLI.

Tool shape

{
  "notebook_path": "/abs/path/to/nb.ipynb",
  "cell_id":      "alpha",            // UUID-style id, or `cell-N` numeric fallback
  "new_source":   "x = 1",
  "cell_type":    "code",             // optional for replace, required for insert
  "edit_mode":    "replace"           // replace | insert | delete (default: replace)
}

Edit mode	Behaviour
`replace`	Overwrite `cells[i].source`. Code cells get `execution_count: null` and `outputs: []` (the diff stays sane). Markdown cells preserve all metadata.
`insert`	Add a new cell AFTER the anchor `cell_id`. Empty `cell_id` inserts at position 0. `cell_type` required. nbformat ≥ 4.5 gets a fresh 12-char base-36 id.
`delete`	Remove the cell at `cell_id`.

cell_id resolution: literal cells[i].id match first; falls back to cell-N numeric index (matches the upstream parseCellId). Failure lists up to 10 available ids in the error message.

Defensive behaviour

Absolute path required. Refuses relative paths to avoid ambiguity across daemon cwd changes.
.ipynb extension required. Other file types fall to FileEdit.
Replace at end-of-cells auto-converts to insert. Lift from the upstream CLI (NotebookEditTool.ts:372-377). Requires cell_type in that path.
Bad writes leave the file untouched. Validation runs before write; any error returns before std::fs::write.

Plan-mode classification

Classified FileEdit (mutating) in nexo_core::plan_mode::MUTATING_TOOLS. Plan-mode-on goals get a PlanModeRefusal rather than a silent edit.

Output

{
  "notebook_path": "/abs/path/...",
  "edit_mode": "replace" | "insert" | "delete",
  "cell_id": "alpha",
  "cell_type": "code" | "markdown",
  "language": "python",        // from notebook.metadata.language_info.name
  "total_cells": 3,
  "cells_delta": 1             // -1 for delete, +1 for insert, 0 for replace
}

Out of scope (deferred)

Read-before-Edit guard. the upstream CLI requires Read to have been called on the file in the same session before NotebookEdit is allowed. Nexo-rs does not have a shared file-state cache yet — the guard becomes useful when Phase 67 driver-loop adds one.
Attribution / file-history tracking. The upstream fileHistoryTrackEdit records edits to a per-session ledger. Skipped — the workspace-git layer (Phase 10.9) covers the same use case.
Multi-cell batch edits. One operation per call by design; callers loop over cell ids.

References

PRIMARY: upstream agent CLI, upstream agent CLI.
SECONDARY: OpenClaw research/ — no equivalent (grep -rln "ipynb|jupyter|nbformat" research/src/ returns nothing).
Plan + spec: proyecto/PHASES.md::79.13.

`RemoteTrigger` (Phase 79.8)

RemoteTrigger lets the model publish a JSON payload to a pre-configured outbound destination — webhook (HTTP POST) or NATS subject. Destinations live in the agent's YAML allowlist; the model passes only name + payload, never URLs or subjects.

Diff vs upstream

The upstream RemoteTriggerTool (upstream agent CLI) is a CRUD client for claude.ai's hosted scheduled-agent API (/v1/code/triggers). Different concept entirely — Anthropic uses "trigger" to mean "scheduled remote agent". Nexo-rs adopts the name and ships a generic outbound publisher per our PHASES.md spec. The two are conceptually unrelated; we cite the upstream CLI as naming reference only.

Configuration

agents:
  - id: cody
    remote_triggers:
      - kind: webhook
        name: ops-pager
        url: https://hooks.example.com/abc
        secret_env: OPS_PAGER_SECRET   # optional — HMAC-SHA256 signs body
        timeout_ms: 5000               # default 5000
        rate_limit_per_minute: 10      # default 10; 0 = unlimited

      - kind: nats
        name: internal-ops
        subject: agent.outbound.ops
        rate_limit_per_minute: 30

Empty list (the default) keeps the tool registered but every call refuses with "no destination named X in this agent's allowlist".

Tool shape

{
  "name": "ops-pager",
  "payload": { "level": "warn", "msg": "build red on main" }
}

payload accepts any JSON shape (object / array / scalar). Cap is 256 KiB serialised — oversize is rejected before any network call.

Webhook headers

When dispatched as a webhook, every request carries:

Header	Value
`Content-Type`	`application/json`
`X-Nexo-Trigger-Name`	trigger name (allowlist key)
`X-Nexo-Timestamp`	unix-seconds at dispatch
`X-Nexo-Signature`	`sha256=<hex>` HMAC of body using `secret_env` value (only when `secret_env` is set)

Receivers MUST verify the signature when configured. Compute HMAC-SHA256(body, secret) and compare against the X-Nexo-Signature header in constant time.

Rate limit

Sliding-window token bucket per trigger name, 1-minute window, default 10 calls / minute. Set to 0 for unlimited (no bucket). Bucket lives in process memory — restarts reset.

Plan-mode

Classified Outbound (mutating) in nexo_core::plan_mode::MUTATING_TOOLS. Plan-mode-on goals receive PlanModeRefusal rather than a silent publish.

Security model

Allowlist. The model sees only destination names; URLs and subjects are operator-owned in YAML. No way to coerce a trigger to a model-supplied URL.
HMAC sign. Optional but recommended. secret_env resolves at call time — secrets never enter YAML.
Refuses unsigned when secret missing. If secret_env is set but the env var is empty, the call refuses rather than send unsigned (defence in depth — shipping unsigned could bypass receiver auth).
Body cap + rate limit. Capacity controls bound the blast radius if a model goes haywire.
Plan-mode gate. A goal in plan mode cannot publish.

Out of scope (deferred)

Per-binding override. Today the canonical source is agents[].remote_triggers. A binding.remote_triggers override would let an operator scope per channel; not yet wired.
Circuit breaker per trigger. Phase 2.5 CircuitBreaker is available but not yet wired in. Add when transient outbound failures become noisy enough to justify.
Telemetry counters. nexo_remote_trigger_calls_total{ name, result} + nexo_remote_trigger_latency_ms{name} are spec'd but not emitted. Wire when the tool is in active use.

Diff vs upstream (summary)

Aspect	upstream	Nexo-rs
Purpose	claude.ai CCR scheduled-agent CRUD	Generic outbound publisher
Auth	Anthropic OAuth	HMAC-SHA256 (operator-shared secret)
Destinations	hardcoded `/v1/code/triggers`	YAML allowlist (webhook / NATS)
Rate limit	Anthropic-side	Per-trigger token bucket in-process

References

PRIMARY: PHASES.md::79.8 spec (own design). upstream agent CLI cited for naming + dispatcher shape only — semantics differ.
SECONDARY: OpenClaw research/ — no equivalent. Single-process TS reference uses plugin outbound paths directly; no allowlisted generic publisher exists.

`Repl` (Phase 79.12)

Stateful REPL tool — spawn persistent Python, Node.js, or bash subprocesses whose interpreter survives across LLM turns inside the same goal. Variables, imports, and definitions persist; the model executes code iteratively without restarting the interpreter each turn.

Feature-gated behind repl-tool. Off by default — arbitrary code execution is dangerous and must be explicitly opted into per agent / per binding.

Lift from upstream agent CLI + src/services/sandbox/.

Tool shape

{
  "action":     "spawn | exec | read | kill | list",
  "session_id": "<uuid>",       // required for exec, read, kill
  "runtime":    "python | node | bash",  // required for spawn
  "code":       "print(1+1)",   // required for exec
  "cwd":        "/tmp"          // optional for spawn (defaults to agent workspace)
}

Action	Behaviour
`spawn`	Launch a new persistent REPL session. Returns `session_id`.
`exec`	Run code in a running session. Returns `stdout`, `stderr`, `timed_out`, `exit_code`. Output is the difference since the last exec/read — only new bytes.
`read`	Read current output buffers without sending code.
`kill`	Terminate a session by id.
`list`	List all active sessions with runtime, cwd, spawned_at, output_len.

Output

{
  "stdout":    "2\n",
  "stderr":    "",
  "timed_out": false,
  "exit_code": null
}

timed_out: true when exec exceeds timeout_secs (default 30 s). Exit code is Some(n) only when the child process has terminated.

Configuration

# Agent-level default
agents:
  - id: ana
    repl:
      enabled: true
      allowed_runtimes: ["python", "node"]
      max_sessions: 3
      timeout_secs: 30
      max_output_bytes: 65536

# Per-binding override (replaces the whole struct)
inbound_bindings:
  - plugin: whatsapp
    repl:
      enabled: true
      allowed_runtimes: ["python"]

Field	Type	Default	Description
`enabled`	bool	`false`	Gate the Repl tool. Off by default.
`allowed_runtimes`	`[string]`	`["python","node","bash"]`	Runtimes the agent may spawn.
`max_sessions`	u32	`3`	Maximum concurrent REPL sessions per agent.
`timeout_secs`	u32	`30`	Seconds before an exec returns `timed_out: true`.
`max_output_bytes`	u64	`65536`	Per-session output buffer cap. Oldest bytes dropped when cap reached.

Runtimes

Runtime	Binary	Flags
`python`	`python3`	`-u` (unbuffered), `-q` (quiet), `-i` (interactive — required when stdin is piped)
`node`	`node`	`-i` (interactive), `--no-warnings`
`bash`	`bash`	`--norc`, `--noprofile`

Session lifecycle

Sessions are keyed by UUID (returned by spawn).
Output is buffered per-session with oldest-first truncation at max_output_bytes.
Background reader threads (blocking I/O via std::thread) read stdout/stderr.
exec snapshots buffer lengths before sending code and returns only the new bytes.
wait_for_new_output polls every 50 ms until the buffer grows beyond the snapshot or the process dies.
Dead-process detection (child.try_wait()) on every exec/read — returns a clear error if the session has terminated.

Plan-mode classification

Classified Bash (mutating) in nexo_core::plan_mode::MUTATING_TOOLS. Plan-mode-on goals receive a PlanModeRefusal rather than a silent exec.

Sandbox note

Sandbox enforcement (bubblewrap / firejail / macOS sandbox-exec) is tracked in the Phase 79.12 spec but deferred to a future sub-phase. Current implementation trusts the operator to enable repl-tool only for agents that need it, and limits blast radius via allowed_runtimes and max_output_bytes.

Out of scope (deferred)

Sandbox integration. bwrap / firejail / sandbox-exec probing and enforcement. Tracked in PHASES.md 79.12 sandbox matrix.
allow_unsandboxed per-binding toggle. Locked behind capability gate — only via direct YAML edit, not self-config.
Last-expression value. The spec's value: Option<Value> field (JSON-encoded last expression) is deferred.
bash language is included as a pragmatic convenience for debugging; the PHASES.md spec says "no shell language" but the implementation ships it as the session-sandbox risk is lower than an ad-hoc BashTool invocation (no filesystem side effects beyond the session cwd).

References

PRIMARY: upstream agent CLI + upstream agent CLI
SECONDARY: OpenClaw research/ — no equivalent (grep -rln "repl\|REPL\|stateful.*sandbox" research/src/ returns nothing).
Implementation: crates/core/src/agent/repl_registry.rs, crates/core/src/agent/repl_tool.rs, crates/config/src/types/repl.rs.
Plan + spec: proyecto/PHASES.md::79.12.

LSP tool (Phase 79.5)

The Lsp tool gives the model in-process access to four Language Server Protocol servers — rust-analyzer, pylsp, typescript-language-server, gopls — for code intelligence queries (go-to-definition, hover, references, workspace symbol, diagnostics) without spawning a sub-shell or shelling out to grep.

Built-in language matrix

Language	Binary	Install hint
rust	`rust-analyzer`	`rustup component add rust-analyzer`
python	`pylsp`	`pip install python-lsp-server`
typescript / javascript	`typescript-language-server`	`npm i -g typescript-language-server`
go	`gopls`	`go install golang.org/x/tools/gopls@latest`

At boot the daemon probes PATH for each binary. Missing binaries get a single tracing::warn! line that includes the install hint. Discovered binaries are cached for the lifetime of the process; nexo setup reload re-probes if the operator installs a missing one.

Tool surface

A single tool, Lsp, dispatched by the kind field. All positional fields (line, character) are 1-based to match editor UX — the underlying LSP wire is 0-based but the tool description and handler perform the conversion.

{ "kind": "go_to_def",        "file": "src/foo.rs", "line": 42, "character": 8 }
{ "kind": "hover",            "file": "src/foo.rs", "line": 42, "character": 8 }
{ "kind": "references",       "file": "src/foo.rs", "line": 42, "character": 8 }
{ "kind": "workspace_symbol", "query": "Foo" }
{ "kind": "diagnostics",      "file": "src/foo.rs" }

workspace_symbol accepts an empty query ("all symbols"). diagnostics returns the latest publishDiagnostics snapshot for the file; if the file was just opened and no snapshot has arrived yet, the tool waits up to 2 seconds before returning (no diagnostics).

Per-binding policy

Opt in per agent in agents.yaml:

agents:
  - id: cody
    lsp:
      enabled: true
      languages: [rust]              # whitelist; empty = all discovered
      prewarm: [rust]                # warmed at boot
      idle_teardown_secs: 600        # 10 min

enabled: false (the default) keeps the Lsp tool unregistered for the agent — the model never sees it advertised. languages: [] (the default when enabled: true) permits every discovered language; provide an explicit list to sandbox a binding to a single language.

Lifecycle

Lazy spawn — the binary is launched on first tool call to a given (workspace_root, language) tuple. Cold start: ~500 ms for rust-analyzer, lower for the others.
Pre-warm — listing a language under lsp.prewarm spawns it during boot so the first model call hits a warm session.
Idle teardown — sessions with no model activity for idle_teardown_secs are shut down cleanly (shutdown → exit LSP requests, then bounded kill_on_drop). Phase 19/20 poller calls do not count as activity, so a workspace with only synthetic traffic still releases the server's RAM.
Crash recovery — a session that exits non-zero increments a restart counter (cap max_restarts = 3); the next call re-spawns. After the cap, the session reports SessionDead and never auto-restarts again.
-32801 ContentModified retries — rust-analyzer in mid- index returns this code; the session retries with exponential backoff (500 ms / 1 s / 2 s) before surfacing the error.

Plan-mode

All five MVP ops are read-only — Lsp is in READ_ONLY_TOOLS so plan-mode never refuses an Lsp tool call.

Output format

Results are formatted as path:line:col with workspace-relative paths when shorter, mirroring upstream agent CLI formatters.ts. A label is appended where the LSP server provides one ("Definition: src/bar.rs:120:5 (struct Bar)").

Per-result cap: 100 KB total. Workspace-symbol queries that exceed the cap are truncated with a +N more results note.

Errors

The handler returns {"ok": false, "error", "kind"} instead of panicking. Stable kind discriminators:

ServerUnavailable — binary missing on PATH (error includes the install hint).
LanguageNotEnabled — binding's lsp.languages whitelist excludes the target.
NoServerForExtension — file extension not in the matrix.
FileNotFound / NotAFile / FileTooLarge (10 MB cap).
RequestTimeout — server stalled past 30 seconds; the session receives $/cancelRequest and stays alive.
SessionDead — max-restart cap exceeded.
CapabilityMissing — the running server doesn't advertise the kind's capability (the dynamic description should hide this case in practice).
Wire — JSON-RPC framing or argument parse error.

References

PRIMARY: upstream agent CLI (LSPTool.ts, schemas.ts, formatters.ts, prompt.ts) and upstream agent CLI (LSPClient.ts, LSPServerInstance.ts, LSPServerManager.ts, manager.ts, passiveFeedback.ts).
SECONDARY: research/src/agents/pi-bundle-lsp-runtime.ts (OpenClaw's hand-rolled JSON-RPC framing and capability- filtered tool registration).
Spec + plan: proyecto/PHASES.md::79.5.

`cron_create` / `cron_list` / `cron_delete` / `cron_pause` / `cron_resume`

LLM-time scheduling: from inside a turn, the model registers a cron entry that fires a future goal. Complements Phase 7 Heartbeat (config-time only) and Phase 20 agent_turn poller (config-time only) — this is the only path where the model itself mutates the schedule.

Lift from upstream agent CLI (5-field cron schema, recurring + durable flags, 50-entry cap). OpenClaw research/src/cron/schedule.ts provides the parallel naming convention — we use Rust's cron = "0.12" crate (already a transitive workspace dep).

Diff vs Phase 7 Heartbeat vs Phase 20 `agent_turn` poller

Mechanism	Trigger source	Mutable at runtime	Persists
Phase 7 Heartbeat	YAML `heartbeat.interval_secs`	No (hot-reload only)	Config
Phase 20 `agent_turn` poller	YAML cron spec	No (hot-reload only)	Config
Phase 79.7 ScheduleCron	LLM tool call mid-turn	Yes (model-driven)	SQLite

Tool surface and constraints

cron_create { cron, prompt, channel?, recipient?, recurring? } — schedule a recurring or one-shot prompt. recipient is the to address for outbound publish (JID for WhatsApp, chat id for Telegram, email for SMTP); without it the dispatcher only logs the LLM response.
cron_list — read-only, returns the binding's entries.
cron_delete { id } — remove an entry.
cron_pause { id } — soft-disable an entry (paused = true).
cron_resume { id } — re-enable a paused entry (paused = false).
5-field cron expression (M H DoM Mon DoW); 6-field also accepted (passthrough).
60-second minimum interval — sub-minute schedules refuse with a clear message.
Cap 50 entries per binding (lift from upstream).
Origin-tagged binding namespace: entries from a whatsapp:ops goal stay isolated from telegram:bot entries. binding_id resolves from inbound origin (plugin:instance) with agent_id fallback for non-interactive turns.
SQLite-backed (nexo_cron_entries table); survives daemon restart.
Model pinning at schedule time: cron_create stores model_provider + model_name from effective binding policy so each fire can resolve the same provider/model pair later.

Runtime firing — shipped (end-to-end)

crates/core/src/cron_runner.rs::CronRunner polls store.due_at(now) every 5 s and dispatches due entries through an Arc<dyn CronDispatcher>. State advance is policy-driven:

recurring entries always advance (even on dispatch failure) so a broken downstream never hot-loops one row forever.
one-shot entries delete on success; on failure they retry with bounded exponential backoff (runtime.cron.one_shot_retry) and are deleted only after the retry budget is exhausted.

Production wiring at boot uses LlmCronDispatcher (crates/core/src/llm_cron_dispatcher.rs): builds a ChatRequest from entry.prompt, resolves the LLM client from the entry's pinned model_provider/model_name (with legacy fallback for old rows), logs the response with id + binding + cron expression and a 200-char preview, then forwards the body to the user-facing channel via BrokerChannelPublisher when the entry carries both a channel and a recipient.

Tool-call execution is now available as an explicit opt-in: runtime.cron.tool_calls.enabled: true. In that mode, the dispatcher advertises the binding-filtered tool set, executes returned tool calls, feeds tool_result messages back to the model, and repeats up to runtime.cron.tool_calls.max_iterations.

Fallback: when no agents are configured or the LLM-client build fails, the runner falls back to LoggingCronDispatcher so cron fires stay observable in degraded boot.

Outbound publish

BrokerChannelPublisher parses <plugin>:<instance> from entry.channel and emits an event on plugin.outbound.<plugin>.<instance> carrying:

{ "kind": "text", "to": "<recipient>", "text": "<llm body>" }

This is the same envelope the WhatsApp / Telegram / Email outbound tools already speak — the receiving plugin's dispatcher delivers the message to the user.

Failure mode: a publish error is logged via tracing::warn! but never fails fire(). The runner still advances state, so a stuck downstream channel (NATS down, plugin not subscribed) cannot deadlock the cron loop. Set both channel and recipient on cron_create to enable user-facing delivery — either missing → the dispatcher only logs.

Tool shapes

`cron_create`

{
  "cron": "*/5 * * * *",
  "prompt": "Check the build queue and report",
  "channel": "whatsapp:default",
  "recipient": "5511999999999@s.whatsapp.net",
  "recurring": true
}

Returns:

{
  "ok": true,
  "id": "01J...",
  "binding_id": "whatsapp:default",
  "cron": "*/5 * * * *",
  "recurring": true,
  "next_fire_at": 1700000300,
  "instructions": "Entry persisted. The runtime fires it on schedule. Use cron_list to inspect, cron_pause/cron_resume to temporarily stop/restart, and cron_delete to cancel."
}

One-shot retry policy

Process-level policy in config/runtime.yaml:

cron:
  one_shot_retry:
    max_retries: 3        # 0 => drop on first failure
    base_backoff_secs: 30 # attempt #1 delay
    max_backoff_secs: 1800
  tool_calls:
    enabled: false        # default: log-only for tool calls
    max_iterations: 6
    allowlist: []         # optional extra narrowing (glob syntax)

Attempt delays are exponential (base * 2^(attempt-1)), capped by max_backoff_secs.

`cron_list`

{}

Returns the binding's full entry list, sorted by next_fire_at asc.

`cron_delete`

{ "id": "01J..." }

`cron_pause`

{ "id": "01J..." }

`cron_resume`

{ "id": "01J..." }

Cron expression semantics

Standard 5-field UTC: M H DoM Mon DoW. Examples:

Expression	Means
`/5 * * *`	Every 5 minutes
`0 9 * * *`	Daily 09:00 UTC
`30 14 28 2 *`	Feb 28 14:30 UTC (one-shot if `recurring: false`)
`0 /2 * *`	Every 2 hours on the hour

The 60-second minimum is enforced by checking that two consecutive fires are ≥ 60 seconds apart. Sub-minute expressions like `*/30 * *

- *` (every 30 s, 6-field) are rejected.

Plan-mode classification

cron_create, cron_delete, cron_pause, and cron_resume → Schedule (mutating). Plan mode refuses with PlanModeRefusal.
cron_list → ReadOnly. Stays callable while plan mode is on.

References

PRIMARY: upstream agent CLI (schema, validation, 50-entry cap), plus the sibling CronListTool.ts / CronDeleteTool.ts / CronPauseTool.ts / CronResumeTool.ts.
SECONDARY: research/src/cron/schedule.ts (OpenClaw — croner JS lib + cache pattern, semantically compatible).
Plan + spec: proyecto/PHASES.md::79.7.

Poller V2 — Laravel-style dispatch

Phase 96 refactored the poller subsystem around a single principle: the runner is a dumb scheduler. It knows nothing about channels, credentials, outbound topics, or LLMs. Pollers reach the world through one egress trait, PollerHost. Everything else is the runner's business: schedule, lease, retry, breaker, cursor persistence, telemetry.

If you've used Laravel's queue, the cut is familiar: Queue::push takes an opaque job, Worker pops + invokes — the queue never introspects what the job does or what it returns.

Why

V1 (pre-Phase-96) leaked too much. The runner needed to know:

That outbound goes to a Channel enum (whatsapp / telegram / google).
That every poller might want a CredentialsBundle + per-channel AgentCredentialResolver.
That agent_turn specifically needs LlmRegistry + LlmConfig.
How to translate OutboundDelivery { channel, recipient, payload } into plugin.outbound.<channel>.<account_id> topic publishes (dispatch.rs, ~200 LOC).

Every new poller kind risked widening the runner's surface — and out-of-tree pollers couldn't escape the in-tree types at all. Phase 96 cut all of it.

Contract

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Poller: Send + Sync + 'static {
    fn kind(&self) -> &'static str;
    async fn tick(&self, ctx: &PollContext) -> Result<TickAck, PollerError>;
}

pub struct PollContext {
    pub job_id: String,
    pub agent_id: String,
    pub kind: &'static str,
    pub config: Value,
    pub cursor: Option<Vec<u8>>,
    pub now: DateTime<Utc>,
    pub interval_hint: Duration,
    pub cancel: CancellationToken,
    pub host: Arc<dyn PollerHost>,
}

pub struct TickAck {
    pub next_cursor: Option<Vec<u8>>,
    pub next_interval_hint: Option<Duration>,
    pub metrics: Option<TickMetrics>,
}
}

PollerHost is the single egress:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait PollerHost: Send + Sync + 'static {
    async fn broker_publish(&self, topic: String, payload: Vec<u8>) -> Result<(), HostError>;
    async fn credentials_get(&self, channel: String) -> Result<Value, HostError>;
    async fn log(&self, level: LogLevel, message: String, fields: Value) -> Result<(), HostError>;
    async fn metric_inc(&self, name: String, labels: Value) -> Result<(), HostError>;
    async fn llm_invoke(&self, request: LlmInvokeRequest) -> Result<LlmInvokeResponse, HostError>;
}
}

Two host implementations

Adapter	Use case	Crate
`InProcessHost`	In-tree builtins (`webhook_poll`, `agent_turn`)	`nexo-poller` (private use)
`BrokerPollerHost`	Subprocess plugin pollers	`nexo-microapp-sdk::poller`

InProcessHost calls directly into the daemon's AnyBroker, AgentCredentialResolver, LlmRegistry. BrokerPollerHost pipes the same trait methods through broker reverse-RPC on daemon.rpc.<plugin_id>. Both produce identical poller-visible behavior — the trait surface is the contract, not the impl.

Dispatch topology

                      ┌─────────────────────────────────────┐
                      │ daemon: PollerRunner                │
                      │  ├─ webhook_poll  (in-tree)         │
                      │  ├─ agent_turn    (in-tree)         │
                      │  └─ plugin proxies (Arc<dyn Poller>) │
                      └──────────────┬──────────────────────┘
                                     │
                ┌────────────────────┼────────────────────────┐
                │                    │                        │
       broker_publish         broker tick request         broker reverse-RPC
   plugin.outbound.X.Y    plugin.poller.<kind>.tick      daemon.rpc.<plugin_id>
                │                    │                        │
                ▼                    ▼                        │
      ┌──────────────┐     ┌──────────────┐                  │
      │ wa / tg / …  │     │ subprocess   │──────────────────┘
      │ channel      │     │ plugin       │
      │ plugin       │     │  (PollerHandler)
      └──────────────┘     └──────────────┘

The daemon's PluginPollerRouter owns (plugin_id, kinds, topic_prefix) mappings. For each (handle, kind) it wraps a PluginPollerProxy that implements Poller and forwards every tick through broker JSON-RPC. The runner's registry is homogeneous — it cannot tell which entries are in-tree and which forward over the wire.

`[plugin.poller]` manifest section

Seventh manifest section closing the Phase 81.33.b.real lineage (pairing → http → admin → metrics → dashboard → ✶ → poller).

[plugin.poller]
kinds                = ["google_calendar"]
broker_topic_prefix  = "plugin.poller.google_calendar"
lifecycle            = "long_lived"        # or "ephemeral"
max_concurrent_ticks = 1                   # 1..=64, default 1
tick_timeout_secs    = 60                  # 1..=3600, default 60

Boot-time validation:

kinds non-empty + unique within the plugin.
Each kind matches ^[a-z][a-z0-9_]+$.
broker_topic_prefix non-empty + no trailing dot + no spaces.
Cross-plugin uniqueness — two plugins declaring the same kind fail boot loud (PollerRouteRegistrationError::DuplicateKind).

Lifecycle

long_lived (default) — daemon spawns the plugin subprocess once at boot. The subprocess subscribes to its tick topic and replies through the message's reply_to. Best for pollers with warm state (OAuth tokens, HTTP connection pools, parsed feeds).
ephemeral — manifest accepts the value but the daemon currently rejects it with a config error. Tracked as a Phase 96 follow-up: spawn-per-tick path requires new stdio JSON-RPC primitives (no broker subscription, direct stdin/stdout dispatch
- SIGTERM on reply).

Reverse-RPC

Subprocess pollers call back to the daemon via broker request-reply on daemon.rpc.<plugin_id>. Methods:

Method	Daemon response
`credentials_get(channel, agent_id)`	`{ account_id, … }` plus typed Google fields (`client_id_path`, `token_path`) when channel == google
`log(level, message, fields)`	Forwards to daemon tracing
`metric_inc(name, labels)`	Forwards to daemon tracing (Prometheus aggregator is a follow-up)
`llm_invoke(request)`	Proxies to `LlmRegistry::build(...)::chat(...)`, returns `{ content, model_id, usage }`

Error envelopes use JSON-RPC codes that mirror PollerError classification: -32001 transient (retry with backoff), -32002 permanent (auto-pause job until agent pollers reset <id>), -32602 config (bad config — operator fixes YAML), -32601 method not found.

What this unlocks

New poller kinds (Jira, Linear, Stripe, custom internal APIs) ship as standalone crates published to crates.io. No fork of nexo-poller, no PR to the framework.
The framework's nexo-poller dep tree no longer carries nexo-plugin-google — gmail + google_calendar moved out to nexo-rs-poller-gmail + nexo-rs-poller-google-calendar standalone repos. Closes the Phase 94 close-out follow-up.
LLM-using pollers (agent_turn + future custom prompts) no longer need llm_registry / llm_config fields baked into PollContext. Any subprocess plugin can call host.llm_invoke and get the daemon's configured provider stack for free.

References

Phase 81.33.b.real lineage: Stage 1 pairing → 2 http → 4 admin → 5 metrics → 6 dashboard → ✶ → Phase 96 poller. Same RwLock interior mutability pattern, same broker-RPC forwarder shape, same "construct empty at boot, populate after wire" rule.
OpenClaw cron service (research/src/cron/service/locked.ts:11-21, service.restart-catchup.test.ts:79-116) informed lease + restart semantics.
claude-code-leak MCP elicitation handler (src/services/mcp/elicitationHandler.ts:77-106) shaped the reverse-RPC pattern: server (here, subprocess) sends a request UP to the client (here, daemon), client responds via the same channel.

`ListMcpResources` + `ReadMcpResource` (Phase 79.11 — MVP)

Router-shaped MCP tools: a single discovery surface for agents talking to many MCP servers, instead of registering N×2 per-server tools (which still ship via the Phase 12.5 catalog).

Pattern derived from prior CLI work + ReadMcpResourceTool/. The upstream CLI ships these as the LLM-driven introspection layer over the per-server McpClient trait.

Diff vs Phase 12.5 per-server tools

Aspect	Phase 12.5 per-server	Phase 79.11 router
Tool surface	`mcp__<server>__list_resources` per server	Single `ListMcpResources { server: Option<String> }`
Token cost	O(2 × N servers) tools in prompt	2 fixed tools
Use when	1–2 MCP servers connected	3+ MCP servers, surface gets noisy

Both surfaces coexist — operators can keep the per-server tools for the agents that need them and let the router-shaped tools handle wide deployments.

Tool shapes

`ListMcpResources`

{ "server": "github", "max": 100 }

server is optional — omit to enumerate every connected server. max overrides the default 200-entry cap. Returns:

{
  "resources": [
    { "server": "github", "uri": "...", "name": "...", "description": "...", "mime_type": "..." }
  ],
  "truncated": false,
  "errors": [],
  "count": 12
}

errors carries per-server failures so a single bad server doesn't drop the whole call. truncated: true flags that the cap was hit before all resources were enumerated.

`ReadMcpResource`

{ "server": "github", "uri": "github://owner/repo/file", "max_bytes": 65536 }

server + uri required. max_bytes overrides the default 256 KiB cap. Returns:

{
  "server": "github",
  "uri": "...",
  "contents": [
    { "uri": "...", "mime_type": "text/plain", "text": "...", "blob_length": null, "blob": null }
  ],
  "truncated": false
}

Truncation respects UTF-8 char boundaries — never splits a multi-byte sequence. Binary blob bodies are returned verbatim (base64) without mid-string truncation; the blob_length field is reported so the model can decide whether to ask for a smaller slice.

Plan-mode classification

Both tools are ReadOnly — they query but never mutate, so they stay callable while plan mode is on.

Out of scope (deferred)

McpAuth. The McpClient trait does not expose a refresh hook (refresh is currently transparent inside the client). Tracked in FOLLOWUPS.md::Phase 79.11 — once the trait grows the method (lift from upstream agent CLI), a third tool wires into the same module.
Operator overrides for the caps. Today the caps are module constants; a follow-up can read them from the MCP YAML (mcp.list_max_resources, mcp.resource_max_bytes).

References

SECONDARY: OpenClaw research/ — no equivalent router-shaped MCP tool.
Plan + spec: proyecto/PHASES.md::79.11.

ConfigTool — gated self-config (Phase 79.10)

The Config tool lets an agent read and propose changes to its own YAML configuration from inside a chat-driven turn. The flow is intentionally two-step (propose then apply) so a remote operator can approve or reject the change with a regular message on the same channel that originated the proposal — there is no host 'ask' permission prompt the way Claude Code's upstream shows (upstream agent CLI).

Cargo feature gate

ConfigTool ships behind the config-self-edit Cargo feature (off by default). Build with:

cargo build --workspace --features config-self-edit

A binary distributed without the feature cannot expose the tool even when an operator sets config_tool.self_edit: true in YAML — the gate is a hard ship-control until security review.

Per-agent YAML

agents:
  - id: cody
    config_tool:
      self_edit: true            # default false; opt-in per agent
      allowed_paths:              # empty = every SUPPORTED_SETTINGS key
        - "model.model"
        - "language"
      approval_timeout_secs: 86400  # 24 h

allowed_paths is intersected with SUPPORTED_SETTINGS (12 keys at MVP: model.{provider,model}, language, system_prompt, heartbeat.{enabled,interval_secs}, link_understanding.enabled, web_search.enabled, lsp.{enabled,languages,idle_teardown_secs,prewarm}) and then filtered by the hard-coded denylist.

Three operations

{ "op": "read",    "key": "model.model" }
{ "op": "propose", "key": "model.model", "value": "claude-opus-4-7", "justification": "operator asked" }
{ "op": "apply",   "patch_id": "01J7HVK..." }

op: read is read-only and passes plan-mode gating. propose and apply are mutating — plan-mode refuses them.

Hard-coded denylist

crates/setup/src/capabilities.rs::CONFIG_SELF_EDIT_DENYLIST holds 13 globs that the tool MUST NEVER touch:

Glob	Intent
`_token`, `_secret`, `_password`, `_key`	credential-shaped suffixes
`pairing.*`	pairing internals — touching these revokes the operator's grip
`capabilities.*`	cannot widen own capabilities
`mcp.servers..auth.`, `mcp.servers.*.command`	MCP auth + spawn args (running arbitrary binaries via config self-edit is game-over)
`binding.*.role`	cannot self-promote to coordinator
`binding..plan_mode.`	cannot drop plan-mode guardrails
`remote_triggers[].url`, `remote_triggers[].secret_env`	outbound webhook URLs + signing keys
`cron.user_max_entries`	operator-only
`agent_registry.store.*`	changing the store under a running goal is unsafe

Source-of-truth lives in code, not YAML — a model that proposes a patch widening the denylist cannot succeed because that's a code change requiring review. Validation runs at BOTH propose (early reject) and apply (defense-in-depth: the staging file may have been edited externally between propose and apply).

Approval flow

Model emits Config { op: "propose", key, value, justification }.
Handler validates the triple gate (capability on, key in SUPPORTED_SETTINGS ∩ allowed_paths, key NOT in denylist), runs the per-key validator, generates a patch_id, persists the proposal under .nexo/config-proposals/<patch_id>.yaml, parks an approval oneshot with the ApprovalCorrelator, writes a proposed row to the audit store, fires notify_origin on the binding's channel with:
```
Operator: reply [config-approve patch_id=<id>]
or [config-reject patch_id=<id> reason=...] within 24 h.
```
Operator replies with the bracketed command on the SAME (channel, account_id) that originated the proposal. Cross-binding messages are rejected (the entry stays parked).
Model emits Config { op: "apply", patch_id }. Handler awaits the correlator decision (Approved / Rejected / Expired):
- Approved: snapshot agents.yaml, apply the patch, trigger Phase 18 hot-reload. On reload-Err: restore the snapshot + record rolled_back. On Ok: cleanup staging
  - record applied.
- Rejected: record rejected with the reason.
- Expired (24 h elapsed): record expired.

Audit log

<state_dir>/config_changes.db stores every state transition (idempotent on (patch_id, status)). Read-only LLM tool config_changes_tail { n: 20 } (always available, regardless of the Cargo feature) returns a markdown table suitable for the agent's post-mortem.

Schema:

patch_id    TEXT
status      TEXT  -- proposed | applied | rolled_back | rejected | expired
binding_id  TEXT
agent_id    TEXT
op          TEXT  -- propose | apply | reject | expire
key         TEXT
value       TEXT  -- pre-redacted: secret-suffix paths render as "<REDACTED>"
error       TEXT
created_at  INTEGER
applied_at  INTEGER
PRIMARY KEY (patch_id, status)

Plan-mode behaviour

Config is in MUTATING_TOOLS but the dispatcher inspects args.op at call time. op: "read" short-circuits the gate; op: "propose" and op: "apply" refuse with a PlanModeRefusal { tool_kind: Config }.

config_changes_tail is always in READ_ONLY_TOOLS — never refuses.

Error kinds

The tool returns { "ok": false, "error", "kind", ... } on failure. Stable kind discriminators:

UnknownKey — key not in SUPPORTED_SETTINGS.
PathNotAllowed — key not in this agent's allowed_paths.
ForbiddenKey — denylist hit (returns the matched glob).
ValidationFailed — per-key validator rejected the value.
NoPending — apply called for a patch_id that was never proposed or already consumed.
Rejected — operator replied [config-reject ...].
Expired — 24 h elapsed without approval.
RolledBack — reload rejected the post-apply config; the snapshot was restored.
Yaml / Io / InternalError — fall-through.

References

PRIMARY: upstream agent CLI (ConfigTool.ts, supportedSettings.ts, prompt.ts, constants.ts).
Spec: proyecto/PHASES.md::79.10.

Team tools (Phase 79.6)

Five LLM tools that let an agent form a named team of up to 8 sub-agents that operate in parallel, share a Phase 14 TaskFlow task list, and communicate via broker DMs. Distinct from the existing 1-to-1 delegate (Phase 8) which is request/reply between named agents — teams are coordinated multi-member work rooted under one lead.

Five tools

Tool	Op kind	Plan-mode
`TeamCreate`	mutating	Delegate (refused)
`TeamDelete`	mutating	Delegate (refused)
`TeamSendMessage`	mutating	Delegate (refused)
`TeamList`	read-only	always callable
`TeamStatus`	read-only	always callable

MUTATING_TOOLS and READ_ONLY_TOOLS in crates/core/src/plan_mode.rs are the source of truth.

Per-agent YAML

agents:
  - id: cody
    team:
      enabled: true            # default false; opt-in
      max_members: 8           # clamped at 8
      max_concurrent: 4        # clamped at 4
      idle_timeout_secs: 3600  # 1 h stale-team threshold
      worktree_per_member: false  # default for TeamCreate;
                                   # per-call override accepted

When enabled: false (the default) the 5 tools are not registered for this agent — the model never sees them advertised.

SQL store

Three tables in <state_dir>/teams.db (idempotent CREATE):

CREATE TABLE teams (
    team_id              TEXT PRIMARY KEY,
    display_name         TEXT NOT NULL,
    description          TEXT,
    lead_agent_id        TEXT NOT NULL,
    lead_goal_id         TEXT NOT NULL,
    flow_id              TEXT NOT NULL,
    worktree_per_member  INTEGER NOT NULL,
    created_at           INTEGER NOT NULL,
    deleted_at           INTEGER,
    last_active_at       INTEGER NOT NULL
);

CREATE TABLE team_members (
    team_id        TEXT NOT NULL,
    name           TEXT NOT NULL,        -- human-readable handle
    agent_id       TEXT NOT NULL,        -- internal UUID
    agent_type     TEXT,
    model          TEXT,
    goal_id        TEXT NOT NULL,
    worktree_path  TEXT,
    joined_at      INTEGER NOT NULL,
    is_active      INTEGER NOT NULL,
    last_active_at INTEGER NOT NULL,
    PRIMARY KEY (team_id, name)
);

CREATE TABLE team_events (
    event_id          TEXT PRIMARY KEY,
    team_id           TEXT NOT NULL,
    kind              TEXT NOT NULL,
    actor_member_name TEXT,
    payload_json      TEXT NOT NULL,
    created_at        INTEGER NOT NULL
);

team_id is the sanitised name (lowercase + non-alnum → -). Composite PK on (team_id, name) enforces unique member names within a team. deleted_at IS NOT NULL ⇒ soft-deleted.

Broker topics

Topic	Direction	Payload
`team.<team_id>.dm.<member_name>`	point-to-point	`DmFrame` JSON
`team.<team_id>.broadcast`	fan-out (lead only)	`DmFrame` JSON

#![allow(unused)]
fn main() {
pub struct DmFrame {
    pub team_id: String,
    pub to: String,         // member_name or "broadcast"
    pub from: String,
    pub body: serde_json::Value,
    pub correlation_id: Option<String>,
}
}

The router subscribes once per process to team.>; per-team in-memory tokio::sync::broadcast::Sender channels deliver to member runtimes. When a member's goal is Pending (idle between turns), Phase 67's wake-on hook flips it to Running on the first DM.

Lifecycle

+-----------+     TeamCreate      +-----------+
| no team   |  ---------------->  | team      |
+-----------+                     | (1 lead)  |
                                  +-----+-----+
                                        |
                                        | (operator/79.6.b adds members)
                                        v
                                  +-----------+
                                  | team      |
                                  | (N>=2)    |
                                  +-----+-----+
                                        |
              +-------------------------+--------------------------+
              |                         |                          |
              v                         v                          v
      TeamSendMessage          TeamSendMessage             TeamDelete
      to: <member>             to: "broadcast"             (zero running)
      DM                       fan-out                     soft delete
      ↻ wake idle              ↻ wake all                  + drop router

Caps

TEAM_MAX_MEMBERS = 8 (incl. lead).
TEAM_MAX_CONCURRENT_DEFAULT = 4 per agent.
TEAM_NAME_MAX_LEN = 64, MEMBER_NAME_MAX_LEN = 32.
TEAM_IDLE_TIMEOUT_SECS = 3600 — reaper marks teams stale.
SHUTDOWN_DRAIN_SECS = 30 — TeamDelete drain budget.
DM_BODY_MAX_BYTES = 64 * 1024 per TeamSendMessage.

Per-agent YAML can lower max_members / max_concurrent but never raise above the constants.

Error kinds

{ ok: false, kind: "...", error: "..." } shape:

TeamingDisabled — team.enabled = false for this agent.
InvalidName / InvalidMemberName.
TeamNameTaken { existing_team_id }.
TeamNotFound / TeamDeleted.
MemberNotFound.
TeamFull { count, cap }.
ConcurrentCapExceeded { count, cap }.
BodyTooLarge { actual, max }.
NotLeader — only the team lead can TeamDelete / broadcast.
OnlyLeadCanBroadcast — non-lead member tried to: "broadcast".
NotMember — caller is neither the lead nor a current member; TeamStatus refuses without confirming team existence.
BlockedByActiveMembers { names: [...] } — TeamDelete while members are still Running.
TeammateCannotSpawnTeammate — caller's AgentContext has team_member_name = Some(...). Single-level fan-out only.
Wire — missing required arg or malformed shape.

Plan-mode behaviour

TeamCreate, TeamDelete, TeamSendMessage are in MUTATING_TOOLS. Under an active plan-mode they refuse with PlanModeRefusal { tool_kind: Delegate }.
TeamList, TeamStatus are in READ_ONLY_TOOLS. Always callable.

Comparison vs `delegate` (Phase 8)

	`delegate` (Phase 8)	Team* (Phase 79.6)
Topology	1 → 1	1 lead + N parallel members
Lifecycle	request/reply	persistent team + audit log
Storage	none (broker only)	SQLite (3 tables)
Comms	`agent.route.{target}`	`team.{id}.dm.{name}` + `team.{id}.broadcast`
Idle/wake	n/a	goal `Pending` ↔ `Running`
Capability gate	`allowed_delegates`	`team.enabled` + caps
Best for	quick request to known peer	research fan-out, multi-source verify, full-stack work

Deferred to 79.6.b

Spawn-as-teammate via Phase 67 dispatch (AgentContext.team_id injection is wired; the actual sub-goal spawn that registers in team_members is the missing piece).
Phase 14 FlowFlow link (flow_id is currently a placeholder equal to team_id).
nexo team list / status / drop operator CLI.
Force-kill drain in TeamDelete (today blocks; 79.6.b cancels in-flight goals after SHUTDOWN_DRAIN_SECS).
MCP server mode (run_mcp_server) exposes the 5 tools — part of the 79.M MCP exposure parity sweep.

References

Spec: proyecto/PHASES.md::79.6.

MCP server exposable catalog (Phase 79.M)

nexo mcp-server advertises a curated subset of the runtime tool registry to external MCP clients (Claude Desktop, Cursor, Zed, etc.). The subset is defined in code by a static slice; operators pick which entries to enable via mcp_server.expose_tools.

Source-of-truth: `EXPOSABLE_TOOLS`

#![allow(unused)]
fn main() {
// crates/config/src/types/mcp_exposable.rs

pub static EXPOSABLE_TOOLS: &[ExposableToolEntry] = &[
    // ...
    ExposableToolEntry {
        name: "cron_list",
        tier: SecurityTier::ReadOnly,
        boot_kind: BootKind::Always,
        feature_gate: None,
    },
    // ...
];
}

Adding a tool to this slice does not expose it — the operator must still list the name in mcp_server.expose_tools. The slice controls what is legal to expose; YAML controls what is actually exposed.

YAML

# config/mcp_server.yaml
mcp_server:
  enabled: true
  name: "kate"
  expose_tools:
    - cron_list
    - cron_create
    - ListMcpResources
    - ReadMcpResource
    - config_changes_tail
    - web_search
    - web_fetch
    - EnterPlanMode
    - ExitPlanMode
    - ToolSearch
    - TodoWrite
    - NotebookEdit
  expose_denied_tools:
    - Heartbeat
  denied_tools_profile:
    enabled: true
    require_auth: true
    require_delegate_allowlist: true
    require_remote_trigger_targets: true
    allow:
      heartbeat: true
      delegate: false
      remote_trigger: false

Three-bucket policy

Bucket	`BootKind`	Behaviour
Expose	`Always`	Boot helper constructs the tool from `McpServerBootContext`; missing handle → labelled skip.
Expose (gated)	`FeatureGated`	Skipped unless the named Cargo feature is enabled. `Config` is the only entry today.
Deny by default	`DeniedByPolicy { reason }`	Dispatcher denies by default (`Heartbeat`, `delegate`, `RemoteTrigger`). `run_mcp_server` can optionally override selected entries via `mcp_server.expose_denied_tools` plus extra safety checks.
Defer	`Deferred { phase, reason }`	Wiring postponed to a follow-up sub-phase. `Lsp`, `Team*`.

Boot dispatch flow

expose_tools (YAML) ┐
                    ├──► EXPOSABLE_TOOLS lookup
                    │     │
                    │     ├──► Always       → boot helper → Registered | SkippedInfraMissing
                    │     ├──► FeatureGated → cfg!(feature) check → Registered | SkippedFeatureGated
                    │     ├──► DeniedByPolicy → SkippedDenied (or override path in run_mcp_server)
                    │     └──► Deferred    → SkippedDeferred
                    └──► (typo / removed)  → UnknownName

Every outcome lands in two telemetry counters:

mcp_server_tool_registered_total{name, tier}
mcp_server_tool_skipped_total{name, reason}

reason ∈ {denied_by_policy, deferred, feature_gate_off, infra_missing, unknown_name}.

Boot context

#![allow(unused)]
fn main() {
// crates/core/src/agent/mcp_server_bridge/context.rs

pub struct McpServerBootContext {
    pub agent_id: String,
    pub broker: AnyBroker,
    pub cron_store: Option<Arc<dyn CronStore>>,
    pub mcp_runtime: Option<Arc<SessionMcpRuntime>>,
    pub config_changes_store: Option<Arc<dyn ConfigChangesStore>>,
    pub web_search_router: Option<Arc<WebSearchRouter>>,
    pub link_extractor: Option<Arc<LinkExtractor>>,
    pub agent_context: Arc<AgentContext>,
}
}

run_mcp_server builds the context best-effort: it tries to open ./data/cron.db, ./data/config_changes.db, and the env-driven web-search providers when the corresponding entry is in expose_tools. If a handle cannot be constructed the relevant tool is skipped with a labelled warn line; the server still boots.

Safe profile for denied overrides

Denied-by-default tools now require two explicit opt-ins:

Tool name in mcp_server.expose_denied_tools.
Matching allow-bit in mcp_server.denied_tools_profile.allow.* with denied_tools_profile.enabled: true.

Default profile is fail-closed (enabled: false, all allow bits false).

Additional hardening gates in the profile:

require_auth (default true): requires mcp_server.auth_token_env or mcp_server.http.auth.
require_delegate_allowlist (default true): delegate only boots when agents.<id>.allowed_delegates is non-empty and not ["*"].
require_remote_trigger_targets (default true): RemoteTrigger only boots when agents.<id>.remote_triggers has at least one entry.

Adding a new tool

Implement the tool somewhere in nexo-core::agent::* so it has a tool_def() -> ToolDef and a ToolHandler impl.
Add an ExposableToolEntry to EXPOSABLE_TOOLS with the appropriate tier + boot_kind.
Add a match arm in boot_always (or per-bucket helper) that constructs the tool from the boot context and returns BootResult::Registered.
Add a unit test in crates/core/src/agent/mcp_server_bridge/dispatch.rs::tests covering the missing-handle and present-handle cases.
The conformance suite in crates/core/tests/exposable_catalog_test.rs will automatically pick it up via the every_always_entry_boots_* tests.

Comparison vs `nexo run`

	`nexo run`	`nexo mcp-server`
Tool registry	full (~31 tools, per-binding)	curated subset of `EXPOSABLE_TOOLS`
Plan-mode gating	yes (`MUTATING_TOOLS` / `READ_ONLY_TOOLS`)	yes — same gates apply
Capability YAML	per-agent `team.enabled`, `lsp.enabled`, etc.	`mcp_server.expose_tools` allowlist
Auth	local trust + binding policy	optional `auth_token_env` / `http.auth.kind`

Threat model — Config self-edit via MCP

The Config tool is the only entry that lets an external MCP client mutate the agent's YAML at runtime. It is gated by four locks that all must be open before the boot dispatcher registers it:

Lock	Where	Failure →
1. Cargo feature `config-self-edit`	compile-time	`SkippedFeatureGated`
2. `mcp_server.auth_token_env` or `http.auth` set	boot-time	`SkippedDenied { config-requires-auth-token }`
3. `agents.<id>.config_tool.self_edit = true`	per-agent YAML	`SkippedDenied { config-self-edit-policy-disabled }`
4. `agents.<id>.config_tool.allowed_paths` non-empty	per-agent YAML	`SkippedDenied { config-allowed-paths-must-be-explicit }`

Plus the inherent denylist (crates/setup/src/capabilities.rs::CONFIG_SELF_EDIT_DENYLIST) which permanently blocks credentials, allowed_delegates, outbound_allowlist, system_prompt, plugins, mcp_server., and broker.. The denylist is hard-coded in code, not operator- editable from inside a Config call.

Approval flow:

Model calls Config { op: "propose", key, value, justification }.
ConfigTool stages the patch under <state_dir>/config-proposals/<patch_id>.yaml.
ApprovalCorrelator parks a oneshot::Receiver keyed by patch_id.
Operator sends [config-approve patch_id=<id>] on any plugin inbound topic the daemon subscribes to (works because mcp- server's correlator subscribes to plugin.inbound.> if NATS is shared with the operator's nexo run daemon).
Model calls Config { op: "apply", patch_id }. If approved, the YAML write happens; ConfigChangesStore records the row; ReloadTrigger fires.

In mcp-server mode the McpServerReloadTrigger is a stub that returns Ok with a log line. The mutated YAML is durable on disk; the operator's nexo run daemon picks it up via Phase 18 file watcher. The mcp-server process itself does not run a ConfigReloadCoordinator — same-process reload only happens in nexo run.

Audit:

Every read/propose/apply lands in config_changes SQLite (<state_dir>/config_changes.db) via ConfigChangesStore.
Tail with Config { op: ... } events: config_changes_tail (read-only, exposable).
Secret values redacted via DefaultSecretRedactor (matches *_token, *_secret, *_password, *_key suffixes).

What an MCP client cannot do, even with all locks open:

Change credentials, API keys, OAuth tokens (denylist).
Add/remove agent bindings (denylist on inbound_bindings).
Modify allowed_delegates, outbound_allowlist, system_prompt (denylist).
Toggle plugins (denylist on plugins).
Self-elevate mcp_server.expose_tools (denylist on mcp_server.*).
Bypass approval — apply always blocks until correlator gets a matching [config-approve patch_id=<id>] from inbound.
Read secret values without redaction.

References

PRIMARIO: upstream agent CLI, upstream agent CLI.
SECUNDARIO: research/docs/cli/mcp.md:30-120 (openclaw mcp serve curated catalog).
Spec: proyecto/PHASES.md::79.M.

Fork subagent (Phase 80.19)

crates/fork/ — fork-with-cache-share subagent infrastructure. A fork is a lightweight in-process LLM turn loop that:

Shares the parent goal's prompt-cache key (system prompt, tools, model, message prefix) so cache hits transfer across the fork boundary.
Runs as a single LLM turn loop (LlmClient::chat + tool dispatch + loop), NOT through Phase 67's heavyweight goal-flow driver-loop (which spawns claude subprocesses and runs acceptance + workspace checks).
Optionally writes a transcript / agent-handle row, or stays invisible to agent ps when skip_transcript: true.

Fork is the primitive that consumes downstream sub-phases:

Sub-phase	Use of fork
80.1 autoDream consolidation	`ForkAndForget` + `AutoMemFilter` (80.20) + 4-phase prompt
80.14 AWAY_SUMMARY	`ForkAndForget` + read-only memory whitelist + transcript scan
Phase 51 eval harness	`Sync` mode + scripted prompts
Refactored `delegation_tool.rs`	`Sync` mode replacing the bespoke sync delegate

The upstream runForkedAgent (upstream agent CLI) is the verbatim reference. nexo's adaptation collapses 17 isolation fields down to the handful that actually matter in Rust, because Arc<...> shared state is already isolated by construction.

Public surface

#![allow(unused)]
fn main() {
use nexo_fork::{
    DefaultForkSubagent, ForkSubagent, ForkParams, ForkOverrides,
    DelegateMode, QuerySource, CacheSafeParams, AllowAllFilter,
};

// 1. Snapshot the parent's last LLM request.
let cache_safe = CacheSafeParams::from_parent_request(&parent_chat_request);

// 2. Build a fork.
let handle = DefaultForkSubagent::new()
    .fork(ForkParams {
        parent_ctx,
        llm,
        tool_dispatcher,
        prompt_messages: vec![/* fork's first-turn user message */],
        cache_safe,
        tool_filter: Arc::new(AllowAllFilter),
        query_source: QuerySource::Custom("docs_example"),
        fork_label: "docs_example".into(),
        overrides: None,
        max_turns: 10,
        on_message: None,
        skip_transcript: true,
        mode: DelegateMode::ForkAndForget,
        timeout: Duration::from_secs(300),
        external_abort: None,
    })
    .await?;

// 3. Await completion when ready (or never, for true fire-and-forget).
let mut handle = handle;
let result = handle.take_completion().unwrap().await?;
}

Cache-key invariant (CRITICAL)

CacheSafeParams::fork_context_messages MUST preserve any incomplete tool_use blocks from the parent. Filtering them strips the paired tool_result rows and breaks Anthropic's API (400 error), AND breaks the cache prefix. nexo's crates/llm repairs missing pairings in transport — same as the main thread — so identical post-repair prefix keeps the cache hit.

Reference: upstream agent CLI.

#![allow(unused)]
fn main() {
// CORRECT — pass through unchanged
let cs = CacheSafeParams::from_parent_request(&req);

// WRONG — never do this
// cs.fork_context_messages.retain(|m| !has_dangling_tool_use(m));
}

The test cache_safe::tests::from_parent_request_preserves_message_prefix_with_partial_tool_use verifies bit-for-bit pass-through.

Isolation strategy

KAIROS (TypeScript) clones 17 mutable fields per fork: readFileState, abortController, getAppState, setAppState, setResponseLength, nestedMemoryAttachmentTriggers, toolDecisions, etc. Most of these are mutable closures or mutable maps that JavaScript needs to deep-clone manually.

In nexo, every analogous field on AgentContext is either an Arc (shared) or wrapped in Arc<RwLock<...>> (interior mutability with explicit locking). Rust's ownership model already guarantees forks cannot mutate the parent's state without going through the locks.

We therefore only override the fields whose isolation actually matters:

Field	Default	Override
`agent_id`	parent's value	`ForkOverrides::agent_id`
`critical_system_reminder`	none	`ForkOverrides::critical_system_reminder` (consumed by `run_turn_loop`)
`abort`	new child token; parent → child cascade only	`ForkParams::external_abort` (caller supplies)
`tool_filter`	`AllowAllFilter`	`ForkParams::tool_filter` (e.g. `AutoMemFilter` for 80.1)

DelegateMode

#![allow(unused)]
fn main() {
pub enum DelegateMode {
    Sync,            // block until completion
    ForkAndForget,   // tokio::spawn + return ForkHandle immediately
}
}

ForkAndForget is right when the caller (autoDream, AWAY_SUMMARY) does not need the result inline. The handle's Drop impl cancels the abort signal automatically when the future is never consumed — prevents leaked tokio tasks if the handle is dropped without take_completion.

Telemetry

Every fork emits a tracing span fork.subagent with fields:

fork_run_id — uuid v4
parent_agent — parent_ctx.agent_id
fork_label — caller-supplied tag (e.g. auto_dream)
query_source — QuerySource variant
mode — Sync | ForkAndForget
skip_transcript — bool
cache_key_hash — u64 from CacheSafeParams::cache_key_hash

The turn loop additionally emits:

fork.cache_break_detected (level WARN) when cache hit ratio drops below 0.5 on the first turn — actionable signal that the fork's CacheSafeParams does not match the parent. Phase 77.4 cache-break heuristic.
fork.tool_filter (level DEBUG) when the filter denies a tool call.

AutoMemFilter (Phase 80.20)

crates/fork::AutoMemFilter is the canonical [ToolFilter] for forked memory-consolidation work — autoDream (Phase 80.1), AWAY_SUMMARY (Phase 80.14), eval harness (Phase 51 future). Verbatim port of upstream agent CLI.

What it allows

Tool	Allowed when
`REPL`	always (inner primitives re-gate via this same filter; required for cache-key parity per upstream `:171-180`)
`FileRead`, `Glob`, `Grep`	always (inherently read-only)
`Bash`	`nexo_driver_permission::is_read_only(command)` — composes Phase 77.8 destructive-cmd warning + Phase 77.9 sed-in-place + a positive whitelist of ~45 read-only utilities + redirect / subshell / heredoc detection
`FileEdit`, `FileWrite`	`file_path` (post-canonicalize) starts with the filter's `memory_dir`
anything else	denied with structured `tool_result` body so the model can recover within the same turn

Defense in depth

Whitelist allow-list — only the seven tool names above; everything else is rejected at the filter layer.
Bash classifier — composes existing Phase 77.x classifiers + a conservative whitelist that intentionally drops tee, awk, perl, python, node, ruby because they can shell out via system(...). Operators add them back per-call only if a pipe-only no-side-effects shape can be validated.
Path canonicalize at construction (memory_dir resolved once) AND per-call (file_path resolved before starts_with). Defeats symlink swaps and .. traversal.
Post-fork audit in 80.1 — auto_dream independently re-checks files_touched paths after the fork completes, so a filter bypass would still be caught.

Provider-agnostic

The filter operates on tool name + JSON args. It does NOT depend on any specific [LlmClient] impl — works under Anthropic, OpenAI, MiniMax, Gemini, DeepSeek, or any future provider that implements the trait. Tool names are canonical nexo strings (tool_filter::tool_names::*); provider clients translate to/from native wire formats.

The filter expects flat top-level args. If a provider client wraps args in a nested envelope (e.g. {"arguments": {...}}), the client MUST unwrap before dispatch — the filter denies nested shapes explicitly so a missing unwrap surfaces immediately.

Example

#![allow(unused)]
fn main() {
use std::sync::Arc;
use nexo_fork::{
    AutoMemFilter, DefaultForkSubagent, DelegateMode, ForkParams,
    ForkSubagent, QuerySource, CacheSafeParams,
};

let memory_dir = std::path::PathBuf::from("/var/lib/nexo/memory/agent_a");
std::fs::create_dir_all(&memory_dir)?;
let filter = Arc::new(AutoMemFilter::new(&memory_dir)?);

let handle = DefaultForkSubagent::new()
    .fork(ForkParams {
        parent_ctx,
        llm,
        tool_dispatcher,
        prompt_messages: vec![/* /dream prompt */],
        cache_safe: CacheSafeParams::from_parent_request(&parent_request),
        tool_filter: filter,           // ← whitelist applied here
        query_source: QuerySource::AutoDream,
        fork_label: "auto_dream".into(),
        overrides: None,
        max_turns: 30,
        on_message: None,
        skip_transcript: true,
        mode: DelegateMode::ForkAndForget,
        timeout: std::time::Duration::from_secs(300),
        external_abort: None,
    })
    .await?;
}

Cross-process forks

Out of scope for 80.19. When Phase 32 multi-host orchestration lands, a NatsForkSubagent impl will publish on agent.fork.<run_id>.events so a fork can run on a remote daemon sharing the parent's prompt cache via the upstream LLM provider's cache plane.

Agent event firehose (Phase 82.11)

Operator UIs (and any microapp with the right capability) need real-time visibility into what agents are doing: chat lines, pause state changes, escalations to humans, batch-job results, future custom kinds. The agent event firehose is the single architectural seam that delivers them.

The wire format — AgentEventKind — is a #[non_exhaustive] discriminated enum on nexo/notify/agent_event (admin RPC reference). This page documents the runtime composition that gets a frame from a producer (transcript writer, processing handler, escalation handler) onto every interested subscriber.

Trait

Every producer holds a single Arc<dyn AgentEventEmitter>:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait AgentEventEmitter: Send + Sync + Debug {
    async fn emit(&self, event: AgentEventKind);
}
}

Implementations are best-effort: failures log and drop. The contract is that emit MUST NOT block the producer. Boot is free to swap in any composition without touching emit sites.

Source: crates/core/src/agent/agent_events.rs.

Implementations

`BroadcastAgentEventEmitter` — live in-process

A tokio::sync::broadcast::Sender<AgentEventKind> with a 256-frame ring buffer. Subscribers that lag past the buffer get RecvError::Lagged(n) rather than panic — they are expected to call nexo/admin/agent_events/list to resync.

Single-daemon installs run happily with just this. No durability, no cross-host.

`SqliteAgentEventLog` — durable backfill

Append-only log keyed by autoincrement id. Denormalised columns (kind, agent_id, tenant_id, at_ms) so the common filter axes hit indexed paths; full AgentEventKind round-trips as JSON in payload_json so future enum variants land non-breaking.

Doubles as AgentEventEmitter so it slots into the composition without a separate wiring path. sweep_retention(retention_days, max_rows) mirrors the admin-audit sweep so a single boot scheduler runs both.

Read API (AgentEventLog::list_recent) supports agent_id + kind + tenant_id + since_ms + limit filters with parameterised SQL.

Source: crates/core/src/agent/admin_rpc/agent_events_sqlite.rs.

`NatsAgentEventEmitter` — multi-host bridge

Publishes serialised AgentEventKind frames to <prefix>.<agent_id>.<kind> (default prefix nexo.agent_events). Subscribers route per-agent (nexo.agent_events.ana.>), per-kind (nexo.agent_events.*.processing_state_changed), or both at the broker.

The pure helper agent_event_subject(prefix, &event) exposes the routing key without a live client — useful for boot-time validation and for tests. agent_id is sanitised at emit-site (./*/>/whitespace → _) so a malformed config can't break wildcard subscriptions.

Failure mode is best-effort: publish errors log and drop. The broker crate's circuit breaker + disk queue protect the daemon when NATS is unreachable.

Source: crates/core/src/agent/agent_events.rs (NatsAgentEventEmitter, agent_event_subject).

`TeeAgentEventEmitter` — fan-out

Composes several inner emitters into a single Arc<dyn AgentEventEmitter>. Boot wires:

Tee([
    BroadcastAgentEventEmitter,   // live JSON-RPC notifications
    SqliteAgentEventLog,          // durable backfill across restart
    NatsAgentEventEmitter,        // multi-host bridge for SaaS
])

Per-sink failures stay isolated by trait contract. Tee preserves that guarantee — emit returns after every inner has been polled sequentially. Production keeps each inner non-blocking (broadcast try_send, NATS publish, async SQLite append) so a slow sink cannot throttle the whole tee.

Boot composition state

AdminBootstrapInputs (in nexo-setup) accepts an optional agent_event_log: Option<Arc<SqliteAgentEventLog>>. When Some, build_with_firehose composes Tee([Broadcast, Log]) internally — every emit through bootstrap.event_emitter() lands in the durable log. The NATS bridge is library-side ready (NatsAgentEventEmitter::new(client)) but not yet stitched by boot — adding it is one line in the same composition once the broker handle is threaded into bootstrap inputs.

`NoopAgentEventEmitter`

Default for headless installs and tests. Useful as an explicit "no-op, by design" instead of None plumbed through every emit site.

Subscribers reach events through three doors:

Door	When	How
Live JSON-RPC notifications	Operator UI online during the emit	Microapp holds `transcripts_subscribe` / `agent_events_subscribe_all`; daemon delivers `nexo/notify/agent_event` frames automatically.
Backfill RPC	Operator UI starts after the emit	`nexo/admin/agent_events/list` reads from the `MergingAgentEventReader` — transcripts JSONL for `transcript_appended`, durable SQLite log for non-transcript kinds, merged by `at_ms` desc.
External NATS subscriber	Operator dashboard runs off-daemon	Subscribe directly at `<prefix>.<agent_id>.<kind>`.

MergingAgentEventReader (in crates/core/src/agent/admin_rpc/domains/agent_events.rs) respects the kind filter:

kind=Some("transcript_appended") → transcripts JSONL only.
kind=Some(other) → durable log only.
kind=None → both, merged by at_ms desc, capped at the caller's limit.

Boot wires the SQLite log as a Tee sink alongside the broadcast emitter — meaning the log captures TranscriptAppended too. The merger drops those on the log side so the JSONL reader stays canonical for chat history; subscribers never see duplicates.

Variants today

Variant	Producer	Notes
`TranscriptAppended`	`TranscriptWriter::append_entry` (Phase 82.11)	Body already-redacted at emit.
`PendingInboundsDropped`	inbound dispatcher under `processing/pause` (Phase 82.13.b.3)	Fired only on cap eviction.
`EscalationRequested`	(deferred) `escalate_to_human` built-in tool	Variant + emit shape pinned in 82.14.b.firehose.
`EscalationResolved`	`escalations::resolve` + `auto_resolve_on_pause` (Phase 82.14.b.firehose)	Same shape from both call sites so subscribers can't tell paths apart.
`ProcessingStateChanged`	`processing::pause` + `processing::resume` (Phase 82.13.b.firehose)	Carries `prev_state` + `new_state` so subscribers render correct deltas. Idempotent retries skip the emit.

Adding a new kind

AgentEventKind is #[non_exhaustive] with #[serde(tag = "kind")], so a new variant lands non-breaking in three steps:

Add the variant in crates/tool-meta/src/admin/agent_events.rs. Mirror the conventions: agent_id denormalised, optional tenant_id (skip-when-None on the wire), an at_ms field for ordering.
Wire the producer to call emitter.emit(...) from the place the event becomes true. Pre-fetch any "previous" state before the mutation so the wire frame carries both ends of the transition.
Extend agent_events_sqlite::extract_metadata and agent_events::event_at_ms so the durable log + the merger know how to project the new variant. Unknown future variants fall through to a warn-skip on the durable side and the live broadcast still surfaces them — failure stays graceful.

No FTS schema change is required: search remains TranscriptAppended-only today. Future revs that want full-text search over non-transcript kinds add an AgentEventLog::search_events method without touching existing emit sites.

Pure-Rust quick tunnel

nexo-tunnel-quick exposes a local TCP port over https://*.trycloudflare.com without the cloudflared Go subprocess. It is the public-HTTPS plumbing the daemon needs for WhatsApp QR pairing and dev-time webhook receivers.

Phase 92 lineage:

cloudflare-quick-tunnel (crates.io, upstream) — QUIC + Cap'n Proto-RPC client against Cloudflare's argotunnel edge.
nexo-tunnel-quick (this crate) — workspace wrapper that adds the sidecar URL file accessor, the lifecycle metrics module, and surfaces the supervisor knobs through a stable API.
nexo-tunnel (legacy alias, v0.3.x) — re-exports nexo-tunnel-quick::* verbatim until Phase 92.11 retires it.

Public API

#![allow(unused)]
fn main() {
use std::time::Duration;
use nexo_tunnel_quick::{TunnelManager, DEFAULT_GRACE_PERIOD};

let handle = TunnelManager::new(8080)
    .with_timeout(Duration::from_secs(30))
    .start()
    .await?;

println!("public URL: {}", handle.url);
println!("edge POP : {}", handle.location);
println!("tunnel id: {}", handle.tunnel_id);

if let Some(m) = handle.metrics().await {
    println!(
        "streams={} in={} out={} reconnects={}",
        m.streams_total, m.bytes_in, m.bytes_out, m.reconnects,
    );
}

handle.shutdown_with(DEFAULT_GRACE_PERIOD).await;
}

Supervisor

Heartbeat, reconnect-with-backoff and graceful unregisterConnection all run inside the upstream supervisor (a Tokio task owned by QuickTunnelHandle). Visible knobs:

Constant	Value	Role
`DEFAULT_HANDSHAKE_TIMEOUT`	30 s	wait for edge to register before failing
`DEFAULT_GRACE_PERIOD`	30 s	drain in-flight streams during shutdown
`MAX_RECONNECT_ATTEMPTS`	10	consecutive supervisor reconnect failures

MAX_RECONNECT_ATTEMPTS exhaustion surfaces as TunnelError::PermanentFailure(attempts) on the next metrics() poll (the supervisor task closes itself; the handle still answers shutdown() cleanly).

Telemetry (Phase 92.10)

Process-wide lifecycle counters live in nexo_tunnel_quick::metrics and follow the same lock-free pattern as Phase 86 (nexo-memory::metrics): LazyLock<AtomicU64> / LazyLock<DashMap> storage, hand-rolled Prometheus text rendering, no prometheus crate dep, no metrics server here. Operators stitch the renderer output into the runtime's /metrics aggregator.

Counter	Labels	Meaning
`tunnel_starts_total`	—	successful `TunnelManager::start` calls
`tunnel_starts_failed_total`	`reason`	`api` / `discovery` / `quic_dial` / `register` / …
`tunnel_shutdowns_total`	—	graceful `TunnelHandle::shutdown_with` invocations
`tunnel_streams_total`	`tunnel_id`	streams proxied (per tunnel, supervisor counter)
`tunnel_bytes_in_total`	`tunnel_id`	bytes edge → local
`tunnel_bytes_out_total`	`tunnel_id`	bytes local → edge
`tunnel_reconnects_total`	`tunnel_id`	supervisor reconnect cycles

Two render entry points:

#![allow(unused)]
fn main() {
// Lifecycle counters only — no per-handle snapshot.
let text = nexo_tunnel_quick::metrics::render_prometheus();

// Lifecycle + per-tunnel supervisor counters from live handles.
let text = nexo_tunnel_quick::metrics::render_prometheus_for(&[&h1, &h2]).await;
}

reason cardinality is capped at 16 distinct labels — extra ones collapse to "other" so a misbehaving error path can't blow up Prometheus storage.

Sidecar URL file

nexo pair start is a separate process from the daemon, so the active URL is published to a file at:

$NEXO_HOME/state/tunnel.url when NEXO_HOME is set, otherwise
~/.nexo/state/tunnel.url.

The daemon writes atomically (<path>.tmp + rename) on tunnel-up and removes on shutdown:

#![allow(unused)]
fn main() {
use nexo_tunnel_quick::{write_url_file, read_url_file, clear_url_file};

write_url_file(&handle.url)?;
let active = read_url_file();
clear_url_file()?;
}

No daemon connection, no broker round-trip, no shared library state — the CLI reads the file directly.

llm.yaml

LLM provider registry. Each agent's model.provider must resolve to a key in this file.

Source: crates/config/src/types/llm.rs.

Shape

providers:
  minimax:
    api_key: ${MINIMAX_API_KEY:-}
    group_id: ${MINIMAX_GROUP_ID:-}
    base_url: https://api.minimax.io
    rate_limit:
      requests_per_second: 2.0
      quota_alert_threshold: 100000
  anthropic:
    api_key: ${ANTHROPIC_API_KEY:-}
    base_url: https://api.anthropic.com
    rate_limit:
      requests_per_second: 2.0
    auth:
      mode: oauth_bundle
      bundle: ./secrets/anthropic_oauth.json
retry:
  max_attempts: 5
  initial_backoff_ms: 1000
  max_backoff_ms: 60000
  backoff_multiplier: 2.0

Per-provider fields

Field	Type	Required	Default	Purpose
`api_key`	string	✅	—	API key. Supports `${ENV_VAR}` and `${file:…}`.
`base_url`	url	✅	—	API endpoint. Override to use a proxy or a local server.
`group_id`	string	—	—	MiniMax-only. Group identifier.
`rate_limit.requests_per_second`	f64	—	`2.0`	Outbound throttle.
`rate_limit.quota_alert_threshold`	u64	—	—	Optional soft-alarm tokens-per-day threshold.
`api_flavor`	enum	—	`openai_compat`	`openai_compat` or `anthropic_messages`. Lets MiniMax expose the Anthropic wire.
`embedding_model`	string	—	—	Override model used for embeddings (e.g. Gemini's `text-embedding-004`).
`safety_settings`	JSON	—	—	Gemini-only; attached verbatim to requests.

Top-level retry block

Applies to every provider that doesn't define its own:

Field	Default	Purpose
`max_attempts`	`5`	Total attempts including the first try.
`initial_backoff_ms`	`1000`	First backoff.
`max_backoff_ms`	`60000`	Cap.
`backoff_multiplier`	`2.0`	Exponential factor.

Retries are jittered to avoid thundering-herd reconnects. See Fault tolerance — Retry policies.

Auth modes

auth:
  mode: auto | static | token_plan | oauth_bundle
  bundle: ./secrets/anthropic_oauth.json
  setup_token_file: ./secrets/anthropic_setup.json
  refresh_endpoint: https://auth.example.com/refresh
  client_id: your-oauth-client

`mode`	When
`auto`	Let the provider client decide from available credentials.
`static`	Use `api_key` verbatim.
`token_plan`	MiniMax "Token Plan" OAuth bundle.
`oauth_bundle`	Anthropic PKCE OAuth bundle written by `agent setup`.

Supported providers

Key	Notes
`minimax`	Primary provider. MiniMax M2.5. OpenAI-compat or Anthropic-flavour wire.
`anthropic`	Claude models. API key or OAuth subscription.
`openai`	OpenAI API and anything speaking its wire (Ollama, Groq, local proxies).
`gemini`	Google Gemini, including embedding support.

Provider-specific docs

Common mistakes

api_key: sk-… committed to git. Use ${ENV_VAR} or ${file:./secrets/…}; the secrets/ directory is gitignored.
Mismatched embedding_model dimensions. The vector store asserts embedding.dimensions matches the model output. A mismatch aborts startup with an explicit message.
Setting both api_key and auth.mode: oauth_bundle. The auth mode wins. The api_key is kept as a fallback for tools that bypass the OAuth path.

Input-token reduction (`context_optimization`)

Four independent kill switches for prompt caching, online history compaction, pre-flight token counting, and the workspace bundle cache. Full schema, defaults, and rollout guidance in Operations → Context optimization.

broker.yaml

Broker topology, disk persistence, and fallback behavior.

Source: crates/config/src/types/broker.rs.

Shape

broker:
  type: nats          # nats | local
  url: nats://localhost:4222
  auth:
    enabled: false
    nkey_file: ./secrets/nats.nkey
  persistence:
    enabled: true
    path: ./data/queue
  limits:
    max_payload: 4MB
    max_pending: 10000
  fallback:
    mode: local_queue
    drain_on_reconnect: true

Fields

Field	Type	Default	Purpose
`type`	`nats` \| `local`	`local`	`local` keeps the whole bus in-process; `nats` uses a real NATS server.
`url`	url	—	NATS connection URL (ignored when `type: local`).
`auth.enabled`	bool	`false`	Turn on NKey mTLS.
`auth.nkey_file`	path	—	Path to the NKey file when `auth.enabled`.
`persistence.enabled`	bool	`true`	Turn on the SQLite disk queue.
`persistence.path`	path	`./data/queue`	Directory for the disk queue SQLite DB.
`limits.max_payload`	size	`4MB`	Reject events larger than this.
`limits.max_pending`	u64	`10000`	Hard cap on the disk queue; past this, oldest events are shed.
`fallback.mode`	`local_queue` \| `drop`	`local_queue`	What to do when NATS is unreachable.
`fallback.drain_on_reconnect`	bool	`true`	Replay the disk queue when NATS returns.

Operational notes

type: local for single-machine dev (and small prod). You don't need NATS running just to try the agent. The local broker matches NATS subject semantics, so everything works the same.
Subprocess plugins work in local mode too (Phase 92). When type: local, the daemon derives a stdio_bridge transport and pipes broker.publish / broker.event for the extracted subprocess plugins (WhatsApp, Telegram, marketing) through the stdio JSON-RPC channel it already uses for tool calls — no NATS server required. stdio_bridge is daemon-derived; operators never pick it in YAML. Full picture: broker shapes.
Switch at runtime with nexo set-broker. Rewrites this file + SIGHUPs the running daemon (~3 s blackout; in-flight messages drained from the persistence layer):
```
nexo set-broker nats --url nats://localhost:4222   # → multi-host
nexo set-broker local                              # → stdio bridge
```
Disk queue always on in production. Even on a single machine. It's the guarantee against losing events on a NATS blip.
drain_on_reconnect: true is FIFO. See Event bus — Disk queue.

memory.yaml

Short-term sessions, long-term SQLite storage, and optional vector search.

Source: crates/config/src/types/memory.rs.

Shape

short_term:
  max_history_turns: 50
  session_ttl: 24h
  max_sessions: 10000

long_term:
  backend: sqlite
  sqlite:
    path: ./data/memory.db

vector:
  enabled: false
  backend: sqlite-vec
  default_recall_mode: hybrid
  embedding:
    provider: http
    base_url: https://api.openai.com/v1
    model: text-embedding-3-small
    api_key: ${OPENAI_API_KEY}
    dimensions: 1536
    timeout_secs: 30

Short-term

Per-session conversation buffer held in memory by SessionManager.

Field	Default	Purpose
`max_history_turns`	`50`	Turns kept before oldest are pruned into long-term memory.
`session_ttl`	`24h`	How long a session lives idle before eviction. humantime syntax.
`max_sessions`	`10000`	Soft cap. On overflow the oldest-idle session is evicted (fires `on_expire`). `0` = unbounded.

Long-term

Persisted memory, durable across restarts.

Field	Options	Default	Purpose
`backend`	`sqlite` \| `redis`	`sqlite`	Storage engine.
`sqlite.path`	path	`./data/memory.db`	SQLite file (with `sqlite-vec` extension loaded when vector enabled).
`redis.url`	url	—	Redis connection string (when `backend: redis`).

Vector

Opt-in semantic memory.

Field	Default	Purpose
`enabled`	`false`	Opt-in.
`backend`	`sqlite-vec`	Zero-extra-infrastructure vector index.
`default_recall_mode`	`hybrid`	Used when the `memory` tool call omits `mode`. Options: `keyword`, `vector`, `hybrid`.
`embedding.provider`	`http`	Where to fetch embeddings. `http` = any OpenAI-compatible embeddings server.
`embedding.base_url`	—	Embeddings endpoint.
`embedding.model`	—	Model id, e.g. `text-embedding-3-small`, `nomic-embed-text`.
`embedding.api_key`	—	Key for the embeddings server. Supports `${ENV_VAR}` / `${file:…}`.
`embedding.dimensions`	—	Must match the model output (1536 for OpenAI 3-small; 768 for nomic). Mismatch aborts startup.
`embedding.timeout_secs`	`30`	Embeddings request timeout.

Memory layers

flowchart LR
    MSG[incoming message] --> STM[short-term<br/>in-memory buffer]
    STM -->|turns exceed max| PRUNE[prune]
    PRUNE --> LTM[(long-term<br/>SQLite)]
    LTM --> EMB{vector<br/>enabled?}
    EMB -->|yes| VEC[(sqlite-vec index)]
    TOOL[memory tool] --> RECALL{recall mode}
    RECALL -->|keyword| LTM
    RECALL -->|vector| VEC
    RECALL -->|hybrid| LTM
    RECALL -->|hybrid| VEC

Per-agent isolation

Each agent's memory DB lives under its workspace when workspace_git is enabled — keeps memories forensically reviewable and prevents one agent from reading another's history.

Drop-in agents

config/agents.d/*.yaml is a merge-directory for agent definitions that should not live in agents.yaml — typically anything with business content (sales prompts, pricing tables, internal phone numbers, customer-facing identities).

Source: crates/config/src/lib.rs (merge logic).

Why it exists

Keep agents.yaml public-safe and checked into git
Keep sensitive content gitignored and loaded at runtime
Compose layered configs (00-dev.yaml, 10-prod.yaml) without editing a single monolithic file
Ship .example.yaml templates so the shape stays discoverable

.gitignore rules include:

config/agents.d/*.yaml
!config/agents.d/*.example.yaml

The .example.yaml files are committed and serve as templates; the real .yaml files are not.

Merge order

Files are loaded in lexicographic filename order and their agents arrays are concatenated to the base agents.yaml:

flowchart TD
    BASE[agents.yaml] --> MERGE[merged catalog]
    D1[agents.d/00-shared.yaml] --> MERGE
    D2[agents.d/10-ana.yaml] --> MERGE
    D3[agents.d/20-kate.yaml] --> MERGE
    EX[agents.d/ana.example.yaml] -.->|gitignored template<br/>usually not loaded| MERGE

Every file must have the top-level agents: [...] shape:

# config/agents.d/10-ana.yaml
agents:
  - id: ana
    model:
      provider: minimax
      model: MiniMax-M2.5
    plugins: [whatsapp]
    inbound_bindings:
      - plugin: whatsapp
    system_prompt: |
      …private content…

Agent id collisions

Two files cannot define the same agent.id. On collision the loader fails fast with a clear message. If you want to override an agent, either:

Replace the entry (rename or remove the original)
Use inbound_bindings[] per-binding overrides inside a single entry

Common patterns

Public vs. private split

config/agents.yaml                  # committed, only support/ops agents
config/agents.d/ana.yaml            # gitignored, full sales prompt
config/agents.d/kate.yaml           # gitignored, personal assistant
config/agents.d/ana.example.yaml    # committed, empty template

Environment layering

config/agents.d/00-common.yaml      # shared defaults
config/agents.d/10-dev.yaml         # dev-only overrides (loaded only on dev box)

Swap the 10-*.yaml file per environment. Docker compose can mount the right one from a secret volume.

Validation

#[serde(deny_unknown_fields)] still applies to every file
validate_agents() runs after the merge — checks duplicate ids, missing plugin references, invalid skill directories
Errors name the file and the offending agent id

Per-agent credentials

Bind each agent to specific WhatsApp / Telegram / Google accounts so outbound traffic originates from the right number, bot, or mailbox — never from a shared pool.

Mental model

Three layers:

Plugin instance — a labelled WhatsApp session or Telegram bot in config/plugins/{whatsapp,telegram}.yaml. Each instance owns its own token / session_dir and an optional allow_agents list.
Google account — an entry in the optional config/plugins/google-auth.yaml. Each account is 1:1 with an agent_id.
Agent binding — in config/agents.d/<agent>.yaml, the credentials: block pins the agent to the instance / account it may use for outbound tool calls.

The runtime runs a boot-time gauntlet that cross-checks all three layers before any plugin boots. Every invariant violation surfaces in a single report so you can fix the full YAML in one edit.

Config schemas

`config/agents.d/ana.yaml`

agents:
  - id: ana
    credentials:
      whatsapp: personal        # must match whatsapp.yaml instance
      telegram: ana_bot         # must match telegram.yaml instance
      google:   ana@gmail.com   # must match google-auth.yaml accounts[].id
      # Opt-out for the symmetric-binding warning when inbound bot and
      # outbound bot are intentionally different:
      # telegram_asymmetric: true
    inbound_bindings:
      - { plugin: whatsapp, instance: personal }
      - { plugin: telegram, instance: ana_bot }

`config/plugins/whatsapp.yaml`

whatsapp:
  - instance: personal
    session_dir: ./data/workspace/ana/whatsapp/personal
    media_dir:   ./data/media/whatsapp/personal
    allow_agents: [ana]           # defense-in-depth ACL
  - instance: work
    session_dir: ./data/workspace/kate/whatsapp/work
    media_dir:   ./data/media/whatsapp/work
    allow_agents: [kate]

`config/plugins/telegram.yaml`

telegram:
  - instance: ana_bot
    token: ${file:./secrets/telegram/ana_token.txt}
    allow_agents: [ana]
    allowlist:
      chat_ids: [1194292426]
  - instance: kate_bot
    token: ${file:./secrets/telegram/kate_token.txt}
    allow_agents: [kate]

`config/plugins/google-auth.yaml`

google_auth:
  accounts:
    - id: ana@gmail.com
      agent_id: ana                       # 1:1 — the gauntlet enforces it
      client_id_path:     ./secrets/google/ana_client_id.txt
      client_secret_path: ./secrets/google/ana_client_secret.txt
      token_path:         ./secrets/google/ana_token.json
      scopes:
        - https://www.googleapis.com/auth/gmail.modify

Agents that still declare the legacy inline google_auth block are auto-migrated into this store on boot (a warning tells you to migrate).

What the gauntlet validates

Check	Lenient	Strict
Duplicate `session_dir` across instances	error	error
`session_dir` that is a parent of another	error	error
Credential file with lax permissions (linux 0o077)	error	error
`credentials.<ch>` points to an instance that does not exist	error	error
Agent listens on >1 instance without declaring `credentials.<ch>`	error	error
Instance `allow_agents` excludes a binding agent	error	error
Inbound instance ≠ outbound instance (no `<ch>_asymmetric`)	warn	error
Inline `agents.<id>.google_auth` without matching `google-auth.yaml`	warn	warn

Linux permission check is skipped for /run/secrets/* (Docker secrets) and can be disabled entirely with CHAT_AUTH_SKIP_PERM_CHECK=1.

Topics

Outbound tool calls land on instance-suffixed topics when the resolver has a binding:

plugin.outbound.whatsapp.<instance>
plugin.outbound.telegram.<instance>

Unlabelled (instance: None) plugin entries keep publishing to the legacy bare topic plugin.outbound.whatsapp / plugin.outbound.telegram for full back-compat.

CLI gate

# Run the full gauntlet without booting the daemon. Exits 0 clean,
# 1 on errors, 2 on warnings-only.
agent --config ./config --check-config

# Promote warnings to errors (CI lane).
agent --config ./config --check-config --strict

The gate scans agents.yaml, every agents.d/*.yaml, whatsapp.yaml, telegram.yaml, and google-auth.yaml. Sample failure:

credentials: FAILED with 1 error(s):
   1. agent 'ana_per_binding_example' binds credentials.telegram='ana_tg' but no such telegram instance exists (available: [])

Secrets in logs

The credential layer never logs a raw account id. Every reference is via an 8-byte sha256(account_id) fingerprint rendered as hex:

2025-04-24T16:03:42Z INFO credentials.audit agent="ana" channel="whatsapp" fp=a3f2…7c direction=outbound

The fingerprint is pinned — switching the algorithm is an explicit breaking change tracked by crates/auth/tests/fingerprint_stability.rs.

Observability

Nine Prometheus series land at /metrics:

Series	Type	Labels
`credentials_accounts_total`	gauge	`channel`
`credentials_bindings_total`	gauge	`agent`, `channel`
`channel_account_usage_total`	counter	`agent`, `channel`, `direction`, `instance`
`channel_acl_denied_total`	counter	`agent`, `channel`, `instance`
`credentials_resolve_errors_total`	counter	`channel`, `reason`
`credentials_breaker_state`	gauge	`channel`, `instance`
`credentials_boot_validation_errors_total`	counter	`kind`
`credentials_insecure_paths_total`	gauge	—
`credentials_google_token_refresh_total`	counter	`account_fp`, `outcome`

Back-compat

Configs without a credentials: block keep working — the resolver infers outbound from the single inbound_bindings entry when it is unambiguous; otherwise outbound tools are marked unbound and fall back to the legacy bare topic.
Plugin entries with instance: None stay on the legacy bare topic.
agents.<id>.google_auth still registers google_* tools for that agent; google-auth.yaml is preferred going forward.

Hot-reload (no daemon restart)

Edit agents.d/*.yaml, plugins/whatsapp.yaml, plugins/telegram.yaml, or plugins/google-auth.yaml, then trigger a reload via the loopback admin endpoint:

curl -fsSX POST http://127.0.0.1:9091/admin/credentials/reload | jq

{
  "accounts_wa": 2,
  "accounts_tg": 2,
  "accounts_google": 1,
  "warnings": [],
  "version": 4
}

The resolver runs the gauntlet against the fresh files, then atomically swaps bindings in place. Plugin tools holding Arc<…> references see the new state on their next call. Failure mode: gauntlet errors return HTTP 400 with the error list; the previous bindings stay active so a typo in YAML does not knock out the runtime.

CredentialHandles already issued to in-flight tool calls keep working — handles are by-value clones; the resolver only mediates lookup of future calls.

What the reload does NOT cover

Adding a brand-new WhatsApp / Telegram instance still requires a restart for the plugin (each instance owns its own session_dir
- websocket). The resolver picks up the new account but the plugin side stays as-was until next boot.
Removing an account leaks its breaker entry in BreakerRegistry until restart. No correctness impact.

Google client_id / client_secret rotation

Rewriting the secret files (./secrets/<agent>_google_client_id.txt, ..._client_secret.txt) is picked up automatically on the next google_* tool call — GoogleAuthClient checks file mtime before each network hop and re-reads when it advanced. No reload call required for that case. Audit log line:

INFO credentials.audit event="google_secrets_refreshed" \
  google_*: re-read client_id/client_secret after on-disk rotation

Strict mode

agent --check-config --strict promotes warnings to errors. Two checks behave differently under strict:

Condition	Lenient	Strict
Inline `agents.<id>.google_auth` block (legacy)	warn + auto-migrate	`BuildError::LegacyInlineGoogleAuth`, fail boot
Asymmetric inbound ≠ outbound (no `<ch>_asymmetric: true`)	warn	error

Run --strict in CI to gate PRs that touch credential YAML.

Migrating

Add instance: + allow_agents: to each entry in whatsapp.yaml / telegram.yaml.
Create config/plugins/google-auth.yaml with one accounts[] per agent that needs Gmail.
Add credentials: to each agents.d/*.yaml.
Run agent --check-config --strict. Fix every listed error.
Commit.

pollers.yaml

The Phase 19 generic poller subsystem. One runner orchestrates N modules — each module is an impl Poller (gmail, rss, calendar, webhook_poll, or anything you write yourself) — and every module shares the same scheduler, lease, breaker, cursor persistence, and outbound dispatch via Phase 17 credentials.

Source: crates/poller/, crates/config/src/types/pollers.rs.

Top-level shape

pollers:
  enabled: true
  state_db: ./data/poller.db
  default_jitter_ms: 5000
  lease_ttl_factor: 2.0
  failure_alert_cooldown_secs: 3600
  breaker_threshold: 5
  jobs:
    - id: ana_leads
      kind: gmail
      agent: ana
      schedule: { every_secs: 60 }
      config:
        query: "is:unread subject:lead"
        deliver: { channel: whatsapp, to: "57300...@s.whatsapp.net" }
        message_template: |
          New lead 🚨
          {snippet}

Absent file → subsystem off (no jobs spawn, no admin endpoint).

Top-level fields

Field	Default	Purpose
`enabled`	`true`	Master switch. `false` skips everything below.
`state_db`	`./data/poller.db`	SQLite path for `poll_state` + `poll_lease`. Created if missing.
`default_jitter_ms`	`5000`	Random offset added to `next_run_at` when a job's schedule does not declare its own. Avoids thundering herd.
`lease_ttl_factor`	`2.0`	Lease TTL = `factor × interval` (min 30s). A daemon that crashes mid-tick releases the lease via expiry; another worker takes over without rerunning side effects unless your module is non-idempotent.
`failure_alert_cooldown_secs`	`3600`	Per-job cooldown for `failure_to` alerts. Persisted in `poll_state.last_failure_alert_at` so it survives restarts.
`breaker_threshold`	`5`	Consecutive `Transient` errors before the per-job circuit breaker opens.
`jobs`	`[]`	Per-job entries (see below).

Per-job fields

Field	Required	Purpose
`id`	✅	Unique. Used as session key for state, metrics, admin endpoints, lease.
`kind`	✅	Discriminator. Must match a registered `Poller::kind()` (see Built-ins and Build a poller).
`agent`	✅	Agent whose Phase 17 credentials this job uses. The runner looks up the binding for whatever channel the module needs (Google for fetch, WhatsApp/Telegram for outbound, etc).
`schedule`	✅	One of `every`, `cron`, `at` (see Schedules).
`config`	—	Module-specific options. Validated by `Poller::validate` at boot. Bad config rejects this job only — siblings keep loading.
`failure_to`	—	`{ channel, to }` for an alert when consecutive_errors crosses `breaker_threshold`. Optional — omit to log only.
`paused_on_boot`	`false`	Persist `paused = 1` in state at startup. Useful for staged rollouts.

Schedules

# Repeat every N seconds. Most common.
schedule: { every_secs: 60 }

# 6-field cron: sec min hour dom mon dow.
schedule:
  cron: "0 */5 * * * *"          # every 5 minutes on the boundary
  tz: "America/Bogota"           # accepted; evaluated in UTC unless cron-tz feature on
  stagger_jitter_ms: 2000        # local override for this job

# One-shot at an RFC3339 instant. After it fires the job stays paused.
schedule: { at: "2026-04-26T15:00:00Z" }

Built-ins

`kind`	Purpose	Cursor	Auth
`gmail`	Search Gmail, regex extract, dispatch	Reserved (Gmail UNREAD + mark_read does dedup)	Phase 17 Google
`rss`	RSS / Atom feeds	ETag + bounded seen-id ring	None
`webhook_poll`	Generic JSON GET / POST	Bounded seen-id ring	None / custom headers
`google_calendar`	Calendar v3 events incremental sync	`nextSyncToken`	Phase 17 Google

`gmail`

- id: ana_leads
  kind: gmail
  agent: ana
  schedule: { every_secs: 60 }
  config:
    query: "is:unread subject:(lead OR interesado)"
    newer_than: "1d"             # avoids back-filling years on first deploy
    max_per_tick: 20
    dispatch_delay_ms: 1000      # throttle between dispatches in same tick
    sender_allowlist: ["@mycompany.com"]
    extract:
      name: "Nombre:\\s*(.+)"
      phone: "Tel:\\s*(\\+?\\d+)"
    require_fields: [name, phone]
    message_template: |
      New lead 🚨 {name} — {phone}
      {snippet}
    mark_read_on_dispatch: true
    deliver: { channel: whatsapp, to: "57300...@s.whatsapp.net" }

Multiple gmail jobs for the same agent share a cached GoogleAuthClient — token refreshes happen once across all jobs.

google_* errors are classified: 401 / invalid_grant / revoked → Permanent (auto-pause), 5xx / network → Transient (backoff).

`rss`

- id: ana_blog_watch
  kind: rss
  agent: ana
  schedule: { every_secs: 600 }
  config:
    feed_url: https://example.com/feed.xml
    max_per_tick: 5
    message_template: "{title}\n{link}"
    deliver: { channel: telegram, to: "1194292426" }

ETag from the previous response is sent as If-None-Match. 304 Not Modified produces a zero-cost tick.

`webhook_poll`

- id: ana_jira_assigned
  kind: webhook_poll
  agent: ana
  schedule: { every_secs: 300 }
  config:
    url: https://company.atlassian.net/rest/api/3/search
    method: GET
    headers:
      Authorization: "Bearer ${JIRA_TOKEN}"
      Accept: "application/json"
    items_path: "issues"        # dotted path to the array; "" for root
    id_field: "id"              # field used for dedup
    max_per_tick: 10
    message_template: "[{key}] {fields}"
    deliver: { channel: telegram, to: "1194292426" }
    # SSRF guard — must opt in to hit private / loopback hosts:
    # allow_private_networks: true

401 / 403 → Permanent. Any other 4xx → Permanent. 5xx → Transient.

`google_calendar`

- id: ana_calendar_sync
  kind: google_calendar
  agent: ana
  schedule: { every_secs: 300 }
  config:
    calendar_id: primary
    skip_cancelled: true
    message_template: "📅 {summary} — {start}\n{html_link}"
    deliver: { channel: telegram, to: "1194292426" }

First tick captures nextSyncToken and dispatches nothing (baseline). Subsequent ticks use syncToken=... and dispatch the diff. 410 Gone (token expired) is classified Permanent — operator runs agent pollers reset <id> to re-baseline.

Multi-job per built-in

Same agent + same kind, multiple jobs — completely independent. The runner gives each its own cursor, breaker, schedule, metrics, and pause/resume controls. The GoogleAuthClient is the only thing shared (intentional, so quota and refresh costs aren't multiplied).

# Three Gmail polls for Ana, all independent
- id: ana_leads
  kind: gmail
  agent: ana
  schedule: { every_secs: 60 }
  config:
    query: "is:unread label:lead"
    deliver: { channel: whatsapp, to: "57300...@s.whatsapp.net" }
    # …

- id: ana_invoices
  kind: gmail
  agent: ana
  schedule: { every_secs: 600 }
  config:
    query: "is:unread label:invoice"
    deliver: { channel: telegram, to: "1194292426" }
    # …

- id: ana_alerts
  kind: gmail
  agent: ana
  schedule: { cron: "0 */15 * * * *" }
  config:
    query: "is:unread from:monitor@infra.com"
    deliver: { channel: telegram, to: "9876543210" }
    # …

Pause ana_invoices independently with agent pollers pause ana_invoices.

CLI

agent pollers list                 # plain table; --json for machine output
agent pollers show ana_leads      # detail of one job
agent pollers run ana_leads       # manual tick (bypasses schedule + lease)
agent pollers pause ana_invoices  # paused = 1
agent pollers resume ana_invoices
agent pollers reset ana_calendar_sync --yes  # destructive; clears cursor
agent pollers reload              # re-read pollers.yaml + diff

The daemon must be running (CLI hits the loopback admin server at 127.0.0.1:9091).

Admin endpoints

GET  /admin/pollers
GET  /admin/pollers/<id>
POST /admin/pollers/<id>/run
POST /admin/pollers/<id>/pause
POST /admin/pollers/<id>/resume
POST /admin/pollers/<id>/reset
POST /admin/pollers/reload

reload returns a ReloadPlan JSON: { add, replace, remove, keep }. Validation runs across every job in the new file before any task is touched — a typo never knocks healthy siblings offline.

Agent tools

When the poller subsystem is up, every agent gets six LLM-callable tools registered on its ToolRegistry:

Tool	Effect
`pollers_list`	List every job + status
`pollers_show`	Inspect one job
`pollers_run`	Trigger a tick out-of-band
`pollers_pause`	Set `paused = 1`
`pollers_resume`	Set `paused = 0`
`pollers_reset`	Wipe cursor + errors (destructive)

Each registered Poller impl can also expose per-kind custom tools via Poller::custom_tools() — gmail ships gmail_count_unread out of the box. See Build a poller.

Create / delete are intentionally not exposed: prompt-injection could plant a webhook_poll aimed at internal infra. Operators own pollers.yaml + agent pollers reload.

Failure-destination

- id: ana_leads
  kind: gmail
  # …
  failure_to:
    channel: telegram
    to: "1194292426"     # alerts on the operator's chat

When the per-job circuit breaker trips (consecutive_errors >= breaker_threshold), the runner publishes a text message to the configured channel (resolved via Phase 17 just like the happy path) and records the timestamp for cooldown gating. Cooldown is failure_alert_cooldown_secs global default, overridable per job in a future revision.

Observability

Seven Prometheus series exposed under /metrics:

Series	Type	Labels
`poller_ticks_total`	counter	`kind`, `agent`, `job_id`, `status={ok,transient,permanent,skipped}`
`poller_latency_ms`	histogram	`kind`, `agent`, `job_id`
`poller_items_seen_total`	counter	`kind`, `agent`, `job_id`
`poller_items_dispatched_total`	counter	`kind`, `agent`, `job_id`
`poller_consecutive_errors`	gauge	`job_id`
`poller_breaker_state`	gauge	`job_id` (`0=closed`, `1=half-open`, `2=open`)
`poller_lease_takeovers_total`	counter	`job_id`

Migrating from `gmail-poller.yaml`

The legacy crate nexo-plugin-gmail-poller keeps its YAML schema but no longer drives its own loop. On boot the wizard auto-translates every legacy job into a kind: gmail entry, folds it into cfg.pollers.jobs, and logs a deprecation warn. Explicit entries in pollers.yaml win on id collision so a manual migration is never clobbered.

To migrate cleanly:

Run agent --check-config to print every translated id.
Copy each into config/pollers.yaml under pollers.jobs, adjusting the agent: field if the legacy agent_id was inferred.
Delete config/plugins/gmail-poller.yaml.

Anthropic / Claude

Native Anthropic client with multiple authentication paths: static API key, setup tokens, full OAuth PKCE subscription flow, or automatic import from the local Claude Code CLI.

Source: crates/llm/src/anthropic.rs, crates/llm/src/anthropic_auth.rs. Phase 15 added the subscription flow end-to-end.

Configuration

# config/llm.yaml
providers:
  anthropic:
    api_key: ${ANTHROPIC_API_KEY:-}
    base_url: https://api.anthropic.com
    rate_limit:
      requests_per_second: 2.0
    auth:
      mode: oauth_bundle
      bundle: ./secrets/anthropic_oauth.json

Per-agent selection:

model:
  provider: anthropic
  model: claude-haiku-4-5

Authentication modes

`auth.mode`	Credential	Header
`static`	`api_key` (`sk-ant-…`)	`x-api-key: <key>`
`setup_token`	`sk-ant-oat01-…` (min 80 chars)	`Authorization: Bearer <key>` + `anthropic-beta: oauth-2025-04-20`
`oauth_bundle`	`{access, refresh, expires_at}` JSON	`Authorization: Bearer <access>`
`auto`	tries all of the above in order	—

`auto` resolution order

Used when auth.mode: auto or omitted:

flowchart TD
    START[anthropic client build] --> B1{oauth_bundle<br/>file exists?}
    B1 -->|yes| USE1[use OAuth bundle]
    B1 -->|no| B2{Claude Code CLI<br/>credentials found?}
    B2 -->|yes| USE2[import from<br/>~/.claude/.credentials.json]
    B2 -->|no| B3{setup_token<br/>file exists?}
    B3 -->|yes| USE3[use setup token]
    B3 -->|no| B4{api_key<br/>set?}
    B4 -->|yes| USE4[use static key]
    B4 -->|no| FAIL([fail: no credentials])

OAuth bundle

The wizard runs a PKCE flow in the browser and writes the bundle to ./secrets/anthropic_oauth.json:

{
  "access_token": "...",
  "refresh_token": "...",
  "expires_at": "2026-05-01T12:00:00Z"
}

Refresh endpoint: https://console.anthropic.com/v1/oauth/token
Refresh cadence: 60 seconds before expires_at, background task POSTs grant_type=refresh_token
Concurrency: all refreshes serialize behind a mutex
Shared OAuth client id: 9d1c250a-e61b-44d9-88ed-5944d1962f5e
Stale-token handling: a 401 mid-flight marks the token stale so the next refresh fires immediately instead of waiting for the expiry window

CLI credentials import

If you're already running Claude Code CLI on the same host, the client auto-detects and imports ~/.claude/.credentials.json. Zero config — if it exists and is valid, it's used.

Tool calling

Native Anthropic shape:

Tool definitions: {name, description, input_schema}
Tool invocation: tool_use blocks with id, name, input
Tool result: tool_result blocks correlated via tool_use_id

Streaming uses native SSE; a dedicated parser in crates/llm/src/stream.rs handles message_start, content_block_*, and message_delta events.

Error classification

Response	Mapping	Behavior
429	`LlmError::RateLimit { retry_after_ms }` (fallback 60s)	Retried
401 / 403	`LlmError::CredentialInvalid` with context (API vs OAuth)	Marks OAuth token stale; fails fast so the operator sees it
5xx	`LlmError::ServerError`	Retried
Other 4xx	`LlmError::Other`	Fail fast

OAuth subscription request shape

Anthropic gates Opus 4.x and Sonnet 4.x behind a Claude-Code identity claim when the request is authenticated with a Bearer token (setup token or OAuth bundle). Without the claim, only Haiku passes — every other model returns a 4xx that surfaces as a vague "no quota" error.

When AnthropicAuth::is_subscription() is true (SetupToken or OAuth variants), the client adds:

Header anthropic-beta: claude-code-20250219, oauth-2025-04-20, fine-grained-tool-streaming-2025-05-14 (cache betas merged in on top).
Header anthropic-dangerous-direct-browser-access: true.
Header User-Agent: claude-cli/<version>.
Header x-app: cli.
A first system block whose text is exactly: You are Claude Code, Anthropic's official CLI for Claude.

The user's system_prompt (and any structured system_blocks) follow the spoof block, preserving the original instructions verbatim.

User-Agent version: defaults to the value of CLAUDE_CLI_DEFAULT_VERSION in crates/llm/src/anthropic_auth.rs. Operators can override it without rebuilding by exporting:

export NEXO_CLAUDE_CLI_VERSION=2.1.99

The API-key path is unchanged — none of these headers or the spoof block are added when AnthropicAuth::ApiKey is in use.

Mirrors OpenClaw's anthropic-transport-stream.ts:558-641. Reference implementation lives in research/src/agents/.

Supported features

Chat completions ✅
Tool calling ✅
Streaming (SSE) ✅
Multimodal (images) ✅
Prompt caching ✅ (via Anthropic beta headers)
Extended thinking ✅ (model-dependent)
OAuth subscription (Pro / Max plans) ✅ — Opus / Sonnet require the Claude-Code request shape documented above.

Prompt Cache Break Diagnostics (Phase 77.4)

Global detector (all providers/models) in crates/core/src/agent/llm_behavior.rs:

After each parsed response, the client compares cache_read_input_tokens against the previous turn in the same session.
If cache-read drops by more than 50%, it emits a warning log: llm.cache_break.
The log includes a suspected_breaker hint: provider_swap, model_swap, system_prompt_mutation, or unknown.

Anthropic-specific enrichment in crates/llm/src/anthropic.rs:

Emits anthropic.cache_break with the same >50% drop trigger.
The log includes a suspected_breaker hint based on request drift: model_swap, system_prompt_mutation, beta_header_drift, or unknown.
First turn is baseline only (no comparison/log).

Common mistakes

Setup-token string under 80 chars. The setup-token validator refuses it at parse time. Make sure you pasted the full string.
api_key + oauth_bundle both set. The auth mode wins. The static key is kept only as a fallback the auto-resolver may pick up if the bundle is missing.
Claude Code CLI credentials being used unintentionally. If auto mode is on and you installed CLI on the host, that path wins before api_key. Set auth.mode: static to pin the static key.

OpenAI-compatible

Client for OpenAI itself and for any upstream that speaks the same wire: Ollama, Groq, OpenRouter, LM Studio, vLLM, Azure OpenAI, or your own proxy.

Source: crates/llm/src/openai_compat.rs.

Configuration

# config/llm.yaml
providers:
  openai:
    api_key: ${OPENAI_API_KEY:-}
    base_url: https://api.openai.com/v1
    rate_limit:
      requests_per_second: 2.0

Per-agent:

model:
  provider: openai
  model: gpt-4o

Known-working upstreams

Point base_url at any of these and it works out of the box:

Upstream	`base_url`
OpenAI	`https://api.openai.com/v1`
Ollama	`http://localhost:11434/v1`
Groq	`https://api.groq.com/openai/v1`
OpenRouter	`https://openrouter.ai/api/v1`
LM Studio	`http://localhost:1234/v1`
vLLM	`http://<host>:<port>/v1`
Azure OpenAI	Azure resource URL (watch for differences)
MiniMax (compat mode)	`https://api.minimax.io`

Authentication

Single mode: static API key sent as Authorization: Bearer <key>. Some upstreams ignore the key entirely (Ollama, local vLLM) — supply any non-empty string to satisfy the config validator.

Features & gaps

Feature	Status
Chat completions	✅
Tool calling	✅ (OpenAI function-calling shape)
Streaming	✅
`tool_choice: auto \| required \| none \| {type:function}`	✅
JSON mode / structured outputs	upstream-dependent
Multimodal	upstream-dependent
Embeddings	supported for OpenAI proper; other upstreams may vary

Feature gating when the upstream lacks support: we do not pre-probe features — a call that requires a feature the upstream doesn't speak will fail with the upstream's own error (typically a 400). The error bubbles up as LlmError::Other and does not retry, so you notice quickly.

Error classification

Response	Mapping	Behavior
429	`LlmError::RateLimit` (fallback 30s)	Retried
5xx	`LlmError::ServerError`	Retried
Other 4xx	`LlmError::Other`	Fail fast

Common mistakes

Trailing slash in base_url. Some upstreams are lenient, some are not. Stick to the form shown in the table.
Using Azure OpenAI without the deployment path. Azure requires an extra segment (/openai/deployments/<name>/chat/completions) that the vanilla OpenAI path doesn't. Currently not supported out of the box; use a proxy or a custom provider if you need Azure.
Relying on JSON mode everywhere. Many local servers don't enforce schemas. Validate the response yourself when using Ollama / LM Studio for critical tool args.

DeepSeek

Connector for DeepSeek's hosted models. The API is OpenAI-compatible end to end (same /v1/chat/completions shape, same SSE streaming, same Bearer auth) so the connector is a thin factory that wraps OpenAiClient with DeepSeek's default endpoint.

Source: crates/llm/src/deepseek.rs.

Configuration

# config/llm.yaml
providers:
  deepseek:
    api_key: ${DEEPSEEK_API_KEY}
    # base_url defaults to https://api.deepseek.com/v1 when blank.
    # Override only for self-hosted gateways or testing fixtures.
    base_url: ""
    rate_limit:
      requests_per_second: 2.0
      quota_alert_threshold: 100000

Pin the agent to it:

agents:
  - id: ana
    model:
      provider: deepseek
      model: deepseek-chat

Models

Model id	Use case
`deepseek-chat`	General-purpose. Supports tool calling.
`deepseek-reasoner`	Long-form reasoning. No tool calling in current API revision.

deepseek-reasoner agents must therefore leave allowed_tools empty (or list only tools the agent never plans to invoke). Tool calls fired against the reasoner endpoint return an error from upstream.

Streaming

Identical to OpenAI's SSE format, so OpenAiClient::chat_stream parses it without per-provider code. nexo_llm_stream_ttft_seconds and nexo_llm_stream_chunks_total Prometheus series labelled with provider="deepseek" show up automatically.

Tool calling

deepseek-chat follows OpenAI's tool-calling spec verbatim. JSON arguments deserialise the same way; parallel_tool_calls is honoured.

Rate limits

DeepSeek returns standard 429 with a retry-after header. The existing retry plumbing (crates/llm/src/retry.rs) consumes that header so 429s back off cleanly without touching the connector.

Quota / cost

DeepSeek's pricing is per-1M-tokens; the TokenUsage returned by each ChatResponse is forwarded to the standard agent_llm_tokens_total counter (labels: provider="deepseek", model, usage_kind).

Known limitations

No native embeddings client — DeepSeek does not currently publish an embeddings endpoint. Use a different provider for embedding_model if your agent needs vector search.
Reasoner tool-call gap — see Models. Validate at boot by leaving allowed_tools: [] on agents pinned to deepseek-reasoner.
Cache awareness — DeepSeek's KV-cache hit information is surfaced through the same cache_usage field as the OpenAI client reports it.

Multi-instance providers + secret-backed keys

Phase 82.10.s ships a long-overdue split between factory type (the crates/llm/src/<id>.rs client the daemon dispatches against) and provider instance (the YAML key under llm.yaml.providers.*).

Phase 82.10.t adds dynamic model discovery via /v1/models so SPA wizards show the live list a key actually has access to instead of a hardcoded catalog that drifts.

Why

Pre-82.10.s, providers.minimax was both the YAML id AND the factory name — there was exactly one MiniMax per daemon. Two problems:

Two microapps in the same daemon couldn't have separate MiniMax keys. The key was an env var (MINIMAX_API_KEY), and env vars are process-global. Microapp B would overwrite microapp A's key.
A single tenant couldn't run two MiniMaxes with different keys for billing isolation between their own clients.

Post-82.10.s, the YAML can name as many instances of the same factory as the operator wants, each with its own key:

providers:
  # Legacy single-instance path still works (factory_type omitted →
  # the YAML key IS the factory id).
  minimax:
    api_key: ${MINIMAX_API_KEY}
    base_url: https://api.minimax.chat/v1

  # Multi-instance: name the instance whatever you want, point
  # factory_type at a registered factory, supply a per-instance
  # secret reference instead of a shared env var.
  minimax-cliente-a:
    factory_type: minimax
    base_url: https://api.minimax.chat/v1
    api_key_secret_id: LLM_MINIMAX_CLIENTE_A

  minimax-cliente-b:
    factory_type: minimax
    base_url: https://api.minimax.chat/v1
    api_key_secret_id: LLM_MINIMAX_CLIENTE_B

Agents then point at the instance id, not the factory:

agents:
  ana:
    model:
      provider: minimax-cliente-a   # ← instance id
      model: MiniMax-M2.5
  pedro:
    model:
      provider: minimax-cliente-b   # ← different instance, different key
      model: MiniMax-M2.5

Each agent dispatches against its own key. Quota / rate-limit / billing all separate.

API key sources — exactly one of three

LlmProviderConfig accepts the API key from one of three sources, and the upsert RPC + boot resolver enforce exactly one:

Source	Where it lives	When to use
`api_key` (inline)	YAML literal — usually `${ENV_VAR}`	Dev / single-tenant single-instance
`api_key_secret_id`	Reference to `<state_root>/secrets/<ID>.txt` mode 0600	Production multi-instance
`api_key_env` (legacy)	Env var name — daemon resolves at boot	Pre-82.10.s back-compat

Setting two of the above at once → loud boot failure with the provider id and the conflicting sources listed.

Boot resolution

After AppConfig::load, main.rs walks every provider instance (global

tenant-scoped) and:

Resolves api_key via LlmConfig::resolve_all_keys(&secrets).
- Errors collected per-instance (not fail-fast) so the operator sees every broken provider in one diagnostic, not fix-restart-loop.
- FsSecretsStore impls SecretsSource (sync read) so config-load reads <secrets_dir>/<id>.txt without async machinery.
Validates factory_type via LlmRegistry::validate_config.
- Each instance's resolved factory id (explicit factory_type or fallback to the YAML key) MUST be a registered factory.
- Aggregates errors the same way; loud boot fail beats a runtime LLM dispatch error mid-traffic.

Sample boot failure:

Error: LLM provider API-key resolution failed for 2 instance(s):
  · minimax-cliente-a: secret 'cliente-a-key' read failed: No such file
  · openai: no API key source (set `api_key` inline or `api_key_secret_id`)

Admin RPC — `nexo/admin/llm_providers/upsert`

The admin handler now accepts:

{
  "id": "minimax-cliente-a",
  "factory_type": "minimax",                  // optional — defaults to id
  "base_url": "https://api.minimax.chat/v1",
  "api_key_secret_value": "sk-...",           // write-through (audit-redacted)
  // mutually exclusive with:
  //   "api_key_env": "MINIMAX_API_KEY"       // legacy
  //   "api_key_secret_id": "PRE_STAGED_ID"   // pre-staged via secrets/write
  "tenant_id": "acme"                         // optional tenant scope
}

When api_key_secret_value is supplied, the daemon:

Stamps the value into the SecretsStore under a derived id (LLM_<INSTANCE_UPPERCASE>) — atomic file write mode 0600.
Sets api_key_secret_id: LLM_<INSTANCE> on the YAML.
Triggers reload signal so the rebuilt LlmRegistry picks up the key without daemon restart.

Audit redactor masks api_key_secret_value as <redacted> so the cleartext only persists in the SecretsStore, never on disk in admin_audit.db. api_key_secret_id (a name, not a value) stays visible for diagnostics.

Admin RPC — `nexo/admin/llm_providers/catalog`

Returns the list of registered factories with their default base URL

env var + curated model list. SPA wizards use this to render strict provider/model dropdowns without a hardcoded catalog drifting from the framework. Plugin-registered remote providers (Phase 81.25) appear here too as long as they registered before bootstrap.

Admin RPC — `nexo/admin/llm_providers/probe`

Phase 82.10.t extended the probe response with a model_names field parsed from data[].id of an OpenAI-compat /v1/models payload:

{
  "ok": true,
  "status": 200,
  "latency_ms": 142,
  "model_count": 47,
  "model_names": ["gpt-4o", "gpt-4o-mini", "gpt-4-turbo", "..."]
}

model_names is null when:

The provider doesn't expose /v1/models (Anthropic, Gemini).
The body isn't OpenAI-compat shaped.
No data[].id strings could be extracted.

UI fallback in that case: the static factory catalog from llm_providers/catalog. Names are capped at 200 to bound RPC payload against pathological providers returning thousands of variants.

Frontend behaviour (agent-creator microapp ≥ 0.0.44)

The Agents page surfaces both flows:

Top section — list of configured LLM instances. "Nueva instancia" CTA opens a modal:
- Factory dropdown (from llm_providers/catalog).
- Instance id (validates slug, rejects duplicates client-side).
- Base URL auto-filled from the catalog, editable.
- API key (password input) — write-through via api_key_secret_value.
Edit modal per agent — provider dropdown lists the configured instances (minimax-cliente-a, minimax-cliente-b), not the factory types. Model dropdown:
- Probes the instance's /v1/models after open.
- Live names → green "✓ N modelos en vivo" indicator.
- Probe failure / non-OpenAI shape → static catalog fallback with a hint explaining the provider doesn't expose /v1/models.
- 60 s in-memory cache per instance; concurrent calls deduped.

Edge cases — defensive design notes

Empty factory_type: "" is treated as absent (defensive against YAML typos that would otherwise match an empty-string factory).
Empty secret value in the SecretsStore is treated as NotFound (an operator's echo "" > file doesn't half-succeed).
Same factory_type + same key across instances is allowed — the operator owns fair-share quota when they explicitly clone.
Tenant-scoped instance + global instance with same id — Phase 83.8.12 already wins-tenant; this layer doesn't change that.
Plugin-registered remote providers appear in llm_providers/catalog after their register call. The catalogue snapshot used by admin RPC is taken at AdminRpcBootstrap::build time — providers registered after that don't show up until restart.

Migration from legacy YAML

No migration needed — yamls without factory_type keep working under the back-compat path (instance id IS the factory id). Operators only touch their YAML when they want a second instance of the same factory with a different key.

Credential schema (Phase 82.10.u)

Each LLM provider factory declares its credential field schema. The admin RPC llm_providers/catalog surfaces it; the SPA wizard renders one input per descriptor; the upsert handler validates the operator's payload against the same schema before persistence. Single source of truth — no drift.

Why

Pre-82.10.u, the operator's llm_providers/upsert always boiled down to a single api_key. Two problems:

MiniMax also needs group_id — without it, /v1/models returns the provider's empty default list and the SPA shows nothing. There was no way to surface the field through the admin RPC.
Anthropic supports OAuth — but the wire shape couldn't express "auth_mode dropdown + setup_token field that only appears when mode=setup_token + bundle JSON paste alternative".

Phase 82.10.u introduces a declarative CredentialFieldDescriptor shape every factory advertises. The SPA renders dynamically. The handler validates the same schema server-side.

Schema shape

#![allow(unused)]
fn main() {
pub struct CredentialFieldDescriptor {
    pub name: String,           // yaml key + secret-store id suffix
    pub label: String,          // operator-facing
    pub kind: FieldKind,        // Text | Password | Select { options }
    pub required: bool,
    pub secret: bool,           // → SecretsStore vs yaml inline
    pub default: Option<String>,
    pub help: Option<String>,
    pub validation: Option<FieldValidation>,  // Regex | Length
    pub depends_on: Option<DependsOn>,        // visibility predicate
}
}

Persistence rule

secret == true → value lands in the SecretsStore under derived id LLM_<INSTANCE>_<FIELD_UPPER>. Yaml carries only <field>_secret_id reference.
secret == false → value lands inline in llm.yaml.providers.<id>.<field>.

Validation

Rule	Server check	SPA check
`required` + `depends_on.satisfied`	`MISSING_FIELD` if absent	red border on blur
`Regex { pattern, hint }`	`INVALID_FORMAT` with hint	hint shown inline
`Length { min, max }`	`INVALID_FORMAT` length n not in [min,max]	char count

Per-factory schemas

Factory	Fields	Auth modes	`/v1/models`?
`minimax`	api_key (secret) · group_id (10-20 digits) · region (select) · key_kind (api/plan)	api_key, oauth_device_code	✓
`anthropic`	auth_mode (select) · api_key (depends_on api_key) · setup_token (depends_on setup_token)	api_key, setup_token, oauth_auth_code, oauth_bundle_import	✗ (static catalog)
`openai` / `deepseek` / `gemini`	api_key	api_key	✓ / ✓ / via Gemini-specific path

Wire flow — operator creates a MiniMax instance

// 1) GET catalog
"nexo/admin/llm_providers/catalog" → {
  "providers": [{
    "id": "minimax",
    "credential_schema": [
      {"name":"api_key","kind":{"type":"password"},"required":true,"secret":true,...},
      {"name":"group_id","kind":{"type":"text"},"required":true,"secret":false,
       "validation":{"type":"regex","pattern":"^[0-9]{10,20}$","hint":"10-20 digits"}},
      {"name":"region","kind":{"type":"select","options":[...]},"default":"global"},
      {"name":"key_kind","kind":{"type":"select","options":[...]},"default":"api"}
    ],
    "supported_auth_modes":["api_key","oauth_device_code"],
    "supports_models_probe": true
  }, ...]
}

// 2) Validate without persisting (Phase 82.10.u probe_draft)
"nexo/admin/llm_providers/probe_draft" {
  "factory_type":"minimax",
  "base_url":"https://api.minimax.io/v1",
  "auth_mode":"api_key",
  "fields":{"api_key":"sk-...", "group_id":"1234567890123", "region":"global", "key_kind":"api"}
}
→ { "ok":true, "status":200, "model_count":12, "model_names":[...] }

// 3) Persist
"nexo/admin/llm_providers/upsert" {
  "id":"minimax-cliente-a",
  "factory_type":"minimax",
  "base_url":"https://api.minimax.io/v1",
  "auth_mode":"api_key",
  "fields":{ /* same as probe */ }
}
→ summary

Error taxonomy

LlmProviderError rides in AdminRpcError::data so the SPA discriminates by code:

#![allow(unused)]
fn main() {
pub enum LlmProviderError {
    MissingField { field },
    UnknownField { field },
    InvalidFormat { field, hint },
    InvalidAuthMode { factory, mode },
    SessionExpired,           // OAuth — TTL elapsed
    SessionNotFound,          // OAuth — never issued or replayed
    OAuthExchangeFailed { upstream_status, message },
    ProbeFailed { upstream_status, message },
    YamlWriteFailed { detail },
    SecretWriteFailed { detail },
}
}

Audit

Schema-driven payloads are walked by redact_secret_keys so any field whose name matches api_key, setup_token, access_token, refresh_token, oauth_bundle, password, token, secret is masked as <redacted>. Non-secret identifiers (group_id, region, key_kind) stay literal in the audit log for diagnostics.

Back-compat

Pre-82.10.u microapps that send api_key_env / api_key_secret_value keep working — the handler picks the legacy path when fields is empty.
Pre-82.10.u daemons that don't carry credential_schema in the catalog response → SPA falls back to the legacy single-api_key UI (default [] from the optional ?? []).

OAuth flows (Phase 82.10.u)

Two-step admin RPC flow that lets an operator authorise a Claude subscription or MiniMax Token Plan from the SPA wizard without the SPA ever touching the PKCE verifier or refresh tokens.

Why an admin RPC and not just stdin

Pre-82.10.u, OAuth lived inside crates/setup/src/services/: interactive stdin paste, claude login style. That works for single-tenant operators who own the daemon shell, but for a multi-tenant SaaS the operator is a browser tab — there is no stdin.

Phase 82.10.u extracts the PKCE primitives (crates/llm-auth) and exposes them over admin RPC so a SPA can drive the same flow. The framework owns the verifier across the two HTTP requests via InMemoryVerifierStore; the SPA only sees opaque session ids.

Endpoints

`nexo/admin/llm_providers/oauth_start`

// req
{
  "factory_type": "anthropic",
  "auth_mode": "oauth_auth_code",
  "tenant_id": null
}
// resp (auth_code)
{
  "session_id": "f2c1...",
  "authorize_url": "https://claude.ai/oauth/authorize?...",
  "expires_at_ms": 1714776600000,
  "flow_kind": "auth_code"
}
// resp (device_code) — for `(minimax, oauth_device_code)`
{
  "session_id": "9a3e...",
  "authorize_url": "https://api.minimax.io/...",
  "expires_at_ms": ...,
  "flow_kind": "device_code",
  "user_code": "ABC123",
  "polling_interval_ms": 2000
}

`nexo/admin/llm_providers/oauth_finish`

// req — auth_code
{
  "session_id": "f2c1...",
  "instance_id": "anthropic-personal",
  "code": "abc#def"        // operator pasted from callback page
}
// req — device_code (no code; daemon polls)
{
  "session_id": "9a3e...",
  "instance_id": "minimax-cliente-a"
}
// resp
{
  "ok": true,
  "account_email": "user@example.com",  // Anthropic only
  "expires_at_ms": 1714780200000,
  "secret_id": "LLM_ANTHROPIC_PERSONAL_OAUTH_BUNDLE"
}

State machine

[oauth_start]                                    [oauth_finish]
─────────────                                    ─────────────
                          (10 min TTL)
PKCE gen → store.put → ...........  → take →  exchange/poll → bundle → SecretsStore
                                                                     ↓
                                                                yaml patch:
                                                                  auth.mode = oauth_bundle
                                                                  auth.bundle = <secret path>
                                                                     ↓
                                                                reload_signal()

Defensive design

Concern	Mitigation
CSRF	`state` is checked against the stashed PKCE state inside `exchange_code`
Replay	`take()` removes the entry BEFORE exchange; second call → `SESSION_NOT_FOUND`
Expired sessions	`peek_status` discriminates `Live` / `Expired` / `Missing` so the SPA gets accurate diagnostics
Memory bloat	Background sweep every 60 s drops stale entries; capacity 100 with FIFO eviction
Verifier leak	Verifier never travels to the SPA — only opaque `session_id`
Audit	`code`, `access_token`, `refresh_token`, `oauth_bundle` masked via `redact_secret_keys`

Client SDK

crates/llm-auth exposes the primitives so any consumer (admin RPC, CLI wizard, future MCP server) shares the same crypto + HTTP shape:

#![allow(unused)]
fn main() {
use nexo_llm_auth::{gen_pkce, StateEncoding};
use nexo_llm_auth::anthropic::{build_authorize_url, exchange_code, TOKEN_URL};

let pkce = gen_pkce(StateEncoding::HexOnly);
let url = build_authorize_url(&pkce);
// ... operator pastes `<code>#<state>` ...
let bundle = exchange_code(&pkce, &code, &state, TOKEN_URL).await?;
}

CLI flow (unchanged)

agent setup anthropic → oauth_login mode still uses crates/setup/src/services/anthropic_oauth.rs which now wraps the same nexo-llm-auth primitives. Operators with shell access keep the stdin paste UX.

Microapp wizard

The SPA-side UI lives in agent-creator-microapp/frontend/src/components/OAuthPane.tsx and the zustand store agent-creator-microapp/frontend/src/lib/oauthFlow.ts. State machine: idle → starting → awaiting_user → exchanging → success | error. Auth-code variant renders a paste box; device-code variant renders the user_code + verification_uri with a Confirm button (the daemon polls upstream).

Rate limiting & retry

Every LLM provider client sits behind a token bucket and a bounded retry policy with decorrelated jittered exponential backoff. This page is the definitive reference for those two mechanisms.

Source: crates/llm/src/retry.rs, crates/llm/src/rate_limiter.rs, crates/llm/src/quota_tracker.rs.

Rate limiter

Token bucket, acquired before every outbound request.

interval = 1 / requests_per_second
One token per request
Bucket fully refills after interval per slot
Per-provider, per-agent — each client has its own bucket, so one noisy agent can't starve another even when they share a provider

rate_limit:
  requests_per_second: 2.0
  quota_alert_threshold: 100000   # optional

At 2.0 rps, the bucket tops up a slot every 500 ms. A burst of 3 requests will wait briefly on the third.

Quota tracker

Optional. When a provider returns remaining-quota info (header, response body), quota_tracker records it via record_usage() on the token response. If the remaining crosses quota_alert_threshold, a structured warn log is emitted:

WARN quota threshold crossed  provider=minimax remaining=99500 threshold=100000

Pair with a Prometheus log-scraping rule for an alert.

Retry policy

Retries live above the circuit breaker. They handle transient failures that don't warrant flipping the breaker.

Error class	Max attempts	Backoff curve
429 (rate limit)	5	`max(retry-after, jittered_backoff)`
5xx (server)	3	`jittered_backoff`
401 (auth)	1 refresh + 1 retry	(internal to the client)
Other 4xx	0 (fail fast)	—

Decorrelated jittered backoff

Not simple exponential — the next backoff is a uniform random draw in a growing range:

next = uniform(base, max(base, last × multiplier))

Defaults from llm.yaml retry block:

Field	Default
`initial_backoff_ms`	1000
`max_backoff_ms`	60000
`backoff_multiplier`	2.0

Why decorrelated jitter: multiple clients hitting the same 429 don't re-fire in lockstep. Desynchronization is built-in.

flowchart LR
    REQ[request] --> API{API response}
    API -->|200| OK[return ChatResponse]
    API -->|429| RL[RateLimit]
    API -->|5xx| SE[ServerError]
    API -->|401| AU[CredentialInvalid]
    API -->|4xx| F[Other fail fast]

    RL --> D1{attempts<br/>< 5?}
    SE --> D2{attempts<br/>< 3?}
    AU --> REF[auth refresh<br/>+ single retry]
    D1 -->|yes| BO1[wait max(retry_after,<br/>jittered_backoff)]
    D1 -->|no| F
    D2 -->|yes| BO2[wait jittered_backoff]
    D2 -->|no| F
    BO1 --> REQ
    BO2 --> REQ
    REF --> REQ

Error classification per provider

The providers classify HTTP responses into a shared LlmError so the retry layer can be common code:

HTTP	`LlmError` variant	Retried?
200	`Ok(ChatResponse)`	—
429	`RateLimit { retry_after_ms }`	✅ up to 5
5xx	`ServerError { status, body }`	✅ up to 3
401 / 403	`CredentialInvalid`	❌ (client handles refresh internally)
Other 4xx	`Other`	❌

Tuning

Bursty workloads: bump requests_per_second cautiously; the upstream's own rate limits won't move, so you'll just pay more 429s to find the ceiling.
Flaky networks: raise max_attempts for 5xx; keep max_backoff_ms bounded so slow agents don't spiral.
Subscription plans: lower requests_per_second to keep daily usage under caps; pair with quota_alert_threshold.

Install (Phase 81.18.b.1 — operator action required)

The daemon stopped constructing TelegramPlugin in-tree as of Phase 81.18.b.1; it now spawns the standalone subprocess binary per cfg entry. Operators with cfg.plugins.telegram populated must install the binary and surface its directory through plugins.discovery.search_paths before starting the daemon, or the discovery walker logs a clear warning and the plugin never boots:

# Recommended — download the pre-built tarball from the plugin's
# GitHub Releases into the daemon's plugin dir:
nexo plugin install lordmacu/nexo-plugin-telegram
nexo plugin list

# Or build from source:
cargo install --git https://github.com/lordmacu/nexo-plugin-telegram

nexo plugin install lands the binary + plugin.toml under <state_dir>/plugins/telegram/, which the daemon's discovery walker scans by default — no search_paths edit needed. If you build with cargo install --git instead, point discovery at the install dir in agents.yaml:

plugins:
  discovery:
    search_paths:
      - ~/.cargo/bin   # or wherever you installed the binary

Each cfg.plugins.telegram[] entry maps to one subprocess; per- instance state (offset_path, media_dir, instance topic suffix, bot token) is seeded into the child via NEXO_PLUGIN_TELEGRAM_* env vars at spawn time so multi-bot operators get true process isolation.

Topics

Direction	Subject	Notes
Inbound	`plugin.inbound.telegram`	Legacy single-bot
Inbound	`plugin.inbound.telegram.<instance>`	Per-bot routing
Outbound	`plugin.outbound.telegram`	Legacy single-bot
Outbound	`plugin.outbound.telegram.<instance>`	Per-bot routing

Each instance subscribes only to its own outbound topic, so two bots in the same process don't cross-wire.

Config

# config/plugins/telegram.yaml
telegram:
  token: ${file:./secrets/telegram_token.txt}
  instance: sales_bot
  polling:
    enabled: true
    interval_ms: 25000
    offset_path: ./data/media/telegram/sales_bot.offset
  allowlist:
    chat_ids: []        # empty = accept all
  auto_transcribe:
    enabled: false
    command: ./extensions/openai-whisper/target/release/openai-whisper
    language: es
  bridge_timeout_ms: 120000

Key fields:

Field	Default	Purpose
`token`	— (required)	Bot API token from @BotFather.
`instance`	`None`	Label for multi-bot routing. Unlabelled keeps the legacy bare topic.
`allow_agents`	`[]`	Agents permitted to publish from this bot. Empty = accept any agent holding a resolver handle. Defense-in-depth for the per-agent `credentials` binding.
`polling.enabled`	`true`	Long-polling intake. Webhook not yet supported.
`polling.interval_ms`	`25000`	Long-poll timeout hint. Telegram clamps to [1 s, 50 s].
`polling.offset_path`	`./data/media/telegram/offset`	File to persist update offset across restarts.
`allowlist.chat_ids`	`[]`	Numeric chat ids allowed. Empty = accept all.
`auto_transcribe.enabled`	`false`	Voice → text.
`auto_transcribe.command`	`./extensions/openai-whisper/.../openai-whisper`	Path to whisper binary.
`bridge_timeout_ms`	`120000`	Handler deadline before a `bridge_timeout` event fires.

Auth

Single mode: static bot token. No OAuth. Store it under ./secrets/ and reference via ${file:...}.

flowchart LR
    SETUP[agent setup] --> ASK[ask for bot token]
    ASK --> F[./secrets/telegram_token.txt]
    F -.->|${file:...}| CFG[config/plugins/telegram.yaml]
    CFG --> RUN[runtime: HTTP Bot API with long-poll]

Tools exposed to the LLM

Tool	Notes
`telegram_send_message`	Send text to chat id (negative for groups/channels).
`telegram_send_reply`	Quote a specific prior message.
`telegram_send_reaction`	Emoji on a message.
`telegram_edit_message`	Modify a prior message's text.
`telegram_send_location`	GPS coordinates.
`telegram_send_media`	File upload with caption and mime hint.

All tools enforce outbound_allowlist.telegram per binding.

Event shapes

// message
{
  "kind": "message",
  "from": "12345",
  "chat": "12345",
  "chat_type": "private",
  "text": "hi",
  "reply_to": null,
  "is_group": false,
  "timestamp": 1714000000,
  "msg_id": "42",
  "username": "jdoe",
  "media": [],
  "latitude": null,
  "longitude": null,
  "forward": null
}

// media item (inside `media`)
{
  "kind": "voice" | "photo" | "video" | "document" | "audio",
  "local_path": "./data/media/telegram/....ogg",
  "file_id": "AgACAgEA...",
  "mime_type": "audio/ogg",
  "duration_s": 4,
  "width": null,
  "height": null,
  "file_name": null
}

// callback_query  (inline-keyboard button press, auto-ACKed)
{"kind": "callback_query", "from": "...", "chat": "...", "data": "buy"}

// chat_membership
{"kind": "chat_membership", "chat": "...", "status": "added" | "kicked" | ...}

// lifecycle
{"kind": "connected" | "disconnected"}
{"kind": "bridge_timeout", "msg_id": "...", "waited_ms": ...}

Forwarded messages include a forward object:

"forward": {
  "source": "user" | "channel" | "chat",
  "from_user_id": 12345,
  "from_chat_id": null,
  "date": 1714000000
}

Gotchas

Webhook mode is not supported yet. Long-polling only.
polling.interval_ms is clamped by Telegram. Values outside [1000, 50000] get capped by the server side; default 25000 is a good middle ground.
Negative chat ids are groups/channels. Telegram uses negative ids for group chats; positive for private. Don't strip the sign.
Auto-transcribe requires the whisper skill extension. The command path must point at a working binary, otherwise inbound voice messages arrive without text.

Email plugin

Multi-account IMAP/SMTP channel for Nexo agents. Receives messages through IMAP IDLE (with a 60 s polling fallback for servers that don't speak IDLE), sends through SMTP under a circuit-breaker, and exposes six tools (email_send, email_reply, email_archive, email_move_to, email_label, email_search) so an agent can read and act on a mailbox.

Status (Phase 81.20.x shipped 2026-05-16). Email is now a standalone subprocess plugin distributed via crates.io. Install with cargo install nexo-plugin-email; the daemon's binary-mode discovery walker auto-detects the binary, probes --print-manifest, and wires all five auto-discovery stages (pairing adapter, HTTP routes, admin RPC, Prometheus metrics, dashboard) without any daemon-side code change. The daemon binary no longer compiles nexo-plugin-email in-tree (cargo tree -i nexo-plugin-email returns "did not match any packages").

Install

cargo install nexo-plugin-email          # latest crates.io release

The binary lands in $HOME/.cargo/bin/nexo-plugin-email. The daemon's PluginDiscoveryConfig::default() already includes that directory in its search_paths, so a fresh nexo daemon boot finds the plugin without manifest editing. The walker spawns nexo-plugin-email --print-manifest, captures the bundled TOML, and registers the plugin's 12 tools + 5 manifest sections via the generic auto-discovery contract.

If your environment hardens against arbitrary binary execution during boot, set plugins.discovery.auto_detect_binaries: false in config/discovery.yaml and add an explicit nexo-plugin.toml reference under search_paths instead.

Configuration

config/plugins/email.yaml — multi-account schema. Credentials live in nexo-auth (Phase 17), not in this YAML; see Per-account credentials below.

email:
  enabled: true
  max_body_bytes: 32768           # body_text truncation
  max_attachment_bytes: 26214400  # 25 MiB; oversized attachments are
                                  # written truncated and flagged
  attachments_dir: data/email-attachments
  outbound_queue_dir: data/email-outbound
  poll_fallback_seconds: 60       # used when IDLE isn't supported
  idle_reissue_minutes: 28        # < RFC 2177's 29-minute ceiling
  spf_dkim_warn: true             # boot-time DNS check, non-fatal

  loop_prevention:
    auto_submitted: true          # RFC 3834
    list_headers: true            # List-Id / List-Unsubscribe / Precedence
    self_from: true               # bounce-back from our own outbound

  accounts:
    - instance: ops
      address: ops@example.com
      provider: custom            # gmail | outlook | yahoo | icloud | custom
      imap: { host: imap.example.com, port: 993, tls: implicit_tls }
      smtp: { host: smtp.example.com, port: 587, tls: starttls }
      folders:
        inbox:   INBOX
        sent:    Sent
        archive: Archive
      filters:
        from_allowlist: []
        from_denylist:  []

Topics: plugin.inbound.email.<instance> (parsed inbound), plugin.outbound.email.<instance> (commands you publish to send), plugin.outbound.email.<instance>.ack (per-message ack), and email.bounce.<instance> (DSNs).

Per-account credentials

secrets/email/<instance>.toml — chmod 0o600 enforced at boot. Three auth kinds are supported.

# Password (app password works fine for Outlook / iCloud / Yahoo).
[auth]
kind = "password"
username = "ops@example.com"
password = "${EMAIL_OPS_PASSWORD}"

# Pre-issued OAuth2 bearer (bring-your-own-token).
[auth]
kind = "oauth2_static"
username = "ops@gmail.com"
access_token  = "${EMAIL_OPS_TOKEN}"
refresh_token = "${EMAIL_OPS_REFRESH}"   # optional
expires_at    = 1735689600                # optional unix sec

# Reuse an account already in `config/plugins/google-auth.yaml`.
[auth]
kind = "oauth2_google"
username = "ops@gmail.com"
google_account_id = "ops"

${ENV} placeholders are resolved at boot via nexo_config::env::resolve_placeholders. The OAuth2-Google variant delegates token reads to the Google credential store and shares its per-account refresh mutex so concurrent IMAP IDLE workers never race a token rotation.

Provider auto-detect

The setup helper provider_hint(domain) recognises five families out of the box:

Domain	Provider	IMAP host	SMTP host
`gmail.com`, `googlemail.com`	Gmail	`imap.gmail.com:993`	`smtp.gmail.com:587`
`outlook.com`, `hotmail.com`, `live.com`, `msn.com`	Outlook	`outlook.office365.com:993`	`smtp.office365.com:587`
`yahoo.com`, `yahoo.co.uk`, `ymail.com`, `rocketmail.com`	Yahoo	`imap.mail.yahoo.com:993`	`smtp.mail.yahoo.com:587`
`icloud.com`, `me.com`, `mac.com`	iCloud	`imap.mail.me.com:993`	`smtp.mail.me.com:587`
anything else	Custom	(prompt)	(prompt)

Gmail addresses also get a suggest_oauth_google = true hint so the wizard offers to reuse google-auth.yaml instead of asking for an app password.

Tools

The agent gets six tools when the email plugin is active:

Tool	Purpose
`email_send`	Send a new message. `from` is pinned to the account address (anti-spoof).
`email_reply`	Fetch the parent by UID, derive recipients (`reply_all` adds `parent.To/Cc` minus own), inherit `In-Reply-To` / `References`.
`email_archive`	UID MOVE to the configured archive folder; falls back to `COPY + STORE \Deleted + EXPUNGE`.
`email_move_to`	Same as archive but to an arbitrary folder (no auto-create).
`email_label`	Gmail-only: `STORE +X-GM-LABELS` / `-X-GM-LABELS`. Errors on non-Gmail.
`email_search`	Portable JSON DSL → IMAP SEARCH atoms. Default limit 50, max 200.

Every result is wrapped in a { ok: bool, ... } envelope. Errors become { ok: false, error: "..." } rather than thrown exceptions so the agent doesn't have to branch on exception types.

email_search query shape:

{
  "instance": "ops",
  "folder": "INBOX",
  "query": {
    "from": "alice@x", "to": "bob@x",
    "subject": "report", "body": "kpi",
    "since": "2024-01-01", "before": "2024-12-31",
    "unseen": true, "seen": false
  },
  "limit": 50
}

User-controlled strings pass through imap_quote (RFC 3501 quoted-string + CR/LF collapse) before reaching the wire — that's the security boundary against atom injection.

Outbound attachments are referenced by file path; the dispatcher reads the bytes at enqueue time so a missing file fails fast with ack: Failed instead of parking a doomed job:

{
  "instance": "ops",
  "to": ["alice@x"],
  "subject": "Report",
  "body": "see attached",
  "attachments": [
    { "data_path": "/tmp/q3.pdf", "filename": "q3.pdf" }
  ]
}

Inbound events

Published as JSON on plugin.inbound.email.<instance>:

{
  "account_id": "ops@example.com",
  "instance": "ops",
  "uid": 42,
  "internal_date": 1700000000,
  "raw_bytes": "<.eml bytes (binary-safe via serde_bytes)>",
  "meta": {
    "message_id": "<abc@x>",
    "in_reply_to": "<parent@x>",
    "references": ["<root@x>", "<parent@x>"],
    "from": { "address": "alice@x", "name": "Alice Doe" },
    "to":   [{ "address": "ops@example.com" }],
    "cc":   [],
    "subject": "Re: hi",
    "body_text": "...",
    "body_html": null,
    "date": 1700000000,
    "headers_extra": { "list-id": "<l@x>" },
    "body_truncated": false
  },
  "attachments": [
    {
      "sha256": "abc...",
      "local_path": "data/email-attachments/abc...",
      "size_bytes": 4096,
      "mime_type": "application/pdf",
      "filename": "report.pdf",
      "disposition": "attachment",
      "truncated": false
    }
  ],
  "thread_root_id": "<root@x>"
}

thread_root_id is the canonical session key — pass it through session_id_for_thread() (UUIDv5) to bridge into nexo-core's session map.

Bounce events

Delivery reports never reach the LLM as conversational content. They publish on email.bounce.<instance>:

{
  "account_id": "ops@example.com",
  "instance": "ops",
  "original_message_id": "<our-outbound@example.com>",
  "recipient": "ghost@unknown.com",
  "status_code": "5.1.1",
  "action": "failed",
  "reason": "smtp; 550 5.1.1 user unknown",
  "classification": "permanent"
}

classification follows SMTP convention: 5.x.x → permanent, 4.x.x → transient, anything else → unknown. The detector fires on a Content-Type: multipart/report; report-type=delivery- status envelope; legacy Postfix / sendmail bounces without that marker are caught via a From localpart heuristic (MAILER-DAEMON, mail-daemon, mail.daemon, postmaster).

Loop-prevention

After parse, before publish, the worker walks LoopPreventionCfg in priority order and short-circuits on the first match:

Reason	Trigger
`auto_submitted`	`Auto-Submitted` header is anything other than `no` (RFC 3834).
`list_mail`	`List-Id` or `List-Unsubscribe` present (RFC 2369).
`precedence_bulk`	`Precedence: bulk\|junk\|list` (RFC 2076).
`self_from`	Inbound `From` matches the account's own address.
`dsn_inbound`	`parse_bounce` returned `Some` (handled before loop walk).

Each suppressed message advances the IMAP cursor — it has been processed, just not surfaced.

SPF / DKIM boot warns

When spf_dkim_warn: true, each account triggers a 3 s non-blocking DNS lookup at start. WARN lines are operator-actionable:

Tag	Means
`email.spf.missing`	No `v=spf1` TXT record at the apex of the From domain.
`email.spf.misalignment`	SPF policy exists but doesn't authorise the configured SMTP host.
`email.dkim.missing`	No TXT at `default._domainkey.<domain>`. Try selectors `default`, `google`, `selector1`, `mail`.
`email.spf_dkim.dns_unavailable`	The DNS lookup itself failed. Often transient.

DMARC, multi-selector DKIM rotation, and signature verification are deliberately out of scope for v1.

Troubleshooting

email.idle.unsupported — the server doesn't advertise IDLE; the worker is permanently in 60 s polling mode. Yahoo Plus and some legacy IMAP servers behave this way.
email.uidvalidity.changed — the mailbox was recreated server-side; the cursor reset to last_uid=0 and every existing message will be processed again.
Outbound DLQ growing — inspect data/email-outbound/<instance>.dlq.jsonl. After 5 transient attempts (or any 5xx) jobs land here; there's no auto-purge.
email.auth.xoauth2_failed — the OAuth2 token was rejected. The worker retries once with a forced refresh; if it still fails the SMTP / IMAP circuit-breaker opens.
EMAIL_INSECURE_TLS=1 — disables TLS cert verification. Logged at WARN; only safe for fake servers / loopback.

Limitations

Deferred	Tracked in
Persistent bounce history	`proyecto/FOLLOWUPS.md`
Interactive setup wizard	`proyecto/FOLLOWUPS.md`
greenmail e2e test harness	`proyecto/FOLLOWUPS.md`
Email-specific Prometheus metrics	`proyecto/FOLLOWUPS.md`
Phase 16 binding-policy auto-filter	`proyecto/FOLLOWUPS.md`
HTML body in outbound	(text/plain only in v1)
`.ics` calendar invites	Phase 65
Vision OCR over attached images	Phase 49

Deployment (Phase 81.19.b)

The email plugin is shipped as a standalone repo: nexo-rs-plugin-email (nexo-plugin-email v0.1.2+ on crates.io). The crate is dual-mode:

Mode	Used for	Wire path
In-process	Default — daemon registers a singleton factory	`factory_registry.register("email", email_plugin_factory(...))`
Subprocess	Operator drops manifest in `search_paths` and removes the in-tree factory	discovery walker auto-spawns the binary via JSON-RPC stdio

By default the daemon runs the email plugin in-process, exactly as before the extract. The factory wins over discovery's auto-subprocess fallback (init_loop.rs:417), so an email manifest in plugins.discovery.search_paths does NOT spawn the subprocess unless the operator strips the in-tree factory registration.

Subprocess opt-in (advanced)

For deployments that want process-level isolation of the IMAP/SMTP work, install the binary and remove the in-tree factory:

cargo install nexo-plugin-email
mkdir -p ~/.config/nexo/plugins.d/
cp $(which nexo-plugin-email) ~/.config/nexo/plugins.d/
# Copy the manifest from $CARGO_HOME/.../nexo-plugin-email-0.1.2/
# nexo-plugin.toml into the same dir.

Then in agents.yaml:

plugins:
  discovery:
    search_paths:
      - ~/.config/nexo/plugins.d

And strip the in-tree email_plugin_factory registration from the daemon source (proyecto/src/main.rs Phase 81.19.b block). Without that strip, both paths are visible but the factory wins.

The subprocess advertises zero tool defs in its initialize reply — tool dispatch (email_send / email_reply / …) requires the in-process surface and currently doesn't work in pure subprocess mode. Follow-up 81.19.b.tool-dispatch-subprocess tracks closing that gap.

Browser (Chrome DevTools Protocol)

Drives a real Chrome/Chromium instance via CDP. Agents can navigate, click, fill, screenshot, and run JS — with stable element refs that work across DOM mutations within a single turn.

Phase 81.17.c (2026-05-07). The browser plugin now ships as a standalone subprocess (nexo-rs-plugin-browser), loaded by the daemon via discovery + auto-subprocess fallback (Phase 81.17.b). The 12 browser_* tools route through 81.29 RemoteToolHandler over JSON-RPC stdio. The in-tree crates/plugins/browser/ source stays in the workspace dormant for one migration window; deletion is tracked in follow-up 81.17.c.in-tree-removal.

Source-of-truth: standalone repo github.com/lordmacu/nexo-plugin-browser (local: /home/familia/chat/nexo-rs-plugin-browser/). In-tree mirror at crates/plugins/browser/ is dormant; the daemon does NOT instantiate it in-process anymore.

Out-of-tree subprocess install

cd /path/to/nexo-rs-plugin-browser
cargo build --release

# Copy binary + manifest into a discovery search path.
mkdir -p ~/.local/share/nexo/plugins/browser
cp target/release/nexo-plugin-browser ~/.local/share/nexo/plugins/browser/
cp nexo-plugin.toml                   ~/.local/share/nexo/plugins/browser/

In plugins.yaml:

plugins:
  discovery:
    search_paths:
      - ~/.local/share/nexo/plugins

The discovery walker picks up the manifest on next boot; auto-subprocess fallback spawns the binary; tool handlers register in the agent's scoped registry via RemoteToolHandler. ENV vars flow from cfg.plugins.browser YAML via the daemon's seed_browser_subprocess_env helper.

The standalone repo's README covers ENV var reference, sandbox notes, and the latency budget.

Topics

Direction	Subject	Notes
Outbound	`plugin.outbound.browser`	Tool invocations
Events	`plugin.events.browser.<method_suffix>`	Mirrored CDP notifications

Browser is an outbound-only plugin — there is no unsolicited inbound event from a web page to the agent.

Config

Two shapes accepted: a single map (0.2.x back-compat — keeps the legacy per-agent_id profile fan-out from Phase 81.17.c.multi-profile) or a sequence of maps (0.3.0+ declared multi-instance — operator names each session, every instance has its own Chrome process + user_data_dir).

Single-map (legacy)

# config/plugins/browser.yaml
browser:
  headless: false
  executable: ""                     # empty → search PATH
  cdp_url: ""                        # empty → launch new Chrome
  user_data_dir: ./data/browser/profile
  window_width: 1280
  window_height: 800
  connect_timeout_ms: 10000
  command_timeout_ms: 15000
  args: []                           # extra CLI flags for Chrome

Multi-instance (0.3.0+)

# config/plugins/browser.yaml
browser:
  - instance: marketing
    headless: false
    user_data_dir: ""               # empty → ${state_dir}/instances/marketing/
    allow_agents: [ana]             # empty = accept any agent
  - instance: research
    headless: true
    allow_agents: [juan, marketing]

Every browser_* tool gains an optional instance: string argument. Resolution:

Explicit instance matches a declared label → routes there.
No instance + exactly 1 declared instance → uses it (compat shim).
No instance + 0 declared instances → falls back to the legacy per-agent_id profile path.
No instance + N>1 declared instances → ArgumentInvalid (the caller must name an instance).

Pairing flow: the dashboard surfaces each declared instance with its .nexo-paired sentinel status. Operator clicks "open Chrome" via the admin RPC nexo/admin/browser/launch_visible, logs in to the sites manually, then "mark paired" persists the sentinel so the wizard flips green. The runtime headless: true → false toggle on launch_visible is a deferred follow-up (browser.launch_visible.runtime).

Cost: ~200-500 MB RAM per declared Chrome instance. Single-instance shared-profile mode stays available via the legacy single-map shape or a 1-element array.

Field	Default	Purpose
`headless`	`false`	Launch Chrome without a UI.
`executable`	`""`	Chrome binary path. Empty = search `PATH`.
`cdp_url`	`""`	Connect to an existing Chrome DevTools server (e.g. `http://127.0.0.1:9222`). Empty = launch a new instance.
`user_data_dir`	`./data/browser/profile`	Chrome profile cache. Keeps cookies, logins.
`window_width` / `window_height`	`1280` / `800`	Viewport.
`connect_timeout_ms`	`10000`	How long to wait for Chrome startup / remote connect.
`command_timeout_ms`	`15000`	Per-CDP-command execution timeout.
`args`	`[]`	Extra CLI flags forwarded verbatim to the spawned Chrome. Ignored when `cdp_url` is set. Later args win — use this to override built-in flags when a restricted environment needs it (e.g. `--no-sandbox` on Termux).

Auth

None. CDP is an unauthenticated protocol — use cdp_url only with a loopback / firewalled Chrome.

Tools exposed to the LLM

Tool	Purpose
`browser_navigate`	Load URL and wait for `load` event.
`browser_click`	Click by element ref (`@e12`) or CSS selector.
`browser_fill`	Type into input / textarea / contenteditable. Replaces content.
`browser_screenshot`	Base64 PNG of the viewport.
`browser_evaluate`	Run JS, return value as JSON.
`browser_snapshot`	Text DOM tree with stable element refs.
`browser_scroll_to`	Scroll a target element into view.
`browser_current_url`	Current page URL.
`browser_wait_for`	Poll for an element to appear.
`browser_go_back` / `browser_go_forward`	Navigation history.
`browser_press_key`	Keyboard events.

All tools are prefixed browser_* for glob filtering in allowed_tools.

Element refs

browser_snapshot emits a text tree where every actionable element has a ref like @e12. Those refs are stable within the snapshot turn but invalidated by any subsequent DOM mutation:

sequenceDiagram
    participant A as Agent
    participant B as Browser plugin
    participant C as Chrome

    A->>B: browser_snapshot
    B->>C: DOM.describeNode(..)
    C-->>B: tree
    B-->>A: "Login @e12\nEmail @e13\n..."
    A->>B: browser_fill(@e13, "user@…")
    B->>C: DOM.focus + Input.dispatch
    A->>B: browser_click(@e12)
    Note over A,B: refs still valid<br/>(same snapshot turn)
    A->>B: browser_snapshot
    Note over B: refs from prior snapshot<br/>now INVALID

Rule: take a snapshot, act on refs from that snapshot, take a new snapshot before acting again.

Gotchas

browser_fill replaces content. No append mode. To add text to existing content, read the current value first (via evaluate) then send the merged string.
Connecting to an existing Chrome (cdp_url) skips the profile setup. Any login state is whatever that Chrome already has.
Element refs expire on DOM mutation. The plugin does not auto-refresh — refs from a stale snapshot will error or misfire.
Headless sites break. Some sites detect headless Chrome and behave differently. Use headless: false for those.

Google (OAuth, Gmail, Calendar, Drive) + gmail-poller

Two related subsystems:

google plugin — per-agent OAuth client plus a generic google_call tool that lets an agent hit any Google API the granted scopes allow
gmail-poller plugin — cron-style scheduler that polls Gmail, matches subjects/bodies with regex, and dispatches results to any outbound topic (WhatsApp, Telegram, another agent)

Phase 94 — extracted to standalone subprocess plugin. The agent-callable surface (google_auth_start, google_auth_status, google_call, google_auth_revoke) now lives in nexo-rs-plugin-google, packaged as a separate binary the daemon spawns via discovery. Operator install:
cargo install nexo-plugin-google
The binary self-publishes its manifest at boot (nexo-plugin-google --print-manifest) and exposes a --oauth-once <agent_id> CLI subcommand the setup wizard uses for initial consent (loopback by default; --device for headless).

The in-tree crates/plugins/google/ lib survives as the dep for nexo-poller's google_calendar + gmail builtins (call the OAuth client in-process). Future cleanup: migrate poller to the published nexo-plugin-google 0.2.0 lib crate.

Sources: nexo-rs-plugin-google/ (standalone repo) and the legacy in-tree crates/plugins/google/ (poller-only).

`google` — per-agent OAuth

Config

Two shapes supported:

Preferred (Phase 17) — declare accounts in a dedicated store and bind them from the agent via credentials.google:

# config/plugins/google-auth.yaml
google_auth:
  accounts:
    - id: ana@gmail.com
      agent_id: ana                     # 1:1; gauntlet enforces the binding
      client_id_path:     ./secrets/google/ana_client_id.txt
      client_secret_path: ./secrets/google/ana_client_secret.txt
      token_path:         ./secrets/google/ana_token.json
      scopes:
        - https://www.googleapis.com/auth/gmail.modify

Gmail-poller picks these up automatically; agents see google_* tools when the store has an entry matching their agent_id.

Legacy inline (still works, logs a migration warn):

# agents.yaml
google_auth:
  client_id: ${GOOGLE_CLIENT_ID}
  client_secret: ${file:./secrets/google_secret.txt}
  scopes:
    - gmail.readonly
    - gmail.send
    - calendar
    - drive.file
  token_file: ./data/workspace/ana/google_token.json
  redirect_port: 17653

Field	Default	Purpose
`client_id` / `client_secret`	—	OAuth app creds from Google Cloud Console.
`scopes`	—	OAuth scopes. Short-form (`gmail.readonly`) auto-expanded to full URL.
`token_file`	`google_tokens.json`	Persistent refresh-token JSON. Relative paths resolve from workspace.
`redirect_port`	`8765`	Loopback callback port. Must match the "Authorized redirect URI" in the OAuth client.

Pairing flow

sequenceDiagram
    participant A as Agent LLM
    participant T as google_auth_start
    participant B as Browser
    participant L as Loopback listener<br/>127.0.0.1:<port>/callback
    participant G as Google OAuth

    A->>T: invoke
    T->>L: start listener
    T-->>A: return consent URL
    A->>B: ask user to open URL
    B->>G: consent flow
    G->>L: redirect w/ code
    L->>G: exchange code → tokens
    L->>L: persist refresh_token<br/>(mode 0o600)
    L-->>A: success

The wizard wraps this as a one-shot step, but runtime tools expose the same primitives for re-auth.

Device-code flow (headless setup)

agent setup google offers a second consent path that does not require a local browser — useful for servers, CI, and SSH-only environments. The wizard:

POSTs to oauth2.googleapis.com/device/code with the account's client_id and scopes.
Prints a 6-character user_code + a verification_url to the terminal.
Polls oauth2.googleapis.com/token (default every 5 s) until the operator approves on any device.
Persists the resulting refresh_token at token_path with mode 0o600.

╭─ Device-code OAuth ───────────────────────────────────────
│  Open in any browser:          https://www.google.com/device
│  Code to enter:                HBQM-WLNF
│  (valid for 1800s)
╰───────────────────────────────────────────────────────────

Waiting for approval...
✔ Tokens persisted at ./secrets/ana_google_token.json.

The Google Cloud Console OAuth client must be type "TVs and Limited Input devices" for this flow — Desktop/Web clients reject device-code with client_type_disabled.

Lazy-refresh of `client_id` / `client_secret`

GoogleAuthClient.config is ArcSwap<GoogleAuthConfig>. Every network call (exchange_code, request_device_code, poll_device_token, refresh_token) first invokes refresh_secrets_if_changed, which compares mtime on client_id_path and client_secret_path and re-reads them when they advance. Rotating the secret files (e.g. quarterly key rotation in Google Cloud Console) takes effect on the next tool call without a daemon restart.

Steady-state cost: one fs::metadata call per outbound request. Audit trail (target credentials.audit):

INFO event="google_secrets_refreshed" \
  google_*: re-read client_id/client_secret after on-disk rotation

Tools exposed

Tool	Purpose
`google_auth_start`	Start OAuth, return the consent URL.
`google_auth_status`	Report `{authenticated, expires_in_secs, has_refresh, scopes}`. Safe to poll.
`google_call`	Generic `{method, url, body?}` against any `*.googleapis.com` endpoint. Auto-refreshes access token.
`google_auth_revoke`	Revoke the refresh token; forces full re-auth.

Supported APIs

Anything under *.googleapis.com that the granted scopes permit. Common call shapes:

Gmail v1 — https://gmail.googleapis.com/gmail/v1/users/me/messages?q=is:unread
Calendar v3 — https://www.googleapis.com/calendar/v3/calendars/primary/events
Drive v3 — https://www.googleapis.com/drive/v3/files?q=mimeType='application/pdf'
Sheets v4 — https://sheets.googleapis.com/v4/spreadsheets/<id>/values/A1:D10

Gotchas

401 means the refresh token was revoked. Re-auth via google_auth_start.
403 means a scope wasn't granted. Add the scope, revoke, re-auth.
Token file leaks → revoke immediately. The file holds a refresh token with the granted scopes.

`gmail-poller` — cron-style Gmail bridge

Poll Gmail, extract fields via regex, render a template, dispatch to any outbound topic. Multi-account, allowlisted by sender substring, rate-limited per dispatch.

Config

# config/plugins/gmail-poller.yaml
gmail_poller:
  enabled: true
  interval_secs: 60
  accounts:
    - id: default
      agent_id: ana           # Phase 17 — binds the account to an agent; defaults to `id` when omitted
      token_path: ./data/workspace/ana/google_token.json
      client_id_path: ./secrets/google_client_id.txt
      client_secret_path: ./secrets/google_client_secret.txt
  jobs:
    - name: lead_forward
      account: default
      query: "is:unread subject:(lead OR interesado)"
      newer_than: 1d
      interval_secs: 120
      forward_to_subject: plugin.outbound.whatsapp.default
      forward_to: "573000000000@s.whatsapp.net"
      extract:
        name: "Nombre:\\s*(.+)"
        phone: "Tel:\\s*(\\+?\\d+)"
      require_fields: [name, phone]
      message_template: |
        New lead 🚨
        {name} — {phone}
        Subject: {subject}
        {snippet}
      mark_read_on_dispatch: true
      max_per_tick: 20
      dispatch_delay_ms: 1000
      sender_allowlist: ["@mycompany.com", "partners@"]

Per-job fields

Field	Default	Purpose
`name`	— (required)	Job id.
`account`	`"default"`	Which OAuth account to use.
`query`	— (required)	Gmail search (`is:unread`, etc.).
`newer_than`	—	Gmail `newer_than:` suffix (`1d`, `2h`) — avoids back-filling.
`interval_secs`	root interval	Override per-job poll cadence.
`forward_to_subject`	—	Broker topic to publish dispatched message.
`forward_to`	—	Recipient passed through (JID, chat id, phone).
`extract`	`{}`	Named regex groups applied to the email body. First group wins.
`require_fields`	`[]`	Skip dispatch if any listed extracted field is empty.
`message_template`	— (required)	Template with `{field}`, `{subject}`, `{snippet}` placeholders.
`mark_read_on_dispatch`	`true`	Mark the thread as read after successful dispatch.
`dispatch_delay_ms`	`1000`	Sleep between multi-match dispatches.
`max_per_tick`	`20`	Hard cap per poll cycle.
`sender_allowlist`	`[]`	Substring/domain filter on `From:` header. Empty = accept all.

Event shape

{
  "to": "<forward_to>",
  "kind": "text",
  "text": "<rendered message_template>",
  "subject": "<email subject>",
  "<extract key>": "<captured group>"
}

Published to <forward_to_subject>.

Error backoff

Sustained errors are backed off: [0, 0, 0, 30, 60, 120, 300] seconds (caps at 300). Transient failures don't stop the poll loop.

Gotchas

Gmail API only — no IMAP. This plugin is Google-specific. For generic IMAP triage, use a custom extension.
sender_allowlist is substring, not regex. Simpler to read, simpler to get wrong. Quote boundary characters explicitly.
extract regex must compile. Invalid regex fails the whole job at boot with an error naming the field.

Long-term memory (SQLite)

Durable memory shared by every agent in the process. One SQLite file, multi-tenant via an agent_id column on every row. Survives restarts.

Source: crates/memory/src/long_term.rs.

Storage location

long_term:
  backend: sqlite
  sqlite:
    path: ./data/memory.db

One file for all agents. Per-agent isolation is enforced by WHERE agent_id = ? on every query — not by separate DB files. An idx_memories_agent(agent_id, created_at DESC) index keeps those queries fast.

If you want per-agent file separation, override sqlite.path per agent via an inbound_bindings[] override or a per-agent config directory.

Schema

The runtime creates these tables at boot if they don't exist.

`memories` — atomic facts

CREATE TABLE memories (
  id            TEXT PRIMARY KEY,  -- UUID
  agent_id      TEXT NOT NULL,
  content       TEXT NOT NULL,
  tags          TEXT DEFAULT '[]', -- JSON array
  concept_tags  TEXT DEFAULT '[]', -- auto-derived (phase 10.7)
  created_at    INTEGER NOT NULL   -- ms since epoch
);
CREATE INDEX idx_memories_agent ON memories(agent_id, created_at DESC);

`memories_fts` — full-text search (FTS5)

CREATE VIRTUAL TABLE memories_fts USING fts5(
  content,
  id        UNINDEXED,
  agent_id  UNINDEXED
);

Powers the keyword recall mode with BM25 ranking.

`interactions` — conversation archive

CREATE TABLE interactions (
  id          TEXT PRIMARY KEY,
  session_id  TEXT NOT NULL,
  agent_id    TEXT NOT NULL,
  role        TEXT,
  content     TEXT,
  created_at  INTEGER
);
CREATE INDEX idx_interactions_session ON interactions(session_id, created_at DESC);

`reminders` — phase 7 heartbeat reminders

CREATE TABLE reminders (
  id            TEXT PRIMARY KEY,
  agent_id      TEXT NOT NULL,
  session_id    TEXT NOT NULL,
  plugin        TEXT,
  recipient     TEXT,
  message       TEXT,
  due_at        INTEGER,
  claimed_at    INTEGER,
  delivered_at  INTEGER,
  created_at    INTEGER
);
CREATE INDEX idx_reminders_due
  ON reminders(agent_id, delivered_at, due_at ASC);

`recall_events` — signal tracking (phase 10.5)

CREATE TABLE recall_events (
  id         INTEGER PRIMARY KEY AUTOINCREMENT,
  agent_id   TEXT,
  memory_id  TEXT,
  query      TEXT,
  score      REAL,
  ts_ms      INTEGER
);

Every recall() hit records a row. Dream sweeps read this to decide what to promote.

`memory_promotions` — dreaming ledger (phase 10.6)

CREATE TABLE memory_promotions (
  memory_id    TEXT PRIMARY KEY,
  agent_id     TEXT,
  promoted_at  INTEGER,
  score        REAL,
  phase        TEXT
);

Prevents double-promotion across sweeps.

`vec_memories` — vector index (phase 5.4, optional)

Created on demand when vector.enabled: true. See Vector search.

What gets written when

Action	Writes to
Agent calls `memory.remember(content, tags)`	`memories`, `memories_fts`, `vec_memories` (if enabled)
Every turn	`interactions` (used for transcripts, not promoted into `memories`)
Agent calls `forge_reminder(...)`	`reminders`
Every `recall()` hit	`recall_events` (one row per result returned)
Dream sweep promotes hot memory	`memory_promotions`

Memory tool

Single unified tool with three actions, visible to the LLM as memory:

Action	Required	Optional	Returns
`remember`	`content`	`tags[]`, `context`	`{ok, id}`
`recall`	`query`	`limit` (default 5), `mode` (`keyword` \| `vector` \| `hybrid`)	`{ok, results: [{id, content, tags}]}`
`forget`	`id`	—	`{ok}`

Results do not include similarity scores — only content and tags. Scores are used internally for dreaming signal tracking but aren't surfaced to the LLM to avoid encouraging score-gaming prompts.

Other memory-related tools:

forge_memory_checkpoint — snapshot the workspace-git repo (phase 10.9)
memory_history — git log + optional unified diff (phase 10.9)

Per-agent isolation

flowchart TB
    subgraph PROC[agent process]
        DB[(./data/memory.db<br/>single SQLite file)]
    end
    A1[agent: ana] -->|WHERE agent_id = 'ana'| DB
    A2[agent: kate] -->|WHERE agent_id = 'kate'| DB
    A3[agent: ops] -->|WHERE agent_id = 'ops'| DB

One LongTermMemory instance per process, shared across agents via Arc. The MemoryTool attached to each agent passes ctx.agent_id to every query.

Workspace-git (phase 10.9)

A separate per-agent git repo lives in the agent's workspace directory (not inside the memory DB). When workspace_git.enabled: true, the runtime commits after:

Dream sweeps (Phase 10.6)
forge_memory_checkpoint tool calls
Session close (on_expire)

Good for forensic replay — you can git log to see the memory state at any point. See Soul — MEMORY.md.

Gotchas

One DB, multi-tenant. A query missing its agent_id filter would leak across agents. All runtime code goes through the LongTermMemory API which injects it automatically.
Vacuum is manual. SQLite does not auto-compact after deletes. Run VACUUM; periodically (or PRAGMA auto_vacuum=incremental from day one).
recall_events grows unboundedly. Dream sweeps periodically prune, but a dreaming-disabled agent's table will grow forever. Add a retention job if you run without dreaming.

Vector search

Optional semantic memory via sqlite-vec — a virtual table inside the same SQLite file used for long-term memory. No separate service, no extra process, no migration.

Source: crates/memory/src/vector.rs, crates/memory/src/embedding/.

Turning it on

vector:
  enabled: true
  backend: sqlite-vec
  default_recall_mode: hybrid
  embedding:
    provider: http
    base_url: https://api.openai.com/v1
    model: text-embedding-3-small
    api_key: ${OPENAI_API_KEY}
    dimensions: 1536
    timeout_secs: 30

Dimension must match the model output:

Model	Dimensions
`text-embedding-3-small`	1536
`text-embedding-3-large`	3072
`nomic-embed-text`	768
Gemini `text-embedding-004`	768

A mismatch aborts startup with an explicit error. If you already have vectors at a different dimension, you must delete the DB (or the vector table) and rebuild the index.

Storage

CREATE VIRTUAL TABLE vec_memories USING vec0(
  memory_id TEXT PRIMARY KEY,
  embedding FLOAT[<dimensions>]
);

The virtual table lives in the same SQLite file as memories. A join on memory_id brings you back the content and tags.

Embedding provider

#![allow(unused)]
fn main() {
trait EmbeddingProvider {
    fn dimension(&self) -> usize;
    async fn embed(&self, texts: &[String]) -> Result<Vec<Vec<f32>>>;
}
}

Phase 5.4 ships one provider: http — any OpenAI-compatible /embeddings endpoint. That covers OpenAI, Gemini (via its API), Ollama, LM Studio, and self-hosted inference.

Local-only providers (fastembed, candle) are intentional follow-ups — the HTTP provider is enough to unblock everything downstream.

Recall modes

Set the default in memory.yaml and override per tool call with the mode argument.

`keyword` — FTS5 + concept expansion

flowchart LR
    Q[query] --> CT[derive 3 concept tags]
    Q --> M[FTS5 MATCH<br/>query OR tag1 OR tag2 OR tag3]
    CT --> M
    M --> R[rank by BM25]
    R --> RES[top N]

Fast, no embedding cost
Misses semantic neighbors that don't share vocabulary
The extra concept tags are auto-derived from the query and help narrow down concept matches

`vector` — nearest-neighbor

flowchart LR
    Q[query] --> EMB[embed]
    EMB --> VEC[vec_memories<br/>MATCH k=N*2]
    VEC --> JOIN[join memories<br/>filter by agent_id]
    JOIN --> RES[top N by distance]

Catches paraphrases and cross-vocabulary matches
Embedding request on every call — watch costs and latency
Falls back to keyword on provider error (via hybrid) — not on pure vector mode, where errors surface

`hybrid` — Reciprocal Rank Fusion

The default recommendation. Runs both keyword and vector, then fuses ranks with the RRF formula 1 / (K + rank + 1) where K = 60:

flowchart LR
    Q[query] --> K[keyword search]
    Q --> V[vector search]
    K --> RRF[RRF fusion<br/>K=60]
    V --> RRF
    RRF --> RES[top N by fused score]

Vector errors degrade gracefully to keyword-only without raising.

Tool interaction

The memory tool takes an optional mode param:

{
  "action": "recall",
  "query": "what's the client's address?",
  "limit": 5,
  "mode": "hybrid"
}

If omitted, default_recall_mode is used.

Cost and latency profile

Mode	Per recall
`keyword`	1 SQL query, no LLM call
`vector`	1 embedding HTTP call + 1 SQL query
`hybrid`	1 embedding HTTP call + 2 SQL queries + fusion

For high-throughput agents that recall on every turn, start with keyword and upgrade to hybrid only where you see miss rate matter.

Gotchas

Changing embedding model = full reindex. The dimension check catches the obvious case, but even same-dimension model swaps produce semantically different vectors; the old index becomes stale.
sqlite3_auto_extension registers once per process. Not a problem in production, but test suites that instantiate multiple SQLite connections across tests may hit edge cases.
Vector returns distance, not similarity. Lower is closer. Hybrid fusion normalizes across both, so callers don't see this directly unless they bypass the tool.

Stdio runtime + Discovery

The stdio runtime is the default way extensions run: a child process speaking line-delimited JSON-RPC over stdin/stdout. This page covers how the runtime discovers, spawns, supervises, and registers tools from a stdio extension.

Source: crates/extensions/src/discovery.rs, crates/extensions/src/runtime/stdio.rs.

Discovery

# config/extensions.yaml
extensions:
  enabled: true
  search_paths: [./extensions]
  ignore_dirs: [node_modules, .git, target]
  disabled: []
  allowlist: []            # empty = all allowed
  max_depth: 4
  follow_links: false
  watch:
    enabled: false
    debounce_ms: 500

ExtensionDiscovery walks each search path, looking for plugin.toml files:

flowchart TD
    ROOT[search_paths root] --> WALK[walkdir max_depth]
    WALK --> IGNORE{dir in<br/>ignore_dirs?}
    IGNORE -->|yes| SKIP[skip]
    IGNORE -->|no| FIND[find plugin.toml]
    FIND --> PARSE[parse + validate manifest]
    PARSE --> SIDE[sidecar .mcp.json if manifest<br/>has no mcp_servers]
    SIDE --> PRUNE[prune nested candidates]
    PRUNE --> DEDUP[dedupe by id]
    DEDUP --> DIS[apply disabled filter]
    DIS --> ALLOW[apply allowlist filter]
    ALLOW --> SORT[sort by root_index, id]
    SORT --> CANDS[DiscoveryReport<br/>candidates + diagnostics]

Prune-nested removes any candidate whose root_dir is a strict descendant of another — avoids registering an extension twice if it happens to live inside another extension's tree. Algorithm is O(N × depth).

follow_links = false is the default (monorepo-safe). When enabled, symlink escapes out of the root raise DiagnosticLevel::Error.

Gating

Before spawn, Requires::missing() runs:

flowchart LR
    CAND[candidate] --> REQ[requires.bins<br/>+ requires.env]
    REQ --> BINS{all on $PATH?}
    BINS -->|no| SKIP1[warn + skip]
    BINS -->|yes| ENV{all env set?}
    ENV -->|no| SKIP2[warn + skip]
    ENV -->|yes| SPAWN[spawn runtime]

A skipped extension does not register any tools. The warn log names exactly which bin or env var was missing.

Spawn model

sequenceDiagram
    participant H as Host (agent)
    participant S as StdioRuntime
    participant C as Child process

    H->>S: spawn(manifest, cwd)
    S->>C: tokio::process::Command
    S->>C: {"jsonrpc":"2.0","method":"initialize",<br/>"params":{"agent_version","extension_id"},"id":0}
    C-->>S: {"result":{"server_version","tools":[...],"hooks":[...]}}
    S-->>H: HandshakeInfo
    H->>H: register each tool as ExtensionTool
    H->>H: register each hook as ExtensionHook

Child is spawned with the extension's directory as cwd
stdin + stdout is the RPC channel (line-delimited JSON)
stderr is routed to the agent's tracing output
Handshake timeout: default 10 s

Tool descriptors

{
  "name": "get_weather",
  "description": "Look up weather by city.",
  "input_schema": { "type": "object", "properties": { "city": { "type": "string" } }, "required": ["city"] }
}

The host wraps each descriptor in an ExtensionTool:

Registered name: ext_{plugin_id}_{tool_name} (truncated with hash suffix if it exceeds 64 chars)
Description prefixed with [ext:{id}] so the LLM knows the origin
input_schema copied to the registered tool

Context passthrough

If the manifest sets context.passthrough = true, every call() injects:

{ "_meta": { "agent_id": "...", "session_id": "..." }, ...user_args }

The extension can decide how to split state per agent or session.

Env injection

The host passes through most env vars to the child, but blocks secret-like names via substring/suffix rules:

Suffixes: _TOKEN, _KEY, _SECRET, _PASSWORD, _CREDENTIAL, _PAT, _AUTH, _APIKEY, _BEARER, _SESSION
Substrings: PASSWORD, SECRET, CREDENTIAL, PRIVATE_KEY

Extensions that need a secret should read it from a file path the host passes by argument, or have the secret baked into their own requires.env entry (which the operator whitelists consciously).

Supervision

stateDiagram-v2
    [*] --> Spawning
    Spawning --> Ready: handshake ok
    Ready --> Restarting: child crash
    Restarting --> Ready: handshake ok again
    Restarting --> Failed: max attempts<br/>in restart_window
    Ready --> Shutdown: graceful signal
    Failed --> Shutdown
    Shutdown --> [*]

Supervisor policy:

Max restart attempts within a sliding restart_window
Exponential backoff base_backoff → max_backoff
Each transport is wrapped in a CircuitBreaker named ext:stdio:{id} so hung children don't freeze the agent loop

Graceful shutdown sends an empty message, waits shutdown_grace (default 3 s), then kills the child.

Watcher (phase 11.2 follow-up)

With extensions.watch.enabled: true the runtime watches search_paths for changes to any plugin.toml. Change-set is debounced (debounce_ms) and compared by SHA-256 of the file to squash spurious writes.

On change the runtime logs — it does not auto-reload. The operator restarts the agent to pick up the new manifest. Hot reload is a future phase.

Gotchas

Blocked env vars surprise extensions. If an extension expected OPENAI_API_KEY to come through and it wasn't declared in requires.env, the name-based block may silently strip it. Declare the env you need — that whitelists it.
follow_links: true + symlinked monorepo layouts can cause discovery to traverse out of the search root. Keep follow_links: false unless you know the layout is bounded.
Children crashing during handshake. You get a single DiagnosticLevel::Error per candidate, not a retry loop. Fix the binary, restart the host.

NATS runtime

For extensions that run out-of-process and manage their own lifecycle — a long-lived service on another machine, a container in an orchestrator, an operator-maintained daemon. The agent talks to them over NATS RPC instead of stdin/stdout.

Source: crates/extensions/src/runtime/nats.rs.

When to pick NATS over stdio

Use stdio	Use NATS
Extension is a binary you ship with the agent	Extension is a separate service you operate
Lifecycle is tied to the agent	Lifecycle is independent (k8s, systemd)
Fast local startup; co-resident on same host	Might be remote or shared between hosts
Dev-loop: install once and forget	Sensitive deployment — deploy independently of the agent

Stdio is the default. Reach for NATS when the extension's failure domain must be separated from the agent's.

Manifest

[plugin]
id = "heavy-compute"
version = "0.3.0"

[capabilities]
tools = ["long_running_job"]

[transport]
type = "nats"
subject_prefix = "ext.heavy-compute"

Wire shape

Single request/reply subject:

{subject_prefix}.{extension_id}.rpc

sequenceDiagram
    participant A as Agent
    participant N as NATS
    participant E as Extension service

    A->>N: publish ext.heavy-compute.rpc<br/>{method:"initialize", ...}
    N->>E: deliver
    E->>N: reply HandshakeInfo
    N-->>A: tools + hooks
    A->>A: register ExtensionTool per tool
    Note over A,E: steady state
    loop tool call
        A->>N: {method:"tools/long_running_job", params, id}
        N->>E: deliver
        E-->>N: result
        N-->>A: reply
    end

The JSON-RPC shape is identical to stdio — only the transport changes. Extensions don't need to know which form the host chose.

Liveness

Instead of supervising a child process, the NATS runtime uses heartbeats:

Field	Default	Purpose
`heartbeat_interval`	`15 s`	Expected beacon cadence from the extension.
`heartbeat_grace_factor`	`3`	Mark failed after `grace_factor × interval` silence.

A failed extension logs a warn and is marked unavailable. Tools stay registered in the registry but calls error out immediately. When the extension starts beaconing again, it's automatically marked available.

Circuit breaker

Same pattern as stdio: one CircuitBreaker per extension, ext:nats:{id}, wrapping every RPC. Prevents a flapping extension from piling up outstanding calls against it.

Deployment recipes

Docker compose side service

services:
  agent:
    image: nexo-rs:latest
    depends_on: [nats, heavy-compute]
  nats:
    image: nats:2.10-alpine
  heavy-compute:
    image: my-ext:0.3.0
    command: ["--nats-url", "nats://nats:4222",
              "--subject-prefix", "ext.heavy-compute"]

Kubernetes

Run the extension as its own Deployment with its own resource limits, rollouts, and observability. Share the NATS cluster via a Service. Scale extensions independently of agents.

Gotchas

subject_prefix collisions. Two extensions with the same prefix will step on each other. Enforce uniqueness in your ops convention.
Latency. NATS over LAN is sub-millisecond, but any network hop is orders of magnitude slower than stdio's pipe. Don't pick NATS for a 1 kHz tool call pattern.
Auth on the broker. NATS auth applies to extensions too — if you turn on NKey mTLS, every extension service must be enrolled.

1Password extension

A bundled stdio extension that wraps the op CLI with a service-account token. Read-only: it never creates or edits secrets. Two main use cases:

Look up a secret you don't already have in env (read_secret).
Use a secret in a command without ever exposing it to the agent (inject_template).

Source: extensions/onepassword/. Skill prompt: skills/onepassword/SKILL.md.

Tools

Tool	Reveals secret?	Audited
`status`	no	no
`whoami`	no	no
`list_vaults`	no	no
`list_items`	no — strips field values	no
`read_secret`	only if `OP_ALLOW_REVEAL=true`	yes
`inject_template`	template-only mode reveals only with `OP_ALLOW_REVEAL=true`; exec mode never reveals to the LLM	yes

`read_secret`

{ "action": "read_secret", "reference": "op://Prod/Stripe/api_key" }

Default response (reveal off):

{
  "ok": true,
  "reference": "op://Prod/Stripe/api_key",
  "vault": "Prod", "item": "Stripe", "field": "api_key",
  "length": 26,
  "fingerprint_sha256_prefix": "3f9a7c2e1b48d5a0",
  "reveal": false
}

With OP_ALLOW_REVEAL=true|1|yes set on the agent process, the response also contains { "value": "...", "reveal": true }.

`inject_template`

Resolves {{ op://Vault/Item/field }} placeholders via op inject. Two execution paths:

Template-only

{ "action": "inject_template",
  "template": "Authorization: Bearer {{ op://Prod/API/token }}\n" }

Reveal off → { length, fingerprint_sha256_prefix, reveal: false }
Reveal on → { rendered: "Authorization: Bearer abc…", reveal: true }

Exec (piped to a command)

{ "action": "inject_template",
  "template": "Bearer {{ op://Prod/API/token }}",
  "command": "curl",
  "args": ["-H", "@-", "https://api.example.com/me"] }

command must be in OP_INJECT_COMMAND_ALLOWLIST (comma-separated). Default empty → exec mode disabled.
Rendered template is never returned to the LLM. Only the downstream command's exit_code, stdout (capped at max_stdout_bytes, default 4096, max 16384), and stderr.
Both stdout and stderr are redacted before being returned — Bearer JWT, sk-…, sk-ant-…, AKIA…, and 32+ char hex tokens are replaced with [REDACTED:<label>].

Dry run

{ "action": "inject_template",
  "template": "{{ op://A/B/c }} {{ op://X/Y/z }}",
  "dry_run": true }

Validates each op:// reference's shape without resolving values. Returns references_validated.

Configuration

Environment variables consumed by the extension:

Var	Purpose	Default
`OP_SERVICE_ACCOUNT_TOKEN`	required	—
`OP_ALLOW_REVEAL`	`true`/`1`/`yes` to allow value reveal	off
`OP_AUDIT_LOG_PATH`	JSONL audit log path	`./data/secrets-audit.jsonl`
`OP_INJECT_COMMAND_ALLOWLIST`	comma-separated allowed exec commands	empty (exec disabled)
`OP_INJECT_TIMEOUT_SECS`	per-call timeout (capped at `MAX_TIMEOUT_SECS`)	30
`OP_TIMEOUT_SECS`	per-call timeout for non-inject commands	15
`AGENT_ID`	injected by the host on spawn — appears in audit	—
`AGENT_SESSION_ID`	injected by the host on spawn	—

Audit log

read_secret and inject_template append one JSON line per call to OP_AUDIT_LOG_PATH. The log is append-only and contains only metadata — never the secret value.

{"ts":"2026-04-25T18:00:00Z","action":"read_secret","agent_id":"kate","session_id":"f1...","op_reference":"op://Prod/Stripe/token","fingerprint_sha256_prefix":"a1b2c3d4e5f6789a","reveal_allowed":false,"ok":true}
{"ts":"2026-04-25T18:00:05Z","action":"inject_template","agent_id":"kate","session_id":"f1...","references":["op://Prod/Stripe/token"],"command":"curl","args_count":4,"dry_run":false,"ok":true,"exit_code":0,"stdout_total_bytes":124,"stdout_returned_bytes":124,"stdout_truncated":false}
{"ts":"2026-04-25T18:00:10Z","action":"inject_template","agent_id":"kate","session_id":null,"references":["op://Bad/Ref"],"command":"rm","args_count":0,"dry_run":false,"ok":false,"error":"command_not_in_allowlist"}

Failures writing the log are reported to stderr and never block the tool — the secret has already been read or piped; refusing to log would be worst-of-both-worlds.

Rotate with logrotate or any append-aware rotator. Keeping the log on a partition with limited write access (separate user, AppArmor, or dedicated tmpfs) reduces forensic tampering surface.

Threat model

The agent process is trusted. Reveal is gated by an env var the operator controls; once on, the value is just a string in memory that flows through the LLM, transcripts, and any tool that touches it.
Exec mode is the recommended path for any operation that does not require the agent to see the secret. The LLM only knows that the operation succeeded, not what the credential looked like.
Redaction is best-effort. Stdout from a poorly-behaved command could still leak a secret in a shape we don't recognize. Cap the max_stdout_bytes aggressively when in doubt.
The audit log is not encrypted. It contains references and fingerprints, not values. If even the references are sensitive, put the log on a permissioned filesystem.

Building microapps in Rust

A microapp is an external program that talks to the Nexo daemon over a stable wire contract. It can be a single JSON-RPC stdio extension (Phase 11), a NATS subscriber, an HTTP service consuming the webhook envelope, or any combination.

This page lists the helper crates published from the framework that take care of the wire-shape boilerplate so you can focus on the microapp's actual logic.

Tier A — publishable utility crates

`nexo-tool-meta`

Wire-shape types shared between the daemon and any consumer. Slim, four-dependency, sub-second compile.

[dependencies]
nexo-tool-meta = "0.1"

What's inside:

BindingContext — (channel, account_id, agent_id, session_id, binding_id, mcp_channel_source) tuple stamped on every tool call. Read it from params._meta.nexo.binding. Stable across turns within a binding.
InboundMessageMeta — per-turn metadata about the message that triggered the agent turn (kind, sender_id, msg_id, inbound_ts, reply_to_msg_id, has_media, origin_session_id). Read it from params._meta.nexo.inbound. Provider-agnostic shape; same for whatsapp / future channels / webhook / event-subscriber / delegation / heartbeat.
InboundKind — 3-way discriminator (external_user / internal_system / inter_session) surfacing the origin of the turn so microapps can branch handlers without re-deriving from sender presence alone.
build_meta_value / parse_binding_from_meta / parse_inbound_from_meta — the inverse trio around the dual-write _meta payload. The daemon emits, the microapp parses.
WebhookEnvelope — typed JSON envelope the daemon publishes to NATS after every accepted webhook request.
format_webhook_source — Phase 72 turn-log marker helper.

Round-trip example:

#![allow(unused)]
fn main() {
use nexo_tool_meta::{
    parse_binding_from_meta, parse_inbound_from_meta,
    BindingContext, InboundKind,
};

// Inside a JSON-RPC `tools/call` handler.
fn handle_call(args: &serde_json::Value) {
    let meta = &args["_meta"];
    if let Some(binding) = parse_binding_from_meta(meta) {
        // Route the work to the right tenant.
        match binding.channel.as_deref() {
            Some("whatsapp") => { /* WA-specific */ }
            _ => { /* future channels */ }
        }
    } else {
        // Bindingless path: delegation receive, heartbeat
        // bootstrap, tests. Microapps that don't care still
        // see the legacy flat block at `meta["agent_id"]` etc.
    }

    // Per-turn metadata: who sent what, when, replying to which
    // earlier message, with media or not.
    if let Some(inbound) = parse_inbound_from_meta(meta) {
        match inbound.kind {
            InboundKind::ExternalUser => {
                // Real end-user — apply per-sender rate limits,
                // anti-loop heuristics, etc.
                let _sender = inbound.sender_id.as_deref();
                let _msg_id = inbound.msg_id.as_deref();
            }
            InboundKind::InternalSystem => {
                // Cron tick / scheduler / yaml-declared internal
                // event — skip user-facing checks.
            }
            InboundKind::InterSession => {
                // Peer-agent delegation — `origin_session_id`
                // carries the calling peer's request token.
                let _origin = inbound.origin_session_id;
            }
            _ => { /* future kinds */ }
        }
    }
}
}

Wire layout

Both buckets live as siblings under _meta.nexo.*:

{
  "_meta": {
    "agent_id": "ana",
    "session_id": "00000000-0000-0000-0000-000000000000",
    "nexo": {
      "binding": {
        "agent_id": "ana",
        "channel": "whatsapp",
        "account_id": "personal",
        "binding_id": "whatsapp:personal"
      },
      "inbound": {
        "kind": "external_user",
        "sender_id": "+5491100",
        "msg_id": "wa.ABCD1234",
        "inbound_ts": "2026-05-01T12:34:56Z",
        "reply_to_msg_id": "wa.PREV0001",
        "has_media": false
      }
    }
  }
}

Either bucket can be absent: binding is omitted on bindingless paths (delegation receive, heartbeat, tests), inbound is omitted when the producer didn't populate it (legacy paths predating Phase 82.5). A microapp must tolerate either being missing.

Producers

Path	`kind`	`sender_id`	`msg_id`	Source
whatsapp inbound	`external_user`	E.164 phone	`wa.<id>`	core runtime intake
event-subscriber	yaml-declared	JSONPath extract	event id	core runtime synthesizer
webhook receiver	yaml-declared (via subscriber)	header/body extract	request id	webhook receiver → subscriber
delegation receive	`inter_session`	None	None	core runtime route_sub
proactive tick	`internal_system`	None	None	core runtime heartbeat_sub
email-followup tick	`internal_system`	None	None	llm_behavior

`nexo-webhook-receiver`

Provider-agnostic per-source webhook verification primitives. HMAC-SHA256 / HMAC-SHA1 / raw-token signature verify + event kind extraction (header or JSON path) + NATS publish topic rendering. No HTTP listener — pure-fn surface.

[dependencies]
nexo-webhook-receiver = "0.1"

`nexo-webhook-server`

Axum-based HTTP listener that mounts the receiver behind a 5-gate defense pipeline (method / body cap / per-source concurrency / (source, client_ip) rate limit / signature). Suitable as a standalone webhook ingestion service in any Rust daemon.

[dependencies]
nexo-webhook-server = "0.1"

`nexo-resilience`

Circuit breaker + retry + rate-limit primitives. Nothing nexo-specific — drop-in for any Rust service that needs them.

[dependencies]
nexo-resilience = "0.1"

`nexo-driver-permission`

Bash safety classifier — destructive-command warning, sed-in-place detection, read-only validation, sandbox heuristic. Useful for any tool that lets an LLM (or any other untrusted source) emit shell commands.

[dependencies]
nexo-driver-permission = "0.1"

Tier B — runtime helpers (Phase 83.4)

nexo-microapp-sdk (planned) will package the JSON-RPC stdio loop, the BindingContext parser, and the webhook envelope consumer behind ergonomic helpers — replaces the ~200 LOC of boilerplate every microapp would otherwise rewrite. Watch Phase 83 in proyecto/PHASES-microapps.md.

Forward-compatibility

Every Tier A type that crosses the wire is either #[non_exhaustive] (microapps cannot rely on field exhaustivity when reading) or has a documented field-add policy. Field additions are deliberate semver-minor: a microapp built against 0.1.0 keeps working when the daemon emits a 0.2.x-shaped payload because:

Read-side: serde's permissive default ignores unknown keys.
Write-side: the daemon never removes fields without bumping major.

Reference microapp

agent-creator-microapp (out-of-tree at https://github.com/lordmacu/agent-creator-microapp) is a working microapp that demonstrates:

JSON-RPC stdio loop (initialize / tools/list / tools/call / shutdown / hooks).
Wire-contract integration test that spawns the binary as a subprocess and asserts the daemon-side payload shape.
parse_binding_from_meta consumption from nexo-tool-meta.

Use it as the starting template until the dedicated crates/template-rust/ microapp scaffold lands in Phase 83.7.

Testing microapps

Microapps you build on nexo-microapp-sdk get a full in-process test harness so tool / hook handlers run without a daemon. Two pieces:

MicroappTestHarness drives a Microapp builder through the JSON-RPC dispatch loop end-to-end, returning the parsed result frame. Tools and hooks see the same ToolCtx / HookCtx they would in production.
MockAdminRpc is a programmable stand-in for the daemon side of nexo/admin/*. Register canned responses per method, hand the mock to the harness, and your tools that call ctx.admin().call(...) see the canned values. The mock also records every request so tests assert on shape.

Both ship behind the SDK's test-harness cargo feature; the MockAdminRpc additionally requires the admin feature.

# In your microapp's Cargo.toml
[dev-dependencies]
nexo-microapp-sdk = { version = "0.1", features = ["admin", "test-harness"] }

The reference test in extensions/template-microapp-rust/src/main.rs exercises every piece below; copy it as a starting template.

Smoke test (no admin, no binding)

#![allow(unused)]
fn main() {
use nexo_microapp_sdk::{Microapp, MicroappTestHarness, ToolCtx, ToolError, ToolReply};
use serde_json::{json, Value};

async fn ping(_args: Value, _ctx: ToolCtx) -> Result<ToolReply, ToolError> {
    Ok(ToolReply::ok_json(json!({ "pong": true })))
}

#[tokio::test]
async fn ping_returns_pong() {
    let app = Microapp::new("my-microapp", "0.1.0").with_tool("ping", ping);
    let h = MicroappTestHarness::new(app);
    let out = h.call_tool("ping", json!({})).await.unwrap();
    assert_eq!(out["pong"], true);
}
}

The harness consumes the Microapp once per call. Tests that need multiple calls build a fresh app each time, or factor the builder into a build_app() helper (see the template).

Tool with `BindingContext`

ctx.binding() returns the (agent_id, channel, account_id, …) the daemon resolved for this turn. In production it's threaded through _meta.nexo.binding; tests inject a MockBindingContext through the same path.

#![allow(unused)]
fn main() {
use nexo_microapp_sdk::{MicroappTestHarness, MockBindingContext};

#[tokio::test]
async fn tool_reads_agent_id_from_binding() {
    let binding = MockBindingContext::new()
        .with_agent("ana")
        .with_channel("whatsapp")
        .with_account("acme")
        .build();
    let h = MicroappTestHarness::new(build_app());
    let out = h
        .call_tool_with_binding("greet", json!({ "name": "world" }), binding)
        .await
        .unwrap();
    assert_eq!(out["agent_id"], "ana");
}
}

MockBindingContext::new().build() panics if agent_id is unset — the daemon never delivers a tool call without one, so the panic surfaces test wiring mistakes immediately.

Tool that calls `nexo/admin/*`

When a tool calls ctx.admin().call(...) the production path talks JSON-RPC over stdio to the daemon. The harness installs the MockAdminRpc's AdminClient instead:

#![allow(unused)]
fn main() {
use nexo_microapp_sdk::admin::MockAdminRpc;
use nexo_microapp_sdk::AdminError;

#[tokio::test]
async fn whoami_calls_admin_and_surfaces_detail() {
    let mock = MockAdminRpc::new();

    // Register a canned `Ok(value)` response.
    mock.on(
        "nexo/admin/agents/get",
        json!({ "id": "ana", "active": true, "model": { "provider": "minimax" } }),
    );

    let binding = MockBindingContext::new().with_agent("ana").build();
    let h = MicroappTestHarness::new(build_app())
        .with_admin_mock(&mock)
        .await;

    let out = h
        .call_tool_with_binding("whoami", json!({}), binding)
        .await
        .unwrap();
    assert_eq!(out["queried_agent"], "ana");

    // Mock recorded the request — assert on shape.
    let calls = mock.requests_for("nexo/admin/agents/get");
    assert_eq!(calls.len(), 1);
    assert_eq!(calls[0].params["agent_id"], "ana");
}
}

Three flavours of `on*`

Method	Signature	When
`on(method, value)`	`&self, &str, Value`	Static `Ok(value)`
`on_err(method, err)`	`&self, &str, AdminError`	Static `Err(err)`
`on_with(method, F)`	`&self, &str, F: Fn(Value) -> Result<Value, AdminError>`	Closure responder — receives the request params, returns the result. Use this when the response depends on input or the test wants to count invocations

A method without a registered responder returns AdminError::MethodNotFound. The mock is fail-loud on purpose — tests that forget to wire a response see a clear error rather than hanging on a default response.

Asserting on errors

The error round-trip is variant-preserving. A daemon that returns CapabilityNotGranted on the wire shows up as the same typed variant on the microapp side, and the mock matches that shape:

#![allow(unused)]
fn main() {
mock.on_err(
    "nexo/admin/agents/upsert",
    AdminError::CapabilityNotGranted {
        capability: "agents_crud".into(),
        method: "nexo/admin/agents/upsert".into(),
    },
);
}

The tool's ctx.admin().call(...) returns Err(AdminError::CapabilityNotGranted { .. }) verbatim — so the tool's error-mapping logic gets exercised exactly as it would against the live daemon.

Counting invocations from a closure

on_with captures any state the closure needs:

#![allow(unused)]
fn main() {
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};

let count = Arc::new(AtomicUsize::new(0));
let count_clone = Arc::clone(&count);
mock.on_with("nexo/admin/ping", move |_| {
    count_clone.fetch_add(1, Ordering::SeqCst);
    Ok(json!({}))
});
// ... drive the harness ...
assert_eq!(count.load(Ordering::SeqCst), 3);
}

Hooks

fire_hook(hook_name, args) returns the parsed HookOutcome. Same harness, different surface:

#![allow(unused)]
fn main() {
let h = MicroappTestHarness::new(build_app());
let outcome = h
    .fire_hook("before_message", json!({ "body": "hi" }))
    .await
    .unwrap();
assert!(matches!(outcome, HookOutcome::Continue));
}

For Abort cases, match on the variant and inspect reason.

What the harness does NOT do

Boot a real daemon. No NATS, no agents.yaml, no live agent loop. Use the harness for tool / hook unit tests; reach for an end-to-end test (a real daemon process spawned from the test) when you need the full pipeline.
Subscribe to the firehose. nexo/notify/agent_event delivery is daemon-side; the harness exits after one request/response. Future helper lands in 83.15.b.b.
Persist anything. Every harness call gets a fresh Handlers registry; admin mock state is the MockAdminRpc you explicitly hand it. Tests are isolated by construction.

Reference

The template microapp ships every pattern above as runnable tests:

cargo test -p template-microapp-rust

See extensions/template-microapp-rust/src/main.rs#tests for the source. Copy whichever tests apply when you start a new microapp.

Compliance primitives — when to use which

nexo-compliance-primitives (Phase 83.5) ships six reusable primitives every conversational microapp needs. This page maps each primitive to the decision it makes and the symptom that tells you to wire it in.

Primitive	Decision	Symptom that demands it
`AntiLoopDetector`	Block when same body N+ times in window OR auto-reply signature seen	Bot is talking to a bot; "Recibido / Mensaje automático" replies bouncing
`AntiManipulationMatcher`	Block on prompt-injection / role-hijack	"Ignore previous instructions", "Act as", system-prompt extraction
`OptOutMatcher`	Block + `do_not_reply_again` on opt-out keyword	User says "STOP", "no me escribas más", "unsubscribe"
`PiiRedactor`	Transform: strip PII before LLM sees text	Compliance / data-handling rules forbid passing card / phone / email through LLM
`RateLimitPerUser`	Block on bucket exhausted	One user spamming; protect downstream from runaway senders
`ConsentTracker`	Gate outbound: only send when `OptedIn`	GDPR / CAN-SPAM / WhatsApp Business cold-outbound rules

All six plug into the Phase 83.3 hook interceptor. The microapp casts the verdict into a HookOutcome::{Block, Transform, Continue} and the daemon acts.

AntiLoopDetector — anti-loop

When to wire it:

Channel is bidirectional and the user-agent could be another bot (WhatsApp Business with auto-replies, email out-of-office responders, etc.).
You see your own bot's messages echoed back in the inbound feed.

Wire shape:

#![allow(unused)]
fn main() {
let mut detector = AntiLoopDetector::new(3, Duration::from_secs(300));

async fn before_message(args: Value, ctx: HookCtx) -> Result<HookOutcome, ToolError> {
    let body = args.get("body").and_then(|v| v.as_str()).unwrap_or("");
    match DETECTOR.lock().unwrap().record_and_evaluate(body) {
        LoopVerdict::Repetition { count } => Ok(HookOutcome::Block {
            reason: format!("loop: same message {count}× in 5 min"),
            do_not_reply_again: true,
        }),
        LoopVerdict::AutoReplySignature { phrase } => Ok(HookOutcome::Block {
            reason: format!("auto-reply signature: `{phrase}`"),
            do_not_reply_again: true,
        }),
        LoopVerdict::Clear => Ok(HookOutcome::Continue),
    }
}
}

Tunables: threshold (count to trip) + window (rolling duration). Custom signature lists via with_signatures(...).

AntiManipulationMatcher — prompt-injection

When to wire it:

User-controlled inbound text reaches the LLM verbatim.
Compliance rules prohibit roles or instruction overrides.

Wire shape:

#![allow(unused)]
fn main() {
let m = AntiManipulationMatcher::default();

async fn before_message(args: Value, _ctx: HookCtx) -> Result<HookOutcome, ToolError> {
    let body = args.get("body").and_then(|v| v.as_str()).unwrap_or("");
    match m.evaluate(body) {
        ManipulationVerdict::Matched { phrase } => Ok(HookOutcome::Block {
            reason: format!("manipulation phrase: `{phrase}`"),
            do_not_reply_again: false,
        }),
        ManipulationVerdict::Clear => Ok(HookOutcome::Continue),
    }
}
}

Tunables: add domain-specific phrases via with_extra_phrases(...). Replace the entire list via with_phrases(...) — useful if you only care about a small subset.

OptOutMatcher — unsubscribe

When to wire it:

Channel is regulated (CAN-SPAM, GDPR, WhatsApp Business).
User can text a keyword to stop replies.

Wire shape:

#![allow(unused)]
fn main() {
let opt_out = OptOutMatcher::default();

async fn before_message(args: Value, ctx: HookCtx) -> Result<HookOutcome, ToolError> {
    let body = args.get("body").and_then(|v| v.as_str()).unwrap_or("");
    match opt_out.evaluate(body) {
        OptOutVerdict::OptOut { keyword } => {
            // Persist the opt-out via ConsentTracker so future
            // outbound is also gated.
            CONSENT.lock().unwrap().opt_out(
                ctx.binding().map(|b| b.account_id.as_deref().unwrap_or("default")).unwrap_or("default"),
                "stop_keyword",
            );
            Ok(HookOutcome::Block {
                reason: format!("opt-out keyword: `{keyword}`"),
                do_not_reply_again: true,
            })
        }
        OptOutVerdict::Clear => Ok(HookOutcome::Continue),
    }
}
}

Whole-word matching avoids substring false positives — "baja" in "darse de baja" matches; "baja" inside "bajaron" does not.

PiiRedactor — strip PII before LLM

When to wire it:

Compliance rules say card / phone / email cannot leave the trust boundary.
LLM provider's terms of service constrain what you can send.

Wire shape:

#![allow(unused)]
fn main() {
let redactor = PiiRedactor::new().with_luhn(true);

async fn before_message(args: Value, _ctx: HookCtx) -> Result<HookOutcome, ToolError> {
    let body = args.get("body").and_then(|v| v.as_str()).unwrap_or("");
    let (clean, stats) = redactor.redact(body);
    if stats.total() == 0 {
        return Ok(HookOutcome::Continue);
    }
    Ok(HookOutcome::Transform {
        transformed_body: clean,
        reason: Some(format!(
            "redacted: {} cards, {} phones, {} emails",
            stats.cards_redacted, stats.phones_redacted, stats.emails_redacted
        )),
        do_not_reply_again: false,
    })
}
}

Tunables: with_luhn(true) filters Luhn-invalid 16-digit runs out of the card path (cuts false positives). skip_phones() / skip_cards() / skip_emails() turn off individual categories.

RateLimitPerUser — token bucket

When to wire it:

One user can spam.
Protect downstream services (LLM provider, your DB, your outbound channel API).

Wire shape:

#![allow(unused)]
fn main() {
let mut limiter = RateLimitPerUser::flat(20, Duration::from_secs(60));

async fn before_message(args: Value, ctx: HookCtx) -> Result<HookOutcome, ToolError> {
    let user_key = ctx
        .inbound()
        .map(|m| m.from.clone())
        .unwrap_or_else(|| "anon".into());
    match limiter.try_acquire(&user_key) {
        RateLimitVerdict::Allowed { .. } => Ok(HookOutcome::Continue),
        RateLimitVerdict::Denied { retry_after } => Ok(HookOutcome::Block {
            reason: format!("rate-limited; retry in {retry_after:?}"),
            do_not_reply_again: false,
        }),
    }
}
}

Tunables: RateLimitPerUser::new(rate, window, max) lets you specify a different burst max from the long-run rate. The constructor clamps max >= rate so the long-run rate is honoured.

ConsentTracker — opt-in gate for outbound

When to wire it:

Cold outbound (you message users who didn't message first).
Regulatory: GDPR, CAN-SPAM, WhatsApp Business policy.

Wire shape:

#![allow(unused)]
fn main() {
let tracker = ConsentTracker::new();

// On every outbound dispatch:
fn can_send(user_key: &str) -> bool {
    TRACKER.lock().unwrap().allows_outbound(user_key)
}

// On opt-in form submission:
TRACKER.lock().unwrap().opt_in(user_key, "web_form");

// On opt-out keyword:
TRACKER.lock().unwrap().opt_out(user_key, "stop_keyword");
}

Default Unknown status means no outbound — CAN-SPAM "express consent" default-deny. The audit log (history_for_user(...)) gives you a per-user timestamped record of every consent change.

Composition

In production you wire ALL of them in one before-message hook:

#![allow(unused)]
fn main() {
async fn before_message(args: Value, ctx: HookCtx) -> Result<HookOutcome, ToolError> {
    let body = extract_body(&args);
    let user_key = extract_user_key(&ctx);

    // Order matters: cheapest checks first.
    if matches!(opt_out.evaluate(body), OptOutVerdict::OptOut { .. }) { return block_with_do_not_reply(); }
    if matches!(manipulation.evaluate(body), ManipulationVerdict::Matched { .. }) { return block(); }
    if let RateLimitVerdict::Denied { .. } = limiter.try_acquire(user_key) { return block_rate_limited(); }
    if let LoopVerdict::Repetition { .. } | LoopVerdict::AutoReplySignature { .. } = loop_detector.record_and_evaluate(body) {
        return block_loop();
    }

    // PII redaction LAST so the redacted body goes to the agent.
    let (clean, stats) = pii.redact(body);
    if stats.total() > 0 {
        return Ok(HookOutcome::Transform { transformed_body: clean, .. });
    }

    Ok(HookOutcome::Continue)
}
}

Publishing the SDK crates

This page documents the publish sequence for the framework's microapp-author-facing crates. Operators run these in order when cutting a 0.x release.

Publishable crates (Tier A)

The framework publishes four crates microapp authors consume:

Crate	Phase	Depends on
`nexo-tool-meta`	82.2.b	(no nexo deps)
`nexo-plugin-manifest`	81	(no nexo deps)
`nexo-compliance-primitives`	83.5	(no nexo deps)
`nexo-microapp-sdk`	83.4	`nexo-tool-meta`

Other framework crates (nexo-core, nexo-config, nexo-broker, etc.) are NOT publishable — they are daemon-internal and microapp authors should not import them.

Publish order

Because nexo-microapp-sdk depends on nexo-tool-meta, the publish sequence is:

# 1. Standalone crates (any order)
cargo publish -p nexo-tool-meta
cargo publish -p nexo-plugin-manifest
cargo publish -p nexo-compliance-primitives

# 2. Dependent crate (after tool-meta lands on crates.io)
cargo publish -p nexo-microapp-sdk

Verify each step with a dry-run first:

cargo publish -p nexo-tool-meta --dry-run

A clean dry-run prints Uploading … warning: aborting upload due to dry run and exits 0. Anything else (path-dep complaint, missing license, etc.) aborts before upload.

Path-dep elision

The workspace's root Cargo.toml declares each crate with both version = "..." AND path = "...". Cargo strips the path segment when packaging for crates.io, so the published artifact contains version-only deps. Operators do not need to edit Cargo.toml between local dev and publish.

Versioning policy

Pre-1.0 (current state):

Breaking changes ALLOWED with a minor bump (0.1 → 0.2).
Per-crate CHANGELOG.md describes the migration.
One release of grace before removing a deprecated symbol — i.e. release N marks the symbol #[deprecated], release N+1 removes it.
Wire-format changes propagate together: bumping nexo-tool-meta triggers a coordinated bump on nexo-microapp-sdk.

Post-1.0 (future):

Strict semver. Breaking changes require a major bump.
Wire format changes require coordinated multi-language release (Rust SDK + any future Python/TS SDK).

Out-of-tree microapp migration

After publish, out-of-tree microapps swap their Cargo.toml:

 [dependencies]
-nexo-microapp-sdk        = { path = "../nexo-rs/crates/microapp-sdk" }
-nexo-tool-meta           = { path = "../nexo-rs/crates/tool-meta" }
-nexo-compliance-primitives = { path = "../nexo-rs/crates/compliance-primitives" }
+nexo-microapp-sdk        = "0.1"
+nexo-tool-meta           = "0.1"
+nexo-compliance-primitives = "0.1"

A microapp that depends only on published versions can build without any nexo-rs source on disk — strictly the published artifact.

CI integration (deferred)

Auto-publish on tag via release-plz is the operator-side deliverable per Phase 83.14:

.github/workflows/publish.yml runs on v*.*.* tags.
Reads CARGO_REGISTRY_TOKEN from secrets.
Calls cargo publish per crate in the order documented above.
Sequencing: publishes nexo-tool-meta first, waits for crates.io index propagation (typically <60 s), then publishes the rest.

The release-plz integration itself lands when the operator actually tags v0.1.0; until then the per-crate Cargo.toml + CHANGELOG.md is publish-ready and the dry-run checks pass.

Model Context Protocol (MCP)

nexo-rs is both an MCP client (consumes tools from external MCP servers) and an MCP server (exposes its own tools so editors like Claude Desktop, Cursor, Zed can use them). Same wire, different directions.

Source: crates/mcp/, bridges in crates/core/src/agent/mcp_*.

The two directions

flowchart LR
    subgraph IDE[MCP clients]
        CD[Claude Desktop]
        CUR[Cursor]
        ZED[Zed]
    end
    subgraph AGENT[agent process]
        AS[Agent-as-server<br/>stdio bridge]
        AC[Agent-as-client<br/>session runtime]
    end
    subgraph EXT[External MCP servers]
        GS[Gmail MCP]
        DB[DB MCP]
        WF[Workflow MCP]
    end

    IDE --> AS
    AS --> AR[Agent tools registry]
    AC --> EXT
    AR --> AC

Server side — an MCP client (e.g. Claude Desktop) runs agent mcp serve. The agent's internal tools appear as MCP tools in that client.
Client side — the agent spawns external MCP servers (stdio or HTTP) and registers their tools into its own ToolRegistry, so agents can call them exactly like built-ins or extensions.

Phase map

Phase	What it adds
12.1	MCP client over stdio
12.2	MCP client over HTTP (streamable + SSE fallback)
12.3	Tool catalog — merge MCP tools with extensions and built-ins
12.4	Session runtime — per-session child spawn, sentinel-shared default
12.5	Resources — `resources/list` + `resources/read` with optional LRU cache
12.6	Agent as MCP server (stdio)
12.7	MCP servers declared by extensions
12.8	`tools/list_changed` debounced hot-reload

All eight landed. See PHASES.md.

Why both sides

Being a client lets agents tap any MCP ecosystem without needing a custom extension per service — if the thing you want speaks MCP, you can reach it today.

Being a server lets the carefully-sandboxed tool surface of nexo-rs (allowed_tools, outbound_allowlist, etc.) be reused from any MCP-speaking client. Your LLM-driven IDE gets access to WhatsApp send, Gmail poll, browser CDP, and everything else — without you wiring each one into the IDE's config.

Wire shape (both directions)

JSON-RPC 2.0. For transports:

stdio — child process, line-delimited JSON on stdin/stdout
streamable HTTP — modern MCP 2024-11-05 shape
SSE — legacy; used as automatic fallback

sequenceDiagram
    participant H as Host (agent or IDE)
    participant S as MCP server

    H->>S: initialize (id=0)
    S-->>H: InitializeResult (capabilities, serverInfo)
    H->>S: notifications/initialized (fire-and-forget)
    loop steady state
        H->>S: tools/list
        S-->>H: tools[]
        H->>S: tools/call {name, args}
        S-->>H: content blocks
    end
    alt tool list changes
        S-->>H: notifications/tools/list_changed
        H->>S: tools/list (debounced refresh)
    end

Where to go next

Client (stdio + HTTP) — consuming external MCP servers from agents
Agent as MCP server — exposing the agent's tools over MCP

MCP client (stdio + HTTP)

How nexo-rs consumes tools from external MCP servers. Every MCP tool ends up in the same ToolRegistry that hosts built-ins and extensions — the LLM calls them identically.

Source: crates/mcp/src/client.rs, crates/mcp/src/http/client.rs, crates/mcp/src/manager.rs, crates/mcp/src/session.rs, crates/core/src/agent/mcp_catalog.rs.

Config

# config/mcp.yaml
mcp:
  enabled: true
  session_ttl: 30m
  idle_reap_interval: 60s
  connect_timeout_ms: 10000
  call_timeout_ms: 30000
  shutdown_grace_ms: 3000
  servers:
    gmail:
      transport:
        type: stdio
        command: ./mcp-gmail
        args: []
      env:
        GMAIL_TOKEN: ${file:./secrets/gmail_token.json}
    workflow:
      transport:
        type: http
        url: https://mcp.example.com/workflow
        mode: auto          # streamable_http | sse | auto
        headers:
          Authorization: Bearer ${WORKFLOW_TOKEN}
  resource_cache:
    enabled: true
    ttl: 30s
    max_entries: 256
  resource_uri_allowlist: []   # empty = permissive
  strict_root_paths: false
  context:
    passthrough: true
  sampling:
    enabled: false
  watch:
    enabled: false
    debounce_ms: 200

Transports

stdio

Child process per server. Line-delimited JSON-RPC 2.0 over stdin/stdout. stderr is routed to the agent's tracing output.

sequenceDiagram
    participant M as McpRuntimeManager
    participant S as Server (child process)

    M->>S: spawn Command(cmd, args, env)
    M->>S: {"method":"initialize","id":0, ...}
    S-->>M: capabilities + serverInfo
    M->>S: notifications/initialized (no-reply)
    Note over M,S: steady state — tools/list, tools/call, resources/*
    M->>S: notifications/cancelled (per in-flight id)<br/>then shutdown_grace

HTTP — streamable vs SSE

Three modes selectable per server:

`mode`	Behavior
`streamable_http`	MCP 2024-11-05 spec — modern
`sse`	Legacy Server-Sent Events fallback
`auto` (default)	Try `streamable_http`; on 404/405/415, fall back to SSE

Each connection gets an mcp-session-id header. Additional headers (auth, routing) pass through a HeaderMap; values are env-resolved at config load.

Session runtime

A single McpRuntimeManager lives per process. Inside, a SessionMcpRuntime per conversation session keeps its own map of live MCP clients:

flowchart TB
    MGR[McpRuntimeManager<br/>one per process]
    MGR --> SENT[Sentinel session<br/>UUID = nil<br/>shared by all agents]
    MGR --> S1[session A runtime]
    MGR --> S2[session B runtime]
    SENT --> C1[mcp client: gmail]
    SENT --> C2[mcp client: workflow]
    S1 --> CX[session-scoped clients<br/>for stateful servers]

Sentinel session (UUID = nil) is the default shared namespace — all agents see the same clients, avoiding duplicate child processes for servers that don't need per-session isolation
Per-session runtimes are spawned when a server genuinely needs independent state (example: a workflow engine that tracks its own context per user)
Idle reap — every idle_reap_interval, the manager disposes sessions unused for longer than session_ttl, shutting their clients down gracefully
Config fingerprinting — changes to the servers set produce a new fingerprint; runtimes are rebuilt on request; concurrent requests de-dupe so only one rebuild happens

Tool catalog

McpToolCatalog::build() calls tools/list on every configured server in parallel and merges the results:

flowchart LR
    LIST[tools/list per server<br/>parallel] --> PREFIX[prefix names:<br/>server_toolname]
    PREFIX --> MERGE[merge into ToolRegistry]
    MERGE --> LLM[tools visible to LLM]
    LIST -.->|single-server error| ERR[non-fatal:<br/>server visible with error=...]

Names are always prefixed {server_name}_{tool_name} so collisions across servers can't happen
Duplicates within the same server → first wins, warn log
input_schema is passed through verbatim
Server capability resources unlocks two meta-tools for reading resources

Tool call flow

sequenceDiagram
    participant A as Agent
    participant C as McpCatalog tool
    participant R as SessionMcpRuntime
    participant S as MCP server
    participant CB as CircuitBreaker

    A->>C: invoke gmail_list_messages(...)
    C->>R: call(server=gmail, tool=list_messages, args)
    R->>CB: allow?
    CB-->>R: yes
    R->>S: tools/call {name, args, _meta}
    S-->>R: content blocks
    R-->>C: content
    C-->>A: result

Every RPC goes through a per-server CircuitBreaker. If the breaker is open, the call fails fast instead of hanging on a dead server.

Context passthrough

When mcp.context.passthrough: true, tools/call injects:

{ "_meta": { "agent_id": "ana", "session_id": "..." }, ...args }

Server-side code can use this to scope state per agent without the schema leaking that concern.

Resources

Servers advertising resources capability unlock:

resources/list (paginated via cursor, max 64 pages)
resources/read (optionally cached via LRU)
resources/templates/list (URI templates)

Cache config:

resource_cache:
  enabled: true
  ttl: 30s
  max_entries: 256

Cache invalidates on notifications/resources/list_changed. Optional per-scheme allowlist (resource_uri_allowlist: ["file", "db"]) rejects unknown URI schemes before dispatch.

Hot reload (phase 12.8)

flowchart LR
    S[server notifies<br/>tools/list_changed] --> DBC[200 ms debounce]
    DBC --> REL[catalog rebuild]
    REL --> REG[ToolRegistry re-populated<br/>with new schema]

Same flow for resources. Agents in flight at the moment of the rebuild keep their references to the old tool definitions — next turn uses the refreshed registry.

Gotchas

One MCP child per server by default. Turn on per-session isolation only for servers that genuinely need it; spawning a child per session multiplies resource cost.
notifications/initialized is fire-and-forget. If the server insists on acknowledging it, you have a broken server.
SSE is a last resort. It's in auto for compatibility; new server deployments should speak streamable HTTP.
Circuit breakers are per-server. One bad server doesn't freeze the catalog; but a flapping one still slows the agent loop via backoff waits.

Agent as MCP server

Expose the agent's tools over MCP so Claude Desktop, Cursor, Zed, or any other MCP-speaking client can use them. Stdio transport; the agent runs as a child process of the consuming client.

Source: crates/mcp/src/server/, crates/core/src/agent/mcp_server_bridge.rs.

Config

# config/mcp_server.yaml
enabled: true
name: agent
allowlist: []            # empty = every native tool; populated = strict allowlist
expose_proxies: false    # set true to also expose ext_* and mcp_* proxy tools
auth_token_env: ""       # optional env var holding a shared bearer token

Field	Default	Purpose
`enabled`	`false`	Must be `true` for the server subcommand to start.
`name`	`"agent"`	Reported as `serverInfo.name` in handshake.
`allowlist`	`[]`	Empty = all native tools. Populated = only these names reach the MCP client. Globs (`memory_*`) supported.
`expose_proxies`	`false`	Whether `ext_` (extension) and `mcp_` (upstream MCP) proxy tools are surfaced.
`auth_token_env`	`""`	If set, the `initialize` request must present this token; unauthenticated clients get rejected.

Running it

agent mcp serve --config ./config

The process reads JSON-RPC from stdin and writes responses to stdout — exactly the shape Claude Desktop, Cursor, etc. expect.

Claude Desktop example

~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "nexo": {
      "command": "/usr/local/bin/agent",
      "args": ["mcp", "serve", "--config", "/srv/nexo-rs/config"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

The Anthropic client spawns the agent, handshakes, and then every agent tool shows up in the conversation's tool list.

Wire flow

sequenceDiagram
    participant IDE as MCP client (Claude Desktop)
    participant A as agent mcp serve
    participant TR as ToolRegistry
    participant AG as Agent tools

    IDE->>A: initialize (auth_token if configured)
    A-->>IDE: capabilities + serverInfo (name, version)
    IDE->>A: notifications/initialized
    loop every turn
        IDE->>A: tools/list
        A->>TR: filtered by allowlist + expose_proxies
        A-->>IDE: tool defs
        IDE->>A: tools/call {name, args}
        A->>AG: invoke tool
        AG-->>A: result
        A-->>IDE: content blocks
    end

Tool exposure rules

flowchart TD
    ALL[every tool registered in ToolRegistry]
    ALL --> FILT1{allowlist<br/>empty?}
    FILT1 -->|yes| NATIVE[keep native tools only]
    FILT1 -->|no| GLOB[keep tools matching allowlist]
    NATIVE --> FILT2{expose_proxies?}
    GLOB --> FILT2
    FILT2 -->|yes| OUT[include ext_* and mcp_* too]
    FILT2 -->|no| SKIP[drop ext_* and mcp_*]
    OUT --> EMIT[tools/list response]
    SKIP --> EMIT

Native tools — memory_*, whatsapp_*, telegram_*, browser_*, forge_*, etc.
Proxy tools — ext_<id>_<tool> for extensions, <server>_<tool> for upstream MCP. Hidden by default to avoid proxying an external server through to another external client.

Capabilities advertised

tools — always
resources — advertised only if the agent exposes any via the server handler (phase 12.5 puts the groundwork in, consumer features follow)
prompts — reserved, not advertised yet
logging — conditional on handler implementation

Auth

When auth_token_env is set, the initialize request must present the token (via a server-specific header convention or as an _meta field). Clients that don't know the token get rejected before anything else happens. Useful when the agent is launched through a shared-host proxy rather than a local command: spawn.

Security model

Read-only by default? No — the server exposes whatever the allowlist permits. Model it explicitly:

allowlist:
  - memory_recall    # read memory
  - memory_store     # write memory  (remove for read-only)

Outbound channels (whatsapp_send_message, telegram_send_message) will send real messages from the agent's configured accounts. Include them in the allowlist only if the IDE user should be able to do that.
expose_proxies: true is transitive power. It gives the IDE the full tool set of every extension and upstream MCP server too.

Gotchas

Allowlist globs match tool names, not prefixes. memory_* matches memory_recall and memory_store but not memory_history (phase 10.9 tool). Write the pattern to match the real set.
No per-IDE-user identity. The server has one identity = the agent's configured credentials. If multiple humans share the IDE, they share the agent's blast radius.
Proxies forward the agent's rate limits. Calling whatsapp_send_message through the MCP server is the same as an agent calling it — counts against the same WhatsApp rate bucket.

MCP channels — inbound surfaces from Slack / Telegram / iMessage

An MCP channel is any MCP server that declares the experimental['nexo/channel'] capability and pushes user messages into the agent via notifications/nexo/channel. The runtime treats those messages as trusted inbound: it wraps them in <channel source="...">…</channel> XML and delivers them through the same intake lane as a paired WhatsApp / Telegram / email message.

Outbound is the mirror image: the agent invokes the server's send_message tool (or the operator-configured equivalent) via the channel_send LLM tool. Per-server permission relay lets a user approve risky tools from their phone via a structured yes <id> / no <id> reply.

This page covers the operator-facing surface. For the schema details see agents.channels in the YAML reference.

Why channels

Channels turn the agent from a thing you ask things on a terminal into a thing that lives in the platforms your team already uses. The same primitives that drive chat-side intake (pairing, dispatch policy, per-binding rate limits) apply to channel inbound — channels are not a special case for the gates that decide whether a sender is trusted.

YAML shape

agents:
  - id: kate
    channels:
      enabled: true
      max_content_chars: 16000
      default_rate_limit:
        rps: 5.0
        burst: 20
      approved:
        - server: slack
          plugin_source: slack@anthropic
          outbound_tool_name: chat.postMessage
          rate_limit:
            rps: 10.0
            burst: 50
        - server: telegram
          # plugin_source omitted — accept any installed source
          # outbound_tool_name omitted — defaults to "send_message"
          # rate_limit omitted — inherits default_rate_limit
    inbound_bindings:
      - plugin: telegram
        instance: kate_tg
        allowed_channel_servers:
          - slack
          - telegram

The 5-step gate

Every channel registration runs through a 5-step filter:

Capability — server declared experimental['nexo/channel'].
Killswitch — agents.channels.enabled = true. Hot reloadable.
Per-binding session allowlist — server name is in the binding's allowed_channel_servers.
Plugin source verification — when the approved entry declares plugin_source, the runtime's stamp must match exactly. Catches a malicious plugin clone with a different source.
Approved allowlist — server appears in agents.channels.approved. Operators can separate "binding may route through this server" (gate 3) from "we vetted the server itself" (gate 5).

Each gate emits a typed Skip { kind, reason } on failure so debug output points at the exact YAML knob to fix.

Threading

Each (server, meta) pair maps to a stable agent session uuid via ChannelSessionKey::derive. Threading priority goes thread_ts (Slack) → thread_id → chat_id (Telegram, Discord) → conversation_id → room_id → channel_id → to. Without any matching key the session collapses to one per server.

The mapping persists through the SQLite-backed SqliteSessionRegistry so daemon restarts don't reset Slack threads — the bot doesn't have to re-introduce itself every reboot.

Outbound + permission relay

channel_send(server, content, arguments?) resolves the server's outbound tool from the RegisteredChannel snapshot (default send_message, configurable per-server) and invokes it through the existing MCP runtime. arguments is passed verbatim; content populates a text key when the operator hasn't supplied one.

When a tool requires approval AND the agent's binding has a channel server with experimental['nexo/channel/permission'], the runtime emits notifications/nexo/channel/permission_request to the server and races every channel reply against the local prompt. The first decision wins. Reply format the server parses and forwards as a structured event:

^\s*(y|yes|n|no)\s+([a-km-z]{5})\s*$

The 5-letter ID uses the alphabet a-z minus l (visually confusable with 1 / I in many fonts). Phone autocorrect's capitalisation of the prefix is tolerated.

Rate limit

Per-server token bucket throttles inbound before parsing. When the bucket is empty the message is dropped with a structured warn — a noisy server cannot blow up memory or flood the conversation context. Configure via default_rate_limit (global ceiling) and per-server rate_limit (override). 0/0 means unthrottled; the validator caps rps at 1000 to catch typos.

Hot-reload

Flipping channels.enabled or removing a server from approved triggers a re-evaluation of every active registration via ChannelRegistry::reevaluate. Entries that no longer pass the gate get unregistered with a typed SkipKind reason; surviving entries stay live without a daemon restart.

LLM tools the agent gets

channel_list — list active registrations for the agent's current binding (read-only, auto-approve-friendly).
channel_send — outbound wrapper.
channel_status [server?] — diagnostic surface (registered? plugin source? permission relay? registered-at-ms?). When server is omitted, returns one row per registered server.

All three resolve binding_id from ctx.effective.binding_index at call time, falling back to agent_id for paths without a binding match.

Audit

Every turn driven by a channel inbound writes source: "channel:<server>" into the Phase 72 turn-log (goal_turns table). Operators can answer "what came in via Slack today?" with a single SQL filter on the indexed source column.

MCP server (HTTP + SSE)

The agent can expose its own tools as an MCP server so other clients (Claude Desktop, Cursor, Zed, custom IDE plugins, remote consumers, third-party plugins like the upcoming nexo-marketing extension) can call them. The transport ships in two flavours, both backed by the same Dispatcher and so both share identical wire-level behaviour:

Transport	Status	Path	Use case
stdio	shipped (Phase 12.6)	`agent mcp-server` over the process stdio	Local IDE plugins that spawn the agent as a subprocess
HTTP+SSE (Streamable)	shipped (Phase 76.1)	`POST /mcp`, `GET /mcp`, `DELETE /mcp`	Remote clients, multi-process consumers, browser-based tools
Legacy SSE alias	optional (Phase 76.1)	`GET /sse`, `POST /messages?sessionId=…`	Older Claude Desktop builds still on the 2024-11-05 spec

Phase 76.1 only ships the transport layer. Pluggable auth (Phase 76.3), multi-tenant isolation (76.4), per-tool rate-limit (76.5), durable sessions + SSE replay (76.8 — see "Session resumption" below), and TLS-in-process (76.13) are tracked separately. For production today, terminate TLS at nginx/caddy/Traefik in front of the loopback bind.

Enabling HTTP

Edit config/mcp_server.yaml:

mcp_server:
  enabled: true
  http:
    enabled: true
    bind: "127.0.0.1:7575"
    auth_token_env: "NEXO_MCP_HTTP_TOKEN"
    allow_origins:
      - "http://localhost"
      - "http://127.0.0.1"
    body_max_bytes: 1048576
    request_timeout_secs: 30
    session_idle_timeout_secs: 300
    max_sessions: 1000
    enable_legacy_sse: false

Start the daemon as usual; agent mcp-server boots both stdio and the HTTP listener when http.enabled: true.

Authentication (Phase 76.3)

The HTTP transport supports four pluggable authentication modes. All modes share an anti-enumeration response shape: every rejection returns the same 401 body ({"jsonrpc":"2.0","error":{"code":-32001,"message":"unauthorized"}}) so a probing client cannot distinguish missing token, wrong token, expired token, unknown kid, etc. The reason is logged via tracing::warn! only.

Configure via mcp_server.http.auth. The block is mutually exclusive with the legacy auth_token_env; set one or the other.

`kind: none`

Disables authentication. The runtime refuses to boot if bind is not a loopback address (127.0.0.0/8 or ::1). For local dev only.

`kind: static_token`

Constant-time-compared bearer token.

mcp_server:
  http:
    enabled: true
    auth:
      kind: static_token
      token_env: "NEXO_MCP_TOKEN"

The env var must resolve to a non-empty string at boot. Clients present the token via either Authorization: Bearer <token> or Mcp-Auth-Token: <token>. Comparison runs through subtle::ct_eq to defeat timing side-channels; length-mismatch returns false immediately (the length channel is not protected — pick a fixed-length token).

`kind: bearer_jwt`

JWT validated against a remote JWKS endpoint with cache + stale-OK fallback.

mcp_server:
  http:
    enabled: true
    auth:
      kind: bearer_jwt
      jwks_url: "https://idp.example.com/.well-known/jwks.json"
      jwks_ttl_secs: 300
      jwks_refresh_cooldown_secs: 10
      algorithms: ["RS256"]
      issuer: "https://idp.example.com/"
      audiences: ["nexo-mcp"]
      tenant_claim: "tenant_id"
      scopes_claim: "scope"
      leeway_secs: 30

Boot-time validation rejects:

Empty algorithms list.
algorithms containing none.
Mixing HMAC (HS*) and asymmetric (RS*/ES*/PS*) algorithms in the same list — the algorithm-confusion CVE class.

JWKS robustness:

The cache uses single-flight refresh (one in-flight HTTP fetch per kid, others wait on tokio::sync::Notify).
Refresh attempts are rate-limited by jwks_refresh_cooldown_secs.
If a refresh fails and a previously-cached key for the same kid exists, the stale key is reused and a warn! line is emitted (the IdP is allowed transient outages).
If no usable cached key is available, the request returns HTTP 503 (-32099 authentication backend unavailable) rather than 401, since the failure is on our side.

The Principal produced by a successful JWT validation carries tenant_id, subject, and scopes — those flow into DispatchContext.principal and are available to handlers.

`kind: mutual_tls` (mode: `from_header`)

mTLS terminated by a reverse proxy (nginx, Caddy, Traefik). The proxy validates the client cert and forwards the CN/SAN via a trusted header.

mcp_server:
  http:
    enabled: true
    bind: "127.0.0.1:7575"   # MUST be loopback in this mode
    auth:
      kind: mutual_tls
      mode: from_header
      header_name: "X-Client-Cert-Cn"
      cn_allowlist:
        - "agent-1.internal"
        - "agent-2.internal"

The runtime refuses to boot when bind is not loopback in this mode — without that constraint any internet client could forge the header. cn_allowlist is exact-match (no glob, no substring).

Backward compatibility

The legacy mcp_server.http.auth_token_env field still works. When set with no auth block, the runtime promotes it to AuthConfig::StaticToken and emits a tracing::warn! with a deprecation hint. Setting both auth and auth_token_env simultaneously fails fast at boot.

Tenant isolation (Phase 76.4)

Every authenticated request carries a validated TenantId on its [Principal]. The tenant flows from the auth boundary into DispatchContext::tenant(), and from there into helpers that namespace filesystem paths and SQLite databases.

Origin of the tenant id

The tenant id is always server-derived from the Principal. A tool must never read tenant_id from its own arguments — that would let a caller forge a tenant tag. Pattern ported from upstream agent CLI: the client passes only repo, the organizationId is validated on the server side from the Bearer token. Nexo follows the same discipline.

How each auth mode derives the tenant

Mode	Source	Default	Failure
`none`	hardcoded `"local"`	—	—
`static_token`	YAML `tenant:` field	`"default"`	invalid id → boot fail
`bearer_jwt`	JWT claim named by `tenant_claim`	reject if missing	invalid format → 401 (`TenantClaimMissing`)
`mutual_tls` (`from_header`)	`cn_to_tenant` map → CN itself	—	dotted CN without remap → 401

mcp_server:
  http:
    enabled: true
    auth:
      kind: static_token
      token_env: NEXO_MCP_TOKEN
      tenant: prod-corp     # 76.4 — pin the tenant for this token

mcp_server:
  http:
    enabled: true
    auth:
      kind: mutual_tls
      mode: from_header
      cn_allowlist: [agent-1.internal, agent-2.internal]
      cn_to_tenant:                       # 76.4 — required for dotted CNs
        agent-1.internal: tenant-a
        agent-2.internal: tenant-b

Dotted CNs (e.g. agent-1.internal) cannot be parsed as tenant ids on their own — the strict TenantId validator rejects .. Provide cn_to_tenant to remap, or rename the CN. We deliberately do not silently rewrite CNs (no automatic . → -); silent rewrites of identity claims are a security smell.

`TenantId` validation

TenantId::parse(raw) enforces:

No NUL bytes (C-syscall truncation vector).
Input must already be in NFKC canonical form — fullwidth-form bypasses (e.g. Ｔｅｎａｎｔ, ．．／) are rejected.
Percent-decode-and-recheck: %2e%2e%2f smuggling is rejected.
Length: 1–64 bytes.
Charset: [a-z0-9_-] only (lowercase ASCII; no dot, slash, uppercase, or whitespace).
No leading or trailing _ or -.

These rules are direct ports of upstream agent CLI (sanitizePathKey).

Path scoping

#![allow(unused)]
fn main() {
use nexo_mcp::server::auth::{tenant_scoped_path, tenant_db_path};

// New writes — non-canonicalising, fast.
let p = tenant_scoped_path(&root, ctx.tenant(), "memory/notes.txt");

// Reads — symlink-aware, ports
// upstream agent CLI
// (validateTeamMemWritePath).
let p = tenant_scoped_canonicalize(&root, ctx.tenant(), "memory/notes.txt")?;
}

tenant_scoped_canonicalize performs a two-pass containment check:

Lexical resolution rejects .. and absolute suffixes.
realpath() on the deepest existing ancestor follows symlinks and asserts the resolved path is strictly under <root>/tenants/<tenant>/. Symlink loops (ELOOP), dangling symlinks, and sibling-tenant traversal (tenants/t-evil/... trying to pass as tenants/t/...) all surface as distinct TenantPathError variants.

Symlink defense is gated on cfg(unix) — Windows std::fs::canonicalize returns UNC paths that break the prefix check. Phase 76.4 production targets are Linux musl + Termux; full Windows port is a follow-up.

`TenantScoped<T>` trip-wire

#![allow(unused)]
fn main() {
use nexo_mcp::server::auth::TenantScoped;

let db = TenantScoped::new(tenant_a.clone(), open_db_for("tenant-a"));
let raw = db.try_into_inner(&tenant_b)?; // → CrossTenantError
}

Thin wrapper that pairs a value with the tenant it was constructed for. try_into_inner is the trip-wire: extracting under a wrong tenant returns CrossTenantError rather than silently leaking. Not a load-bearing security boundary on its own — the actual isolation comes from path scoping at construction time — but cheap defense in depth against future bugs.

SQLite layout

tenant_db_path(root, tenant) returns <root>/tenants/<tenant>/state.sqlite3. One DB per tenant is the strongest isolation rusqlite makes easy: a corrupted DB blasts exactly one tenant. The production reference at upstream agent CLI is file-based + server-side scope enforcement; one-DB-per-tenant in nexo is a step beyond that, suited to the in-process MCP server shape.

Per-principal rate-limit (Phase 76.5)

A second rate-limit layer sits inside the dispatcher, keyed on (tenant_id, tool_name). It complements the per-IP layer (Phase 76.1, HTTP middleware): the per-IP layer rejects broad floods at the HTTP level (429 + Retry-After); the per-principal layer protects individual tools from a single authenticated tenant exhausting them (200 + JSON-RPC -32099 + data.retry_after_ms).

Wire shape

The per-IP and per-principal layers return different wire shapes — intentional, since they fire at different stack levels:

Layer	Status	Body
Per-IP (76.1, before parsing)	`429 Too Many Requests` + `Retry-After: <secs>` header	minimal
Per-principal (76.5, inside dispatcher)	`200 OK` + JSON-RPC error	`{"jsonrpc":"2.0","error":{"code":-32099,"message":"rate limit exceeded","data":{"retry_after_ms":<n>}},"id":<request_id>}`

A client that handles both sees one shape (HTTP 429) for "you're hitting the public IP gate too hard" and another (JSON-RPC -32099) for "this tenant has used its tool quota". retry_after_ms is the time until one token refills.

The Retry-After header parsing pattern (seconds → milliseconds) is ported from upstream agent CLI getRetryAfterMs.

Configuration

mcp_server:
  http:
    enabled: true
    per_principal_rate_limit:
      enabled: true                         # default
      default: { rps: 100.0, burst: 200.0 } # applies to any tool not in per_tool
      per_tool:
        agent_turn:    { rps: 10.0, burst: 20.0 }   # heavier tool, lower limit
        memory_search: { rps: 50.0, burst: 100.0 }
      max_buckets: 50000                     # hard cap on the bucket map
      stale_ttl_secs: 300                    # prune buckets idle > 5 min
      warn_threshold: 0.8                    # log when utilization ≥ 80%

When the per_principal_rate_limit block is omitted entirely, the limiter is not built (zero overhead in the dispatcher hot path). When the block is present but enabled: false, the limiter is built but check() short-circuits.

What gets rate-limited

JSON-RPC method	Gated by 76.5?
`tools/call`	yes
`tools/list`	no — list calls are cheap, no abuse vector beyond per-IP
`initialize`	no — once per session, gated by auth + per-IP
`shutdown`	no
`resources/*`	no (Phase 76.7 may add a separate gate)

Stdio principals (auth_method: stdio) bypass the limiter entirely — stdio is single-tenant by construction, so a self-throttling agent makes no sense.

Bucket eviction

The bucket map is bounded by max_buckets (default 50 000) with two eviction strategies running in parallel:

Hard cap: when len() ≥ max_buckets and a fresh key is about to be inserted, the limiter evicts ~1% of the cap from the buckets with the smallest last_seen timestamp (LRU).
Background sweeper: a tokio::spawn task wakes every 60 s and prunes any bucket with last_seen older than stale_ttl_secs. The task holds a Weak<Self> so it dies when the limiter is dropped.

This pattern is ported from OpenClaw research/src/gateway/control-plane-rate-limit.ts:6-7,101-110 (10 k cap + 5-min stale-TTL pruner). The upstream CLI (a prior CLI tool Code CLI) is client-side only and does not implement server-side rate-limiting itself; we port the wire shape from The upstream CLI and the eviction policy from OpenClaw.

Early-warning log

When a bucket's utilization crosses warn_threshold (default 0.8), the limiter emits a tracing::warn! with tenant, tool, and the current utilization. Useful as an "approaching saturation" signal so operators can pre-emptively raise a per-tool override before clients start hitting -32099. Pattern from upstream agent CLI EARLY_WARNING_CONFIGS, simplified to a single fixed threshold.

Per-principal concurrency cap + per-call timeout (Phase 76.6)

The third gate in the dispatch path. Sits after the rate-limit layer (76.5) and protects against a different failure mode: not "too many requests per second" but "too many requests in flight at once" — typical when handlers are slow and a client keeps firing.

Layer	Measures	Wire when exceeded
76.1 per-IP (HTTP middleware)	requests / second per source IP	HTTP 429
76.5 per-principal rate-limit	requests / second per (tenant, tool)	JSON-RPC `-32099`
76.6 per-principal concurrency cap	in-flight requests per (tenant, tool)	JSON-RPC `-32002`
76.6 per-call timeout	wall-clock duration of a single call	JSON-RPC `-32001`

A request must clear all four to reach the handler.

Wire shape

Outcome	Code	Body `data`
Concurrency cap exceeded (queue wait expired)	`-32002`	`{"max_in_flight": <n>, "queue_wait_ms_exceeded": <n>}`
Per-call timeout exceeded	`-32001`	`{"timeout_ms": <n>}`

-32002 is reserved for "operator-side overload" — distinct from -32099 which means "you, the client, asked too much".

Configuration

mcp_server:
  http:
    enabled: true
    per_principal_concurrency:
      enabled: true                       # default
      default: { max_in_flight: 10 }      # per-(tenant, tool) default
      per_tool:
        agent_turn:    { max_in_flight: 5,  timeout_secs: 300 }
        memory_search: { max_in_flight: 20, timeout_secs: 5 }
      default_timeout_secs: 30            # fallback when per-tool omits
      queue_wait_ms: 5000                 # how long to wait for a permit
      max_buckets: 50000                  # hard cap on the semaphore map
      stale_ttl_secs: 300                 # prune buckets idle > 5 min

When the block is omitted entirely, the cap is not built (zero overhead). When enabled: false, the cap is built but acquire short-circuits to a no-op permit.

What gets capped

JSON-RPC method	Capped by 76.6?
`tools/call`	yes
`tools/list`	no
`initialize`	no
`shutdown`	no
`resources/*`	no

Stdio principals (auth_method: stdio) bypass the cap entirely (single-tenant by construction).

How permits work

Each (tenant, tool) pair gets a tokio::sync::Semaphore with max_in_flight permits. The dispatcher acquires one permit before calling the handler and drops it (RAII) on:

successful return,
handler error,
per-call timeout firing,
client/session cancellation.

The permit is always released — there is no path that strands one. Verified by tests/http_concurrency_load_test.rs and the test fixture in PHASES.md (handler sleeps 60 s with timeout 5 s → returns -32001 within ~5 s, semaphore back to full permits).

Queue wait

When all permits are taken, a new request waits up to queue_wait_ms for one to free up. If the wait expires, the request is rejected with -32002. queue_wait_ms: 0 means "reject immediately if no permit is available" (no queueing).

Cancellation during the wait (HTTP client disconnect, session shutdown, tokio::select! on the caller side) propagates: the acquire returns Cancelled → dispatcher returns -32800 request cancelled rather than waiting out the full queue interval.

Per-call timeout

Independent of the concurrency cap. Wraps the handler future in tokio::time::timeout(timeout_for(tool), ...). On elapse the inner future is dropped at its next .await (cooperative cancellation), the permit is released, and the dispatcher returns -32001 with data.timeout_ms. Lookup priority for the timeout:

per_tool[<name>].timeout_secs
default.timeout_secs
default_timeout_secs

Hard cap on any timeout is 600 s (mirrors http_config::MAX_REQUEST_TIMEOUT_SECS).

Bucket eviction

Same shape as 76.5: a hard cap (max_buckets, default 50 000) with LRU eviction at insert + a background sweeper that runs every 60 s and prunes entries with last_seen older than stale_ttl_secs. The sweeper only drops entries whose semaphore has all permits available — it never strands an in-flight permit. Worst case: a tenant that always has at least one call in flight never gets its entry pruned, bounded by the hard cap LRU at insert time.

Reference patterns

RAII permit + cancel-aware acquire — in-tree crates/mcp/src/client.rs:873-899 (76.1 client side).
DashMap + sweeper + hard-cap eviction — Phase 76.5 per_principal_rate_limit.rs. We mirror the same shape with Semaphore in place of TokenBucket.
tokio::select! cancellation — Phase 76.2 dispatch.rs:201-205 (biased; cancel; do_dispatch).
AbortSignal/AbortController equivalent — upstream agent CLI and src/services/tools/toolExecution.ts:415-416. The upstream CLI does not implement server-side concurrency caps (it's a client), so only the cancellation propagation idea is portable.
Anti-pattern (NOT ported): OpenClaw research/src/acp/control-plane/session-actor-queue.ts:6-37 uses an unbounded keyed-async-queue. Phase 76.6 explicitly rejects unbounded queues (max_buckets + queue_wait_ms together bound both memory and tail latency).

Server-side notifications + streaming (Phase 76.7)

Phase 76.7 closes the server→client notification loop on top of the per-session SSE channel that Phase 76.1 already wired. Three JSON-RPC notifications are now emitted by the in-tree dispatcher, plus a fourth (notifications/progress) that tools opt into via a streaming-aware handler method.

Notification	Trigger	Wire shape
`notifications/tools/list_changed`	`HttpServerHandle::notify_tools_list_changed()`	`{"jsonrpc":"2.0","method":"notifications/tools/list_changed"}`
`notifications/resources/list_changed`	`HttpServerHandle::notify_resources_list_changed()`	`{"jsonrpc":"2.0","method":"notifications/resources/list_changed"}`
`notifications/resources/updated`	`HttpServerHandle::notify_resource_updated(uri, contents)`	`{"jsonrpc":"2.0","method":"notifications/resources/updated","params":{"uri":<…>,"contents":<…>?}}`
`notifications/progress`	tool calls `progress.report(progress, total?, message?)`	`{"jsonrpc":"2.0","method":"notifications/progress","params":{"progressToken":<echoed>,"progress":<n>,"total":<n>?,"message":<…>?}}`

Capability advertisement

The default McpServerHandler::capabilities() now returns:

{
  "tools":     { "listChanged": true },
  "resources": { "listChanged": true, "subscribe": true }
}

Implementors that don't support subscriptions can override the method.

Progress reporter

A tool that wants to emit progress overrides call_tool_streaming on its McpServerHandler (the default delegates to call_tool and ignores the reporter):

#![allow(unused)]
fn main() {
async fn call_tool_streaming(
    &self,
    name: &str,
    args: Value,
    progress: ProgressReporter,
) -> Result<McpToolResult, McpError> {
    for i in 1..=100 {
        progress.report(i as f64, Some(100.0), Some(format!("step {i}")));
        do_one_step().await;
    }
    Ok(/* result */)
}
}

progress.report is non-blocking. Drop-oldest on broadcast overflow; sender never panics if the SSE consumer disconnected.
A 20 ms coalescing gate (per reporter) collapses storms — a tool that calls report 1 000 times in a tight loop produces ≤ 50 events/sec on the wire, with the most recent values emitted on each gate fire.
The reporter is a noop when the originating request did not include params._meta.progressToken. Tools call report unconditionally without branching.

`resources/subscribe` semantics

→ {"jsonrpc":"2.0","method":"resources/subscribe","params":{"uri":"file:///x"},"id":1}
← {"jsonrpc":"2.0","result":{},"id":1}

Subscriptions are stored in a DashSet<String> on the session, cleared when the session is removed. The host pushes notifications/resources/updated via HttpServerHandle::notify_resource_updated(uri, contents); only sessions whose subscription set contains uri receive the event.

Reference patterns

upstream agent CLI — client-side consumption of tools/list_changed. The upstream CLI is client-side and does NOT implement server-side notifications; we port the wire shape and build the server-side broadcast ourselves on top of the existing broadcast::Sender<SessionEvent> per session (Phase 76.1, crates/mcp/src/server/http_session.rs:39-46).
crates/mcp/src/server/http_transport.rs:815-820 — Lagged event handling on SSE overflow. Reused as-is for notifications/progress storm scenarios.

Session resumption + SSE replay (Phase 76.8)

The HTTP transport persists every server-pushed SSE frame to a SQLite event store so a reconnecting client can replay the gap via the Last-Event-ID header instead of re-initialize-ing from scratch.

Wire contract

SSE frames carry id: <seq> (per-session monotonic, starting at
1. plus event: message / data: <json-rpc-frame>.
Reconnect: GET /mcp with Mcp-Session-Id: <uuid> + Last-Event-ID: <seq>. The server replays persisted frames with seq > <Last-Event-ID> (capped at max_replay_batch) before the live broadcast loop attaches.
Header absent → no replay (live only). Header present (any numeric value, including 0) → replay everything above.
Unknown Mcp-Session-Id → HTTP 404 + JSON-RPC body {"error":{"code":-32001,"message":"Session not found"}}. This matches the prior agent CLI client's isMcpSessionExpiredError contract — a permanent failure that the client must recover by re-initialize.

Configuration

mcp_server:
  http:
    session_event_store:
      enabled: true                     # opt-in; default off when block omitted
      db_path: "data/mcp_sessions.db"   # absolute path recommended in prod
      max_events_per_session: 10000     # ring cap; oldest pruned every 1000 emits
      max_replay_batch: 1000            # hard ceiling per replay (max 10000)
      purge_interval_secs: 60           # background prune older than session_max_lifetime_secs

The session_max_lifetime_secs (default 24 h) gates how long events live in the store. The background purge worker stops on parent shutdown; SIGTERM does not block on it.

What does not survive a daemon restart

The in-memory HttpSession (broadcast channel + cancellation token) is gone after a restart. Only events + subscriptions persist on disk. A client that reconnects with its old session-id gets the 404 + -32001 contract above and is expected to re-initialize. Full session reattach (rehydrating HttpSession entire) is parked as 76.8.b until a real client asks for it — the upstream client treats expired sessions as permanent failure, so the parity gap is intentional.

Observability

The same mcp_requests_total{outcome} and mcp_request_duration_seconds metrics from 76.10 cover replay path requests transparently. Replay-specific counters (mcp_replay_rows_total, mcp_replay_skipped_total{reason="cap"}) are deferred to a follow-up — file an issue if you need them sooner.

Reference patterns

upstream agent CLI — wire format SSE id: + Last-Event-ID reconnect.
upstream agent CLI — HTTP 404 + JSON-RPC -32001 permanent-failure contract.
crates/agent-registry/src/turn_log.rs:64-89 — in-tree TurnLogStore pattern mirrored verbatim for the SessionEventStore trait shape (Phase 72 alignment).

Observability + health (Phase 76.10)

The server emits Prometheus metrics for every dispatch path plus enriched /healthz + /readyz responses. Metrics are hand-rolled (LazyLock<DashMap<Key, AtomicU64>> module globals) following the in-tree pattern (crates/web-search/src/telemetry.rs, crates/llm/src/telemetry.rs) — render-on-scrape, no prometheus crate dependency.

Metric inventory

Metric	Type	Labels	Bumped at
`mcp_requests_total`	counter	`tenant`, `tool`, `outcome`	`Dispatcher` post-call (every `tools/call` outcome)
`mcp_request_duration_seconds`	histogram (8 buckets: 50/100/250/500/1k/2.5k/5k/10k ms)	`tenant`, `tool`	`Dispatcher` post-call
`mcp_in_flight`	gauge (signed)	`tenant`, `tool`	RAII `InFlightGuard` — increment on entry, decrement on every exit path (incl. panic unwind)
`mcp_rate_limit_hits_total`	counter	`tenant`, `tool`	76.5 rate-limit reject
`mcp_timeouts_total`	counter	`tenant`, `tool`	76.6 per-call timeout reject (-32001)
`mcp_concurrency_rejections_total`	counter	`tenant`, `tool`	76.6 concurrency cap reject (-32002)
`mcp_progress_notifications_total`	counter	`outcome` (ok\|drop)	76.7 reporter emit / drop-oldest overflow

Cardinality discipline

Tool labels are bounded by MAX_DISTINCT_TOOLS = 256. Beyond that, every new tool name collapses to "other". Pattern ported from upstream agent CLI (mcp__* tools collapsed to 'mcp'). Tenant labels are bounded by TenantId::parse ([a-z0-9_-]{1,64}) — even a misconfigured deployment can't blow up the metric.

`correlation_id` propagation

The HTTP transport extracts X-Request-ID from request headers (or generates a UUIDv4 when absent), echoes it in the response header, and stamps it on DispatchContext.correlation_id. The dispatcher logs it on every mcp.dispatch span:

INFO mcp.dispatch{tenant=acme tool=agent_turn correlation_id=4d8c...} ...

Client-supplied values longer than 128 chars are replaced with a fresh UUIDv4 — don't trust unbounded headers.

`/healthz` vs `/readyz`

/healthz (port from Phase 9.3): liveness only, returns 200 {"status":"ok"} as long as the process is alive.

/readyz: structured readiness check with cached snapshot (TTL 5 s — absorbs scrape thundering-herd):

{
  "ready": true,
  "checks": {
    "broker": true,
    "sessions_capacity_ok": true
  }
}

Returns HTTP 200 when ready is true, 503 otherwise. Operators should hit /readyz from k8s readinessProbe and /healthz from livenessProbe.

Reference patterns

Cardinality bounding — upstream agent CLI (MCP tool collapsing) and :281-299 (model-name normalisation). Direct port: 256-tool allowlist + "other" collapse.
In-tree precedent — crates/web-search/src/telemetry.rs:14-260 (8-bucket histogram layout), crates/core/src/telemetry.rs:483-557 (aggregator).
Anti-pattern flagged — crates/poller/src/telemetry.rs:74-94 uses user-provided job_id: String as a label, which can grow unboundedly. Phase 76.10 deliberately avoids unbounded labels.

Defaults and hardening

HttpTransportConfig::validate() refuses to boot the HTTP listener when the operator picks an insecure combination:

Non-loopback bind without auth_token_env.
Non-loopback bind with empty allow_origins.
Non-loopback bind with allow_origins: ["*"].
body_max_bytes above the 16 MiB hard cap.
session_idle_timeout_secs above 86 400 s (24 h hard cap).
request_timeout_secs above 600 s.
session_max_lifetime_secs < session_idle_timeout_secs.

Body parsing is hardened against pathological inputs:

JSON nesting beyond depth 64 is rejected (-32600) BEFORE serde_json allocates — defends against stack-overflow payloads.
Batch (array) requests are rejected (MCP 2025-11-25 forbids them).
method and params.name strings beyond 64 KiB are rejected.
Notifications (id absent) yield 202 No Content and never produce a response body.

Endpoints

`POST /mcp`

JSON-RPC over HTTP. initialize allocates a new session — the response carries Mcp-Session-Id: <uuid>. Every subsequent request MUST include the same header; missing or unknown session id returns 404.

curl -i -H 'Authorization: Bearer ${TOKEN}' \
     -H 'Content-Type: application/json' \
     -d '{"jsonrpc":"2.0","method":"initialize","params":{},"id":1}' \
     http://127.0.0.1:7575/mcp

`GET /mcp` (SSE)

Opens a Server-Sent Events stream for unsolicited notifications (tools/list_changed, future progress events). Required header is Mcp-Session-Id. Stream events:

event: message — JSON-RPC envelope from server to client.
event: lagged — payload {"dropped": <n>} when the per-session buffer (default 256) overflows due to a slow consumer.
event: shutdown — payload {"reason": "<…>"} on graceful daemon shutdown.
event: end — payload {"reason": "session_closed" | "max_age" | "expired"}.

`DELETE /mcp`

Tears down the session referenced by Mcp-Session-Id. Returns 204 on success, 404 if the id is unknown. SSE consumers listening on the same session receive event: end with reason: "session_closed".

`GET /healthz` and `GET /readyz`

Always reachable, never authenticated, no origin check. /healthz returns 200 ok while the listener is alive. /readyz returns 503 until the first successful initialize, then 200 for the rest of the process lifetime.

Legacy SSE alias (`enable_legacy_sse: true`)

GET /sse — opens an SSE stream and emits a single event: endpoint whose data is the absolute URL the client must POST to (http://<host>/messages?sessionId=<uuid>). Subsequent server→client events come through the same stream.
POST /messages?sessionId=X — equivalent to POST /mcp, but the JSON-RPC response is delivered on the SSE stream as an event: message rather than in the HTTP body. The HTTP body is 202 No Content.

Reverse-proxy guidance

In production, terminate TLS in front of the agent. Three recipes below.

Nginx

server {
    listen 443 ssl http2;
    server_name mcp.example.com;
    ssl_certificate /etc/letsencrypt/live/mcp.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/mcp.example.com/privkey.pem;

    location /mcp {
        proxy_pass http://127.0.0.1:7575;
        proxy_http_version 1.1;
        proxy_buffering off;          # keep SSE responsive
        proxy_read_timeout 1h;        # SSE long-poll
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $remote_addr;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    location /healthz {
        proxy_pass http://127.0.0.1:7575;
        proxy_http_version 1.1;
    }

    location /readyz {
        proxy_pass http://127.0.0.1:7575;
        proxy_http_version 1.1;
    }
}

Caddy (v2)

Caddy auto-provisions Let's Encrypt certificates. Minimal Caddyfile:

mcp.example.com {
    reverse_proxy /mcp*     127.0.0.1:7575
    reverse_proxy /healthz  127.0.0.1:7575
    reverse_proxy /readyz   127.0.0.1:7575

    # SSE needs these tuned:
    @sse path /mcp
    header @sse Cache-Control no-store
    header @sse X-Accel-Buffering no
}

Traefik (v3)

YAML static config snippet:

entryPoints:
  websecure:
    address: ":443"
    http:
      tls:
        certResolver: letsencrypt

http:
  routers:
    mcp:
      rule: "Host(`mcp.example.com`)"
      entryPoints: ["websecure"]
      service: mcp-backend
      tls:
        certResolver: letsencrypt

  services:
    mcp-backend:
      loadBalancer:
        servers:
          - url: "http://127.0.0.1:7575"

With Docker labels (Compose):

services:
  nexo-mcp:
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.mcp.rule=Host(`mcp.example.com`)"
      - "traefik.http.routers.mcp.entrypoints=websecure"
      - "traefik.http.routers.mcp.tls.certresolver=letsencrypt"
      - "traefik.http.services.mcp.loadbalancer.server.port=7575"
      # SSE: disable buffering on the MCP route
      - "traefik.http.middlewares.mcp-sse.buffering.maxRequestBodyBytes=0"
      - "traefik.http.routers.mcp.middlewares=mcp-sse"

mTLS (mutual TLS)

For in-VPC or zero-trust deployments where the MCP server must authenticate the client via certificate:

server {
    listen 443 ssl http2;
    server_name mcp.internal.example.com;

    ssl_certificate     /etc/mcp/server.crt;
    ssl_certificate_key /etc/mcp/server.key;
    ssl_client_certificate /etc/mcp/client_ca.crt;
    ssl_verify_client on;
    ssl_verify_depth 2;

    error_page 495 /_mtls_fail;
    location /_mtls_fail {
        internal;
        return 400 "client certificate required\n";
    }

    location /mcp {
        proxy_pass http://127.0.0.1:7575;
        proxy_http_version 1.1;
        proxy_buffering off;
        proxy_read_timeout 1h;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $remote_addr;
        proxy_set_header X-Client-Cert-Subject $ssl_client_s_dn;
    }
}

Caddy mTLS:

mcp.internal.example.com {
    tls /etc/mcp/server.crt /etc/mcp/server.key {
        client_auth {
            mode require_and_verify
            trusted_ca_cert_file /etc/mcp/client_ca.crt
        }
    }
    reverse_proxy 127.0.0.1:7575
}

Note: mTLS provides transport-level authentication. When the proxy enforces client certificates, the MCP server's application-layer token/auth requirement can be relaxed (validate accepts tls.client_ca_path as a substitute for auth_token).

In-process TLS (`server-tls` feature)

For deployments that can't/won't run a reverse proxy, the crate ships an optional server-tls feature:

# Cargo.toml
nexo-mcp = { version = "...", features = ["server-tls"] }

# config/mcp_server.yaml
mcp_server:
  enabled: true
  http:
    tls:
      cert_path: /etc/mcp/server.crt
      key_path: /etc/mcp/server.key
      client_ca_path: /etc/mcp/client_ca.crt  # optional: enables mTLS

Current status: the YAML schema and config validation accept the tls block. The runtime in-process TLS listener is blocked on axum 0.7's serve() which only accepts TcpListener; full support lands with the axum 0.8 upgrade (generic Listener trait). Today, use the reverse-proxy recipes above and leave the tls block empty.

The agent's per-IP rate limiter trusts X-Forwarded-For only when the listener is bound to loopback (operator behind a proxy); otherwise the direct peer IP is authoritative.

Exposing additional tools (Phase 76.16)

By default the MCP server exposes the five agent introspection tools (who_am_i, what_do_i_know, my_stats, memory, session_logs). To surface any subset of the Phase 79 agentic tools to external MCP clients, add them to expose_tools in config/mcp_server.yaml:

mcp_server:
  expose_tools:
    - EnterPlanMode   # puts the session into read-only plan review mode
    - ExitPlanMode    # lifts plan-mode; requires operator approval
    - ToolSearch      # on-demand schema fetch for deferred tools
    - TodoWrite       # ephemeral intra-turn checklist
    - SyntheticOutput # typed/structured output forcing
    - NotebookEdit    # Jupyter cell-level edits
    - RemoteTrigger   # webhook / NATS publish from inside a turn

Unknown names and the two gated tools (Config, Lsp) are skipped with a tracing::warn! log at startup — the daemon continues normally. The existing allowlist field in mcp_server.yaml still applies on top of expose_tools, letting operators further restrict which of the registered tools each client session may call.

Denied-by-default tools (Heartbeat, delegate, RemoteTrigger) require an additional safe profile:

List the tool in expose_denied_tools.
Enable denied_tools_profile.enabled.
Set the matching denied_tools_profile.allow.* = true.

Example (safe minimal override for reminders only):

mcp_server:
  auth_token_env: MCP_SERVER_TOKEN
  expose_tools: ["Heartbeat"]
  expose_denied_tools: ["Heartbeat"]
  denied_tools_profile:
    enabled: true
    require_auth: true
    require_delegate_allowlist: true
    require_remote_trigger_targets: true
    allow:
      heartbeat: true
      delegate: false
      remote_trigger: false

Security note: Config (self-config write-back) and Lsp (in-process rust-analyzer / pylsp) require additional infrastructure and are deferred to a later sub-phase. They are intentionally not enabled via expose_tools today.

Testing the server

Run the full conformance + fuzz suite (Phase 76.12):

cargo test -p nexo-mcp --features server-conformance

This runs:

5 proptest cases over parse_jsonrpc_frame — arbitrary bytes, strings, methods, depths, and batch arrays. Invariant: no panic.
11 HTTP conformance cases — MCP 2025-11-25 spec fixtures via HTTP transport.
11 stdio conformance cases — same fixtures via stdio transport, verifying transport parity.

For the load smoke test (50 sessions × 200 requests = 10 000 calls, p99 gate < 500 ms; takes ~5 s):

cargo test -p nexo-mcp --features server-conformance \
    -- --include-ignored load_smoke

Coming in later sub-phases

76.13 ✅ — TLS config schema + feature flag + nginx/caddy/Traefik/mTLS reverse-proxy recipes. In-process TLS listener deferred to axum 0.8 upgrade.
76.14 ✅ — nexo mcp-server CLI ops: inspect, bench, tail-audit. All three subcommands wired and smoke-tested.

Track the rollout in PHASES.md and the public surface diff in CLAUDE.md.

Building an MCP server extension

Phase 76.15 — operator-friendly walk-through for forking the template-mcp-server skeleton into a domain-specific MCP server (e.g. nexo-marketing, nexo-crm). The companion chapter HTTP+SSE transport documents the production knobs (auth, multi-tenant, rate-limit, audit, resume); this chapter is the developer's quickstart.

When to build an MCP server extension

You want one when:

You have a domain (marketing, CRM, billing, ops) with its own tools, types, and access policy that should NOT live inside the agent process.
You want separate deployment + auth — for example, the marketing team owns the marketing MCP server and exposes it on their VPC; the agent process is shared infrastructure.
You want third-party access — Claude Code, Cursor, custom scripts, or another agent connect over HTTP+SSE while the agent proxies through the same surface.

You DON'T want one when:

The capability is shared by every agent in the workspace — ship it as a built-in tool inside crates/core/.
The capability is a thin wrapper over one HTTP API — ship it as an agent extension (stdio JSON-RPC) instead. MCP servers carry per-call auth + rate-limit + audit overhead that's wasted on a single-tenant private endpoint.

The skeleton — `extensions/template-mcp-server/`

template-mcp-server/
├── Cargo.toml             # depends on nexo-mcp (path dep in-tree, crates.io after copy)
├── plugin.toml            # extension manifest (id, capabilities)
├── config.example.yaml    # documented HTTP block ready to paste
├── README.md              # quickstart + production checklist
└── src/
    ├── main.rs            # boot stdio + optional HTTP via env var
    └── tools.rs           # one typed Echo tool using McpServerBuilder

The whole skeleton is under 250 LOC of Rust. It deliberately stops short of multi-tenant + audit + rate-limit so the diff stays readable; everything you need to enable those is in config.example.yaml plus pointers to the operator chapter.

The 5-step fork

1. Copy + rename

cp -r extensions/template-mcp-server ~/code/nexo-marketing
cd ~/code/nexo-marketing

2. `Cargo.toml`

Bump name to nexo-marketing (or whatever).
Drop publish = false if you intend to release.
Switch the nexo-mcp path dep to a published version:

nexo-mcp = "0.1.1"   # was: { path = "../../crates/mcp" }

3. `plugin.toml`

Bump id, name, description.
List your tools under [capabilities].tools.
Decide whether the agent's extension supervisor should fork the binary directly (keep transport.command) or whether you'll run it as a long-lived service (drop the line, register the URL in the agent's mcp_server.http block).

4. `src/tools.rs`

Replace the Echo tool with your domain logic. A typed tool is three structs + one impl Tool:

#![allow(unused)]
fn main() {
#[derive(Deserialize, JsonSchema)]
pub struct SendEmailArgs {
    pub to: String,
    pub subject: String,
    pub body: String,
}

#[derive(Serialize, JsonSchema)]
pub struct SendEmailOut {
    pub message_id: String,
}

pub struct SendEmail { /* config: API key, smtp, etc. */ }

#[async_trait]
impl Tool for SendEmail {
    type Args = SendEmailArgs;
    type Output = SendEmailOut;

    fn name(&self) -> &str { "send_email" }
    fn description(&self) -> &str { "Send a transactional email." }

    async fn call(&self, args: Self::Args, _ctx: ToolCtx<'_>) -> Result<Self::Output, McpError> {
        // call your provider — propagate errors as McpError variants.
        Ok(SendEmailOut { message_id: "...".into() })
    }
}
}

The schema for SendEmailArgs is derived once at registration and cached; runtime cost per tools/call is one BTreeMap::get on the tool name. The _ctx: ToolCtx<'_> parameter exposes tenant, correlation_id, session_id, progress, cancel — use them when you need multi-tenant routing or want to emit notifications/progress for long-running calls.

5. Wire the agent

Two patterns:

Pattern A — child process (simplest). The agent's extension supervisor forks your binary as a child and pipes stdio. Add to the agent's config/agents.yaml:

extensions:
  - id: nexo-marketing
    command: "/path/to/nexo-marketing"
    transport: stdio

Pattern B — long-lived HTTP service. Run the binary as a systemd unit on its own host. Configure the agent's mcp_server.http block in config/mcp_server.yaml to point at it. This unlocks per-tenant auth, audit, rate-limit, and lets non-agent clients (Claude Code, Cursor) hit the same server.

Production checklist

Before exposing a forked server beyond loopback:

Knob	Phase	Why
`auth.kind: static_token` or `bearer_jwt`	76.3	Loopback bind without auth refuses to boot
`allow_origins: [...]` (no `*`)	76.1	CORS hard-rejected on non-loopback bind
`audit_log.enabled: true`	76.11	Per-call durable trail; survives restart
`per_principal_rate_limit.enabled: true`	76.5	Cap noisy tenants before they exhaust paid APIs
`per_principal_concurrency.enabled: true`	76.6	Keep one tenant from starving others
`session_event_store.enabled: true`	76.8	SSE consumers can resume via `Last-Event-ID`
TLS in front (nginx/caddy/Traefik)	76.13 (deferred)	Direct `rustls` parked; treat binary as cleartext

Each row maps to a config block in config.example.yaml. Uncomment and fill in.

Quickstart smoke

# Fresh checkout, in-tree build:
cargo build -p template-mcp-server

# Stdio (what the agent's supervisor sees):
echo '{"jsonrpc":"2.0","method":"initialize","params":{},"id":1}' | \
  ./target/debug/template-mcp-server

# HTTP for direct curl/claude-mcp testing:
MCP_TEMPLATE_HTTP_BIND=127.0.0.1:7676 \
  cargo run -p template-mcp-server &

curl -s -X POST http://127.0.0.1:7676/mcp \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","method":"initialize","params":{},"id":1}'

What's NOT in the template

These are deliberately out of scope to keep the skeleton small:

Tool with notifications/progress — see crates/mcp/src/server/progress.rs and Phase 76.7 docs.
Tool with notifications/tools/list_changed for hot-reload — the runtime can broadcast it via HttpServerHandle::notify_tools_list_changed(), but the template doesn't ship a sample of when to fire it.
Resources / prompts surface — only tools are wired. See crates/mcp/src/bin/mock_mcp_server.rs for the full resources/list + resources/read shape.
Custom error types — the template returns McpError via ?. Map your provider errors to specific JSON-RPC error codes (-32602, -32603, custom application codes ≥ -32000) when you need clients to distinguish them.

Reference patterns mined for this template

upstream agent CLI — minimal stdio MCP server in the prior agent CLI source. The upstream CLI uses imperative setRequestHandler(SchemaName, async (req) => …) per spec method; we collapse that into one McpServerBuilder::tool(impl Tool) chain.
upstream agent CLI — the upstream factory pattern returning a configured Server. Our build_handler closure plays the same role per transport.
crates/mcp/src/bin/mock_mcp_server.rs — exhaustive in-tree reference for protocol corner cases (initialize, paginate, errors, notifications, resources, sampling). Read it when this template's surface stops being enough.
research/extensions/firecrawl/ — OpenClaw extension layout. Different model (TypeScript provider contracts) but informs the plugin.toml shape.

Gating by env / bins

Both kinds of skills (extension skills under extensions/ and local skills under skills_dir) declare what they need to work. The runtime checks those preconditions at load time and reacts differently depending on skill kind.

The declaration

Both kinds use the same shape. For an extension, it lives in plugin.toml:

[requires]
bins = ["ffmpeg", "ffprobe"]
env  = ["OPENAI_API_KEY"]

For a local skill it lives in the YAML frontmatter of SKILL.md:

---
name: "Whisper transcription"
requires:
  bins: ["ffmpeg"]
  env: ["OPENAI_API_KEY"]
---

Check semantics (source: crates/extensions/src/manifest.rs Requires::missing(), crates/core/src/agent/skills.rs):

bins — each name looked up on $PATH. On Windows also <bin>.exe.
env — each name must be set and non-empty.

Two reactions, one mechanism

flowchart TD
    CHECK[Requires::missing] --> ANY{missing bin<br/>or env?}
    ANY -->|no| OK[proceed]
    ANY -->|yes| KIND{skill kind}
    KIND -->|extension| WARN[warn<br/>continue<br/>tools still registered]
    KIND -->|local skill| SKIP[warn<br/>skip<br/>not injected into prompt]

Skill kind	On missing preconditions
Extension	Warn log, still spawn + register tools. A subsequent tool call will fail visibly when the bin/env is absent.
Local skill	Warn log, do not inject into the system prompt. The LLM never hears the skill existed.

Why the difference

A local skill is a description the LLM reads and internalizes — "you have a transcription skill, call whisper_transcribe." If the backing binary is missing, the tool call will fail. But the LLM was told the capability exists, so it will keep trying. Not injecting the skill prevents promising capabilities that can't be delivered.

An extension tool is observable: the LLM calls it, gets a concrete error back ("command tesseract not found on PATH"), and can adapt in the same turn. Warn-and-continue is the friendlier behavior — the operator sees the warning and can fix the config without the agent crash-looping.

Where this is logged

Both kinds emit the same structured warn log fields:

WARN skill=weather missing_bins=[] missing_env=[WEATHER_API_KEY]
     "skill disabled: required env vars unset or empty"

WARN extension=docker-api missing_bins=[docker] missing_env=[]
     "extension preflight: declared requires not satisfied (continuing anyway)"

Filter on missing_env or missing_bins to alert proactively.

Pre-deploy verification

Use the CLI:

agent ext doctor --runtime

This runs Requires::missing() for every discovered extension, and with --runtime actually spawns each stdio extension to run the handshake. Nothing is left to chance.

For local skills, a failing agent turn logs all skipped skills — a dry run against the smallest scripted input gives you the same signal without needing a separate command.

Reserved env for secrets

Extensions receive a filtered copy of the host's env. Names matching the secret-like patterns below are stripped before spawn (crates/extensions/src/runtime/stdio.rs):

Suffixes: _TOKEN, _KEY, _SECRET, _PASSWORD, _PASSWD, _PWD, _CREDENTIAL, _CREDENTIALS, _PAT, _AUTH, _APIKEY, _BEARER, _SESSION
Substrings: PASSWORD, SECRET, CREDENTIAL, PRIVATE_KEY

Declaring an env in requires.env whitelists it past the blocklist. That's the only supported way for an extension to receive a secret env var. Gating and whitelisting come from the same field — preconditions you declare travel alongside the value you want.

Write-gating in practice

Some shipped extensions gate destructive operations behind dedicated flags — separate from requires.env:

Extension	Write gate env var
`docker-api`	`DOCKER_API_ALLOW_WRITE`
`proxmox`	`PROXMOX_ALLOW_WRITE`
`onepassword`	`OP_ALLOW_REVEAL` (reveal vs metadata-only)
`google`	`GOOGLE_ALLOW_SEND`, `GOOGLE_ALLOW_CALENDAR_WRITE`, `GOOGLE_ALLOW_DRIVE_WRITE`, `GOOGLE_ALLOW_TASKS_WRITE`, `GOOGLE_ALLOW_PEOPLE_WRITE`

These are not handled by the generic gating layer — the extension reads them itself and refuses destructive methods when unset. Good pattern to adopt when your own extension wraps an API with destructive endpoints.

Gotchas

Empty env counts as missing. EXAMPLE_KEY= is treated the same as EXAMPLE_KEY unset. This is intentional — empty strings rarely mean "use the default" for a secret.
requires.bins checks $PATH at discovery. A binary installed after the agent starts won't be picked up until restart — or until you run agent ext doctor --runtime as a secondary gate.
Local-skill skip is silent to the LLM. If you expected a skill to be present and you don't see it in the system prompt, check the warn logs for the skip reason before debugging agent behavior.

Dependencies — modes and bin versions

A skill that depends on a CLI tool or an environment variable can declare those needs in requires. The runtime resolves the declarations at load time and decides whether to expose the skill, hide it, or expose it with a visible warning the LLM can see.

---
name: ffmpeg-tools
requires:
  bins: [ffmpeg]
  env:  [TRANSCODE_OUTPUT_DIR]
  bin_versions:
    ffmpeg: ">=4.0"
  mode: strict          # default
---

Modes

Mode	When deps are missing	LLM sees the skill?
`strict` (default)	Skill is dropped	No
`warn`	Skill loads with a `> ⚠️ MISSING DEPS …` banner prepended to its body	Yes — with the warning inline
`disable`	Skill is always dropped, even when deps are satisfied	No

Per-agent override

Operators override a skill's declared mode without editing the skill file:

agents:
  - id: kate
    skills: [ffmpeg-tools]
    skill_overrides:
      ffmpeg-tools: warn

Resolution order:

agents.<id>.skill_overrides[<name>] (operator wins)
Skill frontmatter requires.mode
strict (built-in default)

Bin versions

requires.bin_versions adds a semver constraint on top of mere bin presence. Failing the constraint is treated like a missing dep — the active mode decides whether to skip or warn.

Constraint syntax

semver request strings:

Want	Constraint
At least 4.0	`">=4.0"`
Any 4.x compatible release	`"^4.0"`
4.x but no 5	`">=4.0, <5.0"`
Exact 4.2.1	`"=4.2.1"`
Patch-compatible to 5.1.3	`"~5.1.3"`

Versions like 4.2 are normalized to 4.2.0 before comparison so constraint matching works against partial outputs.

Custom probe

Defaults: <bin> --version, regex \d+\.\d+(?:\.\d+)?. Override when a tool emits something idiosyncratic:

requires:
  bin_versions:
    curl:
      constraint: ">=8.0"
      command: "--help"
      regex: 'curl (\d+\.\d+(?:\.\d+)?)'

The shorthand form bin: ">=4.0" and the long form bin: { constraint: …, command: …, regex: … } are both accepted.

Probe fail modes

Reason	When
`bin_not_found`	Binary not on PATH
`probe_failed`	Spawn errored or timed out (5 s cap)
`parse_failed`	The default regex (or override) didn't match
`constraint_unsatisfied`	Found version doesn't match the constraint
`invalid_constraint`	Constraint string couldn't be parsed as semver

Invalid constraints log at error level; the skill is treated as having a missing dep — boot continues so a typo in one skill doesn't take the whole agent down. Probes are cached process-wide by absolute path so a bin shared across skills only spawns once.

When mode: warn and any dep is missing, the skill body is rendered to the LLM with this prefix:

> ⚠️ MISSING DEPS for skill `ffmpeg-tools`:
>   - bin not found: ffmpeg
>   - env unset: TRANSCODE_OUTPUT_DIR
>   - version mismatch: ffmpeg requires >=4.0 (found 3.4.2)
> Calls into this skill may fail.

The LLM treats this like any other markdown context, so it has the information it needs to either avoid the skill or report a useful error to the user when a tool call fails.

Backwards compatibility

Skills without requires.mode, requires.bin_versions, or agents.<id>.skill_overrides keep the prior behavior (strict, no version checks). The defaults are chosen so an unmodified skill catalog and existing agents.yaml continue to work unchanged.

TaskFlow model

TaskFlow is a durable, multi-step flow runtime that survives process restarts and external waits. It's designed for work that spans more LLM turns than a single conversation buffer can hold — approvals, data pipelines, delegated subtasks, scheduled actions.

Source: crates/taskflow/ (types.rs, store.rs, engine.rs).

When to use it

Use TaskFlow when any of the following apply:

A task needs to pause and resume later (hours, days)
Multiple agents collaborate on one outcome
You need a full audit trail of what happened and when
You need recovery from a crash mid-task

If it's a one-shot turn, don't reach for TaskFlow — the runtime's normal session buffer is enough.

Flow shape

A flow is an opaque state_json (free-form JSON) plus metadata:

Field	Purpose
`id`	UUID generated on creation.
`controller_id`	String label identifying the flow definition (e.g. `kate/inbox-triage`).
`goal`	Human-readable statement of intent.
`owner_session_key`	`agent:<id>:session:<session_id>` — hard tenancy gate.
`requester_origin`	Who asked (user id, external system id).
`current_step`	String label for the current phase (`"classify"`, `"await_approval"`, …).
`state_json`	Free-form JSON owned by the flow — the LLM mutates this over time.
`wait_json`	Current wait condition while `status = Waiting`.
`status`	See state machine below.
`cancel_requested`	Sticky flag that forces the next valid transition to `Cancelled`.
`revision`	Monotonic integer; increments on every update. Used for optimistic concurrency.
`created_at` / `updated_at`	Timestamps.

state_json is shallow-merged on updates: a patch { "foo": 1 } replaces only the foo key, everything else is preserved.

State machine

stateDiagram-v2
    [*] --> Created
    Created --> Running: start_running
    Running --> Waiting: set_waiting(condition)
    Waiting --> Running: resume
    Running --> Finished: finish
    Running --> Failed: fail
    Waiting --> Failed: fail
    Created --> Cancelled: cancel
    Running --> Cancelled: cancel
    Waiting --> Cancelled: cancel
    Finished --> [*]
    Failed --> [*]
    Cancelled --> [*]

Terminal states: Finished, Failed, Cancelled. No further transitions allowed.
Sticky cancel: cancel_requested = true forces the next allowed transition to land on Cancelled. The flag survives restart and is idempotent — multiple cancel requests converge on the same outcome.

Persistence

SQLite-backed via sqlx, pool size 5. Default path ./data/taskflow.db, override with TASKFLOW_DB_PATH.

Tables

CREATE TABLE flows (
  id                  TEXT PRIMARY KEY,
  controller_id       TEXT,
  goal                TEXT,
  owner_session_key   TEXT,
  requester_origin    TEXT,
  current_step        TEXT,
  state_json          TEXT,
  wait_json           TEXT,
  status              TEXT,
  cancel_requested    BOOLEAN,
  revision            INTEGER,
  created_at          INTEGER,
  updated_at          INTEGER
);

CREATE TABLE flow_steps (
  id                  TEXT PRIMARY KEY,
  flow_id             TEXT NOT NULL,
  runtime             TEXT,              -- Managed | Mirrored
  child_session_key   TEXT,
  run_id              TEXT,
  task                TEXT,
  status              TEXT,
  result_json         TEXT,
  created_at          INTEGER,
  updated_at          INTEGER,
  UNIQUE (flow_id, run_id)
);

CREATE TABLE flow_events (
  id          INTEGER PRIMARY KEY AUTOINCREMENT,
  flow_id     TEXT NOT NULL,
  kind        TEXT,
  payload_json TEXT,
  at          INTEGER
);

flows.revision drives optimistic concurrency (see FlowManager).
flow_events is append-only — every transition leaves a trail.
flow_steps.(flow_id, run_id) UNIQUE catches duplicate observations at the DB layer, not in a race-prone managerial check.

Wait conditions

Persisted in wait_json while status = Waiting.

#![allow(unused)]
fn main() {
enum WaitCondition {
    Timer { at: DateTime<Utc> },                        // auto-resume at time
    ExternalEvent { topic: String, correlation_id: String }, // resume when matching event arrives
    Manual,                                              // resume only via explicit call
}
}

Condition	Resumed by
`Timer`	`WaitEngine::tick()` when `now >= at`
`ExternalEvent`	`try_resume_external(flow_id, topic, correlation_id, payload)`
`Manual`	`FlowManager::resume(id, patch)` — typically via CLI or a deliberate LLM turn

There is no timeout built into the wait itself — you timeout by pairing any wait with a Timer fallback (e.g. fan out "wait for approval OR 24 h elapsed") via orchestration in the flow's step logic.

Audit trail

Every transition writes a flow_events row with:

kind: created, started, waiting, resumed, finished, failed, cancelled, state_updated, step_observed, ...
payload_json: contextual data (wait condition, result, reason, step info)
at: timestamp

The audit append happens inside the same SQLite transaction as the state update — you can never see a flow state that doesn't have a matching audit event, even after a crash mid-operation.

Mirrored flows

Beyond Managed flows (owned by FlowManager), you can create Mirrored flows that just observe externally-driven work:

create_mirrored(input) inserts a flow already in Running state
record_step_observation(StepObservation) upserts into flow_steps by (flow_id, run_id) — new observations merge with existing rows
Emits step_observed audit events

Useful for tracking tasks executed elsewhere — a delegation to another agent, a subprocess spawned out-of-band — while keeping one unified audit surface.

FlowManager — the mutation API, revision retry, and agent-facing tools

FlowManager, tools, and CLI

FlowManager owns the mutation API for flows. It wraps the FlowStore with revision-checked atomic updates, the agent-facing taskflow tool, the WaitEngine, and the agent flow CLI.

Source: crates/taskflow/src/manager.rs, crates/taskflow/src/engine.rs, crates/core/src/agent/taskflow_tool.rs.

Responsibilities

flowchart LR
    subgraph FM[FlowManager]
        CREATE[create_managed<br/>create_mirrored]
        RUN[start_running<br/>set_waiting<br/>resume<br/>finish<br/>fail<br/>cancel]
        PATCH[update_state<br/>request_cancel]
        QUERY[get / list_by_owner / list_by_status / list_steps]
        OBS[record_step_observation]
    end
    FM --> STORE[FlowStore<br/>SQLite]
    FM --> ENG[WaitEngine]
    TOOL[taskflow tool<br/>agent-facing] --> FM
    CLI[agent flow CLI] --> FM
    ENG --> STORE

One manager per store — typically one per process. Same database file can be opened by multiple managers safely as long as each goes through the revision protocol.

Optimistic concurrency

Every mutation follows this loop:

flowchart TD
    START[mutation requested] --> FETCH[fetch current flow]
    FETCH --> APPLY[apply closure:<br/>transition, patch, etc.]
    APPLY --> SAVE[store.update_and_append<br/>WHERE id=? AND revision=?]
    SAVE --> RES{result}
    RES -->|ok| DONE([return updated flow])
    RES -->|RevisionMismatch| REFETCH[refetch + retry]
    REFETCH --> LIMIT{attempts >= 2?}
    LIMIT -->|no| APPLY
    LIMIT -->|yes| ERR([surface RevisionMismatch])

revision is a monotonic integer on every flow
Update runs UPDATE ... WHERE id=? AND revision=? — only one writer wins per revision
Retry budget is 2 attempts (1 fetch + 1 refetch); persistent conflict bubbles up to the caller
Update and audit-event append happen inside a single SQLite transaction — crash mid-operation cannot produce a desync between state and audit trail

WaitEngine

Broker-agnostic scheduler. Pull-based tick() advances any flow whose wait condition has fired.

flowchart LR
    TICK[WaitEngine::tick_at] --> SCAN[scan all Waiting flows]
    SCAN --> EVAL{evaluate wait}
    EVAL -->|Timer expired| RESUME1[resume]
    EVAL -->|still future| STAY1[stay waiting]
    EVAL -->|ExternalEvent / Manual| STAY2[stay waiting]
    EVAL -->|cancel_requested| CAN[transition to Cancelled]
    EXT[try_resume_external<br/>topic + correlation_id] --> MATCH{wait condition<br/>matches?}
    MATCH -->|yes| RESUME2[resume + merge payload into<br/>state.resume_event]
    MATCH -->|no| NOOP[no-op]

tick_at(now) — a single scan. Returns a TickReport with counters: scanned, resumed, cancelled, still waiting, errors.
run(interval, shutdown_token) — long-running loop; drive from heartbeat or a dedicated tokio task.
try_resume_external(flow_id, topic, correlation_id, payload) — called by a NATS subscriber or the CLI when an external event arrives; matches against the flow's persisted wait_json and resumes if it fits.

Correlation ids are caller-chosen strings. Typical pattern: when a flow delegates to another agent via agent.route.<target_id>, include the flow's id or a fresh UUID as the correlation id, and have the receiver echo it on reply.

Agent-facing tool

Single taskflow tool with dispatch by action:

Action	Params	Result
`start`	`controller_id`, `goal`, optional `current_step` (default `"init"`), optional `state`	`{ok, flow}` — auto-transitions Created → Running
`status`	`flow_id`	`{ok, flow}` or `{ok:false, error:"not_found"}`
`advance`	`flow_id`, optional `patch`, optional `current_step`	`{ok, flow}` with merged state
`cancel`	`flow_id`	`{ok, flow}`
`list_mine`	—	`{ok, count, flows: [...]}`

Session tenancy

Every call derives owner_session_key = "agent:<id>:session:<session_id>". The manager rejects any mutation whose owner does not match the flow's — "belongs to a different session" error. Cross-session access from the LLM is not possible.

Revision hidden from the LLM

The tool fetches the flow before every mutation and uses the live revision internally. The LLM never sees or reasons about revision numbers — fewer tokens, fewer mistakes.

CLI

agent flow list          [--json]
agent flow show <id>     [--json]
agent flow cancel <id>
agent flow resume <id>

list prints a table sorted by updated_at DESC
show prints the flow plus every recorded step
cancel calls manager.cancel(id)
resume is a manual unblock for Manual or ExternalEvent waits — useful in ops / testing when an expected event never arrived

All commands honor TASKFLOW_DB_PATH (default ./data/taskflow.db).

End-to-end example

From crates/taskflow/tests/e2e_test.rs:

#![allow(unused)]
fn main() {
// 1. Create + run + park.
let f = manager.create_managed(input).await?;
let f = manager.start_running(f.id).await?;
let f = manager.set_waiting(f.id, json!({"kind": "manual"})).await?;

// 2. Process exits. Reopen the SAME database file from a fresh manager.
let reloaded = manager.get(f.id).await?.unwrap();
assert_eq!(reloaded.status, FlowStatus::Waiting);
assert_eq!(reloaded.state_json["verses_done"], 10);  // partial work survived

// 3. Resume picks up where we left off.
let resumed = manager.resume(reloaded.id, None).await?;
assert_eq!(resumed.status, FlowStatus::Running);
}

Shipped shape of CreateManagedInput:

{
  "controller_id": "kate/inbox-triage",
  "goal": "triage inbox",
  "owner_session_key": "agent:kate:session:abc",
  "requester_origin": "user-1",
  "current_step": "classify",
  "state_json": { "messages": 10, "processed": 0 }
}

There is no YAML flow-definition format — flows are built in code (or driven by the taskflow tool's start action).

Garbage collection

store.prune_terminal_flows(retain_days) deletes flows whose terminal state is older than the retention window. Wire this into a scheduled job when your flows pile up — audit trails accumulate forever otherwise.

Gotchas

state_json is shallow-merged. Nested updates require the caller to build the full replacement object for the key being changed.
revision conflicts retry only twice. If two callers are fighting over a flow continuously, the second persistently surfaces RevisionMismatch — treat that as a signal that you should either serialize at a higher level, or have the loser retry at the app layer.
No flow-level mutex. The DB-level UNIQUE (flow_id, run_id) on steps keeps step-observation races safe; revision checks keep mutation races safe. But two observers can read a flow simultaneously — don't rely on read-time consistency for decisions.
wait_json is cleared on resume. If you need to remember the wait condition for audit purposes, the flow_events table has it.

Wait / resume

Durable flows can park themselves between steps. The runtime drives parked flows back to Running either on a wall-clock deadline (timer), when an external signal arrives (NATS), or when an operator resumes them by hand (manual).

Two pieces wire this together:

WaitEngine — single global tokio task. Every tick_interval it scans Waiting flows and resumes any whose timer has fired or whose cancel intent has been set.
taskflow.resume bridge — single broker subscriber that translates incoming events into WaitEngine::try_resume_external calls.

Source: crates/taskflow/src/engine.rs, src/main.rs::spawn_taskflow_resume_bridge.

Wait conditions

The wait_json column on a flow stores one of:

Kind	Shape	Resumed by
`timer`	`{kind:"timer", at:"<RFC3339>"}`	`WaitEngine.tick()` once `now >= at`
`external_event`	`{kind:"external_event", topic:"…", correlation_id:"…"}`	`taskflow.resume` bridge with matching `(topic, correlation_id)`
`manual`	`{kind:"manual"}`	Explicit `manager.resume(...)` (CLI / ops)

Timer.at is validated by the tool against taskflow.timer_max_horizon (default 30 days). Past deadlines and topics/correlation_ids that are empty are rejected before the flow ever enters Waiting.

Tool actions

The taskflow tool exposes the LLM-facing surface. Beyond the existing start | status | advance | cancel | list_mine, three actions drive the wait/resume lifecycle:

`wait`

{
  "action": "wait",
  "flow_id": "…uuid…",
  "wait_condition": {"kind": "timer", "at": "2026-04-26T09:00:00Z"}
}

Move flow Running → Waiting. Validates wait_condition shape and guardrails before persisting.

`finish`

{
  "action": "finish",
  "flow_id": "…uuid…",
  "final_state": {"result": "ok"}
}

Move flow → Finished. final_state (optional) is shallow-merged into state_json before transition.

`fail`

{
  "action": "fail",
  "flow_id": "…uuid…",
  "reason": "downstream-error"
}

Move flow → Failed. reason is required. The reason is stamped under state_json.failure.reason and recorded in the audit event.

NATS resume bridge

A single subscriber lives at taskflow.resume. Anything that wants to wake a parked flow publishes a JSON message there:

{
  "flow_id": "f5e0…",
  "topic": "agent.delegate.reply",
  "correlation_id": "corr-42",
  "payload": {"answer": 42}
}

The bridge calls WaitEngine::try_resume_external(flow_id, topic, correlation_id, payload). If the flow is Waiting with a matching external_event condition, it resumes; the payload (if any) is merged into state_json.resume_event. Mismatches and unknown flow ids are silent debug logs.

Example with the nats CLI:

nats pub taskflow.resume '{
  "flow_id": "f5e0…",
  "topic": "agent.delegate.reply",
  "correlation_id": "corr-42",
  "payload": {"answer": 42}
}'

Single subject (no flow_id in suffix) is intentional — it keeps the subject namespace flat and avoids per-flow subscription churn. Volume is expected to be low (<10/s); if that ever changes, the bridge can shard internally without protocol changes.

Configuration

config/taskflow.yaml (optional; absent → defaults):

tick_interval: 5s        # WaitEngine cadence
timer_max_horizon: 30d   # max future Timer.at allowed by tool
db_path: ./data/taskflow.db   # also honored via TASKFLOW_DB_PATH

agents.yaml enables the tool per agent:

agents:
  - id: kate
    plugins: [taskflow, memory]

Without taskflow in plugins, the agent does not see the tool — the engine and bridge still run process-wide.

Tick interval guidance

5s (default) is plenty for human-scale timers.
Bring it down to 1s only if you have sub-minute timers and care about the worst-case lag.
The tick is idempotent and pull-based; missing a tick is harmless.

Telemetry

Each tick logs at debug level when scanned > 0:

DEBUG wait engine tick scanned=3 resumed=1 cancelled=0 still_waiting=2 errors=0

The bridge logs at info on each successful resume:

INFO taskflow resumed via NATS flow_id=… topic=…

Identity & workspace

Every agent has a workspace directory — a small set of markdown files that describe who it is, what it knows, and how it's meant to behave. The runtime loads those files at session start and injects them into the system prompt. The agent reads them; some of them, the agent also writes back to.

Source: crates/core/src/agent/workspace.rs, crates/core/src/agent/self_report.rs.

Workspace files

<workspace>/
├── IDENTITY.md        # 10.1 — persona facts (name, vibe, emoji)
├── SOUL.md            # 10.2 — prompt-like character document
├── USER.md            # who the human is (if single-user)
├── AGENTS.md          # peers this agent knows about
├── MEMORY.md          # 10.3 — self-curated facts index
├── DREAMS.md          # dreaming diary (10.6)
├── notes/             # per-day notes
└── .git/              # 10.9 — per-agent repo for forensics

Configured per agent:

agents:
  - id: kate
    workspace: ./data/workspace/kate
    workspace_git:
      enabled: true

IDENTITY.md (phase 10.1)

Short, structured. Five optional fields parsed from a markdown bullet list:

- **Name:** Kate
- **Creature:** octopus
- **Vibe:** warm but sharp
- **Emoji:** 🐙
- **Avatar:** https://.../kate.png

The parser:

Silently skips template placeholders in parens (e.g. _(pick something)_) so the bootstrap template never leaks into the persona
Produces an AgentIdentity { name, creature, vibe, emoji, avatar } struct, all fields Option<String>

Rendered into the system prompt as a single line:

# IDENTITY
name=Kate, emoji=🐙, vibe=warm but sharp

SOUL.md (phase 10.2)

Free-form markdown. No parsing. Injected verbatim after the IDENTITY block. This is where long-form character, operating principles, tone, and hard rules live.

Loaded on every session start. Main and shared sessions both see SOUL.md — the privacy boundary is MEMORY.md, not SOUL.md (shared groups should never leak private memories, but the persona is fine to surface).

MEMORY.md (phase 10.3)

The agent's self-curated index of things it remembers. Markdown sections with bullet lists — no special schema:

## People

- Luis prefers Spanish but is fine switching to English.
- Ana uses a Samsung, not an iPhone.

## Dreamed 2026-04-23 03:00 UTC

- User's timezone is America/Bogota _(score=0.42, hits=5, days=3)_
- Prefers short replies on WhatsApp _(score=0.38, hits=4, days=2)_

## Open questions

- What phone carrier does Luis use?

Scope rules:

Loaded only in main (DM-style) sessions. Group and broadcast sessions never see MEMORY.md — per-user facts must not leak into multi-user chats.
Appended automatically by dreaming sweeps (Phase 10.6)
Truncation: 12 000 chars per file cap (whole workspace total budget: 60 000 chars). Exceeding files get a [truncated] marker.

USER.md and AGENTS.md

USER.md — who this agent is talking to. Loaded in main sessions only.
AGENTS.md — which peers this agent can delegate to. Pairs with allowed_delegates in agents.yaml.

Both are free-form markdown read into the prompt.

Transcripts (phase 10.4)

Per-session, append-only JSONL files in transcripts_dir:

{"type":"session","version":1,"id":"<uuid>","timestamp":"2026-04-24T...","agent_id":"kate","source_plugin":"telegram"}
{"type":"entry","timestamp":"...","role":"user","content":"hello","message_id":"...","source_plugin":"telegram","sender_id":"user123"}
{"type":"entry","timestamp":"...","role":"assistant","content":"hello Luis","source_plugin":""}

One file per session at <transcripts_dir>/<session_id>.jsonl
No time-based rotation (session close = file close)
First line is a session header with metadata, every subsequent line is a turn

Transcripts are write-only from the runtime's point of view — they're for replay, audit, and human review, not read-back into the prompt.

Self-report tools (phase 10.8)

Four tools let the agent inspect its own state:

Tool	Returns	Use
`who_am_i`	`{agent_id, model, workspace_dir, identity{…}, soul_excerpt}`	When asked "who are you?"
`what_do_i_know`	`{sections: [{heading, bullets}], truncated}` with optional filter	Search MEMORY.md by section name
`my_stats`	`{sessions_total, memories_stored, memories_promoted, last_dream_ts, recall_events_7d, top_concept_tags_7d, workspace_files_present}`	Meta-awareness
`session_logs`	`{ok, sessions/entries/hits, …}` — actions: `list_sessions`, `read_session`, `search`, `recent`	Inspect own JSONL transcripts for self-reflection, debugging, cross-session search

The first three return concise JSON designed for the LLM to consume in one turn. Soul excerpt in who_am_i is truncated to 2 048 chars; what_do_i_know caps at 6 144 bytes serialized with at most 10 bullets per section.

session_logs is registered automatically when the agent has a non-empty transcripts_dir. It is scoped to that directory — agents cannot read each other's transcripts. Default limits: 50 entries per call (max 500), 200 chars per content preview (max 4 000). When recent is invoked without session_id, it defaults to the current session. If the agent's allowed_tools patterns exclude session_logs, it is filtered after registration like every other tool.

Load flow

flowchart TD
    SESSION[new session] --> LOADER[WorkspaceLoader.load scope]
    LOADER --> SCOPE{scope}
    SCOPE -->|Main| FULL[load IDENTITY + SOUL + USER +<br/>AGENTS + daily notes + MEMORY]
    SCOPE -->|Shared| SHARED[load IDENTITY + SOUL +<br/>AGENTS only]
    FULL --> TRUNC[enforce 12k/file, 60k total]
    SHARED --> TRUNC
    TRUNC --> RENDER[render_system_blocks<br/>into prompt]
    RENDER --> PROMPT[# IDENTITY<br/># SOUL<br/># USER<br/># AGENTS<br/># MEMORY]

MEMORY.md — write cadence and promotion rules
Dreaming — how sleeps turn recall signals into MEMORY.md entries

MEMORY.md + recall signals + workspace-git

This page covers everything about how what the agent knows evolves over time: the MEMORY.md index, the recall signals that drive dreaming, how concept tags are derived, and how the workspace-git repo captures a full audit history.

For the underlying storage mechanics (tables, queries, vector index), see Memory — long-term.

What goes where

flowchart LR
    subgraph DB[SQLite data/memory.db]
        MEM[memories]
        FTS[memories_fts]
        REC[recall_events]
        PROM[memory_promotions]
    end
    subgraph WS[workspace dir]
        MD[MEMORY.md]
        DRM[DREAMS.md]
        GIT[.git]
    end

    TOOL[memory.remember] --> MEM
    TOOL --> FTS
    MEM -. recall hits .-> REC
    REC --> DRM2[dream sweep]
    DRM2 --> PROM
    DRM2 --> MD
    DRM2 --> DRM
    CHK[forge_memory_checkpoint] --> GIT
    DRM2 --> GIT

Three layers, each with a different update cadence:

Layer	Write trigger	Consumer
`memories` table	Agent calls `memory.remember`	Next turn's `memory.recall`
`recall_events` table	Every `memory.recall` hit	Dream sweep (10.6)
`memory_promotions` table	Promotion during dream	Prevents double-promote across sweeps
`MEMORY.md`	Dream sweep (10.6)	Next session's system prompt (main scope only)
`DREAMS.md`	Dream sweep (10.6)	Historical diary for humans + `my_stats`
`.git`	Dream finish, session close, `forge_memory_checkpoint`	`memory_history` tool, post-mortem via `git log`

Recall signals (phase 10.5)

The recall_events table captures every hit of memory.recall:

CREATE TABLE recall_events (
  id         INTEGER PRIMARY KEY AUTOINCREMENT,
  agent_id   TEXT,
  memory_id  TEXT,
  query      TEXT,  -- the search string that surfaced this memory
  score      REAL,  -- relevance score from the recall call
  ts_ms      INTEGER
);

Aggregation over a per-memory window produces the signals struct consumed by dreaming:

Signal	Meaning
`frequency`	Log-normalized count of hits
`relevance`	Mean score across hits
`recency`	Exponential decay from last-hit timestamp
`diversity`	Distinct query strings, normalized (saturates at 5+)
`recall_count`	Raw hit count — used by gates
`unique_days`	Distinct UTC days the memory was surfaced

Each weighted and summed into the score that drives promotion (see Dreaming).

Concept tags (phase 10.7)

Every memory row has a concept_tags JSON column populated at insert time — not via TF-IDF but via a deterministic pipeline:

Glossary match. Hard-coded list of protected tech terms (multilingual) — backup, openai, migration, etc.
Compound tokens. Regex preserves file paths and identifiers (src/main.rs, camelCaseNames).
Unicode word segmentation. UAX #29 word boundaries split the rest.
Per-token rules:
- NFKC normalization + lowercase
- 32-char max; 3-char min for Latin, 2-char min for CJK
- Reject pure digits, ISO dates, and 100+ shared stop-words across English, Spanish, and path noise
- Underscores converted to dashes

Output capped at 8 tags per memory. Stored as JSON array on the memories row; expanded into keyword recall searches as part of the FTS5 MATCH query.

Dream sweeps backfill tags for older memories that were created before the tagging pipeline existed.

MEMORY.md write cadence

Dreaming sweeps append blocks:

## Dreamed 2026-04-24 03:00 UTC

- Luis lives in Bogota and prefers Spanish _(score=0.42, hits=5, days=3)_
- Kate should default to short WhatsApp replies _(score=0.38, hits=4, days=2)_

One block per sweep
Promoted memories shown as bullets with score, hit count, unique days
Existing sections preserved; the file is only ever appended to (manual editing by humans is fine — the dream sweep appends a new block rather than rewriting anything)

Privacy rules:

MEMORY.md is injected into main-scope sessions only. Groups / broadcasts never see it.
transcripts_dir is separate from workspace and is not committed to workspace-git by default.

Workspace-git (phase 10.9)

When workspace_git.enabled: true, the agent's workspace directory is a git repo. Commits happen automatically at three moments:

flowchart LR
    T1[dream sweep finishes] --> C[commit_all promote]
    T2[session close<br/>on_expire callback] --> C2[commit_all session-close]
    T3[forge_memory_checkpoint<br/>tool call] --> C3[commit_all checkpoint:note]
    C --> LOG[.git history]
    C2 --> LOG
    C3 --> LOG

Mechanics (crates/core/src/agent/workspace_git.rs):

Staged: every non-ignored file (respects auto-generated .gitignore)
Skipped: files larger than 1 MiB (MAX_COMMIT_FILE_BYTES)
Idempotent: no-op commit when the tree is clean
Author: {agent_id} <agent@localhost> (configurable via workspace_git.author_name / author_email)
Auto .gitignore excludes transcripts/, media/, *.tmp, *.swp, .DS_Store
No remote configured by default; operators add one if forensic archival matters

Tools that touch git

Tool	Purpose	Returns
`forge_memory_checkpoint(note)`	Commit right now with `checkpoint: <note>` subject	`{ok, oid(short), subject, skipped}`
`memory_history(limit?, include_diff?)`	`git log` of the last `limit` commits (max 100); optional unified diff oldest→HEAD	`{commits: [...], diff?}`

Good uses of explicit checkpoints:

Before a risky update sequence the agent is about to perform
After receiving a non-obvious instruction from the user
As bookends around a taskflow step boundary

Gotchas

MEMORY.md can grow unbounded over years. Workspace-git keeps the history; but the in-prompt view is truncated at 12 KB. Keep an eye on size, prune old ## Dreamed blocks if they stop being useful.
Concept-tag derivation is deterministic per content. Editing a memory's content in-place does not re-derive tags — the tags that were computed at insert stick. Re-insert to refresh.
git log replays tell the truth. If you're debugging a surprising agent behavior, memory_history --include-diff is the fastest way to see what the agent wrote to itself and when.

Dreaming

"Dreaming" is a scheduled offline sweep that consolidates an agent's memory. It reads recall signals, scores each memory that was recently surfaced, promotes the strongest ones into MEMORY.md, and commits the workspace-git repo.

Source: crates/core/src/agent/dreaming.rs.

When it runs

# agents.yaml
agents:
  - id: kate
    heartbeat:
      enabled: true
      interval: 30s
    dreaming:
      enabled: false
      interval_secs: 86400        # 24 h
      min_score: 0.35
      min_recall_count: 3
      min_unique_queries: 2
      max_promotions_per_sweep: 20
      weights:
        frequency: 0.24
        relevance: 0.30
        recency: 0.15
        diversity: 0.15
        consolidation: 0.10

Dreaming is heartbeat-driven: it ticks inside the heartbeat loop and actually sweeps when interval_secs has elapsed since the last sweep. Disable the heartbeat and dreaming stops firing.

Default interval_secs: 86400 (24 hours). Run nightly or tune down for high-throughput agents.

Three phases (Light / REM / Deep)

Conceptually borrowed from the OpenClaw design, nexo-rs ships Light → Deep:

flowchart LR
    START[sweep tick] --> LIGHT[Light:<br/>gather memories with<br/>>=1 recall event]
    LIGHT --> DEEP[Deep:<br/>score + gate + promote]
    DEEP --> WRITE[append MEMORY.md block]
    WRITE --> DIARY[append DREAMS.md entry]
    DIARY --> GIT[commit workspace]

(REM — thematic summarization with an LLM — is intentionally deferred.)

Scoring

For each candidate memory:

score = w.frequency × frequency
      + w.relevance × relevance
      + w.recency   × recency
      + w.diversity × diversity
      + w.consolidation × consolidation

Where the signals come from recall_events.

Consolidation is a modest bias toward memories that recurred in diverse queries over multiple days — taking the memory from "hit once" to "actually load-bearing."

Gates

A candidate is promoted only if all of these hold:

Gate	Default	Meaning
`recall_count >= min_recall_count`	3	Surfaced at least 3 times
`unique_days >= 1`	1	Not all hits on the same day
`distinct_queries >= min_unique_queries`	2	More than one query style hit it
`score >= min_score`	0.35	Weighted composite over the threshold
`!is_promoted(memory_id)`	—	Not already promoted in a prior sweep

Up to max_promotions_per_sweep (default 20) promoted per run; ordered by descending score.

Outputs

`MEMORY.md` append

## Dreamed 2026-04-24 03:00 UTC

- Luis lives in Bogota and prefers Spanish _(score=0.42, hits=5, days=3)_
- Kate should default to short WhatsApp replies _(score=0.38, hits=4, days=2)_

Only memories promoted this sweep appear in the block.

`DREAMS.md` diary

A longer-form diary entry the agent can read back in my_stats().last_dream_ts context. One per sweep.

Side effects

memory_promotions row per promoted memory (prevents double-promote across sweeps)
concept_tags backfilled on older memories that were created before the tagging pipeline landed
workspace_git.commit_all("promote", <body with delta>) captures the full change

Idempotency

Re-running a sweep during the same interval is a no-op:

Promotions consult memory_promotions before writing
MEMORY.md is appended to, not rewritten
Git commit returns cleanly with skipped: true when the tree is unchanged

You can safely call a manual "dream now" during a stuck session (currently via restart with a lowered interval_secs) without corrupting state.

Safety rails

Shutdown cancellation. Dream sweeps run under a cancellation token tied to the shutdown sequence. Partial sweeps don't leave inconsistent state — the atomic trio (DB row + MEMORY.md append
- git commit) runs after all candidates are scored and gated.
Heartbeat-only. Dreaming never fires from a user message turn, so a long sweep cannot block a user response.
Read-mostly. Sweep reads from recall_events; the only writes are memory_promotions, MEMORY.md append, DREAMS.md append, and git commit. Existing memory rows are untouched except for tag backfill.

What dreaming is not

Not a summarizer. It does not rewrite content.
Not a deduplicator. Two similar memories remain two memories; the recall layer will simply surface both and let the LLM pick.
Not an LLM call. The whole sweep is deterministic — no model inference, no per-sweep cost.

Tuning

Situation	Change
Memories stay too cold to promote	Lower `min_score` (e.g. 0.25)
Too many noise promotions	Raise `min_recall_count` to 5
MEMORY.md grows too fast	Lower `max_promotions_per_sweep`
Very chatty agent	Increase `interval_secs` — 24 h is already safe

Observability

Every sweep emits a summary log line with:

candidates scanned
candidates promoted
skipped (already promoted)
score range of the promoted set
workspace-git commit OID (or "clean tree")

Wire it into Prometheus via log scraping if you want time-series counters — no dedicated metric is exposed yet.

Gotchas

Turning dreaming on with min_score default produces a long first sweep. If the agent has been running for weeks without dreaming, there are a lot of candidates. Expect the first sweep to promote near the cap and subsequent sweeps to tail off.
Concept-tag backfill is O(candidates). Large backlogs will show first-sweep latency proportional to the candidate count. Not a bug — run the first sweep in a maintenance window if the backlog is large.
interval_secs is measured from last completed sweep. A failed sweep does not reset the clock — a retry will fire on the next heartbeat tick regardless.

Two-tier consolidation: light + deep (Phase 80.1)

Everything above describes the light pass — a deterministic scoring sweep that runs on the heartbeat. Phase 80.1 adds a deep pass: a forked subagent that periodically scans transcripts and rewrites the memory directory in-depth. The two pillars complement each other.

Dimension	Light pass (scoring)	Deep pass (fork)
Crate	`crates/core/src/agent/dreaming.rs`	`crates/dream/`
Cadence	Every heartbeat tick	Every 24 h, ≥ 5 transcripts
Cost	~1 SQLite query + ranking	A forked LLM goal, up to 30 turns
Writes	Append to `MEMORY.md`	Rewrite top-level `*.md` files in memory_dir
Failure mode	Returns empty `DreamReport`	Fails the audit row, rolls back the lock
Coordination	Defers when deep pass holds the lock	Acquires lock for the duration of the fork
Reference	Phase 10.6 (existing)	Phase 80.1 (this)

You can run either alone or both together. Both alone are production-safe; both together share the same memory_dir and the deep pass briefly suspends the light pass while it runs (see Coordination below).

Deep pass via fork (Phase 80.1)

The deep pass spawns a forked subagent — a fresh ChatRequest with skip_transcript: true and a 4-phase consolidation prompt — to rewrite memory under a constrained tool whitelist.

Gates (cheapest first)

A turn fires the fork only when all of these hold:

kairos_active == false (KAIROS uses a disk skill, skip to avoid double-fire).
is_remote_mode() == false.
is_auto_memory_enabled() == true.
auto_dream.enabled == true (per-binding YAML).
Time gate: hours_since(last_consolidated_at) ≥ min_hours (default 24 h).
Scan throttle: bail if a scan ran in the last 10 min.
Session gate: ≥ min_sessions transcripts touched since last fork (default 5).
Lock acquire: try_acquire_consolidation_lock() succeeds.

If any gate rejects, the runner returns RunOutcome::Skipped { gate } without firing.

ConsolidationLock

The lock file lives at <memory_dir>/.consolidate-lock. Single instance per binding (one fork at a time). Properties:

mtime IS lastConsolidatedAt — one stat() per turn is cheaper than reading a separate state file.
Body is the holder's PID. The lock is stale if the PID is dead OR now - mtime ≥ holder_stale (default 1 h).
No heartbeat. If a fork legitimately runs longer than 1 h, raise holder_stale.
try_acquire: write our PID, re-read; if matches → acquired.
rollback(prior_mtime): rewind mtime to pre-acquire. prior == 0 → unlink.

The path is canonicalized at construction so a later symlink swap cannot redirect the lock target.

4-phase consolidation prompt

The forked subagent runs through:

Orient — read existing MEMORY.md, top-level *.md files, recent transcripts.
Gather — extract candidate facts, decisions, patterns from the sessions since the last consolidation.
Consolidate — rewrite the memory files, merging duplicates, refining wording.
Prune — drop stale entries, keep the index lean.

See crates/dream/src/consolidation_prompt.rs for the full prompt template.

AutoMemFilter (Phase 80.20)

The fork only sees memory-safe tools:

FileRead, Glob, Grep, REPL — unrestricted.
Bash — only when bash_security::is_read_only returns true (~45 read-only utilities: ls, find, grep, cat, stat, wc, head, tail, ...).
FileEdit, FileWrite — only when the path resolves under the agent's canonical memory_dir. Paths outside trigger a structured denial.

Provider-agnostic — the filter runs at the dispatch layer, not the LLM provider layer.

Post-fork escape audit

After a fork completes, the runner re-scans for any FileEdit/Write that landed outside memory_dir (e.g. via a Bash redirect that slipped through). If found, the outcome flips to RunOutcome::EscapeAudit { run_id, escapes, prior_mtime } and the audit row is updated. This is defense-in-depth on top of AutoMemFilter.

Cap

MAX_TURNS = 30. Server-side enforced. The fork is bounded; if the prompt explodes, the cap closes the run with RunOutcome::TimedOut.

Coordination: skip pattern (Phase 80.1.e)

When both passes are enabled, the light pass checks the consolidation lock at the start of run_sweep. If a live PID is holding the lock, the light pass skips entirely:

#![allow(unused)]
fn main() {
if let Some(probe) = &self.consolidation_probe {
    if probe.is_live_holder() {
        return Ok(DreamReport {
            deferred_for_fork: true,
            candidates_considered: 0,
            promoted: vec![],
            ..
        });
    }
}
}

The light pass logs:

INFO dreaming agent_id=kate dream sweep deferred — autoDream fork holds consolidation lock

Trade-off: a memory that would have been promoted during the fork window is deferred to the next turn. Memories that score high still score high next turn — recoverable. The cost is at most one turn of latency vs the complexity of a buffer pattern (which we considered and rejected).

The pattern is mutually-exclusive-per-turn: when one writer is active, the other defers entirely. Recoverable on the next turn.

If the light pass runs without the deep pass enabled, the probe is None and the skip arm never fires — original behaviour preserved.

Audit trail

Two artifacts let you reconstruct what every fork did:

SQLite `dream_runs` table (Phase 80.18)

<state_root>/dream_runs.db carries one row per fork run:

Column	Type	Notes
`id`	UUID	Primary key, also the `run_id` echoed to git commits
`goal_id`	UUID	The driver-loop goal that triggered the fork
`status`	enum	`Running` → `Completed` / `Failed` / `Killed` / `LostOnRestart`
`phase`	enum	`Starting` → `Updating` (flips on first FileEdit)
`sessions_reviewing`	int	Count of transcripts the fork looked at
`prior_mtime_ms`	int?	Lock mtime before acquire (for rollback). `Some(0)` is distinct from `None`.
`files_touched`	JSON	Array of `PathBuf` — paths the fork wrote to (deduplicated)
`turns`	JSON	Last `MAX_TURNS = 30` assistant turns. Trimmed server-side.
`started_at`	TS	When the fork acquired the lock
`ended_at`	TS?	When the run reached terminal status
`fork_label`	string	`auto_dream`, `away_summary`, `eval`, ...
`fork_run_id`	UUID?	Optional pointer to `nexo_fork::ForkHandle::run_id`

Defenses: server-side MAX_TURNS = 30 cap, tail clamped at TAIL_HARD_CAP = 1000, idempotent insert on (goal_id, started_at).

Git commits (Phase 80.1.g)

When workspace_git.enabled = true for the binding, every successful fork that touched files lands a commit:

auto_dream: 3 file(s) consolidated

audit_run_id: 7a3b2f00-deaf-cafe-beef-001122334455

- MEMORY.md
- decisions/2026-04.md
- followups.md

Cross-link from git log back to the SQLite row:

$ git -C <workspace> log --grep "auto_dream" --pretty=oneline
<oid> auto_dream: 3 file(s) consolidated
$ nexo agent dream status 7a3b2f00-deaf-cafe-beef-001122334455

The Phase 77.7 secret guard runs transparently before each commit — a fork that somehow wrote a credential lands Err, the warning is logged, and the audit row stays intact (the audit row is the source of truth; the commit is bonus forensics).

Operator CLI: `nexo agent dream` (Phase 80.1.d)

Three sub-commands. None require a running daemon — they read the SQLite store directly. Read paths use a read-only pool; kill uses a read-write pool plus a filesystem lock-file rewind.

`tail` — list recent runs

$ nexo agent dream tail
# Dream Runs (db: /home/.../state/dream_runs.db)

| ID       | Goal     | Status    | Phase    | Sessions | Files | Started             | Ended               | Label      |
|----------|----------|-----------|----------|----------|-------|---------------------|---------------------|------------|
| 7a3b2f00 | b91c2d3a | Completed | Updating | 5        | 3     | 2026-04-30T10:12:01 | 2026-04-30T10:13:45 | auto_dream |
| f88e1100 | b91c2d3a | Failed    | Starting | 7        | 0     | 2026-04-30T08:00:01 | 2026-04-30T08:00:42 | auto_dream |

2 rows shown (last 20).

Filter by goal, change page size, or get JSON for scripting:

$ nexo agent dream tail --goal=b91c2d3a-... --n=5
$ nexo agent dream tail --json | jq '.[] | select(.status == "Failed") | .id'

Empty / missing DB returns a friendly message and exit 0:

$ nexo agent dream tail
(no dream runs recorded yet — db not found at /home/.../state/dream_runs.db)

`status` — single run detail

$ nexo agent dream status 7a3b2f00-deaf-cafe-beef-001122334455
# Dream Run 7a3b2f00-deaf-cafe-beef-001122334455

- **goal_id**: b91c2d3a-...
- **status**: Completed
- **phase**: Updating
- **sessions_reviewing**: 5
- **fork_label**: auto_dream
- **started_at**: 2026-04-30T10:12:01Z
- **ended_at**: 2026-04-30T10:13:45Z
- **prior_mtime_ms**: 1745939518000

## Files touched (3):
- MEMORY.md
- decisions/2026-04.md
- followups.md

`kill` — abort a running fork

$ nexo agent dream kill 7a3b2f00-... --force --memory-dir=/path/to/memory
[dream-kill] run_id=7a3b2f00-... status was Running, transitioning to Killed
[dream-kill] lock rollback: prior_mtime=1745939518000 → memory_dir=/path/to/memory
[dream-kill] done

Without --force on a Running row, the command warns and exits 2:

[dream-kill] run_id=7a3b2f00-... is still Running. Pass --force to abort.

Without --memory-dir, status flips but the lock is NOT rewound — the next fork tick may see the stale mtime as if a consolidation just completed:

[dream-kill] WARN: status flipped but lock not rolled back. Pass --memory-dir <path> next time to rewind the consolidation lock.

Already-terminal rows are no-op:

[dream-kill] run_id=7a3b2f00-... already in terminal state Completed; nothing to do

Database path resolution

The CLI resolves the dream-runs DB in three tiers:

--db <path> (explicit override, beats everything).
NEXO_STATE_ROOT env → <state_root>/dream_runs.db.
XDG default ~/.local/share/nexo/state/dream_runs.db.

The YAML tier is intentionally absent — agents.state_root is not a config field today (state_root flows into BootDeps directly per Phase 80.1.b.b.b). Set NEXO_STATE_ROOT to align the CLI with your daemon's actual data dir.

LLM tool: `dream_now` (Phase 80.1.c)

When enabled, the agent itself can force a memory consolidation mid-turn:

{
  "name": "dream_now",
  "description": "Force a memory consolidation pass now, bypassing time/session gates. Use when you've just learned a lot and want it consolidated into long-term memory before continuing.",
  "parameters": {
    "type": "object",
    "properties": {
      "reason": {
        "type": "string",
        "description": "Optional human-readable reason recorded in the audit row."
      }
    },
    "additionalProperties": false
  }
}

The tool returns a structured envelope across all six RunOutcome variants:

{
  "outcome": "completed",
  "run_id": "7a3b2f00-deaf-cafe-beef-001122334455",
  "files_touched": ["MEMORY.md"],
  "duration_ms": 12450,
  "reason": "user just locked in 4 architectural decisions"
}

Other outcomes: skipped (with gate field), lock_blocked (another fork in progress), errored, timed_out, escape_audit.

Capability gate (Phase 80.1.c.b)

Two layers must both allow the tool for it to register on a binding's surface:

Host-level: operator must export NEXO_DREAM_NOW_ENABLED=true. Default is deny — without the env var, register_dream_now_tool short-circuits with tracing::info!("dream_now: host-level capability gate closed; tool not registered").
Per-binding: Phase 16 allowed_tools: ["dream_now", ...] must include the tool name on the binding's allowlist.

Verify with:

$ nexo setup doctor capabilities
# ... capability table ...
| dream | NEXO_DREAM_NOW_ENABLED | enabled  | Medium | Allow the LLM to force a memory-consolidation pass via the `dream_now` tool. ... |

The capability listing is provider-agnostic. Same gate semantics under Anthropic, MiniMax, OpenAI, Gemini, DeepSeek, xAI, Mistral.

Configuration (Phase 80.1)

# agents.yaml
agents:
  - id: kate
    workspace_git:
      enabled: true        # required for auto_dream → git commits (Phase 80.1.g)
      author_name: "kate"
      author_email: "kate@nexo.local"
    dreaming:
      enabled: true
      interval_secs: 86400
      # ... existing scoring-sweep config from sections above
    auto_dream:
      enabled: true
      min_hours: 24h
      min_sessions: 5
      scan_interval: 10min
      holder_stale: 1h
      fork_timeout: 5min
      memory_dir: null     # null = default <workspace>/.nexo-memory/<agent_id>

Boot logging confirms wiring:

INFO boot.auto_dream agent=kate auto_dream runner registered git_checkpoint_wired=true

Setting auto_dream.enabled = false (or omitting the block entirely) disables the deep pass — the light pass keeps running under dreaming.enabled = true. Setting dreaming.enabled = false turns off the light pass but leaves the deep pass independent.

The autonomous agent — capabilities overview

This page is the bird's-eye map of what an agent running on nexo can actually do without you holding its hand. Every sub-feature has its own page (linked at the end of each section); this page exists so you can see the whole picture without piecing it together from individual reference docs.

"Autonomous" here doesn't mean "AGI". It means: the agent runs in the background, decides when to act on its own schedule, remembers what it has learned, talks to the user through every channel the operator wired (Slack, Telegram, iMessage, email, WhatsApp), approves or escalates risky actions through curated gates, and survives daemon restarts without losing context.

The agent never executes anything the operator didn't authorise in YAML. Every autonomous behaviour is a knob the operator flips on with explicit consent — there are no implicit defaults that ship a user from "ran nexo for the first time" to "the agent is texting my boss".

1. Living in the background

The agent doesn't need a foreground TTY to run.

Session kinds — every running goal carries a SessionKind enum: Interactive (default, attached to a terminal), Bg (detached background goal — nexo agent run --bg <prompt>), Daemon (a long-running goal supervised by the daemon process itself), or DaemonWorker (a child of a daemon).
nexo agent run --bg "<prompt>" spawns a goal, returns the goal_id immediately, detaches. The agent keeps running even after you close the terminal.
nexo agent ps lists running goals filtered by kind; --all includes Interactive. RO SQLite — works without a daemon up.
nexo agent attach <goal_id> renders a markdown snapshot of any goal: kind, status, phase, started_at, finished_at, diff_stat, last decision, last event. Useful to check progress without interrupting.
nexo agent discover lists Running goals filtered to detached / daemon kinds. Pass --include-interactive to broaden.
Reattach on restart — boot flips prior-run Running rows to LostOnRestart and fires notify_origin once per goal so the originating chat sees a clean [abandoned] closure instead of silence.
Drain on SIGTERM — drain_running_goals runs BEFORE plugin teardown so [shutdown] notify_origin actually leaves the channel before the daemon dies. Per-hook 2 s timeout prevents stuck publishers from hanging shutdown.

→ See Background agents (agent run --bg / ps / attach)

2. Memory + self-improvement

The agent learns. Three tiers, each with a different cost / durability trade-off.

Short-term memory — per-session, in RAM, scoped to the current goal. Cheap; gone on goal completion.
Long-term memory — SQLite + sqlite-vec embeddings (crates/memory/src/long_term.rs). Survives restarts; searchable by semantic + lexical query.
Git-backed MEMORY.md — every memory promotion writes a markdown file and commits it to a per-agent git repo. Full history; operator can git log MEMORY.md to audit what the agent decided to remember.

Three self-improvement loops the agent runs without operator intervention:

Light-pass dreaming — scoring-based consolidation runs every N turns. Cheap, no LLM call, just promotes warm memories via decay × access × recency.
Deep-pass autoDream (Phase 80.1) — heavier consolidation via a forked sub-agent with its own 4-phase prompt, runs behind 7 gates: kairos active, time-since-last (default 24 h), session count ≥ 5 transcripts, scan throttle (10 min), live consolidation lock (PID + mtime), force bypass, post-fork escape audit. Deferred for fork (deferred_for_fork: true) when another process holds the lock — promotions land on the next turn rather than racing.
extract_memories (Phase 77.5) — post-turn LLM-driven extraction. After each turn, a small LLM call asks "what surprised you, what did you learn, what should we remember?" and writes structured memory rows.

Defenses:

Secret scanner (Phase 77.7) — regex set blocks Anthropic / OpenAI / GitHub / AWS / Stripe / Google / JWT key shapes before any memory commit. Fails the commit loud.
AutoMemFilter (Phase 80.20) — when a forked sub-agent writes memory, the can_use_tool whitelist locks FileEdit / FileWrite to paths under memory_dir, Bash to read-only classifier (Phase 77.8/77.9 destructive + sed-in-place defenses still apply), REPL unrestricted. Defense-in-depth.
Memdir relevance scorer (Phase 77.6) — relevance × recency × access ranking with age decay so old / unused memories don't inflate the working-memory cost.

→ See Dreaming, Memdir scanner

3. Self-driving execution loop

When the agent receives work, what runs the loop?

Driver-loop (Phase 67) — replaces a single LLM request/response with a multi-turn execution: read context → plan → propose tool calls → run permission gate → execute → inspect results → loop. Goal-scoped, with budget caps on turns + time + tokens. Persists to agent_handles SQLite so every turn survives a daemon restart.
Acceptance autodetect (Phase 75) — at goal completion the loop runs an autodetect pass: cargo build for Rust, pyproject.toml build for Python, npm test for Node, cmake --build for CMake, cargo test --no-run for cargo. Mismatch fails the goal — the agent doesn't claim "done" on a broken build.
Plan mode (Phase 79.1) — EnterPlanMode toggle puts the agent into a read-only mode where it can only call read tools
- planning advisors (no Bash, no Write). ExitPlanMode resolves the plan with operator approval and re-enters the full surface.
Sleep { duration_ms, reason } tool (Phase 77.20) — the agent can decide "no work to do for now, wake me in 20 min" without holding a shell process. The runtime intercepts the sentinel result, pauses the goal, and schedules a wake-up with cache-aware timing (≤ 270 s keeps prompt cache warm, ≥ 1200 s amortises a cache miss; avoids the 270-1200 s window that pays the miss without benefit).
Forked sub-agent infra (Phase 80.19) — delegation_tool with mode: { Sync | ForkAndForget }. Cache-safe parameters (system_prompt + user_context + system_context + tool_use_context + fork_context_messages all five must match parent for cache hit). skipTranscript: true keeps the fork's messages out of the parent's history.

→ See Acceptance autodetect (deferred), Self-driving guide (deferred)

4. Time-based action

The agent can fire on its own schedule.

Heartbeat (Phase 7) — config-time, per-agent. Every N seconds invoke on_heartbeat(). Used for proactive messages, reminders, periodic state sync.
Cron (Phase 79.7) — LLM-time scheduled fires. The agent itself can call cron_create to schedule a future task; the runtime fires it via LlmCronDispatcher. Up to 50 entries per binding.
Cron jitter cluster (Phase 80.2-80.6) — six knobs:
- enabled — global killswitch.
- recurring_frac — fraction of next-fire interval used as jitter window.
- recurring_cap_ms — absolute cap (5 min default).
- one_shot_max_ms / one_shot_floor_ms — backward lead for one-shots.
- one_shot_minute_mod — modulus gate (mod=0 = never jitter one-shots).
- recurring_max_age_ms — auto-expire old recurring entries (permanent: true exempt). All hot-reloadable via Arc<ArcSwap>. jitter_frac_from_entry_id derives the offset from the UUID hex prefix so retries don't move the target.
Boot-time missed-task quarantine — sweep_missed_entries(skew_ms) rewrites overdue next_fire_at to i64::MAX so a long-down daemon doesn't stampede on the next tick.
agent_turn poller (Phase 20) — config-time scheduled LLM turn → channel publish. Provider-agnostic; primary use case is "every morning at 7am, summarise the inbox and post to Slack".
Proactive mode (Phase 77.20) — proactive: { enabled: true, tick_interval_secs, jitter_pct, max_idle_secs } injects a periodic <tick> message into the agent's session. The agent decides whether to act on it or call Sleep. Mutually exclusive with role: coordinator.

→ See Cron jitter (deferred), Proactive mode

5. Communication — every surface the agent can reach

5.1. Inbound from the user

Pairing (Phase 26) — every (channel, account_id) inbound goes through a pairing gate. Senders that haven't been allowlisted via nexo pair seed get a pairing challenge. Per-binding pairing_policy + auto_challenge knobs. Seeded senders survive daemon restarts via PairingStore::list_allow.
WhatsApp / Telegram / email / browser — first-party plugins (Phases 6, 22, plus email + browser CDP). Each is a Channel impl that maps inbound platform events to agent.intake.<binding> broker subjects.
MCP channels (Phase 80.9) — any MCP server that declares experimental['nexo/channel'] can push user messages into the agent. Provider-agnostic: write a Slack adapter as an MCP server and the agent gets Slack inbound for free.
- 5-step gate: capability + killswitch + per-binding allowlist + plugin source verification + approved allowlist.
- SQLite-backed session registry — Slack threads survive daemon restarts.
- Token bucket rate limit per server.
- Audit marker source: "channel:<server>" in the turn-log.
- Operator CLI nexo channel list / doctor / test.

5.2. Outbound to the user

notify_origin / notify_channel hooks — Phase 67.F callback shape so the agent can surface mid-goal updates back to the originating channel without holding the request open.
send_user_message tool (Phase 80.8) — when brief mode is active, the agent's visible output flows through this tool. status: "normal" for replies, "proactive" for unsolicited surfacings. Free text outside the tool stays visible in the detail view.
channel_send tool (Phase 80.9) — invoke any MCP channel server's outbound tool by name. Configurable outbound_tool_name per server (default send_message).
Reminder tool (Phase 7.3) — schedule a future message to any channel.

5.3. Inbound from the world

Generic webhook receiver (Phase 80.12) — HTTP receiver behind a tunnel. Configure each source by YAML: signature_spec (HMAC-SHA256/SHA1/raw token) + event_kind_from (header or body json-path) + publish_to (subject NATS). Constant-time signature compare via subtle::ConstantTimeEq. Provider-agnostic: GitHub, Stripe, Calendly, Zapier all in YAML.
Pollers (Phase 19) — config-time external endpoint polls. Fan-out to per-source NATS subjects.

5.4. Multi-agent coordination

Peer inbox (Phase 80.11) — every running goal has a NATS subject agent.inbox.<goal_id>. list_peers returns reachable peers (filtered by allowed_delegates); send_to_peer sends a typed InboxMessage with correlation_id.
InboxRouter (Phase 80.11.b) — single broker subscriber on agent.inbox.>, dashmap per-goal buffers (MAX_QUEUE=64, FIFO eviction). Renders <peer-message from="..." sent_at="..." correlation_id="..."> block into the agent's next turn.
Teams (Phase 79.6) — N parallel coordinated agents with a shared scratchpad directory. Distinct from Agent 1-to-1 delegation — suited to research fan-out + massive refactors.
Delegation tool (Phase 8) — agent-to-agent routing on agent.route.{target_id} with correlation_id. Sync mode awaits the response; ForkAndForget (Phase 80.19) fires the delegate without blocking.

→ See MCP channels, Multi-agent coordination, AWAY_SUMMARY

6. Permission + safety

The agent has powerful tools. The safety story is layered.

Per-binding capability override (Phase 16) — each binding has its own EffectiveBindingPolicy that filters allowed_tools, rate limits, outbound allowlists, and capability gates. Same agent can have a public WhatsApp binding (locked-down tool set) AND a private Telegram binding (full power).
Auto-approve dial (Phase 80.17) — auto_approve: true flips skipping the prompt for read-only / scoped-write tools while destructive Bash + writes outside workspace + ConfigTool + REPL + remote_trigger always ask. is_curated_auto_approve decision table 25 entries with symlink-escape defense + parent-canonicalize fallback for new files. mcp_/ext_ prefix default-ask. Default arm _ => false.
Capability inventory — crates/setup/src/capabilities.rs::INVENTORY registers every dangerous env toggle (NEXO_DREAM_NOW_ENABLED, NEXO_KAIROS_REMOTE_CONTROL, etc). nexo doctor capabilities surfaces every armed knob.
Bash safety (Phase 77.8-77.10):
- Destructive command warning — flags rm -rf /-shaped invocations.
- Sed-in-place + path validation — rejects sed -i against paths outside the workspace.
- shouldUseSandbox heuristic with bwrap / firejail probe.
Channel permission relay (Phase 80.9.b) — ChannelRelayDecider decorator races the local approval prompt against any channel reply (yes <id> / no <id> from the user's phone). First decision wins. 5-letter ID alphabet a-z minus l (anti-confusable); substring blocklist for offensive combos. Local prompt always runs in parallel — channel approval is a second surface, never a replacement.
Setup doctor — nexo setup doctor audits (channel, account_id) tuples, capability gates, dispatch policy consistency, pairing allowlist coverage.

→ See Auto-approve dial, Capability toggles, Bash safety knobs

7. Audit + observability

Everything the agent does leaves a trail.

Turn-level audit log (Phase 72) — every driver-loop AttemptResult writes a row to goal_turns SQLite table: outcome, decision text, summary, diff_stat, error, raw_json, plus the channel source marker. 1000-row tail cap. Idempotent on (goal_id, turn_index) so a replay doesn't corrupt history.
agent_turns_tail goal_id=<uuid> [n=20] tool — read tool that surfaces the last N turns of a goal as a markdown table. Post-mortem debug surface.
DreamTask audit (Phase 80.18) — dream_runs SQLite table joined to goal_id with status, phase, sessions_reviewing, files_touched (JSON), prior_mtime_ms, started_at, ended_at. dream_runs_tail LLM tool. nexo agent dream tail/status/kill CLI.
Agent registry persistence (Phase 71) — agent_handles SQLite table tracks every Running / completed / aborted goal. Survives daemon restarts.
Channel turn-log marker (Phase 80.9.h) — channel-driven turns write source: "channel:<server>". Single SQL filter answers "what came in via Slack today?".
Prometheus metrics (Phase 9.2) — counters + gauges per agent / per binding / per tool / per channel. health.bind YAML key wires the scrape endpoint.
Tracing logs — every gate / every dispatch / every retry emits a tracing::info! or warn! with structured fields (server, binding, kind, reason, error). Operator-readable.
Config-changes log (Phase 79.10) — when ConfigTool mutates YAML, a row lands in config_changes table with patch_id, actor_origin, allowed paths.

→ See Logging, Metrics, Turn-level audit log (deferred)

8. Operator surface

The CLI commands a human runs to drive / debug / observe the agent:

Command	What it does
`nexo run --config config/agents.yaml`	Daemon entrypoint
`nexo agent run [--bg] "<prompt>"`	Spawn a goal
`nexo agent ps [--all] [--kind=...]`	List running goals
`nexo agent attach <goal_id>`	Snapshot of a goal
`nexo agent discover [--include-interactive]`	List discoverable goals
`nexo agent dream tail/status/kill`	DreamTask audit + control
`nexo channel list/doctor/test`	MCP channels surface
`nexo pair list/seed/start/revoke`	Pairing gate management
`nexo flow list/show/cancel/resume`	TaskFlow runtime
`nexo setup`	Interactive wizard
`nexo setup doctor`	Configuration audit
`nexo setup migrate --dry-run/--apply`	Schema migrations
`nexo doctor capabilities`	Env toggle inventory
`nexo ext install/list/uninstall/run`	Extension management
`nexo mcp-server`	Run nexo as an MCP server

→ See CLI reference

9. End-to-end use case

This is the kind of workflow the autonomous agent is built for.

Scenario: a marketing-agent named kate runs as a daemon process, paired with the operator's Slack workspace + Telegram account. It manages the editorial calendar and replies to user queries during business hours.

agents:
  - id: kate
    model:
      provider: anthropic
      model: claude-sonnet-4-5
    plugins: [memory, browser, web_search]
    assistant_mode:
      enabled: true
    auto_approve: true
    proactive:
      enabled: true
      tick_interval_secs: 1800   # check in every 30 min
      max_idle_secs: 86400
    auto_dream:
      enabled: true
    channels:
      enabled: true
      approved:
        - server: slack
        - server: telegram
    inbound_bindings:
      - plugin: telegram
        instance: kate_tg
        allowed_channel_servers: [slack, telegram]
        auto_approve: true
        dispatch_policy:
          mode: full

What happens at runtime:

Boot — nexo run spawns kate as a daemon. The daemon reads the YAML, validates, opens broker, opens SQLite stores (memory, agent registry, dream runs, turn log, channel sessions, pairing). Connects the configured MCP servers. Spawns a ChannelInboundLoop per (binding, server) plus a single ChannelBridge per process. Wraps the inner permission decider in ChannelRelayDecider.
First Slack DM — alice writes "¿qué publicamos hoy?" in Slack thread 1700000000.000. The Slack MCP server emits notifications/nexo/channel. The runtime parses, derives session_key = "slack|thread_ts=1700000000.000", resolves a fresh session_uuid, persists it in mcp_channel_sessions.sqlite, hands off the <channel source="slack" thread_ts="1700000000.000"> to the intake. Pairing gate verifies alice is allowlisted (or challenges her).
Agent decides — the LLM reads recent context (long-term memory + transcripts), decides to look up the calendar. Calls Bash(python check_calendar.py). Auto-approve flips the prompt away because the path is read-only and inside the workspace.
Reply — agent calls channel_send(server: "slack", content: "Tenemos pendiente el blog post de Q2", arguments: { thread_ts: "1700000000.000" }). The runtime resolves the outbound tool name from the registered server's snapshot and invokes it through the MCP runtime. Slack MCP server posts to the Slack API.
Cron fires at 8 PM — cron_create from a previous turn scheduled a daily summary. Cron runner picks it up, dispatches an LLM turn through LlmCronDispatcher. Output goes to the operator's Telegram via notify_channel.
Risky tool prompt — the agent decides to schedule an email blast. The local approval prompt opens; in parallel the runtime emits notifications/nexo/channel/permission_request to both Slack and Telegram. Operator's phone shows Approve "Schedule email blast?" — yes abcde / no abcde. Operator types yes abcde in Telegram; Telegram MCP server parses, emits notifications/nexo/channel/permission. ChannelRelayDecider wins the race, returns AllowOnce. Email sends.
Operator sleeps — agent keeps running. Receives Slack DMs from team members; replies through the same threads. Cron tasks fire on schedule. Memory consolidates at midnight via auto_dream.
Daemon restart — operator pushes a new YAML, the watcher detects, validates, swaps via Phase 18 ArcSwap. The ChannelRegistry::reevaluate pass evicts handlers that no longer pass the gate. SQLite stores survive. When alice writes again in the same Slack thread, the agent reattaches to the same session — the bot doesn't re-introduce itself.
Operator returns after 12 h silence — first inbound triggers the AWAY_SUMMARY digest. Agent composes a markdown report of the past 12 h: 14 channel messages handled, 2 permission prompts approved, 1 cron fire completed. Sent before processing the operator's actual message.
Operator audits — agent_turns_tail goal_id=<uuid> n=50 shows every decision the agent made in the last 50 turns. nexo channel doctor validates the YAML against the gate. nexo agent dream tail shows last consolidations.

The operator never sat at a terminal during steps 5-9. The agent is autonomous within the bounds of the YAML.

10. Provider-agnostic by design

Every autonomous behaviour works against any LLM provider:

MiniMax M2.5 (primary)
Anthropic Claude (subscription OAuth, API key, or Claude Code import)
OpenAI-compat providers
Gemini
Local llama.cpp (Phase 68 backlog — model-agnostic GGUF loader for tier-0 inference)

The LlmClient trait is the abstraction. No autonomous feature hard-codes a provider; everything routes through the registry + binding-level provider selection.

Channels work the same way: any MCP server that follows the protocol becomes a channel, regardless of which platform it adapts.

Pollers, webhooks, and channel adapters are all data-driven via YAML — operators don't write per-provider Rust to add a new external surface.

11. Code map — where each capability lives

Capability	Crate / file	Tests
Driver-loop	`crates/driver-loop/`	+ integration tests
Permission decider	`crates/driver-permission/src/decider.rs`	inline
Auto-approve dial	`crates/driver-permission/src/auto_approve.rs`	27
Channel relay decorator	`crates/driver-permission/src/channel_relay.rs`	8
Bash safety	`crates/driver-permission/src/bash_destructive.rs`	19
Long-term memory	`crates/memory/src/long_term.rs`	inline
Memdir relevance scorer	`crates/memory/src/memdir/`	inline
Secret guard	`crates/memory/src/secret_guard.rs`	inline
autoDream runner	`crates/dream/`	67
Cron schedule + jitter	`crates/core/src/cron_schedule.rs`	80
Channels gate + parser + bridge	`crates/mcp/src/channel*.rs`	109
Channel session store	`crates/mcp/src/channel_session_store.rs`	9
Channel permission relay	`crates/mcp/src/channel_permission.rs`	27
Channel boot helpers	`crates/mcp/src/channel_boot.rs`	5
Channel LLM tools	`crates/core/src/agent/channel_*_tool.rs`	21
Pairing	`crates/pairing/`	inline
TaskFlow	`crates/taskflow/`	inline
Agent registry persistence	`crates/agent-registry/`	51
Turn-level audit log	`crates/agent-registry/src/turn_log.rs`	9
Inbox router	`crates/core/src/agent/inbox*.rs`	17
Webhook receiver	`crates/webhook-receiver/`	33
Forked sub-agent	`crates/fork/`	42
Driver / runtime hookup	`src/main.rs`	smoke

Total channel-related lib tests: 168 verde spread across 5 crates. Workspace-wide tests count is much larger; see the phase-specific docs for the per-feature breakdown.

12. What's NOT done yet

Honest list of polish items still backlogged:

Sample MCP channel server fixture — extensions/sample-channel-server/ reference impl so operators can wire a fake channel quickly without writing an MCP server from scratch. ~200 LOC, high educational value, no functional impact.
Setup wizard panel for channels — nexo setup → Configurar agente → Channels interactive opt-in. UX nice-to-have.
Live-runtime channel doctor — current nexo channel doctor is static against YAML. Live version that consults the active ChannelRegistry via NATS to show what's actually registered in the running daemon.
channel_history LLM tool — tail of the turn-log filtered by source: "channel:<server>", useful for the agent to ask itself "what did Slack send today".
Phase 67.10–67.13 — escalation-to-channel paths for driver-loop are largely subsumed by notify_origin / notify_channel already. Remaining tickets in PHASES.md.
Phase 68 Local LLM tier (llama.cpp) — 15 sub-phases for tier-0 inference (PII / embeddings / poller pre-filter / classifiers / fallback). Planned to run on Termux ARM CPU + desktop CPU/GPU.

None of these block the autonomous agent's current capabilities.

13. Where to go next

Setting up your first autonomous agent → Quick start + Setup wizard.
Deep dive on assistant mode + auto-approve → Assistant mode overview.
MCP channels specifically → MCP channels.
Multi-agent coordination patterns → Multi-agent coordination.
Audit + observability stack → Logging
- Metrics + Turn-level audit log (deferred).
Phase tracking — PHASES.md at repo root has the exhaustive sub-phase status (✅ MVP / ⬜ open / DEFERRED).

Assistant mode

Assistant mode is a per-binding behavioural toggle that flips an agent into a proactive posture: it can act on its own when no user is in the chat, run long-lived background goals, coordinate with peers, and summarise activity when the user re-connects after a silence. Default is disabled — bindings without the block keep their conventional request-response behaviour.

Quick start

# agents.yaml
agents:
  - id: kate
    workspace_git:
      enabled: true     # required for `auto_dream` git commits
    dreaming:
      enabled: true
      interval_secs: 86400
    assistant_mode:
      enabled: true
      # Operator override (optional). When omitted, the bundled
      # default text is appended to the system prompt.
      system_prompt_addendum: null
      # Auto-spawn teammates at boot. Wired in 80.15.b follow-up;
      # accepted at parse time so YAML doesn't need migration later.
      initial_team: []
    auto_approve: true   # see `auto-approve.md`
    away_summary:
      enabled: true
      threshold_hours: 4
      max_events: 50

Boot logging confirms wiring:

INFO boot.assistant agent=kate assistant_mode runner registered

What changes when assistant mode is on

1. Proactive system prompt addendum

The binding's effective system prompt picks up an addendum that nudges the agent toward proactive behaviour:

You are running in assistant mode. Your default posture is proactive: when the user is away, you may use scheduled triggers (cron) and channel inbound to drive your own actions, including spawning teammates, calling tools to gather context, and waiting on external events. When you have something useful to report, surface it succinctly through the configured outbound channel; otherwise stay quiet rather than narrating idle time. Only block on user input when you genuinely need a decision they can supply.

Operator can override the text via system_prompt_addendum: "...". Empty strings are rejected — omit the field to use the default.

2. Boot-immutable flag

The enabled flag is captured at boot. Toggling requires a daemon restart so a single turn never sees a half-flipped state. The addendum content itself IS hot-reloadable through the Phase 18 config-watcher path — operators can iterate on the prompt text without bouncing the daemon.

3. Curated auto-approve dial (companion feature)

assistant_mode: true is most useful paired with auto_approve: true (see auto-approve.md). Without the dial, the agent hangs on every tool call waiting for interactive approval — the proactive posture dies the first time it tries to run ls /tmp. With the dial, safe read-only / scoped-write tools auto-allow while destructive Bash, writes outside workspace, and self-config-edit tools always ask.

nexo setup doctor warns when these are misaligned (assistant_mode on but auto_approve off — see 80.17.c follow-up for the audit).

4. Always-on lifecycle

Bindings in assistant mode typically pair with:

BG sessions (agent run --bg) — long-lived goals that survive shell exit. See cli/agent-bg.md.
AWAY_SUMMARY — re-connection digest after silence. See away-summary.md.
Multi-agent coordination — list_peers + send_to_peer for in-process peer messaging. See multi-agent-coordination.md.
Heartbeat / cron — for time-driven proactive triggers (existing Phase 7 + future Phase 80.2 jitter cluster).

Reading the flag from code

Boot-time helpers resolve the configured value through a single view:

#![allow(unused)]
fn main() {
use nexo_assistant::ResolvedAssistant;
use nexo_config::types::assistant::AssistantConfig;

// At boot, per binding:
let resolved = ResolvedAssistant::resolve(cfg.assistant_mode.as_ref());
// AgentContext.assistant: ResolvedAssistant — read by:
//   - llm_behavior (system prompt addendum injection)
//   - cron defaults (80.15.c follow-up)
//   - brief mode auto-on (80.15.d follow-up)
//   - dream context kairos signal (Phase 80.1)
//   - remote-control auto-tier (80.17.b.b follow-up)
}

Status (Phase 80 cluster)

The assistant-mode cluster ships across multiple sub-phases. As of the most recent Phase 80 sweep:

Sub-phase	Feature	Status
80.15	`assistant_mode` flag + addendum + `ResolvedAssistant`	✅ MVP
80.10	`SessionKind` enum + `agent run --bg` + `agent ps`	✅ MVP
80.16	`agent attach` + `agent discover` (DB-only viewer)	✅ MVP
80.17 + 80.17.b	`auto_approve` dial + decorator	✅ MVP
80.14	AWAY_SUMMARY digest helper	✅ MVP
80.11	Agent inbox + `list_peers` / `send_to_peer` tools	✅ MVP
80.11.b	Receive side router + per-goal buffer + render	✅ MVP
80.1 cluster	`auto_dream` fork-style consolidation	✅ MVP
80.16.b	Live event streaming via NATS for attach	⬜
80.2-80.6	Cron jitter cluster	⬜
80.8	Brief mode + `SendUserMessage` tool	⬜
80.9	MCP channels routing (7-step gate)	⬜
80.12	Generic webhook receiver	⬜
80.21	Docs + admin-ui sweep	✅ (this page)

Each sub-phase ships its infrastructure standalone (testable in isolation, opt-in). Wiring the whole cluster end-to-end requires the operator to thread the deferred main.rs hookup snippets when their daemon dirty state allows.

Auto-approve dial

The auto_approve per-binding flag flips the approval gate from "always ask the operator" to "auto-allow a curated subset of safe tools". It is the missing piece that makes assistant mode practical: a proactive agent running cron-driven goals can't block on interactive approvals at every tool call.

Default is disabled — current interactive-approval behaviour preserved unchanged for every existing binding.

What auto-approves

When auto_approve: true AND the tool is in the curated subset AND its call passes the per-tool conditional checks, the approval prompt is skipped and the tool runs as AllowOnce. Otherwise the existing approval pipeline takes over.

Bucket	Tools	Notes
Read-only / info	`FileRead`, `Glob`, `Grep`, `LSP`, `WebFetch`, `WebSearch`, `list_agents`, `agent_status`, `agent_turns_tail`, `memory_history`, `dream_runs_tail`, `list_mcp_resources`, `read_mcp_resource`, `list_followups`, `list_peers`, `task_get`	Always auto when dial on
Bash conditional	`Bash`	Only when `is_read_only` AND not `destructive_command` AND not `sed_in_place`
Scoped writes	`FileEdit`, `FileWrite`	Only when path canonicalises under `workspace_path`; new-file case canonicalises parent then re-attaches filename; symlink-escape resistant
Notify + memory	`notify_origin`, `notify_channel`, `notify_push`, `forge_memory_checkpoint`, `dream_now`, `ask_user_question`	Always auto
Coordination	`delegate`, `team_create`, `team_delete`, `send_to_peer`, `task_create`, `task_update`, `task_stop`	Always auto

What ALWAYS asks (regardless of dial)

Tool	Why
`ConfigTool` / `config_self_edit`	Self-editing YAML is too dangerous
`REPL`	Stateful subprocess side-effects
`remote_trigger`	Outbound webhook to arbitrary URL
`schedule_cron`	Persistent state mutation
Bash with destructive / sed-in-place	Phase 77.8/77.9 vetoes ALWAYS apply
`mcp_` / `ext_` prefix	Heterogeneous per-server semantics
Unknown tool name	Default-deny — new tools must be explicitly added

Layering

The dial composes with the existing gates rather than replacing them:

┌─────────────────────────────────────────────────────────────┐
│ 1. Phase 16 binding `allowed_tools` (tool name filter)      │
│    └── Tool not in list → never even reaches the registry   │
│ 2. Capability gate (env vars + cargo features)              │
│    └── Host-level dangerous toggle off → tool stripped      │
│ 3. Auto-approve dial (THIS layer)                           │
│    └── Curated subset → AllowOnce                           │
│    └── Otherwise → fall through to operator prompt          │
│ 4. Bash destructive heuristic (Phase 77.8/77.9)             │
│    └── ALWAYS vetoes regardless of dial                     │
│ 5. Operator interactive approval                            │
│    └── Companion-tui / pairing / chat reply                 │
└─────────────────────────────────────────────────────────────┘

The dial NEVER widens the tool surface. A tool absent from the binding's allowed_tools is still absent. The dial only skips the prompt for tools that are already on the surface AND fall in the curated subset.

Configuration

agents:
  - id: kate
    workspace: /home/kate/projects   # used to scope FileEdit/Write
    allowed_tools: ["FileRead", "Bash", "FileEdit", "delegate"]
    auto_approve: true               # agent-level default
    inbound_bindings:
      - plugin: whatsapp
        # Per-binding override (optional). None inherits agent default.
        auto_approve: false           # this binding stays interactive

agent.auto_approve is the agent-level default; per-binding override at inbound_bindings[].auto_approve is Option<bool> where None inherits.

Wiring on the operator side (deferred 80.17.b.b/c)

Today's slim MVP ships:

is_curated_auto_approve(tool_name, args, on, workspace_path) -> bool decision table (crates/driver-permission/src/auto_approve.rs)
AutoApproveDecider<D> decorator (same module) wrapping any PermissionDecider chain
AgentConfig.auto_approve: bool + InboundBinding.auto_approve: Option<bool> YAML schema
EffectiveBindingPolicy.auto_approve: bool + workspace_path: Option<PathBuf> resolved per binding

Pending:

80.17.b.b — boot-time wrap of the active decider with AutoApproveDecider::new(...) (1-line snippet)
80.17.b.c — caller-side metadata population: the wire that constructs PermissionRequest must insert metadata.auto_approve and metadata.workspace_path from the resolved policy before invoking the decider
80.17.c — nexo setup doctor warn for assistant_mode + !auto_approve misconfiguration

Until those ship, the decorator is a transparent pass-through — the helper is called but the metadata never reads true. Test it locally by hand-populating metadata in your decider wrapper.

Defense-in-depth

Five layers protect against agent misbehaviour even with the dial on:

Phase 16 binding policy — tool not on the surface = never reachable.
Default-deny match arm — newly introduced tools never auto-approve until explicitly added to the decision table.
Phase 77.8/77.9 destructive heuristic — rm -rf, dd, mkfs, sed -i, fork-bomb shapes always veto.
Workspace-scoped writes — symlink-escape resistant via Path::canonicalize + starts_with.
Operator restart kill-switch — flipping auto_approve: false and restarting takes < 5 seconds.

AWAY_SUMMARY digest

When the user has been silent for a configurable threshold (default 4 hours), the next inbound message triggers a short markdown digest that summarises everything the agent did during the silence: goals completed, aborts, failures, and turn counts. Default is disabled — per-binding opt-in.

Why

In assistant mode, the agent runs proactively in the user's absence. When the user comes back to the chat, they need a quick recap before the agent processes their new request. AWAY_SUMMARY is that recap.

The digest answers "what did you do while I was gone?" with a few counter bullets, NOT a long narrative. If you want a richer LLM-summarised version (1–3 sentences of natural prose), that's the deferred 80.14.b follow-up.

Configuration

agents:
  - id: kate
    away_summary:
      enabled: true
      threshold_hours: 4    # default
      max_events: 50        # default

Field	Type	Default	Notes
`enabled`	bool	`false`	Master toggle
`threshold_hours`	u64	`4`	Hours of silence before next inbound triggers digest. `0` would fire on every inbound — operator-side rate limiting becomes their responsibility. Rejected at validate time when > 30 days (likely operator confusion).
`max_events`	usize	`50`	Cap on events included in the digest. Larger windows still fire but truncate with a `(showing the most recent N — older events may exist)` suffix. Rejected at validate time when 0.

Output shape

Template-based markdown (no LLM call in the slim MVP — pure-fn render):

**While you were away** (last 6h12m):
- 7 goal turn(s) recorded
- 4 completed
- 1 aborted/cancelled
- 1 failed
- 1 in progress / other

When the event count hits max_events, a truncation suffix is appended:

_(showing the most recent 50 — older events may exist)_

Wiring (operator-side)

Today's slim MVP ships the digest helper as a pure-async function in nexo-dispatch-tools::away_summary:

#![allow(unused)]
fn main() {
use nexo_dispatch_tools::try_compose_away_digest;

// Inbound handler — called when a new user message arrives.
let digest = try_compose_away_digest(
    &cfg.away_summary.unwrap_or_default(),
    last_seen,           // Option<DateTime<Utc>> from caller storage
    chrono::Utc::now(),
    turn_log_store.as_ref(),
).await?;

if let Some(text) = digest {
    // Deliver via notify_origin BEFORE processing the user inbound.
    notify_origin(channel, &text).await?;
}

// Atomically update last_seen = now after composing
// (caller-managed storage — pairing store, separate SQLite, in-memory map).
update_last_seen(channel, sender_id, chrono::Utc::now()).await?;

// Now process the user's actual message.
process_inbound(...).await?;
}

The helper walks 4 gates cheapest-first:

cfg.enabled (opt-in)
last_seen.is_some() — None returns None without firing so the caller can use it as the bootstrap path (set last_seen = now without burning the threshold)
now - last_seen >= threshold — negative elapsed (clock skew) returns None
Turn-log has at least one event since last_seen — empty digest is not worth sending

When all four pass, returns Some(markdown).

Atomic update pattern

The last_seen storage is operator-managed in the slim MVP: the helper accepts the timestamp as a parameter, doesn't couple to nexo-pairing or any specific table. Whatever you choose, update it atomically AFTER composing the digest so a rapid double-inbound doesn't fire twice.

Defense-in-depth

Edge case	Behaviour
`enabled: false`	Returns None
`last_seen: None` (bootstrap)	Returns None — caller sets last_seen without firing
`now - last_seen < threshold`	Returns None
`now < last_seen` (clock skew)	Returns None
Turn-log empty	Returns None
`max_events == 0`	Validate rejects at boot (not at runtime)
`threshold_hours > 30 days`	Validate rejects at boot

Deferred follow-ups

80.14.b — LLM-summarised version: forks a subagent that takes the events list and renders a 1–3 sentence prose summary. Today's MVP is template-based.
80.14.c — last_seen_at tracking in nexo-pairing::PairingStore with SQLite migration so operators don't roll their own.
80.14.d — Per-channel-adapter rendering (whatsapp / telegram render markdown differently).
80.14.e — Time-of-day awareness ("don't ping at 3am unless awake_hours covers").
80.14.f — Custom prompt template per agent (relevant once 80.14.b ships).
80.14.g — main.rs inbound interceptor wire (1-line invocation site, blocked on dirty-state pattern).

Multi-agent coordination

Two LLM tools — list_peers and send_to_peer — let in-process agents discover each other and exchange messages without going through a delegate-style RPC. Pairs cleanly with assistant mode: the agent's plain text output is NOT visible to other agents, so peer messaging IS the only way to communicate.

Subject contract

Per-goal NATS subject:

agent.inbox.<goal_id>

Wire format is JSON. Payload is InboxMessage:

#![allow(unused)]
fn main() {
pub struct InboxMessage {
    pub from_agent_id: String,
    pub from_goal_id: GoalId,
    pub to_agent_id: String,
    pub body: String,
    pub sent_at: DateTime<Utc>,
    pub correlation_id: Option<Uuid>,
}
}

correlation_id is omitted from the wire when None so request/response patterns can re-use the channel without accidentally injecting noise on one-shots.

Constants exposed via nexo_core::agent::inbox:

INBOX_SUBJECT_PREFIX = "agent.inbox"
MIN_BODY_CHARS = 1
MAX_BODY_BYTES = 64 * 1024

`list_peers` LLM tool

Read-only enumeration. No-arg shape; returns peer summaries excluding the calling agent. Each entry includes a reachable flag based on the binding's allowed_delegates filter (Phase 16).

{
  "peers": [
    {
      "agent_id": "researcher",
      "description": "research agent",
      "reachable": true
    },
    {
      "agent_id": "writer",
      "description": "writer agent",
      "reachable": false
    }
  ]
}

When the binding has no PeerDirectory configured, returns:

{
  "peers": [],
  "note": "this agent has no PeerDirectory configured"
}

Use this BEFORE send_to_peer to discover valid to: targets.

`send_to_peer` LLM tool

Fire-and-forget. Resolves to: to a peer agent_id, looks up the peer's live goals, publishes the InboxMessage to each goal's inbox subject, returns the per-goal delivery report.

Tool shape

{
  "to": "researcher",
  "message": "task #1 ready for handoff",
  "correlation_id": "7a3b2f00-..."   // optional UUID
}

Output

{
  "delivered_to": ["b91c2d3a-...", "f88e1100-..."],
  "unreachable_reasons": []
}

Or, on failures:

{
  "delivered_to": [],
  "unreachable_reasons": [
    "unknown agent_id `ghost`"
  ]
}

Validation gates

The handler walks 6 gates:

to must be present + non-empty after trim
to != ctx.agent_id (self-sends rejected with explicit error)
message must be present + non-empty
Body ≤ MAX_BODY_BYTES (64 KB cap; rejected with explicit limit)
to must exist in PeerDirectory (fast-path "unknown agent_id" unreachable when not — fail-fast before broker round-trip)
Lookup must return at least one live goal id (empty → "no live goals" unreachable)

When all 6 pass, the handler iterates the live goals, builds an InboxMessage per goal with from_goal_id = ctx.session_id.map(GoalId).unwrap_or_else(GoalId::new) (best- effort; provenance preserved via from_agent_id), publishes via the broker, and accumulates per-goal results.

Per-goal fan-out

A peer with multiple live goals (Bg + Daemon + Interactive) gets the message at every goal's inbox. Per-goal failures don't cancel the whole call — unreachable_reasons accumulates while delivered_to records successful publishes.

Receive side (Phase 80.11.b)

Peer messages are queued in a per-goal in-memory FIFO buffer with a 64-message cap (FIFO eviction on overflow). The receiving goal's runtime drains the buffer at next turn start and renders the messages as a <peer-message from="..."> system block:

# PEER MESSAGES

<peer-message from="researcher" sent_at="2026-04-30T14:00:00+00:00">
task #1 ready for handoff
</peer-message>

correlation_id attribute is added when Some so the receiver can correlate replies back through the same channel.

Buffer-on-demand

Messages addressed to a goal that hasn't register()'d yet queue in a fresh buffer. When the goal eventually registers, it sees the buffered messages — race-safe under fast-spawn-then- immediate-send.

Wiring

#![allow(unused)]
fn main() {
use nexo_core::agent::inbox_router::{InboxRouter, render_peer_messages_block};

// Boot — single per-process spawn.
let router = InboxRouter::new(broker.clone());
let _handle = router.spawn(cancel.clone());

// Per-goal startup.
let buf = router.register(goal_id);
ctx.inbox_buffer = Some(buf);

// Per-turn loop, adjacent to assistant addendum push site.
let drained = ctx.inbox_buffer.as_ref().map(|b| b.drain()).unwrap_or_default();
if let Some(block) = render_peer_messages_block(&drained) {
    channel_meta_parts.push(block);
}

// Goal terminal.
router.forget(goal_id);
}

Defense-in-depth

Edge case	Behaviour
Self-send	Rejected by handler
Body > 64 KB	Rejected with explicit limit in error
Empty body	Rejected
Unknown agent_id	`unreachable_reasons: ["unknown agent_id ..."]`
No live goals for peer	`unreachable_reasons: ["no live goals ..."]`
Per-goal publish fails	Recorded in `unreachable_reasons`, others continue
Buffer full (64 messages)	FIFO eviction with `tracing::warn!`
Subject malformed	Dropped with `tracing::debug!`
Payload garbage	Dropped with `tracing::debug!`
Race: peer terminates between `list_peers` and `send_to_peer`	Falls through `unreachable_reasons` not panic

Deferred follow-ups

80.11.b.b — Hook InboxRouter drain + render into the per-turn loop in llm_behavior.rs (1-line snippet).
80.11.b.c — main.rs router spawn + per-goal register / forget on goal lifecycle hooks.
80.11.c — Broadcast to: "*" with cap (linear in team size, marked expensive in tool description).
80.11.d — Cross-machine inbox via NATS cluster (works automatically with NATS, documents the operator's broker config requirement).
80.11.e — Bridge protocol responses (shutdown_request / plan_approval_request JSON shapes — niche, defer).
80.11.f — main.rs tool registration wire.

Coordinator mode (Phase 84)

Phase 77.18 introduced role: coordinator | worker as a binding flag and gated the team-coordination tool surface behind it. Phase 84 closes the gap that remained: until 84.1 shipped, a coordinator binding only saw the tools — it ran the same system prompt as any other binding and treated worker results as opaque chat fragments.

What 84.1 ships

When a binding's resolved BindingRole is Coordinator, the runtime prepends a purpose-built coordinator persona block ahead of the agent's existing system prompt. The block is deterministic (same inputs → same bytes) so prompt-cache prefix matching stays warm across turns.

Order of the rendered system prompt:

# COORDINATOR ROLE
{persona block — sections below}

{agent.system_prompt}

# CHANNEL ADDENDUM
{binding.system_prompt_extra — when set}

Worker, Proactive, and absent role bindings are byte-identical to today; the persona prefix only kicks in for Coordinator.

Persona block sections

Role declaration — frames the agent as a coordinator: directs workers, synthesizes results, communicates with the user.
Tools available to you — the binding's allowed_tools filtered to the curated coordinator surface (TeamCreate, TeamDelete, SendToPeer, ListPeers, SendMessageToWorker, TaskStop, TaskList, TaskGet, TodoWrite). Tools the binding doesn't surface drop out.
Worker result envelope — instruction to treat the <task-notification> XML envelope (Phase 84.2) as a system event, never as a user message.
Continue-vs-spawn matrix — decision table guiding when to reuse a finished worker (SendMessageToWorker, Phase 84.3) vs. spawn fresh (TeamCreate, Phase 79.6) vs. message a live peer (SendToPeer, Phase 80.11).
Synthesis discipline — coordinator must produce implementation specs with file paths and line numbers. Anti-pattern: "based on your findings, fix the bug" (delegates understanding back to the worker).
Verification rigor — real verification (run failing case first, apply fix, run again, confirm broader suite, read the diff). "The build passed" is not verification.
Parallelism — independent work fans out via concurrent tool calls in a single assistant message.
Scratchpad (optional) — appears when the binding has TodoWrite (Phase 79.4) in allowed_tools. Mandatory for 3+ workers.
Known workers (optional) — only rendered when the CoordinatorPromptCtx.workers slice is populated; the boot path passes empty by default since peer discovery is dynamic.

Configuring a coordinator binding

agents:
  ana:
    inbound_bindings:
      - plugin: whatsapp
        instance: ana_main
        role: coordinator
        allowed_tools:
          - TeamCreate
          - TeamDelete
          - SendToPeer
          - ListPeers
          - SendMessageToWorker     # Phase 84.3
          - TaskStop
          - TodoWrite               # enables Scratchpad section

The allowed_tools list shapes both the runtime tool surface (Phase 16) and the persona block's tool list — keeping the two in sync without a parallel config.

Continue-vs-spawn matrix

Situation	Action
Worker finished; new ask builds on its loaded context	Continue (`SendMessageToWorker`, Phase 84.3)
New work has no overlap with any finished worker	Spawn fresh (`TeamCreate`)
Two unrelated streams of work	Spawn in parallel — one assistant message, both calls
Worker still in_progress; want to nudge	Send to peer (`SendToPeer`)
Worker silent past budget	`TaskStop`, then decide spawn-vs-continue from partial

Default to continue when the new ask shares >50% of the prior worker's read files / search terms. Default to spawn when in a different subsystem.

`<task-notification>` envelope (Phase 84.2)

Worker results arrive in the coordinator's session wrapped in a <task-notification> XML block. The block carries task-id, status, summary, optional result, and optional usage:

<task-notification>
<task-id>goal-9f3a</task-id>
<status>completed</status>
<summary>Found 3 candidate fixes</summary>
<result>See `crates/auth.rs:142`.</result>
<usage>
<total_tokens>1280</total_tokens>
<tool_uses>4</tool_uses>
<duration_ms>12400</duration_ms>
</usage>
</task-notification>

status is one of completed | failed | killed | timeout. Optional elements (<result>, <usage>) collapse out when the producer has no value.

Treat these blocks as system events, not user messages. The persona prompt explicitly forbids <thank> / <acknowledge> responses to a notification — read it, factor into synthesis, and either continue the worker (84.3), spawn the next one, or report to the user.

Producer surface lives in nexo-fork::fork_handle:

#![allow(unused)]
fn main() {
let n = fork_result.to_task_notification(task_id, summary, duration_ms);
let xml = n.to_xml(); // injected into the coordinator's next user turn
}

fork_error_to_task_notification(err, task_id, duration_ms) covers the failure paths (ForkError::Aborted → killed, ForkError::Timeout → timeout, others → failed).

Consumer wiring (the producer-to-LLM-context bridge) lands with 84.3 — that's where the fork-pass + TeamCreate completion paths actually exist. The 84.2 work pre-builds the type + producer helpers so 84.3 has one canonical path.

Backwards compatibility: TaskNotification::parse_block(text) returns None when the input lacks the envelope, so legacy consumers that read raw final text keep working during the rollout.

`SendMessageToWorker` continuation tool (Phase 84.3)

The coordinator can re-engage a finished worker by appending a new user turn to its loaded session context. Distinct from SendToPeer (peer-to-peer messaging to a live agent) and TeamCreate (spawn fresh worker with empty context).

{
  "tool": "SendMessageToWorker",
  "args": {
    "worker_id": "w-research",         // task-id from prior <task-notification>
    "message": "Continue: investigate the token-expiry boundary at auth.rs:142"
  }
}

Response shape

Outcome	`kind`	Notes
Worker exists, finished, this binding spawned it	`Continued`	Returns `worker_id`, `prior_status`, `messages_count`, `pipeline_pending: true`
`worker_id` matches no registry entry in this binding	`UnknownWorker`	Same error returned for cross-binding probes (defense-in-depth — no existence oracle)
Worker exists but is `Running`	`WorkerStillRunning`	Use `SendToPeer` for live peer messaging
Message > 32 KiB	`MessageTooLarge`	Hard cap
Binding has no resolved channel/account	`BindingUnresolved`	Synthesised policies (delegation, heartbeat) refuse cleanly
Binding role isn't coordinator	`RoleRefused`	Defense-in-depth — even if `allowed_tools: ["*"]`

Cross-binding isolation

The registry keys workers by (coordinator_binding_key, worker_id), where coordinator_binding_key comes from EffectiveBindingPolicy.binding_id() — the canonical <channel>:<account_id|"default"> render. A worker registered under binding A is invisible to binding B; the lookup returns Unknown, not WrongBinding, so binding B can't enumerate binding A's worker ids.

Pipeline pending

The 84.3 sub-phase ships the type, the registry, the tool, and all four spec error scenarios. The actual transcript-resume execution (loading the worker's prior messages, appending the new user turn, running another fork loop, emitting a fresh <task-notification> on completion) is deferred to the fork-as-tool spawn pipeline that lives outside this sub-phase. Today the success path returns pipeline_pending: true so a coordinator can verify the request was accepted; the resume itself wires up alongside the worker-spawn pipeline.

Composition with other phases

Phase	Composition
16 — `EffectiveBindingPolicy`	Persona prepend runs inside `resolve()` after `allowed_tools` is computed so the tool list reflects the effective binding surface
77.18 — `BindingRole`	The role string is parsed via `BindingRole::from_role_str`; only `Coordinator` triggers the prepend
79.4 — `TodoWrite`	When present in `allowed_tools`, the Scratchpad section is rendered
79.6 — `TeamCreate` / `TeamDelete`	Listed in the persona's tool section when on the binding's surface
80.11 — `SendToPeer` / `ListPeers`	Same
84.2 — `<task-notification>` envelope	The persona instructs the agent to treat these blocks as system events
84.3 — `SendMessageToWorker`	Listed in the persona's tool section; the continue-vs-spawn matrix references it

Inspecting the rendered prompt

For verification on a configured agent, the prompt is visible in the runtime EffectiveBindingPolicy::resolve(...).system_prompt field. A cargo test -p nexo-core agent::effective::tests::coordinator run exercises the full path including a YAML-fixture smoke test.

Worker mode (Phase 84.4)

Complement to coordinator mode. Bindings with role: worker get a worker-specific system prompt block prepended to the agent's existing system_prompt. Workers run self-contained tasks dispatched by a coordinator; the persona steers them away from user-facing dialogue and toward terse, verified, on-spec output.

What 84.4 ships

When a binding's resolved BindingRole is Worker, the runtime prepends the worker persona block. Coordinator bindings get the coordinator block instead; Proactive and absent role are byte-identical to today.

# WORKER ROLE
{persona block — sections below}

{agent.system_prompt}

# CHANNEL ADDENDUM
{binding.system_prompt_extra — when set}

Persona block sections

Role declaration — frames the agent as an executor: do the work, report results, do not initiate user-facing dialogue. Scope questions go back to the coordinator via the final answer ("blocked: need X").
Output discipline — the final answer is read by another agent, not a human. Optimize for parseability:
- Code work: file path + line range + actual diff (or commit hash).
- Research: bullet list with file_path:line references.
- Failures: actual error output verbatim, not paraphrased.
Self-verification — typecheck + test + read the diff before reporting done. False "done" reports poison the synthesis above.
Tools available to you — the binding's allowed_tools list verbatim (workers see exactly the surface the operator granted, no curated subset). Followed by an explicit reminder that TeamCreate, SendToPeer, SendMessageToWorker, and TaskStop are not worker tools — wanting one means the task scope is too large.
Scratchpad (optional) — appears when TodoWrite is in allowed_tools. Worker scratchpad is for the worker's own multi-step state, not for cross-worker coordination.

Configuring a worker binding

agents:
  ana-worker:
    inbound_bindings:
      - plugin: whatsapp
        instance: ana_worker
        role: worker
        allowed_tools:
          - BashTool
          - FileEdit
          - WebFetch
          - TodoWrite

Composition with other phases

Phase	Composition
16 — `EffectiveBindingPolicy`	Worker prepend runs inside `resolve()` after `allowed_tools` is computed
77.18 — `BindingRole::Worker`	The role string parses to `BindingRole::Worker`; only that triggers the prepend
79.4 — `TodoWrite`	Scratchpad section appears when `TodoWrite` is on the binding's surface
84.1 — coordinator persona	The two persona builders are sister modules; the boot-path matcher in `apply_persona_prefix` dispatches on role
84.2 — `<task-notification>` envelope	A worker's final assistant text becomes the `result` field of the envelope when the spawn pipeline ships
84.3 — `SendMessageToWorker`	Workers don't call SendMessageToWorker (it's coordinator-only), but their session is what the tool resumes

Inspecting the rendered prompt

The full path is exercised in cargo test -p nexo-core agent::effective::tests::worker_role_loaded_from_yaml_renders_persona_block, which deserializes a YAML binding fixture and asserts the # WORKER ROLE prefix lands ahead of the agent's own prompt with the expected sections.

Proactive Mode (Phase 77.20)

Proactive mode lets an agent run autonomously between user messages. Instead of waiting for a new inbound event, the runtime injects periodic <tick> prompts and the model decides whether to do work now or call Sleep { duration_ms, reason }.

Configuration

Enable at agent level or per binding (inbound_bindings[].proactive):

proactive:
  enabled: true
  tick_interval_secs: 600
  jitter_pct: 25
  max_idle_secs: 86400
  initial_greeting: true
  cache_aware_schedule: true
  allow_short_intervals: false
  daily_turn_budget: 200

Per-binding override replaces the full proactive block for that binding.

Sleep Tool

Sleep is the canonical way to wait in proactive mode. Do not use shell sleep for this.

Bounds: duration_ms is clamped to [60_000, 86_400_000].
Wake-up: runtime injects a synthetic <tick> with elapsed time + reason.
Interrupt: real inbound user messages cancel pending sleep immediately.

Inbound Queue Priority

Inbound events can optionally carry priority in payload:

now — highest priority (urgent interrupt)
next — default priority (normal user input)
later — deferred background notifications

When multiple messages are batched in the same debounce window, runtime processes them in now > next > later order, preserving FIFO within each priority class. now also bypasses debounce delay and flushes immediately. If now arrives during an in-flight turn, runtime preempts that turn and runs the now message first.

Cache-Aware Scheduling

When cache_aware_schedule: true, runtime biases sleep duration to avoid the Anthropic cache dead-zone:

<= 270_000ms: keep as-is (cache warm window).
270_001..1_199_999ms: snap to 270_000 or 1_200_000 (nearest).
>= 1_200_000ms: keep as-is.

Daily Tick Budget

daily_turn_budget limits proactive tick-driven turns per 24h window.

0 means unlimited.
When exhausted, wake-ups are suppressed and re-armed using the effective tick interval.

This prevents runaway autonomous loops from burning quota.

Telemetry

Prometheus counter:

nexo_proactive_events_total{agent,event}

Events:

tick.fired
sleep.entered
sleep.interrupted
cache_aware.snapped

Relation to `agent_turn` Poller

Phase 20 agent_turn is cron-driven external scheduling. Proactive mode is model-driven self-pacing inside a live goal. They are complementary and can coexist across different bindings.

Compact tiers

Context compaction and memory extraction in Nexo currently has four tiers:

Tier 1: micro compact (inline tool-result shrink)

Reduces oversized tool_result payloads before request send, keeping tool_use_id correlation stable while replacing bulky content with a compact marker (or provider-summary path when configured).

Operational intent:

protect prompt budget from one-off large tool outputs
preserve turn continuity without rewriting full history

Tier 2: auto compact (history folding — Phase 67.9 + 77.2)

When token pressure crosses configured thresholds or session age expires, runtime folds older history into a compact summary while preserving the hot tail.

Two independent triggers (Phase 77.2):

Token-pressure trigger

Fires when estimated_tokens / context_window >= token_pct (default 0.80 when auto block is present, fallback to legacy threshold 0.70 when absent).

Age trigger

Fires when session_age_minutes >= max_age_minutes (default 120). Disabled when auto block is absent or max_age_minutes: 0.

Guards

Anti-storm: min_turns_between (default 5) turns must elapse between consecutive compactions.
Circuit breaker: after max_consecutive_failures (default 3) consecutive compaction failures, the policy stops requesting compacts for the remainder of the goal. A successful compact resets the counter.
Buffer tokens: buffer_tokens (default 13000) safety margin below effective context window.

Operational intent:

keep long-running sessions inside context window
age-based trigger catches memory pressure from accumulated tool outputs even when estimated tokens are low
reduce repeated cost of stale historical turns

Events

Event	Subject	When
`CompactRequested`	`agent.driver.compact`	Policy classifies and schedules a compact turn
`CompactCompleted`	`agent.driver.compact.completed`	Turn after compact, with `after_tokens`

Tier 3: session memory compact (Phase 77.3)

Persists compact summaries to long-term memory so resumed sessions can inject the last compact summary into the prompt without re-executing elided turns.

Operational intent:

survive daemon restart without losing compaction progress
feed prior summary into resumed goal's first-turn prompt
avoid redundant re-compaction of the same history

How it works

After a successful compact turn, the orchestrator extracts the LLM-generated summary from result.final_text.
Summary is persisted via LongTermMemory::remember() with tag compact_summary and goal_id embedded in the content for FTS5 recall.
On goal resume (daemon restart), load() retrieves the most recent summary and injects it into next_extras as compact_summary.
PostCompactCleanup runs after persistence (no-op placeholder for 77.5+ extractMemories integration).

Events

Event	Subject	When
`CompactSummaryStored`	`agent.driver.compact.summary_stored`	Summary persisted to LTM

Config

compact_policy:
  sm_compact:                  # Phase 77.3 (optional)
    min_tokens: 10000          # min tokens before store (default 10000)
    max_tokens: 40000          # max tokens per summary (default 40000)
    store_in_long_term_memory: true  # default true

sm_compact defaults to None — set it to enable session-memory persistence. store_in_long_term_memory: false uses the noop store for testing.

Tier 4: extractMemories (post-turn LLM extraction — Phase 77.5)

After every N eligible turns, a small LLM call reads the recent conversation transcript and writes durable memories to the persistent memory directory (~/.claude/projects/<path>/memory/*.md + MEMORY.md).

Four-type taxonomy (user / feedback / project / reference) with an explicit exclusion list (code patterns, git history, debug recipes, CLAUDE.md contents, ephemeral task details). Extraction is single-turn: the existing memory manifest is pre-injected into the system prompt so the LLM can decide what to update without file-system exploration. Response is parsed as a JSON array of {file_path, content} objects.

Operational intent:

complement Phase 10.6 dreaming (offline/recall-signal-based) with an inline/transcript-based path
keep the memory directory current without manual remember invocations
surface durable context to future sessions without re-reading full conversation history

Guards

Throttle: turns_throttle (default 1 = every turn; recommend 3+ in production to limit token cost).
Circuit breaker: after max_consecutive_failures (default 3) consecutive extraction failures, the breaker opens and extraction is skipped for the remainder of the goal.
Mutual exclusion: at most one extraction in-flight per goal. When a new turn arrives mid-extraction, its context is coalesced and runs as a single trailing extraction.
Main-agent write detection: extraction is skipped when the main agent already wrote to the memory directory this turn, avoiding clobbering intentional user-directed writes.
Path sandbox: file paths from the LLM are validated — absolute paths and .. traversal are rejected.

Events

Event	Subject	When
`ExtractMemoriesCompleted`	`agent.driver.extract_memories.completed`	Extraction succeeded, N memories saved
`ExtractMemoriesSkipped`	`agent.driver.extract_memories.skipped`	Extraction skipped (disabled / throttled / in-progress / circuit-breaker / main-agent-wrote)

Config

compact_policy:
  extract_memories:            # Phase 77.5 (optional — default: disabled)
    enabled: true              # master switch (default false — opt-in)
    turns_throttle: 3          # run every N eligible turns (default 1)
    max_turns: 5               # max LLM turns per extraction (default 5)
    max_consecutive_failures: 3  # circuit breaker (default 3, 0=disabled)

extract_memories defaults to None — set it to enable post-turn extraction. The LLM backend is wired via the driver orchestrator's extract_memories() builder method; the binary crate supplies the LlmClient adapter.

Configuration surface

All tiers are controlled under llm.context_optimization.compaction in llm.yaml, with per-agent enable switches in agents.yaml.

Driver-side config (config/driver/claude.yaml):

compact_policy:
  enabled: true
  context_window: 200000      # model context window in tokens
  threshold: 0.7              # legacy token-pressure threshold (0.0-1.0)
  min_turns_between_compacts: 5
  auto:                       # Phase 77.2 (optional — age trigger disabled when absent)
    token_pct: 0.80           # token-pressure threshold (0.0-1.0, default 0.80)
    max_age_minutes: 120      # fire age trigger after 2 h (0 disables, default 120)
    buffer_tokens: 13000      # safety margin below context window (default 13000)
    min_turns_between: 5      # anti-storm gap (default 5)
    max_consecutive_failures: 3  # circuit breaker (default 3)
  sm_compact:                  # Phase 77.3 (optional)
    min_tokens: 10000
    max_tokens: 40000
    store_in_long_term_memory: true
  extract_memories:            # Phase 77.5 (optional — default disabled)
    enabled: true
    turns_throttle: 3
    max_turns: 5
    max_consecutive_failures: 3

Agent-side config (agents.yaml or per-binding llm.context_optimization.compaction):

compaction:
  enabled: true
  compact_at_pct: 0.7         # legacy threshold
  auto:                       # Phase 77.2
    token_pct: 0.80
    max_age_minutes: 120
    buffer_tokens: 13000
    min_turns_between: 5
    max_consecutive_failures: 3

See:

Telemetry to watch

llm_compaction_triggered_total{agent,trigger,outcome} — trigger is token_pressure or age
llm_compaction_duration_seconds{agent,outcome}
agent_driver_compaction_requested_total{trigger}
agent_driver_compaction_completed_total{outcome}
agent_driver_compact_summary_stored_total
agent_driver_extract_memories_completed_total
agent_driver_extract_memories_skipped_total{reason}
prompt/token drift counters from token counter telemetry

Memdir scanner

memdir scanner support is currently documented through the MCP server extension flow and OpenClaw-parity references.

Current status:

scanner-style memory path logic is referenced in docs/src/extensions/mcp-server.md (teamMemPaths parity notes)
there is no standalone operator CLI page yet for a dedicated memdir scan command

What operators should do today

Use the MCP server extension docs as the canonical path for memory directory layout and exposure behavior.
Rely on existing memory docs for storage/runtime semantics:
- Long-term memory (SQLite)
- Vector search
Track roadmap follow-ups in PHASES.md / FOLLOWUPS.md for an explicit scanner command surface.

Configuration — `memory.secret_guard` (C5)

The Phase 77.7 secret scanner blocks memory writes that contain API keys, tokens, or private keys. From C5 onwards, operators control its behaviour via the memory.secret_guard block in config/memory.yaml:

memory:
  short_term: { ... }
  long_term:  { ... }

  # C5 — secret-scanner policy (provider-agnostic).
  # Omit the entire block for the secure default (enabled=true,
  # on_secret=block, rules=all, exclude_rules=[]).
  secret_guard:
    enabled: true              # master switch (default true)
    on_secret: block           # block | redact | warn (default block)
    rules: all                 # "all" or a list of rule IDs
    exclude_rules: []          # list of rule IDs to skip (default empty)

Field	Type	Default	Effect
`enabled`	bool	`true`	Master switch. `false` makes every check a no-op.
`on_secret`	`block` \| `redact` \| `warn`	`block`	What to do on detection. `block` returns an error and the write is refused; `redact` replaces matched secrets with `[REDACTED:rule_id]` and writes; `warn` writes intact and emits a warn log + event.
`rules`	`"all"` or `[rule_id, ...]`	`"all"`	Which rules to apply. List form selects only the named rules.
`exclude_rules`	`[rule_id, ...]`	`[]`	Rule IDs to silence (false positives).

YAML-typo values (on_secret: deny, malformed rules, etc.) fail boot loud — never silent.

Provider-agnostic

The scanner detects API keys for every supported LLM provider (Anthropic, MiniMax, OpenAI, Gemini, DeepSeek, xAI, Mistral) using the same regex set. exclude_rules operates on rule IDs (kebab-case like github-pat, aws-access-token, openai-api-key), not on providers — silencing one rule narrows by pattern shape, not by LLM-provider identity.

Common operator workflows

Switch from block to warn (e.g. dev environment debugging):
```
secret_guard: { on_secret: warn }
```
Memory writes are not refused; the daemon logs every detection for review.
Suppress a known false positive:
```
secret_guard:
  exclude_rules: [github-pat]
```
All other 35 rules stay active.
Hard-disable for an isolated test (NOT recommended in production):
```
secret_guard: { enabled: false }
```

Prior art (validated, not copied)

upstream agent CLI, 596-615,312-324 — hardcoded scanner with no YAML knob; activation via build flag (feature('TEAMMEM')) only. Operator override impossible without recompile. We adopt a richer operator-facing config rather than the hardcoded model.
research/src/config/zod-schema.ts — OpenClaw uses 2-value enums (redactSensitive: off|tools, mode: enforce|warn). We extend to 3 (block|redact|warn) for richer behaviour without forcing operators to choose between block and disabled.

Bash safety knobs

Nexo's Bash safety model is layered. Even when the Bash tool is available, execution is constrained by policy and runtime gates.

Main safety layers

Per-binding tool allowlist
- allowed_tools can remove Bash entirely for selected channels.
Plan mode gating
- mutating paths are blocked until explicit exit/approval workflow.
Destructive-intent integration
- plan-mode policy can auto-enter on destructive command detection.
Worker-role curation
- worker bindings run a constrained tool surface by default.

Relevant config knobs

agents[].allowed_tools
agents[].inbound_bindings[].allowed_tools
agents[].plan_mode.*
agents[].inbound_bindings[].plan_mode.*
agents[].inbound_bindings[].role

Operational guidance

For user-facing channels, prefer narrowing allowed_tools rather than trusting prompt-only behavior.
Keep plan mode enabled for coordinator bindings.
Use worker role for delegated execution to reduce blast radius.

Channel doctor

nexo channel is an operator CLI for debugging the MCP-channels surface without a running daemon. Three verbs:

nexo channel list   [--config=<path>] [--json]
nexo channel doctor [--config=<path>] [--binding=<id>] [--json]
nexo channel test   <server> [--binding=<id>] [--content=...]
                    [--config=<path>] [--json]

All three read from the operator's YAML directly. They never spin up the daemon, never connect to a live MCP server, and never publish on the broker. Safe to run on production configs from any operator workstation.

`nexo channel list`

Walks every agent and surfaces (enabled, approved_servers, bindings) per agent. When --json is passed the output is machine-readable; otherwise the renderer groups by agent for human reading.

$ nexo channel list
## agent kate — channels.ENABLED (2 approved)
  approved: slack
  approved: telegram
  binding telegram:kate_tg: 2 server(s) — slack, telegram

When an agent has no channels.approved entries the (no approved servers) placeholder makes the gap obvious. When no binding lists allowed_channel_servers, (no binding has allowed_channel_servers) highlights the configuration is incomplete.

`nexo channel doctor`

Runs the static half of the 5-step gate against every (agent, binding, server) triple in the YAML. The doctor cannot probe a live MCP server, so gate 1 (capability declared) is assumed true; gates 2/3/5 run normally; gate 4 (plugin source) reads from the approved entry. Each row carries one of three outcomes:

WOULD REGISTER — every static gate passes; the only thing the live daemon will check is whether the server actually declares the capability.
SKIP { kind, reason } — typed reason. disabled = channels.enabled: false. session = binding doesn't list the server. marketplace = plugin_source mismatch. allowlist = server isn't in approved.
NOT BOUND — the server appears in approved but no binding lists it. Surfaces a half-configured state where the operator vetted the server but forgot to bind it.

Filter to one binding with --binding=<plugin>:<instance>. The binding id format mirrors what the runtime registers — the same string that shows up in agent logs.

$ nexo channel doctor --binding=telegram:kate_tg
| Agent | Binding            | Server   | Outcome        | Skip       | Reason |
|-------|--------------------|----------|----------------|------------|--------|
| kate  | telegram:kate_tg   | slack    | WOULD REGISTER | -          | all static gates pass; live runtime must declare the capability |
| kate  | telegram:kate_tg   | telegram | WOULD REGISTER | -          | all static gates pass; live runtime must declare the capability |

`nexo channel test`

Synthesises a notifications/nexo/channel payload (with sample chat_id and user meta) and runs it through parse_channel_notification + wrap_channel_message. Prints the model-facing <channel> block plus the derived session_key. Cheap dry-run for tuning meta-key whitelists or verifying content-cap behaviour.

$ nexo channel test slack
# Channel test — server=slack

session_key: slack|chat_id=C_TEST

--- rendered XML (model-facing) ---
<channel source="slack" chat_id="C_TEST" user="operator">
hello from slack — channel test payload
</channel>

Override the body with --content="..." to test how the content cap (agents.channels.max_content_chars) clips long payloads. The output flags [content truncated by max_content_chars] when the cap fired.

When to use which

Setting up channels for the first time → list to verify the YAML structure, then doctor to confirm the gate would let the binding register, then start the daemon.
A server stopped delivering messages → doctor to see if the gate would still register it. Common causes: channels.enabled flipped off; binding's allowed_channel_servers doesn't include the server (typo); approved entry got renamed.
Tuning meta-key whitelists / content caps → test <server> with various --content payloads.

Live-runtime checks

doctor is intentionally static. To check live state — what's actually registered in the running daemon — the agent calls channel_list / channel_status from inside a turn, or the operator inspects the mcp.channel.> NATS subjects directly. Live-runtime CLI is on the roadmap.

Webhook receiver

Inbound HTTP webhook surface for any third-party provider that signs payloads with HMAC-SHA256 / HMAC-SHA1 / a raw shared token and exposes the event kind in a header or JSON body field. Provider-agnostic by construction: declare sources in YAML, no Rust code change per provider.

Successful requests are published to a NATS subject; downstream pollers, agent turns, or microapps subscribe and react.

Quick start

# config/webhook_receiver.yaml
enabled: true
bind: "0.0.0.0:8081"
body_cap_bytes: 1048576
request_timeout_ms: 15000

# (optional) defense for floods — token-bucket per (source, ip).
default_rate_limit:
  rps: 10
  burst: 20

# (optional) max in-flight requests per source. 0 = unbounded.
default_concurrency_cap: 32

# (optional) honour `X-Forwarded-For` only when the socket peer
# is in one of these CIDR blocks.
trusted_proxies:
  - "10.0.0.0/8"
allow_realip_fallback: false

sources:
  - id: "github_main"
    path: "/webhooks/github"
    signature:
      algorithm: "hmac-sha256"
      header: "X-Hub-Signature-256"
      prefix: "sha256="
      secret_env: "WEBHOOK_GITHUB_MAIN_SECRET"
    publish_to: "webhook.github_main.${event_kind}"
    event_kind_from:
      kind: "header"
      name: "X-GitHub-Event"

    # (optional) per-source overrides
    rate_limit:
      rps: 20.0
      burst: 40
    concurrency_cap: 8

Set the secret in the environment before starting the daemon:

export WEBHOOK_GITHUB_MAIN_SECRET='your-shared-secret'

Pipeline

Every accepted POST goes through six gates in order. Failure at any gate short-circuits the request; the dispatcher only fires when every gate passes.

Gate	Reject status	What it checks
1. Method	405	Only `POST <path>` matches the route.
2. Body cap	413	`tower_http::limit::RequestBodyLimitLayer` enforces per-source `body_cap_bytes`.
3. Concurrency	503 + `Retry-After: 1`	Per-source semaphore. `0` = unbounded.
4. Rate limit	429	Token bucket per `(source_id, client_ip)`. LRU-evicts at 4096 keys to defend against IP-flood OOM.
5. Signature	401 / 422 / 500	HMAC verify (constant-time) + event-kind extract from header or JSON body path. `500` only when `secret_env` is unset.
6. Dispatch	502 / 422	`BrokerWebhookDispatcher` publishes the envelope. `502` = broker unavailable; `422` = envelope serialise rejected.

Successful dispatch returns 204 No Content.

NATS envelope

The dispatcher publishes a typed WebhookEnvelope (JSON):

{
  "schema": 1,
  "source_id": "github_main",
  "event_kind": "pull_request",
  "body_json": { "action": "opened", "...": "..." },
  "headers_subset": {
    "x-github-delivery": "abc-123",
    "user-agent": "GitHub-Hookshot/..."
  },
  "received_at_ms": 1746147600000,
  "envelope_id": "0c4a...-uuid",
  "client_ip": "1.2.3.4"
}

Subscribers can filter on topic == "webhook.<source_id>.<event_kind>" or on the broker Event.source field (which doubles as source_id).

Headers forwarded vs stripped

Forwarding every header would leak Authorization / Cookie / the signature itself to NATS subscribers. The receiver allowlists just the non-secret correlation headers downstream consumers actually need:

x-github-delivery
x-stripe-event-id
x-event-id
x-request-id
idempotency-key
user-agent

Operating behind a reverse proxy

If the daemon is behind nginx / Cloudflare / a load balancer:

Set trusted_proxies to the proxy's source CIDR.
Optionally enable allow_realip_fallback if your proxy uses X-Real-IP instead of X-Forwarded-For.

Untrusted peers always have their forwarded headers ignored — clients claiming to be a proxy from outside the trusted CIDR get their socket address used for rate-limit keying. This is the correct defensive posture; tighten trusted_proxies until only your real proxies fit.

Reserved ports

8080 — health server (Kubernetes liveness)
9091 — admin server (loopback only)

The webhook bind address must not collide with either; validation rejects collisions at boot with a typed WebhookConfigError::ReservedBind.

Secret rotation

Secrets are read fresh per request via std::env::var — no caching. To rotate:

Set the new value in the environment.
Restart the daemon (env reads happen on every request, but the original env at start time wins; safest is restart).
Verify with a known-good signed request.

Troubleshooting

All requests 401: tracing::warn! shows signature mismatch. Re-check that the operator-side WEBHOOK_<SOURCE>_SECRET env matches what the provider signs with.
All requests 500: secret_env is unset. Check the environment for the configured variable name.
Bursts get 429s: tighten the provider's retry/backoff or raise default_rate_limit.burst. Token-bucket allows bursts up to burst then drops at rps — design for steady-state load
- a margin.
Bursts get 503s: default_concurrency_cap reached. Raise the cap, or lower the per-source concurrency_cap for noisy sources to keep them from starving the rest.

Validation errors at boot

Error	Cause	Fix
`BodyCapZero`	`body_cap_bytes: 0`	Raise to a positive value (default 1 MiB).
`RequestTimeoutZero`	`request_timeout_ms: 0`	Raise to a positive value (default 15 000 ms).
`DuplicateId`	Two sources share an `id`.	Rename one.
`DuplicatePath`	Two sources share a `path`.	Pick distinct paths.
`ReservedBind`	`bind` port is 8080 or 9091.	Pick a free port.
`Source { id, detail }`	Per-source schema invalid.	Read `detail` — typically empty `path` or empty `secret_env`.
`DefaultRateLimit`	`rps` negative or > 1000.	Use a sane positive value.
`ConcurrencyCapZero`	Per-source `concurrency_cap: 0`	Use `null` to inherit the global cap.

Event subscribers

Per-agent NATS subject patterns that, when matched, fire an agent turn. Covers the gap between webhook receivers / pollers / microapps publishing events and the agent runtime consuming them. Provider-agnostic by construction.

Quick start

# In agents.yaml under each agent:
agents:
  - id: marketing
    event_subscribers:
      - id: github_main
        subject_pattern: "webhook.github_main.>"
        synthesize_inbound: synthesize     # synthesize | tick | off
        inbound_template: "GitHub {{event_kind}}: {{body_json.repository.full_name}} — {{body_json.action}}"
        max_concurrency: 4
        max_buffer: 64
        overflow_policy: drop-oldest       # drop-oldest | drop-newest

When a NATS event matches subject_pattern, the runtime synthesises an inbound message and fires an agent turn. The agent receives the rendered template (or raw JSON fallback) in its turn context and decides what to do.

Synthesis modes

Mode	Behaviour
`synthesize` (default)	Render `inbound_template` against the event payload via mustache-lite (`{{path.to.field}}`). Fallback to JSON-stringify when no template.
`tick`	Fire an agent turn with a `<event subject="..." envelope_id="..."/>` marker as the body. Cheap on context window — agent can ignore or fetch payload via tooling.
`off`	Subscriber inactive. Useful for staging YAML before flipping it on (requires daemon restart at v0).

Auto-synthesised binding

When you declare an event_subscribers entry, the boot supervisor automatically synthesises a matching inbound_bindings entry:

# Implied automatically from event_subscribers above:
inbound_bindings:
  - plugin: event
    instance: github_main

If you declare the binding manually (e.g. to override allowed_tools or sender_rate_limit for that source), your manual entry survives — the auto-synth is idempotent.

Template syntax

Mustache-lite. Only {{path.to.field}} substitution; no conditionals or loops.

{{event_kind}} — top-level field.
{{body_json.action}} — nested object access.
{{tags.0}} — array index.
Missing path → <missing> placeholder (does not crash).
Object/array at the leaf → <missing> (avoids leaking struct shape into the agent body).

Buffer + concurrency

Each binding gets its own:

Bounded buffer (max_buffer, default 64) — absorbs bursts without blocking the broker.
Concurrency cap (max_concurrency, default 1 = serial) — enforces ordering and limits in-flight turns.
Overflow policy (drop-oldest default — recent events more relevant; drop-newest for conservative buffering).

Drops emit tracing::warn! with binding_id + drop counter.

Defensive guards

Loop guard: if a binding's subject_pattern accidentally matches its own re-publish topic (plugin.inbound.event.<id>), the producer drops the self-event with a warn — never blows the buffer.
id validation at boot: rejects ., *, >, or whitespace in id (would mis-parse the re-publish topic).
Pattern validation: >, plugin.>, plugin.*.>, plugin.inbound.* are all rejected as loop-risk patterns at boot.
Per-binding cancel token: SIGTERM drains all subscribers within ≤1s.

Worked example: GitHub webhook → marketing agent

Phase 82.2 webhook receiver verifies a GitHub webhook → publishes webhook.github_main.pull_request to NATS with the typed WebhookEnvelope.
Phase 82.4 event_subscriber matches webhook.github_main.> → renders "GitHub pull_request: anthropic/repo — opened" → re-publishes to plugin.inbound.event.github_main.
The existing inbound resolver matches the auto-synthesised { plugin: "event", instance: "github_main" } binding → constructs BindingContext with event_source: Some({ subject: "webhook.github_main.pull_request", synthesis_mode: "synthesize", ... }).
The agent fires a turn; tools see the metadata in params._meta.nexo.binding.event_source.

Microapps consuming nexo-tool-meta parse the event_source field via parse_binding_from_meta(args._meta).event_source.

Hot-reload

v0 spawns subscribers at boot only. Adding/removing event_subscribers requires a daemon restart. Hot-reload via Phase 18 reload coordinator is the deferred 82.4.c follow-up.

Operator notes

Subscribers are independent of the agent's session — they feed events into the standard inbound flow, which the runtime routes per session like any other inbound.
The _nexo_event_source extension field on the re-published payload is the canonical seam between the EventSubscriber and the inbound resolver. Microapps should read it from the agent-side _meta.nexo.binding.event_source, not from the raw broker payload.
tracing::info! summary at boot: look for event subscribers online: count=N to confirm wiring took.
Validation failures are non-fatal: an invalid binding logs an error and skips; the daemon stays up.

Per-binding tool rate-limits

Phase 82.7 lets operators declare per-binding tool rate-limits on top of the per-agent ones from Phase 9.2. Same agent + same tool, two bindings → two independent buckets with independent caps. Use it to enforce SaaS tier policies (free / pro / enterprise) without spinning up separate agent processes.

When to use

Same agent answers a free-tier WhatsApp account AND an enterprise account; the enterprise tenant must not be starved by free-tier traffic on the shared marketing_send_drip tool.
An event-subscriber binding ingests cron tickers — these should run unlimited regardless of how the agent's other bindings are configured.
A webhook binding receives bursty github events; you want a cap so a runaway CI pipeline can't spam the LLM.

The agent-level tool_rate_limits from Phase 9.2 still applies when no per-binding override is declared. When an override IS declared on the matched binding, it FULLY REPLACES the global decision for that binding (no fall-through to global patterns).

Wire shape

agents:
  - id: ana
    inbound_bindings:
      - plugin: whatsapp
        instance: free_tier
        tool_rate_limits:
          patterns:
            marketing_send_drip:
              rps: 0.167         # 10 per minute
              burst: 10
              essential_deny_on_miss: true
            "memory_*":
              rps: 1.0
              burst: 5
            _default:
              rps: 5.0
              burst: 20

      - plugin: whatsapp
        instance: enterprise
        # no override → unlimited (or global default if defined)

      - plugin: webhook
        instance: github
        tool_rate_limits:
          patterns:
            "*":                 # everything on this binding
              rps: 2.0
              burst: 10

Field reference

Field	Type	Default	Meaning
`patterns.<glob>.rps`	f64	required	Tokens added per second. `0.167` ≈ 10/min.
`patterns.<glob>.burst`	u64	`ceil(rps).max(1)`	Initial bucket capacity. Higher burst = more leniency for bursty workloads.
`patterns.<glob>.essential_deny_on_miss`	bool	`false`	When `true`, the bucket is fail-closed: if LRU pressure evicts the bucket and the key is reallocated, the next call denies once before allocating fresh. Use for paid / quota-bound tools where you'd rather drop a single call than risk leaking quota.
`patterns._default`	object	none	Reserved key matched when no explicit pattern catches the tool. Same shape as other entries.

Glob matching

Same minimal glob as the agent-level patterns:

* alone matches anything.
foo* matches strings starting with foo.
*bar matches strings ending with bar.
foo*bar matches strings starting foo and ending bar.

Patterns evaluate in deterministic alphabetical order; first match wins. _default is always last.

Per-binding fully replaces global

Important semantic — different from how allowed_tools / outbound_allowlist overrides work in some other crates:

Binding declares tool_rate_limits: Some(map) → ONLY the patterns in map apply. Tools that don't match any pattern in the override (and don't match _default either) become unlimited on that binding, regardless of any global agent-level config.
Binding declares tool_rate_limits: None (or the field is omitted) → fall through to agent-level agents.<id>.tool_rate_limits from Phase 9.2.

Operators wanting "binding tighter, with global fallback for tools the binding doesn't mention" must explicitly include those global patterns in the binding map. The full-replace semantic is documented this way to keep the resolution path unambiguous and predictable in audit logs.

Free / pro / enterprise example

- id: ana
  inbound_bindings:
    # Free tier — strict caps on paid tools
    - plugin: whatsapp
      instance: free_tier
      tool_rate_limits:
        patterns:
          marketing_send_drip:
            rps: 0.167          # 10/min
            burst: 10
            essential_deny_on_miss: true
          web_search:
            rps: 0.083          # 5/min
            burst: 5
          _default:
            rps: 1.0
            burst: 5

    # Pro tier — relaxed caps
    - plugin: whatsapp
      instance: pro
      tool_rate_limits:
        patterns:
          marketing_send_drip:
            rps: 1.667          # 100/min
            burst: 100
          _default:
            rps: 10.0
            burst: 50

    # Enterprise — unlimited (no override)
    - plugin: whatsapp
      instance: enterprise

A single marketing_send_drip flood from free_tier cannot deny calls on pro or enterprise; their buckets are independent.

Bucket lifecycle + LRU eviction

Buckets are allocated lazily — the first call for a given (agent, binding_id, tool) triple allocates a TokenBucket.

Bucket cardinality is capped (default 10_000); the cap fires only when allocating a new bucket would push the count past the limit. Eviction picks the stalest bucket by last_touch (a monotonic counter stamped on every try_acquire). Steady-state traffic amortises eviction cost to near zero.

When the evicted bucket's config had essential_deny_on_miss = true, the key is stamped into a separate "recently evicted essentials" set. The next call for that key consumes the entry and denies once, then allocates a fresh bucket. This adapts the fail-open + ESSENTIAL deny opt-in pattern from upstream production agent CLIs to the LRU eviction context.

Phase 72 audit log marker

Every denial emits a tracing::info! event with the canonical marker:

rate_limited:tool=<name>,binding=<id|none>,rps=<f64>

Example:

rate_limited:tool=marketing_send_drip,binding=whatsapp:free_tier,rps=0.167

binding=none indicates a denial on the legacy single-tenant path (delegation receive, heartbeat, pre-Phase-82.7 callers).

Operator audit pipelines parse this format for billing / SaaS fair-use metrics. The format is wire-shape stable — format_rate_limit_hit in nexo-tool-meta is the source of truth.

Hot-reload behaviour

Per-binding overrides participate in the existing Phase 18 config snapshot path. After a yaml reload:

Existing buckets keep their state until naturally aged out by LRU.
New buckets allocated post-reload use the new config.
Worst case is a single turn of slack while the snapshot swap propagates.

For an immediate cold start of all buckets, restart the daemon.

Admin RPC integration

The limiter exposes drop_buckets_for_agent(agent: &str) so the admin RPC delete-agent path (Phase 82.10) can clear (agent, *, *) cells when an operator removes an agent. Without this, buckets would leak until LRU eviction.

Observability

Useful tracing fields when investigating denials:

agent_id — which agent ran the call
marker — canonical rate_limited:... string (parse for binding/tool/rps)
tool — tool name as the LLM saw it

Tracking metrics:

nexo_rate_limit_buckets_active — total live buckets across all agents (TODO; not yet emitted as Prometheus)

Limitations

Bucket evictions during a sustained burst can briefly allow a burst's worth of extra calls before the new bucket settles. Use essential_deny_on_miss: true on tools where this is unacceptable.
The marker's rps= field reflects the configured rate at the time of denial. After a hot-reload that changes the rate, the marker may show the old value for buckets that haven't been re-resolved yet.
The _default pattern only applies within its own scope: a per-binding _default does not fall through to the global _default.

Context optimization

Four independent mechanisms reduce the number of tokens sent to the LLM on every request, without changing the agent's behavior. They live under llm.context_optimization in llm.yaml and can be flipped per agent under agents.<id>.context_optimization.

# config/llm.yaml
context_optimization:
  prompt_cache:
    enabled: true                   # default
    long_ttl_providers: [anthropic, vertex]
  compaction:
    enabled: false                  # default off — opt in per agent
    compact_at_pct: 0.75
    tail_keep_tokens: 20000
    tool_result_max_pct: 0.30
    summarizer_model: ""            # empty = reuse the agent's main model
    lock_ttl_seconds: 300
  token_counter:
    enabled: true                   # default
    backend: auto                   # auto | anthropic_api | tiktoken
    cache_capacity: 1024
  workspace_cache:
    enabled: true                   # default
    watch_debounce_ms: 500
    max_age_seconds: 0              # 0 = never force refresh (notify is authoritative)

1. Prompt caching

Materializes the system prompt as a list of cache_control blocks on the Anthropic wire so the stable prefix (workspace + skills + tool catalog + binding glue) is billed at 0.1× input cost on every cache hit. OpenAI / DeepSeek paths surface their automatic prompt_tokens_details.cached_tokens field through the same CacheUsage struct. Gemini and MiniMax flatten the blocks into the legacy system slot today (warned once per process).

Block layout (4 cache breakpoints, the Anthropic max):

workspace — IDENTITY / SOUL / USER / AGENTS / MEMORY (Ephemeral1h)
skills — per-binding skill catalog (Ephemeral1h)
binding_glue — peer directory + per-binding system prompt + language directive (Ephemeral1h)
channel_meta — sender id + per-turn context (Ephemeral5m)

Tools array is sorted alphabetically by name (the registry iterates a non-deterministic DashMap) and the last tool gets a 1h cache_control marker when cache_tools=true.

What to watch

llm_cache_read_tokens_total{agent, provider, model} — should dominate llm_cache_creation_tokens_total after the first turn of a warm session.
llm_cache_hit_ratio{agent} — target >0.7 on multi-turn agents; <0.3 means you're paying the write premium without the discount.

When to flip off

Provider rejects the request with a 400 mentioning cache_control (very old model). Mitigation: the framework already strips markers for claude-2.x; if Anthropic adds another exception, override ANTHROPIC_CACHE_BETA="..." to disable the beta header.
A custom-built LLM gateway in front of Anthropic doesn't pass the cache_control field through.

2. Compaction (online history folding)

When the pre-flight token estimate crosses compact_at_pct * effective_window, the agent runs a secondary LLM call to fold history[..tail_start] into a single summary string. The summary replaces the head; the last tail_keep_tokens worth of turns ride forward verbatim. Subsequent turns prepend the summary as a synthetic user/assistant pair so Anthropic's role-alternation rule stays valid.

Defaults are intentionally conservative: off by default. Roll out per agent via agents.<id>.context_optimization.compaction: true.

agents:
  - id: ana
    context_optimization:
      compaction: true   # ana opts in early, others stay off

What to watch

llm_compaction_triggered_total{agent, outcome} — outcomes are ok, failed, lock_held, no_boundary, tool_result_truncated.
llm_compaction_duration_seconds{agent, outcome="ok"|"failed"} — a rising p99 means the summarizer model is overloaded; lower compact_at_pct so triggers are smaller (cheaper) and more frequent.

When to flip off

Quality regression in long sessions — the summary may be losing active-task state. Inspect compactions_v1 rows in the SQLite store to see what was folded; bump tail_keep_tokens so more verbatim context survives.
Lock contention spikes — multiple processes (NATS multi-node) racing on the same session. The lock is per-session so this only happens with sticky-session misrouting; fix at the broker level rather than disabling compaction.

Safety nets

compaction_locks_v1 carries TTL (lock_ttl_seconds) — a crashed compactor doesn't deadlock the session; the next acquire after the TTL wins automatically.
Audit log: every successful compaction inserts a row in compactions_v1 with the summary text + token cost. Inspect with sqlite3 memory.db "SELECT * FROM compactions_v1 WHERE session_id = ? ORDER BY compacted_at DESC".
Failure path: 3 retries with backoff; on total failure the original history goes to the LLM unchanged (graceful degradation, never silent data loss).

3. Token counting (pre-flight sizing)

TokenCounter trait with two backends:

AnthropicTokenCounter — calls POST /v1/messages/count_tokens. Exact (matches billing). LRU-cached on blake3(payload): the stable tools+identity prefix hashes the same on every turn, so the network round-trip happens ~once per process lifetime.
TiktokenCounter — offline cl100k_base approximation. Drift vs Anthropic billing measured at 5–15%. Fine for budget gating, not for hard limits.

The cascade wraps the primary in a CircuitBreaker (failure_threshold=3, 30s→300s backoff): on count_tokens outage the agent loop falls back to tiktoken so the request still goes through. Once the breaker has opened at least once, is_exact() flips to false for the rest of the process so dashboards don't conflate sample populations.

What to watch

llm_prompt_tokens_estimated{agent, provider, model} — compare against llm_prompt_tokens_drift{...} (histogram in percent).
A drift p99 climbing past 20% means the active backend is wrong for your model — switch from tiktoken to anthropic_api (or vice versa for non-Anthropic providers).

When to flip off

The agent runs against a self-hosted gateway that doesn't honor count_tokens. Set backend: tiktoken to skip the round-trip.

4. Workspace bundle cache

Reads of IDENTITY / SOUL / USER / AGENTS / MEMORY MDs go through an in-memory Arc<WorkspaceBundle> cache keyed by (root, scope, sorted extras). A notify-debouncer-full watcher (default 500ms) drops every entry under a workspace root when any *.md changes. Non-MD file changes are ignored.

What to watch

workspace_cache_hits_total{path} should dominate workspace_cache_misses_total{path} once the cache is warm.
workspace_cache_invalidations_total{path} rising without operator edits points to a tool that writes to the workspace too aggressively.

When to flip off

NFS / FUSE filesystems where notify(7) drops events. Set workspace_cache.max_age_seconds: 60 (or similar) to force a refresh after the absolute TTL even without a watch event.

Per-agent overrides

The four enables — and only the enables — can be flipped per agent in agents.yaml. The numeric knobs (compact_at_pct, tail_keep_tokens, watch_debounce_ms, …) stay global to keep the surface narrow.

agents:
  - id: ana
    context_optimization:
      prompt_cache: true
      compaction: true
      token_counter: true
      workspace_cache: true
  - id: bob
    context_optimization:
      prompt_cache: false  # bob runs against a gateway that strips cache_control

Hot-reload behavior

Changing global knobs (llm.yaml) takes effect on the next request once the reload coordinator picks up the file change (Phase 18). For per-agent enables, the override rides on Arc<AgentConfig> inside RuntimeSnapshot and is observed on the next policy_for(...) lookup. The LlmAgentBehavior struct itself still caches its compactor / prompt_cache_enabled fields at construction — toggling those without a process restart requires the future ArcSwap<CompactionRuntime> refactor noted in proyecto/FOLLOWUPS.md.

Rollout playbook

Deploy with everything at defaults — prompt_cache=true, compaction=false, token_counter=true, workspace_cache=true.
Watch llm_cache_hit_ratio for 24h. Expect it to climb to >0.7 on chatty agents; if it stays low, check that the workspace bundle is stable across turns (no MD writes mid-session).
Pick one agent, opt it into compaction (agents.<id>.context_optimization.compaction: true), reload config, watch for a week.
If llm_compaction_triggered_total{outcome="ok"} > 0 and quality feedback is positive, roll compaction out to the rest of the fleet.
If drift on llm_prompt_tokens_drift is consistently <10%, leave token_counter.backend: auto. If higher, consider backend: tiktoken for non-Anthropic providers — saves the round-trip without losing accuracy you didn't have anyway.

Link understanding

When a user message contains URLs, the runtime can fetch them, extract the main text, and inject a # LINK CONTEXT block into the system prompt for that turn. The agent stops saying "I can't see what's at that link" and starts answering against the actual page content.

The feature is off by default. Opt in per agent (and optionally override per binding).

Per-agent config

# config/agents.yaml
agents:
  - id: ana
    link_understanding:
      enabled: true              # default: false
      max_links_per_turn: 3      # cap URLs fetched per message
      max_bytes: 262144          # 256 KiB per response, streamed
      timeout_ms: 8000           # per-fetch HTTP timeout
      cache_ttl_secs: 600        # 0 disables cache
      deny_hosts:                # appended to built-in denylist
        - internal.corp

Built-in denylist (always applied, cannot be removed): localhost, 127.0.0.1, ::1, metadata.google.internal, 169.254.169.254. Defense against SSRF to internal endpoints.

Per-binding override

Per-binding link_understanding overrides the agent default. Useful to disable on a noisy channel:

agents:
  - id: ana
    link_understanding: { enabled: true }
    bindings:
      - inbound: plugin.inbound.whatsapp.*
        link_understanding: { enabled: false }   # narrow on WA
      - inbound: plugin.inbound.telegram.*
        # inherits agent default (enabled: true)

null / omitted = inherit. Any object = full replace.

What gets injected

For each fetched URL, one bullet:

# LINK CONTEXT

- https://example.com/post — Title of the page
  First paragraphs of main text, collapsed to ~max_bytes characters,
  HTML stripped, scripts and styles dropped.

The block lands inside the system prompt for that turn only. Cache hits skip the fetch but still render the block.

Hard caps (cannot be raised by config)

Cap	Value
URL length	2048 chars
Redirect chain	5 hops
User-Agent	`nexo-link-understanding/0.1`
Response stream cutoff	`max_bytes` (drops the rest)
Newlines / control chars in extracted text	sanitised (prompt-injection guard)

Operations

A single shared LinkExtractor (HTTP client + LRU cache, capacity 256) is built at boot and reused by every agent runtime in the process.
Cache is in-process only. Restarts cold.
Telemetry exported on /metrics:
- nexo_link_understanding_fetch_total{result="ok|blocked|timeout|non_html|too_big|error"} — counter, one increment per fetch attempt.
- nexo_link_understanding_cache_total{hit="true|false"} — counter, incremented on every TTL-cached lookup so dashboards can compute hit-rate without instrumenting the agent loop.
- nexo_link_understanding_fetch_duration_ms — histogram (single series, no labels). Only observed for attempts that actually issued an HTTP request — cache hits and host-blocked URLs skip it so latency percentiles reflect real network work.

When to leave it off

Agents talking to untrusted senders where the agent must not be pivoted into fetching attacker-controlled URLs.
Channels with strict latency budgets — a fetch can add up to timeout_ms to the turn.
Privacy-sensitive deployments where outbound HTTP from the agent host is not allowed.

Web search

The web_search built-in tool lets an agent query the web through one of four providers: Brave, Tavily, DuckDuckGo, Perplexity. The runtime owns provider selection, caching, sanitisation, and circuit breaking — agents only see results.

The feature is off by default. Operators opt in per agent (and optionally override per binding).

Per-agent config

# config/agents.yaml
agents:
  - id: ana
    web_search:
      enabled: true               # default false
      provider: auto              # "auto" | "brave" | "tavily" | "duckduckgo" | "perplexity"
      default_count: 5            # 1..=10
      cache_ttl_secs: 600         # 0 disables cache
      expand_default: false       # default value of `expand` arg

`provider: auto`

Picks the first credentialed provider in this order:

brave (env BRAVE_SEARCH_API_KEY)
tavily (env TAVILY_API_KEY)
perplexity (env PERPLEXITY_API_KEY, requires the perplexity feature)
duckduckgo (no key — bundled by default; the always-available fallback)

DuckDuckGo scrapes html.duckduckgo.com and is rate-limited / captcha-prone; the runtime detects bot challenges and trips the breaker so the next call rotates to a different provider.

Per-binding override

Same shape as link_understanding: null (default) inherits the agent value, any object replaces it.

agents:
  - id: ana
    web_search: { enabled: true }
    bindings:
      - inbound: plugin.inbound.whatsapp.*
        web_search: { enabled: false }   # silent on WA
      - inbound: plugin.inbound.telegram.*
        # inherits agent default

Tool surface

The LLM sees this signature:

{
  "name": "web_search",
  "parameters": {
    "query":     "string  (required)",
    "count":     "integer (1-10, optional)",
    "provider":  "string  (optional override)",
    "freshness": "day | week | month | year (optional)",
    "country":   "ISO-3166 alpha-2 (optional)",
    "language":  "ISO-639-1 (optional)",
    "expand":    "boolean (optional)"
  }
}

Return shape:

{
  "provider": "brave",
  "query":    "rust async runtimes",
  "from_cache": false,
  "results": [
    {
      "url": "https://example.com/post",
      "title": "Title",
      "snippet": "First 4 KiB of the description, sanitised.",
      "site_name": "example.com",
      "published_at": "2026-04-20T00:00:00Z"
    }
  ]
}

When expand: true and Phase 21 link understanding is enabled, the top three hits also get a body field populated by the shared LinkExtractor. Bodies obey the same denylist + size caps that Link understanding describes.

Cache

In-process SQLite cache shared across every agent. Key format:

sha256(SCHEMA_VERSION || provider || query || canonical_params)

canonical_params excludes provider (router decides) and expand (post-processing). cache_ttl_secs: 0 disables caching entirely.

Operators that want a separate cache file or schema migration set web_search.cache.path in web_search.yaml (planned — see FOLLOWUPS).

Circuit breaker

Every provider call goes through nexo_resilience::CircuitBreaker keyed web_search:<provider>. Default config: 5 consecutive failures trip the breaker, exponential backoff up to 120 s. Open-state calls return ProviderUnavailable(provider) immediately and the router rotates to the next candidate (when called via auto-detect).

Sanitisation

Every title, url, and snippet returned by a provider passes through sanitise_for_prompt:

control chars stripped,
CR / LF / tab collapsed to single spaces,
runs of whitespace collapsed,
byte-capped at 4 KiB (snippet) / 512 B (title) / 2 KiB (URL),
truncation respects UTF-8 char boundaries.

This is the same defence-in-depth Phase 19 (language directive) and Phase 21 (# LINK CONTEXT) apply: SERPs are attacker-controlled input.

Telemetry

Exported on /metrics:

nexo_web_search_calls_total{provider,result} — counter, one increment per provider attempt. result is ok (provider returned hits), error (network / HTTP / parse failure), or unavailable (the breaker short-circuited the call before it left the process).
nexo_web_search_cache_total{provider,hit} — counter, every TTL-cached lookup. provider is the first candidate (the one the cache key is built from). Compute hit rate as cache_total{hit="true"} / sum(cache_total).
nexo_web_search_breaker_open_total{provider} — counter; one increment per request the breaker rejected. Pair with circuit_breaker_state{breaker="web_search:<provider>"} to alert on sustained open state vs a flap.
nexo_web_search_latency_ms{provider} — histogram. Only observed for attempts that issued an HTTP request, so the percentile reflects real provider latency (cache hits and breaker short-circuits would pull p50 down to 0 and hide regressions).

When to leave it off

Privacy-sensitive deployments where outbound HTTP from the agent host is not allowed.
Channels where the cost of a noisy SERP in the prompt outweighs the agent's value (use per-binding enabled: false).
Agents that already have link_understanding for the URLs the user shares — no need for SERP duplication.

Web fetch

The web_fetch built-in tool lets an agent retrieve the cleaned body text + title for one or more URLs the agent already knows. Companion to Web search: web_search finds URLs, web_fetch retrieves them.

Distinct from web_search.expand=true because the agent often knows the URL up-front (skill output, RSS poll, calendar attachment, user message) and would otherwise have to either hallucinate a search query or shell out to a fetch-url extension.

When to use which

Scenario	Tool
Agent needs to find content matching a query	`web_search`
Agent has a URL from a `web_search` hit and wants the body	`web_search(expand=true)`
Agent has a URL from a poller / skill / user message	`web_fetch`
Agent has a list of URLs to triage	`web_fetch(urls=[...])`

Tool signature

{
  "name": "web_fetch",
  "parameters": {
    "urls":      ["https://example.com/article", "https://other.com/page"],
    "max_bytes":  65536          // optional; clamped to deployment cap
  }
}

Response shape:

{
  "results": [
    {
      "url":   "https://example.com/article",
      "title": "Example article",
      "body":  "First paragraph...",
      "ok":    true
    },
    {
      "url":    "https://internal.intranet.local/private",
      "ok":     false,
      "reason": "fetch failed (host blocked, timeout, non-HTML, oversized, or transport error). Check `nexo_link_understanding_fetch_total{result}` for the bucket."
    }
  ],
  "count": 2
}

A bad URL returns a {ok: false, reason} row instead of bailing the whole call, so the agent can still consume the successful ones. Per-call cap of 5 URLs; longer lists get trimmed with a warn log.

Configuration

web_fetch has no dedicated config. It rides on Link understanding:

link_understanding.enabled — gates the tool entirely. With it false, every fetch returns {ok: false, reason: "disabled by policy"}.
link_understanding.max_bytes — deployment-wide ceiling. The tool's max_bytes arg can shrink but never grow past this.
link_understanding.deny_hosts — host blocklist (loopback, private subnets, internal cloud metadata endpoints, plus whatever the operator added).
link_understanding.timeout_ms — per-fetch HTTP timeout.
link_understanding.cache_ttl_secs — cache TTL. Successful fetches are cached so a second web_fetch of the same URL inside the TTL is free.

Per-binding overrides via EffectiveBindingPolicy::link_understanding (see Per-binding capability override).

Telemetry

web_fetch reuses every counter the auto-link pipeline emits. There's no separate dashboard:

nexo_link_understanding_fetch_total{result} — ok / blocked / timeout / non_html / too_big / error.
nexo_link_understanding_cache_total{hit} — true / false.
nexo_link_understanding_fetch_duration_ms — histogram, only populated when an HTTP request actually went out (cache hits and host-blocked URLs skip it so percentiles reflect real fetch work).

The bundled Grafana dashboard (ops/grafana/nexo-llm.json) already plots all three.

Why a per-call cap of 5 URLs

A runaway agent given the prompt "fetch every link in this 10k RSS dump" would otherwise queue thousands of HTTP requests synchronously, blowing the prompt budget and hammering the target hosts. 5 covers every realistic agentic workflow (read 3 candidates, pick the best two, summarise) while leaving a clear ceiling. Operators who want batch behaviour should spawn a TaskFlow that calls web_fetch in chunks with cursor persistence.

Comparison to extensions

The fetch-url Python extension does roughly the same thing. web_fetch differs in three ways:

In-process — no subprocess spawn, no Python interpreter, no extension wire protocol. Sub-100ms cold path on the happy case.
Shared cache + telemetry — links the user shares (auto- expanded by Phase 21 link-understanding) AND links the agent fetches via web_fetch populate the same LRU. The second access is always free.
Same security defaults — same deny-host list, same size cap, same timeout. Operators tune one knob, two surfaces honour it.

Use the extension when the runtime path is wrong shape (custom auth, post-only endpoints, non-HTML responses you want raw). Use web_fetch for the standard "give me the article" case, which is most of them.

Implementation

The tool lives at crates/core/src/agent/web_fetch_tool.rs::WebFetchTool and is registered for every agent unconditionally in src/main.rs. The per-binding link_understanding.enabled policy gates whether the underlying fetch happens; the tool itself is always visible in the agent's tool list so operators can write "call web_fetch on URL X" prompts without needing a per-agent web_fetch.enabled flag.

Source of truth for FOLLOWUPS W-2 closure.

Pairing protocol

Two coexisting protocols ship in nexo-pairing:

DM-challenge inbound gate — opt-in per binding. Unknown senders on WhatsApp / Telegram receive a one-time human-friendly code; the operator approves them via CLI. Existing senders pass through unchanged.
Setup-code QR — operator-initiated. nexo pair start issues a short-lived HMAC-signed bearer token + a gateway URL, packs them into a base64url payload, and renders a QR. A companion app scans, presents the token to the daemon, and gets a session token in return.

The feature is off by default. Existing setups see no behaviour change until the operator flips pairing_policy.auto_challenge on a binding.

DM-challenge gate

Per-binding config

# config/agents.yaml
agents:
  - id: ana
    inbound_bindings:
      - plugin: whatsapp
        instance: personal
        pairing_policy:
          auto_challenge: true   # default false

The gate runs before the plugin publishes to the broker. Three outcomes per inbound message:

Outcome	When	Plugin action
`Admit`	sender in `pairing_allow_from` (or policy off)	publish as normal
`Challenge { code }`	unknown sender, `auto_challenge: true`, slot free	reply with code, drop message
`Drop`	max-pending exhausted (3 per channel/account)	silent drop

Operator workflow

$ nexo pair list
CODE       CHANNEL         ACCOUNT          CREATED                     SENDER
K7M9PQ2X   whatsapp        personal         2026-04-25T13:21:00Z        +57311...

$ nexo pair approve K7M9PQ2X
Approved whatsapp:personal:+57311... (added to allow_from)

The next message from +57311... admits through the gate.

pair list only shows pending challenges by default. Use --all to also dump every active row in pairing_allow_from (approved + seeded), and --include-revoked to keep soft-deleted entries in the listing for audit:

$ nexo pair list --all
No pending pairing requests.

CHANNEL         ACCOUNT           SENDER                    VIA         APPROVED                    REVOKED
telegram        cody_nexo_bot     1194292426                seed        2026-04-26 17:52:10 UTC     -
whatsapp        personal          +57311...                 cli         2026-04-25 13:21:00 UTC     -

$ nexo pair list --all --include-revoked --json | jq '.allow[0]'
{
  "channel": "whatsapp",
  "account_id": "personal",
  "sender_id": "+57311...",
  "approved_via": "cli",
  "approved_at": "2026-04-25T13:21:00Z"
}

--json always returns { "pending": [...], "allow": [...] } so consumers get a stable shape regardless of --all.

Cache + revoke

The gate caches decisions for 30 s to keep SQLite off the hot path. Revokes (and freshly-seeded admits) are eventually consistent within that window:

$ nexo pair revoke whatsapp:+57311...
Revoked whatsapp:+57311...

For an immediate effect, trigger a hot-reload — the coordinator runs PairingGate::flush_cache as a post-reload hook (Phase 70.7), so nexo reload (or any file-watched config edit) drops the cache and the next inbound message re-queries the store:

$ nexo reload

A daemon restart still works as a hammer when reload is disabled.

Migrating an existing bot

If you already have known senders, seed them so the gate doesn't challenge mid-conversation when you flip auto_challenge: true:

$ nexo pair seed whatsapp personal +57311... +57222... +57333...
Seeded 3 sender(s) into whatsapp:personal allow_from

seed is idempotent; running it twice is safe and re-activates any sender that was previously revoked.

Setup-code QR

Issuing

$ nexo pair start --public-url wss://nexo.example.com --qr-png /tmp/p.png --json
{
  "url": "wss://nexo.example.com",
  "url_source": "pairing.public_url",
  "bootstrap_token": "eyJwcm9maWxlIjoi...",
  "expires_at": "2026-04-25T13:32:00Z",
  "payload": "eyJ1cmwi..."
}

payload is what goes in the QR. The companion decodes it to recover {url, bootstrap_token, expires_at}, opens the WebSocket, and presents the token as Authorization: Bearer <bootstrap_token>.

URL resolution

Priority chain (first non-empty wins):

--public-url (CLI flag)
tunnel.url (Phase tunnel — TODO: wire when accessor lands)
gateway.remote.url
LAN bind address (when gateway.bind=lan)

fail-closed: the daemon refuses to issue a code on a loopback-only gateway. As of Phase 70.5 the CLI also prints a ready-to-run nexo pair seed <channel> <account> <SENDER> for every plugin instance configured under config/plugins/, so a dev-machine operator can skip the QR flow entirely:

$ nexo pair start --ttl-secs 300
Pairing-start needs a non-loopback gateway URL.
For local testing you usually don't need the QR flow at all —
seed the operator's chat into the allowlist directly:

  nexo pair seed telegram cody_nexo_bot <YOUR_TELEGRAM_USER_ID>
  nexo pair seed whatsapp default <YOUR_WHATSAPP_NUMBER>

Or, to keep using the QR flow, set one of:
  - `pairing.public_url` in config/pairing.yaml
  - `--public-url <wss://…>` flag
  - run `nexo` with the tunnel enabled (writes tunnel.url)

ws/wss security policy

Cleartext ws:// is allowed only on hosts the operator can reasonably trust to be private:

127.0.0.1 / ::1 (loopback)
RFC1918 (10/8, 172.16/12, 192.168/16)
link-local (169.254/16)
*.local mDNS hostnames
10.0.2.2 (Android emulator)
Any host listed in pairing.ws_cleartext_allow_extra

Everything else exigirá wss://. This matches OpenClaw's posture in research/src/pairing/setup-code.ts.

Token format

b64u(claims_json) + "." + b64u(hmac_sha256(secret, claims_json))

claims_json = {"profile":"companion-v1","expires_at":"...","nonce":"<32 hex>","device_label":"..."}
secret = 32 bytes in ~/.nexo/secret/pairing.key (auto-generated on first boot with 0600 perms; rotate by deleting + restarting).

Verification is constant-time (subtle crate) so timing leaks don't discriminate between "wrong sig" and "wrong claims".

Threat model

Concern	Mitigation
Brute-force pairing code	32^8 ≈ 10^12 keyspace; 60 min TTL; max 3 pending per (channel, account)
Token replay after expiry	TTL on `expires_at` (default 10 min); HMAC verify fails closed
Token forgery	HMAC-SHA256 with 32-byte secret; constant-time compare
Secret leak	Rotate via `rm ~/.nexo/secret/pairing.key && restart`; all in-flight tokens invalidate
TOCTOU on approve	Single SQL transaction (`approve` reads + insert + delete in one tx)
ws cleartext on hostile network	Refuse to issue cleartext URL outside private-host allowlist
DoS via flood of pending requests	Max 3 per (channel, account); TTL 60 min auto-prunes

Storage layout

Two SQLite tables in <memory_dir>/pairing.db:

pairing_pending (channel, account_id, sender_id PRIMARY KEY,
                 code, created_at, meta_json)

pairing_allow_from (channel, account_id, sender_id PRIMARY KEY,
                    approved_at, approved_via, revoked_at)

Soft-delete (revoked_at) keeps historical context: an operator can later see "+57311 was approved on X, revoked on Y" for audit.

When to leave it off

Single-user setups where the operator is the only sender — the gate adds a SQL hit per message for no security gain.
Bots that take public input by design (e.g. a self-service support bot) — the gate would block every customer.
Until you have an agent setup web-search-style wizard, manual pair seed is the only friendly migration path.

Adapter registry

Each channel that participates in pairing implements PairingChannelAdapter in its plugin crate. The adapter owns three channel-specific decisions the runtime cannot make on its own:

normalize_sender(raw) — canonicalise inbound sender ids before the gate hits the store. WhatsApp strips @c.us / @s.whatsapp.net and prepends +; Telegram lower-cases @username and passes numeric chat ids through.
format_challenge_text(code) — render the operator-facing pairing message. The default is plain UTF-8; the Telegram adapter overrides it to escape MarkdownV2 reserved characters and wrap the code in backticks so the user can long-press to copy.
send_reply(account, to, text) — publish the challenge through the channel's outbound topic (plugin.outbound.{whatsapp,telegram}[.<account>]) using the payload shape that channel's dispatcher expects.

The bin (src/main.rs) constructs a PairingAdapterRegistry at boot and registers the WhatsApp + Telegram adapters. The runtime consults the registry on every inbound event whose binding has pairing.auto_challenge: true. Channels with no registered adapter fall back to a hardcoded broker publish that mirrors the legacy text on plugin.outbound.{channel} — operators still see the challenge in their channel, but without per-channel formatting.

Telemetry lives under pairing_inbound_challenged_total{channel,result} with result one of delivered_via_adapter, delivered_via_broker, publish_failed, no_adapter_no_broker_topic, so dashboards can split adapter vs. fallback delivery rates per channel.

CLI reference

nexo pair start [--for-device <name>] [--public-url <url>]
                 [--qr-png <path>] [--ttl-secs <n>] [--json]
nexo pair list  [--channel <id>] [--all] [--include-revoked] [--json]
nexo pair approve <CODE> [--json]
nexo pair revoke <channel>:<sender_id>
nexo pair seed <channel> <account_id> <sender_id> [<sender_id>...]
nexo pair help

Benchmarks

The workspace ships criterion benchmark suites for every hot path that runs on the data plane. CI executes them on every PR + weekly on main so regressions are visible before merge.

Quick run

# Single crate:
cargo bench -p nexo-resilience

# Single bench within a crate:
cargo bench -p nexo-broker --bench topic_matches

# Single group within a bench:
cargo bench -p nexo-broker --bench topic_matches -- 'topic_matches/wildcard'

Output goes to target/criterion/. Open index.html under that directory in a browser for the full HTML report.

Coverage matrix

Crate	Bench	What it measures	Run target
`nexo-resilience`	`circuit_breaker`	`CircuitBreaker::allow` (closed + open), `on_success`, `on_failure`, 8-task concurrent allow contention	sub-100ns per call
`nexo-broker`	`topic_matches`	NATS-style pattern matching (exact, single-wildcard `*`, multi-wildcard `>`, 50-pattern storm)	sub-100ns per match
`nexo-broker`	`local_publish`	End-to-end `LocalBroker::publish` with 0 / 1 / 10 / 50 subscribers (DashMap scan + try_send + slow-consumer drop counter)	sub-10µs at 50 subs
`nexo-llm`	`sse_parsers`	OpenAI / Anthropic / Gemini SSE parsers, 50-chunk fixtures (typical short answer)	chunks/sec scales linearly
`nexo-taskflow`	`tick`	`WaitEngine::tick` at 10 / 100 / 1 000 active waiting flows	sub-millisecond at single-host scale

What's NOT benched yet

These are tracked under Phase 35.5 follow-up:

nexo-core transcripts FTS search — needs SQLite fixture seed before the bench is meaningful.
nexo-core redaction pipeline — wait for the local-LLM redaction backend (Phase 68.7) so we measure the real path operators ship.
nexo-mcp encode_request / parse_notification_method — cheap to add; will land alongside an MCP-stdio round-trip bench.
nexo-memory vector-search recall — needs a public dataset baseline.

Add a bench by following the patterns in crates/<x>/benches/:

[dev-dependencies] adds criterion = "0.5" (with async_tokio if you need a runtime).
[[bench]] registers name = "<bench>" and harness = false.
Bench file uses Throughput::Elements(N) so output is ops/sec, not raw ns/iter.
Each criterion_group! covers a distinct conceptual path — don't bundle unrelated paths.

CI integration

.github/workflows/bench.yml runs the matrix on:

every PR that touches crates/**, Cargo.lock, or Cargo.toml
weekly on Sunday 04:00 UTC against main
manual workflow_dispatch

Each run uploads target/criterion/ as an artifact retained 30 days. PR runs save with --save-baseline pr-<number>; main runs save as main. Compare locally with:

# Pull the artifact for PR #42
gh run download <run-id> --name bench-nexo-broker-<run-id>

# Compare against the local main baseline
cargo bench -p nexo-broker -- --baseline main

Today the CI job is informational — a regression doesn't fail the PR. Once we have ~10 main runs of baseline data per crate, the workflow gates on >10% regression per group. That's Phase 35.6 done-criteria.

Known limitations

GitHub Actions runners are noisy. The ubuntu-latest shared runner tier shows ±5-10% variance on microbenchmarks. This is why we don't gate on small regressions yet — the baseline noise floor is itself ~5%.
Benches don't measure cold cache. cargo bench's warm-up phase reaches steady-state CPU caches; first-call latency on a cold runtime is not captured. Add a separate bench_cold_* group when this matters (it usually doesn't — hot path is what matters at scale).
No cross-crate end-to-end benchmark yet. Phase 35.3 (load test rig) covers that; today's suites are per-crate microbenchmarks.

Reading criterion output

A typical run prints:

publish/mixed_50_subs   time:   [12.347 µs 12.451 µs 12.567 µs]
                        thrpt:  [3.9786 Melem/s 4.0153 Melem/s 4.0494 Melem/s]
                 change: time:   [-0.4%  +0.3%  +1.1%]    (p = 0.62 > 0.05)
                         thrpt:  [-1.1% -0.3% +0.4%]
                         No change in performance detected.

time is the per-iteration latency (lower better).
thrpt is throughput (higher better) — only present when the bench declared Throughput::Elements(N).
change compares against the previous run on the same hardware. p > 0.05 means the difference is within noise.

Look for change reporting "Performance has regressed" with a red bar — that's the signal a PR introduced a regression.

Native install (no Docker)

If you'd rather run nexo-rs directly on a Linux / macOS host — development loop, single-machine deploy, restricted container environment — this page walks through every step and names the bootstrap script that automates it.

Fast path

git clone git@github.com:lordmacu/nexo-rs.git
cd nexo-rs
./scripts/bootstrap.sh

scripts/bootstrap.sh verifies prerequisites, installs a local NATS, creates the runtime directories, stages example configs, and builds the agent binary. Re-runnable — each step is idempotent.

Keep reading for what it actually does (and what to do when a step needs manual intervention).

Prerequisites

Tool	Required for	Notes
Rust (stable, edition 2021)	building the binaries	`rust-toolchain.toml` pins the channel
Git	cloning + per-agent workspace-git	default on most hosts
NATS ≥ 2.10	the broker	binary or dev docker container is fine
SQLite ≥ 3.38	memory + broker disk queue	ships with most distros
Chrome / Chromium	browser plugin (optional)	skip if you don't use the browser plugin
ffmpeg + ffprobe	media-related skills (optional)	skip if you don't ship those skills
yt-dlp / tesseract / tmux / ssh	individual skills (optional)	each skill declares its `requires.bins`

On Ubuntu / Debian:

sudo apt update
sudo apt install -y build-essential pkg-config libsqlite3-dev git curl

On macOS:

xcode-select --install
brew install sqlite git

Install Rust

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source "$HOME/.cargo/env"
rustup component add rustfmt clippy

The repo's rust-toolchain.toml pins the channel; no manual version pick is needed.

Install NATS

Pick one path:

Option A — native NATS server

# Linux x86_64
curl -L -o /tmp/nats.tar.gz \
  https://github.com/nats-io/nats-server/releases/download/v2.10.20/nats-server-v2.10.20-linux-amd64.tar.gz
tar -xzf /tmp/nats.tar.gz -C /tmp
sudo mv /tmp/nats-server-*/nats-server /usr/local/bin/

For macOS: brew install nats-server.

Start it:

nats-server -js                      # foreground
nats-server -js -D                   # foreground with debug
# or, as a systemd service: see below

Option B — dev throwaway via Docker

Even on a "no-Docker" box, a single short-lived container for the broker is often fine:

docker run -d --name nexo-nats --restart unless-stopped \
  -p 4222:4222 -p 8222:8222 nats:2.10-alpine

This is the same broker the compose stack would use; only the broker itself runs in a container.

Systemd unit (Linux, production)

/etc/systemd/system/nats-server.service:

[Unit]
Description=NATS Server
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/nats-server -js
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

sudo systemctl daemon-reload
sudo systemctl enable --now nats-server

Build nexo-rs

git clone git@github.com:lordmacu/nexo-rs.git
cd nexo-rs
cargo build --release

The output is ./target/release/agent. Symlink it into $PATH if you want:

sudo ln -sf "$(pwd)/target/release/agent" /usr/local/bin/agent

Prepare runtime directories

mkdir -p ./data/{queue,workspace,media,transcripts}
mkdir -p ./secrets          # gitignored; holds API keys, nkey files, etc.
chmod 700 ./secrets         # restrictive — the credential gauntlet checks this

Stage config

The repo ships config/*.yaml with safe defaults. Override whatever you need:

# Optional: copy the ana sales agent template into the gitignored dir
cp config/agents.d/ana.example.yaml config/agents.d/ana.yaml

# Add an API key:
export MINIMAX_API_KEY=...
export MINIMAX_GROUP_ID=...
# or write to secrets/ files referenced from config/llm.yaml via ${file:...}

See Configuration — layout for the full reference.

Pair channels and set secrets

./target/release/agent setup

The wizard pairs WhatsApp / Telegram / Google / LLM credentials interactively. See Setup wizard.

First run

./target/release/agent --config ./config

Watch the startup summary — it tells you exactly which plugins loaded, which extensions were skipped and why, and whether the broker is reachable. If anything's missing, the log line names the specific file or env var to fix.

Run as a systemd service

/etc/systemd/system/nexo-rs.service:

[Unit]
Description=nexo-rs agent
Requires=nats-server.service
After=nats-server.service

[Service]
Type=simple
User=nexo
Group=nexo
WorkingDirectory=/srv/nexo-rs
Environment=RUST_LOG=info
Environment=AGENT_ENV=production
ExecStart=/usr/local/bin/agent --config /srv/nexo-rs/config
Restart=on-failure
RestartSec=5
# Optional: restrict where the agent can write
ReadWritePaths=/srv/nexo-rs/data /srv/nexo-rs/secrets

[Install]
WantedBy=multi-user.target

sudo useradd -r -s /bin/false -d /srv/nexo-rs nexo
sudo chown -R nexo:nexo /srv/nexo-rs
sudo systemctl daemon-reload
sudo systemctl enable --now nexo-rs

Logs:

journalctl -u nexo-rs -f

macOS launchd

~/Library/LaunchAgents/dev.nexo-rs.agent.plist:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
  "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>Label</key>          <string>dev.nexo-rs.agent</string>
  <key>WorkingDirectory</key><string>/Users/you/nexo-rs</string>
  <key>ProgramArguments</key>
  <array>
    <string>/Users/you/nexo-rs/target/release/agent</string>
    <string>--config</string><string>/Users/you/nexo-rs/config</string>
  </array>
  <key>EnvironmentVariables</key>
  <dict>
    <key>RUST_LOG</key><string>info</string>
  </dict>
  <key>RunAtLoad</key>      <true/>
  <key>KeepAlive</key>      <true/>
</dict>
</plist>

launchctl load -w ~/Library/LaunchAgents/dev.nexo-rs.agent.plist
launchctl start dev.nexo-rs.agent

Verify

agent status                    # lists running agents
curl localhost:8080/ready       # readiness
curl localhost:9090/metrics     # Prometheus metrics

See Metrics + health.

Upgrading

cd nexo-rs
git pull
cargo build --release
sudo systemctl restart nexo-rs      # Linux
# or: launchctl kickstart -k gui/$UID/dev.nexo-rs.agent   # macOS

The graceful shutdown sequence drains in-flight work and persists the disk queue before exit.

Uninstalling

sudo systemctl disable --now nexo-rs nats-server
sudo rm /etc/systemd/system/{nexo-rs,nats-server}.service
sudo rm /usr/local/bin/{agent,nats-server}
sudo userdel nexo
rm -rf /srv/nexo-rs

Debian / Ubuntu (.deb)

Fedora / RHEL / Rocky (.rpm)

Termux (Android) install

Run nexo-rs directly on an Android phone under Termux. No Docker, no server — a self-hosted agent in your pocket.

Use this path for a personal agent (one phone, one WhatsApp, one Telegram). For multi-tenant / multi-process deployments the regular Linux setup on a server is the right shape.

Quickest path — pre-built `.deb`

Once a v* release is published (recipe lives in packaging/termux/build.sh), download the asset and install with one command:

# Inside Termux on the phone:
curl -LO https://github.com/lordmacu/nexo-rs/releases/latest/download/nexo-rs_aarch64.deb
pkg install ./nexo-rs_aarch64.deb

The deb pulls the runtime deps Termux already ships (libsqlite, openssl, ffmpeg, tesseract, python, yt-dlp). Its postinst scaffolds ~/.nexo/{data,secret} and prints the next steps. Skip the build-from-source section below if this works.

Root vs non-root

Everything in this guide runs without root. You do not need to root your phone to self-host nexo-rs on it.

Root only unlocks extras:

Scenario	Needs root?
Build + run the agent daemon	❌ no
Pair WhatsApp, Telegram, Google	❌ no
Local broker (`broker.type: local`)	❌ no
Native NATS Go binary	❌ no (installs to `$PREFIX/bin`)
`termux-wake-lock`, Termux:Boot autostart	❌ no
Install skills from `pkg` (ffmpeg, tesseract, yt-dlp)	❌ no
MCP client / server mode	❌ no
Browser plugin via `cdp_url` to a chromium you launched yourself	❌ no
Docker compose stack (via `proot-distro` or Linux Deploy)	✅ yes
SELinux permissive (if Chromium sandbox misbehaves)	✅ yes
Running multiple proot-distro containers side by side	✅ yes
Bypass Android's battery optimizer more aggressively	✅ yes

Short version: don't root just for nexo-rs. Root if you want the full compose stack in a Linux-Deploy chroot, otherwise skip it.

What works

Area	Status
Core runtime, memory, TaskFlow, dreaming	✅ full
Broker: `type: local` (in-process) or native NATS Go binary	✅ full
LLM providers (MiniMax / Anthropic / OpenAI-compat / Gemini)	✅ all rustls-based
WhatsApp plugin (pure Rust + Signal Protocol)	✅ pairing via Unicode QR
Telegram plugin	✅ Bot API over HTTP
Gmail / Google plugin + gmail-poller	✅ OAuth over HTTP
Extensions (stdio + NATS)	✅ spawn works
Skills: fetch-url, dns-tools, rss, weather, wikipedia, pdf-extract, brave-search, wolfram-alpha, summarize, translate	✅ pure Rust
MCP client + server	✅ stdio + HTTP
Health / metrics / admin HTTP servers (8080 / 9090 / 9091)	✅ unprivileged ports

What needs a tweak

Thing	Workaround
Service manager (no systemd)	termux-services (runit) or `tmux` + `nohup`
Run at boot	install the Termux:Boot app + drop a script in `~/.termux/boot/`
Survives screen-off	`termux-wake-lock` (from the Termux:API add-on) before running the agent
Browser plugin (Chrome/Chromium)	use `cdp_url:` to a chromium you start manually with `--no-sandbox --disable-dev-shm-usage`; or `disabled: [browser]` if you don't need it
Secrets file permission gauntlet	`export CHAT_AUTH_SKIP_PERM_CHECK=1` (Android filesystem perms model differs)
WhatsApp public tunnel (cloudflared)	skip the public tunnel; pair locally via Unicode QR rendered on the terminal
Docker / compose	use `broker.type: local` or native NATS binary — no containers involved

Prerequisites

From a fresh Termux install:

pkg update
pkg install -y rust git curl sqlite openssl clang pkg-config

Optional (enables specific skills):

pkg install -y ffmpeg tesseract yt-dlp tmux openssh

Optional (browser plugin):

pkg install -y tur-repo
pkg install -y chromium

Optional (run in background without the terminal session alive):

pkg install -y termux-services termux-api
# install the companion app "Termux:API" from F-Droid

Fast path — bootstrap script

The repo's scripts/bootstrap.sh auto-detects Termux and picks the right defaults:

git clone https://github.com/lordmacu/nexo-rs
cd nexo-rs
./scripts/bootstrap.sh --yes

What it does on Termux:

Verifies rust, git, curl, sqlite from pkg
Downloads the static nats-server Go binary (arm64), drops it in $PREFIX/bin/ — or skip with --nats=skip to use the local broker
Creates ./data/** and ./secrets/ (with Termux-compatible perms)
Stages config/agents.d/*.example.yaml → *.yaml if missing
Runs cargo build --release (grab a coffee — ~20–40 min on phone hardware)
Optionally launches agent setup to pair channels

Expect a ~60–100 MB final binary.

Manual install

1. Install Rust and deps

pkg install -y rust git curl sqlite openssl clang pkg-config

2. Clone and build

git clone https://github.com/lordmacu/nexo-rs
cd nexo-rs
cargo build --release --bin agent

3. Broker

Option A — local (simplest):

# config/broker.yaml
broker:
  type: local
  persistence:
    enabled: true
    path: ./data/queue

No NATS binary needed. All pub/sub stays in-process.

Option B — native NATS binary:

curl -L -o /tmp/nats.tar.gz \
  https://github.com/nats-io/nats-server/releases/download/v2.10.20/nats-server-v2.10.20-linux-arm64.tar.gz
tar -xzf /tmp/nats.tar.gz -C /tmp
install -m 0755 "$(find /tmp -name nats-server -type f | head -1)" \
  $PREFIX/bin/nats-server
nats-server -js &

Go binaries are static and work on Termux without libc surprises.

4. Runtime directories and secrets

mkdir -p ./data/{queue,workspace,media,transcripts} ./secrets

Termux stores files under /data/data/com.termux/files/home by default. Avoid pointing config paths at /sdcard — Android's scoped-storage model breaks directory permissions there.

5. Relax the credentials perm check

Android's filesystem doesn't support the same permission bits as Linux in the same way. The credentials gauntlet would refuse to boot with false-positive warnings:

export CHAT_AUTH_SKIP_PERM_CHECK=1

Add it to ~/.termux/termux.properties or a wrapper shell script so it's set every time.

6. Launch the wizard

./target/release/agent setup

For the WhatsApp pairing step, the wizard renders the QR as Unicode blocks directly in the terminal — scan from the phone's WhatsApp app (Settings → Linked Devices). No public tunnel needed.

7. Run the agent

termux-wake-lock                # keep CPU awake even with screen off
./target/release/agent --config ./config

Staying alive in the background

Android's aggressive task killing is the biggest operational surprise. Pick one:

A — `termux-wake-lock` + foreground notification

termux-wake-lock
# agent in foreground:
./target/release/agent --config ./config

The wake-lock persists until you run termux-wake-unlock or kill the session. Minimum friction, most reliable.

B — `termux-services` (runit)

pkg install -y termux-services
sv-enable termux-services
mkdir -p ~/.config/service/nexo-rs
cat > ~/.config/service/nexo-rs/run <<'EOF'
#!/data/data/com.termux/files/usr/bin/sh
cd /data/data/com.termux/files/home/nexo-rs
export CHAT_AUTH_SKIP_PERM_CHECK=1
exec ./target/release/agent --config ./config 2>&1
EOF
chmod +x ~/.config/service/nexo-rs/run
sv up nexo-rs
sv status nexo-rs

C — Termux:Boot (start on device boot)

Install the Termux:Boot app from F-Droid, then:

mkdir -p ~/.termux/boot
cat > ~/.termux/boot/start-agent <<'EOF'
#!/data/data/com.termux/files/usr/bin/sh
termux-wake-lock
cd /data/data/com.termux/files/home/nexo-rs
export CHAT_AUTH_SKIP_PERM_CHECK=1
exec ./target/release/agent --config ./config
EOF
chmod +x ~/.termux/boot/start-agent

Disabling the browser plugin

If you don't need headless browser control (most phone-hosted agents don't), drop it from config/extensions.yaml:

extensions:
  disabled: [browser]

Or, if you have tur-repo chromium installed and want nexo-rs to spawn it, use the browser.args field to forward the flags Termux needs:

# config/plugins/browser.yaml
browser:
  headless: true
  executable: /data/data/com.termux/files/usr/bin/chromium
  args:
    - --no-sandbox
    - --disable-dev-shm-usage
    - --disable-gpu

The built-in launch flags still apply; args is appended after them so you can also override any of the built-ins (Chrome's CLI parser uses last-wins).

Alternative: launch chromium yourself and attach via cdp_url:

# config/plugins/browser.yaml
browser:
  # Start chromium yourself with:
  #   chromium --headless --no-sandbox --disable-dev-shm-usage \
  #            --disable-gpu --remote-debugging-port=9222 &
  cdp_url: http://127.0.0.1:9222

When cdp_url is set, args is ignored — nexo-rs doesn't spawn Chrome, only connects to yours.

Verify

curl localhost:8080/ready
curl localhost:9090/metrics
./target/release/agent status

Upgrading

cd ~/nexo-rs
git pull
cargo build --release
# restart under whichever method you picked (wake-lock / runit / Boot)

Android's graceful shutdown still runs on SIGTERM — closing the Termux session or killing the process drains the disk queue cleanly.

Install — Nix

Nexo ships a Nix flake that pins the toolchain (Rust 1.80, MSRV) and the native build deps so a contributor or operator can go from clean shell to working binary without touching the host system.

Run without installing

nix run github:lordmacu/nexo-rs -- --help

First invocation builds from source (~3-5 min on cold cache); subsequent runs hit the local Nix store.

Build a local binary

nix build github:lordmacu/nexo-rs
./result/bin/nexo --help

The binary is the same nexo produced by cargo build --release --bin nexo. Outputs a result/ symlink the operator can link into /usr/local/bin/ or copy elsewhere.

Contributor dev shell

git clone https://github.com/lordmacu/nexo-rs
cd nexo-rs
nix develop

Drops you into a shell with:

rustc 1.80 + cargo + clippy + rustfmt + rust-src
cargo-edit, cargo-watch, cargo-nextest, cargo-deny
mdbook + mdbook-mermaid (for mdbook build docs)
sqlite, pkg-config, openssl, libgit2 (build deps)

RUST_LOG=info is exported by default. The toolchain version is pinned in flake.nix — bump in lockstep with [workspace.package].rust-version in Cargo.toml.

What the flake does NOT install

The nexo binary alone is not enough for full functionality. Runtime tools the channel plugins shell out to live at the system level, not in the flake:

Chrome / Chromium — required by the browser plugin
cloudflared — used by the tunnel plugin
ffmpeg — media transcoding for WhatsApp voice notes
tesseract-ocr — OCR skill
yt-dlp — the yt-dlp extension

Operators install these via their distro's package manager. The native install guide lists the apt / pacman / brew commands. The Docker image bundles all of them — that's the path of least friction for a "just works" deploy.

Pinning a release

Once v* tags are published, pin to a specific release:

nix run github:lordmacu/nexo-rs/v0.1.1 -- --help

Or in a flake input:

{
  inputs.nexo-rs.url = "github:lordmacu/nexo-rs/v0.1.1";
}

Verifying the build

nix flake check

Runs nix flake check — verifies the flake metadata, evaluates all outputs (packages, apps, devShells, formatter) without actually building. Useful in CI to catch flake regressions early.

Troubleshooting

"experimental feature 'flakes' is disabled" — add to ~/.config/nix/nix.conf:
```
experimental-features = nix-command flakes
```
First build is very slow — the build re-fetches and re-compiles every cargo dependency in the sandbox. Subsequent builds are cached. A future Phase 27.x will publish a cachix cache so nix build pulls the binary directly.
Build fails on macOS arm64 — git2-rs occasionally lags on Apple silicon. Workaround: build the binary inside the Docker image instead (see Docker).

Agent-centric setup wizard

The hub menu's Configurar agente (canal, modelo, idioma, skills) entry drops the operator into a per-agent submenu. Where the rest of the wizard groups actions by service (Telegram, OpenAI, the browser plugin), this submenu groups them by agent: pick one agent up front, then mutate its model, language, channels, and skills from a single dashboard. Every action reuses the existing channel / LLM / skill flows underneath, so behavior stays in lockstep with the rest of the wizard.

./target/release/agent setup
# → Configurar agente (canal, modelo, idioma, skills)

Dashboard

Agente: kate
  Modelo:   anthropic / claude-haiku-4-5  [creds ✔]
  Idioma:   es
  Canales:  ✔ telegram:default  (bound)
            ✗ whatsapp:default  (unbound)
  Skills:   8 / 24 attached

The dashboard is recomputed from disk on every loop iteration, so the screen always reflects the most recent YAML state.

After the dashboard renders, the operator picks one of:

Action	Effect
`Modelo`	Attach / detach / change the LLM provider + model name. Re-uses the LLM credential form when secrets are missing.
`Idioma`	Pick from `es / en / pt / fr / it / de`, or clear the directive.
`Canales`	Auth/Reauth, Bind, or Unbind a channel for this agent. Auth flows are the same `services_imperative` dispatchers the legacy menu uses.
`Skills`	Multi-select against the skill catalog. Newly added skills with required secrets prompt for creds.
`← volver`	Exit the submenu, return to the hub.

YAML mutations

Action	YAML path	Operation
Attach model	`agents[<id>].model.provider`, `…model.model`	`upsert_agent_field`
Detach model	`agents[<id>].model`	`remove_agent_field`
Set language	`agents[<id>].language`	`upsert_agent_field`
Clear language	`agents[<id>].language`	`remove_agent_field`
Bind channel	`agents[<id>].plugins[]`, `agents[<id>].inbound_bindings[]`	`append_agent_list_item` (idempotent)
Unbind channel	`agents[<id>].plugins[]`, `agents[<id>].inbound_bindings[]`	`remove_agent_list_item` by predicate
Replace skills	`agents[<id>].skills`	`upsert_agent_field` (full sequence)

All mutations land atomically (tempfile + rename) and are gated by the same process-wide YAML mutex the legacy upsert path uses, so concurrent wizard sessions don't corrupt the file.

Hot-reload

After every successful mutation, the wizard fires a best-effort nexo --config <dir> reload so a running daemon picks up the YAML edit without a manual restart. The call is fire-and-forget: when the binary isn't on PATH or the daemon isn't running, the wizard keeps going silently.

Where the code lives

crates/setup/src/agent_wizard.rs — submenu + dashboard.
crates/setup/src/yaml_patch.rs — read_agent_field, upsert_agent_field, remove_agent_field, append_agent_list_item, remove_agent_list_item.
crates/setup/tests/agent_wizard_yaml.rs — schema-roundtrip tests that re-parse the mutated YAML through nexo_config::AgentsConfig.

Reproducible builds + SBOM

Every Nexo release ships with two artefacts that let an operator verify provenance and exact composition:

CycloneDX SBOM (sbom-cyclonedx.json) — every cargo dependency at the exact version + hash that was compiled into the binary.
SPDX SBOM (sbom-spdx.json) — full filesystem scan via syft, captures anything that wasn't a cargo dep (bundled binaries, generated assets, vendored data files).

Both SBOMs are Cosign-signed (*.bundle) using the same keyless OIDC chain documented in Verifying releases.

Reading the SBOMs

# Pretty-print the CycloneDX dep tree:
jq '.components | map({name, version, purl})' sbom-cyclonedx.json | less

# Find a specific crate:
jq '.components[] | select(.name == "tokio")' sbom-cyclonedx.json

# Audit the cargo deps with `cargo-audit` (run against the SBOM,
# without rebuilding):
cargo audit --db ~/.cargo/advisory-db --json | \
  jq -r '.vulnerabilities.list[].advisory.id'

Reproducible build claim

The release workflow targets bit-identical binary between two runs given the same git sha + rust-toolchain.toml + Cargo.lock. The pipeline pins:

Rust toolchain: rust-toolchain.toml fixes the channel + components.
Dependency versions: Cargo.lock is committed and --locked is used by every release build.
Build environment: GitHub Actions ubuntu-latest runner + cargo build --release with no RUSTFLAGS overrides.
Build provenance: SLSA Level 2 attestation generated by actions/attest-build-provenance (Phase 27.2 wiring).
Cosign signature: each binary + SBOM signed via OIDC (Phase 27.3).

Reproducing a release locally

# 1. Check out the exact tag.
git clone https://github.com/lordmacu/nexo-rs && cd nexo-rs
git checkout v0.1.1

# 2. Build with the locked deps.
rustup show       # confirms the toolchain matches rust-toolchain.toml
cargo build --release --bin nexo --locked

# 3. Compare your binary's sha256 against the release asset:
sha256sum target/release/nexo
# Expected: same hash listed in `sha256sums.txt` on the GitHub release.

If the hashes don't match: the build is not reproducible on your host. Common reasons:

Different glibc version → embedded __VERSIONED_SYMBOL strings drift. The release workflow runs on ubuntu-latest (currently Ubuntu 24.04, glibc 2.39); building on Debian 12 (glibc 2.36) produces different bytes.
Different LLVM in your local rustc build (rare, mostly affects Mac users compiling with stable + nightly side-by-side).
Local ~/.cargo/config.toml injecting RUSTFLAGS.
Build PROFILE-DEV vs PROFILE-RELEASE.

For a guaranteed bit-identical reproduction, build inside the same container the workflow uses:

docker run --rm -v $(pwd):/src -w /src \
  rust:1.80-bookworm \
  cargo build --release --bin nexo --locked

This reproduces what the GitHub Actions runner would do — same glibc, same toolchain version, same LLVM.

SLSA verification

The workflow attaches an attestation.intoto.jsonl (SLSA Level 2 provenance) per release. Verify with slsa-verifier:

go install github.com/slsa-framework/slsa-verifier/v2/cli/slsa-verifier@latest

slsa-verifier verify-artifact nexo \
  --provenance-path attestation.intoto.jsonl \
  --source-uri github.com/lordmacu/nexo-rs \
  --source-tag v0.1.1

A green verification proves:

The artefact came from the lordmacu/nexo-rs repo
It was built by a GitHub-hosted runner (not a fork or local box)
The build inputs match what's recorded in the provenance

Auditing for known CVEs

The SBOM lets cargo-audit work without rebuilding:

# Convert CycloneDX → cargo-audit's format:
cyclonedx-cli convert --input-format json \
  --output-format json sbom-cyclonedx.json | \
  jq '...' > deps.json

# Or just feed it to grype (broader scope, multi-format):
grype sbom:./sbom-cyclonedx.json
grype sbom:./sbom-spdx.json

Grype catches CVEs across both Rust crates and any system-level deps captured by syft.

Out of scope (deferred)

apk / pkg SBOM for the Termux deb — Termux's package metadata doesn't speak SPDX yet. The release SBOMs cover the same artifact contents though.
Reproducible Docker image layers — the current Dockerfile uses apt-get update && apt-get install which pulls whatever's latest at build time. Pinning to specific Debian package versions is a follow-up (Phase 34 hardening).

MCP server from Claude Desktop

Expose nexo-rs tools (memory, Gmail, WhatsApp send, browser, etc.) to the Anthropic desktop app so your agent-sandboxed capabilities show up inside Claude conversations.

Same technique works for Cursor, Zed, and anything else that speaks MCP — the config shape is identical.

Prerequisites

Built agent binary at a known path (e.g. /usr/local/bin/agent)
A working config/ directory (reuse the one your daemon normally uses, or point at a dedicated one)
Anthropic API key (or OAuth bundle) configured for the agent

1. Enable the MCP server

config/mcp_server.yaml:

enabled: true
name: nexo
allowlist:
  - memory_*           # recall + store + history
  - forge_memory_checkpoint
  - google_*           # if you paired Google OAuth
  - browser_*          # if you want Claude to drive Chrome
expose_proxies: false  # hide ext_* and mcp_* from the IDE
auth_token_env: ""     # leave empty for local spawn; set if tunneling

Pick the smallest allowlist that covers what you want the IDE to do. Each glob is power you're handing the IDE user.

2. Wire Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "nexo": {
      "command": "/usr/local/bin/agent",
      "args": ["mcp-server", "--config", "/srv/nexo-rs/config"],
      "env": {
        "RUST_LOG": "info",
        "AGENT_LOG_FORMAT": "json"
      }
    }
  }
}

Restart Claude Desktop. The nexo block should appear in the tool picker; pick tools from it the same way you pick built-ins.

3. Verify

Ask Claude: "use the nexo tool my_stats and show me the output."

If it works, Claude calls agent mcp-server as a subprocess, which emits JSON-RPC over stdin/stdout. Logs hit Claude's app-level log file plus stderr of the spawned agent (configurable via AGENT_LOG_FORMAT=json).

Wire shape

sequenceDiagram
    participant CD as Claude Desktop
    participant A as agent mcp-server
    participant TR as ToolRegistry
    participant MEM as Memory tool
    participant LTM as SQLite

    CD->>A: spawn subprocess
    CD->>A: initialize
    A-->>CD: {capabilities: {tools}}
    CD->>A: notifications/initialized
    CD->>A: tools/list
    A->>TR: enumerate (allowlist-filtered)
    TR-->>A: tool defs
    A-->>CD: [memory_recall, memory_store, …]
    CD->>A: tools/call {name: memory_recall, args: {query: "..."}}
    A->>MEM: invoke
    MEM->>LTM: SELECT ...
    LTM-->>MEM: rows
    MEM-->>A: result
    A-->>CD: content

Recipes within the recipe

Recall my cross-session memory from Claude

Allowlist:

allowlist:
  - memory_recall
  - memory_history

Now inside a Claude conversation: "recall what I told you about Luis's address last week." Claude calls memory_recall on your agent's SQLite — Claude itself has no persistent memory; this is how you give it one.

Post to WhatsApp from Claude

Allowlist:

allowlist:
  - whatsapp_send_message

⚠ Be careful. This gives whoever sits at the IDE the ability to send WhatsApp messages from your paired account. Only enable if you trust the IDE user as much as you'd trust the agent.

Read-only Gmail from Claude

Allowlist:

allowlist:
  - google_auth_status
  - google_call

Pair with GOOGLE_ALLOW_SEND= (unset) to keep the google_call tool read-only.

Auth token

If you expose the MCP server over a tunnel (not a local spawn), set auth_token_env to guard the initialize call:

auth_token_env: NEXO_MCP_TOKEN

Then set NEXO_MCP_TOKEN in the agent's env and have the client send it on initialize. Clients that don't present the token are rejected.

Gotchas

expose_proxies: true transitively exposes every upstream MCP server. If the agent already consumes a Gmail MCP server, turning this on lets Claude reach through — usually not what you want.
Allowlist globs match whole tool names. memory_* is OK; mem* is not — enumerate with agent ext list and real tool names before wiring globs.
Rate limits still apply. whatsapp_send_message through this path counts against the same WhatsApp rate bucket as the agent's own uses.

Cross-links

Future marketing plugin: multi-client autonomous operation

This recipe shows how the current runtime can operate with a future marketing plugin without changing core architecture.

Goal

Run multiple marketing clients in the same system while preserving:

strict client isolation by plugin instance
per-agent model isolation (no cross-token usage)
autonomous review loops with operator interrupts

1. Agent template

Start from:

config/agents.d/marketing.multiclient.example.yaml

The template maps one client surface to one agent via strict bindings:

marketing_acme_intake listens only to plugin.inbound.marketing.acme_inbox
marketing_bravo_retention listens only to plugin.inbound.marketing.bravo_retention
marketing_charlie_exec listens only to plugin.inbound.marketing.charlie_exec

Each agent has its own model.provider + model.model.

2. Future plugin event contract

Your future plugin should publish inbound events with this shape:

topic: plugin.inbound.marketing.<instance>
Event.session_id: deterministic UUID per conversation thread
payload fields:
- text (required for text turns)
- from (sender/account/contact id)
- priority (optional: now | next | later)
- optional metadata (channel, campaign_id, thread_id, etc.)

Minimal example payload:

{
  "text": "Customer asked to pause campaign due to legal review.",
  "from": "client-ops@acme.com",
  "priority": "next"
}

Urgent interrupt payload:

{
  "text": "STOP ALL SENDS NOW",
  "from": "head-of-marketing@acme.com",
  "priority": "now"
}

3. Practical flow with current runtime logic

The plugin publishes on plugin.inbound.marketing.acme_inbox.
Runtime matches inbound_bindings strictly by (plugin, instance).
marketing_acme_intake receives the event; other agents do not.
Turn runs with ACME agent model only (MiniMax-M2.5 in template).
If the agent chooses Sleep, runtime schedules proactive wake and injects a synthetic <tick> later.
If priority: now arrives during an in-flight turn, runtime preempts the current turn and handles the urgent message first.
A BRAVO event on plugin.inbound.marketing.bravo_retention runs on the BRAVO agent model only (claude-haiku-4-5 in template).

4. Why this is already ready for production hardening

Queue priority is built-in (now > next > later) with in-flight preemption.
Proactive loop is built-in (Sleep + wake <tick> + daily budget).
Session isolation is built-in (per-session debounce task).
Binding isolation is built-in (strict plugin/instance matching).
Model isolation is built-in (resolved from the matched agent/binding policy).

5. Rollout checklist when plugin is built

Emit deterministic session_id per thread.
Publish to instance-scoped inbound topics (plugin.inbound.marketing.<instance>).
Send priority: now only for real interrupts.
Keep one agent per client surface when you need strict token/cost isolation.
Narrow allowed_tools once the marketing tool surface is finalized.

Architecture Decision Records

Short documents capturing why the architecture is the way it is. Each ADR names an alternative that was considered and rejected, and the forces that drove the choice. Read these when you're tempted to change something load-bearing.

Format loosely follows Michael Nygard's ADR template: context, decision, consequences.

Index

#	Title	Status
0001	Single-process runtime over microservices	Accepted
0002	NATS as the broker	Accepted
0003	sqlite-vec for vector search	Accepted
0004	Per-agent tool sandboxing at registry build time	Accepted
0005	Drop-in `agents.d/` directory for private configs	Accepted
0006	Per-agent git repo for memory forensics	Accepted
0007	WhatsApp via whatsapp-rs (Signal Protocol)	Accepted
0008	MCP dual role — client and server	Accepted
0009	Dual MIT / Apache-2.0 licensing	Accepted

Writing a new ADR

Copy the template (next ADR below, or use 0001 as a reference)
Number sequentially: NNNN-short-slug.md
Set status: Proposed while in review, flip to Accepted or Rejected after the discussion settles
Link from this index
Do not edit accepted ADRs in place. Create a new ADR that supersedes it and mark the old one Superseded by NNNN.

ADRs are load-bearing documentation — they're how future you (and future contributors) learn that "NATS over RabbitMQ was not an accident."

ADR 0001 — Single-process runtime over microservices

Status: Accepted Date: 2026-01

Context

nexo-rs hosts N agents, each with its own LLM client, channel plugins, memory views, and extensions. The natural first instinct for Rust systems targeting real uptime is to split this into microservices: an agent service, a plugin service per channel, a memory service, etc., wired over the broker.

Every microservice adds:

A serialization boundary (more CPU, more latency)
A deployment artifact (more Dockerfiles, more CI)
A failure mode (service down vs process down)
An ops surface (metrics, health, logs per service)

The alternative — one binary hosting every subsystem as tokio tasks — gives up none of the durability (the disk queue + DLQ survive a process restart anyway) and keeps all in-memory caches naturally shared.

Decision

Ship one binary (agent) that hosts:

Every agent runtime (one tokio task per agent)
Every channel plugin (WhatsApp, Telegram, browser, …)
Broker client + disk queue + DLQ
Memory (short-term in-mem, long-term SQLite, vector sqlite-vec)
Extension runtimes (stdio / NATS)
MCP client and server
TaskFlow runtime
Metrics + health + admin HTTP servers

Coordination between tasks happens over the broker (NATS or the local mpsc fallback) exactly as if they were separate processes. Swapping to microservices later requires zero code changes on either side of the bus.

Consequences

Positive

One Dockerfile, one health probe, one metrics endpoint
No IPC overhead on hot paths (LLM tool calls go ToolRegistry → Extension through a tokio channel, not a network hop)
Memory caches (session, tool registry) are naturally shared
Simpler ops: one log stream, one trace span hierarchy

Negative

A bug that panics the process takes down every agent at once (the single-instance lockfile mitigates the blast radius by preventing silent double-boot)
Scaling out means running more agent processes pointed at the same NATS — isolation between them requires deliberate NATS subject partitioning

Escape hatch

If a subsystem needs its own lifecycle (example: a GPU-heavy inference service), ship it as a NATS extension — it's automatically out-of-process and auto-discovered by the agent. Microservices by the back door, without splitting the monolith first.

ADR 0002 — NATS as the broker

Status: Accepted Date: 2026-01

Context

The event bus sits under every inter-plugin and inter-agent communication. Requirements:

Subject-based routing with wildcards (plugin.inbound.*, agent.route.<id>)
Low-latency pub/sub (sub-millisecond on LAN)
No broker-side state to manage unless we opt in
Clustered production deployments
Mature async Rust client

Alternatives considered:

RabbitMQ — heavier, queue-per-binding mental model fits less well for fan-out across plugin instances, ops overhead higher
Redis streams / pub-sub — streams are great for durable event logs but the stream-per-subject model clashes with free-form plugin.outbound.<channel>.<instance> naming; pub-sub has no durability
Kafka — overkill for sub-millisecond request/reply loops, heavy ops, partition count becomes a thing you think about
Custom over TCP — too much invented complexity

Additional implementation note: a crate literally called natsio came up in early design research; it does not exist on crates.io. The real Rust client is async-nats (from the NATS org itself), matching the NATS 2.10 server line.

Decision

Use NATS as the broker. Specifically:

Client: async-nats = "0.35" (pinned in Cargo.toml)
Subject namespace: plugin.inbound.*, plugin.outbound.*, plugin.health.*, agent.events.*, agent.route.*
Fallback: a local tokio::mpsc bus implementing the same Broker trait for offline / single-machine runs
Durability: SQLite disk queue in front of every publish; drains FIFO on reconnect; 3 attempts before DLQ

Consequences

Positive

Standard ops path (monitor on :8222/healthz, prometheus exporter, clustering via well-known recipes)
Pub/sub semantics are trivial to reason about
Swapping in JetStream later for persistent streams is additive
Zero broker state in the happy path — restart NATS without catastrophe thanks to the disk queue

Negative

NATS auth (NKey / JWT) has its own learning curve — see the NATS TLS + auth recipe
No built-in message ordering guarantee across subjects (only per-subscriber). Callers that need ordering (e.g. delegation with correlation id) must enforce it themselves

Forbidden anti-pattern

Do not use natsio or any other non-async-nats client. The crate doesn't exist on crates.io; copy-paste from older design docs will mislead.

ADR 0003 — sqlite-vec for vector search

Status: Accepted Date: 2026-02

Context

Agents benefit from semantic recall — surface a memory whose text doesn't share keywords with the query but shares meaning. The usual playbook: run a dedicated vector database.

Requirements:

Zero extra infrastructure for single-machine deployments
Same durability and transactional model as the rest of memory
Embedding-dimension sanity checks at startup
Hybrid retrieval (keyword ⊔ vector) without a separate query plane

Alternatives considered:

Qdrant / Weaviate / Milvus — all excellent; all require an extra service, network hop, and ops surface
pgvector — would force Postgres everywhere, abandoning SQLite for long-term memory
Simple numpy file + linear scan — works for small datasets, falls over past ~10k memories per agent

Decision

Use sqlite-vec: a SQLite extension that adds a vec0 virtual table in the same DB file as long-term memory.

One SQLite file holds memories, memories_fts, and vec_memories — a single JOIN returns content + tags alongside similarity
Dimension is checked at schema init; mismatch between config and existing rows aborts startup with an explicit message
sqlite3_auto_extension registers once per process
Hybrid retrieval uses Reciprocal Rank Fusion (K=60) over the keyword FTS5 hits and the vector neighbors

Consequences

Positive

Zero-infra single-machine deploys keep working — no extra service to run
Backups, replication, export are all just "copy the .db file"
Transactional writes: INSERT into memories + vec_memories in one statement; no dual-write races
Hybrid retrieval is easy (see vector docs)

Negative

sqlite-vec is newer than Qdrant; its indexing algorithm improves over time. Large indexes may need re-sorting periodically
Changing embedding models (even same-dimension ones) produces a stale index — the ADR doesn't solve this, users must reindex
The sqlite3_auto_extension registration happens once per process and has caught test suites that spawn many short-lived connections off-guard

Swap-out path

EmbeddingProvider is a trait and the recall_mode = vector branch is a single code path. Replacing sqlite-vec with Qdrant is a day's work, not a rewrite.

ADR 0004 — Per-agent tool sandboxing at registry build time

Status: Accepted Date: 2026-02

Context

The same process hosts agents with very different blast radii. Ana runs on WhatsApp against leads; Kate manages a personal Telegram; ops has Proxmox credentials. The LLM in one agent must never see — let alone invoke — tools registered for another agent.

Three enforcement points are possible:

Prompt-level sandboxing — "don't use these tools." Relies on model compliance. Fails under adversarial prompts.
Runtime filter — every tools/call checks a policy before dispatch. Robust, but the LLM still sees the tools in tools/list and can hallucinate calls.
Registry build-time pruning — the agent's ToolRegistry is built with only the allowed tools. The LLM literally cannot see the others.

Decision

Default to registry build-time pruning.

allowed_tools: [] (empty) = every registered tool visible
allowed_tools: [glob, …] = strict allowlist, tools not matching are removed from the registry before the LLM's tools/list call is answered
For agents with inbound_bindings[], the base registry keeps every tool and per-binding overrides apply build-time filtering at turn time — a single agent can narrow its surface differently per channel

Additional layers stack on top:

outbound_allowlist.<channel>: [recipients] — even with whatsapp_send_message in the registry, the runtime rejects sends to unlisted recipients (defense in depth)
tool_rate_limits — per-tool rate limiting for side-effectful tools
Per-agent workspace and long-term memory (WHERE agent_id = ?) — data-level isolation

Consequences

Positive

Adversarial prompts can't invoke missing tools — the model has no token string for them
Easy mental model: grep allowed_tools to see what an agent can do
Prompt tokens stay small (tool list scales with allowlist, not registry)

Negative

A misconfigured allowed_tools silently hides tools the LLM expected to use — the agent returns "I can't do that," puzzling both user and developer. Mitigation: agent status shows the effective tool set per agent
Dynamic granting mid-session is not supported (would require re-handshake with the MCP clients)

Config — agents.yaml (allowed_tools semantics)
Per-agent credentials — the gauntlet validates that the binding's channel instance is actually allowed

ADR 0005 — Drop-in `agents.d/` directory for private configs

Status: Accepted Date: 2026-02

Context

Two kinds of agent content coexist in the same project:

Public — the framework demo agents, ops helpers, templates
Private — sales prompts, tarifarios, internal phone numbers, compliance-flagged customer scripts

The obvious "one agents.yaml" approach forces everything to be either committed (leaking business content) or gitignored (losing the template reference). Neither is acceptable.

Decision

Split by path convention:

config/agents.yaml — committed, public-safe defaults
config/agents.d/*.yaml — gitignored drop-in directory
config/agents.d/*.example.yaml — committed templates
Merge happens at load time: every .yaml in agents.d/ gets its agents: array concatenated to the base list
Files load in lexicographic filename order, so 00-common.yaml
- 10-prod.yaml composes predictably

.gitignore includes:

config/agents.d/*.yaml
!config/agents.d/*.example.yaml

Consequences

Positive

Safe to open-source the repo; real business content stays private
Templates stay in git (ana.example.yaml) so newcomers can copy and fill
Per-environment layering falls out for free (00-dev.yaml vs 10-prod.yaml per deploy)

Negative

Agent-id collisions across files are possible — the loader rejects them at startup with an explicit error. Operators must coordinate file naming
Not every config is split this way — some operators expected plugins.d/, llm.d/, etc. We decided against the generalization until a concrete need appeared

Config — drop-in agents — full mechanics
Recipes — WhatsApp sales agent — shows the pattern in practice

ADR 0006 — Per-agent git repo for memory forensics

Status: Accepted Date: 2026-03

Context

An agent's memory evolves over time — dream sweeps promote memories, the agent writes USER.md / AGENTS.md / SOUL.md revisions, session closes append to MEMORY.md. When an agent misbehaves, "what did it know and when?" is a real debugging question.

Options considered:

Append-only audit log per write — possible, but rolls out a custom scheme for every file
DB-level revision history — works for LTM rows but not for workspace markdown files
Git — battle-tested, standard tooling, git log and git blame ship with every developer's laptop

Decision

When workspace_git.enabled: true, the agent's workspace directory is a per-agent git repository. The runtime commits at three specific moments:

Dream sweep finishes — commit subject promote, body lists promoted memories with scores
Session close — commit subject session-close, body includes session id and agent id
Explicit forge_memory_checkpoint(note) tool call — commit subject checkpoint: {note}

Commit mechanics:

Staged: every non-ignored file (respects auto-generated .gitignore that excludes transcripts/, media/, *.tmp)
Skipped: files larger than 1 MiB (MAX_COMMIT_FILE_BYTES)
Idempotent: no-op commit if tree clean
Author: {agent_id} <agent@localhost> (configurable)
No remote by default — operators add one if archival matters

Consequences

Positive

git log gives you a timestamped history of every memory evolution, for free
memory_history tool lets the LLM reason about its own past state — e.g. "what did I believe about this user last week?"
git diff <oldest>..HEAD is one command away when debugging
Familiar tooling for humans (git bisect a misbehaving agent)

Negative

Repositories grow over time; operators should add a remote with periodic push-and-repack
Commits are process-scoped — an agent process crash between "write MEMORY.md" and "commit" leaves an uncommitted diff. The next commit picks it up, but at that point the audit event is merged
Transcripts are intentionally excluded from commits — they can be enormous and aren't the forensic artifact the ADR is aimed at

Soul — MEMORY.md + workspace-git
Agent runtime — Graceful shutdown (session-close commit runs here)

ADR 0007 — WhatsApp via whatsapp-rs (Signal Protocol)

Status: Accepted Date: 2026-02

Context

"Add WhatsApp support" has three common paths:

Official WhatsApp Business API — rate-limited, costs per message, requires business verification, limits proactive outreach to approved templates. Fine for some deployments, a bad fit for "run an agent on your personal number for a small business."
Unofficial web-scraping libraries (e.g. whatsapp-web.js) — pretend to be a browser, fragile against UI changes, frequently banned
Signal Protocol reimplementation — speak the native protocol that the WhatsApp mobile app speaks. Stable, fast, no scraping, permits all message types (voice, media, reactions, edits, etc.)

Decision

Use whatsapp-rs (Cristian's crate) which implements the Signal Protocol handshake + pairing + message layer in Rust. nexo-rs wraps it in crates/plugins/whatsapp:

Pairing: setup-time QR scan via Client::new_in_dir() — the wizard creates a per-agent session dir and renders the QR as Unicode blocks
Runtime: the plugin subscribes to inbound messages, forwards to plugin.inbound.whatsapp[.<instance>], handles the outbound side via the tool family (whatsapp_send_message, whatsapp_send_reply, whatsapp_send_reaction, whatsapp_send_media)
Credentials expiry: the plugin does not fall back to a runtime QR on 401 — the operator must re-pair via the wizard. The runtime refuses to boot without valid creds. This is a deliberate safety net against silent re-pair loops that would cross-deliver to the wrong account
Multi-account: each agent points at its own session dir. No XDG_DATA_HOME mutation

Consequences

Positive

Full feature coverage (voice, media, reactions, edits, groups)
No per-message cost beyond the bandwidth
No business-verification paperwork
Works on a personal number, a secondary SIM, anything you can pair to WhatsApp's Linked Devices

Negative

Signal Protocol parity is non-trivial; keeping up with WhatsApp protocol evolution is an ongoing commitment of whatsapp-rs
Running an agent on a personal number is a policy choice. WhatsApp's Terms of Service don't love automated accounts; use whatsapp-rs on numbers you own and are ready to re-pair if they get banned
Multi-account needs careful session-dir management — see Plugins — WhatsApp gotchas

Forbidden alternatives

Puppeteer / whatsapp-web.js / selenium — pulls the entire Chromium runtime into the process, breaks constantly, and is detected and banned faster than the Signal Protocol path
Business API — only if the deployment pays for it and the agent flow survives template constraints; ship a separate plugin if this comes up

../whatsapp-rs/ sibling crate (Signal Protocol + pairing + Client)
Plugins — WhatsApp
Recipes — WhatsApp sales agent

ADR 0008 — MCP dual role: client and server

Status: Accepted Date: 2026-03

Context

Model Context Protocol is becoming the de facto integration surface for LLM-driven tools. Two questions arose during the Phase 12 design:

Should the agent be an MCP client (consume external MCP servers as tools)?
Should the agent be an MCP server (expose its own tools to external MCP clients like Claude Desktop, Cursor, Zed)?

These are independent decisions. Picking one does not force the other.

Decision

Do both. Same process, same ToolRegistry, different transports.

Client — McpRuntimeManager spawns stdio or HTTP MCP servers per session (with a shared "sentinel session" for servers that don't need per-session isolation). Their tools register into the per-session ToolRegistry with names like {server_name}_{tool_name} and are callable by the agent like any built-in
Server — agent mcp-server subcommand reads JSON-RPC from stdin and writes responses to stdout. An mcp_server.yaml allowlist controls which tools are exposed. Configurable auth_token_env guards the initialize call when the server is exposed through a tunnel

Both sides speak MCP 2024-11-05 (streamable HTTP) with SSE fallback for legacy servers.

Consequences

Positive

Being a client: any MCP-speaking tool ecosystem is reachable without writing a custom extension
Being a server: the agent's tools + memory become available inside Claude Desktop / Cursor / Zed — cross-session memory, remote actions, etc.
Interop with the broader MCP catalog is a configuration change, not a code change

Negative

Two independent code paths to keep current as the MCP spec evolves
expose_proxies configuration gotcha: enabling it on the server side makes every upstream MCP server transitively visible to the consuming client. Default is false and the docs call this out explicitly
MCP spec churn (2024-11-05 vs future versions) needs staying power

ADR 0009 — Dual MIT / Apache-2.0 licensing

Status: Accepted Date: 2026-04

Context

Open-sourcing nexo-rs required picking a license. Constraints:

The Rust ecosystem convention (rustc, tokio, serde, clap, axum…) is dual MIT / Apache-2.0
Downstream projects should be able to pick whichever license fits their own project's obligations
Attribution to the original author must be legally enforceable — the author explicitly asked that users "use it, just name me"
The author doesn't want to ship a custom / restrictive license that confuses or scares off contributors

Alternatives considered:

MIT alone — fine, but missing the explicit patent grant that Apache-2 gives (relevant to corporate downstream users)
Apache-2 alone — fine, but incompatible with GPLv2 downstream (MIT is compatible)
AGPL-3 — forces source-release on SaaS; nexo-rs isn't trying to prevent cloud forks
BSL (Business Source License) — source-available with time-delayed open-source conversion; inappropriate for a framework whose value is in wide adoption
Custom "use it, name me" — would need a lawyer for every edge case; a solved problem doesn't need a new solution

Decision

Dual-license under MIT OR Apache-2.0:

LICENSE-MIT — full text of the MIT License, 2026 Cristian García
LICENSE-APACHE — full text of the Apache-2.0 License
Cargo.toml: license = "MIT OR Apache-2.0" (SPDX)
NOTICE file at repo root (required to be preserved by Apache-2.0 §4(d)) carries the attribution — author, contact, original repo URL
README links all three + explains the SPDX choice

Downstream users pick whichever they prefer. Attribution is mandatory under both.

Consequences

Positive

Fits existing Rust ecosystem tooling (crates.io, rustdoc headers, CI scanners)
Maximum compatibility: GPLv2 projects pick MIT, patent-sensitive corporate projects pick Apache-2
NOTICE file gives the author the strongest attribution lever available in permissive OSS: removing it is a license violation

Negative

Contributors who want to submit PRs agree (per Apache-2 §5) that their contributions are dual-licensed under the same terms. Some contributors may require a CLA discussion; none so far
Trademark on the name "nexo-rs" is not covered — this ADR is about the code, not the brand. If the brand becomes load-bearing, register a trademark separately

License — human-facing version of this decision
NOTICE — enforceable attribution block

Contributing

PRs welcome. A few ground rules keep the codebase coherent.

Workflow

All feature work follows the /forge pipeline:

/forge brainstorm <topic>  →  /forge spec <topic>  →  /forge plan <topic>  →  /forge ejecutar <topic>

Per-sub-phase done criteria live in PHASES.md.

Rules of the road

All code, code comments, and Markdown docs in English.
No hardcoded secrets. Use ${ENV_VAR} or ${file:...} in YAML.
Every external call goes through CircuitBreaker. No exceptions.
Don't commit anything under secrets/.
Don't skip hooks (--no-verify). Fix the underlying lint / test issue instead.

Docs must follow

Any change that touches user-visible behavior — features, config fields, CLI flags, tool surfaces, retry policies — must update the mdBook under docs/ in the same commit. Docs phase plan: docs/PHASES.md. All mdBook pages must be written in English.

Pure-internal changes (private renames, refactors, test-only) are exempt — mention that explicitly in the commit body.

Local checks

cargo fmt --all
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace
./scripts/check_mdbook_english.sh
./scripts/check_markdown_english.sh
mdbook build docs

CI runs all of the above on every push and every PR.

Git pre-commit hook

The repo ships a pre-commit hook at .githooks/pre-commit that:

Docs-sync gate — rejects the commit if production files under crates/, src/, config/, extensions/, scripts/, .github/, or Cargo.{toml,lock} are staged without anything under docs/.
cargo fmt --all -- --check
cargo clippy --workspace -- -D warnings
cargo test --workspace --quiet

Enable it once per clone:

git config core.hooksPath .githooks

(./scripts/bootstrap.sh does this for you.)

Bypass tags

The docs-sync gate honors a single opt-out tag. Include it in the commit message when the change is genuinely internal and doesn't need docs:

refactor: rename private fn [no-docs]

Acceptable reasons:

Private refactor, no change to any public API
Test-only changes
Dependency bumps with no behavior change
CI-config fiddling that doesn't alter ops

Do not use [no-docs] for anything a user would notice. If in doubt, update the docs — it's the lower-regret path.

Full escape hatch

git commit --no-verify disables all hooks (fmt, clippy, tests, docs-sync). Last resort, not a habit.

Reporting issues

Open a GitHub issue with:

nexo-rs version / commit hash
Rust version (rustc -V)
OS / arch
Relevant log lines (redact secrets)
Minimal reproduction

License of contributions

Contributions are dual-licensed MIT OR Apache-2.0 as described in License.

Releases

Two complementary tools own the release pipeline:

Tool	Owns
`release-plz`	version bumps, git tags, `crates.io` publish, per-crate `CHANGELOG.md`
`cargo-dist`	cross-target binary tarballs, `curl \| sh` / PowerShell installers, sha256 sidecars

They run on the same tag (nexo-rs-v<version>) and stay independent — no overlapping config. Phase 27 brings both online; Phase 27.2 wires the GitHub Actions workflow that combines them on tag push.

What ships

The nexo binary is the only artifact in release tarballs. Every other binary in the workspace (driver subsystem, dispatch tools, companion-tui, mock MCP server) carries [package.metadata.dist] dist = false so cargo-dist excludes it. Dev / smoke programs (browser-test, integration-browser-check, llm_smoke) live as [[example]] entries under examples/ for the same reason.

Build provenance — `nexo version`

build.rs injects four stamps captured at compile time:

NEXO_BUILD_GIT_SHA — short git SHA of the build commit (or unknown outside a git checkout)
NEXO_BUILD_TARGET_TRIPLE — full Rust target triple
NEXO_BUILD_CHANNEL — opaque channel marker; defaults to source. The release workflow overrides via NEXO_BUILD_CHANNEL=apt-musl (etc.) so support tickets carry install-channel provenance.
NEXO_BUILD_TIMESTAMP — UTC ISO8601 timestamp of the build

Operators see them with:

nexo version
# nexo 0.1.1
#   git-sha:   abc1234
#   target:    x86_64-unknown-linux-musl
#   channel:   apt-musl
#   built-at:  2026-04-27T12:34:56Z

nexo --version (without --verbose or the subcommand) prints the short form nexo <version>.

Local validation

make dist-check

Builds the host-target tarball via dist build --target $(rustc -vV | sed -n 's|host: ||p') and runs scripts/release-check.sh. The smoke gate verifies every present tarball contains the bin + LICENSE-* + README.md and that the host-native --version output matches the workspace version. Targets the local toolchain can't satisfy emit [release-check] WARN lines instead of failing.

Full setup notes (cargo-dist, cargo-zigbuild, zig, rustup targets): packaging/README.md.

What's automatic vs manual

Step	Owner
Bump version + open release PR	`release-plz` (CI on push to main)
Tag commit + `crates.io` publish	`release-plz` (on PR merge)
Build 2 musl tarballs (x86_64 + aarch64)	`release.yml` (Phase 27.2 ✅) — cargo-dist
Build Termux `.deb` (aarch64-linux-android)	`release.yml` (Phase 27.2 ✅) — `packaging/termux/build.sh`
Build Debian `.deb` (amd64 + arm64)	`release.yml` (Phase 27.4 ✅) — `packaging/debian/build.sh`
Build RPM (x86_64 + aarch64)	`release.yml` (Phase 27.4 ✅) — `packaging/rpm/build.sh`
Install-test `.deb` on Debian 12 / Ubuntu 22.04 / 24.04	`release.yml` (Phase 27.4 ✅) — docker matrix
Install-test `.rpm` on Fedora 40 / Rocky 9	`release.yml` (Phase 27.4 ✅) — docker matrix
Upload all tarballs + debs + rpms + sha256 sidecars	`release.yml` (Phase 27.2 ✅)
Smoke-test `nexo --version` + provenance stamps	`release.yml` (Phase 27.2 ✅)
Sign every asset (cosign keyless)	`sign-artifacts.yml` (Phase 27.3 ✅)
Generate CycloneDX + SPDX SBOMs	`sbom.yml` (Phase 27.9 🔄)
Apt repo publish + signed `Release` file	Phase 27.4.b deferred
Yum / dnf repo publish + `RPM-GPG-KEY-nexo`	Phase 27.4.b deferred
Termux pkg index	Phase 27.8 deferred
Homebrew bottle auto-PR	Phase 27.6 PARKED (Apple targets dropped)
`nexo self-update`	Phase 27.10 deferred

Adding a new bin to the release

Declare the [[bin]] in the appropriate crate's Cargo.toml.
If the crate hosts the bin via [package.metadata.dist] dist = false, either remove that opt-out or move the bin to a new crate that doesn't carry it.
Re-run make dist-check and confirm the new bin shows up under [bin] in the dist plan output.
Update scripts/release-check.sh's per-archive content check if the new bin should be required.

Adding a new target

Append the target triple to targets = […] in dist-workspace.toml.
Append the matching tarball name to EXPECTED_TARBALLS in the smoke gate.
Land the toolchain story in the GH Actions release workflow (Phase 27.2) — without that, the target builds locally only.

License

nexo-rs is dual-licensed under either:

MIT — see LICENSE-MIT
Apache License, Version 2.0 — see LICENSE-APACHE

at your option. SPDX: MIT OR Apache-2.0.

Attribution is required

Redistributions — source, binary, modified, or unmodified — must preserve the NOTICE file and the copyright attribution, as required by Section 4(d) of the Apache License.

Nexo-rs
Copyright 2026 Cristian García <informacion@cristiangarcia.co>

This product includes software developed by Cristian García.
Original project: https://github.com/lordmacu/nexo-rs

Why dual-licensed

Dual MIT / Apache-2.0 is the Rust ecosystem convention (rustc, tokio, serde, clap, etc.). It maximizes downstream compatibility:

MIT is compatible with GPLv2 (Apache-2.0 is not)
Apache-2.0 grants explicit patent rights (MIT does not)

Users pick whichever fits their project.

Contributions

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in nexo-rs by you shall be dual-licensed as above, without any additional terms or conditions — per Section 5 of the Apache License.

API reference (rustdoc)

Every public type, trait, function, and module in the nexo-rs workspace is documented via cargo doc. The CI workflow runs cargo doc --workspace --no-deps and publishes the output under /api/ on the same GitHub Pages deployment as this book.

Open the rustdoc

Published site: https://lordmacu.github.io/nexo-rs/api/
Local build: cargo doc --workspace --no-deps --open

What's there

One rustdoc page per workspace crate:

Crate	Contents
`agent`	Top-level binary — mostly wiring; see `src/main.rs`.
`nexo-core`	`Agent` trait, `AgentRuntime`, `SessionManager`, `ToolRegistry`, `HookRegistry`, agent-facing tools (`memory`, `taskflow`, `self_report`, `delegate`, `workspace_git`).
`nexo-broker`	`Broker` trait (`NatsBroker`, `LocalBroker`), disk queue, DLQ.
`nexo-llm`	`LlmClient` trait, MiniMax / Anthropic / OpenAI-compat / Gemini clients, retry + rate limiter.
`nexo-memory`	Short-term / long-term / vector types, `LongTermMemory` API.
`nexo-config`	YAML struct types, env/file placeholder resolution.
`nexo-extensions`	`ExtensionManifest`, `ExtensionDiscovery`, `StdioRuntime`, CLI.
`nexo-mcp`	MCP client + server primitives.
`nexo-taskflow`	`Flow`, `FlowStore`, `FlowManager`, `WaitEngine`.
`nexo-resilience`	`CircuitBreaker`.
`nexo-setup`	Wizard field registry, YAML patcher.
`nexo-tunnel`	Cloudflared tunnel helper.
`nexo-auth`	Per-agent credential gauntlet, resolver, audit.
`nexo-plugin-*`	Channel plugins (browser, whatsapp, telegram, email, google, gmail-poller).

When to read rustdoc vs the book

Goal	Start here
Understand a subsystem's purpose	this book
Read a specific trait's methods / signatures	rustdoc
Wire two subsystems together	book → rustdoc
Embed a crate in your own binary	rustdoc
Audit what's public API	rustdoc (anything not in rustdoc is internal)

Building locally

# All crates, no dependencies:
cargo doc --workspace --no-deps

# Open the nexo-core rustdoc in a browser:
cargo doc -p nexo-core --no-deps --open

Warnings are rejected in CI (RUSTDOCFLAGS=-D warnings). Run the same locally before pushing if you edited doc comments:

RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps

Public-API stability

The workspace has not committed to semver-level stability yet. Public signatures change between code phases; follow PHASES.md and commit history when upgrading.

Cross-links

Contributing — how to add /// docs when you touch public surface
Architecture overview — the mental model that rustdoc fills in the fine detail for