The autonomous agent — capabilities overview
This page is the bird's-eye map of what an agent running on
nexo can actually do without you holding its hand. Every
sub-feature has its own page (linked at the end of each
section); this page exists so you can see the whole picture
without piecing it together from individual reference docs.
"Autonomous" here doesn't mean "AGI". It means: the agent runs in the background, decides when to act on its own schedule, remembers what it has learned, talks to the user through every channel the operator wired (Slack, Telegram, iMessage, email, WhatsApp), approves or escalates risky actions through curated gates, and survives daemon restarts without losing context.
The agent never executes anything the operator didn't authorise in YAML. Every autonomous behaviour is a knob the operator flips on with explicit consent — there are no implicit defaults that ship a user from "ran nexo for the first time" to "the agent is texting my boss".
1. Living in the background
The agent doesn't need a foreground TTY to run.
- Session kinds — every running goal carries a
SessionKindenum:Interactive(default, attached to a terminal),Bg(detached background goal —nexo agent run --bg <prompt>),Daemon(a long-running goal supervised by the daemon process itself), orDaemonWorker(a child of a daemon). nexo agent run --bg "<prompt>"spawns a goal, returns thegoal_idimmediately, detaches. The agent keeps running even after you close the terminal.nexo agent pslists running goals filtered by kind;--allincludesInteractive. RO SQLite — works without a daemon up.nexo agent attach <goal_id>renders a markdown snapshot of any goal: kind, status, phase, started_at, finished_at, diff_stat, last decision, last event. Useful to check progress without interrupting.nexo agent discoverlists Running goals filtered to detached / daemon kinds. Pass--include-interactiveto broaden.- Reattach on restart — boot flips prior-run
Runningrows toLostOnRestartand firesnotify_originonce per goal so the originating chat sees a clean[abandoned]closure instead of silence. - Drain on SIGTERM —
drain_running_goalsruns BEFORE plugin teardown so[shutdown]notify_originactually leaves the channel before the daemon dies. Per-hook 2 s timeout prevents stuck publishers from hanging shutdown.
→ See Background agents (agent run --bg / ps / attach)
2. Memory + self-improvement
The agent learns. Three tiers, each with a different cost / durability trade-off.
- Short-term memory — per-session, in RAM, scoped to the current goal. Cheap; gone on goal completion.
- Long-term memory — SQLite + sqlite-vec embeddings
(
crates/memory/src/long_term.rs). Survives restarts; searchable by semantic + lexical query. - Git-backed
MEMORY.md— every memory promotion writes a markdown file and commits it to a per-agent git repo. Full history; operator cangit log MEMORY.mdto audit what the agent decided to remember.
Three self-improvement loops the agent runs without operator intervention:
- Light-pass dreaming — scoring-based consolidation runs every N turns. Cheap, no LLM call, just promotes warm memories via decay × access × recency.
- Deep-pass autoDream (Phase 80.1) — heavier consolidation
via a forked sub-agent with its own 4-phase prompt, runs
behind 7 gates: kairos active, time-since-last (default 24 h),
session count ≥ 5 transcripts, scan throttle (10 min), live
consolidation lock (PID + mtime), force bypass, post-fork
escape audit. Deferred for fork (
deferred_for_fork: true) when another process holds the lock — promotions land on the next turn rather than racing. extract_memories(Phase 77.5) — post-turn LLM-driven extraction. After each turn, a small LLM call asks "what surprised you, what did you learn, what should we remember?" and writes structured memory rows.
Defenses:
- Secret scanner (Phase 77.7) — regex set blocks Anthropic / OpenAI / GitHub / AWS / Stripe / Google / JWT key shapes before any memory commit. Fails the commit loud.
AutoMemFilter(Phase 80.20) — when a forked sub-agent writes memory, thecan_use_toolwhitelist locksFileEdit/FileWriteto paths undermemory_dir,Bashto read-only classifier (Phase 77.8/77.9 destructive + sed-in-place defenses still apply),REPLunrestricted. Defense-in-depth.- Memdir relevance scorer (Phase 77.6) —
relevance × recency × accessranking with age decay so old / unused memories don't inflate the working-memory cost.
→ See Dreaming, Memdir scanner
3. Self-driving execution loop
When the agent receives work, what runs the loop?
- Driver-loop (Phase 67) — replaces a single LLM
request/response with a multi-turn execution:
read context → plan → propose tool calls → run permission gate → execute → inspect results → loop. Goal-scoped, with budget caps on turns + time + tokens. Persists toagent_handlesSQLite so every turn survives a daemon restart. - Acceptance autodetect (Phase 75) — at goal completion the
loop runs an autodetect pass:
cargo buildfor Rust,pyproject.tomlbuild for Python,npm testfor Node,cmake --buildfor CMake,cargo test --no-runfor cargo. Mismatch fails the goal — the agent doesn't claim "done" on a broken build. - Plan mode (Phase 79.1) —
EnterPlanModetoggle puts the agent into a read-only mode where it can only call read tools- planning advisors (no
Bash, noWrite).ExitPlanModeresolves the plan with operator approval and re-enters the full surface.
- planning advisors (no
Sleep { duration_ms, reason }tool (Phase 77.20) — the agent can decide "no work to do for now, wake me in 20 min" without holding a shell process. The runtime intercepts the sentinel result, pauses the goal, and schedules a wake-up with cache-aware timing (≤ 270 s keeps prompt cache warm, ≥ 1200 s amortises a cache miss; avoids the 270-1200 s window that pays the miss without benefit).- Forked sub-agent infra (Phase 80.19) —
delegation_toolwithmode: { Sync | ForkAndForget }. Cache-safe parameters (system_prompt+user_context+system_context+tool_use_context+fork_context_messagesall five must match parent for cache hit).skipTranscript: truekeeps the fork's messages out of the parent's history.
→ See Acceptance autodetect (deferred), Self-driving guide (deferred)
4. Time-based action
The agent can fire on its own schedule.
- Heartbeat (Phase 7) — config-time, per-agent. Every N
seconds invoke
on_heartbeat(). Used for proactive messages, reminders, periodic state sync. - Cron (Phase 79.7) — LLM-time scheduled fires. The agent
itself can call
cron_createto schedule a future task; the runtime fires it viaLlmCronDispatcher. Up to 50 entries per binding. - Cron jitter cluster (Phase 80.2-80.6) — six knobs:
enabled— global killswitch.recurring_frac— fraction of next-fire interval used as jitter window.recurring_cap_ms— absolute cap (5 min default).one_shot_max_ms/one_shot_floor_ms— backward lead for one-shots.one_shot_minute_mod— modulus gate (mod=0= never jitter one-shots).recurring_max_age_ms— auto-expire old recurring entries (permanent: trueexempt). All hot-reloadable viaArc<ArcSwap>.jitter_frac_from_entry_idderives the offset from the UUID hex prefix so retries don't move the target.
- Boot-time missed-task quarantine —
sweep_missed_entries(skew_ms)rewrites overduenext_fire_attoi64::MAXso a long-down daemon doesn't stampede on the next tick. agent_turnpoller (Phase 20) — config-time scheduled LLM turn → channel publish. Provider-agnostic; primary use case is "every morning at 7am, summarise the inbox and post to Slack".- Proactive mode (Phase 77.20) —
proactive: { enabled: true, tick_interval_secs, jitter_pct, max_idle_secs }injects a periodic<tick>message into the agent's session. The agent decides whether to act on it or callSleep. Mutually exclusive withrole: coordinator.
→ See Cron jitter (deferred), Proactive mode
5. Communication — every surface the agent can reach
5.1. Inbound from the user
- Pairing (Phase 26) — every
(channel, account_id)inbound goes through a pairing gate. Senders that haven't been allowlisted vianexo pair seedget a pairing challenge. Per-bindingpairing_policy+auto_challengeknobs. Seeded senders survive daemon restarts viaPairingStore::list_allow. - WhatsApp / Telegram / email / browser — first-party
plugins (Phases 6, 22, plus email + browser CDP). Each is a
Channelimpl that maps inbound platform events toagent.intake.<binding>broker subjects. - MCP channels (Phase 80.9) — any MCP server that declares
experimental['nexo/channel']can push user messages into the agent. Provider-agnostic: write a Slack adapter as an MCP server and the agent gets Slack inbound for free.- 5-step gate: capability + killswitch + per-binding allowlist + plugin source verification + approved allowlist.
- SQLite-backed session registry — Slack threads survive daemon restarts.
- Token bucket rate limit per server.
- Audit marker
source: "channel:<server>"in the turn-log. - Operator CLI
nexo channel list / doctor / test.
5.2. Outbound to the user
notify_origin/notify_channelhooks —Phase 67.Fcallback shape so the agent can surface mid-goal updates back to the originating channel without holding the request open.send_user_messagetool (Phase 80.8) — when brief mode is active, the agent's visible output flows through this tool.status: "normal"for replies,"proactive"for unsolicited surfacings. Free text outside the tool stays visible in the detail view.channel_sendtool (Phase 80.9) — invoke any MCP channel server's outbound tool by name. Configurableoutbound_tool_nameper server (defaultsend_message).- Reminder tool (Phase 7.3) — schedule a future message to any channel.
5.3. Inbound from the world
- Generic webhook receiver (Phase 80.12) — HTTP receiver
behind a tunnel. Configure each source by YAML:
signature_spec(HMAC-SHA256/SHA1/raw token) +event_kind_from(header or body json-path) +publish_to(subject NATS). Constant-time signature compare viasubtle::ConstantTimeEq. Provider-agnostic: GitHub, Stripe, Calendly, Zapier all in YAML. - Pollers (Phase 19) — config-time external endpoint polls. Fan-out to per-source NATS subjects.
5.4. Multi-agent coordination
- Peer inbox (Phase 80.11) — every running goal has a NATS
subject
agent.inbox.<goal_id>.list_peersreturns reachable peers (filtered byallowed_delegates);send_to_peersends a typedInboxMessagewithcorrelation_id. InboxRouter(Phase 80.11.b) — single broker subscriber onagent.inbox.>, dashmap per-goal buffers (MAX_QUEUE=64, FIFO eviction). Renders<peer-message from="..." sent_at="..." correlation_id="...">block into the agent's next turn.- Teams (Phase 79.6) — N parallel coordinated agents with a
shared scratchpad directory. Distinct from
Agent1-to-1 delegation — suited to research fan-out + massive refactors. - Delegation tool (Phase 8) — agent-to-agent routing on
agent.route.{target_id}withcorrelation_id. Sync mode awaits the response; ForkAndForget (Phase 80.19) fires the delegate without blocking.
→ See MCP channels, Multi-agent coordination, AWAY_SUMMARY
6. Permission + safety
The agent has powerful tools. The safety story is layered.
- Per-binding capability override (Phase 16) — each binding
has its own
EffectiveBindingPolicythat filtersallowed_tools, rate limits, outbound allowlists, and capability gates. Same agent can have a public WhatsApp binding (locked-down tool set) AND a private Telegram binding (full power). - Auto-approve dial (Phase 80.17) —
auto_approve: trueflips skipping the prompt for read-only / scoped-write tools while destructive Bash + writes outside workspace + ConfigTool + REPL + remote_trigger always ask.is_curated_auto_approvedecision table 25 entries with symlink-escape defense + parent-canonicalize fallback for new files.mcp_/ext_prefix default-ask. Default arm_ => false. - Capability inventory —
crates/setup/src/capabilities.rs::INVENTORYregisters every dangerous env toggle (NEXO_DREAM_NOW_ENABLED,NEXO_KAIROS_REMOTE_CONTROL, etc).nexo doctor capabilitiessurfaces every armed knob. - Bash safety (Phase 77.8-77.10):
- Destructive command warning — flags
rm -rf /-shaped invocations. - Sed-in-place + path validation — rejects
sed -iagainst paths outside the workspace. shouldUseSandboxheuristic withbwrap/firejailprobe.
- Destructive command warning — flags
- Channel permission relay (Phase 80.9.b) —
ChannelRelayDeciderdecorator races the local approval prompt against any channel reply (yes <id>/no <id>from the user's phone). First decision wins. 5-letter ID alphabet a-z minusl(anti-confusable); substring blocklist for offensive combos. Local prompt always runs in parallel — channel approval is a second surface, never a replacement. - Setup doctor —
nexo setup doctoraudits(channel, account_id)tuples, capability gates, dispatch policy consistency, pairing allowlist coverage.
→ See Auto-approve dial, Capability toggles, Bash safety knobs
7. Audit + observability
Everything the agent does leaves a trail.
- Turn-level audit log (Phase 72) — every driver-loop
AttemptResultwrites a row togoal_turnsSQLite table: outcome, decision text, summary, diff_stat, error, raw_json, plus the channelsourcemarker. 1000-row tail cap. Idempotent on(goal_id, turn_index)so a replay doesn't corrupt history. agent_turns_tail goal_id=<uuid> [n=20]tool — read tool that surfaces the last N turns of a goal as a markdown table. Post-mortem debug surface.- DreamTask audit (Phase 80.18) —
dream_runsSQLite table joined togoal_idwithstatus,phase,sessions_reviewing,files_touched (JSON),prior_mtime_ms,started_at,ended_at.dream_runs_tailLLM tool.nexo agent dream tail/status/killCLI. - Agent registry persistence (Phase 71) —
agent_handlesSQLite table tracks every Running / completed / aborted goal. Survives daemon restarts. - Channel turn-log marker (Phase 80.9.h) — channel-driven
turns write
source: "channel:<server>". Single SQL filter answers "what came in via Slack today?". - Prometheus metrics (Phase 9.2) — counters + gauges per
agent / per binding / per tool / per channel.
health.bindYAML key wires the scrape endpoint. - Tracing logs — every gate / every dispatch / every retry
emits a
tracing::info!orwarn!with structured fields (server, binding, kind, reason, error). Operator-readable. - Config-changes log (Phase 79.10) — when ConfigTool
mutates YAML, a row lands in
config_changestable with patch_id, actor_origin, allowed paths.
→ See Logging, Metrics, Turn-level audit log (deferred)
8. Operator surface
The CLI commands a human runs to drive / debug / observe the agent:
| Command | What it does |
|---|---|
nexo run --config config/agents.yaml | Daemon entrypoint |
nexo agent run [--bg] "<prompt>" | Spawn a goal |
nexo agent ps [--all] [--kind=...] | List running goals |
nexo agent attach <goal_id> | Snapshot of a goal |
nexo agent discover [--include-interactive] | List discoverable goals |
nexo agent dream tail/status/kill | DreamTask audit + control |
nexo channel list/doctor/test | MCP channels surface |
nexo pair list/seed/start/revoke | Pairing gate management |
nexo flow list/show/cancel/resume | TaskFlow runtime |
nexo setup | Interactive wizard |
nexo setup doctor | Configuration audit |
nexo setup migrate --dry-run/--apply | Schema migrations |
nexo doctor capabilities | Env toggle inventory |
nexo ext install/list/uninstall/run | Extension management |
nexo mcp-server | Run nexo as an MCP server |
→ See CLI reference
9. End-to-end use case
This is the kind of workflow the autonomous agent is built for.
Scenario: a marketing-agent named kate runs as a daemon
process, paired with the operator's Slack workspace + Telegram
account. It manages the editorial calendar and replies to user
queries during business hours.
agents:
- id: kate
model:
provider: anthropic
model: claude-sonnet-4-5
plugins: [memory, browser, web_search]
assistant_mode:
enabled: true
auto_approve: true
proactive:
enabled: true
tick_interval_secs: 1800 # check in every 30 min
max_idle_secs: 86400
auto_dream:
enabled: true
channels:
enabled: true
approved:
- server: slack
- server: telegram
inbound_bindings:
- plugin: telegram
instance: kate_tg
allowed_channel_servers: [slack, telegram]
auto_approve: true
dispatch_policy:
mode: full
What happens at runtime:
- Boot —
nexo runspawnskateas a daemon. The daemon reads the YAML, validates, opens broker, opens SQLite stores (memory, agent registry, dream runs, turn log, channel sessions, pairing). Connects the configured MCP servers. Spawns aChannelInboundLoopper(binding, server)plus a singleChannelBridgeper process. Wraps the inner permission decider inChannelRelayDecider. - First Slack DM —
alicewrites "¿qué publicamos hoy?" in Slack thread1700000000.000. The Slack MCP server emitsnotifications/nexo/channel. The runtime parses, derivessession_key = "slack|thread_ts=1700000000.000", resolves a freshsession_uuid, persists it inmcp_channel_sessions.sqlite, hands off the<channel source="slack" thread_ts="1700000000.000">to the intake. Pairing gate verifiesaliceis allowlisted (or challenges her). - Agent decides — the LLM reads recent context (long-term
memory + transcripts), decides to look up the calendar.
Calls
Bash(python check_calendar.py). Auto-approve flips the prompt away because the path is read-only and inside the workspace. - Reply — agent calls
channel_send(server: "slack", content: "Tenemos pendiente el blog post de Q2", arguments: { thread_ts: "1700000000.000" }). The runtime resolves the outbound tool name from the registered server's snapshot and invokes it through the MCP runtime. Slack MCP server posts to the Slack API. - Cron fires at 8 PM —
cron_createfrom a previous turn scheduled a daily summary. Cron runner picks it up, dispatches an LLM turn throughLlmCronDispatcher. Output goes to the operator's Telegram vianotify_channel. - Risky tool prompt — the agent decides to schedule an
email blast. The local approval prompt opens; in parallel
the runtime emits
notifications/nexo/channel/permission_requestto both Slack and Telegram. Operator's phone showsApprove "Schedule email blast?" — yes abcde / no abcde. Operator typesyes abcdein Telegram; Telegram MCP server parses, emitsnotifications/nexo/channel/permission.ChannelRelayDeciderwins the race, returnsAllowOnce. Email sends. - Operator sleeps — agent keeps running. Receives Slack
DMs from team members; replies through the same threads.
Cron tasks fire on schedule. Memory consolidates at midnight
via
auto_dream. - Daemon restart — operator pushes a new YAML, the watcher
detects, validates, swaps via Phase 18
ArcSwap. TheChannelRegistry::reevaluatepass evicts handlers that no longer pass the gate. SQLite stores survive. When alice writes again in the same Slack thread, the agent reattaches to the same session — the bot doesn't re-introduce itself. - Operator returns after 12 h silence — first inbound triggers the AWAY_SUMMARY digest. Agent composes a markdown report of the past 12 h: 14 channel messages handled, 2 permission prompts approved, 1 cron fire completed. Sent before processing the operator's actual message.
- Operator audits —
agent_turns_tail goal_id=<uuid> n=50shows every decision the agent made in the last 50 turns.nexo channel doctorvalidates the YAML against the gate.nexo agent dream tailshows last consolidations.
The operator never sat at a terminal during steps 5-9. The agent is autonomous within the bounds of the YAML.
10. Provider-agnostic by design
Every autonomous behaviour works against any LLM provider:
- MiniMax M2.5 (primary)
- Anthropic Claude (subscription OAuth, API key, or Claude Code import)
- OpenAI-compat providers
- Gemini
- Local llama.cpp (Phase 68 backlog — model-agnostic GGUF loader for tier-0 inference)
The LlmClient trait is the abstraction. No autonomous feature
hard-codes a provider; everything routes through the registry +
binding-level provider selection.
Channels work the same way: any MCP server that follows the protocol becomes a channel, regardless of which platform it adapts.
Pollers, webhooks, and channel adapters are all data-driven via YAML — operators don't write per-provider Rust to add a new external surface.
11. Code map — where each capability lives
| Capability | Crate / file | Tests |
|---|---|---|
| Driver-loop | crates/driver-loop/ | + integration tests |
| Permission decider | crates/driver-permission/src/decider.rs | inline |
| Auto-approve dial | crates/driver-permission/src/auto_approve.rs | 27 |
| Channel relay decorator | crates/driver-permission/src/channel_relay.rs | 8 |
| Bash safety | crates/driver-permission/src/bash_destructive.rs | 19 |
| Long-term memory | crates/memory/src/long_term.rs | inline |
| Memdir relevance scorer | crates/memory/src/memdir/ | inline |
| Secret guard | crates/memory/src/secret_guard.rs | inline |
| autoDream runner | crates/dream/ | 67 |
| Cron schedule + jitter | crates/core/src/cron_schedule.rs | 80 |
| Channels gate + parser + bridge | crates/mcp/src/channel*.rs | 109 |
| Channel session store | crates/mcp/src/channel_session_store.rs | 9 |
| Channel permission relay | crates/mcp/src/channel_permission.rs | 27 |
| Channel boot helpers | crates/mcp/src/channel_boot.rs | 5 |
| Channel LLM tools | crates/core/src/agent/channel_*_tool.rs | 21 |
| Pairing | crates/pairing/ | inline |
| TaskFlow | crates/taskflow/ | inline |
| Agent registry persistence | crates/agent-registry/ | 51 |
| Turn-level audit log | crates/agent-registry/src/turn_log.rs | 9 |
| Inbox router | crates/core/src/agent/inbox*.rs | 17 |
| Webhook receiver | crates/webhook-receiver/ | 33 |
| Forked sub-agent | crates/fork/ | 42 |
| Driver / runtime hookup | src/main.rs | smoke |
Total channel-related lib tests: 168 verde spread across 5 crates. Workspace-wide tests count is much larger; see the phase-specific docs for the per-feature breakdown.
12. What's NOT done yet
Honest list of polish items still backlogged:
- Sample MCP channel server fixture —
extensions/sample-channel-server/reference impl so operators can wire a fake channel quickly without writing an MCP server from scratch. ~200 LOC, high educational value, no functional impact. - Setup wizard panel for channels —
nexo setup → Configurar agente → Channelsinteractive opt-in. UX nice-to-have. - Live-runtime channel doctor — current
nexo channel doctoris static against YAML. Live version that consults the activeChannelRegistryvia NATS to show what's actually registered in the running daemon. channel_historyLLM tool — tail of the turn-log filtered bysource: "channel:<server>", useful for the agent to ask itself "what did Slack send today".- Phase 67.10–67.13 — escalation-to-channel paths for
driver-loop are largely subsumed by
notify_origin/notify_channelalready. Remaining tickets inPHASES.md. - Phase 68 Local LLM tier (llama.cpp) — 15 sub-phases for tier-0 inference (PII / embeddings / poller pre-filter / classifiers / fallback). Planned to run on Termux ARM CPU + desktop CPU/GPU.
None of these block the autonomous agent's current capabilities.
13. Where to go next
- Setting up your first autonomous agent → Quick start + Setup wizard.
- Deep dive on assistant mode + auto-approve → Assistant mode overview.
- MCP channels specifically → MCP channels.
- Multi-agent coordination patterns → Multi-agent coordination.
- Audit + observability stack → Logging
- Metrics + Turn-level audit log (deferred).
- Phase tracking —
PHASES.mdat repo root has the exhaustive sub-phase status (✅ MVP / ⬜ open / DEFERRED).