Project tracker + multi-agent dispatch (Phase 67.A–H)
The project-tracker subsystem lets a nexo-rs agent answer "qué fase
va el desarrollo" through Telegram / WhatsApp / a shell, and lets it
dispatch async programmer agents that ship phases on its behalf.
The implementation is layered:
| Layer | Crate | Responsibility |
|---|---|---|
| Project files | nexo-project-tracker | Parse PHASES.md + FOLLOWUPS.md, watch for changes, expose read tools. |
| Multi-agent state | nexo-agent-registry | DashMap + SQLite store of every in-flight goal, cap + queue + reattach. |
| Goal control | nexo-driver-loop | spawn_goal / pause_goal / resume_goal / cancel_goal per-goal. |
| Tool surface | nexo-dispatch-tools | program_phase, dispatch_followup, hook system, agent control + query, admin. |
| Capability gate | nexo-config + nexo-core | DispatchPolicy per agent / binding, ToolRegistry filter. |
Project tracker (Phase 67.A)
FsProjectTracker reads <root>/PHASES.md (required) and
<root>/FOLLOWUPS.md (optional) at startup, caches parsed state
behind a parking-lot RwLock with a 60 s TTL, and starts a notify
watcher on the parent directory that invalidates the cache on
Modify | Create | Remove events.
Read tools register through nexo_dispatch_tools::READ_TOOL_NAMES
(project_status, project_phases_list, followup_detail,
git_log_for_phase).
Set ${NEXO_PROJECT_ROOT} to point at a workspace other than the
daemon's cwd.
Multi-agent registry (Phase 67.B)
AgentRegistry is the single source of truth for every goal the
driver has admitted. Each entry holds an ArcSwap<AgentSnapshot>
(turn N/M, last acceptance, last decision summary, diff_stat) so
list_agents / agent_status readers never block writers.
admit(handle, enqueue)enforces the global cap. Beyond the cap,enqueue=trueparks the goal asQueued;enqueue=falserejects.release(goal_id, terminal)returns the next-up queued goal so the orchestrator can promote it viapromote_queuedonce the worktree / binding is ready.apply_attempt(AttemptResult)refreshes the live snapshot. Idempotent against out-of-order replay (lower turn_index ignored).- Reattach (Phase 67.B.4) walks the SQLite store at boot and
rehydrates
Runningrows. Withresume_running=falsethey flip toLostOnRestartand surface to the operator.
LogBuffer keeps a per-goal ring of recent driver events for the
agent_logs_tail tool — bounded so a chatty goal cannot OOM the
process.
Persistence wiring (Phase 71)
The bin reads agent_registry.store from
config/project-tracker/project_tracker.yaml and opens
SqliteAgentRegistryStore when the resolved path is non-empty.
Env placeholders (${NEXO_AGENT_REGISTRY_DB:-./data/agents.db})
are expanded before the open. Path open failures fall back to
MemoryAgentRegistryStore with a warn so a corrupt sqlite file
never bricks boot.
When the registry is sqlite-backed and reattach_on_boot: true,
the bin runs the reattach sweep with resume_running=false. Every
prior-run Running row flips to LostOnRestart, and any
notify_origin / notify_channel hook attached to that goal fires
once with an [abandoned] summary so the originating chat learns
the goal could not be resumed. Subprocess respawn is intentionally
not attempted — restoring a Claude Code worktree the daemon no
longer owns is unsafe to do silently and lives under Phase 67.C.1.
Shutdown drain (Phase 71.3)
On SIGTERM the bin runs nexo_dispatch_tools::drain_running_goals
before plugin teardown so notify_origin reaches WhatsApp /
Telegram while their adapters are still alive. Each Running goal's
Cancelled hooks fire with a [shutdown] summary; per-hook
dispatch is bounded by a 2 s timeout so a stuck publish cannot
hold shutdown hostage. The row then flips to LostOnRestart so
the next boot's reattach sweep does not re-fire the same
notification.
[shutdown] daemon stopping — goal `<id>` was running and has
been marked abandoned. Re-dispatch with `program_phase
phase_id=<phase>` if you still need it.
SIGKILL still bypasses this — the boot-time reattach sweep is the safety net for that case.
Turn-level audit log (Phase 72)
Live state (AgentSnapshot) only carries the latest decision /
diff / acceptance per goal. Once a turn rolls forward the previous
turn's data is gone. To answer "what did the agent actually do
across its 40 turns?" the runtime now writes a durable row per
turn into a goal_turns table on the same agents.db:
goal_turns(
goal_id TEXT,
turn_index INTEGER,
recorded_at INTEGER,
outcome TEXT, -- done | continue | needs_retry | …
decision TEXT, -- last Decision rendered as
-- "<tool> (allow|deny:msg|observe:note) — rationale"
summary TEXT, -- mirror of AgentSnapshot.last_progress_text
diff_stat TEXT,
error TEXT, -- pre-rendered for needs_retry / escalate / budget
raw_json TEXT, -- full AttemptResult payload
PRIMARY KEY (goal_id, turn_index)
);
EventForwarder writes a row on every AttemptResult event,
upsert-on-conflict so a replay can't dup history. The new chat tool
agent_turns_tail goal_id=<uuid> [n=20] returns a markdown table
of the last N rows (default 20, capped at 1000):
showing 20 of 40 turn(s) for `…`
| turn | outcome | decision | summary | error |
|---|---|---|---|---|
| 21 | continue | Edit (allow) — patch crate slack | wired Plugin trait | - |
| 22 | needs_retry | Bash (allow) — cargo build | … | E0432 in slack/src/lib.rs |
…
Best-effort writes: an append failure logs a warn but never blocks
the driver loop. When the registry isn't sqlite-backed (memory
fallback), the tool reports "set agent_registry.store in
project_tracker.yaml" rather than silently returning empty.
Async dispatch (Phase 67.C + 67.E)
DriverOrchestrator::spawn_goal(self: Arc<Self>, goal) returns a
tokio::task::JoinHandle so the calling tool returns the goal id
instantly without waiting for the run to finish. Per-goal pause /
cancel signals (watch<bool> and CancellationToken::child_token)
let pause_agent / cancel_agent target one goal without taking
down the rest of the orchestrator.
program_phase_dispatch is the heart of the dispatch surface: it
reads the sub-phase out of PHASES.md, runs DispatchGate::check,
constructs a Goal with the dispatcher / origin metadata, asks the
registry for a slot, and either spawns the goal or returns
Queued / Forbidden / NotFound. dispatch_followup is the
mirror that pulls the description from a FOLLOWUPS.md item.
Capability gate (Phase 67.D)
DispatchPolicy { mode, max_concurrent_per_dispatcher, allowed_phase_ids, forbidden_phase_ids } lives on AgentConfig
and (as Option<DispatchPolicy>) on InboundBinding. The
per-binding override fully replaces the agent-level value so an
operator can be precise per channel ("asistente is none
everywhere except this Telegram chat where it is full").
DispatchGate::check short-circuits in this order:
- capability
None→CapabilityNone(every kind). ReadOnlycapability + write kind →CapabilityReadOnly.- write +
require_trusted+!sender_trusted→SenderNotTrusted. Read tools bypass the trust gate solist_agentsstays open for unpaired senders. forbidden_phase_idsmatch →PhaseForbidden.- non-empty
allowed_phase_ids+ no match →PhaseNotAllowed. - dispatcher / sender / global caps. Global cap with
queue_when_full=trueis admitted; the orchestrator queues it. Without queue →GlobalCapReached.
ToolRegistry::apply_dispatch_capability(policy, is_admin) prunes
the registry of dispatch tool names not allowed by the resolved
policy. ToolRegistryCache::get_or_build_with_dispatch builds the
per-binding filtered registry that respects both allowed_tools
and dispatch_policy. Hot reload (Phase 18) constructs a fresh
ToolRegistryCache per snapshot, so a new dispatch_policy lands
on the next intake without restart; in-flight goals keep their
pre-reload tool surface so a hot reload never preempts.
Completion hooks (Phase 67.F)
Each hook is (on: HookTrigger, action: HookAction, id). Triggers
fire on Done | Failed | Cancelled | Progress { every_turns }.
Actions:
notify_origin— publish a markdown summary to the chat that triggered the goal. No-op whenorigin.plugin == "console".notify_channel { plugin, instance, recipient }— publish to an explicit channel different from the origin (escalate to ops).dispatch_phase { phase_id, only_if }— chain another goal whenonly_ifmatches the firing transition. Implemented via a pluggableDispatchPhaseChainerso the runtime ownsprogram_phase_dispatchplumbing.nats_publish { subject }— JSON payload to a custom subject.shell { cmd, timeout }— opt-in viaallow_shell_hooks. CapabilityPROGRAM_PHASE_ALLOW_SHELL_HOOKSregistered with the setup inventory soagent doctor capabilitiesflags it the moment the operator exports the env var. ReceivesNEXO_HOOK_GOAL_ID/PHASE_ID/TRANSITION/PAYLOAD_JSONenv vars.
HookIdempotencyStore (SQLite) keeps (goal_id, transition, action_kind, action_id) UNIQUE so at-least-once NATS replay or a
mid-hook restart cannot fire a hook twice.
HookRegistry (in-memory DashMap<GoalId, Vec<CompletionHook>>)
backs add_hook / remove_hook / agent_hooks_list.
NATS subjects (Phase 67.H.2)
| Subject | Producer |
|---|---|
agent.dispatch.spawned | program_phase_dispatch admitted |
agent.dispatch.denied | DispatchGate::check denied |
agent.tool.hook.dispatched | hook fired ok |
agent.tool.hook.failed | hook attempt errored |
agent.registry.snapshot.<goal_id> | per-goal periodic beacon |
agent.driver.progress | every Nth completed work-turn |
Plus the existing Phase 67.0–67.9 subjects:
agent.driver.{goal,attempt}.{started,completed},
agent.driver.{decision,acceptance,budget.exhausted,escalate,replay,compact}.
CLI (Phase 67.H.1)
nexo-driver-tools mirrors the chat tool surface for shell use:
nexo-driver-tools status [--phase <id> | --followups]
nexo-driver-tools dispatch <phase_id>
nexo-driver-tools agents list [--filter running|queued|...]
nexo-driver-tools agents show <goal_id>
nexo-driver-tools agents cancel <goal_id> [--reason "…"]
origin.plugin = "console" so notify_origin is a no-op (the
operator sees stdout, not a chat reply).
Built-in registration (nexo daemon)
The default nexo agent binary registers every dispatch
tool definition at boot via
nexo_core::agent::dispatch_handlers::register_dispatch_tools_into.
The LLM sees program_phase, list_agents, agent_status,
etc. in its toolset; per-binding dispatch_capability
(config/agents.yaml) prunes the write tools for bindings that
opted out.
What's NOT bundled by default is the runtime
DispatchToolContext — the orchestrator + registry + tracker
references the handlers consult. Without it, a tool call
returns a clean dispatch tools require AgentContext.dispatch to be set at boot error instead of pretending success. Two
integration paths from there:
- In-process orchestrator — boot a
DriverOrchestratoralongside the agents, share oneAgentRegistry. See the next section for the wiring sample. - NATS-based dispatch — agent bin publishes a message to
agent.driver.dispatch.requestthat a separatenexo-driverdaemon consumes. This is the topology to use when the Claude subprocess needs hardware (GPU box) the agent daemon doesn't have. The dispatch tool surface only changes in the registry it consults; operators can swap the in- processAgentRegistryfor one that mirrors a NATS-backed registry without touching the handlers.
Boot wiring (B8)
The integrator's main.rs ties everything together. Minimal
shape:
use std::sync::Arc;
use nexo_agent_registry::{AgentRegistry, MemoryAgentRegistryStore, LogBuffer};
use nexo_core::agent::{
dispatch_handlers::{register_dispatch_tools_into, DispatchToolContext},
tool_registry::ToolRegistry,
};
use nexo_dispatch_tools::{
event_forwarder::EventForwarder,
hooks::{DefaultHookDispatcher, HookRegistry, NoopNatsHookPublisher},
policy_gate::CapSnapshot,
NoopTelemetry,
};
use nexo_pairing::PairingAdapterRegistry;
use nexo_project_tracker::FsProjectTracker;
// 1. Project tracker.
let tracker: Arc<dyn nexo_project_tracker::ProjectTracker> =
Arc::new(FsProjectTracker::open(std::env::current_dir().unwrap())?);
// 2. Agent registry + log buffer.
let registry = Arc::new(AgentRegistry::new(
Arc::new(MemoryAgentRegistryStore::default()),
4,
));
let log_buffer = Arc::new(LogBuffer::new(200));
let hook_registry = Arc::new(HookRegistry::new());
// 3. Hook dispatcher with the channel adapters that Phase 26
// registered (whatsapp / telegram).
let pairing = PairingAdapterRegistry::new();
// pairing.register(WhatsappPairingAdapter::new(...));
// pairing.register(TelegramPairingAdapter::new(...));
let hook_dispatcher = Arc::new(DefaultHookDispatcher::new(
pairing,
Arc::new(NoopNatsHookPublisher),
));
// 4. Orchestrator with EventForwarder so registry / log_buffer /
// hooks see every driver event.
let inner_sink: Arc<dyn nexo_driver_loop::DriverEventSink> =
Arc::new(nexo_driver_loop::NoopEventSink);
let event_sink: Arc<dyn nexo_driver_loop::DriverEventSink> =
Arc::new(EventForwarder::new(
registry.clone(),
log_buffer.clone(),
hook_registry.clone(),
hook_dispatcher.clone(),
inner_sink,
));
// (orchestrator builder consumes event_sink)
// 5. Bundle for AgentContext.dispatch.
let dispatch_ctx = Arc::new(DispatchToolContext {
tracker,
orchestrator: orch.clone(),
registry,
hooks: hook_registry,
log_buffer,
default_caps: CapSnapshot {
queue_when_full: true,
..Default::default()
},
require_trusted: true,
telemetry: Arc::new(NoopTelemetry),
});
// 6. Register the handlers into the base ToolRegistry. The
// per-binding cache prunes write tools when capability=None
// or read_only.
let base = ToolRegistry::new();
register_dispatch_tools_into(&base);
// 7. Per-session AgentContext.with_dispatch(dispatch_ctx)
// + .with_sender_trusted(true) + .with_inbound_origin(plugin,
// instance, sender).
Without step 6 the handlers exist but aren't reachable by the LLM. Without step 4 the registry / log_buffer / hooks stay inert. Without step 5 the handlers return MissingDispatchCtx.
See also
proyecto/PHASES.md— Phase 67.A–H sub-phase status of record.architecture/driver-subsystem.md— Phase 67.0–67.9 driver loop- replay + compact policies.