ADR 0004 — Per-agent tool sandboxing at registry build time
Status: Accepted Date: 2026-02
Context
The same process hosts agents with very different blast radii.
Ana runs on WhatsApp against leads; Kate manages a personal Telegram;
ops has Proxmox credentials. The LLM in one agent must never see —
let alone invoke — tools registered for another agent.
Three enforcement points are possible:
- Prompt-level sandboxing — "don't use these tools." Relies on model compliance. Fails under adversarial prompts.
- Runtime filter — every
tools/callchecks a policy before dispatch. Robust, but the LLM still sees the tools intools/listand can hallucinate calls. - Registry build-time pruning — the agent's
ToolRegistryis built with only the allowed tools. The LLM literally cannot see the others.
Decision
Default to registry build-time pruning.
allowed_tools: [](empty) = every registered tool visibleallowed_tools: [glob, …]= strict allowlist, tools not matching are removed from the registry before the LLM'stools/listcall is answered- For agents with
inbound_bindings[], the base registry keeps every tool and per-binding overrides apply build-time filtering at turn time — a single agent can narrow its surface differently per channel
Additional layers stack on top:
outbound_allowlist.<channel>: [recipients]— even withwhatsapp_send_messagein the registry, the runtime rejects sends to unlisted recipients (defense in depth)tool_rate_limits— per-tool rate limiting for side-effectful tools- Per-agent
workspaceandlong-term memory (WHERE agent_id = ?)— data-level isolation
Consequences
Positive
- Adversarial prompts can't invoke missing tools — the model has no token string for them
- Easy mental model: grep
allowed_toolsto see what an agent can do - Prompt tokens stay small (tool list scales with allowlist, not registry)
Negative
- A misconfigured
allowed_toolssilently hides tools the LLM expected to use — the agent returns "I can't do that," puzzling both user and developer. Mitigation:agent statusshows the effective tool set per agent - Dynamic granting mid-session is not supported (would require re-handshake with the MCP clients)
Related
- Config — agents.yaml (allowed_tools semantics)
- Per-agent credentials — the gauntlet validates that the binding's channel instance is actually allowed