Web fetch
The web_fetch built-in tool lets an agent retrieve the cleaned
body text + title for one or more URLs the agent already knows.
Companion to Web search: web_search finds
URLs, web_fetch retrieves them.
Distinct from web_search.expand=true because the agent often
knows the URL up-front (skill output, RSS poll, calendar
attachment, user message) and would otherwise have to either
hallucinate a search query or shell out to a fetch-url
extension.
When to use which
| Scenario | Tool |
|---|---|
| Agent needs to find content matching a query | web_search |
Agent has a URL from a web_search hit and wants the body | web_search(expand=true) |
| Agent has a URL from a poller / skill / user message | web_fetch |
| Agent has a list of URLs to triage | web_fetch(urls=[...]) |
Tool signature
{
"name": "web_fetch",
"parameters": {
"urls": ["https://example.com/article", "https://other.com/page"],
"max_bytes": 65536 // optional; clamped to deployment cap
}
}
Response shape:
{
"results": [
{
"url": "https://example.com/article",
"title": "Example article",
"body": "First paragraph...",
"ok": true
},
{
"url": "https://internal.intranet.local/private",
"ok": false,
"reason": "fetch failed (host blocked, timeout, non-HTML, oversized, or transport error). Check `nexo_link_understanding_fetch_total{result}` for the bucket."
}
],
"count": 2
}
A bad URL returns a {ok: false, reason} row instead of bailing
the whole call, so the agent can still consume the successful
ones. Per-call cap of 5 URLs; longer lists get trimmed with a
warn log.
Configuration
web_fetch has no dedicated config. It rides on
Link understanding:
link_understanding.enabled— gates the tool entirely. With itfalse, every fetch returns{ok: false, reason: "disabled by policy"}.link_understanding.max_bytes— deployment-wide ceiling. The tool'smax_bytesarg can shrink but never grow past this.link_understanding.deny_hosts— host blocklist (loopback, private subnets, internal cloud metadata endpoints, plus whatever the operator added).link_understanding.timeout_ms— per-fetch HTTP timeout.link_understanding.cache_ttl_secs— cache TTL. Successful fetches are cached so a secondweb_fetchof the same URL inside the TTL is free.
Per-binding overrides via EffectiveBindingPolicy::link_understanding
(see Per-binding capability override).
Telemetry
web_fetch reuses every counter the auto-link pipeline emits.
There's no separate dashboard:
nexo_link_understanding_fetch_total{result}—ok/blocked/timeout/non_html/too_big/error.nexo_link_understanding_cache_total{hit}—true/false.nexo_link_understanding_fetch_duration_ms— histogram, only populated when an HTTP request actually went out (cache hits and host-blocked URLs skip it so percentiles reflect real fetch work).
The bundled Grafana dashboard
(ops/grafana/nexo-llm.json)
already plots all three.
Why a per-call cap of 5 URLs
A runaway agent given the prompt "fetch every link in this 10k
RSS dump" would otherwise queue thousands of HTTP requests
synchronously, blowing the prompt budget and hammering the
target hosts. 5 covers every realistic agentic workflow
(read 3 candidates, pick the best two, summarise) while leaving
a clear ceiling. Operators who want batch behaviour should
spawn a TaskFlow that calls web_fetch
in chunks with cursor persistence.
Comparison to extensions
The fetch-url Python extension does roughly the same thing.
web_fetch differs in three ways:
- In-process — no subprocess spawn, no Python interpreter, no extension wire protocol. Sub-100ms cold path on the happy case.
- Shared cache + telemetry — links the user shares (auto-
expanded by Phase 21 link-understanding) AND links the
agent fetches via
web_fetchpopulate the same LRU. The second access is always free. - Same security defaults — same deny-host list, same size cap, same timeout. Operators tune one knob, two surfaces honour it.
Use the extension when the runtime path is wrong shape (custom
auth, post-only endpoints, non-HTML responses you want raw).
Use web_fetch for the standard "give me the article" case,
which is most of them.
Implementation
The tool lives at
crates/core/src/agent/web_fetch_tool.rs::WebFetchTool and is
registered for every agent unconditionally in src/main.rs.
The per-binding link_understanding.enabled policy gates
whether the underlying fetch happens; the tool itself is always
visible in the agent's tool list so operators can write
"call web_fetch on URL X" prompts without needing a per-agent
web_fetch.enabled flag.
Source of truth for FOLLOWUPS W-2 closure.