Telegram

Bot API channel with long-polling intake, multi-bot routing, full send/reply/reaction/edit/location/media tool surface, and optional voice auto-transcription.

Source: crates/plugins/telegram/.

Topics

DirectionSubjectNotes
Inboundplugin.inbound.telegramLegacy single-bot
Inboundplugin.inbound.telegram.<instance>Per-bot routing
Outboundplugin.outbound.telegramLegacy single-bot
Outboundplugin.outbound.telegram.<instance>Per-bot routing

Each instance subscribes only to its own outbound topic, so two bots in the same process don't cross-wire.

Config

# config/plugins/telegram.yaml
telegram:
  token: ${file:./secrets/telegram_token.txt}
  instance: sales_bot
  polling:
    enabled: true
    interval_ms: 25000
    offset_path: ./data/media/telegram/sales_bot.offset
  allowlist:
    chat_ids: []        # empty = accept all
  auto_transcribe:
    enabled: false
    command: ./extensions/openai-whisper/target/release/openai-whisper
    language: es
  bridge_timeout_ms: 120000

Key fields:

FieldDefaultPurpose
token— (required)Bot API token from @BotFather.
instanceNoneLabel for multi-bot routing. Unlabelled keeps the legacy bare topic.
allow_agents[]Agents permitted to publish from this bot. Empty = accept any agent holding a resolver handle. Defense-in-depth for the per-agent credentials binding.
polling.enabledtrueLong-polling intake. Webhook not yet supported.
polling.interval_ms25000Long-poll timeout hint. Telegram clamps to [1 s, 50 s].
polling.offset_path./data/media/telegram/offsetFile to persist update offset across restarts.
allowlist.chat_ids[]Numeric chat ids allowed. Empty = accept all.
auto_transcribe.enabledfalseVoice → text.
auto_transcribe.command./extensions/openai-whisper/.../openai-whisperPath to whisper binary.
bridge_timeout_ms120000Handler deadline before a bridge_timeout event fires.

Auth

Single mode: static bot token. No OAuth. Store it under ./secrets/ and reference via ${file:...}.

flowchart LR
    SETUP[agent setup] --> ASK[ask for bot token]
    ASK --> F[./secrets/telegram_token.txt]
    F -.->|${file:...}| CFG[config/plugins/telegram.yaml]
    CFG --> RUN[runtime: HTTP Bot API with long-poll]

Tools exposed to the LLM

ToolNotes
telegram_send_messageSend text to chat id (negative for groups/channels).
telegram_send_replyQuote a specific prior message.
telegram_send_reactionEmoji on a message.
telegram_edit_messageModify a prior message's text.
telegram_send_locationGPS coordinates.
telegram_send_mediaFile upload with caption and mime hint.

All tools enforce outbound_allowlist.telegram per binding.

Event shapes

// message
{
  "kind": "message",
  "from": "12345",
  "chat": "12345",
  "chat_type": "private",
  "text": "hi",
  "reply_to": null,
  "is_group": false,
  "timestamp": 1714000000,
  "msg_id": "42",
  "username": "jdoe",
  "media": [],
  "latitude": null,
  "longitude": null,
  "forward": null
}

// media item (inside `media`)
{
  "kind": "voice" | "photo" | "video" | "document" | "audio",
  "local_path": "./data/media/telegram/....ogg",
  "file_id": "AgACAgEA...",
  "mime_type": "audio/ogg",
  "duration_s": 4,
  "width": null,
  "height": null,
  "file_name": null
}

// callback_query  (inline-keyboard button press, auto-ACKed)
{"kind": "callback_query", "from": "...", "chat": "...", "data": "buy"}

// chat_membership
{"kind": "chat_membership", "chat": "...", "status": "added" | "kicked" | ...}

// lifecycle
{"kind": "connected" | "disconnected"}
{"kind": "bridge_timeout", "msg_id": "...", "waited_ms": ...}

Forwarded messages include a forward object:

"forward": {
  "source": "user" | "channel" | "chat",
  "from_user_id": 12345,
  "from_chat_id": null,
  "date": 1714000000
}

Gotchas

  • Webhook mode is not supported yet. Long-polling only.
  • polling.interval_ms is clamped by Telegram. Values outside [1000, 50000] get capped by the server side; default 25000 is a good middle ground.
  • Negative chat ids are groups/channels. Telegram uses negative ids for group chats; positive for private. Don't strip the sign.
  • Auto-transcribe requires the whisper skill extension. The command path must point at a working binary, otherwise inbound voice messages arrive without text.