Skip to main content

TTS — text_to_speech

text_to_speech is the outbound voice tool. The agent passes text; a TtsProvider synthesises audio; channel adapters that support voice playback (Telegram sendVoice, Discord attachment, Slack files.upload) deliver it as a voice message. Symmetric with the inbound STT path already running on the Telegram adapter.

Source

Tool factory: extensions/tools-tts/src/index.ts (createTtsTools). Provider implementations under extensions/tools-tts/src/providers/. Wiring: packages/wiring/src/index.ts registers the tool with the provider built from config.auxiliary?.tts.

Schema

FieldTypeRequiredDescription
textstringyesText to synthesise. Hard cap 4096 characters — call repeatedly for longer content.
voicestringnoVoice id (provider-specific — e.g. OpenAI's alloy / nova / shimmer). Omit for provider default.
speednumbernoSpeed multiplier (0.25 to 4.0, default 1.0). Provider clamps out-of-range values.

Tool metadata: toolset: 'voice', maxResultChars: 1024, capabilities: {}. The capability surface is intentionally empty — voice delivery is the channel adapter's responsibility.

Availability gate

text_to_speech ships as unavailable when no provider is configured. The wiring path:

for (const tool of createTtsTools({ provider: null })) tools.register(tool);

provider: null means isAvailable() returns false, the tool registry filters it out of the personality's exposed set, and the LLM never sees it as an option. To enable, configure auxiliary.tts.* in ~/.ethos/config.yaml:

auxiliary.tts.provider: openai
auxiliary.tts.apiKey: ${secrets:providers/openai/apiKey}
auxiliary.tts.model: tts-1 # tts-1 (cheap, fast) / tts-1-hd (higher quality)
auxiliary.tts.defaultVoice: alloy

Wiring then constructs a TtsProvider and re-registers the tool with isAvailable() === true.

Provider contract

export interface TtsProvider {
synthesize(text: string, opts?: { voice?: string; speed?: number }):
Promise<{ audio: Buffer; format: 'mp3' | 'opus' | 'wav' }>;
readonly name: string;
readonly availableVoices: string[];
}

Providers return raw audio bytes plus the container format. The tool wraps the bytes in a MEDIA: envelope the channel adapter recognises.

Channel-adapter outbound

Adapter behaviour at delivery time:

AdapterBehaviour
TelegramRoutes through sendVoice (.opus) or sendAudio (.mp3 / .wav). Plays inline as a voice bubble.
DiscordPosts as a message attachment. Discord clients auto-show inline audio for audio/* MIME types.
SlackUses files.upload with chat:write + files:write scopes. Adapter must have canSendFiles: true (configured at adapter init).
EmailAttached to the outgoing message as audio/mpeg / audio/ogg.

If the active channel can't deliver audio (no canSendFiles, missing scopes), the tool returns an error rather than silently dropping. Use send_message to a target on a different adapter as a fallback.

Errors

codeWhenOperator fix
input_invalidtext empty or > 4096 charsSplit, summarise, or chunk
not_availableNo provider configuredSet auxiliary.tts.* in config.yaml
not_availableProvider unreachable (network)Check API key + connectivity
execution_failedProvider rejected the request (e.g. voice id unknown)Pick a voice from availableVoices
execution_failedChannel adapter can't carry audioUse a different adapter via send_message

Examples

Read a calendar entry out loud

A voice-bot personality with text_to_speech in its toolset:

text_to_speech({
text: "Next meeting: standup at 10:30 with the engineering team.",
voice: "alloy"
})

Returns a MEDIA: path that the Telegram adapter delivers as a voice bubble in the chat.

Broadcast to multiple channels

Compose with send_message:

1. text_to_speech({ text: "Deploy starting" }) → MEDIA: path
2. send_message({ platform: "telegram", target: "-1001234567890", body: "<MEDIA:...>" })
3. send_message({ platform: "slack", target: "C0DEPLOY", body: "<MEDIA:...>" })

Each adapter delivers it according to its own outbound-files capability.

Slow it down for accessibility

text_to_speech({
text: "Please review the document and confirm your selection.",
voice: "nova",
speed: 0.85
})

See also