# text-to-speech

`station__kokoro__text-to-speech` · external (needs EXECUTION_BACKEND_URL configured) · domain `kokoro` · pv-relevance `non-pv`

Synthesize speech from text using the Kokoro neural TTS model. Returns the audio inline as base64. Default voice is af_bella, default format mp3.

> **Note:** This tool routes through an external execution backend. If `EXECUTION_BACKEND_URL` is unset on the server, calls return JSON-RPC error `-32603 "Tool execution backend not configured"`. Tools with `backend: native` execute in-process and are always callable.

## Agent metadata

- `idempotent`: unknown
- `read_only`: unknown
- `expected_latency_ms`: unknown (not yet contract-tested)
- `cost_tokens_estimate`: unknown

## Input schema

- `text` *string* (required) — Text to synthesize. Max 4096 characters.
- `voice` *string* — Voice identifier (e.g. af_bella, af_kore, af_nicole). Call list-voices for the full set of 67. Default: af_bella.
- `model` *string* — Model name. One of: kokoro, tts-1, tts-1-hd. Default: kokoro.
- `format` *string* — Audio container. One of: mp3, wav, opus, flac, aac. Default: mp3.

## Example call

```json
POST /api/mcp
Content-Type: application/json

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "station__kokoro__text-to-speech",
    "arguments": {
      "text": ""
    }
  }
}
```

## Related

- [/tools](/tools) — all 3062 tools
- [/tools/kokoro__text-to-speech](/tools/kokoro__text-to-speech) — HTML page
- [/tools/kokoro__text-to-speech/json](/tools/kokoro__text-to-speech/json) — JSON form (agent-friendly)
- [/api/mcp](/api/mcp) — endpoint
- [/AGENTS.md](/AGENTS.md) — agent guide