Speech

Your agent can accept voice input and reply with audio. Both can be enabled from Model Settings.

Speech Settings

Open “Model Settings” on the Agents page and look for the Speech section.

Speech Settings

Enable Speech to Text to let users send voice messages instead of typing.

Speech to Text Settings

How it works:

Speech to Text Mic Settings

No additional configuration is needed. The feature works immediately after enabling.

Enable Text to Speech to have the agent’s replies played as audio.

Common settings:

OpenAI provides the gpt-4o-mini-tts model with 4 built-in voices:

OpenAI Speech Settings

How to use:

Speech generation uses Chatolia credits.

ElevenLabs offers two models:

ElevenLabs comes with all default voices available for selection.

ElevenLabs Speech Settings

How to use:

Speech generation uses Chatolia credits unless you provide your own API key.

If you want to use your own ElevenLabs account:

ElevenLabs Speech Settings API Key

Benefits of using your own key:

Speech to Text uses Whisper for transcription
OpenAI Text to Speech uses Chatolia credits
ElevenLabs uses Chatolia credits unless you provide your own API key
When using your own ElevenLabs key, speech generation is billed directly to your ElevenLabs account
Public agent pages respect your speech settings