Appearance
Speech Synthesis
Convert text to natural-sounding speech using TTS models.
Endpoint
POST /v1/audio/speechAuthentication
Authorization: Bearer gw_prod_...Request
Headers
| Header | Required | Description |
|---|---|---|
Authorization | Yes | Bearer token with API key |
Content-Type | Yes | application/json |
Body Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model ID (see supported models) |
input | string | Yes | Text to synthesize (max 4096 chars) |
voice | string | Yes | Voice ID or standard voice name |
response_format | string | No | mp3, opus, aac, flac, wav, pcm |
speed | number | No | Speed multiplier (0.25-4.0, default 1.0) |
Examples
Basic Speech
bash
curl -X POST https://api.gateflow.ai/v1/audio/speech \
-H "Authorization: Bearer gw_prod_..." \
-H "Content-Type: application/json" \
-d '{
"model": "tts-1",
"input": "Hello, welcome to GateFlow!",
"voice": "alloy"
}' --output speech.mp3Python
python
import openai
client = openai.OpenAI(
base_url="https://api.gateflow.ai/v1",
api_key="gw_prod_..."
)
response = client.audio.speech.create(
model="tts-1",
input="Hello, welcome to GateFlow!",
voice="alloy"
)
response.stream_to_file("speech.mp3")High Quality
python
response = client.audio.speech.create(
model="tts-1-hd",
input="Hello, welcome to GateFlow!",
voice="nova"
)ElevenLabs Premium
python
response = client.audio.speech.create(
model="eleven_multilingual_v2",
input="Hello, welcome to GateFlow!",
voice="rachel"
)Using Standard Voices
python
# Standard voice maps to provider-specific voice
response = client.audio.speech.create(
model="eleven_turbo_v2_5",
input="How can I help you today?",
voice="friendly" # Maps to "rachel" on ElevenLabs
)Streaming Speech
python
import httpx
async with httpx.AsyncClient() as http_client:
async with http_client.stream(
"POST",
"https://api.gateflow.ai/v1/audio/speech",
headers={
"Authorization": "Bearer gw_prod_...",
"Content-Type": "application/json"
},
json={
"model": "eleven_turbo_v2_5",
"input": "This is streaming speech synthesis.",
"voice": "friendly",
"stream": True
}
) as response:
async for chunk in response.aiter_bytes():
play_audio_chunk(chunk)Custom Voice Settings
python
response = client.audio.speech.create(
model="eleven_multilingual_v2",
input="Hello!",
voice="friendly",
extra_body={
"gateflow": {
"voice_settings": {
"stability": 0.5,
"similarity_boost": 0.75,
"style": 0.5,
"speed": 1.0
}
}
}
)Response
The response is an audio file in the requested format.
Headers
| Header | Description |
|---|---|
Content-Type | audio/mpeg, audio/opus, etc. |
Content-Length | Size in bytes |
X-GateFlow-Provider | Provider used |
X-GateFlow-Model | Model used |
X-GateFlow-Latency-Ms | Processing time |
Supported Models
OpenAI
| Model | Quality | Latency | Cost |
|---|---|---|---|
tts-1 | Good | Low | $15/1M chars |
tts-1-hd | High | Medium | $30/1M chars |
ElevenLabs
| Model | Quality | Latency | Cost |
|---|---|---|---|
eleven_multilingual_v2 | Excellent | Medium | $24/1M chars |
eleven_turbo_v2_5 | Good | Low | $12/1M chars |
eleven_flash_v2_5 | Good | Very Low | $8/1M chars |
Standard Voices
| Voice ID | Persona | OpenAI | ElevenLabs |
|---|---|---|---|
professional | Business | onyx | josh |
friendly | Conversational | alloy | rachel |
calm | Soothing | nova | bella |
energetic | Upbeat | shimmer | antoni |
serious | Formal | fable | adam |
casual | Relaxed | echo | domi |
GateFlow Extensions
Provider Selection
python
response = client.audio.speech.create(
model="auto",
input="Hello!",
voice="friendly",
extra_body={
"gateflow": {
"prefer": "quality" # or "speed" or "cost"
}
}
)Fallbacks
python
response = client.audio.speech.create(
model="eleven_multilingual_v2",
input="Hello!",
voice="friendly",
extra_body={
"gateflow": {
"fallbacks": ["tts-1-hd", "tts-1"]
}
}
)Audio Formats
| Format | MIME Type | Use Case |
|---|---|---|
mp3 | audio/mpeg | General use, good compression |
opus | audio/opus | Streaming, low latency |
aac | audio/aac | Apple devices |
flac | audio/flac | Lossless, archival |
wav | audio/wav | Uncompressed, editing |
pcm | audio/pcm | Raw audio, processing |
Errors
| Code | Description |
|---|---|
| 400 | Invalid parameters or text too long |
| 401 | Invalid API key |
| 429 | Rate limit exceeded |
| 500 | Provider error |
See Also
- Transcriptions - Speech to text
- Voices - List available voices
- Voice Mapping - Voice configuration