Skip to content

Speech Synthesis

Convert text to natural-sounding speech using TTS models.

Endpoint

POST /v1/audio/speech

Authentication

Authorization: Bearer gw_prod_...

Request

Headers

HeaderRequiredDescription
AuthorizationYesBearer token with API key
Content-TypeYesapplication/json

Body Parameters

ParameterTypeRequiredDescription
modelstringYesModel ID (see supported models)
inputstringYesText to synthesize (max 4096 chars)
voicestringYesVoice ID or standard voice name
response_formatstringNomp3, opus, aac, flac, wav, pcm
speednumberNoSpeed multiplier (0.25-4.0, default 1.0)

Examples

Basic Speech

bash
curl -X POST https://api.gateflow.ai/v1/audio/speech \
  -H "Authorization: Bearer gw_prod_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello, welcome to GateFlow!",
    "voice": "alloy"
  }' --output speech.mp3

Python

python
import openai

client = openai.OpenAI(
    base_url="https://api.gateflow.ai/v1",
    api_key="gw_prod_..."
)

response = client.audio.speech.create(
    model="tts-1",
    input="Hello, welcome to GateFlow!",
    voice="alloy"
)

response.stream_to_file("speech.mp3")

High Quality

python
response = client.audio.speech.create(
    model="tts-1-hd",
    input="Hello, welcome to GateFlow!",
    voice="nova"
)

ElevenLabs Premium

python
response = client.audio.speech.create(
    model="eleven_multilingual_v2",
    input="Hello, welcome to GateFlow!",
    voice="rachel"
)

Using Standard Voices

python
# Standard voice maps to provider-specific voice
response = client.audio.speech.create(
    model="eleven_turbo_v2_5",
    input="How can I help you today?",
    voice="friendly"  # Maps to "rachel" on ElevenLabs
)

Streaming Speech

python
import httpx

async with httpx.AsyncClient() as http_client:
    async with http_client.stream(
        "POST",
        "https://api.gateflow.ai/v1/audio/speech",
        headers={
            "Authorization": "Bearer gw_prod_...",
            "Content-Type": "application/json"
        },
        json={
            "model": "eleven_turbo_v2_5",
            "input": "This is streaming speech synthesis.",
            "voice": "friendly",
            "stream": True
        }
    ) as response:
        async for chunk in response.aiter_bytes():
            play_audio_chunk(chunk)

Custom Voice Settings

python
response = client.audio.speech.create(
    model="eleven_multilingual_v2",
    input="Hello!",
    voice="friendly",
    extra_body={
        "gateflow": {
            "voice_settings": {
                "stability": 0.5,
                "similarity_boost": 0.75,
                "style": 0.5,
                "speed": 1.0
            }
        }
    }
)

Response

The response is an audio file in the requested format.

Headers

HeaderDescription
Content-Typeaudio/mpeg, audio/opus, etc.
Content-LengthSize in bytes
X-GateFlow-ProviderProvider used
X-GateFlow-ModelModel used
X-GateFlow-Latency-MsProcessing time

Supported Models

OpenAI

ModelQualityLatencyCost
tts-1GoodLow$15/1M chars
tts-1-hdHighMedium$30/1M chars

ElevenLabs

ModelQualityLatencyCost
eleven_multilingual_v2ExcellentMedium$24/1M chars
eleven_turbo_v2_5GoodLow$12/1M chars
eleven_flash_v2_5GoodVery Low$8/1M chars

Standard Voices

Voice IDPersonaOpenAIElevenLabs
professionalBusinessonyxjosh
friendlyConversationalalloyrachel
calmSoothingnovabella
energeticUpbeatshimmerantoni
seriousFormalfableadam
casualRelaxedechodomi

GateFlow Extensions

Provider Selection

python
response = client.audio.speech.create(
    model="auto",
    input="Hello!",
    voice="friendly",
    extra_body={
        "gateflow": {
            "prefer": "quality"  # or "speed" or "cost"
        }
    }
)

Fallbacks

python
response = client.audio.speech.create(
    model="eleven_multilingual_v2",
    input="Hello!",
    voice="friendly",
    extra_body={
        "gateflow": {
            "fallbacks": ["tts-1-hd", "tts-1"]
        }
    }
)

Audio Formats

FormatMIME TypeUse Case
mp3audio/mpegGeneral use, good compression
opusaudio/opusStreaming, low latency
aacaudio/aacApple devices
flacaudio/flacLossless, archival
wavaudio/wavUncompressed, editing
pcmaudio/pcmRaw audio, processing

Errors

CodeDescription
400Invalid parameters or text too long
401Invalid API key
429Rate limit exceeded
500Provider error

See Also

Built with reliability in mind.