Appearance
Audio Providers
Configure speech-to-text (STT) and text-to-speech (TTS) providers for your voice pipelines.
Supported Providers
Speech-to-Text (STT)
| Provider | Model | Languages | Best For |
|---|---|---|---|
| OpenAI | whisper-1 | 50+ | General transcription |
| Mistral | voxtral-mini-latest | 100+ | Multilingual, low latency |
gemini-2.5-flash | 100+ | Audio via chat API |
Text-to-Speech (TTS)
| Provider | Model | Voices | Best For |
|---|---|---|---|
| OpenAI | tts-1, tts-1-hd | 6 | General use |
| ElevenLabs | eleven_multilingual_v2 | 100+ | Premium quality |
| ElevenLabs | eleven_turbo_v2_5 | 100+ | Low latency |
| ElevenLabs | eleven_flash_v2_5 | 100+ | Cost-effective |
Configuring Providers
OpenAI (Whisper)
bash
curl -X POST https://api.gateflow.ai/v1/management/providers \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"provider": "openai",
"credentials": {
"api_key": "sk-..."
}
}'STT Usage:
bash
curl -X POST https://api.gateflow.ai/v1/audio/transcriptions \
-H "Authorization: Bearer gw_prod_..." \
-F "file=@audio.mp3" \
-F "model=whisper-1"TTS Usage:
bash
curl -X POST https://api.gateflow.ai/v1/audio/speech \
-H "Authorization: Bearer gw_prod_..." \
-H "Content-Type: application/json" \
-d '{
"model": "tts-1-hd",
"input": "Hello, how can I help you?",
"voice": "alloy"
}' --output speech.mp3ElevenLabs
bash
curl -X POST https://api.gateflow.ai/v1/management/providers \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"provider": "elevenlabs",
"credentials": {
"api_key": "..."
}
}'TTS Usage:
bash
curl -X POST https://api.gateflow.ai/v1/audio/speech \
-H "Authorization: Bearer gw_prod_..." \
-H "Content-Type: application/json" \
-d '{
"model": "eleven_multilingual_v2",
"input": "Hello, how can I help you?",
"voice": "rachel"
}' --output speech.mp3Mistral (Voxtral)
bash
curl -X POST https://api.gateflow.ai/v1/management/providers \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"provider": "mistral",
"credentials": {
"api_key": "..."
}
}'STT Usage:
bash
curl -X POST https://api.gateflow.ai/v1/audio/transcriptions \
-H "Authorization: Bearer gw_prod_..." \
-F "file=@audio.mp3" \
-F "model=voxtral-mini-latest"Provider Selection
Automatic Selection
Let GateFlow choose based on requirements:
python
response = client.audio.transcriptions.create(
file=audio_file,
model="auto",
extra_body={
"gateflow": {
"prefer": "quality" # or "speed" or "cost"
}
}
)Explicit Selection
Specify the exact provider and model:
python
response = client.audio.transcriptions.create(
file=audio_file,
model="whisper-1" # Always uses OpenAI
)Provider Comparison
STT Quality
| Provider | Accuracy | Latency | Cost |
|---|---|---|---|
| Whisper-1 | Excellent | Medium | $0.006/min |
| Voxtral Mini | Excellent | Low | $0.02/min |
| Gemini 2.5 | Good | Low | Via chat tokens |
TTS Quality
| Provider | Naturalness | Latency | Cost |
|---|---|---|---|
| ElevenLabs v2 | Excellent | Medium | $24/1M chars |
| ElevenLabs Turbo | Good | Low | $12/1M chars |
| OpenAI TTS-1-HD | Good | Medium | $30/1M chars |
| OpenAI TTS-1 | Good | Low | $15/1M chars |
Fallback Configuration
Configure fallbacks for reliability:
bash
curl -X POST https://api.gateflow.ai/v1/management/audio-fallbacks \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"stt": {
"primary": "whisper-1",
"fallbacks": ["voxtral-mini-latest"]
},
"tts": {
"primary": "eleven_multilingual_v2",
"fallbacks": ["tts-1-hd", "tts-1"]
}
}'Next Steps
- Voice Mapping - Map voices across providers
- Pipeline Templates - Pre-configured pipelines
- Streaming Speech - Real-time audio