Appearance
Voice Agent Premium
High-quality voice pipeline for premium voice experiences.
Overview
The voice-agent-premium template prioritizes quality and natural conversation over raw speed, using the best available models.
Configuration
yaml
template: voice-agent-premium
stt:
model: whisper-1
streaming: false # Batch for accuracy
language: auto
llm:
model: gpt-5.2
max_tokens: 500
temperature: 0.7
tts:
model: eleven_multilingual_v2
voice: professional
streaming: true
voice_settings:
stability: 0.6
similarity_boost: 0.8Performance
| Stage | Latency | Notes |
|---|---|---|
| STT | 500-1000ms | Batch processing |
| LLM (first token) | 200-400ms | Advanced model |
| TTS (first chunk) | 150-250ms | Premium model |
| Total TTFB | 850-1650ms | Higher quality |
Quality Comparison
| Feature | Fast | Premium |
|---|---|---|
| STT Accuracy | Good | Excellent |
| Response Quality | Good | Excellent |
| Voice Naturalness | Good | Excellent |
| Multilingual | Limited | Full |
| Latency | 400-750ms | 850-1650ms |
| Cost | $0.003-0.005 | $0.008-0.015 |
Usage
Basic Usage
python
from gateflow_mcp import MCPClient
import base64
client = MCPClient(agent_id="agent_abc123", api_key="gf-agent-...")
with open("question.mp3", "rb") as f:
audio_b64 = base64.b64encode(f.read()).decode()
result = client.call_tool(
name="voice/pipeline",
arguments={
"audio": audio_b64,
"template": "voice-agent-premium",
"context": "You are a knowledgeable executive assistant."
}
)
print(f"Transcription: {result['transcription']}")
print(f"Response: {result['response']}")Extended Conversation
python
result = client.call_tool(
name="voice/pipeline",
arguments={
"audio": audio_b64,
"template": "voice-agent-premium",
"context": """You are an executive assistant helping schedule meetings.
Today's date: February 16, 2026
User's timezone: Eastern
Be professional and thorough in your responses.""",
"conversation_history": [
{"role": "user", "content": "I need to schedule a board meeting."},
{"role": "assistant", "content": "I'd be happy to help schedule a board meeting. What date and time works best for you?"},
{"role": "user", "content": "Next Tuesday at 2 PM."}
]
}
)Multilingual Support
python
# Works with 100+ languages
result = client.call_tool(
name="voice/pipeline",
arguments={
"audio": french_audio_b64,
"template": "voice-agent-premium",
"context": "Vous êtes un assistant utile. Répondez en français.",
"overrides": {
"stt": {"language": "fr"},
"tts": {"language": "fr"}
}
}
)Response Format
json
{
"transcription": "I need to reschedule the board meeting to next Friday.",
"response": "I'll reschedule the board meeting to Friday, February 21st. Would you like me to send updated calendar invites to all attendees? I can also prepare an updated agenda if needed.",
"audio": "base64_encoded_mp3_audio...",
"audio_format": "mp3",
"latency": {
"stt_ms": 750,
"llm_ms": 380,
"tts_ms": 220,
"total_ms": 1350
},
"usage": {
"stt_seconds": 5.2,
"llm_tokens": {"prompt": 120, "completion": 65},
"tts_characters": 180
},
"cost": 0.012
}Voice Customization
python
result = client.call_tool(
name="voice/pipeline",
arguments={
"audio": audio_b64,
"template": "voice-agent-premium",
"overrides": {
"tts": {
"voice": "21m00Tcm4TlvDq8ikWAM", # Specific ElevenLabs voice
"voice_settings": {
"stability": 0.5,
"similarity_boost": 0.9,
"style": 0.3,
"speed": 0.95
}
}
}
}
)Best For
- Executive assistants - Complex scheduling and tasks
- Premium customer service - High-value interactions
- Accessibility applications - Natural voice interfaces
- Multilingual applications - Global user bases
- Complex conversations - Multi-turn dialogues
Use Case: Financial Advisor
python
result = client.call_tool(
name="voice/pipeline",
arguments={
"audio": audio_b64,
"template": "voice-agent-premium",
"context": """You are a financial advisor assistant.
Guidelines:
- Never give specific investment advice
- Always recommend consulting a licensed advisor
- Be informative about general financial concepts
- Maintain a professional, reassuring tone""",
"overrides": {
"llm": {
"temperature": 0.5 # More conservative
},
"tts": {
"voice": "serious"
}
}
}
)Permissions Required
yaml
permissions:
tools:
- voice/pipeline
models:
- whisper-1
- gpt-5.2
- eleven_multilingual_v2
pipelines:
- voice-agent-premiumCost Optimization
Balance quality and cost:
python
# Use premium for complex queries, fast for simple ones
def select_template(query_complexity):
if query_complexity == "simple":
return "voice-agent-fast"
else:
return "voice-agent-premium"Next Steps
- Voice Agent Fast - Lower latency option
- Custom Templates - Build your own
- Voice Mapping - Voice configuration