Appearance
Model Fallbacks
Fallbacks ensure your application stays running when a provider has issues. When the primary model fails, GateFlow automatically tries alternatives.
How Fallbacks Work
Configuring Fallbacks
Via Dashboard
- Go to Settings → Routing → Fallbacks
- Select a primary model
- Add fallback models in priority order
- Click Save
Via API
bash
curl -X POST https://api.gateflow.ai/v1/management/fallback-chains \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"primary": "gpt-5.2",
"fallbacks": [
"claude-sonnet-4-5-20250929",
"gemini-3-pro"
]
}'Per-Request Fallbacks
Override fallbacks for specific requests:
python
response = client.chat.completions.create(
model="gpt-5.2",
messages=[...],
extra_body={
"gateflow": {
"fallbacks": ["claude-sonnet-4-5-20250929", "gemini-3-pro"]
}
}
)Fallback Triggers
Fallbacks activate when:
| Trigger | Description | Example |
|---|---|---|
| Provider Error | 5xx from provider | OpenAI returns 500 |
| Timeout | Request exceeds timeout | No response in 60s |
| Rate Limit | Provider rate limit hit | 429 from Anthropic |
| Model Unavailable | Model temporarily down | Scheduled maintenance |
| Circuit Open | Too many recent failures | Provider marked unhealthy |
Fallbacks do not activate for:
- Authentication errors (invalid API key)
- Bad request errors (malformed input)
- Content policy violations
- Cost limit exceeded (your limit, not provider's)
Fallback Behavior
Response Metadata
When a fallback is used, the response indicates this:
json
{
"model": "claude-sonnet-4-5-20250929",
"choices": [...],
"gateflow": {
"routing": {
"requested_model": "gpt-5.2",
"selected_model": "claude-sonnet-4-5-20250929",
"fallback_used": true,
"fallback_reason": "provider_error",
"attempts": [
{"model": "gpt-5.2", "status": "failed", "error": "503"}
]
}
}
}Streaming Fallbacks
For streaming requests, GateFlow buffers briefly before starting the stream:
If the primary starts streaming successfully, it continues even if it later fails mid-stream.
Recommended Fallback Chains
General Chat
json
{
"primary": "gpt-5.2",
"fallbacks": ["claude-sonnet-4-5-20250929", "gemini-3-pro"]
}Code Generation
json
{
"primary": "devstral-2",
"fallbacks": ["gpt-5.2-codex", "claude-sonnet-4-5-20250929"]
}Cost-Sensitive
json
{
"primary": "gpt-5-mini",
"fallbacks": ["claude-haiku-4-5-20251015", "gemini-2.5-flash"]
}High Quality
json
{
"primary": "claude-opus-4-5-20251107",
"fallbacks": ["gpt-5.2", "gemini-3-pro"]
}Reasoning Tasks
json
{
"primary": "o3",
"fallbacks": ["o4-mini", "claude-opus-4-5-20251107"]
}Embeddings
json
{
"primary": "text-embedding-3-large",
"fallbacks": ["text-embedding-004", "embed-english-v3.0"]
}Cross-Provider Considerations
When falling back across providers, be aware of:
Context Window Differences
| Model | Context Window |
|---|---|
| GPT-5.2 | 256k |
| Claude Opus 4.5 | 200k |
| Gemini 3 Pro | 2M |
| Mistral Large 3 | 128k |
If your request has 150k tokens and falls back from Gemini 3 Pro to Mistral Large 3, it will fail.
Feature Differences
| Feature | GPT-5.2 | Claude 4.5 | Gemini 3 |
|---|---|---|---|
| Function calling | Yes | Yes | Yes |
| Vision | Yes | Yes | Yes |
| JSON mode | Yes | Yes | Yes |
| System prompts | Yes | Yes | Yes |
Output Consistency
Different models may produce different outputs for the same input. For applications requiring consistency:
python
response = client.chat.completions.create(
model="gpt-5.2",
messages=[...],
extra_body={
"gateflow": {
"fallback_mode": "fail" # Don't use fallbacks
}
}
)Or use fallbacks within the same provider:
json
{
"primary": "gpt-5.2",
"fallbacks": ["gpt-5.1", "gpt-5"]
}Monitoring Fallbacks
Dashboard Metrics
- Fallback rate (% of requests using fallbacks)
- Fallback reasons breakdown
- Model distribution when fallbacks occur
Alerts
Configure alerts for high fallback rates:
bash
curl -X POST https://api.gateflow.ai/v1/management/alerts \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"name": "High Fallback Rate",
"condition": {
"metric": "fallback_rate",
"operator": "gt",
"threshold": 0.1,
"window_minutes": 15
},
"notify": {
"channels": ["slack", "email"]
}
}'Disabling Fallbacks
Globally
bash
curl -X PATCH https://api.gateflow.ai/v1/management/settings \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"fallbacks_enabled": false
}'Per Request
python
response = client.chat.completions.create(
model="gpt-5.2",
messages=[...],
extra_body={
"gateflow": {
"fallback_mode": "fail"
}
}
)Next Steps
- Cost vs Performance - Balance cost and quality
- Model Change Management - Handle deprecations