Appearance
Automated Fallbacks
Automated fallbacks protect your application when models deprecate or fail. GateFlow handles the switch automatically—no code changes needed.
How It Works
Enabling Automated Fallbacks
Global Setting
Enable for all deprecated models:
bash
curl -X PATCH https://api.gateflow.ai/v1/management/settings \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"auto_fallback_on_deprecation": true
}'Per-Model Configuration
Configure fallback for specific models:
bash
curl -X POST https://api.gateflow.ai/v1/management/fallback-chains \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"primary": "gpt-4-turbo",
"fallbacks": ["gpt-5.2", "claude-sonnet-4-5-20250929"],
"auto_activate_on_deprecation": true
}'Fallback Behavior
When Fallbacks Activate
| Trigger | Behavior |
|---|---|
| Model deprecated | Route to first fallback |
| Provider error (5xx) | Try next in chain |
| Rate limit exceeded | Try next provider |
| Timeout | Try next in chain |
Response Metadata
When a fallback is used, the response indicates this:
json
{
"model": "gpt-5.2",
"choices": [...],
"gateflow": {
"fallback": {
"used": true,
"reason": "model_deprecated",
"original_model": "gpt-4-turbo",
"fallback_model": "gpt-5.2"
}
}
}Logging
All fallback events are logged:
bash
curl https://api.gateflow.ai/v1/management/logs \
-H "Authorization: Bearer gw_prod_admin_key" \
-G -d "event_type=fallback_activated"Fallback Chains
Creating Chains
bash
curl -X POST https://api.gateflow.ai/v1/management/fallback-chains \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"name": "GPT family fallback",
"primary": "gpt-4-turbo",
"fallbacks": [
{
"model": "gpt-5.2",
"priority": 1
},
{
"model": "gpt-5",
"priority": 2
},
{
"model": "claude-sonnet-4-5-20250929",
"priority": 3,
"note": "Cross-provider fallback"
}
]
}'Chain Priority
Fallbacks are tried in priority order:
Conditional Fallbacks
Route to different fallbacks based on conditions:
json
{
"primary": "gpt-4-turbo",
"fallbacks": [
{
"model": "claude-opus-4-5-20251107",
"condition": {"context_length_gt": 100000}
},
{
"model": "gpt-5.2",
"condition": {"always": true}
}
]
}Deprecation-Specific Fallbacks
Configure what happens when specific models deprecate:
bash
curl -X POST https://api.gateflow.ai/v1/management/deprecation-fallbacks \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4-turbo",
"on_deprecation": {
"action": "fallback",
"fallback_to": "gpt-5.2",
"notify": true
}
}'Available Actions
| Action | Description |
|---|---|
fallback | Route to fallback model |
fail | Return error (no fallback) |
queue | Queue request until migration complete |
Monitoring Fallbacks
Dashboard Metrics
The dashboard shows:
- Fallback activation rate
- Fallback reasons breakdown
- Model distribution during fallbacks
- Cost impact of fallbacks
Alerts
Configure alerts for fallback activity:
bash
curl -X POST https://api.gateflow.ai/v1/management/alerts \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"name": "High fallback rate",
"condition": {
"metric": "fallback_rate",
"operator": "gt",
"threshold": 0.1,
"window_minutes": 15
},
"notify": {
"channels": ["slack", "email"]
}
}'Real-Time Status
Check current fallback status:
bash
curl https://api.gateflow.ai/v1/management/fallback-status \
-H "Authorization: Bearer gw_prod_admin_key"Response:
json
{
"active_fallbacks": [
{
"original_model": "gpt-4-turbo",
"fallback_model": "gpt-5.2",
"reason": "model_deprecated",
"activated_at": "2026-04-01T00:00:00Z",
"requests_routed": 15420
}
],
"fallback_rate_1h": 0.12,
"fallback_rate_24h": 0.08
}Testing Fallbacks
Simulate Deprecation
Test how your system handles fallbacks:
bash
curl -X POST https://api.gateflow.ai/v1/management/fallback-chains/test \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"primary": "gpt-4-turbo",
"simulate": "deprecation",
"test_requests": 10
}'Dry Run
See what would happen without actually routing:
bash
curl -X POST https://api.gateflow.ai/v1/chat/completions \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "X-GateFlow-Dry-Run: true" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4-turbo",
"messages": [...]
}'Response:
json
{
"dry_run": true,
"would_route_to": "gpt-5.2",
"reason": "model_deprecated",
"fallback_chain": ["gpt-4-turbo", "gpt-5.2", "claude-sonnet-4-5-20250929"]
}Best Practices
1. Always Configure Fallbacks
Every model should have at least one fallback:
json
{
"fallback_chains": [
{"primary": "gpt-5.2", "fallbacks": ["claude-sonnet-4-5-20250929"]},
{"primary": "claude-sonnet-4-5-20250929", "fallbacks": ["gpt-5.2"]},
{"primary": "gpt-5-mini", "fallbacks": ["claude-haiku-4-5-20251015", "gemini-2.5-flash"]}
]
}2. Test Cross-Provider Fallbacks
Ensure fallbacks to different providers work:
python
# Test that Claude fallback produces acceptable output
response = client.chat.completions.create(
model="gpt-5.2",
messages=[...],
extra_body={
"gateflow": {
"force_fallback": "claude-sonnet-4-5-20250929"
}
}
)3. Monitor Fallback Costs
Different models have different costs. Track cost changes:
bash
curl https://api.gateflow.ai/v1/management/analytics/cost-by-fallback \
-H "Authorization: Bearer gw_prod_admin_key"4. Document Fallback Behavior
Let your team know which fallbacks are configured and why.
Next Steps
- Zero-Downtime Migrations - Production migration strategies
- Model Fallbacks - General fallback configuration