Automated Fallbacks

Automated fallbacks protect your application when models deprecate or fail. GateFlow handles the switch automatically—no code changes needed.

How It Works

Enabling Automated Fallbacks

Global Setting

Enable for all deprecated models:

bash

curl -X PATCH https://api.gateflow.ai/v1/management/settings \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "auto_fallback_on_deprecation": true
  }'

Per-Model Configuration

Configure fallback for specific models:

bash

curl -X POST https://api.gateflow.ai/v1/management/fallback-chains \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "primary": "gpt-4-turbo",
    "fallbacks": ["gpt-5.2", "claude-sonnet-4-5-20250929"],
    "auto_activate_on_deprecation": true
  }'

Fallback Behavior

When Fallbacks Activate

Trigger	Behavior
Model deprecated	Route to first fallback
Provider error (5xx)	Try next in chain
Rate limit exceeded	Try next provider
Timeout	Try next in chain

Response Metadata

When a fallback is used, the response indicates this:

json

{
  "model": "gpt-5.2",
  "choices": [...],
  "gateflow": {
    "fallback": {
      "used": true,
      "reason": "model_deprecated",
      "original_model": "gpt-4-turbo",
      "fallback_model": "gpt-5.2"
    }
  }
}

Logging

All fallback events are logged:

bash

curl https://api.gateflow.ai/v1/management/logs \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -G -d "event_type=fallback_activated"

Fallback Chains

Creating Chains

bash

curl -X POST https://api.gateflow.ai/v1/management/fallback-chains \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "GPT family fallback",
    "primary": "gpt-4-turbo",
    "fallbacks": [
      {
        "model": "gpt-5.2",
        "priority": 1
      },
      {
        "model": "gpt-5",
        "priority": 2
      },
      {
        "model": "claude-sonnet-4-5-20250929",
        "priority": 3,
        "note": "Cross-provider fallback"
      }
    ]
  }'

Chain Priority

Fallbacks are tried in priority order:

Conditional Fallbacks

Route to different fallbacks based on conditions:

json

{
  "primary": "gpt-4-turbo",
  "fallbacks": [
    {
      "model": "claude-opus-4-5-20251107",
      "condition": {"context_length_gt": 100000}
    },
    {
      "model": "gpt-5.2",
      "condition": {"always": true}
    }
  ]
}

Deprecation-Specific Fallbacks

Configure what happens when specific models deprecate:

bash

curl -X POST https://api.gateflow.ai/v1/management/deprecation-fallbacks \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4-turbo",
    "on_deprecation": {
      "action": "fallback",
      "fallback_to": "gpt-5.2",
      "notify": true
    }
  }'

Available Actions

Action	Description
`fallback`	Route to fallback model
`fail`	Return error (no fallback)
`queue`	Queue request until migration complete

Monitoring Fallbacks

Dashboard Metrics

The dashboard shows:

Fallback activation rate
Fallback reasons breakdown
Model distribution during fallbacks
Cost impact of fallbacks

Alerts

Configure alerts for fallback activity:

bash

curl -X POST https://api.gateflow.ai/v1/management/alerts \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "High fallback rate",
    "condition": {
      "metric": "fallback_rate",
      "operator": "gt",
      "threshold": 0.1,
      "window_minutes": 15
    },
    "notify": {
      "channels": ["slack", "email"]
    }
  }'

Real-Time Status

Check current fallback status:

bash

curl https://api.gateflow.ai/v1/management/fallback-status \
  -H "Authorization: Bearer gw_prod_admin_key"

Response:

json

{
  "active_fallbacks": [
    {
      "original_model": "gpt-4-turbo",
      "fallback_model": "gpt-5.2",
      "reason": "model_deprecated",
      "activated_at": "2026-04-01T00:00:00Z",
      "requests_routed": 15420
    }
  ],
  "fallback_rate_1h": 0.12,
  "fallback_rate_24h": 0.08
}

Testing Fallbacks

Simulate Deprecation

Test how your system handles fallbacks:

bash

curl -X POST https://api.gateflow.ai/v1/management/fallback-chains/test \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "primary": "gpt-4-turbo",
    "simulate": "deprecation",
    "test_requests": 10
  }'

Dry Run

See what would happen without actually routing:

bash

curl -X POST https://api.gateflow.ai/v1/chat/completions \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "X-GateFlow-Dry-Run: true" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4-turbo",
    "messages": [...]
  }'

Response:

json

{
  "dry_run": true,
  "would_route_to": "gpt-5.2",
  "reason": "model_deprecated",
  "fallback_chain": ["gpt-4-turbo", "gpt-5.2", "claude-sonnet-4-5-20250929"]
}

Best Practices

1. Always Configure Fallbacks

Every model should have at least one fallback:

json

{
  "fallback_chains": [
    {"primary": "gpt-5.2", "fallbacks": ["claude-sonnet-4-5-20250929"]},
    {"primary": "claude-sonnet-4-5-20250929", "fallbacks": ["gpt-5.2"]},
    {"primary": "gpt-5-mini", "fallbacks": ["claude-haiku-4-5-20251015", "gemini-2.5-flash"]}
  ]
}

2. Test Cross-Provider Fallbacks

Ensure fallbacks to different providers work:

python

# Test that Claude fallback produces acceptable output
response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[...],
    extra_body={
        "gateflow": {
            "force_fallback": "claude-sonnet-4-5-20250929"
        }
    }
)

3. Monitor Fallback Costs

Different models have different costs. Track cost changes:

bash

curl https://api.gateflow.ai/v1/management/analytics/cost-by-fallback \
  -H "Authorization: Bearer gw_prod_admin_key"

4. Document Fallback Behavior

Let your team know which fallbacks are configured and why.

Next Steps

Zero-Downtime Migrations - Production migration strategies
Model Fallbacks - General fallback configuration

Automated Fallbacks ​

How It Works ​

Enabling Automated Fallbacks ​

Global Setting ​

Per-Model Configuration ​

Fallback Behavior ​

When Fallbacks Activate ​

Response Metadata ​

Logging ​

Fallback Chains ​

Creating Chains ​

Chain Priority ​

Conditional Fallbacks ​

Deprecation-Specific Fallbacks ​

Available Actions ​

Monitoring Fallbacks ​

Dashboard Metrics ​

Alerts ​

Real-Time Status ​

Testing Fallbacks ​

Simulate Deprecation ​

Dry Run ​

Best Practices ​

1. Always Configure Fallbacks ​

2. Test Cross-Provider Fallbacks ​

3. Monitor Fallback Costs ​

4. Document Fallback Behavior ​

Next Steps ​

Automated Fallbacks

How It Works

Enabling Automated Fallbacks

Global Setting

Per-Model Configuration

Fallback Behavior

When Fallbacks Activate

Response Metadata

Logging

Fallback Chains

Creating Chains

Chain Priority

Conditional Fallbacks

Deprecation-Specific Fallbacks

Available Actions

Monitoring Fallbacks

Dashboard Metrics

Alerts

Real-Time Status

Testing Fallbacks

Simulate Deprecation

Dry Run

Best Practices

1. Always Configure Fallbacks

2. Test Cross-Provider Fallbacks

3. Monitor Fallback Costs

4. Document Fallback Behavior

Next Steps