Retry Logic

Configurable retry strategies for handling transient failures from AI providers.

Overview

GateFlow automatically retries failed requests with intelligent backoff:

Default Retry Configuration

json

{
  "retry": {
    "max_attempts": 3,
    "initial_delay_ms": 1000,
    "max_delay_ms": 30000,
    "backoff_multiplier": 2.0,
    "retryable_status_codes": [429, 500, 502, 503, 504]
  }
}

Retry Strategies

Exponential Backoff (Default)

Doubles delay between each retry:

Attempt	Delay
1	1s
2	2s
3	4s
4	8s

python

import openai

client = openai.OpenAI(
    base_url="https://api.gateflow.ai/v1",
    api_key="gw_prod_..."
)

response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateflow": {
            "retry": {
                "strategy": "exponential",
                "max_attempts": 5,
                "initial_delay_ms": 500
            }
        }
    }
)

Linear Backoff

Fixed delay between retries:

python

response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateflow": {
            "retry": {
                "strategy": "linear",
                "max_attempts": 3,
                "delay_ms": 2000
            }
        }
    }
)

Immediate Retry

No delay, useful for load balancer errors:

python

response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateflow": {
            "retry": {
                "strategy": "immediate",
                "max_attempts": 2
            }
        }
    }
)

Retryable Errors

Automatically Retried

Status Code	Error Type	Description
429	Rate Limit	Too many requests
500	Server Error	Internal provider error
502	Bad Gateway	Upstream connection error
503	Service Unavailable	Provider temporarily down
504	Gateway Timeout	Request timed out

Never Retried

Status Code	Error Type	Description
400	Bad Request	Invalid request format
401	Unauthorized	Invalid API key
403	Forbidden	Insufficient permissions
404	Not Found	Model not found
422	Validation Error	Invalid parameters

Custom Retry Conditions

Configure which errors trigger retries:

bash

curl -X POST https://api.gateflow.ai/v1/management/retry-policies \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "aggressive-retry",
    "max_attempts": 5,
    "initial_delay_ms": 500,
    "backoff_multiplier": 1.5,
    "retryable_status_codes": [429, 500, 502, 503, 504],
    "retryable_error_types": ["timeout", "connection_error"],
    "non_retryable_error_types": ["context_length_exceeded"]
  }'

Jitter

Add randomization to prevent thundering herd:

python

response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateflow": {
            "retry": {
                "strategy": "exponential",
                "max_attempts": 3,
                "jitter": True,  # Adds ±25% randomization
                "jitter_factor": 0.25
            }
        }
    }
)

Retry Headers

GateFlow returns retry information in response headers:

Header	Description
`X-GateFlow-Retry-Count`	Number of retries attempted
`X-GateFlow-Total-Latency-Ms`	Total time including retries
`X-GateFlow-Provider-Attempts`	Providers tried (comma-separated)

Combining with Fallbacks

Retries work with model fallbacks:

python

response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateflow": {
            "fallbacks": ["claude-sonnet-4-5-20250929", "gemini-3-pro"],
            "retry": {
                "max_attempts": 2,  # Per provider
                "initial_delay_ms": 500
            }
        }
    }
)

Disabling Retries

For latency-sensitive applications:

python

response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateflow": {
            "retry": {
                "enabled": False
            }
        }
    }
)

Monitoring Retries

View retry metrics in the dashboard or via API:

bash

curl https://api.gateflow.ai/v1/management/analytics/retries \
  -H "Authorization: Bearer gw_prod_..." \
  -G -d "start_date=2026-02-01" -d "end_date=2026-02-16"

Response:

json

{
  "period": "2026-02-01 to 2026-02-16",
  "total_requests": 50000,
  "requests_with_retries": 1250,
  "retry_rate": 0.025,
  "avg_retries_per_failed": 1.8,
  "by_provider": {
    "openai": {"retry_rate": 0.02},
    "anthropic": {"retry_rate": 0.015},
    "google": {"retry_rate": 0.03}
  },
  "by_error_type": {
    "rate_limit": 800,
    "timeout": 300,
    "server_error": 150
  }
}

Next Steps

Rate Limits - Understanding rate limiting
Model Fallbacks - Configure fallback chains
Request Queuing - Queue management

Retry Logic ​

Overview ​

Default Retry Configuration ​

Retry Strategies ​

Exponential Backoff (Default) ​

Linear Backoff ​

Immediate Retry ​

Retryable Errors ​

Automatically Retried ​

Never Retried ​

Custom Retry Conditions ​

Jitter ​

Retry Headers ​

Combining with Fallbacks ​

Disabling Retries ​

Monitoring Retries ​

Next Steps ​

Retry Logic

Overview

Default Retry Configuration

Retry Strategies

Exponential Backoff (Default)

Linear Backoff

Immediate Retry

Retryable Errors

Automatically Retried

Never Retried

Custom Retry Conditions

Jitter

Retry Headers

Combining with Fallbacks

Disabling Retries

Monitoring Retries

Next Steps