Skip to content

Retry Logic

Configurable retry strategies for handling transient failures from AI providers.

Overview

GateFlow automatically retries failed requests with intelligent backoff:

Default Retry Configuration

json
{
  "retry": {
    "max_attempts": 3,
    "initial_delay_ms": 1000,
    "max_delay_ms": 30000,
    "backoff_multiplier": 2.0,
    "retryable_status_codes": [429, 500, 502, 503, 504]
  }
}

Retry Strategies

Exponential Backoff (Default)

Doubles delay between each retry:

AttemptDelay
11s
22s
34s
48s
python
import openai

client = openai.OpenAI(
    base_url="https://api.gateflow.ai/v1",
    api_key="gw_prod_..."
)

response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateflow": {
            "retry": {
                "strategy": "exponential",
                "max_attempts": 5,
                "initial_delay_ms": 500
            }
        }
    }
)

Linear Backoff

Fixed delay between retries:

python
response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateflow": {
            "retry": {
                "strategy": "linear",
                "max_attempts": 3,
                "delay_ms": 2000
            }
        }
    }
)

Immediate Retry

No delay, useful for load balancer errors:

python
response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateflow": {
            "retry": {
                "strategy": "immediate",
                "max_attempts": 2
            }
        }
    }
)

Retryable Errors

Automatically Retried

Status CodeError TypeDescription
429Rate LimitToo many requests
500Server ErrorInternal provider error
502Bad GatewayUpstream connection error
503Service UnavailableProvider temporarily down
504Gateway TimeoutRequest timed out

Never Retried

Status CodeError TypeDescription
400Bad RequestInvalid request format
401UnauthorizedInvalid API key
403ForbiddenInsufficient permissions
404Not FoundModel not found
422Validation ErrorInvalid parameters

Custom Retry Conditions

Configure which errors trigger retries:

bash
curl -X POST https://api.gateflow.ai/v1/management/retry-policies \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "aggressive-retry",
    "max_attempts": 5,
    "initial_delay_ms": 500,
    "backoff_multiplier": 1.5,
    "retryable_status_codes": [429, 500, 502, 503, 504],
    "retryable_error_types": ["timeout", "connection_error"],
    "non_retryable_error_types": ["context_length_exceeded"]
  }'

Jitter

Add randomization to prevent thundering herd:

python
response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateflow": {
            "retry": {
                "strategy": "exponential",
                "max_attempts": 3,
                "jitter": True,  # Adds ±25% randomization
                "jitter_factor": 0.25
            }
        }
    }
)

Retry Headers

GateFlow returns retry information in response headers:

HeaderDescription
X-GateFlow-Retry-CountNumber of retries attempted
X-GateFlow-Total-Latency-MsTotal time including retries
X-GateFlow-Provider-AttemptsProviders tried (comma-separated)

Combining with Fallbacks

Retries work with model fallbacks:

python
response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateflow": {
            "fallbacks": ["claude-sonnet-4-5-20250929", "gemini-3-pro"],
            "retry": {
                "max_attempts": 2,  # Per provider
                "initial_delay_ms": 500
            }
        }
    }
)

Disabling Retries

For latency-sensitive applications:

python
response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "gateflow": {
            "retry": {
                "enabled": False
            }
        }
    }
)

Monitoring Retries

View retry metrics in the dashboard or via API:

bash
curl https://api.gateflow.ai/v1/management/analytics/retries \
  -H "Authorization: Bearer gw_prod_..." \
  -G -d "start_date=2026-02-01" -d "end_date=2026-02-16"

Response:

json
{
  "period": "2026-02-01 to 2026-02-16",
  "total_requests": 50000,
  "requests_with_retries": 1250,
  "retry_rate": 0.025,
  "avg_retries_per_failed": 1.8,
  "by_provider": {
    "openai": {"retry_rate": 0.02},
    "anthropic": {"retry_rate": 0.015},
    "google": {"retry_rate": 0.03}
  },
  "by_error_type": {
    "rate_limit": 800,
    "timeout": 300,
    "server_error": 150
  }
}

Next Steps

Built with reliability in mind.