Skip to content

Zero-Downtime Migrations

Migrate production traffic between models without interrupting service. This guide covers strategies for safe, observable model transitions.

Migration Strategies

Strategy 1: Canary Deployment

Gradually shift traffic from old model to new:

Configuration

bash
curl -X POST https://api.gateflow.ai/v1/management/migrations \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "source_model": "gpt-4-turbo",
    "target_model": "gpt-5.2",
    "strategy": "canary",
    "config": {
      "initial_percentage": 10,
      "increment": 20,
      "increment_interval_hours": 24,
      "auto_rollback": true,
      "rollback_conditions": {
        "error_rate_increase": 0.02,
        "latency_p99_increase_factor": 1.5
      }
    }
  }'

Monitoring Canary

bash
curl https://api.gateflow.ai/v1/management/migrations/mig_abc123/metrics \
  -H "Authorization: Bearer gw_prod_admin_key"

Response:

json
{
  "current_split": {
    "source": 50,
    "target": 50
  },
  "metrics": {
    "source": {
      "requests": 7500,
      "error_rate": 0.001,
      "latency_p50": 450,
      "latency_p99": 890
    },
    "target": {
      "requests": 7500,
      "error_rate": 0.0008,
      "latency_p50": 320,
      "latency_p99": 560
    }
  },
  "comparison": {
    "error_rate_change": -0.0002,
    "latency_p99_change": -0.37,
    "status": "healthy"
  }
}

Strategy 2: Shadow Mode

Run both models simultaneously, compare results:

Configuration

bash
curl -X POST https://api.gateflow.ai/v1/management/migrations \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "source_model": "gpt-4-turbo",
    "target_model": "gpt-5.2",
    "strategy": "shadow",
    "config": {
      "duration_hours": 72,
      "sample_rate": 0.1,
      "compare_metrics": ["semantic_similarity", "output_length", "latency"]
    }
  }'

Shadow Results

bash
curl https://api.gateflow.ai/v1/management/migrations/mig_abc123/shadow-results \
  -H "Authorization: Bearer gw_prod_admin_key"

Response:

json
{
  "shadow_period": {
    "start": "2026-04-01T00:00:00Z",
    "end": "2026-04-04T00:00:00Z"
  },
  "requests_compared": 15000,
  "results": {
    "semantic_similarity": {
      "mean": 0.94,
      "p50": 0.95,
      "p5": 0.88
    },
    "output_length_ratio": {
      "mean": 1.02,
      "note": "Target produces 2% longer outputs"
    },
    "latency_comparison": {
      "source_p50": 450,
      "target_p50": 320,
      "improvement": "29%"
    }
  },
  "recommendation": "proceed_with_migration"
}

Strategy 3: Blue-Green

Instant switch with quick rollback capability:

Configuration

bash
curl -X POST https://api.gateflow.ai/v1/management/migrations \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "source_model": "gpt-4-turbo",
    "target_model": "gpt-5.2",
    "strategy": "blue_green",
    "config": {
      "keep_blue_warm": true,
      "auto_rollback_window_minutes": 30
    }
  }'

Execute switch:

bash
curl -X POST https://api.gateflow.ai/v1/management/migrations/mig_abc123/switch \
  -H "Authorization: Bearer gw_prod_admin_key"

Strategy 4: Feature Flag

Route based on custom criteria:

bash
curl -X POST https://api.gateflow.ai/v1/management/migrations \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "source_model": "gpt-4-turbo",
    "target_model": "gpt-5.2",
    "strategy": "feature_flag",
    "config": {
      "flag_name": "use_gpt52",
      "rules": [
        {"condition": {"header": "X-Beta-User"}, "target": true},
        {"condition": {"api_key_tag": "internal"}, "target": true},
        {"condition": {"default": true}, "target": false}
      ]
    }
  }'

Rollback Procedures

Automatic Rollback

Configure conditions for automatic rollback:

json
{
  "auto_rollback": true,
  "rollback_conditions": {
    "error_rate_absolute": 0.05,
    "error_rate_increase": 0.02,
    "latency_p99_ms": 2000,
    "latency_p99_increase_factor": 2.0,
    "quality_score_drop": 0.1
  }
}

Manual Rollback

bash
curl -X POST https://api.gateflow.ai/v1/management/migrations/mig_abc123/rollback \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "reason": "User reports of quality issues"
  }'

Rollback Speed

StrategyRollback Time
Blue-GreenInstant
CanaryInstant
Feature FlagInstant
ShadowN/A (no production impact)

Pre-Migration Checklist

1. Compatibility Verification

bash
# Check feature compatibility
curl -X POST https://api.gateflow.ai/v1/management/migrations/compatibility-check \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "source_model": "gpt-4-turbo",
    "target_model": "gpt-5.2"
  }'

2. Cost Impact Analysis

bash
# Estimate cost changes
curl -X POST https://api.gateflow.ai/v1/management/migrations/cost-estimate \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "source_model": "gpt-4-turbo",
    "target_model": "gpt-5.2",
    "based_on_days": 30
  }'

Response:

json
{
  "current_monthly_cost": 2345.67,
  "estimated_monthly_cost": 1567.89,
  "savings": 777.78,
  "savings_percentage": 33.2
}

3. Staging Test

bash
# Run test in staging
curl -X POST https://api.gateflow.ai/v1/management/migrations/test \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "source_model": "gpt-4-turbo",
    "target_model": "gpt-5.2",
    "workspace": "staging",
    "test_requests": 1000
  }'

4. Alert Configuration

bash
# Set up migration alerts
curl -X POST https://api.gateflow.ai/v1/management/alerts \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Migration Error Rate",
    "condition": {
      "metric": "migration_error_rate_increase",
      "threshold": 0.01
    },
    "notify": ["slack", "pagerduty"]
  }'

Migration Playbook

Week Before

  1. Run compatibility check
  2. Estimate cost impact
  3. Test in staging
  4. Configure rollback conditions
  5. Set up alerts
  6. Notify team of migration window

Day Of

  1. Verify staging tests passed
  2. Confirm team availability
  3. Start canary at low percentage
  4. Monitor metrics closely
  5. Gradually increase traffic

Post-Migration

  1. Review metrics for 24-48 hours
  2. Check user feedback channels
  3. Document any issues
  4. Update runbooks
  5. Remove old model references

Monitoring During Migration

Key Metrics

MetricAlert Threshold
Error rate> 1% or +0.5% from baseline
Latency P99> 2x baseline
Quality score< 90% similarity
Cost per request> 1.5x baseline

Dashboard Setup

Create a migration dashboard showing:

  • Traffic split (source vs target)
  • Error rates (side by side)
  • Latency percentiles (side by side)
  • Cost comparison
  • Quality metrics

Example: GPT-4-turbo to GPT-5.2 Migration

Timeline

Configuration

bash
curl -X POST https://api.gateflow.ai/v1/management/migrations \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "GPT-4-turbo to GPT-5.2",
    "source_model": "gpt-4-turbo",
    "target_model": "gpt-5.2",
    "phases": [
      {
        "strategy": "shadow",
        "duration_hours": 24,
        "sample_rate": 0.1
      },
      {
        "strategy": "canary",
        "steps": [5, 20, 50, 80, 100],
        "step_duration_hours": 24,
        "auto_rollback": true
      }
    ]
  }'

Next Steps

Built with reliability in mind.