Skip to content

Rate Limits

Control agent request rates and resource usage.

Overview

Rate limits prevent agents from consuming excessive resources and help manage costs.

Limit Types

LimitDescriptionScope
requests_per_minuteTool calls per minutePer agent
cost_per_sessionMax cost per sessionPer session
cost_dailyMax daily costPer day
cost_monthlyMax monthly costPer month
audio_minutes_dailyAudio processing minutesPer day
ocr_pages_dailyOCR pages processedPer day
concurrent_sessionsSimultaneous sessionsPer agent

Configuring Limits

At Agent Creation

bash
curl -X POST https://api.gateflow.ai/v1/mcp/agents \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Support Bot",
    "permissions": {
      "tools": ["llm/chat", "retrieval/search"]
    },
    "limits": {
      "requests_per_minute": 60,
      "cost_per_session": 5.00,
      "cost_daily": 100.00,
      "cost_monthly": 2000.00
    }
  }'

Updating Limits

bash
curl -X PATCH https://api.gateflow.ai/v1/mcp/agents/agent_abc123 \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "limits": {
      "requests_per_minute": 120,
      "cost_daily": 200.00
    }
  }'

Limit Templates

Free Tier

yaml
limits:
  requests_per_minute: 10
  cost_per_session: 0.50
  cost_daily: 5.00
  cost_monthly: 50.00
  concurrent_sessions: 1

Standard Tier

yaml
limits:
  requests_per_minute: 60
  cost_per_session: 5.00
  cost_daily: 100.00
  cost_monthly: 1000.00
  concurrent_sessions: 5

Enterprise Tier

yaml
limits:
  requests_per_minute: 300
  cost_per_session: 50.00
  cost_daily: 1000.00
  cost_monthly: 20000.00
  concurrent_sessions: 50

Rate Limit Errors

Requests Per Minute

json
{
  "error": {
    "type": "rate_limit_error",
    "code": "rpm_exceeded",
    "message": "Agent exceeded requests per minute limit",
    "limit": 60,
    "current": 62,
    "retry_after_seconds": 45
  }
}

Cost Limit

json
{
  "error": {
    "type": "rate_limit_error",
    "code": "cost_limit_exceeded",
    "message": "Session cost limit exceeded",
    "limit_type": "cost_per_session",
    "limit": 5.00,
    "current": 5.12
  }
}

Daily Limit

json
{
  "error": {
    "type": "rate_limit_error",
    "code": "daily_limit_exceeded",
    "message": "Daily cost limit exceeded",
    "limit": 100.00,
    "current": 100.45,
    "resets_at": "2026-02-17T00:00:00Z"
  }
}

Checking Usage

From Agent

python
from gateflow_mcp import MCPClient

client = MCPClient(agent_id="agent_abc123", api_key="gf-agent-...")

usage = client.call_tool("self_inspect/get_my_usage", {})

print(f"Session cost: ${usage['session']['cost']:.4f}")
print(f"Daily cost: ${usage['daily']['cost']:.2f} / ${usage['limits']['cost_daily']:.2f}")
print(f"Remaining: ${usage['daily']['remaining_budget']:.2f}")

# Check warnings
for warning in usage.get("warnings", []):
    print(f"⚠️ {warning['message']}")

From Admin API

bash
curl "https://api.gateflow.ai/v1/mcp/agents/agent_abc123/usage" \
  -H "Authorization: Bearer gw_prod_admin_key"

Handling Rate Limits

Client-Side

python
import time

def call_with_retry(client, tool, arguments, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.call_tool(tool, arguments)
        except RateLimitError as e:
            if attempt == max_retries - 1:
                raise
            wait = e.retry_after_seconds or 60
            print(f"Rate limited, waiting {wait}s...")
            time.sleep(wait)

Pre-Flight Check

python
def check_budget_before_operation(client, estimated_cost):
    usage = client.call_tool("self_inspect/get_my_usage", {})

    remaining = usage["daily"]["remaining_budget"]
    if remaining < estimated_cost:
        raise Exception(f"Insufficient budget: ${remaining:.2f} < ${estimated_cost:.2f}")

    return True

Resource-Specific Limits

Audio Processing

yaml
limits:
  audio_minutes_daily: 60    # 60 minutes of audio per day
  audio_minutes_monthly: 1000

OCR Processing

yaml
limits:
  ocr_pages_daily: 100       # 100 pages per day
  ocr_pages_monthly: 2000

Embeddings

yaml
limits:
  embedding_tokens_daily: 1000000  # 1M tokens per day

Auto-Suspend on Limit

Configure automatic suspension:

bash
curl -X PATCH https://api.gateflow.ai/v1/mcp/agents/agent_abc123 \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "auto_suspend": {
      "on_cost_limit": true,
      "on_rpm_abuse": true,
      "abuse_threshold": 3
    }
  }'

Limit Alerts

Get notified before hitting limits:

bash
curl -X POST https://api.gateflow.ai/v1/mcp/agents/agent_abc123/alerts \
  -H "Authorization: Bearer gw_prod_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "limit_warning",
    "thresholds": {
      "cost_daily": [0.5, 0.8, 0.95],
      "cost_monthly": [0.5, 0.8, 0.95]
    },
    "webhook_url": "https://your-app.com/limit-alert"
  }'

Alert Payload:

json
{
  "event": "limit_warning",
  "agent_id": "agent_abc123",
  "limit_type": "cost_daily",
  "threshold": 0.8,
  "current": 82.50,
  "limit": 100.00,
  "timestamp": "2026-02-16T15:30:00Z"
}

Best Practices

  1. Set conservative limits - Start low, increase as needed
  2. Monitor usage - Track patterns before setting limits
  3. Use session limits - Prevent runaway sessions
  4. Set alerts - Get warned before limits hit
  5. Review regularly - Adjust limits based on actual usage

Limit Inheritance

Agents inherit organization limits:

yaml
# Organization defaults
organization:
  limits:
    cost_daily: 1000.00

# Agent inherits, can override lower
agent:
  limits:
    cost_daily: 100.00  # Cannot exceed org limit

Next Steps

Built with reliability in mind.