Appearance
Rate Limits
Control agent request rates and resource usage.
Overview
Rate limits prevent agents from consuming excessive resources and help manage costs.
Limit Types
| Limit | Description | Scope |
|---|---|---|
requests_per_minute | Tool calls per minute | Per agent |
cost_per_session | Max cost per session | Per session |
cost_daily | Max daily cost | Per day |
cost_monthly | Max monthly cost | Per month |
audio_minutes_daily | Audio processing minutes | Per day |
ocr_pages_daily | OCR pages processed | Per day |
concurrent_sessions | Simultaneous sessions | Per agent |
Configuring Limits
At Agent Creation
bash
curl -X POST https://api.gateflow.ai/v1/mcp/agents \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"name": "Support Bot",
"permissions": {
"tools": ["llm/chat", "retrieval/search"]
},
"limits": {
"requests_per_minute": 60,
"cost_per_session": 5.00,
"cost_daily": 100.00,
"cost_monthly": 2000.00
}
}'1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Updating Limits
bash
curl -X PATCH https://api.gateflow.ai/v1/mcp/agents/agent_abc123 \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"limits": {
"requests_per_minute": 120,
"cost_daily": 200.00
}
}'1
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
Limit Templates
Free Tier
yaml
limits:
requests_per_minute: 10
cost_per_session: 0.50
cost_daily: 5.00
cost_monthly: 50.00
concurrent_sessions: 11
2
3
4
5
6
2
3
4
5
6
Standard Tier
yaml
limits:
requests_per_minute: 60
cost_per_session: 5.00
cost_daily: 100.00
cost_monthly: 1000.00
concurrent_sessions: 51
2
3
4
5
6
2
3
4
5
6
Enterprise Tier
yaml
limits:
requests_per_minute: 300
cost_per_session: 50.00
cost_daily: 1000.00
cost_monthly: 20000.00
concurrent_sessions: 501
2
3
4
5
6
2
3
4
5
6
Rate Limit Errors
Requests Per Minute
json
{
"error": {
"type": "rate_limit_error",
"code": "rpm_exceeded",
"message": "Agent exceeded requests per minute limit",
"limit": 60,
"current": 62,
"retry_after_seconds": 45
}
}1
2
3
4
5
6
7
8
9
10
2
3
4
5
6
7
8
9
10
Cost Limit
json
{
"error": {
"type": "rate_limit_error",
"code": "cost_limit_exceeded",
"message": "Session cost limit exceeded",
"limit_type": "cost_per_session",
"limit": 5.00,
"current": 5.12
}
}1
2
3
4
5
6
7
8
9
10
2
3
4
5
6
7
8
9
10
Daily Limit
json
{
"error": {
"type": "rate_limit_error",
"code": "daily_limit_exceeded",
"message": "Daily cost limit exceeded",
"limit": 100.00,
"current": 100.45,
"resets_at": "2026-02-17T00:00:00Z"
}
}1
2
3
4
5
6
7
8
9
10
2
3
4
5
6
7
8
9
10
Checking Usage
From Agent
python
from gateflow_mcp import MCPClient
client = MCPClient(agent_id="agent_abc123", api_key="gf-agent-...")
usage = client.call_tool("self_inspect/get_my_usage", {})
print(f"Session cost: ${usage['session']['cost']:.4f}")
print(f"Daily cost: ${usage['daily']['cost']:.2f} / ${usage['limits']['cost_daily']:.2f}")
print(f"Remaining: ${usage['daily']['remaining_budget']:.2f}")
# Check warnings
for warning in usage.get("warnings", []):
print(f"⚠️ {warning['message']}")1
2
3
4
5
6
7
8
9
10
11
12
13
2
3
4
5
6
7
8
9
10
11
12
13
From Admin API
bash
curl "https://api.gateflow.ai/v1/mcp/agents/agent_abc123/usage" \
-H "Authorization: Bearer gw_prod_admin_key"1
2
2
Handling Rate Limits
Client-Side
python
import time
def call_with_retry(client, tool, arguments, max_retries=3):
for attempt in range(max_retries):
try:
return client.call_tool(tool, arguments)
except RateLimitError as e:
if attempt == max_retries - 1:
raise
wait = e.retry_after_seconds or 60
print(f"Rate limited, waiting {wait}s...")
time.sleep(wait)1
2
3
4
5
6
7
8
9
10
11
12
2
3
4
5
6
7
8
9
10
11
12
Pre-Flight Check
python
def check_budget_before_operation(client, estimated_cost):
usage = client.call_tool("self_inspect/get_my_usage", {})
remaining = usage["daily"]["remaining_budget"]
if remaining < estimated_cost:
raise Exception(f"Insufficient budget: ${remaining:.2f} < ${estimated_cost:.2f}")
return True1
2
3
4
5
6
7
8
2
3
4
5
6
7
8
Resource-Specific Limits
Audio Processing
yaml
limits:
audio_minutes_daily: 60 # 60 minutes of audio per day
audio_minutes_monthly: 10001
2
3
2
3
OCR Processing
yaml
limits:
ocr_pages_daily: 100 # 100 pages per day
ocr_pages_monthly: 20001
2
3
2
3
Embeddings
yaml
limits:
embedding_tokens_daily: 1000000 # 1M tokens per day1
2
2
Auto-Suspend on Limit
Configure automatic suspension:
bash
curl -X PATCH https://api.gateflow.ai/v1/mcp/agents/agent_abc123 \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"auto_suspend": {
"on_cost_limit": true,
"on_rpm_abuse": true,
"abuse_threshold": 3
}
}'1
2
3
4
5
6
7
8
9
10
2
3
4
5
6
7
8
9
10
Limit Alerts
Get notified before hitting limits:
bash
curl -X POST https://api.gateflow.ai/v1/mcp/agents/agent_abc123/alerts \
-H "Authorization: Bearer gw_prod_admin_key" \
-H "Content-Type: application/json" \
-d '{
"type": "limit_warning",
"thresholds": {
"cost_daily": [0.5, 0.8, 0.95],
"cost_monthly": [0.5, 0.8, 0.95]
},
"webhook_url": "https://your-app.com/limit-alert"
}'1
2
3
4
5
6
7
8
9
10
11
2
3
4
5
6
7
8
9
10
11
Alert Payload:
json
{
"event": "limit_warning",
"agent_id": "agent_abc123",
"limit_type": "cost_daily",
"threshold": 0.8,
"current": 82.50,
"limit": 100.00,
"timestamp": "2026-02-16T15:30:00Z"
}1
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
Best Practices
- Set conservative limits - Start low, increase as needed
- Monitor usage - Track patterns before setting limits
- Use session limits - Prevent runaway sessions
- Set alerts - Get warned before limits hit
- Review regularly - Adjust limits based on actual usage
Limit Inheritance
Agents inherit organization limits:
yaml
# Organization defaults
organization:
limits:
cost_daily: 1000.00
# Agent inherits, can override lower
agent:
limits:
cost_daily: 100.00 # Cannot exceed org limit1
2
3
4
5
6
7
8
9
2
3
4
5
6
7
8
9
Next Steps
- Cost Transparency - Track costs
- Agent Lifecycle - Auto-suspend
- Tool Permissions - Tool access