Skip to content

Error Handling

Handle errors gracefully in MCP agent applications.

Error Format

All MCP errors follow a consistent format:

json
{
  "error": {
    "type": "permission_error",
    "code": "tool_not_permitted",
    "message": "Agent does not have permission to use tool: voice/synthesize",
    "details": {
      "tool": "voice/synthesize",
      "agent_id": "agent_abc123"
    },
    "doc_url": "https://docs.gateflow.ai/mcp/error-handling#tool_not_permitted"
  }
}

Error Types

TypeDescription
permission_errorAccess denied
rate_limit_errorLimits exceeded
validation_errorInvalid input
provider_errorUpstream failure
internal_errorServer error

Error Codes

Permission Errors

CodeDescriptionResolution
tool_not_permittedTool not in permissionsAdd tool to agent permissions
model_not_allowedModel not in allowlistAdd model to allowlist
classification_deniedData classification too highAdjust classification permissions
collection_not_permittedCollection access deniedGrant collection access

Rate Limit Errors

CodeDescriptionResolution
rpm_exceededRequests per minute exceededWait and retry
cost_limit_exceededCost limit reachedIncrease limit or wait for reset
daily_limit_exceededDaily limit reachedWait for daily reset
concurrent_limit_exceededToo many sessionsClose other sessions

Validation Errors

CodeDescriptionResolution
invalid_argumentsBad tool argumentsFix argument format
missing_requiredRequired field missingAdd required field
invalid_formatWrong data formatUse correct format
file_too_largeFile exceeds limitReduce file size

Provider Errors

CodeDescriptionResolution
provider_unavailableProvider is downRetry or use fallback
provider_timeoutRequest timed outRetry with smaller input
model_overloadedModel capacity fullWait and retry
content_filteredContent blockedModify input

Handling Errors

Python SDK

python
from gateflow_mcp import MCPClient, MCPError

client = MCPClient(agent_id="agent_abc123", api_key="gf-agent-...")

try:
    result = client.call_tool(
        name="llm/chat",
        arguments={"messages": [{"role": "user", "content": "Hello"}]}
    )
except MCPError as e:
    if e.type == "permission_error":
        print(f"Permission denied: {e.message}")
        print(f"Allowed tools: {e.details.get('permitted_tools')}")

    elif e.type == "rate_limit_error":
        print(f"Rate limited: {e.message}")
        retry_after = e.details.get("retry_after_seconds", 60)
        print(f"Retry after: {retry_after}s")

    elif e.type == "validation_error":
        print(f"Invalid input: {e.message}")
        print(f"Field: {e.details.get('field')}")

    elif e.type == "provider_error":
        print(f"Provider error: {e.message}")
        print(f"Provider: {e.details.get('provider')}")

    else:
        print(f"Error: {e.type} - {e.message}")

Retry Logic

python
import time
from gateflow_mcp import MCPClient, MCPError

def call_with_retry(client, tool, arguments, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.call_tool(tool, arguments)

        except MCPError as e:
            if e.type == "rate_limit_error":
                if attempt < max_retries - 1:
                    wait = e.details.get("retry_after_seconds", 60)
                    print(f"Rate limited, waiting {wait}s...")
                    time.sleep(wait)
                    continue

            elif e.type == "provider_error":
                if e.code in ["provider_unavailable", "provider_timeout"]:
                    if attempt < max_retries - 1:
                        wait = 2 ** attempt  # Exponential backoff
                        print(f"Provider error, retrying in {wait}s...")
                        time.sleep(wait)
                        continue

            raise  # Re-raise if not retryable

    raise Exception("Max retries exceeded")

Error Recovery

python
def safe_search(client, query):
    try:
        return client.call_tool(
            name="retrieval/search",
            arguments={"query": query}
        )
    except MCPError as e:
        if e.code == "collection_not_permitted":
            # Fall back to public collection
            return client.call_tool(
                name="retrieval/search",
                arguments={
                    "query": query,
                    "collection": "public-docs"
                }
            )
        raise

Error Logging

Structured Logging

python
import logging
import json

logger = logging.getLogger(__name__)

def log_mcp_error(e: MCPError, context: dict = None):
    error_log = {
        "type": e.type,
        "code": e.code,
        "message": e.message,
        "details": e.details,
        "context": context or {}
    }
    logger.error(f"MCP Error: {json.dumps(error_log)}")

Error Metrics

python
from prometheus_client import Counter

mcp_errors = Counter(
    'mcp_errors_total',
    'MCP errors by type and code',
    ['type', 'code']
)

def track_error(e: MCPError):
    mcp_errors.labels(type=e.type, code=e.code).inc()

Graceful Degradation

Fallback Strategies

python
def get_response(client, user_input):
    # Try preferred model first
    try:
        return client.call_tool(
            name="llm/chat",
            arguments={
                "messages": [{"role": "user", "content": user_input}],
                "model": "gpt-5.2"
            }
        )
    except MCPError as e:
        if e.code == "model_not_allowed":
            # Fall back to allowed model
            return client.call_tool(
                name="llm/chat",
                arguments={
                    "messages": [{"role": "user", "content": user_input}],
                    "model": "gpt-5-mini"
                }
            )
        raise

Service Unavailable Handling

python
def handle_service_unavailable(e: MCPError):
    if e.code == "provider_unavailable":
        # Return cached response or placeholder
        return {
            "content": "Service temporarily unavailable. Please try again.",
            "cached": True
        }
    raise e

Validation Before Calling

Pre-Flight Checks

python
def validate_before_call(client, tool, arguments):
    # Check permissions
    whoami = client.call_tool("self_inspect/whoami", {})

    if tool not in whoami["permissions"]["tools"]:
        raise ValueError(f"Tool {tool} not permitted")

    # Check budget
    usage = client.call_tool("self_inspect/get_my_usage", {})
    if usage["daily"]["remaining_budget"] < 0.10:
        raise ValueError("Insufficient budget")

    return True

Error Documentation URLs

Each error includes a doc_url linking to detailed documentation:

python
except MCPError as e:
    print(f"Error: {e.message}")
    print(f"Learn more: {e.doc_url}")

Best Practices

  1. Always handle errors - Wrap MCP calls in try/catch
  2. Use specific handlers - Handle different error types appropriately
  3. Implement retries - For transient errors
  4. Log comprehensively - Include context in error logs
  5. Degrade gracefully - Provide fallbacks where possible
  6. Pre-validate - Check permissions before critical operations

Common Error Patterns

Budget Exhausted

python
try:
    result = client.call_tool("llm/chat", arguments)
except MCPError as e:
    if e.code == "cost_limit_exceeded":
        if e.details.get("limit_type") == "cost_per_session":
            # Start new session
            client.new_session()
            result = client.call_tool("llm/chat", arguments)
        else:
            # Daily/monthly limit - need to wait
            raise

Permission Escalation Needed

python
try:
    result = client.call_tool("voice/transcribe", arguments)
except MCPError as e:
    if e.code == "tool_not_permitted":
        # Log for admin review
        request_permission_escalation(
            agent_id=client.agent_id,
            tool="voice/transcribe",
            reason="User requested transcription feature"
        )
        raise UserFacingError("This feature requires additional permissions.")

Next Steps

Built with reliability in mind.