Error Handling

Handle errors gracefully in MCP agent applications.

Error Format

All MCP errors follow a consistent format:

json

{
  "error": {
    "type": "permission_error",
    "code": "tool_not_permitted",
    "message": "Agent does not have permission to use tool: voice/synthesize",
    "details": {
      "tool": "voice/synthesize",
      "agent_id": "agent_abc123"
    },
    "doc_url": "https://docs.gateflow.ai/mcp/error-handling#tool_not_permitted"
  }
}

Error Types

Type	Description
`permission_error`	Access denied
`rate_limit_error`	Limits exceeded
`validation_error`	Invalid input
`provider_error`	Upstream failure
`internal_error`	Server error

Error Codes

Permission Errors

Code	Description	Resolution
`tool_not_permitted`	Tool not in permissions	Add tool to agent permissions
`model_not_allowed`	Model not in allowlist	Add model to allowlist
`classification_denied`	Data classification too high	Adjust classification permissions
`collection_not_permitted`	Collection access denied	Grant collection access

Rate Limit Errors

Code	Description	Resolution
`rpm_exceeded`	Requests per minute exceeded	Wait and retry
`cost_limit_exceeded`	Cost limit reached	Increase limit or wait for reset
`daily_limit_exceeded`	Daily limit reached	Wait for daily reset
`concurrent_limit_exceeded`	Too many sessions	Close other sessions

Validation Errors

Code	Description	Resolution
`invalid_arguments`	Bad tool arguments	Fix argument format
`missing_required`	Required field missing	Add required field
`invalid_format`	Wrong data format	Use correct format
`file_too_large`	File exceeds limit	Reduce file size

Provider Errors

Code	Description	Resolution
`provider_unavailable`	Provider is down	Retry or use fallback
`provider_timeout`	Request timed out	Retry with smaller input
`model_overloaded`	Model capacity full	Wait and retry
`content_filtered`	Content blocked	Modify input

Handling Errors

Python SDK

python

from gateflow_mcp import MCPClient, MCPError

client = MCPClient(agent_id="agent_abc123", api_key="gf-agent-...")

try:
    result = client.call_tool(
        name="llm/chat",
        arguments={"messages": [{"role": "user", "content": "Hello"}]}
    )
except MCPError as e:
    if e.type == "permission_error":
        print(f"Permission denied: {e.message}")
        print(f"Allowed tools: {e.details.get('permitted_tools')}")

    elif e.type == "rate_limit_error":
        print(f"Rate limited: {e.message}")
        retry_after = e.details.get("retry_after_seconds", 60)
        print(f"Retry after: {retry_after}s")

    elif e.type == "validation_error":
        print(f"Invalid input: {e.message}")
        print(f"Field: {e.details.get('field')}")

    elif e.type == "provider_error":
        print(f"Provider error: {e.message}")
        print(f"Provider: {e.details.get('provider')}")

    else:
        print(f"Error: {e.type} - {e.message}")

Retry Logic

python

import time
from gateflow_mcp import MCPClient, MCPError

def call_with_retry(client, tool, arguments, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.call_tool(tool, arguments)

        except MCPError as e:
            if e.type == "rate_limit_error":
                if attempt < max_retries - 1:
                    wait = e.details.get("retry_after_seconds", 60)
                    print(f"Rate limited, waiting {wait}s...")
                    time.sleep(wait)
                    continue

            elif e.type == "provider_error":
                if e.code in ["provider_unavailable", "provider_timeout"]:
                    if attempt < max_retries - 1:
                        wait = 2 ** attempt  # Exponential backoff
                        print(f"Provider error, retrying in {wait}s...")
                        time.sleep(wait)
                        continue

            raise  # Re-raise if not retryable

    raise Exception("Max retries exceeded")

Error Recovery

python

def safe_search(client, query):
    try:
        return client.call_tool(
            name="retrieval/search",
            arguments={"query": query}
        )
    except MCPError as e:
        if e.code == "collection_not_permitted":
            # Fall back to public collection
            return client.call_tool(
                name="retrieval/search",
                arguments={
                    "query": query,
                    "collection": "public-docs"
                }
            )
        raise

Error Logging

Structured Logging

python

import logging
import json

logger = logging.getLogger(__name__)

def log_mcp_error(e: MCPError, context: dict = None):
    error_log = {
        "type": e.type,
        "code": e.code,
        "message": e.message,
        "details": e.details,
        "context": context or {}
    }
    logger.error(f"MCP Error: {json.dumps(error_log)}")

Error Metrics

python

from prometheus_client import Counter

mcp_errors = Counter(
    'mcp_errors_total',
    'MCP errors by type and code',
    ['type', 'code']
)

def track_error(e: MCPError):
    mcp_errors.labels(type=e.type, code=e.code).inc()

Graceful Degradation

Fallback Strategies

python

def get_response(client, user_input):
    # Try preferred model first
    try:
        return client.call_tool(
            name="llm/chat",
            arguments={
                "messages": [{"role": "user", "content": user_input}],
                "model": "gpt-5.2"
            }
        )
    except MCPError as e:
        if e.code == "model_not_allowed":
            # Fall back to allowed model
            return client.call_tool(
                name="llm/chat",
                arguments={
                    "messages": [{"role": "user", "content": user_input}],
                    "model": "gpt-5-mini"
                }
            )
        raise

Service Unavailable Handling

python

def handle_service_unavailable(e: MCPError):
    if e.code == "provider_unavailable":
        # Return cached response or placeholder
        return {
            "content": "Service temporarily unavailable. Please try again.",
            "cached": True
        }
    raise e

Validation Before Calling

Pre-Flight Checks

python

def validate_before_call(client, tool, arguments):
    # Check permissions
    whoami = client.call_tool("self_inspect/whoami", {})

    if tool not in whoami["permissions"]["tools"]:
        raise ValueError(f"Tool {tool} not permitted")

    # Check budget
    usage = client.call_tool("self_inspect/get_my_usage", {})
    if usage["daily"]["remaining_budget"] < 0.10:
        raise ValueError("Insufficient budget")

    return True

Error Documentation URLs

Each error includes a doc_url linking to detailed documentation:

python

except MCPError as e:
    print(f"Error: {e.message}")
    print(f"Learn more: {e.doc_url}")

Best Practices

Always handle errors - Wrap MCP calls in try/catch
Use specific handlers - Handle different error types appropriately
Implement retries - For transient errors
Log comprehensively - Include context in error logs
Degrade gracefully - Provide fallbacks where possible
Pre-validate - Check permissions before critical operations

Common Error Patterns

Budget Exhausted

python

try:
    result = client.call_tool("llm/chat", arguments)
except MCPError as e:
    if e.code == "cost_limit_exceeded":
        if e.details.get("limit_type") == "cost_per_session":
            # Start new session
            client.new_session()
            result = client.call_tool("llm/chat", arguments)
        else:
            # Daily/monthly limit - need to wait
            raise

Permission Escalation Needed

python

try:
    result = client.call_tool("voice/transcribe", arguments)
except MCPError as e:
    if e.code == "tool_not_permitted":
        # Log for admin review
        request_permission_escalation(
            agent_id=client.agent_id,
            tool="voice/transcribe",
            reason="User requested transcription feature"
        )
        raise UserFacingError("This feature requires additional permissions.")

Next Steps

Audit Logging - Track all operations
Cost Transparency - Monitor costs
Transport Config - Connection settings

Error Handling ​

Error Format ​

Error Types ​

Error Codes ​

Permission Errors ​

Rate Limit Errors ​

Validation Errors ​

Provider Errors ​

Handling Errors ​

Python SDK ​

Retry Logic ​

Error Recovery ​

Error Logging ​

Structured Logging ​

Error Metrics ​

Graceful Degradation ​

Fallback Strategies ​

Service Unavailable Handling ​

Validation Before Calling ​

Pre-Flight Checks ​

Error Documentation URLs ​

Best Practices ​

Common Error Patterns ​

Budget Exhausted ​

Permission Escalation Needed ​

Next Steps ​

Error Handling

Error Format

Error Types

Error Codes

Permission Errors

Rate Limit Errors

Validation Errors

Provider Errors

Handling Errors

Python SDK

Retry Logic

Error Recovery

Error Logging

Structured Logging

Error Metrics

Graceful Degradation

Fallback Strategies

Service Unavailable Handling

Validation Before Calling

Pre-Flight Checks

Error Documentation URLs

Best Practices

Common Error Patterns

Budget Exhausted

Permission Escalation Needed

Next Steps