OpenAI Integration

GateFlow provides comprehensive support for OpenAI's advanced AI models with sustainability optimization and carbon-efficient routing.

Available Models

Chat Models

GPT-5.2 Family (Flagship Models)

gpt-5.2 - Flagship model with extended reasoning capabilities
- Context window: 200,000 tokens
- Max output: 16,384 tokens
- Supports: Chat, Vision, Functions, Streaming
- Average latency: 1,500ms
- Quality score: 10/10
gpt-5.2-instant - Fast variant with optimized latency
- Context window: 200,000 tokens
- Max output: 16,384 tokens
- Supports: Chat, Vision, Functions, Streaming
- Average latency: 800ms
- Quality score: 9/10
gpt-5.2-codex - Specialized for code generation and debugging
- Context window: 200,000 tokens
- Max output: 16,384 tokens
- Supports: Chat, Functions, Streaming
- Average latency: 1,600ms
- Quality score: 10/10

GPT-5 Family (Production Models)

gpt-5 - Balanced model for production use
- Context window: 128,000 tokens
- Max output: 16,384 tokens
- Supports: Chat, Vision, Functions, Streaming
- Average latency: 1,000ms
- Quality score: 9/10
gpt-5-mini - Cost-effective for high-volume applications
- Context window: 128,000 tokens
- Max output: 16,384 tokens
- Supports: Chat, Functions, Streaming
- Average latency: 500ms
- Quality score: 8/10
gpt-5-nano - Ultra-fast for simple tasks
- Context window: 128,000 tokens
- Max output: 4,096 tokens
- Supports: Chat, Streaming
- Average latency: 300ms
- Quality score: 7/10

Specialized Models

o3 - Advanced reasoning model
- Context window: 200,000 tokens
- Max output: 100,000 tokens
- Supports: Chat
- Average latency: 3,000ms
- Quality score: 10/10
o4-mini - Fast reasoning model
- Context window: 128,000 tokens
- Max output: 65,536 tokens
- Supports: Chat
- Average latency: 1,800ms
- Quality score: 9/10

Embedding Models

text-embedding-3-large - High-quality embeddings (3,072 dimensions)
- Context window: 8,191 tokens
- Price: $0.13 per 1M tokens
- Average latency: 150ms
- Quality score: 10/10
text-embedding-3-small - Fast embeddings (1,536 dimensions)
- Context window: 8,191 tokens
- Price: $0.02 per 1M tokens
- Average latency: 100ms
- Quality score: 8/10

Sustainability Features

OpenAI integration through GateFlow offers several sustainability benefits:

Carbon-Optimized Routing: Automatically select the most energy-efficient data center
Model Efficiency: GPT-5.2 models are significantly more efficient than previous generations
Time-Shifted Execution: Defer non-urgent requests to low-carbon periods
Request Batching: Combine multiple requests for reduced overhead
Automatic Model Selection: Choose the most efficient model for your task

Example Usage

Basic Chat Completion

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.gateflow.ai/v1",
    api_key="gw_prod_your_key_here"
)

# Using GPT-5.2 for complex reasoning with sustainability optimization
response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "Analyze this complex document and provide insights"}],
    routing_mode="sustain_optimized",  # Enable carbon-optimized routing
    maximum_quality_score=9  # Balance quality and efficiency
)

print(f"Response: {response.choices[0].message.content}")
print(f"Model used: {response.model}")
print(f"Carbon footprint: {response.sustainability.carbon_gco2e} gCO₂e")
print(f"Carbon saved: {response.sustainability.carbon_saved_gco2e} gCO₂e")

Using Embeddings for RAG

python

# High-quality embeddings for semantic search
embedding_response = client.embeddings.create(
    model="text-embedding-3-large",
    input=[
        "Document 1 content about sustainability practices",
        "Document 2 content about renewable energy",
        "User query about eco-friendly AI solutions"
    ],
    routing_mode="sustain_optimized"
)

# Use embeddings for semantic search
for i, embedding in enumerate(embedding_response.data):
    print(f"Embedding {i+1}: {len(embedding.embedding)} dimensions")
    print(f"Carbon footprint: {embedding.sustainability.carbon_gco2e} gCO₂e")

Advanced Reasoning with O3

python

# Using O3 for complex reasoning tasks
response = client.chat.completions.create(
    model="o3",
    messages=[{"role": "user", "content": "Solve this complex mathematical problem step by step"}],
    routing_mode="sustain_optimized",
    timeout_seconds=30  # Allow extra time for complex reasoning
)

print(f"Reasoning steps: {response.choices[0].message.content}")

Code Generation with GPT-5.2 Codex

python

# Specialized code generation
response = client.chat.completions.create(
    model="gpt-5.2-codex",
    messages=[{"role": "user", "content": "Generate Python code for a sustainable AI pipeline"}],
    routing_mode="sustain_optimized"
)

print(f"Generated code:\n{response.choices[0].message.content}")

OpenAI-Specific Features

Function Calling

python

# Define functions (works across all OpenAI models)
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather information for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                }
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools,
    routing_mode="sustain_optimized"
)

Vision Capabilities

python

# Multi-modal input with vision
response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Analyze this sustainability chart"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/sustainability-chart.png"
                    }
                }
            ]
        }
    ],
    routing_mode="sustain_optimized"
)

Streaming Responses

python

# Stream responses for better user experience
stream = client.chat.completions.create(
    model="gpt-5.2-instant",
    messages=[{"role": "user", "content": "Generate a detailed sustainability report"}],
    stream=True,
    routing_mode="sustain_optimized"
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Model Selection Guide

Use Case	Recommended Model	Key Features	Sustainability Benefits
Complex reasoning	`gpt-5.2`	200K context, multimodal	Highest efficiency per token
Fast responses	`gpt-5.2-instant`	800ms latency	Optimized for speed and efficiency
Code generation	`gpt-5.2-codex`	Specialized coding	Reduced compute for coding tasks
Production use	`gpt-5`	Balanced performance	Best quality-to-carbon ratio
High-volume apps	`gpt-5-mini`	Cost-effective	Lowest carbon footprint
Simple tasks	`gpt-5-nano`	Ultra-fast	Minimal energy consumption
Advanced reasoning	`o3`	100K output tokens	Optimized for complex tasks
Fast reasoning	`o4-mini`	65K output tokens	Efficient reasoning architecture
High-quality embeddings	`text-embedding-3-large`	3,072 dimensions	Optimized embedding generation
Cost-effective embeddings	`text-embedding-3-small`	1,536 dimensions	Lowest energy embeddings

Sustainability Best Practices

Optimization Strategies

Right-size your model: Use gpt-5-mini or gpt-5-nano for simple tasks instead of flagship models
Enable Sustain Mode: Let GateFlow automatically choose the most efficient OpenAI model
Use time-shifting: Defer non-urgent requests to low-carbon periods
Batch requests: Process multiple items in single API calls to reduce overhead
Combine with caching: Cache frequent OpenAI requests for maximum savings
Region optimization: Select data centers in low-carbon regions

Configuration Example

python

# Configure OpenAI provider with sustainability settings
response = client.chat.completions.create(
    model="auto",  # Let GateFlow choose most efficient OpenAI model
    messages=[{"role": "user", "content": "Process this sustainably"}],
    routing_mode="sustain_optimized",
    minimum_quality_score=8,  # Balance quality and efficiency
    region_preference="us-west",  # Prioritize low-carbon regions
    max_carbon_budget_gco2e=50  # Set maximum carbon budget
)

Performance Characteristics

Latency Comparison

Fastest: gpt-5-nano (300ms)
Balanced: gpt-5-mini (500ms), gpt-5.2-instant (800ms)
Standard: gpt-5 (1,000ms), gpt-5.2 (1,500ms)
Advanced: o4-mini (1,800ms), o3 (3,000ms)

Token Limits

Standard models: 128K-200K context window
Output limits: 4K-100K tokens depending on model
Embedding models: 8K token input limit

Pricing Overview

Input prices: $0.10-$10.00 per 1M tokens
Output prices: $0.04-$40.00 per 1M tokens
Embeddings: $0.02-$0.13 per 1M tokens

Integration with Other GateFlow Features

Multi-Provider Fallbacks

python

# Configure OpenAI as primary with fallbacks
response = client.chat.completions.create(
    model="gpt-5.2",  # Primary: OpenAI
    messages=[{"role": "user", "content": "Important request"}],
    fallback_providers=["anthropic", "mistral"],  # Fallback chain
    routing_mode="sustain_optimized"
)

Semantic Caching

python

# Cache frequent OpenAI requests
response = client.chat.completions.create(
    model="gpt-5",
    messages=[{"role": "user", "content": "Frequently asked sustainability question"}],
    cache_ttl_seconds=3600,  # Cache for 1 hour
    embedding_model="text-embedding-3-small"  # Use for semantic matching
)

Troubleshooting

"OpenAI API key not configured"

Solution: Add your OpenAI API key in the GateFlow Dashboard under Settings → Providers.

"Model not found: gpt-4-turbo"

Solution: Use current models like gpt-5 instead of deprecated models. Check the model compatibility guide.

"Rate limit exceeded"

Solution:

Check your OpenAI account limits
Configure fallbacks to other providers
Enable request queuing in GateFlow settings
Use gpt-5-mini for high-volume applications

"Carbon savings lower than expected"

Solution:

Verify Sustain Mode is properly configured
Check grid carbon intensity in your region
Try different OpenAI models for better efficiency
Enable time-shifted execution for non-urgent requests

Migration from Direct OpenAI API

Key Differences

Feature	Direct OpenAI API	GateFlow OpenAI Integration
API Format	OpenAI-specific	OpenAI-compatible
Authentication	OpenAI API key	GateFlow API key
Model Names	`gpt-4`, `gpt-3.5-turbo`	`gpt-5.2`, `gpt-5`
Carbon Tracking	Manual	Automatic
Multi-provider	No	Yes
Fallbacks	Manual	Automatic
Sustainability	Basic	Advanced optimization

Migration Example

Before (Direct OpenAI API):

python

import openai
openai.api_key = "sk-proj-..."
response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello from OpenAI!"}]
)

After (GateFlow Integration):

python

from openai import OpenAI
client = OpenAI(
    base_url="https://api.gateflow.ai/v1",
    api_key="gw_prod_your_gateflow_key"
)
response = client.chat.completions.create(
    model="gpt-5.2",  # Use current models
    messages=[{"role": "user", "content": "Hello from OpenAI via GateFlow with sustainability benefits!"}],
    routing_mode="sustain_optimized"  # Enable carbon optimization
)

Next Steps

Explore Anthropic Integration - Advanced reasoning models
Try Google Gemini Models - Large context capabilities
Configure Sustain Mode - Automatic carbon optimization
View Provider Analytics - Monitor your OpenAI carbon savings

OpenAI Integration ​

Available Models ​

Chat Models ​

GPT-5.2 Family (Flagship Models) ​

GPT-5 Family (Production Models) ​

Specialized Models ​

Embedding Models ​

Sustainability Features ​

Example Usage ​

Basic Chat Completion ​

Using Embeddings for RAG ​

Advanced Reasoning with O3 ​

Code Generation with GPT-5.2 Codex ​

OpenAI-Specific Features ​

Function Calling ​

Vision Capabilities ​

Streaming Responses ​

Model Selection Guide ​

Sustainability Best Practices ​

Optimization Strategies ​

Configuration Example ​

Performance Characteristics ​

Latency Comparison ​

Token Limits ​

Pricing Overview ​

Integration with Other GateFlow Features ​

Multi-Provider Fallbacks ​

Semantic Caching ​

Troubleshooting ​

"OpenAI API key not configured" ​

"Model not found: gpt-4-turbo" ​

"Rate limit exceeded" ​

"Carbon savings lower than expected" ​

Migration from Direct OpenAI API ​

Key Differences ​

Migration Example ​

Next Steps ​

OpenAI Integration

Available Models

Chat Models

GPT-5.2 Family (Flagship Models)

GPT-5 Family (Production Models)

Specialized Models

Embedding Models

Sustainability Features

Example Usage

Basic Chat Completion

Using Embeddings for RAG

Advanced Reasoning with O3

Code Generation with GPT-5.2 Codex

OpenAI-Specific Features

Function Calling

Vision Capabilities

Streaming Responses

Model Selection Guide

Sustainability Best Practices

Optimization Strategies

Configuration Example

Performance Characteristics

Latency Comparison

Token Limits

Pricing Overview

Integration with Other GateFlow Features

Multi-Provider Fallbacks

Semantic Caching

Troubleshooting

"OpenAI API key not configured"

"Model not found: gpt-4-turbo"

"Rate limit exceeded"

"Carbon savings lower than expected"

Migration from Direct OpenAI API

Key Differences

Migration Example

Next Steps