Appearance
OpenAI Integration
GateFlow provides comprehensive support for OpenAI's advanced AI models with sustainability optimization and carbon-efficient routing.
Available Models
Chat Models
GPT-5.2 Family (Flagship Models)
gpt-5.2- Flagship model with extended reasoning capabilities- Context window: 200,000 tokens
- Max output: 16,384 tokens
- Supports: Chat, Vision, Functions, Streaming
- Average latency: 1,500ms
- Quality score: 10/10
gpt-5.2-instant- Fast variant with optimized latency- Context window: 200,000 tokens
- Max output: 16,384 tokens
- Supports: Chat, Vision, Functions, Streaming
- Average latency: 800ms
- Quality score: 9/10
gpt-5.2-codex- Specialized for code generation and debugging- Context window: 200,000 tokens
- Max output: 16,384 tokens
- Supports: Chat, Functions, Streaming
- Average latency: 1,600ms
- Quality score: 10/10
GPT-5 Family (Production Models)
gpt-5- Balanced model for production use- Context window: 128,000 tokens
- Max output: 16,384 tokens
- Supports: Chat, Vision, Functions, Streaming
- Average latency: 1,000ms
- Quality score: 9/10
gpt-5-mini- Cost-effective for high-volume applications- Context window: 128,000 tokens
- Max output: 16,384 tokens
- Supports: Chat, Functions, Streaming
- Average latency: 500ms
- Quality score: 8/10
gpt-5-nano- Ultra-fast for simple tasks- Context window: 128,000 tokens
- Max output: 4,096 tokens
- Supports: Chat, Streaming
- Average latency: 300ms
- Quality score: 7/10
Specialized Models
o3- Advanced reasoning model- Context window: 200,000 tokens
- Max output: 100,000 tokens
- Supports: Chat
- Average latency: 3,000ms
- Quality score: 10/10
o4-mini- Fast reasoning model- Context window: 128,000 tokens
- Max output: 65,536 tokens
- Supports: Chat
- Average latency: 1,800ms
- Quality score: 9/10
Embedding Models
text-embedding-3-large- High-quality embeddings (3,072 dimensions)- Context window: 8,191 tokens
- Price: $0.13 per 1M tokens
- Average latency: 150ms
- Quality score: 10/10
text-embedding-3-small- Fast embeddings (1,536 dimensions)- Context window: 8,191 tokens
- Price: $0.02 per 1M tokens
- Average latency: 100ms
- Quality score: 8/10
Sustainability Features
OpenAI integration through GateFlow offers several sustainability benefits:
- Carbon-Optimized Routing: Automatically select the most energy-efficient data center
- Model Efficiency: GPT-5.2 models are significantly more efficient than previous generations
- Time-Shifted Execution: Defer non-urgent requests to low-carbon periods
- Request Batching: Combine multiple requests for reduced overhead
- Automatic Model Selection: Choose the most efficient model for your task
Example Usage
Basic Chat Completion
python
from openai import OpenAI
client = OpenAI(
base_url="https://api.gateflow.ai/v1",
api_key="gw_prod_your_key_here"
)
# Using GPT-5.2 for complex reasoning with sustainability optimization
response = client.chat.completions.create(
model="gpt-5.2",
messages=[{"role": "user", "content": "Analyze this complex document and provide insights"}],
routing_mode="sustain_optimized", # Enable carbon-optimized routing
maximum_quality_score=9 # Balance quality and efficiency
)
print(f"Response: {response.choices[0].message.content}")
print(f"Model used: {response.model}")
print(f"Carbon footprint: {response.sustainability.carbon_gco2e} gCO₂e")
print(f"Carbon saved: {response.sustainability.carbon_saved_gco2e} gCO₂e")Using Embeddings for RAG
python
# High-quality embeddings for semantic search
embedding_response = client.embeddings.create(
model="text-embedding-3-large",
input=[
"Document 1 content about sustainability practices",
"Document 2 content about renewable energy",
"User query about eco-friendly AI solutions"
],
routing_mode="sustain_optimized"
)
# Use embeddings for semantic search
for i, embedding in enumerate(embedding_response.data):
print(f"Embedding {i+1}: {len(embedding.embedding)} dimensions")
print(f"Carbon footprint: {embedding.sustainability.carbon_gco2e} gCO₂e")Advanced Reasoning with O3
python
# Using O3 for complex reasoning tasks
response = client.chat.completions.create(
model="o3",
messages=[{"role": "user", "content": "Solve this complex mathematical problem step by step"}],
routing_mode="sustain_optimized",
timeout_seconds=30 # Allow extra time for complex reasoning
)
print(f"Reasoning steps: {response.choices[0].message.content}")Code Generation with GPT-5.2 Codex
python
# Specialized code generation
response = client.chat.completions.create(
model="gpt-5.2-codex",
messages=[{"role": "user", "content": "Generate Python code for a sustainable AI pipeline"}],
routing_mode="sustain_optimized"
)
print(f"Generated code:\n{response.choices[0].message.content}")OpenAI-Specific Features
Function Calling
python
# Define functions (works across all OpenAI models)
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather information for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
}
}
}
}
]
response = client.chat.completions.create(
model="gpt-5.2",
messages=[{"role": "user", "content": "What's the weather in Paris?"}],
tools=tools,
routing_mode="sustain_optimized"
)Vision Capabilities
python
# Multi-modal input with vision
response = client.chat.completions.create(
model="gpt-5.2",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Analyze this sustainability chart"},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/sustainability-chart.png"
}
}
]
}
],
routing_mode="sustain_optimized"
)Streaming Responses
python
# Stream responses for better user experience
stream = client.chat.completions.create(
model="gpt-5.2-instant",
messages=[{"role": "user", "content": "Generate a detailed sustainability report"}],
stream=True,
routing_mode="sustain_optimized"
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Model Selection Guide
| Use Case | Recommended Model | Key Features | Sustainability Benefits |
|---|---|---|---|
| Complex reasoning | gpt-5.2 | 200K context, multimodal | Highest efficiency per token |
| Fast responses | gpt-5.2-instant | 800ms latency | Optimized for speed and efficiency |
| Code generation | gpt-5.2-codex | Specialized coding | Reduced compute for coding tasks |
| Production use | gpt-5 | Balanced performance | Best quality-to-carbon ratio |
| High-volume apps | gpt-5-mini | Cost-effective | Lowest carbon footprint |
| Simple tasks | gpt-5-nano | Ultra-fast | Minimal energy consumption |
| Advanced reasoning | o3 | 100K output tokens | Optimized for complex tasks |
| Fast reasoning | o4-mini | 65K output tokens | Efficient reasoning architecture |
| High-quality embeddings | text-embedding-3-large | 3,072 dimensions | Optimized embedding generation |
| Cost-effective embeddings | text-embedding-3-small | 1,536 dimensions | Lowest energy embeddings |
Sustainability Best Practices
Optimization Strategies
- Right-size your model: Use
gpt-5-miniorgpt-5-nanofor simple tasks instead of flagship models - Enable Sustain Mode: Let GateFlow automatically choose the most efficient OpenAI model
- Use time-shifting: Defer non-urgent requests to low-carbon periods
- Batch requests: Process multiple items in single API calls to reduce overhead
- Combine with caching: Cache frequent OpenAI requests for maximum savings
- Region optimization: Select data centers in low-carbon regions
Configuration Example
python
# Configure OpenAI provider with sustainability settings
response = client.chat.completions.create(
model="auto", # Let GateFlow choose most efficient OpenAI model
messages=[{"role": "user", "content": "Process this sustainably"}],
routing_mode="sustain_optimized",
minimum_quality_score=8, # Balance quality and efficiency
region_preference="us-west", # Prioritize low-carbon regions
max_carbon_budget_gco2e=50 # Set maximum carbon budget
)Performance Characteristics
Latency Comparison
- Fastest:
gpt-5-nano(300ms) - Balanced:
gpt-5-mini(500ms),gpt-5.2-instant(800ms) - Standard:
gpt-5(1,000ms),gpt-5.2(1,500ms) - Advanced:
o4-mini(1,800ms),o3(3,000ms)
Token Limits
- Standard models: 128K-200K context window
- Output limits: 4K-100K tokens depending on model
- Embedding models: 8K token input limit
Pricing Overview
- Input prices: $0.10-$10.00 per 1M tokens
- Output prices: $0.04-$40.00 per 1M tokens
- Embeddings: $0.02-$0.13 per 1M tokens
Integration with Other GateFlow Features
Multi-Provider Fallbacks
python
# Configure OpenAI as primary with fallbacks
response = client.chat.completions.create(
model="gpt-5.2", # Primary: OpenAI
messages=[{"role": "user", "content": "Important request"}],
fallback_providers=["anthropic", "mistral"], # Fallback chain
routing_mode="sustain_optimized"
)Semantic Caching
python
# Cache frequent OpenAI requests
response = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "Frequently asked sustainability question"}],
cache_ttl_seconds=3600, # Cache for 1 hour
embedding_model="text-embedding-3-small" # Use for semantic matching
)Troubleshooting
"OpenAI API key not configured"
Solution: Add your OpenAI API key in the GateFlow Dashboard under Settings → Providers.
"Model not found: gpt-4-turbo"
Solution: Use current models like gpt-5 instead of deprecated models. Check the model compatibility guide.
"Rate limit exceeded"
Solution:
- Check your OpenAI account limits
- Configure fallbacks to other providers
- Enable request queuing in GateFlow settings
- Use
gpt-5-minifor high-volume applications
"Carbon savings lower than expected"
Solution:
- Verify Sustain Mode is properly configured
- Check grid carbon intensity in your region
- Try different OpenAI models for better efficiency
- Enable time-shifted execution for non-urgent requests
Migration from Direct OpenAI API
Key Differences
| Feature | Direct OpenAI API | GateFlow OpenAI Integration |
|---|---|---|
| API Format | OpenAI-specific | OpenAI-compatible |
| Authentication | OpenAI API key | GateFlow API key |
| Model Names | gpt-4, gpt-3.5-turbo | gpt-5.2, gpt-5 |
| Carbon Tracking | Manual | Automatic |
| Multi-provider | No | Yes |
| Fallbacks | Manual | Automatic |
| Sustainability | Basic | Advanced optimization |
Migration Example
Before (Direct OpenAI API):
python
import openai
openai.api_key = "sk-proj-..."
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello from OpenAI!"}]
)After (GateFlow Integration):
python
from openai import OpenAI
client = OpenAI(
base_url="https://api.gateflow.ai/v1",
api_key="gw_prod_your_gateflow_key"
)
response = client.chat.completions.create(
model="gpt-5.2", # Use current models
messages=[{"role": "user", "content": "Hello from OpenAI via GateFlow with sustainability benefits!"}],
routing_mode="sustain_optimized" # Enable carbon optimization
)Next Steps
- Explore Anthropic Integration - Advanced reasoning models
- Try Google Gemini Models - Large context capabilities
- Configure Sustain Mode - Automatic carbon optimization
- View Provider Analytics - Monitor your OpenAI carbon savings