Google Integration

Google's Gemini models provide advanced AI capabilities with large context windows and multimodal support. GateFlow integrates seamlessly with Google's AI platform for optimized routing and sustainability.

Available Models

Chat Models

Model ID	Type	Context Window	Best For
`gemini-3-pro`	Chat	2M tokens	Complex reasoning, multi-turn conversations
`gemini-3-flash`	Chat	1M tokens	Fast, cost-effective responses
`gemini-2.5-pro`	Chat	1M tokens	Balanced performance
`gemini-2.5-flash`	Chat	1M tokens	High-speed, low-cost tasks
`gemini-2.5-flash-lite`	Chat	500K tokens	Lightweight tasks

Embedding Models

Model ID	Dimensions	Max Tokens	Best For
`text-embedding-004`	768	2,048	Semantic search, clustering

Configuration

json

{
  "provider": "google",
  "credentials": {
    "api_key": "AIza..."
  }
}

Pricing

Model	Input ($/M tokens)	Output ($/M tokens)
`gemini-3-pro`	$3.50	$14.00
`gemini-3-flash`	$0.10	$0.40
`gemini-2.5-pro`	$1.25	$5.00
`gemini-2.5-flash`	$0.075	$0.30
`gemini-2.5-flash-lite`	$0.05	$0.20
`text-embedding-004`	$0.025	N/A

Sustainability Features

Google integration through GateFlow offers several sustainability benefits:

Carbon-Neutral Data Centers: Google's data centers run on 100% renewable energy
TPU Acceleration: Energy-efficient Tensor Processing Units for AI workloads
Intelligent Routing: Automatically select the most energy-efficient data center
Time-Shifted Execution: Defer non-urgent requests to low-carbon periods
Automatic Model Selection: Choose the most efficient Gemini model for your task

Example Usage

Basic Chat Completion

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.gateflow.ai/v1",
    api_key="gw_prod_your_key_here"
)

# Use Google Gemini for large context tasks
response = client.chat.completions.create(
    model="gemini-3-pro",
    messages=[{"role": "user", "content": "Analyze this 1M token document"}],
    routing_mode="sustain_optimized"
)

print(f"Response: {response.choices[0].message.content}")
print(f"Model used: {response.model}")
print(f"Carbon footprint: {response.sustainability.carbon_gco2e} gCO₂e")
print(f"Carbon saved: {response.sustainability.carbon_saved_gco2e} gCO₂e")

Using Embeddings

python

# Generate embeddings for semantic search
embedding_response = client.embeddings.create(
    model="text-embedding-004",
    input=[
        "Document 1 content",
        "Document 2 content",
        "User query"
    ],
    routing_mode="sustain_optimized"
)

for i, embedding in enumerate(embedding_response.data):
    print(f"Embedding {i+1}: {len(embedding.embedding)} dimensions")
    print(f"Carbon footprint: {embedding.sustainability.carbon_gco2e} gCO₂e")

Large Context Processing

python

# Process very large documents with Gemini 3 Pro
response = client.chat.completions.create(
    model="gemini-3-pro",
    messages=[{"role": "user", "content": "Summarize this 500K token research paper"}],
    routing_mode="sustain_optimized",
    max_tokens=4096
)

print(f"Summary: {response.choices[0].message.content}")

Google-Specific Features

python

# Multi-modal input with images
response = client.chat.completions.create(
    model="gemini-3-pro",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Analyze this chart"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/chart.png"
                    }
                }
            ]
        }
    ],
    routing_mode="sustain_optimized"
)

Function Calling

python

# Define functions for tool use
tools = [
    {
        "type": "function",
        "function": {
            "name": "search_knowledge_base",
            "description": "Search company knowledge base",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {"type": "string"}
                }
            }
        }
    }
]

response = client.chat.completions.create(
    model="gemini-3-pro",
    messages=[{"role": "user", "content": "Find information about our sustainability initiatives"}],
    tools=tools,
    routing_mode="sustain_optimized"
)

Model Selection Guide

Use Case	Recommended Model	Key Features	Sustainability Benefits
Large context	`gemini-3-pro`	2M context window	Carbon-neutral data centers
Fast reasoning	`gemini-3-flash`	400ms latency	TPU-optimized efficiency
Balanced performance	`gemini-2.5-pro`	1M context	Best quality-to-carbon ratio
High-volume	`gemini-2.5-flash`	Ultra-fast	Lowest carbon footprint
Lightweight tasks	`gemini-2.5-flash-lite`	Cost-effective	Minimal energy consumption
Embeddings	`text-embedding-004`	768 dimensions	Optimized embedding generation

Sustainability Best Practices

Optimization Strategies

Right-size your model: Use gemini-2.5-flash for simple tasks instead of Pro models
Enable Sustain Mode: Let GateFlow automatically choose the most efficient Google model
Use time-shifting: Defer non-urgent requests to low-carbon periods
Batch requests: Process multiple items in single API calls to reduce overhead
Leverage TPUs: Google's Tensor Processing Units provide energy-efficient acceleration

Configuration Example

python

# Configure Google provider with sustainability settings
response = client.chat.completions.create(
    model="google:auto",  # Let GateFlow choose most efficient Google model
    messages=[{"role": "user", "content": "Process this sustainably"}],
    routing_mode="sustain_optimized",
    minimum_quality_score=8,  # Balance quality and efficiency
    region_preference="us-central1"  # Prioritize Google's carbon-neutral regions
)

Performance Characteristics

Latency Comparison

Fastest: gemini-2.5-flash (200ms)
Balanced: gemini-3-flash (400ms)
Standard: gemini-2.5-pro (900ms)
Advanced: gemini-3-pro (1,200ms)
Deep Think: gemini-3-deep-think (2,500ms)

Token Limits

Gemini 3 Pro: 2M context window, 8K output tokens
Gemini 3 Flash: 1M context window, 8K output tokens
Gemini 2.5 models: 1M-2M context window, 8K output tokens
Embedding model: 2K token input limit

Pricing Overview

Input prices: $0.05-$3.50 per 1M tokens
Output prices: $0.20-$14.00 per 1M tokens
Embeddings: $0.025 per 1M tokens

Integration with Other GateFlow Features

Multi-Provider Fallbacks

python

# Configure Google as primary with fallbacks
response = client.chat.completions.create(
    model="gemini-3-pro",  # Primary: Google
    messages=[{"role": "user", "content": "Important request"}],
    fallback_providers=["anthropic", "openai"],  # Fallback chain
    routing_mode="sustain_optimized"
)

Semantic Caching

python

# Cache frequent Google requests
response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[{"role": "user", "content": "Frequently asked question"}],
    cache_ttl_seconds=3600,  # Cache for 1 hour
    embedding_model="text-embedding-004"  # Use Google embeddings for semantic matching
)

Troubleshooting

"Google API key not configured"

Solution: Add your Google API key in the GateFlow Dashboard under Settings → Providers.

"Model not found: gemini-1.5-pro"

Solution: Use current models like gemini-3-pro instead of deprecated models.

"Rate limit exceeded"

Solution:

Check your Google Cloud quota
Configure fallbacks to other providers
Enable request queuing in GateFlow settings
Use gemini-2.5-flash for high-volume applications

"Carbon savings lower than expected"

Solution:

Verify Sustain Mode is properly configured
Check grid carbon intensity in your region
Try different Google models for better efficiency
Enable time-shifted execution for non-urgent requests

Migration from Direct Google API

Key Differences

Feature	Direct Google API	GateFlow Google Integration
API Format	Google-specific	OpenAI-compatible
Authentication	Google API key	GateFlow API key
Model Names	`gemini-1.5-pro`	`gemini-3-pro`
Carbon Tracking	Manual	Automatic
Multi-provider	No	Yes
Fallbacks	Manual	Automatic
Sustainability	Basic	Advanced optimization

Migration Example

Before (Direct Google API):

python

import google.generativeai as genai
genai.configure(api_key="your-google-api-key")
model = genai.GenerativeModel("gemini-1.5-pro")
response = model.generate_content("Hello from Google!")

After (GateFlow Integration):

python

from openai import OpenAI
client = OpenAI(
    base_url="https://api.gateflow.ai/v1",
    api_key="gw_prod_your_gateflow_key"
)
response = client.chat.completions.create(
    model="gemini-3-pro",  # Use current models
    messages=[{"role": "user", "content": "Hello from Google via GateFlow with sustainability benefits!"}],
    routing_mode="sustain_optimized"  # Enable carbon optimization
)

Next Steps

Explore OpenAI Integration - Versatile AI models
Try Anthropic Integration - Advanced reasoning models
Configure Sustain Mode - Automatic carbon optimization
View Provider Analytics - Monitor your Google carbon savings

Google Integration ​

Available Models ​

Chat Models ​

Embedding Models ​

Configuration ​

Pricing ​

Sustainability Features ​

Example Usage ​

Basic Chat Completion ​

Using Embeddings ​

Large Context Processing ​

Google-Specific Features ​

Multi-modal Support ​

Function Calling ​

Model Selection Guide ​

Sustainability Best Practices ​

Optimization Strategies ​

Configuration Example ​

Performance Characteristics ​

Latency Comparison ​

Token Limits ​

Pricing Overview ​

Integration with Other GateFlow Features ​

Multi-Provider Fallbacks ​

Semantic Caching ​

Troubleshooting ​

"Google API key not configured" ​

"Model not found: gemini-1.5-pro" ​

"Rate limit exceeded" ​

"Carbon savings lower than expected" ​

Migration from Direct Google API ​

Key Differences ​

Migration Example ​

Next Steps ​

Google Integration

Available Models

Chat Models

Embedding Models

Configuration

Pricing

Sustainability Features

Example Usage

Basic Chat Completion

Using Embeddings

Large Context Processing

Google-Specific Features

Multi-modal Support

Function Calling

Model Selection Guide

Sustainability Best Practices

Optimization Strategies

Configuration Example

Performance Characteristics

Latency Comparison

Token Limits

Pricing Overview

Integration with Other GateFlow Features

Multi-Provider Fallbacks

Semantic Caching

Troubleshooting

"Google API key not configured"

"Model not found: gemini-1.5-pro"

"Rate limit exceeded"

"Carbon savings lower than expected"

Migration from Direct Google API

Key Differences

Migration Example

Next Steps