Cohere Integration

GateFlow provides full support for Cohere's efficient AI models with sustainability optimization.

Available Models

Chat Models

command-r-plus - Most capable model for complex tasks
command-r - Balanced performance and efficiency
command - Cost-effective for simpler tasks

Embedding Models

embed-english-v3.0 - Optimized for English text
embed-multilingual-v3.0 - Supports 100+ languages

Rerank Models

rerank-english-v3.0 - Improve RAG quality with semantic ranking
rerank-multilingual-v3.0 - Multilingual reranking

Sustainability Benefits

Cohere models are optimized for efficiency:

Lower Carbon Footprint: Up to 30% less CO₂ per token vs comparable models
Faster Inference: Reduced compute time = lower energy consumption
Competitive Pricing: Cost efficiency often correlates with carbon efficiency
Specialized Models: Right-sized models for specific tasks

Example Usage

Basic Chat Completion

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.gateflow.ai/v1",
    api_key="gw_prod_your_key_here"
)

# Use Cohere for efficient chat completion
response = client.chat.completions.create(
    model="command-r-plus",
    messages=[{"role": "user", "content": "Analyze this document efficiently"}]
)

print(response.choices[0].message.content)
print(f"Carbon footprint: {response.sustainability.carbon_gco2e} gCO₂e")

Using Cohere Rerank for Better RAG

python

# Use Cohere rerank for better RAG results
documents = [
    "Document 1 content about sustainability...",
    "Document 2 content about AI efficiency...",
    "Document 3 content about renewable energy..."
]

rerank_response = client.rerank.create(
    model="rerank-english-v3.0",
    query="sustainable AI practices",
    documents=documents,
    top_n=2
)

# Get the top 2 most relevant documents
for result in rerank_response.results:
    print(f"Document {result.index}: Score {result.score}")
    print(f"Carbon saved: {result.sustainability.carbon_saved_gco2e} gCO₂e")

Sustain Mode with Cohere

python

# Let GateFlow choose the most sustainable Cohere model
response = client.chat.completions.create(
    model="cohere:auto",  # Auto-select most efficient Cohere model
    routing_mode="sustain_optimized",
    messages=[{"role": "user", "content": "Generate eco-friendly content"}]
)

print(f"Selected model: {response.model}")
print(f"Carbon saved: {response.sustainability.carbon_saved_gco2e} gCO₂e")

Cohere-Specific Features

Tool Use

Cohere models support function calling with GateFlow's unified interface:

python

# Define tools (works across all providers)
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather information for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                }
            }
        }
    }
]

response = client.chat.completions.create(
    model="command-r-plus",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools
)

Search & Rerank Pipeline

Combine embeddings and rerank for optimal search results:

python

# Step 1: Generate embeddings
embedding_response = client.embeddings.create(
    model="embed-english-v3.0",
    input=["query text", "document 1", "document 2", "document 3"]
)

# Step 2: Use rerank for precision
rerank_response = client.rerank.create(
    model="rerank-english-v3.0",
    query="query text",
    documents=["document 1", "document 2", "document 3"],
    embeddings=embedding_response.data
)

Sustainability Best Practices

Model Selection Guide

Use Case	Recommended Model
Complex analysis	`command-r-plus`
General chat	`command-r`
Simple tasks	`command`
English embeddings	`embed-english-v3.0`
Semantic search	`rerank-english-v3.0`

Optimization Tips

Right-size your model: Use command instead of command-r-plus for simple tasks
Batch requests: Process multiple items in single API calls
Use rerank: Improve RAG quality while reducing overall compute
Enable caching: Cache frequent Cohere requests for maximum savings
Combine with Sustain Mode: Let GateFlow optimize across all providers

Performance Characteristics

Latency

Chat models: 200-800ms typical response time
Embedding models: 50-200ms per batch
Rerank models: 100-300ms per query

Token Limits

Chat models: Up to 128K tokens context window
Embedding models: Up to 512 tokens per text
Rerank models: Up to 512 documents per query

Integration with Other GateFlow Features

Semantic Caching

Cohere embeddings work seamlessly with GateFlow's semantic caching:

python

# Enable semantic caching with Cohere embeddings
response = client.chat.completions.create(
    model="command-r-plus",
    messages=[{"role": "user", "content": "Frequently asked question"}],
    cache_ttl_seconds=3600,  # Cache for 1 hour
    embedding_model="embed-english-v3.0"  # Use Cohere for semantic matching
)

Multi-Provider Fallbacks

Configure Cohere as fallback for other providers:

python

# Set up fallback chain in Dashboard:
# Primary: OpenAI gpt-5.2
# Fallback 1: Cohere command-r-plus
# Fallback 2: Anthropic claude-3-5-sonnet

response = client.chat.completions.create(
    model="gpt-5.2",  # Will fallback to Cohere if OpenAI unavailable
    messages=[{"role": "user", "content": "Important request"}]
)

Troubleshooting

"Cohere API key not configured"

Solution: Add your Cohere API key in the GateFlow Dashboard under Settings → Providers.

"Model not found: command-r-plus"

Solution: Ensure you've selected the correct model name from the available Cohere models.

"Rate limit exceeded"

Solution:

Check your Cohere account limits
Configure fallbacks to other providers
Enable request queuing in GateFlow settings

Migration from Direct Cohere API

Key Differences

Feature	Direct Cohere API	GateFlow Cohere Integration
API Format	Cohere-specific	OpenAI-compatible
Authentication	Cohere API key	GateFlow API key
Model Names	`command-r-plus`	`command-r-plus`
Tool Support	Cohere format	OpenAI format
Carbon Tracking	Manual	Automatic
Multi-provider	No	Yes
Fallbacks	Manual	Automatic

Migration Example

Before (Direct Cohere API):

python

import cohere
co = cohere.Client("your-cohere-api-key")
response = co.chat(
    model="command-r-plus",
    message="Hello from Cohere!"
)

After (GateFlow Integration):

python

from openai import OpenAI
client = OpenAI(
    base_url="https://api.gateflow.ai/v1",
    api_key="gw_prod_your_gateflow_key"
)
response = client.chat.completions.create(
    model="command-r-plus",
    messages=[{"role": "user", "content": "Hello from Cohere via GateFlow!"}]
)

Next Steps

Try ElevenLabs Integration - Low-carbon voice synthesis
Explore Sustain Mode - Automatic carbon optimization
View Sustainability Dashboard - Track your Cohere carbon savings
Configure Provider Settings - Optimize your Cohere configuration

Cohere Integration ​

Available Models ​

Chat Models ​

Embedding Models ​

Rerank Models ​

Sustainability Benefits ​

Example Usage ​

Basic Chat Completion ​

Using Cohere Rerank for Better RAG ​

Sustain Mode with Cohere ​

Cohere-Specific Features ​

Tool Use ​

Search & Rerank Pipeline ​

Sustainability Best Practices ​

Model Selection Guide ​

Optimization Tips ​

Performance Characteristics ​

Latency ​

Token Limits ​

Integration with Other GateFlow Features ​

Semantic Caching ​

Multi-Provider Fallbacks ​

Troubleshooting ​

"Cohere API key not configured" ​

"Model not found: command-r-plus" ​

"Rate limit exceeded" ​

Migration from Direct Cohere API ​

Key Differences ​

Migration Example ​

Next Steps ​

Cohere Integration

Available Models

Chat Models

Embedding Models

Rerank Models

Sustainability Benefits

Example Usage

Basic Chat Completion

Using Cohere Rerank for Better RAG

Sustain Mode with Cohere

Cohere-Specific Features

Tool Use

Search & Rerank Pipeline

Sustainability Best Practices

Model Selection Guide

Optimization Tips

Performance Characteristics

Latency

Token Limits

Integration with Other GateFlow Features

Semantic Caching

Multi-Provider Fallbacks

Troubleshooting

"Cohere API key not configured"

"Model not found: command-r-plus"

"Rate limit exceeded"

Migration from Direct Cohere API

Key Differences

Migration Example

Next Steps