GateFlowRouting-Native Evaluation

Evaluate continuously, route intelligently, comply automatically. The first AI gateway where eval results drive every decision.

Get Started

Explore Eval

✅

Curated Eval Suites

10+ pre-built evaluation suites covering safety, quality, RAG faithfulness, and compliance. Start evaluating in minutes with battle-tested benchmarks.

🔄

Closed-Loop Routing

Eval scores automatically update model routing decisions. When quality drifts, traffic shifts. No manual intervention required.

💰

97% Cost Reduction

Tiered evaluator approach combines heuristics, semantic checks, and LLM judges. Only escalate ambiguous cases to expensive models.

📊

Production Sampling

Continuously sample 1-5% of production traffic for live evaluation. Detect quality drift before users notice.

🛡️

Compliance Ready

EU AI Act and ISO 42001 compliance reports generated from eval history. Safety evals as blocking gates with automatic alerts.

🎯

Intelligent Routing

Route requests across 52+ models with 4 routing modes. Quality-aware model selection based on real eval performance.

Why GateFlow?

Most evaluation tools are disconnected from production. You run evals in notebooks, see results in dashboards, but nothing changes automatically. GateFlow closes the loop.

GateFlow enables quality-driven AI infrastructure:

Gateway-Native Eval: Evaluation built into routing, not bolted on after
Production-Driven: Continuous sampling of live traffic, not just test sets
Curated Database: 10+ pre-built suites, not a blank canvas
Automatic Compliance: EU AI Act and ISO 42001 reports generated from eval history

Quick Example

Evals run automatically on production traffic:

python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.gateflow.ai/v1",
    api_key="gf-..."  # Your GateFlow API key
)

# Standard inference - evals sample automatically
response = client.chat.completions.create(
    model="auto",  # Routing informed by eval scores
    messages=[{"role": "user", "content": "Hello!"}]
)

# Or run explicit eval suites
from gateflow import EvalClient
eval_client = EvalClient(api_key="gf-...")

results = eval_client.run_suite(
    suite="safety-core",
    model="gpt-4o"
)
print(f"Safety score: {results.aggregate_score}%")

That's it. Your requests flow through GateFlow with automatic evaluation, quality-driven routing, and compliance reporting.

What's New

Latest Release - v3.0.0

Eval Platform Launch - Gateway-native evaluation with 10+ curated suites, tiered evaluators for 97% cost reduction, closed-loop routing, and EU AI Act compliance reporting. Read the changelog →

Explore the Docs

Eval Platform

What is GateFlow Eval - Gateway-native evaluation overview
Quickstart - Run your first eval in 5 minutes
Curated Suites - 10+ pre-built evaluation suites
Tiered Evaluators - 97% cost reduction approach

Production Integration

Traffic Sampling - Continuous production evaluation
Routing Feedback Loop - Eval-driven routing
Drift Detection - Automatic quality monitoring

Compliance

EU AI Act - European regulation compliance
ISO 42001 - AI management system standard
Report Generation - Automated compliance reports

Gateway Features

Intelligent Routing - 4 routing modes and task classification
API Reference - Complete endpoint documentation
MCP Agent Governance - Build governed AI agents