Skip to content

Changelog

All notable changes to GateFlow.

[3.0.0] - 2025-03-15

Added

  • Sustainability & Performance Update: Complete rewrite focused on cost efficiency and resource optimization
  • 4 Intelligent Routing Modes: cost_optimized, performance_optimized, balanced, and direct routing strategies
  • Task Classification: Automatic categorization of requests into 5 task types for optimal model selection
  • Vector Search: Qdrant-powered semantic search capabilities for advanced use cases
  • Redis Metrics Dashboard: Sub-100ms real-time analytics with PII redaction
  • Enhanced Security: AES-256 encryption for all API keys and improved error handling
  • 52+ Model Versions: Specific model versions with pricing and capability data across OpenAI, Anthropic, Google, and Mistral

Changed

  • OpenAI: Added 29 models including gpt-5.2, gpt-5, o3, o4-mini, and legacy versions
  • Anthropic: Added 6 models including claude-opus-4-5, claude-sonnet-4-5, claude-haiku-4-5
  • Google: Added 7 models including gemini-3-pro, gemini-3-flash, gemini-3-deep-think
  • Mistral: Added 11 models including mistral-large-3, devstral-2, pixtral-large-latest
  • Complete focus on sustainability with 40-70% cost reduction through intelligent routing
  • 90% compute reduction through semantic caching
  • 3.2x longer hardware lifespan through zero-downtime migrations

Sustainability Improvements

  • Eco-Efficient Routing: Automatic selection of right-sized models for each task
  • Resource Optimization: Continuous monitoring and optimization of resource usage
  • Waste Reduction: Elimination of duplicate computations through advanced caching
  • Long-Term Planning: Future-proof infrastructure with seamless model upgrades

[2.4.0] - 2025-02-01

Added

  • EU AI Act Compliance Module: Automated risk classification, human oversight controls, and compliance reporting for high-risk AI systems
  • Litigation Hold API: Place documents under legal hold to prevent deletion
  • Voice Pipeline Templates: New legal-dictation template optimized for legal transcription
  • Model Alias Wildcards: Use patterns like gpt-4* in alias definitions

Changed

  • Improved semantic cache hit detection accuracy by 15%
  • Reduced embedding latency for cache lookups
  • Enhanced PII detection for medical terminology

Fixed

  • Fixed race condition in fallback chain execution
  • Corrected cost calculation for streaming responses with early termination
  • Resolved audit log timestamp precision issues

[2.3.0] - 2025-01-15

Added

  • Shadow Mode Migrations: Test new models with production traffic without affecting users
  • Cost Anomaly Detection: Automatic alerts for unusual spending patterns
  • Multi-language Reranking: Support for rerank-multilingual-v3.0
  • Batch Embeddings API: Process up to 2048 texts in a single request

Changed

  • Default cache TTL increased from 12 hours to 24 hours
  • Improved error messages for provider authentication failures
  • Dashboard now shows real-time cache hit rates

Fixed

  • Fixed memory leak in long-running streaming connections
  • Corrected rate limit header values for burst scenarios
  • Resolved edge case in canary deployment traffic splitting

[2.2.0] - 2025-01-01

Added

  • Gemini 1.5 Flash Support: Added as a fast, cost-effective option
  • Request Priority Queuing: High-priority requests skip the queue during rate limits
  • Custom Cache Scope Keys: Add custom dimensions to cache keys
  • Webhook Retry Configuration: Configure retry attempts for failed webhooks

Changed

  • Upgraded to pgvector 0.6.0 with improved HNSW performance
  • Reduced cold start latency for serverless deployments
  • Enhanced deprecation alerts with migration timeline estimates

Fixed

  • Fixed incorrect token count for Claude models with images
  • Corrected timezone handling in scheduled reports
  • Resolved SSO session expiration edge cases

[2.1.0] - 2024-12-15

Added

  • MCP Agent Governance: Full MCP server implementation with permissions and audit
  • Voice Pipelines: STT → LLM → TTS in a single API call
  • Data Residency Controls: Specify allowed regions for data storage
  • Organization Workspaces: Separate resources by environment

Changed

  • Improved fallback latency by 40% with parallel health checks
  • Enhanced dashboard with real-time metrics
  • Updated model pricing to reflect provider changes

Fixed

  • Fixed streaming chunk ordering for high-latency connections
  • Corrected RBAC permission inheritance
  • Resolved API key rotation webhook timing

[2.0.0] - 2024-12-01

Breaking Changes

  • API version path changed from /api/v1 to /v1
  • Removed deprecated gpt-4-32k alias (use model ID directly)
  • Changed cache response header from X-Cache-Hit to X-GateFlow-Cache-Hit

Added

  • Semantic Caching v2: New embedding-based similarity with configurable thresholds
  • Model Change Management: Deprecation alerts, migration wizard, automated fallbacks
  • Data Pillar: Document ingestion, PII detection, vector storage
  • Compliance Framework: HIPAA, Legal, GDPR support with audit trails

Changed

  • Complete dashboard redesign with improved UX
  • New pricing model with included cache savings
  • Enhanced API response format with gateflow extension object

Removed

  • Removed legacy /api/v1 endpoints (deprecated in 1.9)
  • Removed simple_cache mode (superseded by semantic cache)

[1.9.0] - 2024-11-01

Added

  • Claude 3.5 Sonnet Support: Added latest Anthropic model
  • Cost Budgets: Set daily/monthly spending limits
  • API Key Permissions: Restrict keys to specific models
  • Export Analytics: CSV export for cost data

Deprecated

  • /api/v1 endpoints deprecated in favor of /v1
  • simple_cache mode deprecated in favor of semantic cache

[1.8.0] - 2024-10-01

Added

  • GPT-4o Support: Added OpenAI's latest model
  • A/B Testing: Split traffic between models
  • Latency-Optimized Routing: Route to fastest available provider
  • Custom Model Aliases: Create friendly names for models

Changed

  • Improved rate limit handling with better queuing
  • Enhanced error messages with doc_url field
  • Reduced average latency by 25%

For older releases, see the full changelog archive.

Versioning

GateFlow follows Semantic Versioning:

  • Major (X.0.0): Breaking changes
  • Minor (0.X.0): New features, backward compatible
  • Patch (0.0.X): Bug fixes, backward compatible

Release Schedule

  • Major releases: Quarterly (with 6-month deprecation notice)
  • Minor releases: Monthly
  • Patch releases: As needed

Staying Updated

Built with reliability in mind.