Skip to content

OCR

Extract text, tables, and form fields from documents and images using AI-powered OCR.

POST /v1/ocr

Overview

The OCR API converts scanned documents and images into structured text. Powered by Mistral's document AI, it supports:

  • Text extraction - Accurate text from documents and images
  • Table extraction - Structured table data in JSON
  • Form field extraction - Key-value pairs from forms
  • Multi-page support - Process entire PDFs
  • Multilingual - 100+ languages

Request

bash
curl -X POST https://api.gateflow.ai/v1/ocr \
  -H "Authorization: Bearer gw_prod_..." \
  -F "file=@document.pdf" \
  -F "extract_tables=true" \
  -F "extract_forms=true"
python
import requests

response = requests.post(
    "https://api.gateflow.ai/v1/ocr",
    headers={"Authorization": "Bearer gw_prod_..."},
    files={"file": open("document.pdf", "rb")},
    data={
        "extract_tables": True,
        "extract_forms": True,
        "pages": "1-5"
    }
)
typescript
const formData = new FormData();
formData.append('file', fileBlob, 'document.pdf');
formData.append('extract_tables', 'true');
formData.append('extract_forms', 'true');

const response = await fetch('https://api.gateflow.ai/v1/ocr', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer gw_prod_...',
  },
  body: formData,
});

Parameters

ParameterTypeRequiredDefaultDescription
filefileYes-Document file (PDF, PNG, JPG, TIFF)
extract_tablesbooleanNotrueExtract tables as structured data
extract_formsbooleanNotrueExtract form field key-value pairs
pagesstringNoallPage range (e.g., 1-5, all)
language_hintsarrayNonullLanguage codes to improve accuracy
output_formatstringNojsonOutput: json, text, markdown

Response

json
{
  "document_id": "550e8400-e29b-41d4-a716-446655440000",
  "total_pages": 3,
  "pages": [
    {
      "page_number": 1,
      "text": "INVOICE\n\nInvoice Number: INV-2024-001\nDate: January 15, 2024\n...",
      "tables": [
        {
          "table_id": "table_1",
          "rows": [
            ["Item", "Quantity", "Price"],
            ["Widget A", "10", "$50.00"],
            ["Widget B", "5", "$25.00"]
          ],
          "bounding_box": {"x": 50, "y": 200, "width": 400, "height": 150}
        }
      ],
      "forms": [
        {"key": "Invoice Number", "value": "INV-2024-001", "confidence": 0.98},
        {"key": "Date", "value": "January 15, 2024", "confidence": 0.95}
      ],
      "confidence": 0.96
    }
  ],
  "processing_time_ms": 1250,
  "_gateway": {
    "provider": "mistral",
    "model": "mistral-ocr-latest",
    "cost": {"usd": 0.03}
  }
}

Response Fields

FieldTypeDescription
document_idstringUnique document identifier
total_pagesintegerNumber of pages processed
pagesarrayPer-page extraction results
pages[].textstringExtracted plain text
pages[].tablesarrayExtracted table data
pages[].formsarrayExtracted form fields
pages[].confidencenumberOverall confidence score (0-1)
processing_time_msintegerProcessing time
_gatewayobjectGateFlow metadata

Supported Formats

FormatExtensionsMax SizeNotes
PDF.pdf50MBMulti-page supported
PNG.png20MBSingle image
JPEG.jpg, .jpeg20MBSingle image
TIFF.tiff, .tif50MBMulti-page supported
BMP.bmp20MBSingle image
WebP.webp20MBSingle image

Language Support

GateFlow OCR supports 100+ languages. Use language_hints to improve accuracy:

json
{
  "file": "<binary>",
  "language_hints": ["en", "nl", "de"]
}

Common language codes:

  • en - English
  • es - Spanish
  • fr - French
  • de - German
  • nl - Dutch
  • zh - Chinese
  • ja - Japanese
  • ko - Korean
  • ar - Arabic

Use Cases

Invoice Processing

python
response = requests.post(
    "https://api.gateflow.ai/v1/ocr",
    headers={"Authorization": "Bearer gw_prod_..."},
    files={"file": open("invoice.pdf", "rb")},
    data={
        "extract_tables": True,
        "extract_forms": True
    }
)

# Extract invoice fields
for page in response.json()["pages"]:
    for field in page["forms"]:
        if field["key"].lower() in ["invoice number", "total", "date"]:
            print(f"{field['key']}: {field['value']}")

Medical Record Processing

python
# Process medical record with PII awareness
response = requests.post(
    "https://api.gateflow.ai/v1/ocr",
    headers={"Authorization": "Bearer gw_prod_..."},
    files={"file": open("medical_record.pdf", "rb")},
    data={
        "extract_forms": True,
        "language_hints": ["en"]
    }
)

# Automatically detects and flags PHI fields

Batch Processing

python
import asyncio
import aiohttp

async def process_documents(files):
    async with aiohttp.ClientSession() as session:
        tasks = []
        for file_path in files:
            with open(file_path, "rb") as f:
                data = aiohttp.FormData()
                data.add_field('file', f.read(), filename=file_path)
                data.add_field('extract_tables', 'true')

                task = session.post(
                    "https://api.gateflow.ai/v1/ocr",
                    headers={"Authorization": "Bearer gw_prod_..."},
                    data=data
                )
                tasks.append(task)

        results = await asyncio.gather(*tasks)
        return [await r.json() for r in results]

Error Codes

CodeDescription
400Invalid file format or corrupted file
401Invalid API key
413File too large
422Could not process document
429Rate limit exceeded
500OCR service error

Pricing

OperationCost
Per page$0.01
TablesIncluded
FormsIncluded

See Also

Built with reliability in mind.