Appearance
Classification Patterns
Automatic data classification rules and patterns for compliance.
Overview
GateFlow uses pattern matching and machine learning to automatically classify data based on sensitivity levels. This ensures consistent handling across your organization.
Classification Levels
| Level | Code | Description | Default Access |
|---|---|---|---|
| Public | public | Publicly available information | All users |
| Internal | internal | Internal business use | Authenticated users |
| Confidential | confidential | Sensitive business data | Role-based access |
| Restricted | restricted | Highly sensitive | Named individuals |
| PHI | phi | Protected Health Information | HIPAA-compliant access |
Built-in Patterns
PII Patterns
python
from openai import OpenAI
client = OpenAI(
base_url="https://api.gateflow.ai/v1",
api_key="gw_prod_..."
)
# View built-in PII patterns
patterns = client.get("/compliance/patterns/pii")
for pattern in patterns["patterns"]:
print(f"{pattern['name']}: {pattern['classification']}")Built-in PII Patterns:
| Pattern | Example | Default Classification |
|---|---|---|
SSN | 123-45-6789 | restricted |
CREDIT_CARD | 4111-1111-1111-1111 | restricted |
EMAIL | user@example.com | internal |
PHONE | (555) 123-4567 | internal |
ADDRESS | 123 Main St, City, ST 12345 | confidential |
DATE_OF_BIRTH | 01/15/1990 | confidential |
PASSPORT | AB1234567 | restricted |
DRIVER_LICENSE | D1234567 | restricted |
BANK_ACCOUNT | 1234567890 | restricted |
IP_ADDRESS | 192.168.1.1 | internal |
PHI Patterns (HIPAA)
| Pattern | Example | Classification |
|---|---|---|
MEDICAL_RECORD_NUMBER | MRN-12345678 | phi |
HEALTH_PLAN_ID | HP-987654321 | phi |
DIAGNOSIS_CODE | ICD-10: J06.9 | phi |
MEDICATION | Lisinopril 10mg | phi |
LAB_RESULT | A1C: 6.5% | phi |
PROVIDER_NPI | 1234567890 | confidential |
Financial Patterns
| Pattern | Example | Classification |
|---|---|---|
ACCOUNT_NUMBER | 1234567890 | restricted |
ROUTING_NUMBER | 021000021 | confidential |
TAX_ID | 12-3456789 | restricted |
SALARY | $125,000 | restricted |
REVENUE | $1.2M quarterly | confidential |
Custom Patterns
Create a Custom Pattern
python
# Define a custom classification pattern
response = client.post(
"/compliance/patterns",
json={
"name": "employee_id",
"description": "Company employee ID format",
"pattern": {
"type": "regex",
"value": r"EMP-[A-Z]{2}\d{6}",
"case_sensitive": False
},
"classification": "internal",
"actions": {
"on_detect": "tag",
"redact_in_logs": True
}
}
)
print(f"Pattern ID: {response['pattern_id']}")Pattern Types
Regex Patterns
python
# Complex regex pattern
client.post(
"/compliance/patterns",
json={
"name": "project_code",
"pattern": {
"type": "regex",
"value": r"PRJ-\d{4}-[A-Z]{3}",
"flags": ["IGNORECASE"]
},
"classification": "confidential"
}
)Keyword Patterns
python
# Keyword-based classification
client.post(
"/compliance/patterns",
json={
"name": "confidential_keywords",
"pattern": {
"type": "keyword_list",
"values": [
"confidential",
"proprietary",
"trade secret",
"internal only",
"do not distribute"
],
"match_type": "any",
"case_sensitive": False
},
"classification": "confidential"
}
)ML-Based Patterns
python
# Machine learning classification
client.post(
"/compliance/patterns",
json={
"name": "legal_documents",
"pattern": {
"type": "ml_classifier",
"model": "document-classifier-v2",
"labels": ["contract", "nda", "agreement", "legal"],
"threshold": 0.85
},
"classification": "confidential"
}
)Classification Rules
Rule Priority
Rules are evaluated in priority order (lower number = higher priority):
python
# Create classification rules with priority
rules = [
{
"name": "phi_override",
"priority": 1,
"conditions": [
{"pattern": "MEDICAL_RECORD_NUMBER"},
{"pattern": "DIAGNOSIS_CODE"}
],
"operator": "any",
"classification": "phi"
},
{
"name": "pii_sensitive",
"priority": 10,
"conditions": [
{"pattern": "SSN"},
{"pattern": "CREDIT_CARD"},
{"pattern": "BANK_ACCOUNT"}
],
"operator": "any",
"classification": "restricted"
},
{
"name": "default_internal",
"priority": 100,
"conditions": [
{"pattern": "EMAIL"},
{"pattern": "PHONE"}
],
"operator": "any",
"classification": "internal"
}
]
for rule in rules:
client.post("/compliance/classification-rules", json=rule)Conditional Rules
python
# Classification based on multiple conditions
client.post(
"/compliance/classification-rules",
json={
"name": "financial_report_rule",
"priority": 5,
"conditions": [
{"pattern": "REVENUE"},
{"metadata": {"document_type": "financial_report"}},
{"content_contains": "quarterly results"}
],
"operator": "all", # All conditions must match
"classification": "restricted"
}
)Context-Aware Rules
python
# Different classification based on context
client.post(
"/compliance/classification-rules",
json={
"name": "email_context",
"priority": 15,
"conditions": [
{"pattern": "EMAIL"}
],
"context_rules": [
{
"context": {"document_type": "employee_directory"},
"classification": "internal"
},
{
"context": {"document_type": "customer_list"},
"classification": "confidential"
}
],
"default_classification": "internal"
}
)Automatic Classification
On Document Upload
python
# Upload with automatic classification
response = client.post(
"/data/documents",
files={"file": open("report.pdf", "rb")},
data={
"auto_classify": True,
"classification_rules": "default", # Use default ruleset
"min_classification": "internal" # Floor classification
}
)
print(f"Detected classification: {response['classification']}")
print(f"Patterns matched: {response['patterns_matched']}")Classification Report
python
# Get classification details for a document
report = client.get(
f"/data/documents/{document_id}/classification-report"
)
print(f"Final Classification: {report['classification']}")
print(f"\nPatterns Detected:")
for pattern in report["patterns_detected"]:
print(f" - {pattern['name']}: {pattern['count']} occurrences")
print(f" Locations: {pattern['locations'][:3]}...")
print(f"\nRules Applied:")
for rule in report["rules_applied"]:
print(f" - {rule['name']} (priority {rule['priority']})")Override and Escalation
Manual Override
python
# Override automatic classification
client.patch(
f"/data/documents/{document_id}/classification",
json={
"classification": "restricted",
"reason": "Contains merger details not detected by patterns",
"override_by": "compliance_officer@company.com",
"expires_at": None # Permanent override
}
)Escalation Rules
python
# Configure automatic escalation
client.post(
"/compliance/escalation-rules",
json={
"name": "high_volume_pii",
"trigger": {
"pattern_count": {"SSN": {"$gte": 10}},
"timeframe_minutes": 60
},
"action": {
"escalate_to": ["security-team@company.com"],
"auto_quarantine": True,
"classification_override": "restricted"
}
}
)Monitoring and Alerts
Classification Dashboard
python
# Get classification statistics
stats = client.get(
"/compliance/classification-stats",
params={
"period": "30d",
"group_by": "classification"
}
)
for level, data in stats["by_classification"].items():
print(f"{level}:")
print(f" Documents: {data['document_count']}")
print(f" Size: {data['total_size_mb']} MB")
print(f" Growth: {data['growth_percent']}%")Alert Configuration
python
# Set up classification alerts
client.post(
"/compliance/alerts",
json={
"name": "phi_detection_alert",
"condition": {
"classification": "phi",
"source": {"$ne": "healthcare_system"} # Unexpected PHI
},
"notify": ["compliance@company.com"],
"severity": "high"
}
)Integration with Workflows
Pre-Processing Hook
python
# Configure classification as pre-processing step
client.post(
"/workflows/hooks",
json={
"event": "document.uploaded",
"action": {
"type": "classify",
"ruleset": "enterprise",
"on_restricted": {
"require_approval": True,
"notify": ["data-governance@company.com"]
}
}
}
)Best Practices
- Start with built-in patterns - Use proven patterns as a foundation
- Layer custom patterns - Add organization-specific patterns on top
- Set appropriate priorities - Higher sensitivity = lower priority number
- Test before production - Validate patterns against sample data
- Monitor false positives - Tune patterns to reduce noise
- Regular audits - Review classification effectiveness quarterly
- Document overrides - Always require reasons for manual changes
API Reference
| Endpoint | Method | Description |
|---|---|---|
/compliance/patterns | GET | List all patterns |
/compliance/patterns | POST | Create pattern |
/compliance/patterns/{id} | PATCH | Update pattern |
/compliance/classification-rules | GET | List rules |
/compliance/classification-rules | POST | Create rule |
/data/documents/{id}/classification-report | GET | Get classification details |
Next Steps
- Data Classification - Classification levels
- PII Detection - PII handling
- Audit Trail - Classification logging