Appearance
RAG Injection
Automatically inject relevant context into LLM requests.
How It Works
Enable RAG
python
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What's our refund policy?"}],
extra_body={
"gateflow": {
"rag": {
"enabled": True,
"collection": "policies",
"top_k": 5
}
}
}
)Injection Modes
Prepend (Default)
Context added before user message:
[System] Context from documents:
- Refunds available within 30 days...
- Full refund for defective items...
[User] What's our refund policy?System Prompt
Context added to system prompt:
json
{
"rag": {
"injection_mode": "system",
"system_template": "Use this context to answer: {context}"
}
}Append
Context after user message:
json
{
"rag": {
"injection_mode": "append"
}
}Configuration
json
{
"rag": {
"enabled": true,
"collection": "policies",
"top_k": 5,
"min_score": 0.7,
"rerank": true,
"include_metadata": true,
"max_context_tokens": 4000
}
}Response Metadata
json
{
"gateflow": {
"rag": {
"used": true,
"chunks_injected": 3,
"sources": [
{"document_id": "doc_123", "chunk_id": "chunk_456"}
]
}
}
}Next Steps
- Semantic Search - Search configuration
- Data Classification - Access controls