Implementation Patterns

6 Production Patterns

Battle-tested patterns for deploying AI operations at scale, from gateway routing to multi-agent orchestration.

Unified LLM Gateway

Route all LLM calls through a single gateway for cost control, rate limiting, and provider failover.

LiteLLM, Portkey, Martian, OCI GenAI

# LiteLLM proxy config
model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4
  - model_name: command-a
    litellm_params:
      model: cohere/command-a
      api_base: https://genai.oci.example.com

Cascade Model Routing

Route queries to cheapest capable model. Simple tasks skip expensive models entirely.

RouteLLM, Custom logic, KEDA scaling

# Cascade: simple → cheap, complex → premium
def route(query):
    complexity = classify(query)
    if complexity < 0.3:
        return "gemini-flash-lite"  # $0.075/1M
    elif complexity < 0.7:
        return "command-a"          # ~$3/1M
    else:
        return "grok-4.1"           # premium

Memory-Augmented Agents

Persistent memory across sessions using vector storage and knowledge graphs.

Mem0, Graphiti, Oracle AI DB 26ai

# Mem0 integration
from mem0 import Memory
m = Memory()
m.add("User prefers RAG over fine-tuning",
      user_id="frank", metadata={"topic": "ai-ops"})
# +26% accuracy, -90% token usage

Observability Pipeline

End-to-end tracing from user query through retrieval, generation, and response.

Langfuse, LangSmith, Arize, OCI Monitoring

# Langfuse trace
trace = langfuse.trace(name="rag-query")
span = trace.span(name="retrieval")
# ... vector search ...
span.end()
gen = trace.generation(
    name="llm", model="command-a",
    input=context, output=response)

Multi-Agent Orchestration

Coordinate specialized agents with structured communication and shared context.

LangGraph, CrewAI, Oracle ADK, Agent Spec

# LangGraph state machine
from langgraph.graph import StateGraph
graph = StateGraph(AgentState)
graph.add_node("researcher", researcher_agent)
graph.add_node("writer", writer_agent)
graph.add_node("reviewer", reviewer_agent)
graph.add_edge("researcher", "writer")
graph.add_edge("writer", "reviewer")

Production RAG Pipeline

Enterprise RAG with hybrid search, reranking, and quality evaluation.

Cohere Embed 4, Rerank 3.5, AI DB 26ai

-- Oracle AI Database 26ai Hybrid Search
SELECT id, title,
  (0.7 * (1 - VECTOR_DISTANCE(emb, :qvec, COSINE))
   + 0.3 * SCORE(1)) AS hybrid_score
FROM documents
WHERE CONTAINS(content, :kw, 1) > 0
ORDER BY hybrid_score DESC
FETCH FIRST 10 ROWS ONLY;