← Use Cases
Use Case

AI Agent Governance for Financial Services

SEC, OCC, and FINRA-compliant AI agent orchestration. Model risk management, audit trails, and human-in-the-loop approvals for finserv agents.

The SEC's examination priorities explicitly call out AI governance. The OCC's guidance on AI in banking describes expectations for control frameworks, audit trails, and human oversight. An ungoverned AI system in financial services is a risk engine.

Firms are deploying agents for trading recommendations, customer service, portfolio rebalancing, and regulatory reporting. These agents call external systems, make decisions that affect customer money, and create audit obligations that traditional ML governance doesn't cover.

Why agents change the compliance surface

Agents act, not just predict

A traditional ML model predicts something: market direction, default probability, customer churn. The prediction feeds a human decision. An investment agent is different. It analyzes market data, evaluates positions, calls a trading tool, and records the decision.

The SEC wants to know: What data did it access? What decision rules did it follow? Was there human review? Can you reproduce the decision? Who is responsible?

Model risk management applies to agents too

SR 11-7 (Federal Reserve guidance on model risk) requires an independent model inventory, validation before deployment, ongoing monitoring, and escalation procedures. A single agent might call five different models. You need to:

  • Pin model versions and prevent silent upgrades
  • Route decisions to appropriate models by risk level
  • Monitor performance and alert on degradation
  • Enforce model-specific constraints (token budgets, confidence thresholds)

Audit trails are compliance obligations

FINRA requires audit trails sufficient to reconstruct activity. The OCC expects documented decision chains. The SEC will ask for the logs.

With Orloj, every agent action creates a structured, queryable audit record: what the agent intended, what tools it called, what data it accessed, whether humans approved it, and what the outcome was.

Declarative governance in practice

Governance in Orloj uses multiple resource kinds that work together. Define them as version-controlled YAML manifests; diff in PRs, roll back safely.

Constrain models and block dangerous tools with an AgentPolicy:

YAML
apiVersion: orloj.dev/v1
kind: AgentPolicy
metadata:
  name: portfolio-governance
spec:
  apply_mode: scoped
  target_systems:
    - portfolio-rebalancing-system
  allowed_models:
    - gpt-4-finance
    - gpt-3.5-turbo
  blocked_tools:
    - account-close
    - wire-transfer
  max_tokens_per_run: 50000

Grant scoped permissions with AgentRoles:

YAML
apiVersion: orloj.dev/v1
kind: AgentRole
metadata:
  name: trading-role
spec:
  description: Can invoke trading and market data tools.
  permissions:
    - tool:market-data-api:invoke
    - tool:trading-execution:invoke
    - capability:market.read

Require human approval before executing trades with ToolPermission:

YAML
apiVersion: orloj.dev/v1
kind: ToolPermission
metadata:
  name: trading-execution-permission
spec:
  tool_ref: trading-execution
  match_mode: all
  required_permissions:
    - tool:trading-execution:invoke
  operation_rules:
    - operation_class: write
      verdict: approval_required

When a trade triggers approval_required, the task pauses and a ToolApproval resource is created. A trading desk head or risk officer reviews and approves or denies via the API or web console. If the approval TTL expires, the task fails with approval_timeout.

What this governance model enforces:

  • Model pinning prevents silent upgrades. Only listed models are allowed; agents with unlisted models are denied.
  • Dangerous tools are blocked outright. wire-transfer and account-close can never be invoked, regardless of permissions.
  • Trade execution requires human approval. The ToolApproval workflow pauses the task until a human signs off.
  • Token budgets are enforced. max_tokens_per_run stops runaway chains of thought.
  • Unauthorized calls fail closed. Denied with tool_permission_denied and logged.

Regulatory requirement mapping

Requirement How Orloj addresses it
Decision rationale documentation Full audit trail with model inputs, confidence scores, tool calls
Segregation of duties Tool permissions by role; human approval gates
Reconstructable audit trails Immutable ledger of agent actions, approvals, outcomes
Model inventory and versioning Model endpoint pinning; version tracking in every decision
Performance monitoring Confidence thresholds; alerting on approval rejections
Human oversight Approval gates enforced at the execution layer
Exposure controls Transaction value limits; blacklists; rate limits per tool
Business continuity Lease-based task ownership; automatic failover

Example: portfolio rebalancing agent

An investment firm runs an agent that monitors client portfolios and recommends rebalancing.

  1. Portfolio drifts 5% from target allocation
  2. Agent fetches current prices via market-data-api (sampled audit)
  3. Agent analyzes portfolio using pinned gpt-4-finance model (full audit)
  4. Agent determines: sell tech-heavy positions, buy bonds to restore target
  5. Agent attempts to call trading-execution

Orloj intervenes:

  • The tool requires human approval (defined in policy)
  • Request routes to the trading desk head and risk officer
  • Both must approve within 5 minutes or the action is rejected
  • The trading desk head reviews the order, rationale, and confidence score
  • They approve with a modification: "Execute at 50% of recommended position size"

Orloj logs the agent's recommendation, the human override, and executes the modified order.

Time Event
10:23:45 Agent: portfolioOptimizer
10:23:46 Model: gpt-4-finance v1.2.3
10:23:47 Decision: Sell 1000 NVDA, Buy 2000 BND
10:23:48 Confidence: 0.82
10:24:30 Approval: portfolioManager APPROVED (partial)
10:24:31 Approval: riskOfficer APPROVED
10:24:32 Execution: Sell 500 NVDA, Buy 1000 BND (modified)

If regulators ask "Why did this trade happen?", the firm produces: the agent's analysis, model version and confidence score, human reviews, the actual executed order, and the modification applied.

Compliance checklist

Model risk management (SR 11-7)

  • Maintain an inventory of all models agents use
  • Enforce validation requirements before deployment
  • Pin model versions and prevent automatic upgrades
  • Monitor performance and alert on degradation

Audit and regulatory reporting

  • Export audit logs in formats suitable for SEC examination
  • Reconstruct any agent decision with full context and approvals
  • Prove transactions were authorized before execution
  • Identify all customer accounts affected by a specific agent

Human oversight

  • Define which actions require approval
  • Enforce time limits on approval windows
  • Route approvals to the right role by transaction type and size
  • Log all modifications and rejections

Risk controls

  • Enforce transaction size limits
  • Implement security and counterparty blacklists
  • Prevent agents from operating outside business hours
  • Auto-disable agents that violate policies repeatedly

Getting started

Phase 1: Define compliance boundaries. Work with legal and compliance to define approval requirements, audit evidence needs, retention periods, and applicable frameworks (SEC, OCC, FINRA).

Phase 2: Write policy manifests. Translate those requirements into Orloj policies: tool permissions, model pinning, rate limits, and audit levels.

Phase 3: Deploy and validate. Run in observation mode. Verify audit logs capture what compliance needs. Validate with your examination team.

Phase 4: Enforce and monitor. Flip gates to enforcement. Track agent behavior, approval latency, and policy violations.

Most finserv firms move through these phases in 6–8 weeks. The investment in writing policies pays off the first time an examiner asks for evidence.


Frequently asked questions

No. Orloj is software you run on your infrastructure. The SEC regulates your use of AI; you configure the governance, Orloj enforces it.

Orloj is designed for semi-autonomous systems that benefit from human oversight and governance. Pure algorithmic systems with microsecond latency and no human involvement have different requirements.

Define that in your AgentPolicy. A $100K trade might go to a junior trader. A $1M trade escalates to the portfolio manager and risk officer.

Orloj captures model version and confidence scores in every decision. Query for low-confidence decisions over time. If drift is detected, pin to an older model version while you investigate.

Orloj logs the rejection and returns an error to the agent. The agent can retry with different parameters, escalate, or fail gracefully.

Export audit logs for the period under examination. Provide AgentPolicy manifests showing your governance rules. Examiners can verify that policies are enforced and decisions are documented.