← Blog

Why Every Agent System Needs a Governance Layer (Not Just Guardrails)

Jon Mandraki

Everyone talks about "adding guardrails" to agent systems. It's become the default move when someone asks about safety. Build your agent, add guardrails at the exit, and you're protected.

This is incomplete. Guardrails and governance are not the same thing, and conflating them leaves you vulnerable.

Guardrails Are Output Filters

Start with what guardrails actually do. Guardrails (Guardrails AI, NeMo Guardrails, and similar products) inspect the outputs of a language model before they reach the user. They check for:

  • Toxicity, hate speech, or harmful content
  • Personally identifiable information (PII)
  • Hallucinations or contradictions
  • Refusals that the model shouldn't make

They're a gate at the output stage. The agent runs, generates output, and the guardrail sits between the model and the world saying "yes, that's safe to output" or "no, filter this."

This matters. Output filtering is real work and it's not trivial to get right.

But here's the trap: guardrails assume the problem is the model saying something bad. They don't prevent the model from doing something bad.

What Guardrails Miss

An agent calling a tool is not an output. It's an action. If an agent decides to delete your production database, that's not an LLM hallucination. That's an executed action. By the time a guardrail sees the output, the database is already gone.

Guardrails can't control:

Tool selection and invocation. If an agent calls a tool, the guardrail might see the tool call in the logs, but it's too late. The tool already ran.

Token budgets. You can't stop a model mid-token just because you've hit your budget cap. You can measure after the fact, but preventing it requires controlling token spend at the inference layer.

Model selection. If an agent decides to call GPT-4 instead of GPT-4o, the guardrail sees the output, not the model decision. By then the expensive model is running.

Audit trails and compliance. Guardrails can log outputs, but if you need to prove that an action followed policy, you need logging at the point where the policy was enforced, not at the output stage.

Execution context. If an agent is running with elevated privileges because of a misconfiguration, a guardrail still sees the same outputs. The problem is the privilege level, not the text.

Failure recovery. If an agent makes ten tool calls and the eighth one fails, guardrails don't help you handle the failure. You need orchestration.

These are not edge cases. These are production concerns that every ops team eventually hits.

A Governance Layer Is Different

A governance layer sits in the execution environment. It's between the agent and the tools, between the agent and the models, between the agent and the budget. Every action flows through it.

What does governance actually do?

Policies define what's allowed. Before an agent runs, you define policies: which tools it can call, which models it can invoke, how many tokens it can spend, which data sources it can access. These policies live as code or configuration.

Runtime enforcement. When the agent tries to call a tool, the governance layer checks: "Is this tool in the policy?" If not, the call fails immediately. The tool never runs.

Authorization. Agents have roles. Roles have permissions. An agent with the "read-only" role can call read tools. It can't call delete tools. This is checked before the action runs, not after.

Budget enforcement. If an agent tries to call a model but has hit its token budget, the governance layer says no. The model call doesn't happen.

Audit trails. Every action is logged at the governance layer: which agent made the request, what it requested, was it allowed, and what happened. This is immutable proof for compliance.

Failure handling. If a tool call fails, the governance layer decides what happens next: retry with backoff, escalate to a human, mark the task as dead-lettered. This is central policy, not scattered error handling across your code.

Cost attribution. You track not just total spend, but per-agent spend. Which agent ran up the bill? Which tools consumed the most tokens?

This isn't middleware you bolt onto an existing system. It's the execution environment itself.

Why the Distinction Matters

Let me make this concrete with an example.

You're running an agent that manages cloud resources. It can call tools like describe_instances, start_instance, stop_instance, and terminate_instance.

With only guardrails: The agent decides to call terminate_instance on your production database server. The function runs. The database is gone. The guardrail looks at the output ("Successfully terminated instance...") and allows it through because technically the statement is true and not toxic. Now you're explaining to your boss why an AI system deleted production.

With governance: Before the agent runs, you set a policy: "This agent role can call describe and start, but not terminate." The agent tries to call terminate_instance. The governance layer checks the policy, rejects the call, and the action never reaches the tool. The agent fails safely. You can handle that failure gracefully, escalate to a human, or route to a different workflow.

Better outcomes don't come from filtering the text. They come from controlling the execution.

The Bolted-On Problem

Most frameworks treat governance as optional. You can add it. You should add it. But it's not built in.

This is how you end up with governance that's inconsistently applied. One service enforces policies, another one doesn't. One agent flow has audit logging, another one skips it when things are simple. One developer remembered to add permission checks. Another one said "this internal tool doesn't need that."

When governance is optional, it becomes inconsistent. When it's inconsistent, it doesn't work.

Real governance is structural. It's the thing that the system can't work without, not the thing you add if you're being careful. It's like authentication in a web server — it's not middleware you layer on top of HTTP. It's part of the request handling from the first byte.

Orloj's governance is built into the runtime. Every tool call, every model invocation, every resource access goes through the governance layer. You can't opt out. You can loosen the policies, but you can't skip the check.

When You Need Both

Guardrails and governance are orthogonal.

Governance controls whether an action is allowed to run. Guardrails control what the model outputs after it has run.

You need governance to prevent unauthorized actions from executing. You need guardrails to prevent the model from outputting toxic, private, or misleading information.

An agent with full governance controls can still be tricked into saying something horrible. A guardrail will catch that. But a perfectly safe output from an agent that has execute permissions to your production infrastructure doesn't protect you.

Both matter. But governance is the load-bearing one.

What Governance Looks Like in Practice

Real governance policies look like this:

apiVersion: orloj.io/v1
kind: Policy
metadata:
  name: data-analyst-agent
spec:
  agents:
    - selector: role=data-analyst
  allowedTools:
    - bigquery:read_table
    - bigquery:list_datasets
  allowedModels:
    - gpt-4o
  budget:
    tokensPerRun: 10000
    tokensPerDay: 100000
  auditLevel: full

This says: agents with role data-analyst can read from BigQuery, but not write. They can use GPT-4o but not GPT-4 or Claude. They get a token budget per run and per day. Everything is logged.

If an agent tries to call a tool outside this policy, the request fails closed. If it hits the token budget, execution stops. If you need to revoke access, you change the policy and redeploy.

This lives in version control. It's code. It's auditable.

The Trend in Frameworks

LangGraph is adding more governance-like features. So is CrewAI with their enterprise offering. This is good. It means the space is recognizing that governance matters.

But most of the implementations are still optional. You can add a policy layer. You should. But nothing forces you to.

Orloj treats it as required because, well, infrastructure needs governance. You wouldn't run Kubernetes without RBAC. You wouldn't run a database without user permissions. Why would you run an agent without governance?

Final Word

If you're building a prototype, guardrails might be enough. Output filtering is real safety.

If you're running agents in production, you need both. Guardrails catch bad outputs. Governance prevents bad actions.

The hard part is that governance is unsexy. It's not a product feature. It's not something you demo. It doesn't make agents faster or smarter. It just makes them reliable and auditable.

But that's the job. That's what production infrastructure looks like.

Related posts