← Blog

Orloj vs. Kagent: Declarative Agents Without Kubernetes

Jon Mandraki

Kagent got accepted into the CNCF Sandbox last year. That's a real signal. It means the cloud-native community thinks agent orchestration belongs in Kubernetes, managed the same way you manage everything else on K8s: CRDs, operators, kubectl.

Orloj thinks agents should be declared too. We use YAML manifests, a CLI, and version-controlled resources. But we don't require Kubernetes. Orloj runs anywhere from a single process on your laptop to a distributed deployment with Postgres and NATS.

Both projects share a conviction that imperative agent code is a dead end for production. The difference is where the runtime lives and how governance works.

Architecture: where agents run

Kagent is a Kubernetes operator. You define agents as Custom Resource Definitions. The Kagent controller watches for changes, reconciles state, and manages agent lifecycles using the standard K8s control loop. Agents run as pods. Governance is Kubernetes RBAC.

If you already think in namespaces, resource quotas, and helm charts, this feels natural. Your agents get the same infrastructure patterns as your microservices.

Orloj is a standalone server and worker architecture. You define agents, systems, policies, and tools in YAML manifests and apply them with orlojctl. The server handles scheduling, resource storage, and the API. Workers execute agent tasks. Governance is built into the runtime layer: agent policies, roles, tool permissions, token budgets.

You can run Orloj on a laptop for development, then scale to distributed workers for production. No cluster required.

What this means in practice

Want to add a new agent? In Kagent, you write a CRD manifest and kubectl apply. In Orloj, you write a YAML manifest and orlojctl apply. The developer experience is intentionally similar.

The differences show up in operations. Kagent agents are K8s pods. They scale, crash-restart, and network-isolate the way pods do. Orloj agents are tasks managed by Orloj's own scheduler. They get lease-based ownership, retry with jitter, dead-letter handling, and idempotency tracking from Orloj's runtime.

Neither approach is wrong. They're built for different infrastructure assumptions.

Governance: RBAC vs. runtime policies

This is the sharpest difference.

Kagent relies on Kubernetes RBAC for access control. You use K8s roles, service accounts, and namespaces to control what agents can access. This works well for infrastructure-level permissions: which APIs can this pod call, which secrets can it read, which namespaces can it operate in.

What K8s RBAC doesn't do is agent-level governance. Questions like: Can this agent use GPT-4 or only GPT-3.5? How many tokens can it consume per task? Which tools is it allowed to call? What's the maximum number of steps before it must stop?

Orloj has a dedicated governance layer. AgentPolicy resources define per-agent constraints: allowed tools, model restrictions, token budgets, step limits, rate limits. AgentRole resources define role-based access control at the agent level. ToolPermission resources control which tools are available to which agents.

All of this is enforced at the execution layer, not in application code. An agent that tries to call a tool it's not authorized for gets a fail-closed denial. The event is logged. There's an audit trail.

If your agents operate in regulated environments (healthcare, financial services, government), runtime-enforced governance with audit trails is a requirement, not a feature.

Tool isolation

Kagent isolates tools through Kubernetes networking and MCP. Agents interact with tools as MCP servers, which can be network-isolated using K8s network policies. Practical, especially if your tools are already running as services in the cluster.

Orloj offers four isolation backends: direct (tools run in the worker process), sandboxed (restricted execution), container (OCI containers), and WASM (WebAssembly modules). You configure isolation per tool based on risk. High-risk tools get WASM with read-only filesystem and no network access. Low-risk tools get direct execution.

The granularity is different. K8s network policies are coarse-grained (pod-to-pod). Orloj's tool isolation is fine-grained (per-tool execution environment).

When to use Kagent

You run everything on Kubernetes and plan to keep it that way. Your team thinks in CRDs and operators. Your agents primarily interact with K8s resources and cloud-native services. You want CNCF ecosystem integration (Prometheus, Grafana, OpenTelemetry) through standard K8s patterns. Kubernetes RBAC is sufficient for your governance needs.

Kagent makes sense when your agents are part of a larger K8s platform and you want them managed the same way.

When to use Orloj

You need agent-specific governance beyond infrastructure RBAC. You're in a regulated industry and need audit trails with per-agent policy enforcement. You want to run agents without requiring a Kubernetes cluster. You need fine-grained tool isolation (WASM, container sandboxing). You want built-in reliability primitives (lease-based ownership, dead-letter queues, idempotent retries) without building them on top of K8s.

Orloj makes sense when your agents need their own operational layer, regardless of where that layer runs.

Can they work together?

Maybe. Orloj can run on Kubernetes. You could deploy orlojd and orlojworker as K8s deployments, use Postgres on K8s for state, and NATS JetStream for messaging. In that setup, K8s handles infrastructure lifecycle while Orloj handles agent lifecycle and governance.

Kagent could potentially manage Orloj components as K8s resources, though nobody has built this integration.

The more interesting question is whether the agent layer and the infrastructure layer should be the same thing. Kagent says yes: agents are infrastructure, manage them with the same tools. Orloj says no: agents need their own governance and reliability primitives that don't map cleanly to K8s concepts.

I'm biased, obviously. But I think the governance question is the deciding factor. If K8s RBAC covers your access control needs, Kagent's approach is clean and well-integrated. If you need token budgets, tool-level permissions, model restrictions, and fail-closed policy enforcement, you need a dedicated agent governance layer.

Decision matrix

Dimension Kagent Orloj
Infrastructure requirement Kubernetes cluster Any (laptop to distributed)
Agent definition K8s CRDs YAML manifests
CLI kubectl orlojctl
Governance model K8s RBAC Runtime policies, roles, permissions
Token budgets Not built in Per-agent budgets
Tool isolation K8s network policies Direct, sandbox, container, WASM
Scaling K8s pod autoscaling Worker horizontal scaling
Observability K8s-native (Prometheus, OTel) Built-in + integration hooks
CNCF ecosystem Native (Sandbox project) Compatible but standalone
Audit trails K8s audit logs Agent-level audit trails
Learning curve K8s + CRDs + Kagent Orloj manifests + architecture
Maturity Early (CNCF Sandbox) Early (pre-1.0)

Both are early-stage. Both are open source. Both are building toward the same insight: agents need to be declared and managed, not wired together with glue code.

Pick the one that fits how you run infrastructure today and what governance you need tomorrow.

Related posts