Tool Boundaries
TodayAgents call tools they should not touch.
With OrlojTool permissions enforced at execution time.
Orloj is an open-source orchestration runtime for multi-agent AI systems. Define agents, tools, policies, and workflows in YAML. Orloj schedules, executes, and governs them.
Pick your lens.
You have AI agents. They need rules, schedules, and someone watching. Orloj is that someone.
→ Clone and run itYou know the frameworks. Orloj is what you reach for when prototypes have to become reliable production systems: YAML-defined workflows, tools running in isolated containers, fail-closed governance, retries, and observability without the glue code.
→ Read the integration docsIt’s Kubernetes for AI agents. Open source. Declarative. The missing infrastructure layer between ‘demo agent’ and ‘production agent fleet.’
→ View the repositoryDeclarative YAML manifests for agents, policies, and DAG workflows. Tools run in isolated containers with explicit image pins. CLI apply/rollback. Fail-closed governance at the execution layer. Lease-based scheduling with dead-letter handling.
apiVersion: orloj.dev/v1 kind: Agent metadata: name: research-agent spec: model_ref: openai-default prompt: You are a research assistant. Be concise. roles: - analyst-role tools: - name: web_search - name: code_exec limits: max_steps: 6 timeout: 30s→ Read the architecture docs
Same agent ambition. Different operational outcomes once runtime constraints are enforced as policy, not convention.
| Capability | Today | With Orloj |
|---|---|---|
| Tool Boundaries | Agents call tools they should not touch. | Tool permissions enforced at execution time. |
| Cost Controls | Token spend spikes without policy limits. | Per-agent token caps and model allowlists. |
| Failure Handling | Retries and dead-letter handling are hand-rolled. | Lease-based retry, replay, and dead-letter primitives. |
| System Composition | Multi-agent wiring lives in bespoke glue code. | Declarative YAML graphs with fan-out and join gates. |
| Auditability | No end-to-end trace when incidents hit production. | Full task trace and message lifecycle logging. |
TodayAgents call tools they should not touch.
With OrlojTool permissions enforced at execution time.
TodayToken spend spikes without policy limits.
With OrlojPer-agent token caps and model allowlists.
TodayRetries and dead-letter handling are hand-rolled.
With OrlojLease-based retry, replay, and dead-letter primitives.
TodayMulti-agent wiring lives in bespoke glue code.
With OrlojDeclarative YAML graphs with fan-out and join gates.
TodayNo end-to-end trace when incidents hit production.
With OrlojFull task trace and message lifecycle logging.
The platform is designed for teams that need deterministic execution, policy enforcement, and safe operations under real production load.
Policies and permissions are evaluated inline on every turn and tool call. Unauthorized actions fail closed with traceable outcomes.
Version-controlled manifests for agents, tools, models, and workflows. Apply once, diff in PRs, and roll back safely.
Reliability primitives you'd otherwise hand-roll. Fan-out/fan-in and failure handling are part of the runtime, not application code you maintain.
orlojctl apply -f ./your-system/ reconciles agents, graph, governance, and tasks in a single declarative pass.
apiVersion: orloj.dev/v1
kind: Agent
metadata:
name: research-agent
spec:
model_ref: openai-default
prompt: |
You are a research assistant.
Produce concise, evidence-backed answers.
tools:
- web_search
- vector_db
roles:
- analyst-role
limits:
max_steps: 6
timeout: 30sapiVersion: orloj.dev/v1
kind: AgentSystem
metadata:
name: report-system
spec:
agents:
- planner-agent
- research-agent
- writer-agent
graph:
planner-agent:
next: research-agent
research-agent:
next: writer-agentapiVersion: orloj.dev/v1
kind: AgentPolicy
metadata:
name: cost-and-security-policy
spec:
apply_mode: scoped
target_systems:
- report-system
max_tokens_per_run: 50000
allowed_models:
- gpt-4o
blocked_tools:
- filesystem_deleteSingle process. In-memory storage. Sequential execution. No external dependencies.
orlojd --embedded-worker --storage-backend=memory
Webhook-triggered. Agents pull logs, correlate metrics, check recent deployments. Read-only tool permissions mean investigation agents can look but can't roll back infrastructure.
Pipeline agents check contracts against regulatory requirements. Model whitelists keep sensitive content off unapproved providers. Every finding is traced and auditable.
Researcher, analyst, and editor stages in a hierarchical agent system. The researcher can query CVE databases; only the editor can write to the output. Token budgets enforced per run.
Agents scan infrastructure for stale or exposed secrets using WASM-isolated tools. Metadata-only access patterns let agents audit secrets without reading secret values.
brew tap OrlojHQ/orloj brew install orlojctl orlojctl init example-system
curl -sSfL https://raw.githubusercontent.com/OrlojHQ/orloj/main/scripts/install.sh | sh
orlojd --storage-backend=memory --embedded-worker
orlojctl apply -f example-system
The full Orloj runtime, open source. Deploy on your own infrastructure with no limits.
Managed Orloj infrastructure so your team can focus on building agents, not operating them.
For organizations that need advanced security, compliance, and dedicated support.
Orloj agent orchestration is coordinating multiple AI agents in production with governance, scheduling, and observability. It’s like Kubernetes for agents: you need the same operational rigor as you do for containers or databases.
LangChain helps you build agents. CrewAI helps agents collaborate. Orloj runs agents in production, with governance, observability, and the reliability patterns you expect from infrastructure. They’re all solutions to different problems, not competing.
Fail-closed means unauthorized actions are denied by default. An agent can only use tools you explicitly permit. Fail-open (the alternative) would allow actions unless you explicitly block them, which is a risky default in production.
Orloj is an orchestration plane for running agents. You can build agents in Orloj just like you would with frameworks like LangChain, LlamaIndex, or CrewAI. Orloj then manages them at scale with governance, scheduling, and reliability.
Not necessarily. Orloj works with agents built in any framework via standardized tool interfaces. Some refactoring may be needed for specific governance requirements, but you don’t need to rebuild from scratch.
Orloj includes lease-based task ownership, retry with jitter, idempotency tracking, and dead-letter handling. These patterns prevent cascading failures and ensure your agent fleet survives partial outages.
Orloj logs all agent actions, tool calls, and policy decisions. The structured audit trail is designed to support compliance workflows for frameworks like HIPAA, SOC 2, and the EU AI Act. Governance is enforced at the execution layer, not as an afterthought.
Orloj provides structured logging, distributed tracing, metrics collection, and cost attribution. You can trace an agent’s decision path, see which tools it called, understand latency, and allocate costs by agent or workflow.
Yes. Orloj is Apache 2.0 licensed and developed publicly on GitHub. You can run it on-premise or in your own VPC.
If you’re familiar with Kubernetes, Docker, or infrastructure-as-code tools, Orloj will feel familiar. You define agents and policies in YAML manifests and deploy with a single command. The concepts are straightforward for engineers.
Orloj is Apache 2.0. The full runtime is open source: governance, orchestration, scheduling, observability.
Define your agents, enforce your policies, and ship to production.