AI Agent Orchestration Gets a Control Plane: Databricks Open-Sources Omnigent
3 hour ago / Read about 30 minute
Source:TechTimes

databricks databricks.com

On the morning Databricks co-founder and CTO Matei Zaharia takes the keynote stage at the Data + AI Summit in San Francisco, his open-source project is already two days old. Zaharia published Omnigent to GitHub on June 13 — giving it a weekend of organic community momentum before the world's largest data and AI conference opened today — and the project it describes represents a bet that the most consequential frontier in enterprise AI has quietly shifted from building better agents to governing the ones already running.

Omnigent is a meta-harness: a software layer that wraps above existing AI coding agents — including Anthropic's Claude Code, OpenAI's Codex, and InflectionAI's Pi — and gives teams a single place to compose, govern, and collaborate on those agents without changing what is underneath them. It ships under the Apache 2.0 license and is in alpha as of this writing.

The release lands in a market that is, by several measures, actively breaking down. Engineering teams deploying multiple AI agents today manage each one in isolation — separate tabs, manual context transfers between Claude Code and Codex, no shared audit trail. Cost spirals have become endemic: Uber burned through its entire 2026 AI budget in four months as roughly 5,000 engineers ran Claude Code sessions with no spending caps, with individual heavy users reaching $500 to $2,000 per month. One unnamed enterprise spent $500 million on a single month of AI usage before finance teams intervened. Gartner forecasts that more than 40% of AI agent projects will be canceled by 2027 due to escalating costs and inadequate risk controls.

What a Harness Is — and Why the Layer Above It Matters

To understand what Databricks has built, the vocabulary needs unpacking. A harness is the scaffolding that wraps a foundation model and turns it into an agent capable of reading files, running terminal commands, and calling external tools. Claude Code, Codex, and Pi are all harnesses in this sense — the tools engineers already have open.

Omnigent sits one layer above. Rather than replacing any of those tools, it treats each harness as an interchangeable component behind a common interface. The architectural insight behind this design is simple but consequential: regardless of how each agent harness communicates with its underlying language model internally, the interface it presents to the outside world is the same — messages and files go in, text streams and tool calls come out. Omnigent standardizes that interface so that harnesses become swappable. An engineer who today switches from Claude Code to Codex must rebuild context, reconfigure permissions, and restart any shared session. With Omnigent, that switch is a one-line change to a YAML configuration file.

Databricks frames this moment as equivalent to the shift that Kubernetes made for server infrastructure: engineers once managed individual processes and machines, then moved to managing entire fleets via an orchestration layer above them. Omnigent proposes the same abstraction step for AI agents — one layer up, so that sessions, policies, and team access travel with the work regardless of which underlying agent is running.

Read more: "Harness Engineering" Emerges as the Fourth Paradigm of AI Engineering

How Omnigent Actually Works

The architecture has two components. A runner wraps any agent — whether a terminal-based coding tool or a higher-level SDK — in a sandboxed session with a uniform API. A server layer sits above the runner to manage policies and session sharing, and exposes every session simultaneously over a terminal interface, a desktop or web application, mobile interfaces, and REST APIs. The entire system installs in one command and operates under two interchangeable CLI names. On first run, it detects whatever model credentials already exist in the environment.

Two reference implementations ship with the repository and illustrate how the architecture composes in practice. Polly is a multi-agent coding orchestrator that writes no code itself: it plans, then delegates work to Claude Code, Codex, or Pi sub-agents running in parallel git worktrees, routing each completed diff to a reviewer from a different vendor than the one that wrote it. Debby is a brainstorming interface with two simultaneous heads — one Claude, one GPT — answering every query in parallel and debating each other on command.

Why Prompt-Based Guardrails Are the Wrong Architecture

This is the technical argument at Omnigent's core, and it is the implication that most coverage of agent tooling has consistently understated. Every major AI coding agent today enforces its guardrails through prompts: a system message saying "ask before deleting files" or "do not push without approval." Prompt-based constraints are fragile by design. They break when a long session pushes early instructions out of the model's effective context window. They break when a model update changes how the system message is weighted. And they cannot track dynamic state — a prompt instruction cannot know that an agent has just installed an unreviewed npm package and therefore should require human approval before its next git push.

Omnigent moves policy enforcement to the infrastructure layer, where policies are stateful and independent of what any particular model does with a system message. A cost-budget policy can pause a session after every $100 spent and request confirmation before proceeding. A permission policy can intercept any git push that follows a package installation, regardless of whether the model ever registered it was supposed to. These policies enforce before the action happens, not after.

The OS-level sandbox — named Omnibox and built by Databricks' security team — locks down filesystem access and intercepts outbound network requests at the egress proxy layer using kernel-level enforcement: bubblewrap and seccomp on Linux, Seatbelt on macOS. The practical effect is that an agent running under Omnigent never sees a GitHub security token directly. The token is injected into the proxy only on approved outbound requests, and the agent interacts with an abstracted interface that cannot exfiltrate the credential itself. This is architecturally different from an instruction to the agent not to share its credentials — it is enforcement at the operating-system level.

The field has independently arrived at the same conclusion. A Gravitee survey published this year found that only 14.4% of AI agents went live with full security and IT approval — meaning more than 85% were deployed without security review. Unit 42 researchers documented nine attack scenarios against production agent frameworks, finding that credential theft, tool exploitation, and remote code execution all arise from insecure design patterns at the harness level, not from model failures. Omnigent's enforcement architecture addresses these gaps structurally, rather than through the same prompt layer that attackers can manipulate through prompt injection.

Who Is Running This in Practice

Databricks has deployed coding agents across its 5,000-plus engineering team, and says the patterns it observed there drove the decision to build a meta-harness layer. Three multi-agent patterns recur at production scale that each span multiple harnesses: legal AI firm Harvey pairs an open-source worker model with a frontier advisor to manage quality and cost without paying frontier prices for every step; Anthropic's own research product operates as a lead agent orchestrating parallel sub-agents; and Databricks' Genie product uses different language models for planning, search, and code generation in a single flow. None of these patterns can be implemented within a single harness.

The project was built in collaboration with Neon and is available as an open-source alpha. The Apache 2.0 license removes a common procurement barrier for organizations that require permissive open-source terms before deploying infrastructure-level tooling.

Read more: Databricks Is the Only Profitable Name in the AI IPO Wave: The Outlier That Makes Money

What Omnigent Does Not Do Yet

Omnigent is in alpha. The project ships two reference orchestrator implementations but no production-ready prebuilt agents for specific enterprise workflows. The roadmap lists automatic optimization at the meta-harness level via a technique called GEPA, code-based introspection using approaches the team calls MemEx and Reinforcement Learning from Models, and an Omnigent Server MCP that would allow agents to operate across sessions. None of these are available yet.

How Do I Know Whether My Enterprise Is Ready for a Meta-Harness?

The signal is operational friction, not agent count. If engineers on a team are maintaining separate configuration files for each agent, transferring context by copying text between windows, or discovering cost overruns only when the monthly invoice arrives, those are the conditions Omnigent is designed to address. Teams running a single agent for a single workflow may not need a meta-harness layer yet. Teams running three or more agents on overlapping codebases — or any team that has experienced a runaway cost event — are the target deployment scenario.

The same Gravitee survey found that 82% of executives feel confident their existing policies protect against unauthorized agent actions, while field data shows more than half of deployed agents operate without consistent security oversight or logging. That gap between executive confidence and operational reality is precisely where infrastructure-layer enforcement becomes the defensible answer.


Frequently Asked Questions

What is the difference between an AI agent harness and a meta-harness?

An agent harness wraps a foundation model and turns it into an agent that can read files, run terminal commands, and call tools — Claude Code, Codex, and Pi are all harnesses. A meta-harness sits one layer above and treats each harness as an interchangeable component. Omnigent provides the shared layer where multi-agent composition, governance, and real-time collaboration live, regardless of which underlying agent is running.

Why do prompt-based guardrails fail for multi-agent governance?

Prompt instructions lose effect when long sessions push them out of a model's effective context window, and they cannot track dynamic session state. A prompt saying "ask for approval before pushing code" cannot detect that an unreviewed npm package was just installed and raise the risk level accordingly. Infrastructure-layer policies, like those Omnigent enforces at the runner level, are stateful, context-aware, and independent of what any model does with a system message.

How do enterprises control AI agent costs today — and what does Omnigent change?

Most enterprises currently lack per-session cost visibility, which is why documented incidents include one organization spending $500 million in a single month and Uber burning its 2026 AI budget in four months. Omnigent's cost policies can pause a session after a configurable spending threshold and require human confirmation before proceeding, enforcing budget limits at the infrastructure layer rather than asking individual engineers to self-monitor.

Is Omnigent a replacement for LangChain, CrewAI, or AutoGen?

No. LangChain, CrewAI, and AutoGen are orchestration frameworks that coordinate what agents do within a single execution environment. Omnigent operates at the layer above those frameworks and above individual coding agents, providing governance, credential management, cross-device session access, and team collaboration for whatever harnesses or SDKs are already in use. It is infrastructure that sits above the agent, not a replacement for the agent itself.