Company

Our Vision for AI for Prod

05/18/2026

9 min read

In March, we introduced AI for prod: AI agents for production systems that work across code, infrastructure, telemetry, and tribal knowledge to debug, remediate, and prevent issues that consume most of an engineering team's time today. Our goal is self-driving production (autonomously operating end-to-end with humans on the loop, not in the critical path) and production fluency democratized across every engineer, regardless of tenure, who is otherwise bottlenecked by gaps in context, expertise, or tools. We wrote about why production has always been the harder half of software engineering, and why AI had finally crossed the threshold where automating it end-to-end was possible. The world has changed dramatically since then, but this argument has only become more true.

In the past four months, the acceleration of AI-generated software has arrived as a reality that engineering leaders are already navigating. Andrej Karpathy hasn't written a line of code since December, directing fleets of agents instead.¹ Spotify's best developers haven't either, merging hundreds of AI-generated pull requests a month through an internal system built on Claude Code.² StrongDM, a security infrastructure company, runs a three-engineer team where no human writes or reviews code.³ One engineering org we work with ran an internal survey and found that 93% of their code was already being written by agents. At Resolve AI, we are no different: over 95% of the code we ship is generated by agents. This shift is an evolution of operating models that will expand in scope and depth over the coming months and years.

Specs go in, software comes out, and agents handle everything in between in the Dark Factory, as some have called it.⁴ The frame has evolved from whether software factories will be commonplace, to how fast they will become commonplace. And yet, the software factory only solves half the problem. The code ships at a 10x multiplier, and a human still has to run it.

This is the gap AI for prod exists to close. Every advancement in code generation makes production more difficult to manage: 10x code velocity yields roughly 10x growth in production complexity. There are more changes shipping per day, more services to coordinate, and more surface area to monitor while the same engineers stay on call. Without AI managing production, complexity quickly outpaces what any one person or team can track, and production drifts into a state of degrading reliability that no individual mind can fully hold. The leverage from agents that write software is real, but it doesn't help us compound software until agents can run and improve production autonomously, with humans off the critical path. Without that, the factory breaks down at the moment of delivery: the bottleneck simply moves from authoring code to running it, and the teams that lean on coding agents without an equivalent system for production end up producing the work that overwhelms them faster.

Where this is going

Software engineering will only continue to move further along the arc the software factory points to: fleets of agents authoring, reviewing, and shipping code with engineers in increasingly oversight roles. The copilot era is ending. Long-horizon, increasingly autonomous agents are already doing complex work within codebases that used to require a human in the loop, and that arc naturally bends toward production. The next phase reaches across the full lifecycle of the factory with agents that:

bring production insights into planning and authoring, helping coding agents produce code that is production-aware.
collaborate with coding agents on the operational scaffolding around every change: telemetry instrumentation, dashboards, alerting, and rollout plans informed by the current state of production and the specifics of what is being shipped.
handhold releases as they roll out, acting in real time to prevent a bad rollout from causing damage.
respond to pages automatically, mitigating a large fraction of operational issues on their own without waking anyone up.
proactively triage complex incidents as they occur, pull in the right experts, and collaborate with them to offload the heavy lifting.
continuously optimize production against configuration drift, suboptimal resource usage, and operational and security vulnerabilities.

The value increases with every workflow they take on. Mean-time-to-resolution (MTTR) drops from the hours engineers need to synthesize signals into minutes. Pages are reduced as the system catches their precursors. Engineering capacity, the most valuable and expensive resource any company has, is redirected from operating software to building it. As these agents prove out, trust compounds with them, and so does the scope of work that can be safely delegated.

The advantage becomes durable because each new model generation forces the harness, evals, training data, and orchestration around it to evolve as well. Companies investing early are building the system that will absorb the upcoming release.

What an effective AI for prod system has to be

AI for prod is the layer that closes the long-broken feedback loop between writing software and running it. AI for prod introduces agents that understand, investigate, and increasingly run the environments that code ships into, collaborating directly with the coding agents that produce that code. Together, they enable a complete SDLC and production automation.

The following four properties of an AI for prod system describe what such a layer has to be.

It owns the workflow, not the task. Most AI tools in production today are task-scoped: you give them a question, and they return an answer, but you and your teams still hold the investigation together, deciding which tools to query, synthesizing evidence, executing the fix, and verifying the result. An AI for prod system owns the workflow end-to-end: investigation, decision, action, verification, and record, all under guardrails the engineering team defines. That means stateful orchestration that plans across multiple steps, branches across competing hypotheses, recovers from tool failures mid-investigation without losing context, and hands off cleanly when human judgment is required. Engineers stop being participants in every investigation and become managers of the system that runs them.

It runs continuously, not only reactively. A system that activates when paged reduces time-to-response but does not change the operating model. A production environment is not static: services are deployed, configuration drifts occur, traffic patterns shift, and capacity trends in directions that will not manifest as incidents for days or weeks. An AI for prod system watches continuously, builds a live model of its state, and acts on what it observes without being asked. A system that only activates on an alert is doing a meaningfully smaller job than one that prevents the alert from firing in the first place.

It learns your environment, continually across team and operational boundaries. General-purpose models arrive knowing nothing about the uniqueness of your production systems: not which services depend on which, not what your runbooks say, not what your on-call rotation learned from last month's SEV0. The status quo for this knowledge is partial, fragmented, and out of sync with current code and production realities. An effective AI for prod system continuously builds and maintains a world model of the specific environment it runs, with dependencies, deploys, change history, and the unstructured context that lives in past investigations, postmortems, and senior-engineer corrections, and keeps that model up to date as the underlying system evolves. Crucially, the learning is at the team level, not individual. When the on-caller corrects a hypothesis at 2 AM, the next person picking up the rotation inherits the correction. On-call handoffs stop being a context tax. The system in month twelve has accumulated a year of your team's incidents, corrections, and patterns.

It operates across the team, not in isolation. Production is, by nature, a multi-user, multi-agent, team-level activity. An AI for Prod must be able to work with engineers and agents alike. The team responding to an incident is rarely one person on one surface: an on-caller in Slack, a senior engineer reviewing a hypothesis on the web, a service owner pinged on their phone. An AI for prod system has to flow naturally across all of these without losing context as the conversation moves between them. It also coordinates with the other agents in the environment, receiving signals from coding agents like Cursor and Claude Code about what changed and why, and returning operational signals about how those changes behaved in production. A coding agent ships a pull request; a production agent observes it in deployment, catches regressions before anyone is paged, and feeds that signal back. Software will be created and managed by both humans and agents; AI for Prod must support context sharing and collaboration across multiple surfaces and human and agent paradigms.

What AI for prod makes possible

Coding agents have already changed how software is written. AI for prod is the essential ingredient that makes the software factory whole. It's what lets software written by agents land in production safely, at the velocity those agents produce, without piling up the downstream bottlenecks that consume engineering teams today. An ideal AI for prod owns production workflows end-to-end, runs continuously, learns the environment, and coordinates with outside agents, enabling much greater reliability and a highly performant software factory. The closed loop between writing software and running it becomes continuous and self-improving.

Companies that invest in this early will truly unlock the velocity that AI-driven development has been promising: software that scales, production that holds up under it, and engineers freed from operating systems to do the work only humans can do.

¹ - Sequoia Ascent 2026 summary, Andrej Karpathy ² - Spotify says its best developers haven’t written a line of code since December, thanks to AI ³ - Built by Agents, Tested by Agents, Trusted by Whom? ⁴ - The Five Levels: from Spicy Autocomplete to the Dark Factory

AI SRE Buyer's Guide

Learn how to evaluate and adopt AI SRE in production.

Download

Mayank Agarwal

Founder and CTO

Content

Where this is going
What an effective AI for prod system has to be
What AI for prod makes possible
¹ - Sequoia Ascent 2026 summary, Andrej Karpathy

The AI ROI Playbook

Learn how to measure AI value across the full SLDC.

Download

Mayank Agarwal

Founder and CTO

Fireside Chat: How FinServ Companies Optimize Cost with AI for Prod

Hear AI strategies and approaches from engineering leaders at FinServ companies including Affirm, MSCI, and SoFi.

Technology

How to Evaluate a Production Ready AI SRE

Learn how to evaluate an AI SRE to ensure they run in your unique production environments. This guide explains the five agentic pillars, six key evaluation dimensions, and enterprise readiness criteria that separate production-ready AI SRE from experiments.

Beyond the Build: Accelerating Engineering Velocity with Agentic AI

100 Engineering software engineering executives joined Resolve AI and other luminary leaders to discuss the accelerated evolution of agentic AI in software engineering from coding to managing production systems.

Social

Machines on call for humans

Join the conversation