Let’s talk strategy, scalability, partnerships, and the future of autonomous systems.
Generative AI has already transformed the way we develop software. Code generation tools accelerate development, shorten feedback loops, and remove friction from everyday tasks. Companies like Robinhood, JPMorgan Chase, Walmart, Microsoft, Coinbase, and Google have all gone on public record, citing broad adoption of agents in code development and review.
However, the truth is that coding was never the bottleneck. It represents just 30 percent of engineering time. The harder 70 percent is running that code in production, where complexity, tool silos, knowledge gaps, and the pace of change all collide. You can code faster, but engineering velocity is not improving because teams still spend the majority of their time fighting production issues.
IDC analysis shows developers dedicate far more hours to operational and background work than to writing code, with some studies finding that only about 16 percent of time is spent directly coding¹. The world’s most expensive engineering talent is spending most of its time firefighting, triaging incidents, and wrestling with workflows designed for a different era.
This is the productivity paradox. Code gets faster, production gets harder. Without solving the 70 percent problem, gains from investments in code generation barely make a dent.
Today’s production environments are sprawling and noisy. Cloud-native architectures, containerized workloads, and Kubernetes orchestration have created more telemetry, more dependencies, and more moving parts than ever. When something breaks, engineers are pulled into a series of cascading war rooms. It becomes a situation where multiple teams are engaged, with experts of specific components of the production system. They bounce between dashboards, logging systems, incident workflows, chat tools, and static runbooks, each with its own query language, data format, and context.
Additionally, production systems are rarely greenfield. They are the product of years of layered builds, legacy migrations, and shifting deployment models. Enterprises typically run a patchwork of on-prem, private cloud, multi-cloud, and SaaS services, each with its own failure modes, operational quirks, and layers of dependencies. This accumulated complexity makes downtime harder to prevent, degradations harder to detect, and remediation slower.
The result is not only costly outages but also frequent downtime and degraded performance. These are far more common, and while less visible to customers, they drain developer productivity. Every time engineers are pulled into a war room, roadmap work stalls, context switching rises, and incident fatigue sets in. What feels like “just a few hours” of degraded service quickly adds up to thousands of lost developer hours each year.
The business cost of this downtime is enormous. Oxford Economics estimates that downtime and service degradation cost the Global 2000 about $400 billion annually². Other analyses suggest the price of downtime for large organizations can reach $9,000 per minute³. For global enterprises, every wasted second translates into lost revenue, broken trust, and missed opportunities.
Organizations have been employing automation to address these problems for years. Site Reliability Engineering codified best practices. Pipelines made deployments faster. APIs made integration easier. Dashboards made telemetry visible.
But all of this shares the same limitation: it scales data and costs, not understanding. Runbooks automate known steps but fail in novel situations. Observability tools surface metrics but still place the cognitive load on engineers to decide what matters. Traditional workflow tools escalate issues but do not solve for the root cause.
The outcome: more alerts, more dashboards, more logs, and more manual decisions. Ultimately, the promise of legacy automation failed to deliver, instead amplifying toil rather than eliminating it.
AI has already proven its value in software engineering. The 2025 Stack Overflow Developer Survey found that 84 percent of developers are using or plan to use AI tools, up from 76 percent the year before⁴. Adoption is widespread, but trust is uneven. Engineers will not hand over production operations to AI unless it is transparent, reliable, and grounded in real systems.
AI SRE changes the equation. Purpose-built AI SRE systems use large language models and multi-agent intelligence to correlate code, infrastructure, and telemetry across logs, metrics, traces, past incidents, and their own memories. Instead of forcing engineers to query tools manually, AI SRE generates real-time narratives of what is happening, pinpoints likely root causes with supporting evidence, and recommends prescriptive remediation steps.
This relieves the heaviest burden on engineers: figuring out what went wrong, why it broke, and how to fix it. AI SRE does not replace engineers. It gives them the same leverage AI brought to coding, but applied to the complexity of production systems. Instead of scaling data, it scales understanding at machine scale.
Several converging forces make AI SRE urgent today, not a year from now:
The first wave of AI delivered coding assistants. The second wave is delivering AI SRE: from scaling code to scaling reliability.
At Resolve AI, we believe AI SRE is not about chatting with logs, metrics, or dashboards. It involves embedding intelligent agents into the core of production workflows, which requires both deep domain and AI expertise to approach the problem holistically.
An AI SRE must be built with these core capabilities:
Because it captures and codifies knowledge across systems, AI SRE also shortens onboarding time for new engineers, reduces the ad-hoc ‘shoulder taps’ that consume peacetime hours, and automates large parts of postmortem creation. This means faster ramp, fewer interruptions, and less fatigue for teams already stretched thin.
AI SRE is not an experiment. It is already running in production at some of the world's largest organizations, delivering measurable improvements in mean time to resolution, reducing downtime costs, and empowering engineers to run their production systems more efficiently with complete system context at their disposal.
Code generation solved the easy part. The hard part is running software reliably in production, where downtime, degradations, and outages cost millions; incident fatigue is rising, and engineers are overwhelmed by war rooms and workflows.
AI SRE is how the world’s largest organizations reclaim engineering time, improve resilience, and turn site reliability engineering into a competitive advantage. For leaders, the value is measurable: fewer teams pulled into incidents, fewer people required to respond, shorter MTTR, and reduced downtime costs. These are the levers that determine whether engineering velocity improves or stalls.
The question is no longer whether you need an AI SRE, but whether you will build or buy. That is where your evaluation must begin.
We cover both here:
Ben Jaderstrom
VP of Worldwide Sales
@ Resolve AI
I’ve spent the last decade helping build and scale high-growth software companies. At Grafana, I was part of a journey that grew the business 40× and helped redefine modern observability. Most recently, at Windsurf, I led GTM efforts as we merged missions with Cognition, the team behind Devin. Now at Resolve AI, I’m focused on building a world-class GTM organization to help the world’s most strategic customers ship reliable software in the AI-native era.
AI generates code in seconds, but debugging production takes hours. Learn how conversational AI debugging can match the speed of modern code generation. And what role do logs play in it?
Vibe debugging is the process of using AI agents to investigate any software issue, from understanding code to troubleshooting the daily incidents that disrupt your flow. In a natural language conversation, the agent translates your intent (whether a vague question or a specific hypothesis) into the necessary tool calls, analyzes the resulting data, and delivers a synthesized answer.
Software runs the world. But when it breaks, business slows. Deals stall. Customers churn. Teams lose momentum. With AI code generation accelerating how fast software is shipped, companies need Resolve now more than ever. That is why I joined Resolve AI as VP of Worldwide Sales. I am excited to partner with the most strategic customers in the world to keep their software reliable and free up their engineers to focus on innovation instead of war rooms.