Get back to driving innovation and delivering customer value.
©Resolve.ai - All rights reserved
With vibe coding, turning intent into product has never been easier or faster. But coding is only one part of a software engineer’s job. We cannot move faster until we also accelerate the operational aspects of software engineering, which are becoming worse with all the AI generated code.
Incidents are stressful and slow down developers. But the real killer to velocity and flow is the constant stream of interruptions and ambiguous questions about production: “Why are pages rendering slow today?" or "Is this performance dip related to the feature flag I just enabled?" or "what’s causing the latency to suddenly spike?" “Is some infrastructure problem causing the alert for my service?” “Which of the code changes that just landed is consuming the error budget?”
Answering these questions requires going dashboard by dashboard, looking through recent changes and trying to visually correlate, digging through outdated docs, diving into unfamiliar code, shoulder-tapping colleagues. These daily interruptions are the true "death by a thousand cuts" preventing effortless software engineering.
Vibe debugging is the process of using AI agents to investigate any software issue, from understanding code to troubleshooting the daily incidents that disrupt your flow. In a natural language conversation, the agent translates your intent (whether a vague question or a specific hypothesis) into the necessary tool calls, analyzes the resulting data, and delivers a synthesized answer.
Vibe debugging a new paradigm that collapses the entire investigative loop in software engineering – from forming a hypothesis to validating it with evidence – is collapsed into a fluid conversation with an AI agent.
In vibe debugging, AI agents act as a trusted partner between you and your production system. They use deep production context to perform the tactical, time-consuming tasks of investigation on your behalf. Instead of pursuing one path at a time, AI agents explore all relevant hypotheses in parallel, simultaneously gathering evidence from code, querying live telemetry, cross referencing deployment history, and gleaning from known issues from past incidents. Without waiting for next instruction, the agents proactively drive the conversation, offering suggestions, highlighting correlations, and surfacing insights you might not have thought to look for.
Let us look at a real world example where I was vibe debugging the failed deployment of a service (svc-analysis
) using Resolve AI.
And it took me all the way from telemetry, to the underlying issue in code, and to an actionable resolution:
When I asked why the deployment failed:
node-canvas
error as the root cause..github/src/base/node-prod/Dockerfile
) and line numbers to add the fix.The result is that the complex, multi-threaded investigative loop (hypothesis->evidence->validation) between a question and an answer, is abstracted into a single threaded conversation with Resolve AI. This radically reduced the time to investigate and I didn’t need to be an expert in Node.js native dependencies or specific Docker image structure. Resolve AI acted as their expert for all of it.
Vibe Debugging with agents like Resolve AI make debugging as natural as asking a question to one of your very experienced teammates: someone who knows your production systems intimately, helps you reason through possibilities, and gathers relevant evidence without requiring perfect knowledge about the underlying system, but without the downside of having to interrupt them.
Vibe debugging introduces a new way of working by fundamentally changing how engineers interact with their systems at each stage of an investigation. Let's explore this process using a real-world example of an investigation with Resolve AI:
Vibe Debugging meets you where you are, even if your starting point is a human question or even an automated alert. You can even start with a vague observation like “Can you check why the UI has been slow in the last 2 hours” or a specific hypothesis. In this case, Steven uses an alert from Grafana as a vague starting point. He initiates a Vibe Debugging conversation to challenge the alert's premise and build his own hypothesis:
Resolve AI takes these high-level questions and returns a synthesized analysis, confirming the issue is not widespread and that CPU is normal. This allows Steven to instantly discard the initial theory and form a more specific one based on evidence. This interaction illustrates how Vibe Debugging took a moment of machine-generated alert and, through a natural language conversation, turned it into an evidence-based hypothesis, setting a clear direction for the rest of the investigation.
To answer Iain’s single question, he would have to begin a serial investigation: find the deployment history; then, read through the commits; then, switch to a metrics tool to look for memory spikes; and finally, try to correlate the timestamps.
On the contrary, Resolve AI investigates multiple hypotheses in parallel.
svc-entity-graph
.svc-entity-graph-ingest
) for correlated network traffic anomalies.It understood the request, simultaneously querying deployment systems, parsing Git history and feature flag changes, analyzing historical memory metrics, and cross-referencing all the timestamps. Resolve AI collapsed what would have taken a lot of time in manual, sequential work into a single action.
Vibe Debugging abstracts away the complex, multi-tool investigation that engineers are typically forced to do manually. Let's look at the example where Iain asks Resolve AI if recent changes to svc-entity-graph
led to increased memory usage.
While Iain asked a single question in plain English, Resolve AI performed a multi-step evidence-gathering process in the background, as shown in its work log:
Screenshot below reveals what was truly abstracted away from the user. Behind the scenes, Resolve AI began by orienting itself, identifying the correct resource (svc-entity-graph
as a kube_deployment
in prod
) and setting the precise time range for the investigation.
From there, it initiated a parallel investigation, automatically querying completely different systems:
1 recent commit
.13
different charts to pinpoint the 2
that were most relevant to a memory issue.Finally, it confirms the entire theory by tracking the reversion of that same change in your CI/CD pipeline and linking it to the system's immediate stabilization.
Iain never had to open a single one of these tabs. He was completely abstracted from the underlying tools and their specific query languages. Vibe Debugging with Resolve AI abstracts all of this away, acting as the translator to convert his question into dozens of underlying queries against the right systems.
Vibe Debugging requires your operational history as a queryable resource to investigate issues that are not just happening in the present moment. For example, in the investigation below Steven starts asking a question that is explicitly comparative and historical:
Resolve AI performs a complex, time-based analysis. As it states in its plan, Resolve AI updates its time range to "cover both the recent period and the timeframe from 3 hours ago so we can compare".
Resolve AI then builds a multi-hour narrative that contrasts the two periods. It identifies that the severe "OOM (out-of-memory) events" were happening earlier but have now ceased, while a more gradual "possible memory leak" persists. This ability to perform a comparative analysis over custom time windows is critical for understanding regressions, chronic issues, and the impact of changes. A traditional chatbot with a static knowledge base would struggle with such analyses.
Resolve AI doesn't just return a list of deployments and a chart of memory usage. It synthesizes these disparate data points into a single, actionable narrative. After its initial investigation, Resolve AI presented its findings, culminating in a conclusion:
This conclusion "The timing and nature of the rollout directly correlates with the memory surge" has synthesized evidence from deployment history, configuration changes, and monitoring data into a single, confident narrative.
This leads to the second act of synthesis: Resolve AI now synthesizes operational knowledge. It provides a detailed, safe procedure for disabling the flags, complete with best practices ("coordinate with your team," "use a low-traffic period") and a pre-planned Rollback Plan.
This is the essence of Vibe Debugging. First, Resolve AI synthesizes data into a clear diagnosis, and then, based on human direction, it synthesizes knowledge into a safe resolution plan. It partners with the engineer through the entire lifecycle of the problem.
The entire interaction from Iain and Steven, from the complex initial question to the follow-up about how to safely disable the flags, happened as conversations on their phones. This natural language interface is the key that unlocks all the underlying power. It makes a deep, historical, multi-system investigation as simple as having a conversation, empowering any engineer to diagnose issues that would have previously required a select group of domain experts.
Generative AI has created a near-frictionless experience for the first half of the engineering lifecycle: code generation. But this fluid experience with “vibe coding” has also created an imbalance. The faster we are able to build, the more painfully apparent the friction becomes in the second half of the loop: debugging production. What good is writing a feature in five minutes if it takes five hours to diagnose why it's failing in staging?
This is where a holistic Generative AI strategy becomes critical. To achieve 10x velocity, we need a counterbalance with AI that makes the production side as fluid as the code generation side. This is the role of Vibe Debugging.
Systemically, Vibe Debugging provides the safety net to move at the speed of Vibe Coding. With Agentic AI like Resolve AI that can understand context, reason through production systems, and learn from every interaction, the cost and time of every debugging scenario plummets.
When debugging is no longer a slow, isolating, and adversarial process, the fear of breaking things diminishes. The focus shifts from "who caused the problem?" to "how quickly can we understand and solve it together?" The academic "blameless postmortem", rather evolves into a collaborative investigation. Here’s a quick example of how debugging can be fun as well
In the screenshot Resolve AI is acting as a knowledgeable and culturally aware participant. The simple act of delegating questions to AI becomes a safe, impartial move, turning the moment of blame into a moment of collective, dark humor.
The incident lifecycle doesn't end when the problem is solved. The final interactions: the request for a poem for "our savior - justin", is about closing the emotional loop. This is the ultimate answer to "Why Vibe Debug?" You get more than a smart tool; you get a cultural catalyst. You get an environment where debugging evolves into collaborative and sometimes even fun learning sessions.
Resolve AI is building AI agents for software engineering founded by the co-creators of OpenTelemetry.
Resolve AI understands your production environments, reasons like your seasoned engineers, and learns from every interaction to give your teams decisive control over on-call incidents with autonomous investigations and clear resolution guidance. Resolve AI also helps you ship quality code faster and improve reliability by revealing hidden system context and operational behaviors.
With Resolve AI, customers like DataStax, Tubi, and Rappi, have increased engineering velocity and systems reliability by putting machines on-call for humans and letting engineers just code.
Steven Karis
Founding Engineer
Steven is a founding engineer at Resolve AI. He is focused on building the agentic AI systems that powers Resolve's AI Production Engineer. He has previously held engineering roles at Splunk, Uber, and Microsoft.
Varun Krovvidi
Product Marketing Manager
Varun is a product marketer at Resolve AI. As an engineer turned marketer, he is passionate about making complex technology accessible by blending his technical fluency and storytelling. Most recently, he was at Google, bringing the story of multi-agent systems and products like Agent2Agent protocol to market
Resolve AI has launched with a $35M Seed round to automate software operations for engineers using agentic AI, reducing mean time to resolve incidents by 5x, and allowing engineers to focus on innovation by handling operational tasks autonomously.
Resolve AI, powered by advanced Agentic AI, has transformed how Blueground manages production engineering and software operations, seamlessly handling alerts, supporting root cause analysis, and alleviating the stress of on-call shifts.
This blog post explores how Agentic AI can transform software engineering by addressing the deep cognitive challenges engineers face during on-call incidents and daily development. It argues that today's observability tools overwhelm engineers with fragmented data but fail to provide real system understanding. By combining AI agents with dynamic knowledge graphs, Resolve AI aims to replicate engineering intuition at machine scale—enabling proactive, autonomous investigation, and delivering the kind of contextual awareness usually reserved for the most seasoned engineers.