Build or buy? See where eng teams are landing

The "Bag More 9s" tote was everywhere at AWS Summit New York last week. The lines for it ran 4 booths long and wrapped throughout the hall. The same bold design and slogan blanketed the Javits Center, the Vessel, and the Hudson Yards Shops in the form of digital billboards. Most attendees had seen “Bag More 9s” a few times before they reached the booth.
So what does it mean?
Software reliability gets measured in nines. The things that cost you nines are the constant stream of alerts and the complex incidents that bring your software down. At Resolve AI, we're building AI for prod, agents that run and fix your software in production, to help you earn those nines back.
What was most interesting to us was how far the conversations had moved. Last December at re:Invent, we spent most of the show explaining what AI for prod is. This time, the conversation shifted to the practical side. When AI writes more of your code, you inherently are going to ship faster. Production gets bigger and changes more often, but the team and their tools helping run it aren’t growing at the same rate. If agents are writing the code, you need something in production to help you run it. Most of the attendees we met had already decided they needed to run production with agents, and the only open questions left were practical ones: how can agents run your software, what guardrails should be in place, and how does the pricing model work?
Pierre Tessier ran a breakout session at the end of the day on Resolve AI’s agents that resolve incidents and run daily production work. Discussions centered around how an agent fixes a real production issue: investigating an alert end to end, getting to an actual root cause, and handling the routine operational toil that eats at an engineer's day.
Much of the conversation was around what it takes for an agentic harness that can hold its own in real production environments: choosing the right model for each step, a context graph that intimately knows and remembers how production works, actions within defined guardrails, and a learning system that improves with every interaction. Evals keep it honest as the systems and the models change.
What was also surprising to our team onsite was how fast the conversation turned to personal agentic workflows. Engineers wanted to know they wouldn't be trading one silo for another: whether they could run Resolve AI inside the tools and agents they already use, keep their own workflows, and decide what an agent is allowed to do on its own.
Resolve AI is exposed as an MCP server, an API, and Skills, so your own agents pull its production context, investigations, and remediation without rebuilding any of it, and you keep control of what touches production. Ask Claude Code about a slow service and it can query Resolve for the real root cause and a suggested fix. More in Build custom agents on Resolve.
The last question came up almost every time someone watched the agent finish a piece of work: “How do you charge for this?” We price on credits, not tokens, and that fact resonated better than almost anything else. Most AI tools pass raw token usage straight through, so your bill tracks how much the model talked rather than what it got done, and you can't predict it month to month. Credits tie cost to the work the agent completes. Attendees weren't sizing up a pilot, they were working out what it costs to run something like Resolve AI every day at scale.
If you put these three questions together, they point in one direction: engineers weren't asking whether AI belongs in production. They were asking what it takes to run it, whether they stay in control of it, and what it costs to keep it running.
If your team is feeling that production burden, see what AI for prod could look like for you. And thanks to everyone who waited in the Bag More 9s line. If you missed the bag, grab yours here.

Join our engineering leads for "Behind the Build", a webinar series deep-dive into how we built agents that run software.

Watch how Resolve AI investigates a service timeout from application logs through Kubernetes pods down to failing memory modules in a UCS blade - building a complete causation chain in 3 minutes. See the stark contrast between traditional multi-team incident response (4 teams, multiple tools, hours of coordination) and AI-native investigation that maps dependencies from app code to storage infrastructure without organizational handoffs. Learn why engineering silos slow incident response and how AI agents can reason across the entire production stack as one connected system.

Hear AI strategies and approaches from engineering leaders at FinServ companies including Affirm, MSCI, and SoFi.

Resolve AI, powered by advanced Agentic AI, has transformed how Blueground manages production engineering and software operations, seamlessly handling alerts, supporting root cause analysis, and alleviating the stress of on-call shifts.