Meet us at AWS re:Invent | Booth # 712:

Production systems are more than just their technical components. The context you need to understand them (how they work, why decisions were made, what behaviors to expect) lives in runbooks, design docs, team chats, and sometimes isn't written down at all.
One of the biggest challenges in working with production is that this knowledge is fragmented. The context about your code, infrastructure, and telemetry is rarely in one place. It's scattered across documentation, issue trackers, and institutional memory.
That's why we built Resolve AI as an AI for Production Systems to work across code, infrastructure, and telemetry. With every interaction, Resolve captures knowledge and evolves its understanding of your systems.
Today, we are excited to announce that Resolve AI now supports the Model Context Protocol (MCP) for Atlassian, bringing seamless integration with Jira and Confluence. This isn't just about reading docs. Any tool with MCP can read a Confluence page. The difference is that Resolve AI already understands your code, infrastructure, and telemetry.
By adding Atlassian MCP support, Resolve connects that deep technical context with your team's knowledge. It knows when a specific runbook applies to a specific infrastructure alert, or which architectural decision record explains a confusing pattern in your code. It automatically applies the right knowledge in the right context—something a generic coding assistant simply cannot do.
Here is how giving Atlassian context to Resolve AI improves three core production workflows.
Most work with production systems involves debugging. Not just major Sev1 incidents that impact customers and demand immediate attention, but also day-to-day questions when you notice something unusual: Why is this API slower than normal? Which service is making excessive database calls? What changed before this error started appearing?
Runbooks contain the investigation procedures, but they're static text. You have to find the right runbook, interpret it, and manually execute the steps across multiple tools.
Example: Debugging a PostgreSQL Memory Issue
The Old Way: PostgreSQL goes out of memory. You jump into war room mode: query Kubernetes for pod memory, grep through application logs, check deployment history, search Confluence for the PostgreSQL runbook, manually correlate metrics with error patterns. Each step requires switching tools and piecing together context. By the time you connect the dots, the system has been degraded for 30 minutes.
With AI for Production Systems: You ask Resolve: "Why did PostgreSQL run out of memory?" Because Resolve understands the telemetry, it knows exactly what to look for. It uses MCP to retrieve the specific PostgreSQL runbook from Confluence that matches the symptoms, extracting memory thresholds (85% warning, 95% critical) and investigation procedures. With the right knowledge applied to the current context, Resolve executes the investigation autonomously:
svc-entity-graph-ingest as the actual culprit, not PostgreSQLstoreEvents function in postgres.js failing to clean up after errorskubectl rollout restart commandThe hardest part of building new features in a brownfield environment (existing systems with legacy code and established patterns) isn't writing the code; it's fitting it into the existing architecture. "How do we handle retries?" "What's the standard for logging?" "Why did we choose this library over that one?"
The answers are often buried in closed Jira tickets, design documents, and architectural decision records (ADRs) in Confluence which capture the key decisions and context behind your system's evolution.
The Use Case: You are tasked with adding a new payment method. Instead of pinging the Principal Engineer (who is busy), you ask Resolve: "How should I implement the retry logic for the new payment gateway? Do we have a standard pattern?" Resolve understands your current codebase and infrastructure, so it knows what "standard pattern" implies for your system. It uses MCP to search your Confluence engineering guides, design docs, and past Jira tickets related to "payments" and "retries."
With AI for Production Systems:
It responds: "According to the 'Payment Resilience Strategy' design doc in Confluence, we standardized on exponential backoff with jitter for all external payment calls to prevent thundering herd issues. I also found JIRA-5678 where we explicitly decided against using the default retry mechanism due to a bug in version 1.2. Here is the recommended code snippet that aligns with our internal standards and includes the required patch."
It’s like having your most experienced engineer pair programming with you, with instant recall of every decision your team has ever made.
Cutting costs is rarely just about "turning off unused servers." It requires understanding why resources were provisioned in the first place. Is that over-provisioned RDS instance a mistake, or is it there to handle a quarterly spike documented in a Jira ticket from last year?
The Use Case: You ask Resolve: "Identify cost savings in our dev environment." Instead of just looking at CPU metrics, Resolve analyzes your Jira history and Confluence architecture docs. It might find:
dev-load-test with 50 large pods that were spun up for a ticket closed three weeks ago (JIRA-1234: Q3 Load Testing).With AI for Production Systems:
It proposes: "I found 50 large-memory pods in the dev-load-test namespace linked to closed ticket JIRA-1234. They haven't processed traffic in 20 days. Deleting the namespace will free up 200 vCPUs and 800GB RAM. Shall I proceed?"
It optimizes based on intent, not just utilization.
AI for Production Systems is about understanding and operating all your production systems and tools. It captures the tribal knowledge of your unique environment and provides the combined expertise of all your engineers. Resolve AI is your AI for Production Systems. Resolve works across code, infrastructure, and telemetry to autonomously investigate incidents, help optimize costs, and ship faster. Companies like Coinbase, ZScaler, and Gametime use Resolve AI to run their production systems. Ready to see it in action? Book a demo to see how Resolve AI can turn your Atlassian knowledge base into an engine for autonomous operations.

Discover why most AI approaches like LLMs or individual AI agents fail in complex production environments and how multi-agent systems enable truly AI-native engineering. Learn the architectural patterns from our Stanford presentation that help engineering teams shift from AI-assisted to AI-native workflows.

Software engineering has embraced code generation, but the real bottleneck is production. Downtime, degradations, and war rooms drain velocity and cost millions. This blog explains why an AI SRE is the critical next step, how it flips the script on reliability, and why it must be part of your AI strategy now.

Vibe debugging is the process of using AI agents to investigate any software issue, from understanding code to troubleshooting the daily incidents that disrupt your flow. In a natural language conversation, the agent translates your intent (whether a vague question or a specific hypothesis) into the necessary tool calls, analyzes the resulting data, and delivers a synthesized answer.