
Summary
• A Node.js service deployment was failing with CrashLoopBackOff across multiple pods, with no clear indication of the root cause from standard Kubernetes diagnostics • Traditional debugging would require manually analyzing logs across failed pods, understanding Node.js native module dependencies, and tracing through container build processes • Resolve AI analyzed logs from multiple failed pods, identified missing system dependencies for node-canvas, then traced through the codebase to find the exact Dockerfile location for the fix • Provided specific infrastructure code changes to resolve the deployment failure, connecting runtime symptoms with build-time solutions
“Asked Resolve AI to check the failed pod logs. It analyzed logs across multiple pods, identified the root cause as missing native dependencies for node-canvas, then when I asked for a fix, it searched our codebase and told me exactly which Dockerfile to edit and what system libraries to add. From mysterious deployment failure to specific infrastructure fix in minutes.”
What was the deployment failure?
The svc-analysis service deployment was consistently failing with pods stuck in CrashLoopBackOff state. Multiple pods were failing to start properly, but the standard Kubernetes diagnostics weren't providing clear insights into the root cause. The failure pattern suggested an application-level issue rather than resource constraints or configuration problems.
This type of deployment failure typically requires extensive manual debugging: examining logs across multiple failed pods, understanding the application's dependencies, correlating error patterns, and tracing the issue back through the container build process. Without deep knowledge of Node.js native modules and containerization, these failures can take hours or days to resolve.
How did Resolve AI trace from telemetry to fix?

Resolve AI immediately began comprehensive log analysis across the failed deployment. Rather than examining a single pod, it systematically searched logs from multiple pods (svc-analysis-66b555c4fb-hf7r7, svc-analysis-66b555c4fb-s4jzt, svc-analysis-6d98b7f978-mfnv9) across the relevant time window, looking for error patterns and startup failures.
The log analysis revealed a consistent pattern: recurring errors loading the node-canvas native module with failures in node_modules/canvas@3.1.2/node_modules/canvas/lib/bindings.js
. This pattern appeared dozens of times across different pods, indicating a systematic issue with native dependency loading preventing the Node.js application from starting.
Resolve AI synthesized the log patterns into a clear root cause: the container environment was missing system libraries required for the node-canvas native module. Rather than just identifying "canvas errors," it understood that this indicated missing build dependencies like libcairo2-dev, libpango1.0-dev, and related graphics libraries.
The analysis went beyond immediate symptoms to understand the underlying infrastructure cause: the container image lacked the system-level dependencies that node-canvas requires for successful installation and loading in a production environment. *Code analysis identified the exact Dockerfile location and provided specific system library installation commands.*

When asked for a fix, Resolve AI searched through the codebase to understand the container build process. It identified that svc-analysis uses the node-prod-base image defined in .github/src/base/node-prod/Dockerfile
and provided the exact location to add the required system dependencies.

The solution was comprehensive and specific: exactly which system libraries to install (build-essential
, libcairo2-dev
, libpango1.0-dev
, libjpeg-dev
, libgif-dev
, librsvg2-dev
), where to add them in the Dockerfile, and alternative approaches for service-specific vs. base image fixes.
What was the impact?
- Connected runtime failures with infrastructure fixes by tracing deployment symptoms back to specific container build requirements
- Eliminated lengthy debugging cycles that would typically require deep knowledge of Node.js native modules, container dependencies, and build processes
- Provided immediately actionable solutions with exact file locations and code changes rather than generic troubleshooting advice
- Prevented future similar issues by identifying the base image fix that ensures all Node.js services have required native dependencies
- Accelerated DevOps troubleshooting by bridging the gap between Kubernetes operations and application development concerns