Debugging PostgreSQL deadlock issues

Learn how PostgreSQL deadlocks form, how to read deadlock log output, and how to fix the four most common patterns including row-level lock inversion, multi-table escalation, and SELECT FOR UPDATE conflicts.

A deadlock occurs when two or more transactions each hold a lock the other needs, and neither can proceed. PostgreSQL detects this automatically and resolves it by canceling one of the transactions. You'll see ERROR: deadlock detected; but detection is the symptom, not the diagnosis. Investigations involve understanding which transactions are conflicting, why, and how to prevent it from recurring.

What actually happens during a Deadlock

Transaction A acquires a lock on row 1, then tries to lock row 2. Transaction B has already locked row 2 and is waiting to lock row 1. Neither can proceed. PostgreSQL's deadlock detector finds the cycle in the wait graph and terminates one transaction with:

ERROR: deadlock detected
DETAIL: 
Process 12345 waits for ShareLock on transaction 67890; blocked by process 11111.
Process 11111 waits for ShareLock on transaction 12345; blocked by process 12345.
HINT: See server log for query details.

The canceled transaction receives this error and its work is rolled back. The surviving transaction acquires the lock and continues. From the application's perspective, one request fails with an error that needs to be retried or surfaced to the user.

Common Deadlock patterns

Row-level lock inversion

The most common pattern: two transactions access the same set of rows but in different orders.

Transaction A:

BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

Transaction B (running concurrently):

BEGIN;
UPDATE accounts SET balance = balance - 50 WHERE id = 2;
UPDATE accounts SET balance = balance + 50 WHERE id = 1;
COMMIT;

A locks row 1, B locks row 2, then each waits for what the other holds. The fix is consistent lock ordering: always acquire locks on the lower ID first, or sort the rows before processing. This is a simple rule to state and surprisingly easy to violate in practice when multiple code paths touch the same tables independently.

Multi-table lock escalation

Deadlocks frequently involve more than one table. Transaction A updates orders then inventory. Transaction B updates inventory then orders. Same inversion problem, harder to spot because the locks are on different tables and the code paths are often in different parts of the application.

-- Transaction A (order fulfillment service)
BEGIN;
UPDATE orders SET status = 'confirmed' WHERE id = $1;
UPDATE inventory SET reserved = reserved + 1 WHERE sku = $2;
COMMIT;

-- Transaction B (inventory adjustment service)
BEGIN;
UPDATE inventory SET quantity = quantity - 1 WHERE sku = $2;
UPDATE orders SET inventory_updated = true WHERE id = $1;
COMMIT;

These two code paths may have been written by different teams who didn't coordinate on lock ordering. The solution is the same: consistent table access order. But the diagnosis requires seeing both code paths together.

SELECT FOR UPDATE with joins

SELECT FOR UPDATE acquires row-level exclusive locks at select time. When a query joins multiple tables and locks rows in both, the lock acquisition order depends on the query execution plan, which can vary based on statistics, table size, and planner decisions. Two concurrent instances of the same query can deadlock if they acquire the same locks in different orders based on their execution plans.

SELECT o.id, i.quantity
FROM orders o
JOIN inventory i ON i.sku = o.sku
WHERE o.status = 'pending'
FOR UPDATE;

Making the join order explicit and consistent, or restructuring to lock tables separately in a defined sequence, eliminates the variability.

Structural fixes

Consistent lock ordering is the primary defense against deadlocks. Any code that updates multiple rows or tables in a single transaction should acquire locks in the same order every time: by ID, by table name, etc.,. This requires coordination across services that touch the same data.

Shorter transactions reduces the window in which locks are held and therefore the probability of conflicting with concurrent transactions. A transaction that holds locks for 50ms is far less likely to deadlock than one holding locks for 2 seconds. Long transactions are often a symptom of doing non-database work inside a transaction like making HTTP calls, writing files, waiting on external responses with the database transaction left open.

Retry logic in application code is a necessary complement to structural fixes, not a substitute for them. Deadlocks will happen in any system with concurrent writes, and the application needs to handle deadlock detected errors gracefully with exponential backoff and retry, with an appropriate upper bound on retries before surfacing the failure.

How Resolve AI investigates PostgreSQL deadlocks

Deadlock log entries tell you which queries conflicted. Resolving a deadlock structurally means finding every code path that acquires locks on the same tables and verifying they do so in consistent order. In a system where multiple services write to the same tables, that's not a grep problem.

When a deadlock occurs, Resolve AI starts by correlating the database error with production signals across three layers.

In the codebase, it searches for the specific lock acquisition patterns that cause each deadlock class: UPDATE statements across the same tables in different orders, SELECT FOR UPDATE on joined tables where execution plan variability can cause inconsistent lock ordering, and DEFERRABLE constraint definitions that introduce implicit lock acquisition during constraint checks. For the row inversion pattern specifically, it traces every code path that writes to the conflicting tables and maps the order in which locks are acquired across each path.

In logs, it searches for deadlock detected and error code 40P01 events, extracts the lock wait graphs from the DETAIL blocks, and looks for timing correlations between transaction types. A deadlock between order fulfillment and inventory adjustment that appears only during peak hours points toward a concurrency threshold, not just a code pattern.

In traces, it examines database span latency to identify lock contention that precedes deadlock events, maps which service endpoints are active in the window when deadlocks occur, and surfaces transaction retry patterns that indicate the application is encountering deadlocks more broadly than the error logs capture.

The output is a causal chain: which services own the conflicting queries, where in each codebase the inconsistent lock ordering originates, and which fix (consistent ordering, advisory locks, or optimistic locking) applies to the specific pattern involved.

Social