Database Concurrency Control: MVCC, Locking, and Deadlocks


Database Concurrency Control: MVCC, Locking, and Deadlocks

Modern databases must handle thousands of concurrent transactions reading and writing the same rows. Concurrency control mechanisms ensure correctness while maximizing throughput. PostgreSQL's approach is built on Multi-Version Concurrency Control (MVCC).

Multi-Version Concurrency Control (MVCC)

MVCC is the foundation of concurrency in PostgreSQL and Oracle. Instead of locking rows for readers, MVCC preserves multiple versions of each row. Each transaction sees a snapshot of data as it existed at that transaction's start time.

Every row in PostgreSQL carries two hidden system columns: `xmin` (the transaction ID that created this version) and `xmax` (the transaction ID that deleted or updated this version). When a transaction reads a row, it checks visibility rules:




-- xmin must be committed and <= current transaction ID


-- xmax must be uncommitted or > current transaction ID





This design means **readers never block writers, and writers never block readers**. It is the single most important property for OLTP workloads.

Transaction Snapshots

A transaction's snapshot captures which transactions were in-progress at the moment the snapshot was taken. The `REPEATABLE READ` isolation level uses snapshot semantics:




BEGIN ISOLATION LEVEL REPEATABLE READ;


SELECT * FROM accounts WHERE id = 1; -- sees snapshot at this moment


-- Another transaction updates and commits account 1


SELECT * FROM accounts WHERE id = 1; -- still sees the original snapshot


COMMIT;





Optimistic vs Pessimistic Locking

Optimistic Locking

Optimistic locking assumes conflicts are rare. The application reads a row, performs work, and checks that the row has not changed before writing. It is typically implemented with a version column:




CREATE TABLE inventory (


id INTEGER PRIMARY KEY,


quantity INTEGER,


version INTEGER DEFAULT 1


);




-- Application reads: SELECT quantity, version FROM inventory WHERE id = 42;


-- Application computes new quantity


UPDATE inventory


SET quantity = 5, version = version + 1


WHERE id = 42 AND version = 3; -- version from the read




-- If 0 rows updated, another transaction changed the row → retry





Optimistic locking works well when contention is low and transactions are short. It avoids holding database locks between application operations.

Pessimistic Locking

Pessimistic locking assumes conflicts are likely and acquires locks proactively:




BEGIN;


SELECT * FROM inventory WHERE id = 42 FOR UPDATE;


-- Hold the lock, do application work, then write


UPDATE inventory SET quantity = 5 WHERE id = 42;


COMMIT; -- lock released





PostgreSQL offers three row-level lock modes:


* `FOR UPDATE`: Prevents other transactions from updating or deleting the row.

* `FOR NO KEY UPDATE`: Weaker variant that allows concurrent key-column updates.

* `FOR SHARE`: Allows concurrent reads but prevents updates.

* `FOR KEY SHARE`: Prevents key-column deletions only.





-- Advisory locks for application-level concurrency


SELECT pg_advisory_lock(12345);


-- critical section


SELECT pg_advisory_unlock(12345);





Deadlock Detection

A deadlock occurs when two or more transactions each hold locks the other needs:




Transaction A: locks row 1, waits for row 2


Transaction B: locks row 2, waits for row 1





PostgreSQL's deadlock detector runs periodically (every `deadlock_timeout`, default 1 second). When it detects a cycle, it aborts one transaction:




ERROR: deadlock detected


DETAIL: Process 123 waits for ShareLock on transaction 456; blocked by process 456.


Process 456 waits for ShareLock on transaction 123; blocked by process 123.


HINT: See server log for query details.





To minimize deadlocks:


* **Access tables in a consistent order** across all transactions.

2\. **Keep transactions short** to reduce the window for conflicts. 3\. **Use `NOWAIT` or `SKIP LOCKED`** to fail fast instead of waiting:




SELECT * FROM inventory WHERE id = 42 FOR UPDATE NOWAIT;


SELECT * FROM jobs ORDER BY priority LIMIT 1 FOR UPDATE SKIP LOCKED;





Lock Monitoring

The `pg_locks` view shows current lock state:




SELECT locktype, relation::regclass, mode, granted,


pid, virtualtransaction


FROM pg_locks


WHERE NOT granted;





This query reveals which backends are waiting for locks and who holds them. The `pg_blocking_pids()` function helps identify blockers:




SELECT pid, pg_blocking_pids(pid) AS blocked_by,


state, query


FROM pg_stat_activity


WHERE state = 'active';





MVCC Bloat

MVCC's strength creates a weakness: dead row versions accumulate. Old row versions remain until they become visible to no active snapshot. This is called bloat.




-- Query to estimate table bloat


SELECT schemaname, tablename,


n_dead_tup, n_live_tup,


ROUND(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_pct


FROM pg_stat_user_tables


ORDER BY dead_pct DESC;





`VACUUM` reclaims space from dead tuples. `VACUUM FREEZE` prevents transaction ID wraparound, an operational imperative in PostgreSQL:




VACUUM ANALYZE accounts;


-- For aggressive cleanup:


VACUUM (VERBOSE, INDEX_CLEANUP ON) accounts;





Summary

PostgreSQL's MVCC provides the foundation: readers and writers coexist without blocking. On top of MVCC, row-level locks enable pessimistic patterns when needed. The key is choosing the right strategy for each operation and monitoring for deadlocks, long-held locks, and MVCC bloat. Proper concurrency control is what separates a smoothly scaling application from one that stalls under load.