Back to clusters

Row locks, relation locks, deadlocks

Where MVCC ends and real locks begin: relation lock modes and their conflicts, row locks through xmax, FOR UPDATE and SKIP LOCKED, deadlocks, idle in transaction, the lock queue under DDL. Here they check whether you understand why a "harmless" ALTER TABLE takes production down.

7 questions · ~30 min read

#relation-lock-modes

intermediateOften

What relation lock modes are there, and which of them conflict?

What to say

A table has eight lock modes from ACCESS SHARE to ACCESS EXCLUSIVE, lined up by strength. A plain `SELECT` takes the weakest, ACCESS SHARE; `INSERT`/`UPDATE`/`DELETE` take ROW EXCLUSIVE; commands like `CREATE INDEX` take SHARE; most `ALTER TABLE`, `DROP`, `TRUNCATE`, and `VACUUM FULL` take the strongest, ACCESS EXCLUSIVE, which conflicts with everything, including a plain SELECT. It is not the commands that conflict but the modes: reads are compatible with each other, while ACCESS EXCLUSIVE is compatible with nothing. An incompatible request gets into the queue.

What they want to hear

A senior should: - understand that locks are taken in modes and the conflict is decided by a compatibility table, not by the command name - know the key pairs: SELECT takes ACCESS SHARE, DML takes ROW EXCLUSIVE, heavy DDL takes ACCESS EXCLUSIVE - explain that ACCESS EXCLUSIVE blocks even readers, which is the danger of DDL on a hot table - be able to inspect current locks and waits through `pg_locks` and `pg_stat_activity`

Pitfalls

  • Thinking a `SELECT` blocks nothing. It holds ACCESS SHARE, and that is enough to block ACCESS EXCLUSIVE
  • Assuming relation locks are about rows. These are locks on the whole table as an object
  • Confusing a mode's strength with a command name. The conflict is decided by the compatibility table of modes

Follow-up

  • ? Why can a long `SELECT` delay an `ALTER TABLE`?
  • ? Which mode does a plain `UPDATE` take?
  • ? How do you tell from `pg_locks` who is waiting on whom?

Depth in knowledge base

tags: locks, relation, ddlbook: postgresql_internals-17.pdf:ch13 relation-level locks

#row-locks-implementation

seniorOften

How are row locks implemented? Where are they stored?

What to say

Row locks do not sit in memory. That would be far too expensive with millions of locked rows. The lock mark is written into the row version itself: `xmax` gets the id of the locking transaction, and bits in `t_infomask` say what kind of lock it is, for delete/update or a "soft" one (FOR SHARE/FOR KEY SHARE). If several transactions hold the row at once, a MultiXact is placed in `xmax`. The only thing in memory is a short-lived lock for the moment of changing the version. So the number of locked rows is unbounded. They are essentially free in terms of memory.

What they want to hear

A senior should: - say the key point: a row lock is stored in the row (xmax plus infomask), not in a shared lock table in memory - explain why: otherwise locking a million rows would eat memory - tie several lockers of one row to a MultiXact - distinguish a row lock from a heavy relation lock. They are different mechanisms

Pitfalls

  • Thinking row locks sit one by one in `pg_locks`. They are not there. The mark is in the row itself
  • Assuming many locked rows are a memory load. By memory they are nearly free
  • Forgetting MultiXact when several transactions hold a row

Follow-up

  • ? Why is a lock on a million rows not visible one by one in `pg_locks`?
  • ? What happens in `xmax` when two transactions lock a row at once?
  • ? How does FOR SHARE differ from FOR UPDATE at the row bit level?

Depth in knowledge base

tags: locks, row, multixactbook: postgresql_internals-17.pdf:ch14 row-level locks

#select-for-update-variants

intermediateOften

FOR UPDATE, FOR SHARE, SKIP LOCKED, NOWAIT: when do you use each?

What to say

`SELECT ... FOR UPDATE` locks the selected rows for change: other such queries wait. `FOR SHARE` is softer. It lets others read with a lock too but not change. `FOR KEY SHARE`/`FOR NO KEY UPDATE` are finer variants that conflict less with each other and let part of an UPDATE proceed in parallel. The behavior modifiers for a busy row: `NOWAIT` fails right away with an error, while `SKIP LOCKED` skips locked rows and takes the next free ones. The pairing `FOR UPDATE SKIP LOCKED` is the canonical way to build a task queue without races.

What they want to hear

A senior should: - distinguish the strength: FOR UPDATE versus FOR SHARE and the finer KEY variants - explain `SKIP LOCKED` as the basis of queues: workers pick different rows without colliding - know `NOWAIT` for the case "better an immediate error than a wait" - warn that `FOR UPDATE` in a long transaction holds the rows and the horizon, which hurts cleanup

Pitfalls

  • Building a task queue without `SKIP LOCKED`. Workers will line up on one row
  • Holding `FOR UPDATE` in a long transaction. The locked rows and the horizon stand for the whole time
  • Confusing `SKIP LOCKED` (skip the busy ones) with `NOWAIT` (fail with an error)

Follow-up

  • ? How do you build a task queue for several workers without races?
  • ? How does `NOWAIT` differ from `SKIP LOCKED` in behavior?
  • ? Why does a long `FOR UPDATE` hurt cleanup?

Depth in knowledge base

tags: locks, for-update, queuebook: postgresql_internals-17.pdf:ch14 row-level locks

#deadlocks

intermediateOften

How does a deadlock arise, and what does PostgreSQL do about it?

What to say

A deadlock is a wait cycle: transaction A holds resource 1 and waits for resource 2, while transaction B holds resource 2 and waits for resource 1. They will not break apart on their own. PostgreSQL does not wait forever: after `deadlock_timeout` (one second by default) the waiting transaction runs a check of the wait graph, finds the cycle, and rolls one transaction back with a `deadlock detected` error, breaking the cycle. The classic cause is two transactions updating the same rows in a different order. It is cured by a consistent lock order and short transactions, not by raising the timeout.

What they want to hear

A senior should: - define a deadlock as a cycle in the wait graph and give the example with reversed update order - know the mechanics: `deadlock_timeout`, building the graph, rolling a victim back with an error - give the prevention: a single order of grabbing rows, short transactions, a retry on the application side - distinguish a deadlock (a wait cycle, broken automatically) from an ordinary long wait (it just waits)

Pitfalls

  • Curing deadlocks by raising `deadlock_timeout`. That only delays detection, it does not remove the cause
  • Not retrying: a deadlock is a normal error, and the application should repeat the transaction
  • Confusing a deadlock with a long wait. The latter resolves itself, the former does not

Follow-up

  • ? Give an example of two transactions guaranteed to deadlock
  • ? What does `deadlock_timeout` do, and why is it not zero?
  • ? How does ordering row updates help avoid deadlocks?

Depth in knowledge base

tags: locks, deadlock, retrybook: postgresql_internals-17.pdf:ch14 row-level locks

#idle-in-transaction

intermediateOften

Why is a transaction in the idle in transaction state dangerous?

What to say

Idle in transaction is an open `BEGIN` that does nothing yet is not finished: the application took a connection, ran a query, and forgot to commit or roll back. While the transaction is alive, it holds its snapshot, and therefore the horizon: vacuum cannot remove dead versions newer than that snapshot, and tables and indexes bloat. If the transaction also managed to lock something, it holds the locks too, and the queue behind it grows. The defense is `idle_in_transaction_session_timeout`, which terminates such sessions, plus discipline in the code: do not leave transactions open.

What they want to hear

A senior should: - explain the double harm: holding the horizon (bloat) and possibly holding locks (a queue) - tie the cause to the application: a connection pool, a forgotten commit, long work between queries inside a transaction - name the defense: `idle_in_transaction_session_timeout` and careful transaction boundaries in code - be able to find such sessions in `pg_stat_activity` by state and the transaction start time

Pitfalls

  • Treating idle in transaction as harmless. It holds the horizon and piles up garbage as badly as a long query
  • Opening a transaction and calling external services inside it. The snapshot hangs for that whole time
  • Not setting `idle_in_transaction_session_timeout` and trusting every client to be disciplined

Follow-up

  • ? How do you find idle in transaction sessions and their age?
  • ? Why does such a session hinder vacuum even if it locks nothing?
  • ? What does `idle_in_transaction_session_timeout` do?

Depth in knowledge base

tags: locks, idle-in-transaction, horizonbook: postgresql_internals-17.pdf:ch6 vacuum · codelibs.ru_postgresql-mistakes-and-how-to-avoid-them.pdf:long transactions

#ddl-lock-queue

seniorOften

Why can a harmless ALTER TABLE take down a loaded database?

What to say

Heavy DDL takes ACCESS EXCLUSIVE, incompatible with everything. If there is a long query on the table, the ALTER gets into the queue behind it and does not run yet. The trouble is that the queued ALTER already blocks everyone who arrives after it: even fast SELECTs line up behind it. One long query plus one ALTER turn into a full table stall. The defense is a `lock_timeout` before the DDL (better to fail than to freeze the queue), migrations in small steps, and techniques that lower the lock strength: `CREATE INDEX CONCURRENTLY`, `ADD COLUMN` without rewriting the table, adding constraints through `NOT VALID` followed by `VALIDATE`.

What they want to hear

A senior should: - explain the queue effect: a waiting ACCESS EXCLUSIVE blocks everyone who arrived after it - name the defense: a `lock_timeout` before the migration, windows, small steps - know the "lighter" forms of DDL: CONCURRENTLY, NOT VALID plus VALIDATE, a cheap ADD COLUMN with DEFAULT since PG 11 - understand the cause is often not the ALTER itself but a long query or an idle in transaction ahead of it

Pitfalls

  • Running `ALTER TABLE` on a hot table without `lock_timeout`. One long query turns it into a full stall
  • Building an index with a plain `CREATE INDEX` in production instead of `CONCURRENTLY`
  • Adding a validated constraint in one step instead of `NOT VALID` plus `VALIDATE`

Follow-up

  • ? Why does a queued `ALTER` block even fast SELECTs?
  • ? Why set a `lock_timeout` before a migration?
  • ? How do you add a validated constraint without locking the table for long?

Depth in knowledge base

tags: locks, ddl, migrationbook: postgresql_internals-17.pdf:ch13 relation-level locks · codelibs.ru_postgresql-mistakes-and-how-to-avoid-them.pdf:migrations

#lwlocks-predicate-locks

seniorRare

How do lightweight locks and predicate locks differ from ordinary ones?

What to say

Heavyweight locks protect objects like tables and rows, are visible in `pg_locks`, can wait in a queue, and take part in deadlock detection. Lightweight locks (LWLocks) protect internal structures in shared memory, buffers, lists, caches; they are short, come in shared/exclusive modes, and do not look for deadlocks. Lower still are spinlocks held for a few instructions. Predicate locks (SIRead) are a special beast at the Serializable level: they block no one and only mark what was read, so SSI can later find dangerous dependencies. In `pg_locks` they appear as rows with mode `SIReadLock`, but unlike ordinary locks they never make anyone wait.

What they want to hear

A senior should: - lay out the hierarchy: heavyweight (objects, queue, deadlocks), LWLock (memory structures), spinlocks (very short) - say that LWLocks do not take part in deadlock detection and are usually observed through wait events - explain predicate locks as read marks for SSI: visible in `pg_locks` but blocking no one - tie LWLock contention to diagnostics: a high share of waits on internal locks signals a bottleneck

Pitfalls

  • Looking for LWLock contention in `pg_locks`. It is seen through wait events, not there
  • Treating predicate locks as ordinary ones. They wait for nothing, they only mark what was read
  • Confusing a spinlock with a heavy lock. A spinlock spins for a few instructions and does not wait in a queue

Follow-up

  • ? Where do you observe contention on lightweight locks?
  • ? Why do predicate locks cause no waits?
  • ? How does a spinlock differ from an LWLock in how long it is held?

Depth in knowledge base

tags: locks, lwlock, predicatebook: postgresql_internals-17.pdf:ch15 miscellaneous locks