linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
  • PostgreSQL internals
    Page and tuple, MVCC, vacuum, WAL, the planner and indexes
    →
All tutorials →
PricingAboutSign inCreate account
/
Intro
Lessons
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
  • PostgreSQL internals
    Page and tuple, MVCC, vacuum, WAL, the planner and indexes
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Chapters
  • How it works
  • Lessons
  • Knowledge base
  • Interview prep
Cluster

Back to clusters

Vacuum, freeze, wraparound, autovacuum

The flip side of MVCC: dead versions have to be removed, and the transaction counter must not overflow. Vacuum, the transaction horizon, HOT cleanup, freeze and wraparound, autovacuum tuning. This is the production killer: half of the incidents with bloat and server stalls live right here.

8 questions · ~35 min read

Questions

On this page

  1. 01Why do you need VACUUM, and what exactly does it do?
  2. 02What is the transaction horizon, and why does a long transaction hinder cleanup?
  3. 03What is a HOT update, and how is in-page cleanup tied to it?
  4. 04What are freezing and wraparound? What does overflowing the transaction counter threaten?
  5. 05When does autovacuum trigger, and which parameters drive its behavior?
  6. 06VACUUM versus VACUUM FULL: what is the difference, and when do you use each?
  7. 07What is a MultiXact, and why does it have its own wraparound?
  8. 08How does in-page cleanup differ from a full vacuum?

#why-vacuum

juniorOften

Why do you need VACUUM, and what exactly does it do?

What to say

Every UPDATE and DELETE leaves a dead row version. It is no longer visible to anyone but still takes up space in the page. VACUUM walks the table, finds versions invisible to any live snapshot, and frees their slots for reuse. The space stays with the table, but holes appear inside the pages for new rows. Along the way it updates the free space map (FSM) and the visibility map (VM), trims line pointers, and advances freezing. A plain VACUUM does not return space to the file system and runs without blocking writes.

What they want to hear

A senior should: - explain that vacuum reuses space inside the table rather than returning it to the OS (only `VACUUM FULL` does that) - tie "what can be removed" to the horizon: versions older than the oldest live snapshot are removed - name the side tasks: updating the FSM/VM, freezing, trimming pointers - know that a plain vacuum does not block SELECT/UPDATE and takes only a light lock

Pitfalls

  • ✗ Saying "VACUUM frees disk space". A plain vacuum reuses space inside the table, and only `VACUUM FULL` returns disk to the OS
  • ✗ Thinking vacuum removes any dead version. Only those older than the horizon
  • ✗ Assuming vacuum blocks writes. The plain mode runs online

Follow-up

  • ? How does VACUUM differ from VACUUM FULL in its effect on disk?
  • ? Which row versions is vacuum not allowed to remove?
  • ? Why does vacuum touch the FSM and VM?

Depth in knowledge base

  • VACUUM and the removable cutoff
  • Transaction horizon
  • In-page cleanup (prune)
tags: vacuum, maintenance, bloatbook: postgresql_internals-17.pdf:ch6 vacuum

#transaction-horizon

intermediateOften

What is the transaction horizon, and why does a long transaction hinder cleanup?

What to say

The horizon is the id of the oldest transaction whose snapshot might still be needed. A row version can be removed only if it became dead before the horizon; otherwise someone still has the right to see it. Any long transaction (an open `BEGIN`, a forgotten session in idle in transaction, a long analytical query at Repeatable Read) holds the horizon in place. While it does not move, vacuum sees the dead versions but has no right to remove them. They pile up, the table bloats, the indexes swell. Rogov calls this an event horizon: beyond it everything is already committed, and cleanup is safe.

What they want to hear

A senior should: - define the horizon as the oldest needed snapshot and tie it to vacuum's right to remove versions - list what holds the horizon: long transactions, idle in transaction, old snapshots on a replica with feedback - show the diagnostics: `backend_xmin` in `pg_stat_activity`, the age of the oldest transaction, a rising `n_dead_tup` - understand this is fixed not by tuning vacuum but by ending the long transaction

Pitfalls

  • ✗ Tweaking autovacuum parameters when the real cause is an unclosed long transaction holding the horizon
  • ✗ Confusing "there are many dead versions" with "they cannot be removed". Vacuum sees them, but the horizon will not let it
  • ✗ Forgetting about replicas: a snapshot on a standby with `hot_standby_feedback` also shifts the horizon on the primary

Follow-up

  • ? How do you find the transaction holding the horizon through `pg_stat_activity`?
  • ? Why is idle in transaction more dangerous than an active long query?
  • ? How can a replica stall cleanup on the primary?

Depth in knowledge base

  • Transaction horizon
  • VACUUM and the removable cutoff
  • Data snapshot
tags: vacuum, horizon, bloatbook: postgresql_internals-17.pdf:ch6 vacuum

#hot-updates-cleanup

intermediateOften

What is a HOT update, and how is in-page cleanup tied to it?

What to say

HOT (heap-only tuple) is an update where the new row version stays in the same page and no index entries are created for it. The condition: no indexed column changed and the page had room. The old version points to the new one through `t_ctid`, forming a HOT chain, and the line pointer becomes a redirect. When the page runs out of room, in-page cleanup (HOT pruning) kicks in: it collapses chains of dead versions and frees space without running a full vacuum over the whole table. This happens right during ordinary queries.

What they want to hear

A senior should: - name both HOT conditions: indexed columns unchanged and room in the page - explain the benefit: no extra index entries, less bloat and less load on vacuum - distinguish HOT pruning (local page cleanup during queries) from a full vacuum (a table walk that updates the maps) - give the lever: a `fillfactor` below 100 reserves room in the page and raises the HOT ratio

Pitfalls

  • ✗ Assuming HOT works on any UPDATE. Change an indexed column and it falls away
  • ✗ Confusing HOT pruning with vacuum. Pruning is local and does not update the visibility map across the whole table
  • ✗ Keeping `fillfactor=100` on a table with heavy UPDATEs and wondering at the low HOT ratio

Follow-up

  • ? How does HOT pruning differ from a full VACUUM?
  • ? How do you see the HOT update ratio through `pg_stat_user_tables`?
  • ? Why does an index on a frequently changing column lower the HOT ratio?

Depth in knowledge base

  • HOT updates and fillfactor
  • In-page cleanup (prune)
  • Visibility Map
tags: vacuum, hot, pruningbook: postgresql_internals-17.pdf:ch5 hot updates

#freeze-wraparound

seniorOften

What are freezing and wraparound? What does overflowing the transaction counter threaten?

What to say

A transaction id is 32 bits, and it wraps around in a circle. Visibility is judged by "xmin is in the past", and "the past" on a ring is relative. To keep old rows from suddenly looking "from the future" and vanishing, vacuum freezes them: it marks them as always visible and forgets their real xmin. Freezing advances the table's `relfrozenxid`. If freezing falls behind and the table's age approaches the limit, autovacuum launches an aggressive freeze, and right at the edge the server goes into protection: it stops issuing new xids and lets only cleanup through. That is the wraparound emergency.

What they want to hear

A senior should: - explain the cause: a 32-bit counter and the relativity of "the past" on a ring - say that freezing removes a row's dependency on xmin and advances `relfrozenxid` - describe the edge protection: it stops issuing xids, enters a "vacuum-only" mode, and how you get out of it - give the monitoring: `age(relfrozenxid)` per table, `autovacuum_freeze_max_age`, alarm in advance rather than after the stall

Pitfalls

  • ✗ Thinking wraparound is "long obsolete". Under heavy load with long transactions it is real and stops writes
  • ✗ Disabling autovacuum "so it does not get in the way". That is how you reach an emergency freeze
  • ✗ Confusing freezing (always-visible) with deletion. Freeze does not remove a row, it fixes its visibility forever

Follow-up

  • ? How do you see wraparound approaching in advance through the system views?
  • ? What does the server do right at the edge of xid overflow?
  • ? Why is `relfrozenxid` more important than the database's total age?

Depth in knowledge base

  • Freezing and relfrozenxid
  • Wraparound and the XID wheel
  • Multixact and relminmxid
tags: vacuum, freeze, wraparoundbook: postgresql_internals-17.pdf:ch7 freezing

#autovacuum-tuning

intermediateOften

When does autovacuum trigger, and which parameters drive its behavior?

What to say

Autovacuum wakes on a timer (`autovacuum_naptime`) and for each table computes a threshold: a base number plus a fraction of the size (`autovacuum_vacuum_threshold` plus `autovacuum_vacuum_scale_factor` times the row count). Once more dead versions than that have accumulated, a vacuum starts. Separately it watches age for the sake of freezing (`autovacuum_freeze_max_age`). Intensity is throttled by a cost-based delay (`autovacuum_vacuum_cost_delay`/`cost_limit`) so it does not eat the disk. On large tables the default scale factor of 0.2 is too big: vacuum comes rarely and late, so it is lowered per table through `ALTER TABLE ... SET`.

What they want to hear

A senior should: - give the threshold formula (threshold plus scale factor of the size) and understand why the default scales poorly to large tables - distinguish the two triggers: garbage accumulation and age for freezing - explain cost-based throttling: how autovacuum limits its own load and when you need to speed it up - give the practice: per-table settings for hot and large tables instead of globally cranking it up

Pitfalls

  • ✗ Leaving `scale_factor=0.2` on a table with hundreds of millions of rows. Vacuum will come far too late
  • ✗ Disabling autovacuum entirely. A near-guaranteed path to bloat and wraparound
  • ✗ Cranking up the frequency while forgetting `cost_delay`. Vacuum hits the throttle and never catches up to the load

Follow-up

  • ? Why does a large table's default scale factor have to be lowered?
  • ? How does autovacuum limit its own disk load?
  • ? How does the garbage threshold differ from the age trigger?

Depth in knowledge base

  • Autovacuum: thresholds and bloat
  • VACUUM and the removable cutoff
  • Transaction horizon
tags: vacuum, autovacuum, tuningbook: postgresql_internals-17.pdf:ch6 vacuum · codelibs.ru_monitoring-postgresql.pdf:autovacuum

#vacuum-vs-vacuum-full

intermediateSometimes

VACUUM versus VACUUM FULL: what is the difference, and when do you use each?

What to say

A plain VACUUM works online: it marks dead versions reusable inside the table, does not block reads and writes, but does not shrink the file on disk. VACUUM FULL rewrites the table into a new file with no dead versions, physically shrinks it, and returns space to the OS, but it takes an ACCESS EXCLUSIVE lock. The table is unavailable for the whole time and you need room for a copy. So everyday hygiene is plain vacuum and autovacuum; VACUUM FULL is a one-off when a table has already bloated badly and you have a window. The alternative without a full lock is `pg_repack`.

What they want to hear

A senior should: - contrast online vacuum (space inside the table) with VACUUM FULL (a new file, disk returned to the OS) - name the cost of VACUUM FULL: ACCESS EXCLUSIVE and room for a copy of the table - explain that regular vacuum exists precisely so you never reach VACUUM FULL - know `pg_repack`/`CLUSTER` as ways to remove bloat with less blocking or with reordering

Pitfalls

  • ✗ Running VACUUM FULL on a schedule. It is a full table lock, not a routine operation
  • ✗ Expecting a plain vacuum to shrink the file on disk. It does not
  • ✗ Starting VACUUM FULL without free space for a copy. The operation fails midway

Follow-up

  • ? Why does VACUUM FULL need free space the size of the table?
  • ? When is `pg_repack` justified instead of VACUUM FULL?
  • ? What does `CLUSTER` change besides removing bloat?

Depth in knowledge base

  • VACUUM and the removable cutoff
  • In-page cleanup (prune)
tags: vacuum, vacuum-full, bloatbook: postgresql_internals-17.pdf:ch6 vacuum

#multixact-wraparound

seniorRare

What is a MultiXact, and why does it have its own wraparound?

What to say

When several transactions lock one row at the same time (for example several `SELECT FOR SHARE`), you cannot write a single number into `xmax`. Then a MultiXact is created: an identifier for a group of transactions, with the list of members held in the `pg_multixact` SLRU directories. MultiXact has its own 32-bit counter and therefore its own wraparound and its own freezing (`autovacuum_multixact_freeze_max_age`). Under a load with active row locking (task queues, FOR SHARE) MultiXact grows fast and can become the bottleneck before the ordinary xid does.

What they want to hear

A senior should: - explain why a MultiXact exists: several lockers of one row do not fit in a single `xmax` - know that it has a separate counter, separate wraparound, and separate freezing - tie it to load: patterns like FOR SHARE and queues accelerate MultiXact growth - be able to watch the MultiXact age, not only the ordinary xid, when debugging freeze problems

Pitfalls

  • ✗ Watching only the xid age and missing a MultiXact overflow
  • ✗ Not knowing that `SELECT FOR SHARE` from several sessions produces a MultiXact
  • ✗ Assuming that freezing the xid also handles MultiXact. It has its own thresholds

Follow-up

  • ? Which load pattern accelerates MultiXact growth the most?
  • ? Where is the list of MultiXact members stored?
  • ? Why must the MultiXact age be monitored separately from the xid age?

Depth in knowledge base

  • Multixact and relminmxid
  • Freezing and relfrozenxid
  • Row locks
tags: vacuum, multixact, freezebook: postgresql_internals-17.pdf:ch7 freezing

#heap-pruning

seniorSometimes

How does in-page cleanup differ from a full vacuum?

What to say

In-page cleanup (page pruning) happens right during an ordinary query when it touches a page: PostgreSQL collapses HOT chains and marks dead versions reusable within that one page. It is cheap and needs no table walk, but it does not touch indexes and does not update the visibility map fully. A full vacuum walks the whole table, cleans index references, updates the FSM and VM, and advances freezing. Pruning relieves pressure between vacuum runs, but it does not replace vacuum.

What they want to hear

A senior should: - localize pruning: one page, during a query, with no table walk - list what pruning does not do: it does not clean indexes, does not update the maps fully, does not freeze - explain that pruning and HOT work as a pair and smooth out bloat between vacuum runs - understand a full vacuum is still needed for indexes and freezing

Pitfalls

  • ✗ Assuming page pruning replaces vacuum. It does not touch indexes or freezing
  • ✗ Thinking pruning runs as a separate process. It is built into the ordinary page read
  • ✗ Expecting pruning to update the visibility map across the whole table. That is vacuum's job

Follow-up

  • ? What can page pruning fundamentally not do?
  • ? When exactly does in-page cleanup trigger?
  • ? Why does a table still need vacuum even with active pruning?

Depth in knowledge base

  • In-page cleanup (prune)
  • HOT updates and fillfactor
  • VACUUM and the removable cutoff
tags: vacuum, pruning, hotbook: postgresql_internals-17.pdf:ch5 hot updates
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.