Question 1

Why do you need VACUUM, and what exactly does it do?

Accepted Answer

Every UPDATE and DELETE leaves a dead row version. It is no longer visible
to anyone but still takes up space in the page. VACUUM walks the table,
finds versions invisible to any live snapshot, and frees their slots for
reuse. The space stays with the table, but holes appear inside the pages
for new rows. Along the way it updates the free space map (FSM) and the
visibility map (VM), trims line pointers, and advances freezing. A plain
VACUUM does not return space to the file system and runs without blocking
writes.

Question 2

What is the transaction horizon, and why does a long transaction hinder cleanup?

Accepted Answer

The horizon is the id of the oldest transaction whose snapshot might still
be needed. A row version can be removed only if it became dead before the
horizon; otherwise someone still has the right to see it. Any long
transaction (an open `BEGIN`, a forgotten session in idle in transaction,
a long analytical query at Repeatable Read) holds the horizon in place.
While it does not move, vacuum sees the dead versions but has no right to
remove them. They pile up, the table bloats, the indexes swell. Rogov
calls this an event horizon: beyond it everything is already committed,
and cleanup is safe.

Question 3

What is a HOT update, and how is in-page cleanup tied to it?

Accepted Answer

HOT (heap-only tuple) is an update where the new row version stays in the
same page and no index entries are created for it. The condition: no
indexed column changed and the page had room. The old version points to
the new one through `t_ctid`, forming a HOT chain, and the line pointer
becomes a redirect. When the page runs out of room, in-page cleanup
(HOT pruning) kicks in: it collapses chains of dead versions and frees
space without running a full vacuum over the whole table. This happens
right during ordinary queries.

Question 4

What are freezing and wraparound? What does overflowing the transaction counter threaten?

Accepted Answer

A transaction id is 32 bits, and it wraps around in a circle. Visibility
is judged by "xmin is in the past", and "the past" on a ring is relative.
To keep old rows from suddenly looking "from the future" and vanishing,
vacuum freezes them: it marks them as always visible and forgets their
real xmin. Freezing advances the table's `relfrozenxid`. If freezing falls
behind and the table's age approaches the limit, autovacuum launches an
aggressive freeze, and right at the edge the server goes into protection:
it stops issuing new xids and lets only cleanup through. That is the
wraparound emergency.

Question 5

When does autovacuum trigger, and which parameters drive its behavior?

Accepted Answer

Autovacuum wakes on a timer (`autovacuum_naptime`) and for each table
computes a threshold: a base number plus a fraction of the size
(`autovacuum_vacuum_threshold` plus `autovacuum_vacuum_scale_factor`
times the row count). Once more dead versions than that have accumulated,
a vacuum starts. Separately it watches age for the sake of freezing
(`autovacuum_freeze_max_age`). Intensity is throttled by a cost-based
delay (`autovacuum_vacuum_cost_delay`/`cost_limit`) so it does not eat the
disk. On large tables the default scale factor of 0.2 is too big: vacuum
comes rarely and late, so it is lowered per table through
`ALTER TABLE ... SET`.

Question 6

VACUUM versus VACUUM FULL: what is the difference, and when do you use each?

Accepted Answer

A plain VACUUM works online: it marks dead versions reusable inside the
table, does not block reads and writes, but does not shrink the file on
disk. VACUUM FULL rewrites the table into a new file with no dead
versions, physically shrinks it, and returns space to the OS, but it takes
an ACCESS EXCLUSIVE lock. The table is unavailable for the whole time and
you need room for a copy. So everyday hygiene is plain vacuum and
autovacuum; VACUUM FULL is a one-off when a table has already bloated badly
and you have a window. The alternative without a full lock is `pg_repack`.

Question 7

What is a MultiXact, and why does it have its own wraparound?

Accepted Answer

When several transactions lock one row at the same time (for example
several `SELECT FOR SHARE`), you cannot write a single number into `xmax`.
Then a MultiXact is created: an identifier for a group of transactions,
with the list of members held in the `pg_multixact` SLRU directories.
MultiXact has its own 32-bit counter and therefore its own wraparound and
its own freezing (`autovacuum_multixact_freeze_max_age`). Under a load
with active row locking (task queues, FOR SHARE) MultiXact grows fast and
can become the bottleneck before the ordinary xid does.

Question 8

How does in-page cleanup differ from a full vacuum?

Accepted Answer

In-page cleanup (page pruning) happens right during an ordinary query when
it touches a page: PostgreSQL collapses HOT chains and marks dead versions
reusable within that one page. It is cheap and needs no table walk, but it
does not touch indexes and does not update the visibility map fully. A
full vacuum walks the whole table, cleans index references, updates the
FSM and VM, and advances freezing. Pruning relieves pressure between
vacuum runs, but it does not replace vacuum.

Vacuum, freeze, wraparound, autovacuum