how/storage

Page anatomy: 8 KB, filled from both ends

A table on disk is an array of 8 KB pages, and each page fills from two sides at once: a line-pointer array grows down from the header, tuples grow up from the end, and the free space is whatever gap is left in the middle. Here is how one page fills up.

PostgreSQL never reads or writes a table row by row or byte by byte. Its unit of I/O, caching, and locking is the page (also called a block): a fixed 8 KB, 8192 bytes. A table file is just an array of these pages, numbered from zero.

Inside, every page has the same four zones:

  • a 24-byte header with the page's LSN, checksum, and three offsets: pd_lower, pd_upper, pd_special;
  • an array of line pointers, 4 bytes each, that grows down from the header;
  • the free space in the middle;
  • the tuples (the row versions themselves), which grow up from the end of the page.

The trick is that the pointer array and the tuples grow toward each other. The page is full the moment they meet. A line pointer is the stable address of a row (see line-pointers), so a tuple can move inside the page without its row id changing.

Press play to watch one page fill from empty to full in five steps.

step 1/5·00 · an empty page: 8192 bytes, a 24-byte header
08192header · 24 Bline pointersfree spacetuplesspecial · heap: emptylsn · checksumlower · upper · specialpd_lowerend of pointersempty page: 8192 bytes, almost all of it free space

§ steps

  1. A freshly extended page is almost entirely free space. At the very top sits the 24-byte header:

    pd_lsn       last WAL record that touched this page
    pd_checksum  optional page checksum
    pd_lower     24   -> end of the line-pointer array
    pd_upper     8192 -> start of the tuples
    pd_special   8192 -> the special area (empty for a heap)

    On an empty page pd_lower sits right after the header and pd_upper sits at the very end, so the gap between them is the whole page. Everything that follows is just these two numbers moving toward each other.

recap

What to remember:

  • A page is 8192 bytes, the unit of I/O and the granularity of the buffer cache (see buffer-cache). A table is an array of pages; an index is a different arrangement of the same 8 KB blocks.
  • Two cursors define the layout: pd_lower is where the line-pointer array ends, pd_upper is where the tuples begin. Free space is exactly pd_upper - pd_lower, and the free space map tracks it per page (see free-space-map) so an INSERT can find a page with room.
  • A row's address is a ctid of (block, line-pointer), not a byte offset. The pointer holds the real offset and can move during page compaction, while the row id stays put. The details are in line-pointers.
  • A tuple carries its own header with the MVCC fields (see tuple-header), so one logical row can have several versions on the same page at once.
  • The special area at the very end is empty for a heap table; index access methods use it for their own bookkeeping. When pd_lower meets pd_upper, the next row goes to a new page, and a value too big for a page is pushed out to TOAST storage (see toast).

You can read all of this live with the pageinspect extension: page_header() returns the three offsets, heap_page_items() lists the line pointers and tuples.

§ dig into the knowledge base

§ try it hands-on