linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
  • PostgreSQL internals
    Page and tuple, MVCC, vacuum, WAL, the planner and indexes
    →
All tutorials →
PricingAboutSign inCreate account
/
Intro
Lessons
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.
linuxlab.io
Tutorials▾
  • Linux & networking
    File system, processes, TCP/IP, BGP and OSPF
    →
  • Terraform & IaC
    HCL, state, plan/apply on a LocalStack sandbox
    →
  • Git & GitHub
    Object model, plumbing, branching, GitHub Actions
    →
  • PostgreSQL internals
    Page and tuple, MVCC, vacuum, WAL, the planner and indexes
    →
All tutorials →
PricingAboutSign inCreate account
/
  • Introduction
  • Chapters
  • How it works
  • Lessons
  • Knowledge base
  • Interview prep
Cluster

Back to clusters

Streaming and logical replication, failover

How PostgreSQL keeps copies of the data and survives a node failure: streaming replication over WAL, synchronous and asynchronous modes, lag, standby feedback, logical replication, slots, failover, and the pitfalls of distributed systems. A replica is not a backup, and that is the first thing they check here.

8 questions · ~35 min read

Questions

On this page

  1. 01How does streaming replication work? What is sent between nodes?
  2. 02Synchronous and asynchronous replication: what do you pay for each?
  3. 03What is replication lag, how do you measure it, and where does it come from?
  4. 04Why do you need hot_standby_feedback, and what conflict does it solve?
  5. 05How does logical replication differ from physical, and when do you need it?
  6. 06What is a replication slot, and why is an abandoned slot dangerous?
  7. 07What are failover and split-brain? Why is automatic switchover dangerous?
  8. 08Is a replica a backup? What are the pitfalls of working with distributed data?

#streaming-replication

intermediateOften

How does streaming replication work? What is sent between nodes?

What to say

Streaming (physical) replication sends the WAL stream. On the primary a walsender process ships WAL records as they appear, and on the replica a walreceiver process accepts them and applies them, replaying the same page changes. A replica is a byte-for-byte copy of the cluster: the same files, the same LSNs. A standby can be a hot standby, accepting read-only queries while it keeps applying WAL. Application follows the same redo rules as crash recovery, so a replica always "catches up" the WAL to the position it managed to receive. The gap between the primary's position and the replica's is the lag.

What they want to hear

A senior should: - name what is sent: the WAL stream, and the walsender/walreceiver processes - say that a physical replica is a binary copy of the whole cluster at the page level - tie application on the replica to the redo mechanism and LSN - distinguish the roles: a hot standby accepts reads, a plain standby only catches up

Pitfalls

  • ✗ Thinking a physical replica copies tables selectively. It is the whole cluster at the WAL level
  • ✗ Assuming you can write on a physical replica. It is read-only
  • ✗ Confusing the WAL stream with logical row changes. Physical replication works in pages

Follow-up

  • ? How does walsender differ from walreceiver?
  • ? Why can't you run an INSERT on a physical replica?
  • ? How does a replica know up to which position it applied the WAL?

Depth in knowledge base

  • Physical (streaming) replication
  • WAL: the write-ahead rule and LSN
tags: replication, streaming, walbook: postgresql_internals-17.pdf:ch11 wal modes · codelibs.ru_mastering-postgresql-15-advanced-techniques-to-build-and-manage-scalable-reliable-and-fault-tolerant-database-applications-5-ed.pdf:replication

#sync-vs-async

intermediateOften

Synchronous and asynchronous replication: what do you pay for each?

What to say

With asynchronous replication, a commit on the primary is confirmed right away, without waiting for the replica. Fast, but on a sudden loss of the primary the last transactions that did not make it to the replica are lost. With synchronous replication (`synchronous_commit = on` plus `synchronous_standby_names`), a commit waits until at least one replica confirms the WAL write. Zero data loss, but the price is a delay on every commit by a network round-trip, and if the synchronous replica falls away, commits on the primary stall. The intermediate levels `remote_write`/`remote_apply` finely tune what exactly to wait for: a write to the replica's WAL or its application. The choice is an explicit trade-off between data loss and latency.

What they want to hear

A senior should: - state the trade-off: async loses the tail of transactions on failure, sync pays in latency - name the risk of the synchronous scheme: a fallen-away synchronous replica stops commits on the primary - know the `synchronous_commit` levels: off, local, remote_write, on, remote_apply, and what they mean - advise at least two synchronous replicas or a quorum so one failure does not hang writes

Pitfalls

  • ✗ Setting one synchronous replica with no spare. Its failure freezes commits on the primary
  • ✗ Calling an asynchronous replica safe for zero loss. The tail of transactions vanishes in a disaster
  • ✗ Confusing `synchronous_commit=off` (a local loss risk) with asynchronous replication (a risk on the replica side)

Follow-up

  • ? What happens to commits if the only synchronous replica goes down?
  • ? How does `remote_write` differ from `remote_apply`?
  • ? How many synchronous replicas do you need to survive one failure without stopping writes?

Depth in knowledge base

  • Physical (streaming) replication
  • hot_standby_feedback and vacuum on the primary
tags: replication, sync, durabilitybook: codelibs.ru_mastering-postgresql-15-advanced-techniques-to-build-and-manage-scalable-reliable-and-fault-tolerant-database-applications-5-ed.pdf:synchronous replication

#replication-lag

intermediateOften

What is replication lag, how do you measure it, and where does it come from?

What to say

Lag is how far a replica trails the primary. It is measured two ways: by volume (the LSN difference between what the primary wrote and what the replica received and applied) and by time (replay lag, how many seconds stale the replica's data is). The causes: a narrow network cannot push enough WAL; the replica cannot keep up applying the WAL, because redo is single-threaded and hits the disk or conflicts with reading queries; a spike of writes on the primary. You watch it through `pg_stat_replication` on the primary (`sent`/`write`/`flush`/`replay` LSN) and `pg_last_wal_replay_lsn` on the replica. Large lag means reads from the replica return stale data, and a failover loses the tail.

What they want to hear

A senior should: - distinguish lag by volume (LSN) from lag by time (replay) - name the causes: network, redo apply speed, conflicts with queries on the replica, write spikes - show the tools: `pg_stat_replication`, LSN differences, `pg_last_wal_replay_lsn` - tie lag to two consequences: stale reads and a lost tail on a switchover

Pitfalls

  • ✗ Measuring lag only by time and missing that the replica received WAL but did not apply it
  • ✗ Ignoring that redo on the replica is essentially single-threaded and may not keep up with the primary's parallel writes
  • ✗ Reading critical data from a badly lagging replica and being surprised at stale results

Follow-up

  • ? How does the `write` LSN in `pg_stat_replication` differ from the `replay` LSN?
  • ? Why can a replica receive WAL yet trail in application?
  • ? How does lag affect data loss on a failover?

Depth in knowledge base

  • Physical (streaming) replication
  • hot_standby_feedback and vacuum on the primary
tags: replication, lag, monitoringbook: codelibs.ru_monitoring-postgresql.pdf:replication lag

#hot-standby-feedback

seniorSometimes

Why do you need hot_standby_feedback, and what conflict does it solve?

What to say

A hot standby runs long reading queries, and they need old row versions. Meanwhile the primary is vacuuming and may remove versions the replica is still showing to its query. When redo application reaches that removal, a conflict arises: the replica either cancels the query (`ERROR: canceling statement due to conflict with recovery`) or stalls application. `hot_standby_feedback = on` solves this as follows: the replica reports its horizon to the primary, and the primary does not clean versions needed by the replica's queries. The price is paid on the primary: its horizon is now held by the replica too, so a long query on the standby stalls cleanup and piles up garbage on the primary.

What they want to hear

A senior should: - describe the conflict: vacuum on the primary versus long queries on the replica - explain what feedback does: the replica moves the primary's horizon, so it does not remove needed versions - name the cost: the primary's horizon is held by the replica, and bloat grows from long queries on the standby - know the alternatives without feedback: `max_standby_streaming_delay` as a compromise between cancelling queries and lag

Pitfalls

  • ✗ Turning on `hot_standby_feedback` and forgetting that a long query on the replica now bloats the primary
  • ✗ Treating a query cancellation on the replica as a bug. It is a routine conflict with recovery
  • ✗ Confusing feedback's role with eliminating lag. It is about the vacuum conflict, not about speed

Follow-up

  • ? Why does a long query on a replica with feedback hurt the primary?
  • ? What does `max_standby_streaming_delay` do without feedback?
  • ? Where does the query cancellation error on a standby come from?

Depth in knowledge base

  • hot_standby_feedback and vacuum on the primary
  • Transaction horizon
  • Physical (streaming) replication
tags: replication, standby, horizonbook: postgresql_internals-17.pdf:ch6 vacuum · codelibs.ru_mastering-postgresql-15-advanced-techniques-to-build-and-manage-scalable-reliable-and-fault-tolerant-database-applications-5-ed.pdf:hot standby

#logical-replication

intermediateOften

How does logical replication differ from physical, and when do you need it?

What to say

Physical replication copies the whole cluster at the page level: all or nothing, the same version, read-only on the replica. Logical replication works at the row level through publication/subscription: WAL is decoded into logical changes (INSERT/UPDATE/DELETE of specific tables) and applied on the subscriber with ordinary commands. This gives what physical cannot: replicate selected tables, between different major versions (handy for an upgrade with minimal downtime), into a database where the subscriber can have its own tables and accept writes. It requires `wal_level = logical`, and tables need a way to identify a row (REPLICA IDENTITY, usually the primary key). DDL is not replicated logically.

What they want to hear

A senior should: - contrast the levels: physical is pages and the whole cluster, logical is rows and selected tables - name the logical scenarios: selective replication, an upgrade between major versions, aggregation into one sink - know the requirements: `wal_level=logical`, REPLICA IDENTITY for UPDATE/DELETE - name the limits: DDL is not replicated, sequences and large objects need attention

Pitfalls

  • ✗ Expecting logical replication to carry schema changes (DDL). It replicates only data
  • ✗ Forgetting REPLICA IDENTITY. Without it an UPDATE/DELETE on a keyless table breaks the subscription
  • ✗ Confusing selectivity: physical is all-or-nothing, you can pick tables only with logical

Follow-up

  • ? Why does logical replication need REPLICA IDENTITY for an UPDATE?
  • ? How does logical replication help an upgrade between major versions?
  • ? What happens to a subscription on an `ALTER TABLE` at the publisher?

Depth in knowledge base

  • Logical replication (publication/subscription)
  • WAL levels: minimal, replica, logical
  • Physical (streaming) replication
tags: replication, logical, publicationbook: postgresql_internals-17.pdf:ch11 wal modes · codelibs.ru_mastering-postgresql-15-advanced-techniques-to-build-and-manage-scalable-reliable-and-fault-tolerant-database-applications-5-ed.pdf:logical replication

#replication-slots

intermediateOften

What is a replication slot, and why is an abandoned slot dangerous?

What to say

A replication slot is a record on the primary that remembers the WAL position a specific consumer (a replica or a logical subscription) has read up to. While the slot exists, the primary must keep WAL up to that position and (for logical slots) must not clean row versions needed for decoding. This saves a replica that fell away for a while: on return it catches up, because the needed WAL is preserved. But the flip side is dangerous: if the consumer is gone for good and the slot was not removed, the primary piles up WAL until the disk runs out and the server stops. An abandoned slot is the classic cause of a suddenly full `pg_wal`.

What they want to hear

A senior should: - define a slot as the WAL retention point (and for logical, the retention of row versions) for a specific consumer - explain the benefit: a replica survives a temporary break without losing its position - name the danger: an abandoned slot holds WAL and fills the disk, and the server stops - give the practice: monitor `pg_replication_slots`, bound `max_slot_wal_keep_size`, drop dead slots

Pitfalls

  • ✗ Creating a slot for a replica, removing the replica, and forgetting the slot. `pg_wal` will grow until failure
  • ✗ Not setting `max_slot_wal_keep_size` and trusting the disk will not fill
  • ✗ Confusing a slot with WAL itself. A slot is only a retention point, and it holds the real WAL files

Follow-up

  • ? What happens to `pg_wal` if a slot exists but the consumer does not?
  • ? Why do you need `max_slot_wal_keep_size`?
  • ? How does a logical slot differ from a physical one in what it retains?

Depth in knowledge base

  • Physical (streaming) replication
  • Logical replication (publication/subscription)
  • WAL: the write-ahead rule and LSN
tags: replication, slots, walbook: codelibs.ru_mastering-postgresql-15-advanced-techniques-to-build-and-manage-scalable-reliable-and-fault-tolerant-database-applications-5-ed.pdf:replication slots · codelibs.ru_monitoring-postgresql.pdf:wal retention

#failover-split-brain

seniorSometimes

What are failover and split-brain? Why is automatic switchover dangerous?

What to say

Failover is promoting a replica to primary when the former primary has failed. Technically it is a `promote`: a standby stops being read-only and starts accepting writes. The danger is split-brain: if the old primary is actually alive (only the network was down) and you promoted a replica, the cluster ends up with two primaries, both accepting writes, and the data diverges irreversibly. So reliable automatic failover requires a quorum arbiter and a fencing mechanism (reliably shut off the old primary, STONITH), not just a ping. Tools like Patroni build this on top of distributed consensus. A naive auto-failover by a ping timeout is a direct path to split-brain.

What they want to hear

A senior should: - define failover as promoting a replica, and split-brain as two live primaries with diverging data - explain why a ping is not enough: a network partition is indistinguishable from a dead node - name the defense: a quorum, fencing/STONITH, external consensus (Patroni, etcd) - understand the trade-off: automation is faster but riskier, manual failover is slower but controlled

Pitfalls

  • ✗ Building auto-failover on one ping timeout with no quorum and no fencing. You get split-brain on a network partition
  • ✗ Promoting a replica without making sure the old primary is reliably shut off
  • ✗ Confusing promote (a logical promotion) with data recovery. Failover will not bring back diverged writes

Follow-up

  • ? Why can't unreachability over the network be equated with the primary's death?
  • ? What is fencing/STONITH, and why is it needed at failover?
  • ? What role does an external quorum play in a safe switchover?

Depth in knowledge base

  • Physical (streaming) replication
  • Distributed pitfalls: 2PC, multi-master, CAP
tags: replication, failover, habook: codelibs.ru_mastering-postgresql-15-advanced-techniques-to-build-and-manage-scalable-reliable-and-fault-tolerant-database-applications-5-ed.pdf:high availability

#distributed-pitfalls

seniorSometimes

Is a replica a backup? What are the pitfalls of working with distributed data?

What to say

A replica is not a backup. It copies faithfully, mistakes and all: a `DROP TABLE` or a value corrupted by the application instantly travels to every replica. You need a separate backup (a base backup plus a WAL archive for PITR) so you can roll back to a point before the mistake. Other distributed traps: a read from an asynchronous replica returns stale data (read-your-writes breaks if you write to the primary and immediately read from a replica); synchronous replication adds commit latency; a failover loses the tail in async mode; distributed transactions across nodes require a two-phase commit and are not free. The main rule: replication is about availability, backup is about preservation, and these are different jobs.

What they want to hear

A senior should: - firmly say "a replica is not a backup" and explain why: it replicates mistakes too - name PITR (a base backup plus a WAL archive) as the real defense against logical errors - list the traps: stale reads from a replica, broken read-your-writes, a lost tail on an async failover - separate the goals: replication is availability, backup is preservation

Pitfalls

  • ✗ Treating a replica as a backup substitute. A `DROP TABLE` repeats on every replica in seconds
  • ✗ Reading from an asynchronous replica right after a write to the primary and expecting fresh data
  • ✗ Thinking a failover in async mode loses nothing. The tail of unreplicated transactions vanishes

Follow-up

  • ? Why does replication not protect against an accidental `DELETE` without a `WHERE`?
  • ? What does PITR give that a replica does not?
  • ? How does read-your-writes break when reading from an asynchronous replica?

Depth in knowledge base

  • Distributed pitfalls: 2PC, multi-master, CAP
  • Backup and point-in-time recovery (PITR)
  • Logical replication (publication/subscription)
tags: replication, backup, consistencybook: codelibs.ru_postgresql-mistakes-and-how-to-avoid-them.pdf:replication is not backup · codelibs.ru_mastering-postgresql-15-advanced-techniques-to-build-and-manage-scalable-reliable-and-fault-tolerant-database-applications-5-ed.pdf:backup and recovery
Footer
linuxlab-TutorialsPricingAboutPrivacy & cookies
Copyright © 2026 LinuxLab. All rights reserved.