PostgreSQL 17 Logical Replication Failover Slots HA Playbook

Date: 2026-03-23
Category: knowledge
Scope: How to make PostgreSQL logical subscribers survive primary failover without re-snapshotting.

1) Why this matters

Before PostgreSQL 17 failover-slot workflows, a primary promotion often meant painful subscriber surgery:

re-create logical slots,
risk data gaps or duplicates,
sometimes re-bootstrap large tables.

PostgreSQL 17 formalizes logical replication failover so subscriber continuity can survive primary failover if you wire the slot-sync pipeline correctly.

2) Core mental model

A subscription can continue after failover only if:

Its logical slot is marked failover = true on the publisher side.
Slot state was synchronized to the standby in time.
The standby has a usable synced slot at promotion time.
Subscriber conninfo is switched to the new primary.

Think of it as two lanes:

Data lane: WAL shipping (physical replication)
Control lane: logical slot state sync

Both must be healthy.

3) Required wiring (minimum viable HA)

3.1 Publisher / primary

Create subscription with failover = true (or create logical slot with failover enabled).
Configure synchronized_standby_slots to include the physical slot(s) of candidate failover standby(ies) so logical decoding does not outrun standby durability.

3.2 Physical standby (future primary)

On standby, configure:

sync_replication_slots = true
hot_standby_feedback = on
primary_slot_name (physical slot to upstream primary)
primary_conninfo with a valid dbname (required for slot sync worker path)

Without this set, failover slots won’t synchronize reliably.

4) Subscription creation patterns

4.1 Preferred: explicit failover-enabled subscription

CREATE SUBSCRIPTION sub_orders
CONNECTION 'host=primary-db dbname=app user=repl password=***'
PUBLICATION pub_orders
WITH (
  create_slot = true,
  slot_name = 'sub_orders',
  copy_data = false,
  failover = true
);

4.2 Deferred/manual slot mode (advanced)

If create_slot = false, ensure slot-side failover property matches subscription-side failover semantics. Mismatches create confusing behavior (subscription says failover-enabled, slot doesn’t — or vice versa).

5) Pre-failover readiness checklist (must-pass)

5.1 On subscriber: list main slots tied to failover-enabled subscriptions

SELECT array_agg(quote_literal(s.subslotname)) AS slots
FROM pg_subscription s
WHERE s.subfailover
  AND s.subslotname IS NOT NULL;

5.2 On subscriber: list relevant table-sync slots (finished copy only)

SELECT array_agg(quote_literal(slot_name)) AS slots
FROM (
  SELECT CONCAT('pg_', srsubid, '_sync_', srrelid, '_', ctl.system_identifier) AS slot_name
  FROM pg_control_system() ctl,
       pg_subscription_rel r,
       pg_subscription s
  WHERE r.srsubstate = 'f'
    AND s.oid = r.srsubid
    AND s.subfailover
) t;

5.3 On target standby: confirm slots are failover-ready

SELECT slot_name,
       (synced AND NOT temporary AND invalidation_reason IS NULL) AS failover_ready
FROM pg_replication_slots
WHERE slot_name IN ('sub1','sub2','sub3');

Only promote when all critical slots show failover_ready = true.

6) Failover runbook (planned event)

Freeze subscriber apply directionally
- ALTER SUBSCRIPTION ... DISABLE on subscribers (recommended before promotion).
Promote standby to primary.
Update subscriber connection strings:
- ALTER SUBSCRIPTION ... CONNECTION 'host=new-primary ...';
Re-enable subscriptions:
- ALTER SUBSCRIPTION ... ENABLE;
Validate no gap/regression in confirmed_flush_lsn and app-level monotonic checks.

Why disable first? If old primary is still reachable, subscribers may keep consuming from old primary until conninfo flips, risking divergence.

7) Operational observability queries

7.1 Slot health on candidate primary/standby

SELECT slot_name,
       slot_type,
       failover,
       synced,
       active,
       wal_status,
       restart_lsn,
       confirmed_flush_lsn,
       invalidation_reason
FROM pg_replication_slots
ORDER BY slot_name;

7.2 Subscription posture on subscriber

SELECT subname,
       subenabled,
       subfailover,
       subslotname,
       subtwophasestate,
       subsynccommit
FROM pg_subscription
ORDER BY subname;

7.3 Table sync status that can influence slot expectations

SELECT s.subname,
       r.srrelid::regclass AS relation,
       r.srsubstate,
       r.srsublsn
FROM pg_subscription_rel r
JOIN pg_subscription s ON s.oid = r.srsubid
ORDER BY s.subname, relation;

8) Common failure modes

failover=true forgotten at creation
Subscriber appears healthy until first failover drill.
Standby missing sync_replication_slots / primary_slot_name / hot_standby_feedback
Slot sync silently incomplete; promotion breaks logical continuity.
No synchronized_standby_slots on primary
Logical consumer can run ahead of standby durability; failover-ready window becomes fragile.
Promoting with non-persistent synced slots
synced=true alone is insufficient if slot is temporary or invalidated.
Skipping pre-promotion slot readiness SQL
You discover missing slot state only after cutover.

9) Practical SLO guardrails

Require green readiness query before promotion approvals.
Alert on invalidation_reason IS NOT NULL for logical slots.
Track slot lag budgets (pg_current_wal_lsn() vs slot replay/confirm positions).
Treat failover drills as recurring game day, not one-time setup.

10) Bottom line

PostgreSQL 17 makes logical failover far more operationally sane, but only if you treat slot synchronization as a first-class HA dependency.

If you run logical replication in production, add failover-slot readiness checks to your promotion gate the same way you gate on replication lag and application health.

References

PostgreSQL 17 docs — Logical Replication Failover: https://www.postgresql.org/docs/17/logical-replication-failover.html
PostgreSQL 17 docs — Logical Decoding Concepts (slot synchronization): https://www.postgresql.org/docs/17/logicaldecoding-explanation.html
PostgreSQL 17 docs — CREATE SUBSCRIPTION (failover option): https://www.postgresql.org/docs/17/sql-createsubscription.html
PostgreSQL 17 docs — ALTER SUBSCRIPTION: https://www.postgresql.org/docs/17/sql-altersubscription.html
PostgreSQL 17 docs — pg_replication_slots view: https://www.postgresql.org/docs/17/view-pg-replication-slots.html
PostgreSQL parameter notes — sync_replication_slots: https://postgresqlco.nf/doc/en/param/sync_replication_slots/
PostgreSQL parameter notes — synchronized_standby_slots: https://postgresqlco.nf/doc/en/param/synchronized_standby_slots/