For database engineers, data platform teams, and Python ETL developers automating change data capture (CDC) pipelines, pg_create_logical_replication_slot is the deterministic entry point for streaming logical WAL changes. This guide isolates the exact operational sequence required to provision, tune, and validate a logical replication slot for production CDC workloads. Proper execution prevents WAL bloat, ensures consumer offset consistency, and aligns with enterprise Logical Replication Setup & Management standards.
Prerequisite Parameter Tuning
Before invoking the slot creation function, the PostgreSQL instance must be configured to retain and decode WAL records at the cluster level. The following parameters require precise tuning and a full service restart:
wal_level = logical(mandatory; cannot be hot-reloaded)max_replication_slots >= N + 2(whereNis the number of concurrent CDC consumers; default10is often insufficient for multi-tenant data platforms)max_wal_senders >= max_replication_slots + 2(must exceed slot count to allow connection overhead and background workers)max_slot_wal_keep_size(set to a strict upper bound, for example10GBor20GB, to enforce automatic slot invalidation if a consumer stalls and prevent unbounded disk exhaustion)
Apply these via postgresql.conf or ALTER SYSTEM, restart the cluster, and verify with SHOW wal_level; and SHOW max_replication_slots;.
Step 1: Privilege & Database Boundary Validation
Logical replication slots are strictly database-scoped. The executing role must hold REPLICATION privileges and connect directly to the target database. Cross-database slot creation is unsupported by the PostgreSQL architecture.
Grant privileges explicitly:
ALTER ROLE cdc_pipeline_user REPLICATION;
GRANT CONNECT ON DATABASE target_db TO cdc_pipeline_user;
Verify role capabilities before proceeding. A role without the REPLICATION attribute cannot create logical slots and will receive ERROR: must be superuser or replication role to use replication slots.
Step 2: Execute Slot Creation
The core invocation follows this exact signature. Parameter selection must align with your CDC consumer architecture and PostgreSQL version baseline (14+):
SELECT pg_create_logical_replication_slot(
slot_name := 'cdc_etl_pipeline_v1',
plugin := 'pgoutput',
temporary := false,
two_phase := false
);
plugin:pgoutputis the native PostgreSQL 10+ output plugin. It streams changes in binary format, supports publication filtering, and minimizes CPU overhead compared to JSON-based decoders.temporary: Set tofalsefor persistent pipelines. Temporary slots drop on disconnect, which destroys offset tracking and forces full resyncs in Python ETL consumers.two_phase: Enable (true) only if your pipeline requires decodingPREPARE TRANSACTION/COMMIT PREPAREDsequences. Defaults tofalse.
PostgreSQL 17+ only: A failover parameter was added in PostgreSQL 17 that, when set to true, allows a logical slot to be synchronized to physical standbys via pg_sync_replication_slots() and survive physical standby promotion. This parameter is not available in PostgreSQL 16 or earlier.
-- PostgreSQL 17+ only
SELECT pg_create_logical_replication_slot(
slot_name := 'cdc_etl_pipeline_v1',
plugin := 'pgoutput',
temporary := false,
two_phase := false,
failover := true
);
For architectural context on how this function fits into the broader replication topology, consult Initializing Replication Slots before proceeding to consumer binding.
Step 3: Validate Offset Anchoring & State
Immediately after creation, verify the slot’s initial state and anchor LSNs. The function returns (slot_name, lsn), where lsn represents the WAL position at creation time.
SELECT slot_name, plugin, slot_type, active, restart_lsn, confirmed_flush_lsn, wal_status
FROM pg_replication_slots
WHERE slot_name = 'cdc_etl_pipeline_v1';
restart_lsn: The oldest WAL position the slot requires. The server will retain WAL from this point forward.confirmed_flush_lsn: InitiallyNULLfor a newly created slot that has not yet had a consumer connect. Advances only when the consumer explicitly acknowledges processed changes.wal_status: Monitor forextendedorlost. A value ofextendedindicates the slot is retaining more WAL thanmax_slot_wal_keep_sizeallows and will be invalidated if the consumer does not advance.
Step 4: Consumer Integration & Acknowledgment Flow
Python ETL pipelines must consume via the logical replication protocol and explicitly advance the confirmed LSN. Using psycopg2’s LogicalReplicationConnection, the consumer must send periodic standby status update messages with the latest processed LSN. Failure to send these ACKs causes confirmed_flush_lsn to stall, triggering WAL accumulation.
import psycopg2
from psycopg2.extras import LogicalReplicationConnection
DSN = "host=primary-db dbname=cdc_source user=replicator replication=database"
conn = psycopg2.connect(DSN, connection_factory=LogicalReplicationConnection)
cur = conn.cursor()
cur.start_replication(
slot_name='cdc_etl_pipeline_v1',
options={'proto_version': '1', 'publication_names': 'cdc_pub'},
decode=False
)
def process_change(msg):
# Parse msg.payload here, then acknowledge
cur.send_feedback(flush_lsn=msg.data_start) # advances confirmed_flush_lsn
cur.consume_stream(process_change)
Refer to the official Logical Decoding documentation for protocol message formats and LSN arithmetic rules.
Step 5: Async Monitoring & WAL Guardrails
Deploy asynchronous monitoring that queries pg_replication_slots and pg_stat_replication at 15–30 second intervals. Track the delta between pg_current_wal_lsn() and restart_lsn to calculate consumer lag.
SELECT
slot_name,
pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) AS pending_bytes,
wal_status,
active
FROM pg_replication_slots
WHERE slot_name = 'cdc_etl_pipeline_v1';
Alert thresholds:
pending_bytes > 5 GB: Trigger PagerDuty/Slack alert for consumer lag investigation.wal_status = 'extended': Immediate intervention required. The slot is approachingmax_slot_wal_keep_sizeand will be invalidated if not advanced.active = false: Consumer disconnected. Iftemporary = false, the slot persists but WAL retention continues.
Step 6: Error Handling, Retry Logic & Emergency Failover
Production CDC pipelines must handle slot invalidation, network partitions, and LSN gaps gracefully.
- Slot Invalidation Recovery: If
max_slot_wal_keep_sizeis breached, PostgreSQL marks the slot invalid and drops it. The ETL pipeline must detect theERROR: replication slot "..." does not existexception, trigger a full snapshot resync, and recreate the slot. - Retry Logic: Implement exponential backoff for connection drops. Do not attempt to recreate an existing slot; use
SELECT pg_replication_slot_advance('slot_name', 'target_lsn')to manually reconcile offsets if the consumer processed changes but failed to ACK. - Emergency Failover: During primary failover, logical replication slots do not automatically migrate to the new primary unless
failover = truewas set at creation time (PostgreSQL 17+). For earlier versions, manually recreate the slot on the promoted standby and coordinate a snapshot-based catch-up. Ensure your orchestration layer handles publication recreation and subscription sync procedures before restarting the ETL consumer.
Maintain strict idempotency in your deployment scripts. Slot creation is not idempotent by default; wrap it in a guard:
DO $$
BEGIN
IF NOT EXISTS (
SELECT 1 FROM pg_replication_slots WHERE slot_name = 'cdc_etl_pipeline_v1'
) THEN
PERFORM pg_create_logical_replication_slot('cdc_etl_pipeline_v1', 'pgoutput');
END IF;
END $$;