Configuring the Debezium PostgreSQL connector requires precise alignment between database-level logical replication parameters, Kafka Connect runtime constraints, and downstream CDC pipeline automation workflows. This guide details production-ready configuration patterns for database engineers, data platform teams, and Python ETL developers deploying change data capture at scale.
PostgreSQL Logical Replication Prerequisites
Before deploying the connector, PostgreSQL must expose transaction logs via logical decoding. Set wal_level = logical in postgresql.conf and restart the instance. Allocate sufficient replication slots by tuning max_replication_slots and max_wal_senders to accommodate concurrent connector deployments and failover scenarios. Create a dedicated publication targeting only the required tables to minimize WAL bloat and network overhead:
CREATE PUBLICATION cdc_prod_publication FOR TABLE public.orders, public.users, public.inventory;
The connector consumes from this publication using a persistent replication slot. Dropping a slot prematurely forces a full resnapshot and breaks downstream state consistency. For end-to-end pipeline architecture, refer to the foundational CDC Pipeline Implementation with Python & Debezium documentation. Consult the official PostgreSQL Logical Replication documentation for slot lifecycle management and WAL retention policies.
Idempotent Deployment Payload
Deploy the connector via the Kafka Connect REST API using an idempotent PUT request. This guarantees that repeated deployments converge to the same configuration state without triggering duplicate connector instances. Critical parameters dictate behavior, fault tolerance, and data fidelity:
{
"name": "pg-cdc-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "pg-primary.internal",
"database.port": "5432",
"database.user": "cdc_replicator",
"database.password": "${file:/opt/kafka/config/secrets/db-pass.properties:password}",
"database.dbname": "analytics_prod",
"topic.prefix": "pg-analytics",
"plugin.name": "pgoutput",
"publication.name": "cdc_prod_publication",
"slot.name": "debezium_pg_analytics",
"slot.drop.on.stop": "false",
"snapshot.mode": "initial",
"snapshot.locking.mode": "minimal",
"snapshot.isolation.mode": "repeatable_read",
"heartbeat.interval.ms": "10000",
"poll.interval.ms": "500",
"max.batch.size": "2048",
"max.queue.size": "8192",
"tombstones.on.delete": "true",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable": "false",
"value.converter.schemas.enable": "false"
}
}
Secrets must never be hardcoded. Use the ${file:...} or ${env:...} syntax to inject credentials at runtime. Deploy idempotently using the Kafka Connect REST API PUT /connectors/{name}/config endpoint.
Snapshot Strategy & Performance Tuning
The initial snapshot phase dictates baseline data latency and database load. snapshot.locking.mode controls table lock duration during the initial sync:
minimal: Acquires locks only during the initial table scan. Reduces production impact but may capture inconsistent states if concurrent DDL occurs.extended: Holds locks across the entire snapshot. Guarantees strict consistency but extends initial sync windows.none: Skips locking entirely. Only safe for append-only or read-only tables.
For large tables, chunk-based snapshotting (snapshot.fetch.size) significantly reduces peak memory usage on the Kafka Connect worker. Tune this value together with max.batch.size and available JVM heap to prevent OOM kills during the initial sync.
Memory Allocation & Runtime Stability
Debezium runs within the Kafka Connect JVM. Misconfigured batch and queue sizes directly cause heap exhaustion and container OOM kills. The max.batch.size and max.queue.size parameters must scale proportionally with available JVM heap (KAFKA_HEAP_OPTS). A common production baseline allocates 4–8 GB heap with max.batch.size=2048 and max.queue.size=8192. Monitor kafka.connect.source-record-poll-rate and kafka.connect.source-record-active-count metrics to detect backpressure.
Downstream Serialization & Parsing
JSON converters are suitable for development, but production pipelines require schema evolution control and compact serialization. Switch to io.confluent.connect.avro.AvroConverter or io.debezium.converters.CloudEventsConverter to enforce schema registry compliance. Enable tombstones.on.delete=true to propagate delete events explicitly, allowing downstream consumers to handle soft vs. hard deletes deterministically. When transforming raw CDC payloads, implement JSON to Avro Transformation to standardize field types and reduce network payload size. Python ETL developers should leverage Python CDC Parser Development to deserialize before/after payloads, handle schema versioning, and route events to analytical sinks without blocking the main consumer loop.
Debugging & Verification Workflow
Idempotent configuration requires a repeatable verification loop:
1. Deploy & validate state
curl -X PUT http://kafka-connect:8083/connectors/pg-cdc-connector/config \
-H "Content-Type: application/json" \
-d @connector-config.json
Verify state transitions: curl http://kafka-connect:8083/connectors/pg-cdc-connector/status. Expect RUNNING for both connector and tasks.
2. Offset & slot verification
Check committed offsets: curl http://kafka-connect:8083/connectors/pg-cdc-connector/offsets. Cross-reference with PostgreSQL slot lag:
SELECT slot_name, restart_lsn, confirmed_flush_lsn,
pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS pending_bytes
FROM pg_replication_slots WHERE slot_name = 'debezium_pg_analytics';
A growing pending_bytes value indicates downstream consumer lag or connector stall.
3. Failure recovery
If the connector enters FAILED state, do not delete it. Inspect the task trace via the status endpoint, resolve the root cause (for example, network timeout, schema mismatch, or WAL gap), and issue another PUT request. Kafka Connect will resume from the last committed offset. Ensure slot.drop.on.stop remains false to prevent slot loss during maintenance windows.