Quick Logical Replication Checklist

by gorthx

By request…

Before starting the publication/subscription, check your schemas.
(Adjust if you’re not replicating all tables, or have dropped indexes/constraints to make the initial sync go faster)
– Do you have the correct number of tables on both sides?
– Do you have the correct number of indexes on both sides?
– Do you have the same number of constraints on both sides?
– Are all of the indexes and constraints valid?

After starting replication, check the logs on the publisher and the subscriber for errors.
– Investigate anything unusual.
– If you’re getting duplicate key errors, check the old database logs first – this may be normal!
– If it’s not normal, this can indicate something is writing to the subscriber. Find it and make it stop1, then fix the situation2.
– If you are getting duplicate key errors after reversing the direction of repication: did you remember to copy the sequences over?

Then connect to the databases and check them out from psql.

On the publisher:

/* "describe publication" - is this information correct? */
\dRp+


/*
pg_replication_slots.slot_name should be the name of the subscription (as it appears on the subscriber)
pg_replication_slots.type should be 'logical'
pg_replication_slots.active should be 't'
*/
SELECT * FROM pg_replication_slots;


/*
pg_stat_replication.application_name should be the name of the subscription (as it appears on the subscriber)
pg_stat_replication.state should be 'streaming'
*/
SELECT * FROM pg_stat_replication;


/* This query should return nothing unexpected: */
SELECT schemaname, relname FROM pg_stat_user_tables
WHERE relname NOT IN (SELECT tablename FROM pg_publication_tables);


On the subscriber:

/* "describe subscription" - is this information correct? */
\dRs+

/* When replication is caught up, pg_subscription_rel.srsubstate will be 'r' for all tables */
SELECT srrelid::regclass, srsubstate, srsublsn
FROM pg_subscription_rel order by srsubstate;
/*
srsubstate key:
i = initialize
d = data is being copied
f = finished table copy
s = synchronized
r = ready (normal replication)
*/


Spot-check data to verify publisher and subscriber match.


If you switch replication direction, run the above checks again.


1 – You may want to enable `default_transaction_read_only` mode on the subscriber. Yes, logical replication will still work. If you’re on AWS, make the change to your parameter group, as you can’t use `ALTER SYSTEM`.
2 – Fix your data by either starting over from the initial copy step, or refilling just the affected tables (briefly: drop table from publication, refresh subscription, truncate table on the subscriber, add table back to publication, refresh subscription3). Depending on the extent of the damage, it can be better to just start over.
3 – It’s not a bad idea to rehearse this particular scenario in staging.

Advertisement
%d bloggers like this: