Tech News

How to Set Up Logical Replication with PostgreSQL 10 on Ubuntu 18.04


PostgreSQL is an open-source object-relational database system that is highly extensible and compliant with ACID (Atomicity, Consistency, Isolation, Durability) and the SQL standard. Version 10.0 of PostgreSQL introduced support for logical replication, in addition to physical replication. In a logical replication scheme, high-level write operations are streamed from a master database server into one or more replica database servers. In a physical replication scheme, binary write operations are instead streamed from master to replica, producing a byte-for-byte exact copy of the original content. In cases where you would like to target a particular subset of data, such as off-load reporting, patching, or upgrading, logical replication can offer speed and flexibility.

Setting up an application for production often useful to have multiple copies of your database in place. The process of keeping database copies in sync is called replication. Replication can provide high-availability horizontal scaling for high volumes of simultaneous read operations, along with reduced read latencies. It also allows for peer-to-peer replication between geographically distributed database servers.

In this guide, we will show you how to set up logical replication with PostgreSQL 10 on two Ubuntu 18.04 servers, with one server acting as the master and the other as the replica. At the end of this guide, you will be able to replicate data from the master server to the replica using logical replication.

Prerequisites
To begin with this guide, you will need two Ubuntu 18.04 servers, which we'll name dbs-master and dbs-replica, each set up with a regular user account and sudo privileges. You should have installed PostgreSQL 10 on both servers before proceeding with this tutorial.

Configuring PostgreSQL for Logical Replication
There are several configuration settings you will need to modify to enable logical replication between your servers. First, you'll configure Postgres to listen on your private network interface. Then you'll configure the appropriate settings to allow replication to dbs-replica.

On dbs-master, open /etc/postgresql/10/main/postgresql.conf, the main server configuration file:

sudo nano /etc/postgresql/10/main/postgresql.conf

Find the following line:

#listen_addresses = 'localhost' # what IP address(es) to listen on;

Uncomment it by removing the #, and add the following to enable network communication:

listen_addresses = 'dbs-master, 172.22.10.100'

This makes dbs-master listen for incoming connections on the private network.

Next, find the following line:

#wal_level = replica # minimal, replica, or logical

Uncomment it, and change it to set the PostgreSQL Write Ahead Log (WAL) level to logical. This increases the volume of entries in the log, adding the necessary information for extracting discrepancies or changes to particular data sets:

wal_level = logical

The entries on this log will be consumed by the replica server, allowing for the replication of the high-level write operations from the master.

Save the file and close it.

Next, let's edit /etc/postgresql/10/main/pg_hba.conf, the file that controls allowed hosts, authentication, and access to databases:

sudo nano /etc/postgresql/10/main/pg_hba.conf

After the last line, let's add a line to allow incoming network connections from dbs-replica. We'll use dbs-replica's private IP address, and specify that connections are allowed from all users and databases:

# TYPE DATABASE USER ADDRESS METHOD
host all all 172.22.10.110/24 md5

Incoming network connections will now be allowed from dbs-replica, authenticated by a password hash (md5).

Save the file and close it.

Next, let's set our firewall rules to allow traffic from dbs-replica to port 5432 on dbs-master:

sudo ufw allow from 172.22.10.110 to any port 5432

Finally, restart the PostgreSQL server for the changes to take effect:

sudo systemctl restart postgresql

With your configuration set to allow logical replication, you can now move on to creating a database, user role, and table.

Setting Up a Database, User Role, and Table
To test the functionality of your replication settings, let's create a database, table, and user role. You will create an sample database with a sample table, which you can then use to test logical replication between your servers. You will also create a dedicated user and assign them privileges over both the database and the table.

First, open the psql prompt as the postgres user with the following command on both dbs-master and dbs-replica and perform following steps on both servers at the same time.

sudo -u postgres psql

Create a new database called sample on both hosts:

postgres=# CREATE DATABASE sample;

Using the \connect meta-command, connect to the databases you just created on each host:

postgres=# \c sample

Create a new table called widgets with arbitrary fields on both hosts:

sample=# CREATE TABLE widgets
sample=# (
sample=# id SERIAL,
sample=# name TEXT,
sample=# price DECIMAL,
sample=# CONSTRAINT widgets_pkey PRIMARY KEY (id)
sample=# );

The table on dbs-replica does not need to be identical to its dbs-master counterpart. However, it must contain every single column present on the table at dbs-master. Additional columns must not have NOT NULL or other constraints. If they do, replication will fail.

On dbs-master, let's create a new user role with the REPLICATION option and a login password. The REPLICATION attribute must be assigned to any role used for replication. We will call our user sampleuser, but you can replace this with your own username. Make sure to also replace P@ssw0rd with your own secure password:

postgres=# CREATE ROLE sampleuser WITH REPLICATION LOGIN PASSWORD 'P@ssw0rd';

Make a note of your password, as you will use it later on dbs-replica to set up replication.

Still on dbs-master, grant full privileges on the sample database to the user role you just created:

sample=# GRANT ALL PRIVILEGES ON DATABASE sample TO sampleuser;

Next, grant privileges on all of the tables contained in the database to your user:

sample=# GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO sampleuser;

The public schema is a default schema in each database into which tables are automatically placed.

With these privileges set, you can now move on to making the tables in your sample database available for replication.

Setting Up a Publication
Publications are the mechanism that PostgreSQL uses to make tables available for replication. The database server will keep track internally of the connection and replication status of any replica servers associated with a given publication. On dbs-master, you will create a publication, sample_publication, that will function as a master copy of the data that will be sent to your subscribers — in our case, dbs-replica.

On dbs-master, create a publication called sample_publication:

sample=# CREATE PUBLICATION sample_publication;

Add the widgets table you created previously to it:

sample=# ALTER PUBLICATION sample_publication ADD TABLE widgets;

With your publication in place, you can now add a subscriber that will pull data from it.

Creating a Subscription
Subscriptions are used by PostgreSQL to connect to existing publications. A publication can have many subscriptions across different replica servers, and replica servers can also have their own publications with subscribers. To access the data from the table you created on dbs-master, you will need to create a subscription to the publication you created in the previous step, sample_publication.

On dbs-replica, let's create a subscription called sample_subscription. The CREATE SUBSCRIPTION command will name the subscription, while the CONNECTION parameter will define the connection string to the publisher. This string will include the master server's connection details and login credentials, including the username and password you defined earlier, along with the name of the sample database. Once again, remember to use dbs-master's private IP address, and replace P@ssw0rd with your own password:

sample=# CREATE SUBSCRIPTION sample_subscription CONNECTION 'host=172.22.10.100 port=5432 password=P@ssw0rd user=sampleuser dbname=sample' PUBLICATION sample_publication;

You will see the following output confirming the subscription:

Output
NOTICE: created replication slot "sample_subscription" on publisher
CREATE SUBSCRIPTION

Upon creating a subscription, PostgreSQL will automatically sync any pre-existing data from the master to the replica. In our case there is no data to sync since the widgets table is empty, but this is a useful feature when adding new subscriptions to an existing database.

With a subscription in place, let's test the setup by adding some demo data to the widgets table.

Testing and Troubleshooting
To test replication between our master and replica, let's add some data to the widgets table and verify that it replicates correctly.

On dbs-master, insert the following data on the widgets table:

sample=# INSERT INTO widgets (name, price) VALUES ('Knife', 3.50), ('Tea Mug', 5.20), ('MugHolder', 4.80);

On dbs-replica, run the following query to fetch all the entries on this table:

sample=# SELECT * FROM widgets;

You should now see:

Output
id | name | price
----+------------+-------
1 | Knife | 3.50
2 | Tea Mug | 5.20
3 | MugHolder | 4.80
(3 rows)

The entries have been successfully replicated from dbs-master to dbs-replica. From now on, all INSERT, UPDATE, and DELETE queries will be replicated across servers unidirectionally.

One thing to note about write queries on replica servers is that they are not replicated back to the master server. PostgreSQL currently has limited support for resolving conflicts when the data between servers diverges. If there is a conflict, the replication will stop and PostgreSQL will wait until the issue is manually fixed by the database administrator. For that reason, most applications will direct all write operations to the master server, and distribute reads among available replica servers.

You can now exit the psql prompt on both servers:

postgres=# \q

Now that you have finished testing your setup, you can add and replicate data on your own.

Troubleshooting
If replication doesn't seem to be working, a good first step is checking the PostgreSQL log on dbs-replica for any possible errors:

tail /var/log/postgresql/postgresql-10-main.log

Here are some common problems that can prevent replication from working:

  • Networking is not enabled on both servers, or the servers are on different networks;
  • dbs-master is not configured to listen for connections on the correct network IP;
  • The Write Ahead Log level on dbs-master is incorrectly configured (it must be set to logical);
  • dbs-master is not configured to accept incoming connections from the correct dbs-replica IP address;
  • A firewall like UFW is blocking incoming PostgreSQL connections on port 5432;
  • There are mismatched table names or fields between dbs-master and dbs-replica;
  • The sampleuser database role is missing the required permissions to access the sample database on dbs-master;
  • The sampleuser database role is missing the REPLICATION option on dbs-master;
  • The sample database role is missing the required permissions to access the widgets table on dbs-master;
  • The table wasn't added to the publication on dbs-master.

After resolving the existing problem(s), replication should take place automatically. If it doesn't, use following command to remove the existing subscription before recreating it:

sample=# DROP SUBSCRIPTION sample_subscription;

Conclusion
In this guide you've successfully installed PostgreSQL 10 on two Ubuntu 18.04 servers and configured logical replication between them. You now have the required knowledge to experiment with horizontal read scaling, high availability, and the geographical distribution of your PostgreSQL database by adding additional replica servers.

No comments