This guide will help you to set up a highly available PostgreSQL cluster using docker on an Ubuntu 20.04. For this guide, we will create 3-node PostgreSQL cluster with 3-node etcd cluster as the DCS and one haproxy node.
Docker is a set of platform as a service products that use OS-level virtualization to deliver software in packages called containers.
PostgreSQL, also known as Postgres, is a free and open-source relational database management system emphasizing extensibility and SQL compliance.
Patroni is a cluster manager used to customize and automate deployment and maintenance of PostgreSQL HA (High Availability) clusters.
etcd is a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines.
HAProxy is a free and open source software that provides a high availability load balancer and proxy server for TCP and HTTP-based applications that spreads requests across multiple servers.
PgBouncer is an open-source, lightweight connection pooler for PostgreSQL.
We have also recorded the steps in this quick video, you can watch alongside this guide to avoid any confusion while setting up your cluster.
To follow this tutorial along, you will need one (physical or virtual) machine installed with Ubuntu 20.04.
STEP1 - Install Prerequisites
Log in to your Ubuntu 20.04 using a non-root user with sudo privileges, and perform the following steps.
Type below command to set correct timezone, update, and upgrade your Ubuntu:
For this guide, we will install PostgreSQL version 14 which is the most latest release. If you wish, you can install any PostgreSQL version of your choice:
listen pg_cluster bind *:5000 option httpchk http-check expect status 200 default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions server postgresql_127.0.0.1_5432 127.0.0.1:5432 maxconn 100 check port 8008 server postgresql_127.0.0.1_5433 127.0.0.1:5433 maxconn 100 check port 8009
Save and close the editor when you are finished.
Next, edit entrypoint.sh file:
nano docker/entrypoint.sh
Find the below entries, replace highlighted text with yours:
Next, edit pgbouncer.toml file, and replace the highlighted text with yours:
nano extras/confd/conf.d/pgbouncer.toml
prefix = "/service/pg_cluster"
Save and close the editor when you are finished.
STEP4 - Build & Run Docker Container
In this step, we will build the docker environment, and create all the required containers using the below command:
python setup.py build
docker build -t patroni .
Next, execute below command to start all the containers at once:
docker-compose up -d
You can check running containers with below command:
docker ps
We will check the logs to see which patroni node is acting as master:
docker logs patroni1
Output: 2022-02-16 06:20:53,918 INFO: no action. I am (patroni1), a secondary, and following a leader (patroni2)
As you can see from the logs, patroni2 is acting as master node in the cluster, and patroni1, patroni3 are the replica nodes.
Next, access the shell prompt of any patroni node like below:
docker exec -ti patroni1 bash
At postgres@patroni1:~$ shell prompt, type below command to check patroni cluster status:
patronictl list
You will see the output similar to like below:
Type below command to check etcd cluster status:
etcdctl member list
You will see the output similar to like below:
Type exit to bounce back to your Ubuntu shell prompt:
exit
STEP5 - Test PostgreSQL Cluster
You should remember that, HAproxy on port 5000 will always connect you to your master node in the cluster for read/write queries, and on port 5001 will connect you to your replica node in the cluster for readonly queries.
Make a connection to database from host using psql on port 5000 like below:
psql -h localhost -p 5000 -U postgres -W
This will prompt you for password, which is secret@123 in our case, for user postgres.
When you are connected to database, from postgres=# prompt, create a test database like below:
CREATE DATABASE db_test;
Switch to your newly created database:
\c db_test
Next, from db_test=# prompt, create a table name "links" like below:
CREATE TABLE links
(id SERIAL PRIMARY KEY,
url VARCHAR(255) NOT NULL,
name VARCHAR(255) NOT NULL,
description VARCHAR(255),
last_update DATE
);
Next, insert some data into a newly created table:
INSERT INTO links (url, name)
VALUES('https://www.techsupportpk.com','Tech Support Pakistan');
Disconnect from the database by typing \q and it will bounce you back to your host shell prompt.
Next, make a connection to your replica database node in the cluster on port 5001 like below:
psql -h localhost -p 5001 -U postgres -W
This will prompt to you for password, which is secret@123 in our case:
At the postgres=# prompt, type below to verify whether your newly created database replicated accross the nodes in the cluster:
\l
\c db_test
\d links
SELECT * FROM links;
STEP6 - Access HAProxy Dashboard
You can monitor your cluster from haproxy dashboard by accessing http://your_host_ip:7000 using any of your favorite browser.
You should see your PostgreSQL cluster status as shown in the screenshot below:
STEP7 - PostgreSQL Cluster Manual Failover
In the event of disaster, if master node goes down due to whatsoever reason, failover should take place automatically, and any of the replica node will become a master node.
Type below command to access your master node (patroni2) in our case, shell prompt:
docker exec -ti patroni2 bash
Type below command to initiate a manual failover:
patronictl failover
This will ask you, which replica node you would like to promote as master:
Candidate ['patroni1', 'patroni3'] []: patroni1
Are you sure you want to failover cluster pg_cluster, demoting current master patroni1? [y/N]:y
This will demote your current master node to replica, and one of the replica node you chose will take master role in the cluster.
Type exit to bounce back to your host shell prompt:
STEP8 - PostgreSQL Cluster Auto Failover
In this step we will test auto failover functinality by just stopping the master node like below:
docker stop patroni2
This will stop your master node container, and any of the replica node will become master automatically.
You can verify which node has taken over master role in the cluster from the logs:
docker logs patroni1
Next, we will start patroni2 again, and it should synchronize as replica node in the cluster automatically:
docker start patroni2
docker logs patroni2
Wrapping up
Now you have a robust, highly available, ready to use PostgreSQL cluster running in docker container on an Ubuntu 20.04. You should add one more haproxy container, and configure IP failover to make the haproxy highly available.
This is great tutorial and setup. I am curious, though, how all this would look and what steps you have to repeat the setup if you want to run all this in separate VMs? - I want the 3 PGs to be in separate VMs each
This is great tutorial and setup. I am curious, though, how all this would look and what steps you have to repeat the setup if you want to run all this in separate VMs? - I want the 3 PGs to be in separate VMs each
ReplyDeleteThis guide will help you: https://www.techsupportpk.com/2022/01/set-up-highly-available-postgresql13-cluster-ubuntu.html
Delete