Set Up a Highly Available PostgreSQL Cluster using Docker on Ubuntu

This guide will help you to set up a highly available PostgreSQL cluster using docker on an Ubuntu 20.04. For this guide, we will create 3-node PostgreSQL cluster with 3-node etcd cluster as the DCS and one haproxy node.
 

Docker is a set of platform as a service products that use OS-level virtualization to deliver software in packages called containers. 
PostgreSQL, also known as Postgres, is a free and open-source relational database management system emphasizing extensibility and SQL compliance. 
Patroni is a cluster manager used to customize and automate deployment and maintenance of PostgreSQL HA (High Availability) clusters.
etcd is a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. 
HAProxy is a free and open source software that provides a high availability load balancer and proxy server for TCP and HTTP-based applications that spreads requests across multiple servers. 
PgBouncer is an open-source, lightweight connection pooler for PostgreSQL. 
 
 
We have also recorded the steps in this quick video, you can watch alongside this guide to avoid any confusion while setting up your cluster.
 
 
 
To follow this tutorial along, you will need one (physical or virtual) machine installed with Ubuntu 20.04. 
 

STEP1 - Install Prerequisites

Log in to your Ubuntu 20.04 using a non-root user with sudo privileges, and perform the following steps. 
 
Type below command to set correct timezone, update, and upgrade your Ubuntu:
sudo timedatectl set-timezone Asia/Karachi

sudo apt update
sudo apt upgrade
Once upgrade process end, execute below command to automatically remove unwanted packages from your Ubuntu:
sudo apt autoremove
Reboot your Ubuntu to take latest packages effect:
sudo shutdown -r now
Type below command to install docker, and docker-compose:
sudo apt install -y docker docker-compose

sudo usermod -aG docker $USER
sudo systemctl restart docker
Reboot your Ubuntu to take the docker installation into effect:
sudo shutdown -r now
It is mandatory to install PostgreSQL on your host machine (Ubuntu) as this setup need some binaries to create and run PostgreSQL cluster in docker:
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo apt update
For this guide, we will install PostgreSQL version 14 which is the most latest release. If you wish, you can install any PostgreSQL version of your choice:
sudo apt install -y postgresql-14 postgresql-client-14
Type below command to create symbolic link of PostgreSQL binaries:
sudo ln -s /usr/lib/postgresql/14/bin/* /usr/sbin/
We do not need PostgreSQL to run on our host machine as we just need some binaries of it, so we will stop and disable PostgreSQL service:
sudo systemctl stop postgresql
sudo systemctl disable postgresql
Type below command to install python, and to create its symbolic link as well:
sudo apt -y install python3 python3-pip libpq-dev
sudo ln -s /usr/bin/python3 /usr/bin/python
Type below command to install some important dependencies:
sudo -H pip install testresources
sudo -H pip install --upgrade setuptools

sudo -H pip install psycopg2-binary
sudo -H pip install psycopg2

STEP2 - Clone Patroni 

In this step, we will clone patroni from GitHub which is being maintained by zalando:
cd ~
git clone https://github.com/zalando/patroni.git
 

STEP3 - Prepare patroni, etcd, haproxy 

In this step, we will edit some of the files from the patroni directory, and make them ready according to our requirements:
cd ~/patroni
Edit Dockerfile and replace the PG_MAJOR= parameter's value with the version of PostgreSQL you would like run in your cluster:
nano Dockerfile

ARG PG_MAJOR=14

Save and close the editor when you are finished. 

Next, edit docker-compose.yml file, and make the changes accordingly:

nano docker-compose.yml
 After modification, docker-compose.yml file should look similar to like below:
version: "2"
services:
etcd1: &etcd
image: patroni
environment:
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
ETCD_INITIAL_CLUSTER: etcd1=http://etcd1:2380,etcd2=http://etcd2:2380,etcd3=http://etcd3:2380
ETCD_INITIAL_CLUSTER_STATE: new
ETCD_INITIAL_CLUSTER_TOKEN: etcd_cluster
container_name: etcd1
hostname: etcd1
command: etcd -name etcd1 -initial-advertise-peer-urls http://etcd1:2380

etcd2:
<<: *etcd
container_name: etcd2
hostname: etcd2
command: etcd -name etcd2 -initial-advertise-peer-urls http://etcd2:2380

etcd3:
<<: *etcd
container_name: etcd3
hostname: etcd3
command: etcd -name etcd3 -initial-advertise-peer-urls http://etcd3:2380

haproxy1:
image: patroni
env_file: docker/patroni.env
hostname: haproxy
container_name: haproxy1
ports:
- "5000:5000"
- "5001:5001"
- "7000:7000"
command: haproxy
environment: &haproxy_env
ETCDCTL_ENDPOINTS: http://etcd1:2379,http://etcd2:2379,http://etcd3:2379
PATRONI_ETCD3_HOSTS: "'etcd1:2379','etcd2:2379','etcd3:2379'"
PATRONI_SCOPE: pg_cluster

patroni1:
image: patroni
env_file: docker/patroni.env
hostname: patroni1
container_name: patroni1
environment:
<<: *haproxy_env
PATRONI_NAME: patroni1

patroni2:
image: patroni
env_file: docker/patroni.env
hostname: patroni2
container_name: patroni2
environment:
<<: *haproxy_env
PATRONI_NAME: patroni2

patroni3:
image: patroni
env_file: docker/patroni.env
hostname: patroni3
container_name: patroni3
environment:
<<: *haproxy_env
PATRONI_NAME: patroni3<: -="" -initial-advertise-peer-urls="" -name="" command:="" container_name:="" docker="" env_file:="" environment:="" etcd1:2379="" etcd2:2379="" etcd2:2380="" etcd2="" etcd3:2379="" etcd3:2380="" etcd3:="" etcd3="" etcd="" etcdctl_endpoints:="" haproxy1:="" haproxy1="" haproxy="" haproxy_env="" hostname:="" http:="" image:="" patroni.env="" patroni1:="" patroni1="" patroni2:="" patroni2="" patroni3:="" patroni3="" patroni="" patroni_etcd3_hosts:="" patroni_name:="" patroni_scope:="" pg_cluster="" ports:="" pre="">

Save and close the editor when you are finished. 

Next, edit haproxy.cfg file and replace batman with pg_cluster accordingly:

nano haproxy.cfg

global
maxconn 100

defaults
log global
mode tcp
retries 2
timeout client 30m
timeout connect 4s
timeout server 30m
timeout check 5s

listen stats
mode http
bind *:7000
stats enable
stats uri /

listen pg_cluster
bind *:5000
option httpchk
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server postgresql_127.0.0.1_5432 127.0.0.1:5432 maxconn 100 check port 8008
server postgresql_127.0.0.1_5433 127.0.0.1:5433 maxconn 100 check port 8009

Save and close the editor when you are finished. 

Next, edit entrypoint.sh file:

nano docker/entrypoint.sh

Find the below entries, replace highlighted text with yours:

readonly PATRONI_SCOPE=${PATRONI_SCOPE:-pg_cluster}
export PATRONI_REPLICATION_PASSWORD="${PATRONI_REPLICATION_PASSWORD:-secret@123}"
export PATRONI_SUPERUSER_PASSWORD="${PATRONI_SUPERUSER_PASSWORD:-secret@123}"

Save and close the editor when you are finished. 

Next, edit patroni.env file:

nano docker/patroni.env

Find the below entries, and replace highlighted text with yours:

PATRONI_RESTAPI_PASSWORD=secret@123
PATRONI_SUPERUSER_PASSWORD=secret@123
PATRONI_REPLICATION_PASSWORD=secret@123

Save and close the editor when you are finished. 

Next, edit pgbouncer.toml file, and replace the highlighted text with yours:

nano extras/confd/conf.d/pgbouncer.toml

prefix = "/service/pg_cluster"

Save and close the editor when you are finished.

 

STEP4 - Build & Run Docker Container

In this step, we will build the docker environment, and create all the required containers using the below command:
python setup.py build

docker build -t patroni .
Next, execute below command to start all the containers at once:
docker-compose up -d
You can check running containers with below command:
docker ps
 We will check the logs to see which patroni node is acting as master:
docker logs patroni1

Output:
2022-02-16 06:20:53,918 INFO: no action. I am (patroni1), a secondary, and following a leader (patroni2)

As you can see from the logs, patroni2 is acting as master node in the cluster, and patroni1, patroni3 are the replica nodes. 

Next, access the shell prompt of any patroni node like below:

docker exec -ti patroni1 bash
At postgres@patroni1:~$ shell prompt, type below command to check patroni cluster status:
patronictl list

You will see the output similar to like below:



Type below command to check etcd cluster status:
etcdctl member list
You will see the output similar to like below:
 

 
Type exit to bounce back to your Ubuntu shell prompt:
exit
 

STEP5 - Test PostgreSQL Cluster 

You should remember that, HAproxy on port 5000 will always connect you to your master node in the cluster for read/write queries, and on port 5001 will connect you to your replica node in the cluster for readonly queries. 
 
Make a connection to database from host using psql on port 5000 like below:
psql -h localhost -p 5000 -U postgres -W
This will prompt you for password, which is secret@123 in our case, for user postgres. When you are connected to database, from postgres=# prompt, create a test database like below:
CREATE DATABASE db_test;
Switch to your newly created database:
\c db_test
Next, from db_test=# prompt, create a table name "links" like below:
CREATE TABLE links 
	(id SERIAL PRIMARY KEY,
	url VARCHAR(255) NOT NULL,
	name VARCHAR(255) NOT NULL,
	description VARCHAR(255),
	last_update DATE
	);
Next, insert some data into a newly created table:
INSERT INTO links (url, name)

VALUES('https://www.techsupportpk.com','Tech Support Pakistan');

Disconnect from the database by typing \q and it will bounce you back to your host shell prompt.

Next, make a connection to your replica database node in the cluster on port 5001 like below:

psql -h localhost -p 5001 -U postgres -W

This will prompt to you for password, which is secret@123 in our case: 

At the postgres=# prompt, type below to verify whether your newly created database replicated accross the nodes in the cluster:

\l
\c db_test
\d links

SELECT * FROM links;
 

STEP6 - Access HAProxy Dashboard 

You can monitor your cluster from haproxy dashboard by accessing http://your_host_ip:7000 using any of your favorite browser. 
 
You should see your PostgreSQL cluster status as shown in the screenshot below:


 

STEP7 - PostgreSQL Cluster Manual Failover

In the event of disaster, if master node goes down due to whatsoever reason, failover should take place automatically, and any of the replica node will become a master node. 
 
Type below command to access your master node (patroni2) in our case, shell prompt:
docker exec -ti patroni2 bash
Type below command to initiate a manual failover:
patronictl failover
This will ask you, which replica node you would like to promote as master:
Candidate ['patroni1', 'patroni3'] []: patroni1

Are you sure you want to failover cluster pg_cluster, demoting current master patroni1? [y/N]:y

This will demote your current master node to replica, and one of the replica node you chose will take master role in the cluster. 

Type exit to bounce back to your host shell prompt:

 

STEP8 - PostgreSQL Cluster Auto Failover 

In this step we will test auto failover functinality by just stopping the master node like below:
docker stop patroni2

This will stop your master node container, and any of the replica node will become master automatically.

You can verify which node has taken over master role in the cluster from the logs:

docker logs patroni1
Next, we will start patroni2 again, and it should synchronize as replica node in the cluster automatically:
docker start patroni2
docker logs patroni2
 

Wrapping up 

Now you have a robust, highly available, ready to use PostgreSQL cluster running in docker container on an Ubuntu 20.04. You should add one more haproxy container, and configure IP failover to make the haproxy highly available.

2 comments:

  1. This is great tutorial and setup. I am curious, though, how all this would look and what steps you have to repeat the setup if you want to run all this in separate VMs? - I want the 3 PGs to be in separate VMs each

    ReplyDelete
    Replies
    1. This guide will help you: https://www.techsupportpk.com/2022/01/set-up-highly-available-postgresql13-cluster-ubuntu.html

      Delete

Powered by Blogger.