PostgreSQL Database Replication
Replication is a basic technology for any database server because the downtime or the data loss can result in reducing accessibility, productivity and product confidence. Using data replication from a primary(master) server to one or more standbys(slaves) decreases the possibility of any data loss. With PostgreSQL, you can easily create a database cluster of Master-Slave topology with one or more standby servers.
Using WAL (Write-Ahead Logging) data is the fastest available way of replication with a great performance, so-called asynchronous replication. In this case the primary database server works in archiving mode just writing the WAL files to the storage and propagating them to the standby database server that operates in recovery mode. These files are transferred to the standby DB promptly after writing is completed.
So, let’s see the main configuration parameters are set up to configure the PostgreSQL database cluster of Master-Slave topology with high availability by setting up the hot standby (or streaming) replication to one or more slaves that can be queried as a read-only database.
Since the PostgreSQL changes configuration with every new release this article will be valid for the version 12 which is the latest at the moment.
PostgreSQL DB cluster is enabled via embedded Auto-Clustering feature at the Dashboard. To do this invoke a new environment topology wizard, pick the software stack(e.g. PostgreSQL 12.3) and just turn on a specially designed Auto-Clustering switch.
With hovering over the tooltip sign you can observe the topology is about to be created.
Master PostgreSQL Configuration
Let’s take a look at the master node configuration parameters used in auto-clustering.
1. Find the environment with the master database in your environments list. Click the Config button next to the PostgreSQL Master node.
2. Open the conf directory and navigate to the postgresql.conf file.
The following lines related to the WAL files can be changed if necessary:
- The wal_level parameter determines how much information is written to the WAL. There are three possible values:
- minimal - leaves only the information needed to recover from a failure or emergency shutdown.
- replica - default value, which writes enough data to support WAL archiving and replication, including running read-only queries on the standby server. In releases prior to 9.6, the archive and hot_standby values were allowed for this parameter. In later releases they are acceptable but mapped to the replica.
- logical - value adds the information required to support logical decoding to the replica logging level.
- max_wal_senders sets the maximum number of simultaneously running WAL transfer processes.
- archive_mode allows WAL archiving along with wal_level parameter(all values enable archiving except for minimal value).
- archive_command - the local shell command that will be executed to archive the completed WAL segment. By default it does nothing by executing ‘cd .' that means the archiving actually disabled. You may try to change it as follows to copy WAL archives to the destination directory you prefer(e.g /tmp/mydata):
Press the Save button above the editor.
3. Open the pg_hba.conf configuration file. The standby database connection is permitted by stating the following parameters:
Let’s inspect configuration files at Slave node. There are only three options that distinguish slave from master:
1. Open the postgresql.conf file, find the Standby Servers section. As you can see this server is acting as standby since the hot_standby parameter is on unlike the master node where this parameter is commented out.
2. Scroll down to the end of the config file. There is a primary_conninfo parameter that specifies the connection string which the standby server will use to connect to the sending server. The connection string must indicate the host name (or address) of the sending server, as well as the port number. The username corresponding to the role with the appropriate privileges on the sending server is also provided. The password must also be specified in the primary_conninfo or in a separate ~/.pgpass file on the backup server if the sender requires password authentication.
3. The last option that makes database server as slave is standby.signal file availability, which indicates the server should start up as a hot standby. File must be located in the PostgreSQL data directory and it can be empty or contain any information. Once a slave is promoted to master this file will be deleted.
Note: Keep in mind that most of the options that are being changed require the server should be restarted. It can be done it in two ways:
1. From the dashboard you can restart either one or both nodes.
2. Through a command line interface via Web SSH client. To do this click on the Web SSH button at the required node e.g. slave.
And issue a command to restart database server:
1. Open the phpPgAdmin panel for your master database by clicking Open in Browser button next to it.
2. Log in with the database credentials you’ve got via email earlier and create a new database.
3. Then you should open the admin panel of your standby database server (in the same way as for master one) and check whether the new database was replicated successfully or not.
PostgreSQL has no native automatic failover scenario for database cluster. On the other hand since there are many third-party solutions that you can use to implement to ensure high-availability for your system. At the same time you may create your own solution to overcome failures of your database cluster. The multiple situations of cluster failure are possible in real life. Here we consider only one most common workflow that can help you to automate failover scenario.
The default topology comprises two nodes:
Once the master node fails the slave node must be promoted to a new master. It can be done with the utility pg_ctl which is used to initialize, start, stop, or control a PostgreSQL server. To do this log in to the standby server via Web SSH and issue command as follows:
where /var/lib/pgsql/data is a database data directory.
Once the slave database is promoted to the master, you should change your application connection string in order to change database cluster entry point to a new hostname or IP address.
Failover process can rely on pg_isready utility that issues a connection check to a PostgreSQL database.
You may create a simple script which checks master database server availability and promotes the standby in case of master failure. Run the script through a crontab at the slave node with an appropriate interval. The script may look like below. Let’s call it as failover.sh:
Once script is triggered the slave promotion to master, the script output should look like:
Now your database got back to work and ready to handle read/write requests by the new master address.
With a new master address you can easily avoid tuning your application connection string changing IP addresses of the master database. To do this you have to put a load balancer in front of the cluster that will monitor the status of its components and route traffic to the current master, but this is out of scope of this document. Here we will demonstrate how to restore original cluster topology and thus no changes will be required at the frontend.
Another reason the topology should be restored is related to ensuring a scaling ability of the cluster. Only original topology can be scaled in/out horizontally.
Let’s see how to perform PostgreSQL database cluster restoration after the former master was dropped off from the cluster and the former slave was promoted to the master.
So, the task is: the dropped-off master should become the actual master and the current master(former slave) should become the actual slave.
The initial data are:
- Database cluster comprises two nodes master(IP: 172.25.2.22) and slave(IP: 172.25.2.31).
- Master node went down and the primary database was stopped.
- Standby database was promoted to the primary role.
- Now the slave retains the reads/writes.
- Former master node was fixed out and is ready to be reintroduced to the replication as primary.
Do the steps as follows to get the cluster of initial topology:
1. Enter former master node via Web SSH and issue command:
2. Add former master IP address 172.22.2.22 to pg_hba.conf at the current master node:
Restart current primary database to apply the changes:
3. Enter the former master node via Web SSH and issue command:
- pg_basebackup - is used to take base backups of a running PostgreSQL database cluster.
- 172.25.2.31 - IP address of the current master node.
- /var/lib/pgsql/data - PostgreSQL data directory.
4. Make sure the IP address in host parameter described in clause #2 of Configuring Standby contains proper IP address of the former master.
5. Create standby.signal file at the current master:
And restart the node to get new slave database:
Remove the standby.signal file at the former master:
And restart the node to get new master database:
6. Finally, in order to reach a consistent recovery state for both primary and standby databases the final restart is required that can be done via dashboard as follows:
Once the restart process is completed the cluster comes back to original topology and may be scaled horizontally.