Last modified: December 03, 2024

This article is written in: 🇺🇸

Master-Standby Replication

Master-Standby replication is a widely adopted database replication topology where a primary database server, known as the master, replicates data to one or more secondary servers called standbys. This setup enhances data availability, fault tolerance, and load balancing within a database system. Standby servers can handle read-only queries and, in case of a master server failure, can be promoted to become the new master, ensuring continuous operation.

Understanding the Architecture

To visualize how Master-Standby replication works, consider the following ASCII diagram:

#
               +---------------------+
               |                     |
               |      Clients        |
               | (Write & Read Ops)  |
               |                     |
               +----------+----------+
                          |
           +--------------+--------------+---------------------------------+
           |                             |                                 |
 Write & Read Operations           Read Operations                   Read Operations
           |                             |                                 |
           v                             v                                 v
+----------+-----------+       +---------+----------+            +---------+----------+
|                      |       |                    |            |                    |
|    Master Server     |       |  Standby Server 1  |            |  Standby Server 2  |
|                      |       |  (Read-Only)       |            |  (Read-Only)       |
+----------+-----------+       +---------+----------+            +---------+----------+
           |                             ʌ                                 ʌ
           |                             |                                 |
           |                         Replication                       Replication
           |                             |                                 |
           +-----------------------------+---------------------------------+

In this architecture, the master server handles all write operations, such as inserts, updates, and deletes. The standby servers continuously receive data changes from the master to stay synchronized and can serve read-only queries, offloading read traffic from the master. This arrangement not only improves performance but also provides a failover mechanism in case the master server becomes unavailable.

The Purpose of Master-Standby Replication

Master-Standby replication serves several essential purposes in database systems:

  1. By replicating data to standby servers, the system can prevent data loss and minimize downtime during failures. If the master server fails, a standby can be promoted to take over, ensuring uninterrupted service.
  2. Offloading read-heavy operations to standby servers distributes the workload more evenly, enhancing performance and scalability. This allows the master server to focus on write operations without being overwhelmed.
  3. Regular maintenance tasks, such as backups or software updates, can be performed on the master or standby servers without significant downtime. Standby servers can be updated one at a time, providing continuous service to users.
  4. As demand on the database grows, additional standby servers can be added to handle increased read traffic. This horizontal scaling is a cost-effective way to enhance system capacity without overhauling the existing infrastructure.

Advantages

Implementing Master-Standby replication offers several benefits:

Challenges

Despite its advantages, Master-Standby replication presents some challenges:

Implementing in PostgreSQL

PostgreSQL offers built-in support for streaming replication, making it a suitable choice for implementing Master-Standby replication. Below is a practical example of how to set up this replication using PostgreSQL.

Prerequisites

Before beginning the setup, ensure the following:

Configuring the Master Server

Editing postgresql.conf

Locate and edit the postgresql.conf file, typically found in the data directory (e.g., /var/lib/pgsql/data/ or /etc/postgresql/). Modify the following parameters to enable replication:

# Enable Write-Ahead Logging (WAL) level suitable for replication
wal_level = replica

# Allow the master to send WAL data to standby servers
max_wal_senders = 3

# Set the maximum number of replication slots (optional)
max_replication_slots = 3

# Retain WAL data to assist standby synchronization
wal_keep_size = 128MB

Editing pg_hba.conf

Update the pg_hba.conf file to allow the standby servers to connect for replication. Add the following line:

# TYPE  DATABASE        USER            ADDRESS                 METHOD

# Allow replication connections from standby servers
host    replication     replicator      standby_ip/32           md5

Replace replicator with the username of the replication role and standby_ip/32 with the IP address of the standby server.

Creating a Replication User

Log into the PostgreSQL prompt on the master server and create a user for replication:

CREATE ROLE replicator WITH REPLICATION LOGIN ENCRYPTED PASSWORD 'your_password';

This user will be used by the standby servers to authenticate with the master.

Restarting PostgreSQL

Restart the PostgreSQL service on the master server to apply the configuration changes:

# For Linux systems using systemd
sudo systemctl restart postgresql

Configuring the Standby Server

Stopping the PostgreSQL Service

Ensure that the PostgreSQL service on the standby server is stopped before proceeding:

sudo systemctl stop postgresql

Creating a Base Backup from the Master

Use the pg_basebackup utility to create a base backup of the master server on the standby server:

pg_basebackup -h master_ip -D /var/lib/pgsql/data/ -U replicator -W -P --wal-method=stream

Option Description
-h master_ip The IP address of the master server.
-D /var/lib/pgsql/data/ The data directory on the standby server.
-U replicator The replication user created earlier.
-W Prompt for the password.
--wal-method=stream Stream WAL files during the backup.

Creating the standby.signal File

For PostgreSQL versions 12 and above, create an empty file named standby.signal in the data directory to indicate that this server should start in standby mode:

touch /var/lib/pgsql/data/standby.signal

For versions before 12, a recovery.conf file is required with the necessary parameters.

Editing postgresql.conf on the Standby

Set the following parameters in the standby server's postgresql.conf file:

# Enable read-only queries on standby
hot_standby = on

# Configure connection information to the primary server
primary_conninfo = 'host=master_ip port=5432 user=replicator password=your_password'

Replace master_ip with the IP address of the master server and your_password with the password for the replication user.

Starting the PostgreSQL Service

Start the PostgreSQL service on the standby server:

sudo systemctl start postgresql

Verifying Replication

Checking Replication Status on the Master

Connect to the master server and execute the following SQL query to check the replication status:

SELECT client_addr, state
FROM pg_stat_replication;

This should display an entry for each standby server, indicating that they are connected and replicating.

Testing Data Replication

On the master server, create a test table and insert data:

-- Create a test table
CREATE TABLE replication_test (id SERIAL PRIMARY KEY, data TEXT);

-- Insert sample data
INSERT INTO replication_test (data) VALUES ('Test data');

On the standby server, query the test table to confirm that the data has been replicated:

-- Select data from the replicated table
SELECT * FROM replication_test;

If the data appears on the standby server, replication is working correctly.

Performing a Failover Procedure

In the event that the master server fails, you can promote a standby server to become the new master.

Promoting the Standby to Master

On the standby server, run the following command to promote it:

pg_ctl promote -D /var/lib/pgsql/data/

Alternatively, you can create a promote.signal file in the data directory:

touch /var/lib/pgsql/data/promote.signal

This action transitions the standby server to accept write operations.

Updating Application Connections

Redirect your application's database connections to the new master server to resume normal operations.

Reconfiguring the Failed Master as a Standby (Optional)

Once the original master server is operational again, you can configure it as a standby to the new master, ensuring it remains part of the replication setup.

Handling Replication Slots (Optional)

Replication slots prevent the master server from discarding WAL segments until they have been received by all standby servers. This helps maintain synchronization, especially when standbys are temporarily disconnected.

On the master server, create a replication slot for each standby:

SELECT * FROM pg_create_physical_replication_slot('standby_slot');

Modify the primary_conninfo on the standby server to include the slot name:

primary_slot_name = 'standby_slot'

Table of Contents

    Master-Standby Replication
    1. Understanding the Architecture
    2. The Purpose of Master-Standby Replication
    3. Advantages
    4. Challenges
    5. Implementing in PostgreSQL
      1. Prerequisites
      2. Configuring the Master Server
      3. Configuring the Standby Server
      4. Verifying Replication
      5. Performing a Failover Procedure
      6. Handling Replication Slots (Optional)