Last modified: May 16, 2025
This article is written in: πΊπΈ
Multi-master replication is a database replication model where multiple database nodes, referred to as masters, can perform read and write operations concurrently. Each master node propagates its data changes to every other master node, ensuring consistency across the entire system. This approach enhances scalability, availability, and performance but introduces complexities like conflict resolution and increased configuration overhead.
To visualize how multi-master replication works, consider the following diagram:
#
+---------------------+
| |
| Clients |
| (Write & Read Ops) |
| |
+----------+----------+
|
+--------------+---------------------+----------------------------------+
| | |
Write & Read Operations Write & Read Operations Write & Read Operations
| | |
v v v
+----------+-----------+ +---------+----------+ +---------+----------+
| | Replication | | Replication | |
| Master Server | ----------> | Master Server | ----------> | Master Server |
| | <---------- | | <---------- | |
+----------+-----------+ +---------+----------+ +---------+----------+
In this setup, each master node is connected to every other master node through replication links. Data changes made on any node are replicated to all other nodes, and clients can connect to any master node for both read and write operations. This interconnected architecture allows for a highly available and scalable system.
Multi-master replication has several defining features:
The primary goals of multi-master replication include:
Implementing multi-master replication offers several benefits:
Despite its advantages, multi-master replication introduces several challenges:
Handling conflicts is a critical aspect of multi-master replication. Various strategies can be employed:
Galera Cluster is a synchronous multi-master replication plugin for MySQL and MariaDB databases. It ensures that transactions are committed on all nodes simultaneously, providing strong data consistency across the cluster.
#
ββββββββββββββββββββββββββββββ
β Galera Quorum Network β
ββββββββββββββ¬ββββββββββββββββ
β (TCP 4567)
βββββββββββββββββββββββ΄ββββββββββββββββββββββ
β β
βββββββββΌβββββββββ ββββββββββΌβββββββββ
β Node-1 β β Node-2 β
β 10.0.0.20 β ββββββββββββββββββββΊβββββ 10.0.0.21 β
β (Master*) β β synchronous β β (Master*) β
βββββββββ¬βββββββββ β replication β ββββββββββ¬βββββββββ
β β (4567/4568/ β β
β β 4444) β β
β βΌββββββββββββββΊβββ β
βββββββββΌβββββββββ ββββββββββΌβββββββββ
β Node-3 βββββββββββββββββββββββββββΊ 10.0.0.22 β
β 10.0.0.22 β (Full Mesh) β (Master*) β
β (Master*) β β β
ββββββββββββββββββ βββββββββββββββββββ
Legend β All three nodes act as masters (βwriter-writersβ). Traffic flows in a full-mesh using the Galera ports 4567 (replication), 4568 (incremental SST/IST) and 4444 (state snapshot transfer). When you see
10.0.0.20/21/22
or Node-1/2/3 below, they refer to the diagram.
10.0.0.20-22
.chronyd
or systemd-timesyncd
.Ubuntu / Debian
sudo apt update
sudo apt install mariadb-server galera-4 rsync
RHEL / CentOS
sudo dnf install mariadb-server galera-4 rsync
(Replace mariadb-β¦
with Percona packages if you need MySQL-8 compatibility.)
The foundation of a stable Galera cluster is consistent configuration across all nodes. This section outlines the essential parameters you must set in the Galera configuration file to enable multi-master replication, ensure data consistency, and tune basic performance settings.
/etc/mysql/conf.d/60-galera.cnf
or /etc/my.cnf.d/galera.cnf
[mysqld]
# ==== Basic ====
bind-address = 0.0.0.0
default_storage_engine = InnoDB
binlog_format = ROW # mandatory
innodb_autoinc_lock_mode= 2 # mandatory for multi-master
# ==== Galera ====
wsrep_on = ON
wsrep_provider = /usr/lib/galera/libgalera_smm.so
wsrep_cluster_name = my_galera_cluster
wsrep_cluster_address = gcomm://10.0.0.20,10.0.0.21,10.0.0.22
# ---- Node-specific (edit on every host) ----
wsrep_node_name = node1 # node2 / node3 respectively
wsrep_node_address = 10.0.0.20 # 10.0.0.21 / 10.0.0.22
# State Snapshot Transfer (full clone)
wsrep_sst_method = rsync # mariabackup is faster for TB-scale
wsrep_sst_auth = sstuser:s3cr3t! # will create below
# Performance / flow-control
innodb_buffer_pool_size = 2G # β₯40 % RAM (adjust)
wsrep_slave_threads = 4 # = CPU cores (rule of thumb)
To securely transfer the initial dataset from the primary node to joining nodes, Galera uses a State Snapshot Transfer (SST) user. Create and grant the necessary privileges once on any cluster node.
mysql -u root <<'SQL'
CREATE USER 'sstuser'@'%' IDENTIFIED BY 's3cr3t!';
GRANT RELOAD, LOCK TABLES, PROCESS, REPLICATION CLIENT ON *.* TO 'sstuser'@'%';
FLUSH PRIVILEGES;
SQL
Before bringing up the cluster, tighten the default MariaDB security posture. Run the secure installation script to set a strong root password, remove unused accounts, and disable remote root access.
sudo mysql_secure_installation # set root pwd, remove test DB, disallow remote root
The first node must be started in bootstrap mode to initialize the cluster state. This step only runs once and sets up the initial primary component.
sudo systemctl stop mariadb
sudo galera_new_cluster # or: mysqld --wsrep-new-cluster &
Verify that the cluster is running and contains only the bootstrap node:
mysql -e "SHOW GLOBAL STATUS LIKE 'wsrep_cluster_size';"
# Expect Value = 1
Subsequent nodes join the existing cluster by starting their MariaDB service. They will perform an SST from the primary node to synchronize state before becoming active members.
sudo systemctl start mariadb
Monitor the SST process in real time with:
journalctl -u mariadb -f
Once complete, verify every node sees the full cluster:
mysql -e "SHOW GLOBAL STATUS LIKE 'wsrep_cluster_size';"
# Expect Value = 3 on every node
Galera uses optimistic concurrency control and write-set certification to resolve conflicts in a synchronous multi-master setup. At commit time, write-sets are broadcast and validated against each node's transaction history.
Proper application design can minimize and handle conflicts:
auto_increment_offset
and auto_increment_increment
to prevent PK clashes.Validate cluster behavior under different scenarios to ensure reliability. The table below outlines core tests and expected outcomes.
Test | Procedure (refer to Node IPs) | Expected result |
Consistency | INSERT on Node-1 β SELECT on Node-2/3 |
Rows visible in < 1 s |
Conflict | Begin two sessions, UPDATE the same row concurrently on Node-2 & Node-3 |
One session commits, other gets errno 1213 |
Failover | systemctl stop mariadb on Node-1 |
Cluster size drops to 2, writes continue on Node-2/3 |