How to Setup MariaDB Galera Multi-Master Synchronous Replication using Debian 10

MariaDB offers two different high availability (HA) and clustering solutions. The first is standard MariaDB master/slave replication which can be configured in varying topologies mainly for load balancing, HA, and backup purposes. The second is MariaDB Galera, a multi-master synchronous clustering solution. Its main features are as follows:

Multi-master: All nodes in a Galera cluster can perform both read and write operations, offering better scalability.
Nodes can join a cluster automatically, and are evicted upon failure.
Galera replication is synchronous, meaning changes on one node are guaranteed to be applied on the other nodes. In theory, this ensures that no data is lost when a node fails.

This guide will walk you through the installation of MariaDB and its configuration in a Galera cluster. We will be using three Debian 10 nodes for demonstration, though any number (≥3) of nodes can be used. Setting up two nodes in a Galera cluster is technically possible but does not provide fault tolerance as a failed node will cause the other node to stop.

Requirements

Three or more Debian 10 instances.
Access to the root user or any user with sudo privileges.
The $EDITOR environment variable should be set.

NOTE: Galera clusters can work over WAN or LAN. If your nodes share a private network, use private IP addresses where applicable. Otherwise, WAN addresses should be used.

If using a sudo user, open and use a root shell for the length of this setup using:

sudo -s

Step 1: Installing MariaDB

This step should be executed on all nodes.

Use the following commands to install MariaDB, the Galera library, and Rsync. The latter is used by Galera.

apt update
apt install -y mariadb-server mariadb-client galera-3 rsync

Ensure the MariaDB service is enabled:

systemctl enable mariadb.service

Secure your MariaDB instances using the mysql_secure_installation script:

mysql_secure_installation

Answer the questions as shown below and make sure you choose a strong password for the MySQL root user.

Enter current password for root (enter for none): Press <Enter>
Set root password? [Y/n] y
New password: your_password
Re-enter new password: your_password
Remove anonymous users? [Y/n] y
Disallow root login remotely? [Y/n] y
Remove test database and access to it? [Y/n] y
Reload privilege tables now? [Y/n] y
All done!  If you've completed all of the above steps, your MariaDB
installation should now be secure.

Step 2: Configuring MariaDB

This step should be executed on all nodes.

Stop the MariaDB service on all nodes:

systemctl stop mariadb.service

By default, the MariaDB daemon listens for connections on localhost only. In order for the cluster to work, this should be changed to an externally accessible address. To do so, edit the option file /etc/mysql/mariadb.conf.d/50-server.cnf:

$EDITOR /etc/mysql/mariadb.conf.d/50-server.cnf

Find the following line:

bind-address = 127.0.0.1

If you are using a private network for the cluster and you do not want to expose MariaDB to other networks (i.e. WAN), specify the local IPv4 address for each node. Otherwise, use 0.0.0.0 which instructs MariaDB to listen on all interfaces. For example:

bind-address = 0.0.0.0

Save the change and exit your text editor.

We will now configure cluster-related options. Create a new option file:

$EDITOR /etc/mysql/mariadb.conf.d/99-cluster.cnf

Enter the following sensible configuration into the file, replacing the IP addresses. It should be identical on all nodes.

[galera]

wsrep_on = on 
wsrep_provider = /lib/galera/libgalera_smm.so 
wsrep_cluster_address = gcomm://192.0.2.1,192.0.2.2,192.0.2.3 
wsrep_cluster_name = galera_cluster_0 

default_storage_engine = InnoDB 
innodb_autoinc_lock_mode = 2 
innodb_doublewrite = 1 

binlog_format = ROW

wsrep_on = on enables write set replication, the underlying functionality used by Galera.
wsrep_provider specifies the path to the galera library. It is provided by the galera-3 package at /lib/galera/libgalera_smm.so on Debian 10.
wsrep_cluster_address should contain at least one address of another cluster member. Listing all members of the cluster is recommended. No particular order is necessary.
wsrep_cluster_name should be unique to the cluster and should be identical on all nodes of the same galera cluster.
The remaining options are required for Galera to work properly and should not be changed.

Step 3: Bootstrapping the cluster

Make sure MariaDB is stopped/inactive on all nodes before proceeding:

systemctl status mariadb.service

To start the cluster, a node first needs to create it. On Debian 10, this can be done with the galera_new_cluster script. The script should only be executed on one node, and only once to initialize the cluster.

galera_new_cluster

This will start MariaDB on the current node. Make sure it is running with:

systemctl status mariadb.service

Then start MariaDB on the other nodes with:

systemctl start mariadb.service

The cluster should now be operational.

Step 4: Testing

To make sure the cluster is working as intended, pick any node and login to MariaDB:

mysql -u root -p

Issue the following statement to create a database:

> CREATE DATABASE test0;
> \q

Then check for this new database on all other nodes:

mysql -u root -p -e "SHOW DATABASES;"

The above command should return a list containing test0:

+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test0              |
+--------------------+

You may want to test more thoroughly by writing to the cluster from every node. Once you're satisfied with testing, clean up any unneeded databases from the cluster. Any node can be used.

mysql -u root -p -e "DROP DATABASE test0;"

Step 5: Troubleshooting Tips

Use the following query to view information about the current state of the node/cluster:

mysql -u root -p -e "SELECT * FROM information_schema.global_status WHERE variable_name IN ('WSREP_CLUSTER_STATUS','WSREP_LOCAL_STATE_COMMENT','WSREP_CLUSTER_SIZE','WSREP_EVS_REPL_LATENCY','WSREP_EVS_DELAYED','WSREP_READY');"

A healthy 3-node cluster should return the following:

+---------------------------+----------------+
| VARIABLE_NAME             | VARIABLE_VALUE |
+---------------------------+----------------+
| WSREP_CLUSTER_SIZE        | 3              |
| WSREP_CLUSTER_STATUS      | Primary        |
| WSREP_EVS_DELAYED         |                |
| WSREP_EVS_REPL_LATENCY    | 0/0/0/0/0      |
| WSREP_LOCAL_STATE_COMMENT | Synced         |
| WSREP_READY               | ON             |
+---------------------------+----------------+

WSREP_CLUSTER_SIZE represents the current number of nodes in the cluster component.
WSREP_CLUSTER_STATUS represents the state of the cluster component and not the cluster as a whole.
WSREP_EVS_DELAYED shows a list of nodes that are delayed. An empty value is expected from healthy clusters.
WSREP_EVS_REPL_LATENCY shows replication latency in the format min/avg/max/stddev/samplesize. The values are displayed in seconds. Very high latencies may lead to degraded performance.
WSREP_LOCAL_STATE_COMMENT shows the current node state.
WSREP_READY indicates whether the node can accept queries.

When a node in a 3-node cluster loses connectivity, the cluster is partitioned into a primary component consisting of 2 nodes and a non-primary component. The primary component is not affected by the outage and continues normal operation. From the perspective of the non-primary component, the query shown above would return the following:

+---------------------------+--------------------------------------------------------------------------------------------------------------------------------+
| VARIABLE_NAME             | VARIABLE_VALUE                                                                                                                 |
+---------------------------+--------------------------------------------------------------------------------------------------------------------------------+
| WSREP_CLUSTER_SIZE        | 1                                                                                                                              |
| WSREP_CLUSTER_STATUS      | non-Primary                                                                                                                    |
| WSREP_EVS_DELAYED         | 6b7864f2-fe7d-11e9-84ab-93e58c0d2907:tcp://192.0.2.1:4567:3,a421be89-fe7d-11e9-a91e-7e62f7562e58:tcp://192.0.2.3:4567:2        |
| WSREP_EVS_REPL_LATENCY    | 0/0/0/0/0                                                                                                                      |
| WSREP_LOCAL_STATE_COMMENT | Initialized                                                                                                                    |
| WSREP_READY               | OFF                                                                                                                            |
+---------------------------+--------------------------------------------------------------------------------------------------------------------------------+

Notice the WSREP_EVS_DELAYED value, which indicates connectivity problems to the other nodes.

On primary component nodes, the same query returns:

+---------------------------+----------------------------------------------------------------+
| VARIABLE_NAME             | VARIABLE_VALUE                                                 |
+---------------------------+----------------------------------------------------------------+
| WSREP_CLUSTER_SIZE        | 2                                                              |
| WSREP_CLUSTER_STATUS      | Primary                                                        |
| WSREP_EVS_DELAYED         | a2217526-fe7d-11e9-8692-1f2f0cdb403d:tcp://192.0.2.2:4567:2    |
| WSREP_EVS_REPL_LATENCY    | 0/0/0/0/0                                                      |
| WSREP_LOCAL_STATE_COMMENT | Synced                                                         |
| WSREP_READY               | ON                                                             |
+---------------------------+----------------------------------------------------------------+

To recover from single node failures, no manual intervention is required. When the failed node reconnects to the cluster, it synchronizes with the cluster automatically.