Installing a database cluster using Galera Cluster with MariaDB

Galera Cluster robust database cluster made on top of MariaDB or MySQL. It contains several features that make it more reliable that standard MySQL Replication. It is a battle-tested solution capable of supporting high traffic applications/websites efficiently. With galera cluster, the application can read and write on any node. It does not use a dedicated master unlike standard MySQL replication. Adding and removing new nodes to the cluster is also much easier.

In this guide, we will create a database server using MariaDB 10.11 (latest stable as at publishing date). To complete this, we will need 4 node. Below are my node details:

HostnameIP AddressFeaturesSoftware
loadbalancer192.168.0.61 core, 512MB RAMUbuntu 20.04, HAProxy
db01192.168.0.91 core, 2048MB RAMUbuntu 20.04
db02192.168.0.21 core, 2048MB RAMUbuntu 20.04
db03192.168.0.31 core, 2048MB RAMUbuntu 20.04

Setting Up the Load Balancer

One of the tasks of a load balancer is to distribute traffic to backend servers. We are going to set up a single node load balancer for our demo. The purpose of your load balancer is to help your connect your applications to your database infrastructure. Your app connects to the load balancer, the load balancer decides which backend database server is available and able to process the apps request.

Note: This load balancer section is not a production ready setup, since having just one load balancer node is in itself a single point of failure. A single load balancer negates the database high availability work. So in production environments, consider having a highly available load balancer set up, or subscribe to load balancers provided by cloud providers or Cloudflare

1.Login to your server via SSH.

2.Install HAProxy on the server using the command below

$ apt install haproxy

3.Backup the HAProxy config file at /etc/haproxy/haproxy.cfg to /etc/haproxy/haproxy.cfg.backup

4. Create a new HAProxy config file and open it using your favorite editor then copy the lines below

global
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    log     global
    mode    tcp
    option  tcplog
    option  dontlognull
        timeout connect 5000ms
        timeout client  50000ms
        timeout server  50000ms

frontend mysql_frontend
    bind *:3306
    mode tcp
    default_backend mysql_backend

backend mysql_backend
    mode tcp
    balance roundrobin
    option mysql-check user haproxy_check
    server db01 192.168.0.9:3306 check inter 2000 rise 2 fall 3
    server db02 192.168.0.2:3306 check inter 2000 rise 2 fall 3
    server db03 192.168.0.3:3306 check inter 2000 rise 2 fall 3

5. Restart haproxy

$ systemctl restart haproxy

6.Make sure port 3306 is open on the server firewall

$ ufw allow 3306/tcp

Setting up the database servers

Now, let us set up our databases and cluster then. We will install MariaDB on the three servers.

1. Access your the first server via SSH using root user or a sudo enabled user.

2.Run the following commands in order to install needed dependencies and the MariaDB repository

$ sudo apt update
$ sudo apt-get install apt-transport-https curl ca-certificates
$ sudo curl -o /etc/apt/trusted.gpg.d/mariadb_release_signing_key.asc '<a href="https://mariadb.org/mariadb_release_signing_key.asc">https://mariadb.org/mariadb_release_signing_key.asc</a>'
$ sudo sh -c "echo 'deb <a href="https://mariadb.mirror.liquidtelecom.com/repo/10.11/ubuntu">https://mariadb.mirror.liquidtelecom.com/repo/10.11/ubuntu</a> focal main' >>/etc/apt/sources.list"

Note: You can navigate to https://mariadb.org/download to get the latest commands as the versions of MariaDB keep being released. MariaDB has a nice command generator that gives you commands to run for your linux distribution. You want to use the latest version from MariaDB Server Repositories.

3.Repeat the above process on the other 2 database servers

4. Next, we need to configure the database servers. To do this, login to the first server via SSH again and navigate to the mariadb configuration folder. This is located under /etc/mysql/mariadb.cnf.d

5. Rename the file called 50-server.cnf to 50-server.cnf.backup. We will create a new one and copy our own content.

$ mv 50-server.cnf 50-server.cnf.backup

6. Create a new 50-server.cnf file

$ touch 50-server.cnf

Then open the file using your favorite text editor(e.g. vim or nano) and copy the content below into the file

# MySQL Options

[mysqld]
datadir=/var/lib/mysql
bind-address=0.0.0.0
user=mysql

default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=0
innodb_buffer_pool_size=128M

binlog_format=ROW
symbolic-links=0

log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

# Galera Options

wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so

wsrep_node_name='db02'
wsrep_node_address="192.168.0.2"

wsrep_cluster_name='galera-training'
wsrep_cluster_address="gcomm://192.168.0.9,192.168.0.2,192.168.0.3"
wsrep_provider_options="gcache.size=300M; gcache.page_size=300M"
wsrep_slave_threads=4
wsrep_sst_method=rsync

Below is a quick explanation of the some of the key options

bind-address=0.0.0.0

This options tells MariaDB from where it should accept connections. The default value is 127.0.0.1 but this value cannot work in a cluster environment since the various servers need to communicate with each other. Therefore, this must be set to 0.0.0.0, to tell MariaDB to allow connections from any network.

default_storage_engine=InnoDB

Galera cluster only supports InnoDB engine. As such, its important to ensure that all new tables created use InnoDB. This options facilitates that.

binlog_format=ROW

Galera cluster only supports row-level replication as the binary log format. It does not use statement-level replication. The correct format is set by the directive above.

wsrep_on=ON 
wsrep_provider=/usr/lib/galera/libgalera_smm.so

These options are used to enable galera replication. The official documentation may have the wsrep_provider path pointing to a different location. But this depends on the OS. In Ubuntu, the correct path is as indicated above. Should you get any errors while starting the galera cluster(as seen later) due to incorrect wsrep_provider path, use the command below to find the location of libgalera_smm.so

$ find /usr -name "libgalera_smm.so"
wsrep_node_name='db02'
wsrep_node_address="192.168.0.2"

These are node specific directives. You must update them to match a nodes IP and name, each of which must be unique for every node in the cluster. They help the nodes in the cluster identify each other.

For the other options and more, please refer to the official documentation and this tutorial created by Codership Oy

7. Repeat step 6 above on the other nodes, changing the value of wsrep_node_name and wsrep_node_address accordingly

8. We now need to open the relevant ports. Galera cluster uses the following ports:

3306 tcp, 4444 tcp, 4567 tcp and udp, 4568 tcp

To open these ports on an Ubuntu server that uses ufw, run the command below

$ sudo ufw allow 3306/tcp && sudo ufw allow 4567/tcp && sudo ufw allow 4567/udp && sudo ufw allow 4568/tcp && sudo ufw allow 4444/tcp

Starting the cluster

Great. Now that we have configured the 3 servers appropriately, we need to start them. A galera cluster is not started as a standalone mariadb or mysql installation would be started. You will not successfully start it using systemctl start mariadb command

1.To successfully start a galera cluster, you need to select one of the server to be the bootstraping server. This can be any of the 3 servers, nothing special about it really.

This mariadb server will need to be started by running the command below

$ galera_new_cluster

2.Once it starts, the other servers in the cluster can be started using the usual systemd command, one server at a time:

$ systemctl start mariadb.

Once started, login to mariadb using your mariadb password and run the following query to check the size of the cluster.

MariaDB [(none)]> show status like 'wsrep_cluster_size';

The result should be as follows

+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 3 |
+--------------------+-------+

If you stop or poweroff any of the database servers and run the command in one of the remaining ones, you should see the cluster size reduce as follows

+--------------------+-------+
| Variable_name | Value |
+--------------------+-------+
| wsrep_cluster_size | 2 |
+--------------------+-------+

When enough servers shutdown so that quorum is not met, the whole cluster will shutdown. In this case, you would need to restart the cluster after fixing whatever issue caused the downtime

Restarting a galera cluster

To restart a Galera cluster, you would first need to check which server is ‘safe to bootstrap’ with. To do this take note of your data directory, which in our case – and also by default, is /var/lib/mysql. Inside it, there will be a file called grastate.dat

$ ls -l /var/lib/mysql/grastate.dat
-rw-rw---- 1 mysql mysql 113 Apr 15 13:42 /var/lib/mysql/grastate.dat

Using grep, look for safe_to_bootstrap option inside the file. This should be done on each server.

$ grep safe_to_bootstrap /var/lib/mysql/grastate.dat

3. In one of the servers, the value will be set to 1. This is the server from which you can successfully restart the cluster. Therefore in this server, run the command galera_new_cluster.

4. Then run the systemctl start mariadb command on the other servers.

In some cases, when the cluster did not shut down gracefully, the value of safe_to_bootstrap will be 0 on all sevrers. In that case, select any one of the servers and manually edit the /var/lib/mysql/grastate.dat file, setting safe_to_bootstrap:1

Save the file and use that server to bootstrap the cluster (start the cluster)

That is that! You can now enjoy a highly available, load balanced database infrastructure.

Conclusion

Galera cluster has proven itself as a robust solution capable of handling high load traffic in a highly available manner. Check out some companies using Galera cluster and the various use cases of Galera cluster to see how it can fit into your database needs.

Leave a Reply

Your email address will not be published. Required fields are marked *