High availability means making any system 100% operational without any down time due to occurrences of failure in hardware, software and application. In production environment, it can be achieved with the below levels:
- Multiple Web Servers
- Multiple Database Servers
- Multiple Load Balancers
- Leveraging different storage systems
Mostly used High availability falls under two categories:
- Active/Passive: In this category, one node will always be in Active mode whereas another node will be in passive (i.e. stand-by) mode. If Active node goes down, passive node will become Active.
- Active/Active: In this category, both the nodes will become Active and both will serve the client request.
Read details in a report by IDC.
How to set up High Availability via HA-Proxy?
There are many software or application or hardware by which HA can be setup but in today’s industry the mostly used technology to set up HA is either HAProxy or Pacemaker.
We are going to setup HA via HAProxy. For this, we have taken below system requirements:
- 2 instances for load balancers with public IP
- 2 instances for web servers with public IP
- 3 instances for database servers with public IP
- One floating or virtual IP
Load Balancers: They are used to balance traffic or request to web servers and database servers as per metrics defined in the configuration files. In most of the cases, the load balancer is set in Active/Passive mode and we have done the same.
Web Servers: They are application servers where website content or application data is stored.
Database Servers: They are servers in which MySQL or other database software is installed.
We have used below infrastructure:
Load Balancer: Active/Passive mode via Keepalive
Web Servers: Active/Passive mode via HAProxy
Database Servers: Active/Passive mode via HAProxy
In web servers, we have added 10GB extra hard disk on both the servers so that it’s shared storage can be created and made available on both the servers. Website content is placed in this shared storage. In the production environment, shared storage is configured either using SAN or NAS technology but they are bit costly. Nowadays, Ceph storage platform is used for the storage system.
In Database Servers, we have installed Galera software which brings all the database instances in Active/Active mode. That means all three nodes will be in Active mode and in most cases, it works well. But there are some changes in data inconsistency if there are multiple writes in multiple nodes. So, we have changed our infrastructure via HAProxy in such a way that there is only one Active node and rest nodes act as backup nodes.
Below is server hostname along with IP assignment. Here “vip-poc” is not a server, it is a floating IP assigned to a load balancer (i.e. lb1-poc and lb2-poc) via Keepalive daemon and all the traffic or requests will pass through this vip only.
192.168.1.159 vip-poc.cloudhost.net vip-poc
192.168.1.160 lb1-poc.cloudhost.net lb1-poc
192.168.1.161 lb2-poc.cloudhost.net lb2-poc
192.168.1.162 web1-poc.cloudhost.net web1-poc
192.168.1.163 web2-poc.cloudhost.net web2-poc
192.168.1.164 db1-poc.cloudhost.net db1-poc
192.168.1.165 db2-poc.cloudhost.net db2-poc
192.168.1.166 db3-poc.cloudhost.net db3-poc
Servers setup and configuration:
Firstly, run update command to update all the servers after OS installation.
# yum update -y
Then, setup DNS so that all the servers are reachable (i.e. pingable) with hostname. Since we are not using DNS server right now, we can add below parameters in /etc./hosts file so that all servers reach each other.
192.168.1.159 vip-poc.cloudhost.net vip-poc
192.168.1.160 lb1-poc.cloudhost.net lb1-poc
192.168.1.161 lb2-poc.cloudhost.net lb2-poc
192.168.1.162 web1-poc.cloudhost.net web1-poc
192.168.1.163 web2-poc.cloudhost.net web2-poc
192.168.1.164 db1-poc.cloudhost.net db1-poc
192.168.1.165 db2-poc.cloudhost.net db2-poc
192.168.1.166 db3-poc.cloudhost.net db3-poc
We need to configure NTP service in all the servers so that all servers maintain and show synced exact time. For this, install NTP package in all servers. But, we have used chrony service, rather than NTP package because it works great.
# yum install chrony -y
We have used lb1-poc as the master server for time synchronization, so update below parameters in /etc/chrony.conf file in lb1-poc server.
# vi /etc/chrony.conf
—————–
allow 192.168.1.0/24
—————-
Start and enable chrony service.
# systemctl start chronyd.service
# systemctl enable chronyd.service
In other servers, update below given things in /etc/chrony.conf file.
# vi /etc/chrony.conf
—————–
server lb1-poc iburst
Then, start and enable chrony service
# systemctl start chronyd.service
# systemctl enable chronyd.service
Now, verify time synchronization in each server with the below given command:
# chronyc sources
1.Load Balancer Setup:
We have used HAProxy with Keepalive daemon for traffic load balancing between web and database servers.
So you install HAProxy and Keepalive package in both LB nodes:
# yum install haproxy keepalived -y
We have used lb1-poc node as master and lb2-poc as slave node.
So the keepalive.conf file looks like:
In lb1-poc node:
# vi /etc/keepalived/keepalived.conf
———-
global_defs {
# Keepalived process identifier
lvs_id haproxy_DH
}
# Script used to check if HAProxy is running
vrrp_script check_haproxy {
script “killall -0 haproxy”
interval 2
weight 2
}
# Virtual interface
# The priority specifies the order in which the assigned interface to take over in a failover
vrrp_instance VI_01 {
state MASTER
interface em1
virtual_router_id 51
priority 101
# The virtual ip address shared between the two loadbalancers
virtual_ipaddress {
192.168.1.159
}
track_script {
check_haproxy
}
}
———————
Note: 192.168.1.159 is virtual or floating ip.
In lb2-poc node:
# vi /etc/keepalived/keepalived.conf
———————
global_defs {
# Keepalived process identifier
lvs_id haproxy_DH_passive
}
# Script used to check if HAProxy is running
vrrp_script check_haproxy {
script “killall -0 haproxy”
interval 2
weight 2
}
# Virtual interface
# The priority specifies the order in which the assigned interface to take over in a failover
vrrp_instance VI_01 {
state SLAVE
interface em1
virtual_router_id 51
priority 100
# The virtual ip address shared between the two load balancers
virtual_ipaddress {
192.168.1.159
}
track_script {
check_haproxy
}
}
—————————-
In both LB nodes, add below parameters in haproxy.cfg file:
# vi /etc/haproxy/haproxy.cfg
——————————
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#———————————————————————
# common defaults that all the ‘listen’ and ‘backend’ sections will
# use if not designated in their block
#———————————————————————
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
#———————————————————————
# main frontend which proxies to the backends
#———————————————————————
frontend LB
bind 192.168.1.159:80
reqadd X-Forwarded-Proto:\ http
default_backend Nodes
backend Nodes 192.168.1.159:80
mode http
# balance roundrobin
balance source
option httpchk
option httpclose
option forwardfor
cookie LB insert
server web1-poc 192.168.1.162:80 cookie web1-poc check # web1 server
server web2-poc 192.168.1.163:80 cookie web2-poc check backup # web2 server
listen DB_Servers 192.168.1.159:3306
balance source
mode tcp
option httpchk
server db1-poc 192.168.1.164:3306 port 9200 check
server db2-poc 192.168.1.165:3306 port 9200 check backup
server db3-poc 192.168.1.166:3306 port 9200 check backup
listen stats *:1936
stats enable
stats uri /
stats hide-version
stats auth haproxy:******
Note 1: web1-poc is Active and web2-poc is backup. That means it is like Active/passive.
Note 2: Similarly, for DB servers, db1-poc is active and db2-poc, db3-poc is backup. We use this for data consistency.
Note 3: haproxy statistics url is http://192.168.1.159:1936/ and username: haproxy, password: ******. You can use any unregistered port instead of 1936.
Then start and enable HAProxy and keepalived daemon in both LB nodes:
# systemctl start haproxy keepalived
# systemctl enable haproxy keepalived
2.Web Server Setup:
We use CentOS 6.8-64 bit for web nodes because we are going to use gfs2 file system with DLM locking.
Install DRBD in both web nodes:
# rpm -Uvh http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm
# yum -y update
# yum install -y kmod-drbd84 drbd84-utils
# modprobe drbd
Now create r0.res resource with below parameters in both nodes:
# vi /etc/drbd.d/r0.res
———————
resource r0 {
protocol C;
startup {
become-primary-on both; ### For Primary/Primary ###
degr-wfc-timeout 60;
wfc-timeout 30;
}
disk {
on-io-error detach;
}
net {
allow-two-primaries; ### For Primary/Primary ###
cram-hmac-alg sha1;
shared-secret “mysecret”;
after-sb-0pri discard-zero-changes;
after-sb-1pri violently-as0p;
after-sb-2pri violently-as0p;
}
on web1-poc.cloudhost.net {
device /dev/drbd0;
disk /dev/sdc1;
address 192.168.1.162:7788;
meta-disk internal;
}
on web2-poc.cloudhost.net {
device /dev/drbd0;
disk /dev/sdc1;
address 192.168.1.163:7788;
meta-disk internal;
}
}
————————
On both nodes, create metadata on r0 resource:
# drbdadm create-md r0
# /etc/init.d/drbd start
On web1-poc node:
# drbdsetup /dev/drbd0 primary -o
This command makes web1-poc as primary and web2-poc as secondary and then starts data synchronization.
You can check status of data synchronization with below command:
# cat /proc/drbd
Let the data synchronize in both the nodes.
After then make web2-poc node as primary by running below command in web2-poc node:
On web2-poc node:
# drbdsetup /dev/drbd0 primary -o
Now both web1-poc and web2-poc are primary. That means both are in Active/Active mode.
Configure GFS:
We are going to configure gfs which is cluster file system to use with DRBD.
Install below packages in both web nodes:
# yum -y install gfs2-utils cman lvm2-cluster
So add below parameters in /etc/cluster/cluster.conf in both web nodes:
[root@web1-poc cluster]#vi cluster.conf
———————————————–
<?xml version=”1.0″?>
<cluster name=”cluster1″ config_version=”3″>
<cman two_node=”1″ expected_votes=”1″/>
<clusternodes>
<clusternode name=”web1-poc.cloudhost.net” votes=”1″ nodeid=”1″>
<fence>
<method name=”single”>
<device name=”manual” ipaddr=”192.168.1.162″/>
</method>
</fence>
</clusternode>
<clusternode name=”web2-poc.cloudhost.net” votes=”1″ nodeid=”2″>
<fence>
<method name=”single”>
<device name=”manual” ipaddr=”192.168.1.162″/>
</method>
</fence>
</clusternode>
</clusternodes>
<fence_daemon clean_start=”1″ post_fail_delay=”0″ post_join_delay=”3″/>
<fencedevices>
<fencedevice name=”manual” agent=”fence_manual”/>
</fencedevices>
</cluster>
————————-
Use below command to check and validate cluster in web1-poc node:
[root@web1-poc ~]# ccs_config_validate
Configuration validates
[root@web1-poc ~]#
Restart cluster services in both nodes:
# /etc/init.d/cman start
# /etc/init.d/clvmd start
# /etc/init.d/gfs2 start
Enable cluster services in both nodes:
# chkconfig cman on
# chkconfig clvmd on
# chkconfig gfs2 on
Now format the device, only on web1-poc node:
# mkfs.gfs2 -p lock_dlm -t cluster1:gfs -j 2 /dev/drbd0
Create /data directory and mount the formatted device to it in web1-poc node:
# mkdir /data
# mount -t gfs2 /dev/drbd0 /data
Then in web2-poc node:
# mkdir /data
# mount -t gfs2 /dev/drbd0 /data
Let’s insert the device on fstab in both the nodes (ie web1-poc and web2-poc)
#vi /etc/fstab
————–
/dev/drbd0 /data gfs2 defaults 0 0
————–
How to Setup MariaDB Galera Cluster 10.0 On CentOS-7?(Database Server setup)
The MariaDB Galera Cluster is the database server set up for the Linux platforms and XtraDB/InnoDB storage engines. It is a synchronous multi-master replication for fast and scalable relational database – the MariaDB.
MariaDB Galera Cluster Features:
- Multi-master topology with Active-active mode
- Easy read and write to any cluster node
- True parallel replication (row level) & Automatic node joining
- Direct client connections, native MariaDB look & feel
- Automatic membership control, failed nodes drop from the cluster
MariaDB Galera Cluster Advantages:
You can yield several benefits form the above-mentioned features of a DBMS clustering solution, including:
- Loss free transactions
- Read and write scalability
- No replication slave lag
- Smaller client latencies
We have used 3 freshly deployed VMs running a minimal install of CentOS 7.2 x86_64:
# hostnamectl set-hostname db1-poc.cloudhost.net
#vi /etc/hosts <-setup hostfile in all server
192.168.1.162 web1-poc.cloudhost.net web1-poc
192.168.1.163 web2-poc.cloudhost.net web2-poc
192.168.1.164 db1-poc.cloudhost.net db1-poc
192.168.1.165 db2-poc.cloudhost.net db2-poc
192.168.1.166 db3-poc.cloudhost.net db2-poc
#vi /etc/hosts.allow <-Setup tcpwrapper sceurity for SSH login.
#rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
#rpm -Uvh https://mirror.webtatic.com/yum/el7/webtatic-release.rpm
MariaDB Galera setup:
Create a mariadb repository /etc/yum.repos.d/mariadb.repo using following content in your system.
# vi /etc/yum.repos.d/mariadb.repo
# MariaDB 10.1 CentOS repository list – created 2015-11-08 17:34 UTC
# http://mariadb.org/mariadb/repositories/
[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/10.1/centos7-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1
How to Install MariaDB Galera Cluster 10.0 software?
Use following commands to install :
#yum install MariaDB-Galera-server MariaDB-client galera rsync MariaDB-server MariaDB-compat socat jemalloc
#systemctl start mariadb
# mysql_secure_installation
#mysql -u root -p
GRANT ALL PRIVILEGES ON *.* TO ‘root’@’%’ IDENTIFIED BY ‘******’ WITH GRANT OPTION;
FLUSH PRIVILEGES;
#systemctl stop mariadb
Edit the MariaDB Galera Cluster config:
#vim /etc/my.conf.d/server.conf
# Mandatory settings
wsrep_on=ON
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address=’gcomm://192.168.1.164,192.168.1.165,192.168.1.166′
wsrep_node_address=’192.168.1.164′
wsrep_sst_method=rsync
wsrep_cluster_name=’cluster1′
wsrep_node_name=’db1-poc’
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
Staring the Galera Cluster:
Once the configuration is done, you need to start the cluster, first on the mariadb01 node which is the master with this command.
#galera_new_cluster
IMPORTANT NOTE: when executing this command on db2 and db3 do not forget to adjust the wsrep_node_address and wsrep_node_name variables.
On db2:
wsrep_node_address=’192.168.1.164′
wsrep_node_name=’db2′
On db3:
wsrep_node_address=’192.168.1.165′
wsrep_node_name=’db3′
Then on other two nodes with normal systemctl command:
#systemctl start mariadb
Check the Galera cluster size:
#mysql -u root -p -e “show status like ‘wsrep%'”
wsrep_local_state_comment | Synced <– cluster is synced
wsrep_incoming_addresses | 192.168.1.164:3306 <– node db1 is a provider
wsrep_cluster_size | 3 <– cluster consists of 3 node
wsrep_ready | ON <– good 🙂
Verify replication:
create database cluster;once database create on db1 node then automatically replicate on all another node.
Read details in a report by Forrester.
In case you have any query regarding this write-up, let me know through comments section.
Services ZNetLive offer: