QMT Failover replication Setup

From QmailToaster
Jump to navigation Jump to search

QMT Failover replication Setup

Craig Smith - 26th October 2006 - craig@doc-net.com

Thanks to Jake for taking the time to review this for me before posting. It always helps to have a sounding board and Jake was kind enough to be that board for me.


This page gives you a procedure to configure a backup qmt server that will be available for failover in the event of primary server failure. The backup server will only ever be 10 minute out from the primary.(depending on cronjob timing)

Please note initial replication (the first run) will take some time, so schedule this for off peak hours. Once the first run has finished and unison has a db of what it is working with subsequent runs are pretty quick. So enable the cron job settings at a time that you can manage the traffic for initial replication.

Also this setup is based on 2 servers where the port used is internal and not visible publically. If you cannot do this on a private network, then read up on using ssh for replication as this is not a secure transport and should not be used on open networks.

This was setup and tested on Fedora core 5 on both servers, and it works without any hiccups.

The details are pretty much cut and paste.

I would test this installation on a similar setup first as I only have my system to compare with, but given the ease of installation and use of unison, I don't forsee any major problems.

This setup assumes that QMT is installed and configured on both servers. The backup servers version is to match the primary's.

I have also setup a script that will test your server availability by pinging it. This is posted at the bottom of this page, but still needs work. The mail commands are purely for testing. If the primary server really were off, the emails wouldn't go out until you switch to the backup which doesn't help. I think replacing these mail commands to sms text messages would certainly help. A quick and easy monitoring system, which can be added to cron to run every x minutes.

This is a first draft and also my fist proper linux/unix scripts so please forgive any errors and let me know if you have any problems. If the primary server fails and the switch is made to backup in a short time, clients will probably not even notice the downtime.

  • To modify files/folders either replicated or ignored edit the unison profile (qmail.prf).
  • This setup is for servers that are in a lan environment with the unison port not visible to the public world. Therefore the security overhead of ssh is not needed. However unison can be run with ssh if needed.

Comments and Notes

Please feel free to leave comments about your experience with this procedure.

Craig Smith - 26th October 2006 On our setup of 2 x fedora core 5 boxes, whenever I make a change on primary I check on backup to make sure it has replicated, and so far everytime it works a charm. I've not run into anything strange.

Also I'm not sure how people prefer the logs. I had it initially set to log each run, but that was changed to keep logging for 20MB which is roughly 24 hours worth of logging. Times are logged so finding a specific run should be fairly easy. However it's quite easy for me to change back to log each run in a seperate file. I will go with preferance really, please let me know.

* Note about non FC versions.

One of the variables in the script obtains file size by cutting the relevant field from an ls listing. This field it turns out is not the same as on Fedora, so if your script runs into errors, the problem lies here more than likely. To fix it run this (ls -l $LOG|cut -d ' ' -f 5)portion in the command line and change the f no. until it correctly displays the size field and then change the number in the script and this should fix any script errors.

It is the following variable.

size=`ls -l  $LOG|cut -d ' ' -f 5`

I believe for Centos it is f6 so that variable would be changed to

size=`ls -l  $LOG|cut -d ' ' -f 6`

Primary Server (Client) setup

  • Primary server is the main server being replicated. This has the unison program and the client side script to replicate data based on the profile.
  • All scripts assume a default path of /unison.
  • The script calls unison to run as follows
/unison/unison -force / -batch qmail
  • -force / is very important, as it specifies this root as the primary and will default all conflicts to this root. NB IF THIS IS NOT INCLUDED ANY CHANGES MADE ON THE BACKUP SERVER WILL REPLICATE TO PRIMARY.
  • -batch tells unison to run without promts and qmail refers to the qmail.prf file. (see below)
To configure unison on the client side take the following steps as root/su
mkdir /unison
cd /unison
wget https://svn.cis.upenn.edu/svnroot/unison-contributed-binaries/linux/unison-2.13.16-linux-text.bz2
bzip2 -d unison-2.13.16-linux-text.bz2
mv unison-2.13.16-linux-text unison
chmod 755 unison
./unison                   *This will create the initial /root/.unison database folder
vi qmail-replicatec        *Once in the editor, copy and paste the qmail-replicatec script from below
:wq
vi /root/.unison/qmail.prf *Once in the editor, copy and paste the qmail.prf details from below
:wq

Client script gets added to cron to run every 10 minutes.

crontab -e
*/10 * * * * /unison/qmail-replicatec

*Below is the full script for the client side, script name qmail-replicatec

#!/bin/sh

#Version 1.0 - Oct 11th 2006
#Script created by Craig Smith - craig@doc-net.com

#This script is the client (Primary  mail server) side script to replicate based on the
#qmail.prf file located under /root/.unison.  The Mysql variables and dump were taken from 
#qmailbackup script written initially by Nate Davis. 

#To add to or change files paths edit /root/.unison/qmail.prf accordingly.

#This script assumes unison and scripts are placed in /unison

#To keep logs from building too excessively as this script will run fairly regularly
#they will be moved to the /unison/oldlogs folder after 10.  The oldlogs folder will
#be emptied after each group of 10.  You can increase this if needed.  See comment below.
#With default settings the last 20 runs are available. Add email address below for notifications of log rotation.

#starting with mysql so the dumped file can replicate

#Checking for mysql.dump and oldlogs folder

folder1=/unison/mysql.dump
folder2=/unison/oldlogs
lock=/unison/replicate.lock
email=(email address)
if [ ! -d $folder1 ]; then
mkdir -p $folder1
fi

if [ ! -d $folder2 ];then
mkdir -p $folder2 
fi

if [ ! -f $lock ]; then
touch $lock
elif [ -f $lock ]; then
echo "Lock file still in place, investigate."|mail -s "unison script problems  please investiagate server" $email
exit 0
fi
# MYSQL variables
mysqlfile="/home/vpopmail/etc/vpopmail.mysql";
mysqlhost=`cut -d\| -f1 < $mysqlfile`;
mysqlport=`cut -d\| -f2 < $mysqlfile`;
mysqluser=`cut -d\| -f3 < $mysqlfile`;
mysqlpswd=`cut -d\| -f4 < $mysqlfile`;
mysqldb=`cut -d\| -f5 < $mysqlfile`;
if [ $mysqlport == "0" ]; then
mysqlport=""
else
mysqlport="-P$mysqlport"
fi

#echo "Backing up MYSQL Data"
mysqldump -u$mysqluser -p$mysqlpswd -h$mysqlhost $mysqlport $mysqldb > /unison/mysql.dump/vpopmail

#script for log clearup
#Only last 10 log files are kept in /unison/oldlogs. 
#The full log will be replaced every 20MB and moved to /unison/oldlogs.

LOG=/unison/unisonlog.full
LOGCOUNT=`ls $folder2 |wc -l`
size=`ls -l  $LOG|cut -d ' ' -f 5`
LOGSAVE=/unison/oldlogs/unisonlog.`date +%Y%m%d%H%M`

#echo $size 

if [ $size -gt 20000000 ];then
echo "this is bigger than 20MB, moving">>$LOG
mv $LOG $LOGSAVE
echo "" >$LOG
fi

#LOG COMMENT : if you want to increase the saved logs change the 10 below.

if [ $LOGCOUNT -gt 15 ];
then
echo "more than 15 logs exist, moving to folder /unison/oldlogs/previous"
  #keep previous batch in previous
  rm -f /unison/oldlogs/previous/*
    mv /unison/oldlogs/* /unison/oldlogs/previous
   clear
    echo "`date` : log files have been moved to oldlogs" >>$LOG
echo "Unison Log files moved, please check folders on server"|mail -s"Unison Logile Rotation : `date`" $email
fi

echo "" >>$LOG
echo "`date` ***STARTING REPLICATION RUN**** " >>$LOG

 
#Begin unison replication using /root/.unison/qmail.prf file for folder locations
#Log file location and format

/unison/unison -force / -batch qmail >> $LOG  2>&1
echo "Deleting lock file" >> $LOG  2>&1
rm -f $lock
echo "Done `date`"  >> $LOG  2>&1


;*Below is the contents of the qmail.prf file. This file points to the backup server ip and port and specifies the folders to be replicated. It will only replicate what is specified by Path. If you don't include path, then it will replicate the whole root. If you want to add details e.g. squirrellmail prefs etc, then add path = path to folder. note no trailing / add the correct ip address and port no when you paste this into the qmail.prf file. eg. root = socket://10.0.0.1:1234//

#Root and path setup for qmail backup
root = /
root = socket://xxx.xxx.xxx.xxx:port no.//

path = home/vpopmail
path = var/qmail/control
path = var/qmail/users
path = etc/mail/spamassassin
path = etc/tcprules.d
path = unison/mysql.dump

ignore = Name *.lock

Backup Server (Server) Setup

  • The backup server runs the unison program with the -sockets option. This creates a listening socket that the client will try and connect to. There is also a script that will deal with qmail cleanup and configuration as unison does not replicate ownerships.

*To configure Unison on the Server side take the following steps. The same unison application as above is used.

mkdir /unison
cd /unison
wget https://svn.cis.upenn.edu/svnroot/unison-contributed-binaries/linux/unison-2.13.16-linux-text.bz2
bzip2 -d unison-2.13.16-linux-text.bz2
mv unison-2.13.16-linux-text unison
chmod 755 unison
./unison                   *This will create the initial /root/.unison database folder
vi qmail-replicateb        *Once in the editor, copy and paste the script from below
:wq
chmod 755 qmail-replicateb
vi unison-run              *Once in the editor, copy and paste the unison-run script from below
:wq
chmod 755 unison-run
vi qmail-switch            *Once in the editor, copy and paste the qmail-switch script from below.
:wq
chmod 755 qmail-switch
cp unison-run /etc/init.d
ln -s /etc/init.d/unison-run /etc/rc3.d/S50unison-run  *This will configure unison to start on boot up.
  • To manually start or stop the socket do the following
/unison/unison-run start 

or

/unison/unison-run stop
  • The server side script uses a qmail queue repair tool. This needs to be configured before running the script
cd /unison
mkdir queue-repair
wget http://pyropus.ca/software/queue-repair/queue-repair-0.9.0.tar.gz
tar -xzf queue-repair-0.9.0.tar.gz
mv queue-repair-0.9.0 queue-repair
  • Run the client script if you need to switch to the backup server

;Below is the script for the server called qmail-replicateb

#!/bin/sh

#Version 1.0 - Oct 11th 2006
#Script created by Craig Smith - craig@doc-net.com

#This script is the server (backup mail server) side script to monitor 
#changes and import the changes and fix qmail accordingly.
#It also sets the correct ownership on the vpopmail folders as unison doesn't
#replicate ownership.

#This script assumes unison and scripts are placed in /unison

#The mysql commands,qmail-newu  and queue repair were taken from Jake Vickers' qmail 
#restore scripts.
#Put this script in cron job to finalize changes based on the main server
#Check the qmail-replicate.log file for each run to see if changes were made.
#Log file will be adjusted to similarly match client side script.

#Currently qmail is left off to prevent unecessary direct mailings to the
#backup server.  This can be uncommented to leave it running.

#Please add your mysql root password below
mysqlrootpass="mysql password"


#This portion compares the vpopmail files between replication for changes and 
#acts accordingly.  If you want this to run with every cron job, comment out 
#the next 4 script lines as well as the last 3 script lines.

#Checking for mysql.dump and oldlogs folder

folder1=/unison/mysql.dump
folder2=/unison/oldlogs

if [ ! -d $folder1 ]; then
  mkdir -p $folder1
fi
 
if [ ! -d $folder2 ]; then
   mkdir -p $folder2
fi

#script for log clearup
#Only last 10 run log files are kept in /unison. The rest are moved to
#/unison/oldlogs and cleared out each time 10 are exceeded to prevent excessive
#buildup. 

#Log file location and format

LOG=/unison/unisonlog.`date +%Y%m%d%H%M` 2>&1

LOGCOUNT=`ls /unison/unisonlog.* |wc -l`

#mysql file to be imported
FILE=/unison/mysql.dump/vpopmail

#LOG COMMENT : if you want to increase the saved logs change the 10 below.

if [ $LOGCOUNT -gt 10 ]
then
# echo "more than 10 logs exist, moving to folder /unison/oldlogs" 
  rm -f /unison/oldlogs/*
    mv /unison/unisonlog.* /unison/oldlogs
   clear
    echo "`date` : log files have been moved to oldlogs" >$LOG
  elif [ $LOGCOUNT -le 10 ]

then

echo "`date` : Log count is fine, nothing to do" >>$LOG

fi

mysqladmin -f -uroot -p$mysqlrootpass drop vpopmail >>$LOG 
mysqladmin -uroot -p$mysqlrootpass flush-tables >>$LOG
mysqladmin -uroot -p$mysqlrootpass refresh >>$LOG
mysqladmin -uroot -p$mysqlrootpass reload >>$LOG

mysqladmin create vpopmail -uroot -p$mysqlrootpass >>$LOG

mysql -uroot -p$mysqlrootpass vpopmail < $FILE

#Set the file permission for vpopmail.vchkpw on vpopmail folder
cd /home
chown -R vpopmail:vchkpw vpopmail
#cd /home/vpopmail/
#chown -R vpopmail:vchkpw *

#Run queue repair

echo "Running qmail-newu for any other loose ends....">>$LOG
/var/qmail/bin/qmail-newu >>$LOG

#Reload mysql data and restart httpd

echo "Reloading (and refreshing) MySQL and Apache...." >>$LOG
mysqladmin -uroot -p$mysqlrootpass reload >>$LOG
mysqladmin -uroot -p$mysqlrootpass refresh >>$LOG
/sbin/service httpd reload >>$LOG

#Rebuild qmail cdb's to accomadate any replicated changes
echo "Rebuilding QMail's CDB and starting the mail server...." >>$LOG
echo "If qmail is not running the svc errors are normal and can be ignored." >>$LOG
qmailctl stop >>$LOG 2>&1
sleep 3
qmailctl cdb >>$LOG
#Leaving qmail stopped to prevent unecessary direct mailing etc.
#qmailctl start
echo "`date` The database has been imported into MYSQL. Verify user details." >>$LOG


;Below is the script for the unison control file called unison-run

!/bin/sh

#Script created by Craig Smith - craig@doc-net.com
#Script for Unison Start up shutdown
#Set the socket number in the /unison/unison -socket # line.


case "$1" in
 start)
   # have a silent kill in case someone tries to start the service when it
   # is already running
   /unison/unison-run stop >/dev/null 2>&1
   echo -n "Starting Unison Socket Server"
   /unison/unison -socket xxxx  &
   echo $! > /var/run/unison.pid
   echo "."
   ;;
 stop)
   echo -n "Stopping Unison Socket Server"
   kill `cat /var/run/unison.pid`
   rm -rf /var/run/unison.pid
   ;;
 help)
HELP
   ;;
 *)
   echo "Usage: unison-run{start|stop}"
   exit 1
   ;;

 esac

Below is the script to use, when you need to switch to your backup server. Change the backup server's ip to match the primary and then run this. By changing the ip you don't have to worry about dns updates by changing the host record. This script will do a final clearup and then start qmail. /unison/qmail-switch to run this file.

Below is the qmail-switch script

#!/bin/sh 

#Version 1.0 - Oct 11th 2006
#Script created by Craig Smith - craig@doc-net.com

#Please add your mysql root password below
mysqlrootpass="mysql rootpass"
LOG=/unison/qmailswitch.log
FILE=/unison/mysql.dump/vpopmail

mysqladmin -f -uroot -p$mysqlrootpass drop vpopmail >>$LOG
mysqladmin -uroot -p$mysqlrootpass flush-tables >>$LOG
mysqladmin -uroot -p$mysqlrootpass refresh >>$LOG
mysqladmin -uroot -p$mysqlrootpass reload >>$LOG

mysqladmin create vpopmail -uroot -p$mysqlrootpass >>$LOG

mysql -uroot -p$mysqlrootpass vpopmail < $FILE

#Set the file permission for vpopmail.vchkpw on vpopmail folder
cd /home
chown -R vpopmail:vchkpw vpopmail

#Run queue repair

echo "Running Charles Cazabon's queue_repair python utility to fix any loose  ends...." >>$LOG
sleep 3
cd /unison/queue-repair
chmod 777 queue_repair.py
./queue_repair.py -r >>$LOG

echo
echo "Running qmail-newu for any other loose ends....">>$LOG
/var/qmail/bin/qmail-newu >>$LOG

#Reload mysql data and restart httpd

echo "Reloading (and refreshing) MySQL and Apache...." >>$LOG
mysqladmin -uroot -p$mysqlrootpass reload >>$LOG
mysqladmin -uroot -p$mysqlrootpass refresh >>$LOG
/sbin/service httpd reload >>$LOG

#Rebuild qmail cdb's to accomadate any replicated changes
echo "Rebuilding QMail's CDB and starting the mail server...." >>$LOG
echo "If qmail is not running the svc errors are normal and can be ignored." >>$LOG
qmailctl stop >>$LOG 2>&1
sleep 3
qmailctl cdb >>$LOG
qmailctl start
echo "`date` The database has been imported into MYSQL. Verify user details." >>$LOG
echo "QMAIL has been fixed and started. It should now be live and workings as  primary mail. please verify." >>$LOG

Base State Check script

This script will ping your host for uptime, and will create a fail based on failure. To use, vi filename e.g. statecheck and paste the contents from below. Then chmod 755 filename. At this time it's not really that helpful without a method of notifying you. if you have an email address that you monitor regularly that is not on your primary server, then uncomment all email options below and you will be notified on the various levels of failure.


#!/bin/sh

#Created by Craig Smith 16/10/2006

#Script file to check for up time on hosts.  Will ping at x min intervals, and email after each consequtive failure.
#To increase failure numbers, add more variables i.e. fouth, fifth etc. Add to cron to specify check rate.
#There are probably other ways of doing this, but for now, this is what came about.

HOST=host to be pinged
#EMAIL=mail for testing
FOLDER=/unison/uptime
first=/unison/uptime/fail1
second=/unison/uptime/fail2
third=/unison/uptime/fail3
fine=/unison/uptime/fine

if [ ! -d $FOLDER ];then
mkdir /unison/uptime
fi

#If increasing error count, extend these counts.

if [ -f $first ];then
 file=$second
elif [ -f $second ];then
 file=$third
elif [ ! -f $first ];then
 file=$first
fi

rm -f $FOLDER/*
ping -c 4 $HOST >/dev/null
if [ $? -eq 0 ]; then
  echo "$HOST is up and visible" > $fine
elif [ $? -eq 1 ];then
  echo "failure `date`" >$file
fi

#used for testing. Should be replaced with sms based commands.
#if [ -f $first ];then
#echo "$HOST has failed to respond for the 1st time on `date`"|  mail - s "Please perform first time $HOST state check" $EMAIL
#elif [ -f $second ];then
#echo "$HOST has failed to respond for the 2nd time on `date`"|  mail -s "Please perform second time $HOST state check!!" $EMAIL
#elif [ -f $third ];then
#  echo "$HOST has failed to respond for the 3rd time on `date`"|  mail -s "$HOST state!!! OFFLINE 3RD FAILURE" $EMAIL
#elif [ -f $fine ]; then
#echo "$HOST is alive and well"| mail -s "No problems were detected `date`." $EMAIL;
#fi


User Tips & Tricks