sandip's blog

Migrating Sendmail Mail Server

Below is how I have migrated mail server with minimum downtime and routing mail to the new server via mailertable, if IP is still pointing to the old server and has not resolved for some ISPs.

  • 48 hours prior to migration, set the TTL value for the mail server DNS A record to a short time like 15 minutes.
  • Prepare for the migration, rsycing the mail spool folder and the user home mail folders.
    rsync --progress -a -e "ssh -i /root/.ssh/key -p 22" old.mailserver:/var/spool/mail/ /var/spool/mail/
    rsync --progress -a -e "ssh -i /root/.ssh/key -p 22" old.mailserver:/var/www/web1/mail/ /var/www/web1/mail/
    rsync --progress -a -e "ssh -i /root/.ssh/key -p 22" --exclude='*/bak' --exclude='*/web' old.mailserver:/var/www/web1/user/ /var/www/web1/user/
  • At the time of migration, firewall incoming port 25 on the old mail server and update the DNS A record to point to the new server.
  • Run rsync the final time.
  • Setup Sendmail with mailertable to relay mail coming in to the old server over to the new mail server. This is a similar setup for secondary mail servers.
  • Add "FEATURE(`mailertable', `hash -o /etc/mail/mailertable.db')dnl" to "/etc/mail/sendmail.mc" if it does not already exist.
  • Create "/etc/mail/mailertable" file with contents of the routing table:
    domain.tld esmtp:[xxx.xxx.xxx.xxx]

    The square brackets skips checking MX records, so IP can be used instead.
  • Remove domain name from "/etc/mail/local-host-names" so mails do not get delivered locally.
  • Edit "/etc/mail/access" to relay mail for the domain.
    TO:domain.tld RELAY
  • Rebuild the access and mailertable databases.
    cd /etc/mail
    makemap hash access.db < access
    makemap hash mailertable.db < mailertable
  • Restart sendmail and open up the firewall.
  • Test by telneting to port 25 on the old servers' IP and sending email. This should get relayed over to the new server.
  • Use a new subdomain and redirect existing webmail url to the new server.

Check glue record for domain

If you've just made any changes to the nameservers, you can verify if this has propagated at the root level.

Check root servers for the corresponding tld first. So for .com domains:

dig ns com

The output is as below:

;; ANSWER SECTION:
com.                    172800  IN    &nbsp; NS    &nbsp; h.gtld-servers.net.
com.                    172800  IN    &nbsp; NS    &nbsp; k.gtld-servers.net.
com.                    172800  IN    &nbsp; NS    &nbsp; e.gtld-servers.net.
com.                    172800  IN    &nbsp; NS    &nbsp; d.gtld-servers.net.
com.                    172800  IN    &nbsp; NS    &nbsp; j.gtld-servers.net.
com.                    172800  IN    &nbsp; NS    &nbsp; i.gtld-servers.net.
com.                    172800  IN    &nbsp; NS    &nbsp; c.gtld-servers.net.
com.                    172800  IN    &nbsp; NS    &nbsp; b.gtld-servers.net.
com.                    172800  IN    &nbsp; NS    &nbsp; m.gtld-servers.net.
com.                    172800  IN    &nbsp; NS    &nbsp; l.gtld-servers.net.
com.                    172800  IN    &nbsp; NS    &nbsp; g.gtld-servers.net.
com.                    172800  IN    &nbsp; NS    &nbsp; f.gtld-servers.net.
com.                    172800  IN    &nbsp; NS    &nbsp; a.gtld-servers.net.

Now query the root servers for the corresponding domain:

dig ns edices.com @g.gtld-servers.net

The additional section from the result with the IP address show the glue records.

;; AUTHORITY SECTION:
edices.com.   ;     ;     172800  IN    &nbsp; NS    &nbsp; ns1.edices.com.
edices.com.   ;     ;     172800  IN    &nbsp; NS    &nbsp; ns2.edices.com.
edices.com.   ;     ;     172800  IN    &nbsp; NS    &nbsp; ns3.edices.com.

;; ADDITIONAL SECTION:
ns1.edices.com.  &nbsp;    &nbsp; 172800  IN    &nbsp; A     ;  207.44.207.121
ns2.edices.com.  &nbsp;    &nbsp; 172800  IN    &nbsp; A     ;  207.44.206.16
ns3.edices.com.  &nbsp;    &nbsp; 172800  IN    &nbsp; A     ;  67.228.161.76

Speed up SSH

Try setting up ssh client with compression and use arcfour/blowfish encryption instead. Also avoid ipv6 lookup and reuse connections using
socket:

Add below to ~/.ssh/config

Host *
Ciphers arcfour,blowfish-cbc
Compression yes
AddressFamily inet
ControlMaster auto
ControlPath ~/.ssh/socket-%r@%h:%p

Troubleshooting device or resource busy

In order to extend an lvm partition, I had to unmount the mounted volume.

When I tried to umount the volume, it complained about device being busy.

When I tried to find the process using the device with, `fuser -m /dev/vg0/lv0` it returned nothing. So did a lazy umount with:

umount -l /dev/vg0/lv0

However, after extending the partition with lvextend and running e2fsck on the volume, it then complained that the device was still busy and failed to check the volume.

I then realized that most probably caused by nfs mounts. Once I stopped the nfs service, I was successfully able to check the volume.

vzdump of CentOS

Current versions of vzdump has dependency for cstream and perl-LockFile-Simple, both available via rpmforge. Below is how I got it to install and run on CentOS-5.5 x86_64 architecture.

wget http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.1-1.el5.rf.x86_64.rpm
rpm -ivh rpmforge-release-0.5.1-1.el5.rf.x86_64.rpm
yum --enablerepo=rpmforge install cstream perl-LockFile-Simple
rpm -ivh http://download.openvz.org/contrib/utils/vzdump/vzdump-1.2-4.noarch.rpm

It's necessary to export the location of the PVE libraries that vzdump requires. This can be added to ".bash_profile":

export PERL5LIB=/usr/share/perl5/

Run process with least cpu and IO priority

Below is command to run process with the least CPU and IO priority.

nice -n 19 ionice -c 3 <command>

You could also include the same in the beginning of the script:

#!/bin/bash
# Make process nice
renice +19 -p $$ >/dev/null 2>&1
ionice -c3 -p $$ >/dev/null 2>&1

References:

easy php-fpm install via yum

On CentOS, php-fpm can be easily installed via CentALT yum repository. This requires epel repository too and will pull down any dependencies if needed.

  • Install EPEL release:
    rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/x86_64/epel-release-5-3.noarch.rpm
  • Install CentALT release:
    rpm -Uvh http://centos.alt.ru/repository/centos/5/x86_64/centalt-release-5-3.noarch.rpm
  • Install via yum:
    yum --enablerepo=CentALT --enablerepo=epel install php-fpm
  • Look through and edit /etc/php-fpm.conf . The config options are well commented... also available at php-fpm.org
  • The default settings should work quite well.
  • Bring up the service via:
    /etc/init.d/php-fpm start

expect script for ssh password prompt

Below is a sample expect script to handle ssh password prompt should you not get the ssh keys to be working between hosts:

#!/usr/bin/expect -f

set host XXX
set user XXX
set password XXX
set remote_path XXX
set local_path XXX

# disables the timeout, so script waits as long as it takes for the transfer
set timeout -1

# call rsync
spawn rsync -av -e ssh $user@$host:$remote_path $local_path

# avoids that if the output is to large, the earlier bytes won't be fotgotten
match_max 100000

# we're expecting the password prompt, we use a pattern so it can be anything that contains password: or Password
expect  "*?assword:" { send "$password\r"}

# send a newline to make sure we get back to the command line
send -- "\r"

# wait for the end-of-file in the output
expect eof

Show swap label

blkid can be used to display the swap label:

# blkid /dev/md2
/dev/md2: TYPE="swap" LABEL="SWAP-md2"

smartctl notes

Below is a list of smartctl commands I frequently use to quickly verify disk health and status, specially when you have smartd logging errors to messages log file.

  • Print all SMART (Self-Monitoring, Analysis and Reporting Technology) information for drive /dev/sda (Primary Master).

    smartctl -a /dev/sda

  • Enable SMART on device.

    smartctl --smart=on /dev/sda

  • Get info about the device:

    smartctl -i /dev/sda

  • Show the capabilities of drive. Also provides status when tests are being carried out.

    smartctl -c /dev/sda

  • Basic health status:

    smartctl -H /dev/sda

  • Display attributes. The attributes to look out for failing disk is Reallocated_Sector_Ct, Reallocated_Event_Count, Current_Pending_Sector and Offline_Uncorrectable. Their RAW_VALUE should normally be "0".

    smartctl -A /dev/sda

  • Immediate offline test which updates attributes value. Good to run after a badblocks fsck check before checking on the attributes values.

    smartctl -t offline /dev/sda

  • Run a thorough long test if you see suspect attributes with -A option as mentioned above.

    smartctl -t long /dev/sda

  • Examine self-test log. Shows if tests failed or passed.

    smartctl -l selftest /dev/sda

  • Display most recent error log.

    smartctl -l error /dev/sda

There are more examples in man smartctl.

Comment