Grabber Softwares : Enhance Your Computing

Backups and Disaster Recovery

Take regular backups of your Linux box with these simple commands

linux.JPG (9333 bytes)Computers have represented evolving compaction, growing power and high availability. A computer system makes use of rotating mechanical devices such as hard disk drives, floppy disk drives and cooling fans. Disk drives are used for long term storage, but unfortunately, being mechanical in nature, these have higher mean time between failures (MTBF) compared to static storage components - RAM, NVRAM, etc. Added to this, dear Murphy provides an accurate sense of timing for such failures—a bluechip invites you for a demo of your product, you land up ahead of time, unpack your laptop, and boom! The laptop reports a hard disk drive error. Needless to mention, redundancy has become the name of the game and backup devices are big business.

If you think you use Linux and don’t need any of this, you are wrong. There’s precious little Linux can do to restrain your hard disk failure. If the disk goes down, Linux goes down with it, and depending upon the situation, Linux would perhaps care to tell you that something is drastically wrong. So with the need for backups firmly established, let’s look at what Linux provides in terms of support for backing up and recovering data from backups. Note that any situation, which implies loss of data, is termed a "disaster".

backup.weekly
#!/bin/sh
# backup to remote tape drive on dumpyard, compressed, all files under
# /export/home
DEVICE=backups@dumpyard:/dev/st0
FILES="*"
#
cd /export/home
echo "Insert Cartridge labelled WEEKLY"
read x
tar -zcvf $DEVICE $FILES >/tmp/backup.log.‘date ‘+%b%d%Y"

Linux, like every other flavor of Unix, provides the basic data archiving utilities like tar, cpio, and dd, used for backing up data on tapes. Linux also provides an implementation of software RAID that provides up to level 5 support. Armed with this information, let’s see how we can plan for backups and disaster recovery.

First, what is your environment? Is the MTTR (Mean Time To Repair) very critical that it must be kept the lowest? Depending on these, you would choose to implement hardware RAID with at least levels 0 and 1. Hardware RAIDs come with hot disk swap capability; such a solution is expensive but gives you literally zero down time. An alternate to such a system is software RAID implementation that functionally eliminates service interruptions due to disk failures by making use of the mirrored disk and keeping the services going. However, the absence of hot swap capability would require that the system is shut down, the disk replaced, synchronized with the mirror disk and the services started up. This down time could be during the lean service hours. The least expensive solution with the longest MTTR is using tapes for restoring the data onto a new disk.

It’s important to understand that there must be made a clear demarcation of the data on the system into system data—data used for configuring system services like printers, network, kernel, etc, and user data—data generated by users. Typically, system data is not expected to change very often (except system logs) and does not, whereas user data does change daily. So one would handle this by backing up the user area, possibly every day, and the system data less frequently. Backing up itself is done in two possible ways: complete or full backups and incremental backups. As the name suggests, incremental backups record changes that have occurred since the last full backup.

A short note about the backup devices. Linux, by default, has support for SCSI tapes. The device /dev/st0 indicates the first such tape drive attached to your system. If you’d use mirroring across disks, you’ll see devices like /dev/md0, /dev/md1, etc, which are logical devices that represent a file system that points to two physical disk partitions that are mirrored.

Now, let’s look at what utilities to use when you do not use mirroring. I’d advocate the use of cpio to copy files from one disk to another. One single command does all, elegantly. Here’s a typical command that you’d use to copy the directory structure under /export/users to /spare/backup/users

backup.daily
#!/bin/sh
# Dump thingie
# Do incremental backups either to tape that is already in the remote
# tape drive or onto a spare disk that is mounted locally; it is assumed that
# these archives will be written to tape later.
#
if [ ‘echo $#‘ -ne 1 ]; then
echo "Usage: $0 [disk] [tape]"
exit
fi
#
BASE=/usr/local/bin
now=‘date ‘+%b %d %T %Y’‘
nowfile=‘date ‘+%b%d%Y’| tr -d ‘\040*’‘
then=‘cat $BASE/date.last.dump‘
if [ ‘echo $1‘ = "disk" ]; then
DEVICE=/mnt/backups/$nowfile.tar.gz
echo "Backing up to "$DEVICE"; Check if the backup partition is mounted in /mnt"
echo "Enter to continue; ^C to abort"
# The mount check for the partition can be automated too; for the present,
# do it one at a time, abort and restart.
read x
else
DEVICE=root@dumpyard:/dev/st0
echo "Backing up to "$DEVICE"; Assuming tape is loaded"
read x
fi
#
HOME=/export/home
#
cd $HOME
#
FILES="*"
# Update FILES as required
#
/usr/sbin/tar -z -c -v \
-f $DEVICE \
-N "$then" \
-V "Dump from $then to $now" \
$FILES >/tmp/backup.log.$nowfile
echo $now > $BASE/date.last.dump

# cd /export/users; find . -name -depth | cpio -pldmuv /spare/backup/users/

You could similarly create a cpio archive by specifying a tape device instead of the destination directory.

A variation of the same using the (GNU) tar is

# cd /export/users; tar zcvf /dev/st0 * >/tmp/backup.log.DDMMYY

dd is typically used in conjuction with cpio or tar.

# cd /export/users; tar zcvf - * | dd of=/dev/st0

is equivalent to the earlier command. To backup on a remote system, use either of

# cd /export/users; tar zcvf [email protected]:/dev/st0 *

(Backup onto tape device attached to host remote.host.name on which userid "backups" has permissions to allow remote command execution from userid root on your machine.)

How you back up is also dependent on how your file systems are laid out. The most trivial case is where the root file system has all the system areas (/var, /usr, /sbin, /etc, /dev, /lib, /boot and /bin), and the other file system is the user file system (typically named /export, /home or /users). In such a situation, contents of the user file systems will be backed up more frequently when compared to the root file system. When you back up the root file system, you may have non-critical data like the cache area of the web/ftp proxy-cache, which you may want to exclude from the backup.

I use tar effectively and would advise you to do so too. tar (expands to tAPE arCHIVER) is used for archiving on tape. Here’s an example of creating a compressed tar archive of /usr/doc/HOWTO excluding the "mini" directory:

# cd /usr/doc/HOWTO; tar —exclude mini -zcvf /tmp/x.tar.gz .

Note that the source file list is specified as the current directory ".". A common mistake users make here is to use "*" which contradicts the exclusion and overrides it. The destination file can be written to tape by replacing /tmp/x.tar.gz by the local or remote device name. Note that the archive will not contain the leading path names since we have descended to that directory. Replace "mini" and "." with the pathnames "/usr/doc/HOWTO/mini" and "/usr/doc/HOWTO", respectively. Again, tar will strip the leading "/" in this pathname in the tar archive. Use the -P option of tar to retain the leading "/".

Here’s how to back up files that have changed after a particular date, using the -N option of tar. You would use this option to back up config file changes in the systems areas or to perform a daily backup of user areas. The example shows how to back up changes on the user file system since the Valentines day.

# cd /export/users/; tar -zcv -N "Feb 14 17:30:32 1999" \? -V "Incremental backup from Feb 14 17:30:32 1999 to Mar 1 12:00:00 1999" \? -f /dev/st0 *

(The "\" at the end of each line implies (a \and a <cr>) a continuation of the command on the next line. The "?" is the secondary prompt that Linux gives by default to signal a continuation. You could continue to type the command without having to give these command continuations.)

The -V option allows you to tag a volume label to the tape archive.

Having seen the various options, let’s now construct a backup strategy. We’ll assume that you toy around less with your system areas. As soon as you install the system, configure and customize it to your requirements, it’s time to take an incremental backup of the system area. You may or may not want to take a complete backup of the system area just after installation—you have it on your distribution anyway. You could repeat such incremental backups either after each change you make to the system config files (recommended), or periodically, say every fortnight. You could make the increment either from the date of install or from the last incremental backup date. The former would not allow you to record a history of changes whereas it’s faster when it comes to restoring the state in case of a crash; one extract would bring it to the state before the crash.

As for the user file system, it’s only fair that incremental backups are done daily, and a full backup every week. The high frequency is again to restore the latest state of the file system ASAP. So you would do an incremental backup of the system areas as soon as you had your system ready after installation and do so each time a change is made. Your user areas would be backed up fully at start, incrementally on a daily basis, and a full backup every weekend. To have guard against tape errors, it’s a good practice to retain the previous week’s set of backups—one full backup and that week’s incremental backups. This way you’d be set back in time when you extract the file system, but not too much.

In case of a crash, you would first install the bare system after perhaps replacing the disk and then begin to extract the file system states, in the same sequence as they were backed up so that the latest backup is extracted last. In case of the system backup, you’d need just one extract, if you did an incremental backup each time from the date of install. In case of the user file systems, you would extract the full backup first and then the incremental backups for that week.

Extracting all files from an archive is simple. Assuming you have archived the information with leading pathnames (using -P), here’s what to do:

# tar zxvf /dev/st0

If only a particular directory is to be extracted, specify that particular directory as the last argument to tar.

There are other options in tar. -M allow an archive to be split across tapes, -L allows you to specify the length of the tape after which to ask for a fresh tape.

Typically, data cartridges used are DDS-I, 90 meter tapes that have a capacity of 2 GB. Depending on the tape drive, compression can be enabled by using the "mt" command in Linux. Try "mt -f /dev/st0 densities" to get a listing of the densities of various drives and formats. DDS-II tapes give higher densities and hence more capacity.

There’s another driver in Linux that uses the floppy drive interrupt (IRQ 6). That device is /dev/ftape. It’s not commonly used as of now.

A commercial package available to do backups with a GUI front-end is BRU (Backup and Restore Utility) (www.estnic.com/). One interesting option that BRU provides is to estimate the backup size, number of tapes required, etc. An option for data consistency check is also available. The big difference with BRU is its capability to archive raw disk partitions. For example, one could use "bru -options -r /dev/sda1" to back up the root partition of your SCSI disk. BRU is an interesting utility to use given that it bunches together the capabilities of tar, dd and dump (a command available to archive file systems on Solaris—tm).

Before I sign off a few tips for those of you who handle sensitive data and require low downtime. Here’s my experience—downtimes are largely due to disk crashes or power supply failures. Disk crashes occur when you don’t want them to: December 31 is my favorite; happened three times so far! Build backing up into your routine; automate them when possible. Do not automate full backups if you can avoid it. Have someone around when they are going on. After backing up, ensure that data is backed up properly by actually extracting a file from that backup. This is still not foolproof. You might want to add a file to the archive and then extract that file. In the process, the entire tape is scanned since the added file is at the end of the archive. Finally, as a thumb rule, I discard tapes that I use for full backups after about 60 writes; at a frequency of once a week, they last me for a good year.

What I have not mentioned here is how to manage the tapes themselves and details of the tape drives. The commands shown are not optimal; they are kept simple and non-repulsive by cutting down geekish appearance or usage. The boxes included sample programs that do weekly and daily backup, adapted from the GNU tar information files.

Happy archiving.

Linux user groups in India
There is an all India user group Linux–India www.linux-india.org and chapters in various cities.
To subscribe to the Linux-India mailing list send mail to [email protected] with subscribe Linux-India in the body of the message.
Besides, there are active user groups in many cities, including Chennai, Bangalore, Mumbai, Ahmedabad, Delhi, and Trivandrum.

Linux software vendors
This is an incomplete list of vendors stocking Linux software. Today, you can find Linux software on the shelves of most bookstalls, including Computer Bookshop in Mumbai, Gangarams in Bangalore, and Ebony in Delhi.

Bangalore
G T Enterprises
913, 14th Main, 4th Cross, Maruthi Circle, Hanumantha Nagar, Bangalore. Tel: 80-6606093,6671407
E-mail:[email protected]

http://www.gtcdrom.com./

Prices at www.gtcdrom.com/plist.txt
Genisys
# 2, MIG, 2nd Stage, KHB Colony, Basaveswaranagar,Bangalore 560079
Tel: 80-3481443,3481315 E-mail: [email protected]
Delhi

G Bhattacherjee
Tel: 11-6855711 E-mail: [email protected]
Prabhakar Singh
Tel:11-91297478