UNIX System Administration Handbook - Evi Nemeth [106]
We begin this chapter with some general backup philosophy, followed by a discussion of the most commonly used backup devices and media (their strengths, weaknesses, and costs). Next, we discuss the standard UNIX backup and archiving commands and give some suggestions as to which commands are best for which situations. We then talk about how to design a backup scheme and review the mechanics of the UNIX commands dump and restore. Finally, we take a look at Amanda, a free network backup package and offer some comments about its commercial alternatives.
10.1 MOTHERHOOD AND APPLE PIE
Before we get into the meat and potatoes of backups, we want to pass on some general hints that we have learned over time (usually, the hard way). None of these suggestions is an absolute rule, but you will find that the more of them you follow, the smoother your dump process will be.
Perform all dumps from one machine
rdump allows you to perform dumps over the network. Although there is some performance penalty for doing this, the ease of administration makes it worthwhile. We have found that the best method is to run a script from a central location that executes rdump (by way of rsh or ssh) on each machine that needs to be dumped, or to use a software package (commercial or free) that automates this process. All dumps should go to the same backup device (nonrewinding, of course).
If your network is too large to be backed up by a single tape drive, you should still try to keep your backup system as centralized as possible. Centralization makes administration easier and allows you to verify that all machines were dumped correctly. Depending on the backup media you are using, you can often put more than one tape drive on a server without affecting performance. With today’s high-performance (6 MB/s and up) tape drives, however, it may be impractical to do this.
Dumps created with rdump can only be restored on machines that have the same byte order as the dump host (and in most cases, only on machines running the same OS). You can sometimes use dd to take care of byte swapping problems, but this simple fix won’t help resolve differences among incompatible versions of rdump.
Label your tapes
It is essential that you label each dump tape clearly and completely. An unlabeled tape is a scratch tape.
The tapes themselves should be labeled to uniquely identify their contents. Detailed information such as lists of filesystems and dump dates can be written on the cases.
You must be able to restore the root and /usr filesystems without looking at dump scripts. Label the dump tapes for these filesystems with their format, the exact syntax of the dump command used to create them, and any other information you would need to restore from them without referring to on-line documentation.
Free and commercial labeling programs abound. Save yourself a major headache and invest in one. If you purchase labels for your laser printer, the label vendor can usually provide (Windows) software that generates labels. For the economy-minded, a quick troff program will do the trick.
Pick a reasonable backup interval
The more often backups are done, the smaller the amount of data that can be lost in a crash. However, backups use system resources and an operator’s time. The sysadmin must provide adequate data security at a reasonable cost of time and materials.
On busy systems, it is generally appropriate to back up filesystems with home directories every workday. On systems that are used less heavily or on which the data is less volatile, you might decide that performing backups several times a week is sufficient. On a small system with only one user, performing backups once a week is probably adequate. How much data are your users willing to lose?