Backups are a snap with rsnapshot
We’ve all heard the reasons for backing up our data regularly — accidental deletion of files (
rm -rf *), corrupted files from crashed applications, the dreaded hard disk failure, the list goes on. Nevertheless, on average, only 25 per cent of computer users perform routine backups of their data, as shown by a recent Harris Interactive survey. So why do the remaining 75 per cent put off this important task? Well, manual backups are often an adhoc measure, unreliable, and time-consuming. Automating an otherwise tedious backup process is key to producing routine and reliable backups. With that in mind, we’ll take a look at rsnapshot, a handy backup utility based on rsync, a well-known open source tool.
rsnapshot was written by Nathan Rosenquist as a replacement for a patchwork of complex shell scripts he had crafted to do rsync backups. Any changes to the backup scheme meant manually editing the scripts, making sure no bugs were introduced. rsnapshot was a great improvement over this process, it was easy to configure, portable across different operating systems, supported remote backups, and best of all, automated the entire backup process.
rsnapshot enables users to keep multiple backups of their data, from local or remote systems, readily accessible. Each backup is a complete snapshot of the data at a specific point in time. rsnapshot minimizes disk space usage by utilizing hard links (multiple entries in the file system to share a single data entity) and rsync. Thus, the total amount of disk space used is the space for one full backup, plus any incremental snapshots.
Since rsnapshot is written entirely in Perl, its a snap to install on most modern versions of Linux or BSD. In fact, rsnapshot comes pre-installed on Debian, Gentoo, FreeBSD, OpenBSD, and NetBSD. Users with other distributions can compile and install rsnapshot by downloading the latest version from www.rsnapshot.org.
To get started I will download and install rsnapshot (v1.2.1) on my Fedora Core 4 system (
mango). If you’re are using a distribution that already has rsnapshot installed, just skip to the next section.
To install rsnapshot you will need to have both
perl (v5.004+) and
rsync available on your system. Although, not required, it helps to have OpenSSH, BSD
cp, and GNU
du, available as well. If you have
rsync on your system, follow the simple instructions below to install rsnapshot.
$ wget -q http://www.rsnapshot.org/downloads/rsnapshot-1.2.1.tar.gz $ tar xzf rsnapshot-1.2.1.tar.gz $ cd rsnapshot-1.2.1 $ ./configure --prefix=/usr/local --sysconfdir=/etc
--sysconfdir=/etc parameter above tells rsnapshot to look for its configuration file (
/etc. Installing rsnapshot requires root privileges.
$ make install
Make sure rsnapshot is available in your command search path.
$ whereis rsnapshot rsnapshot: /usr/local/bin/rsnapshot
For the purposes of this article, we will use rsnapshot to backup data from one Linux system (
kiwi) to another (
mango). rsnapshot will run on
mango, which will also host the backup archives. Both systems should have rsync and ssh installed.
All configuration parameters of rsnapshot are controlled via the
rsnapshot.conf file. Before we setup rsnapshot, we’ll copy the default configuration file
/etc/rsnapshot.conf.default and save it as
/etc/rsnapshot.conf. This way we can revert back to a clean configuration if we mangle our config file.
Now, let’s edit
mango to setup our backup system. Most of the parameter defaults do not need modification, so we’ll just focus on those that do.
Where will backups be stored?
snapshot_root parameter in the
SNAPSHOT ROOT DIRECTORY section specifies the directory where rsnapshot will place backup snapshots as they are created. Make sure you select a disk partition with adequate free space to hold your backups.
# Note: Use TABS (not spaces) to separate # the configuration directive and the value. # If specifying a directory, put a # slash at the end. snapshot_root /usr2/snapshots/
If you plan on using an USB/FireWire hard disk for storing backups, then the
no_create_root parameter should be set to
0. This tells rsnapshot to create the snapshot root directory if it doesn’t already exist.
Which external programs will rsnapshot use?
EXTERNAL PROGRAM DEPENDENCIES section contains parameters to specify paths for optional external tools that rsnapshot depends on to provide certain features. Be sure to uncomment the lines starting with
cmd_du by removing the hash (
#) mark at the beginning of the line.
# use GNU cp cmd_cp /bin/cp # use ssh for secure remote backups cmd_ssh /usr/bin/ssh # use GNU du to check disk space usage cmd_du /usr/bin/du
How often will backups happen?
The configuration parameters in the
BACKUP INTERVALS section determine how often rsnapshot will perform backups and how many snapshots will be kept. The keyword
interval is followed by an alphanumeric label, followed by a number, signifying how many intervals to keep.
In our backup system, we want to take a snapshot of
kiwi every 3 hours, so that’s 8 snapshots per day. Each time
rsnapshot hourly is executed, it will create a new snapshot, rotate the old ones, and retain the 8 most recent (
hourly.0 - hourly.7) snapshots. We also want to take a daily snapshot, and keep a week’s (7 days) worth of snapshots.
#interval minutes 6 interval hourly 8 interval daily 7 #interval weekly 4
The order of the interval definitions is very important. The first
interval line must represent the smallest unit of time, with each subsequent line representing a larger interval. If you were to add a weekly interval, it would appear after the daily interval. Similarly, a minutes interval would appear before hourly.
What is included or excluded from the backup?
Most of the parameters in the
GLOBAL OPTIONS section can be left at their default values. However, there are two parameters that you can use to include or exclude files from the backup. Both parameters get passed directly to rsync, so take a look at the
--exclude options in the rsync man page for a thorough explanation of how to construct match patterns. If you prefer listing all your include/exclude patterns in separate files, specify them using the
Here are some simple examples to get you started.
# exclude anything starting with a dot character (.) exclude .* # exclude anything ending with a tilde character (~) exclude *~ # include .ssh directory include /home/nsharma/.ssh/
What should be backed up?
BACKUP POINTS / SCRIPTS section tells rsnapshot what is to be backed up and where the backup snapshot is stored. This part is very important, so pay attention. We will use rsync over ssh to backup two directories and a file from the system named
kiwi, and store the snapshots in a directory named
kiwi_backups. The hostname
kiwi must resolve to an IP address, either via DNS or the
# two directories (/home/nsharma, /my_articles) backup root@kiwi:/home/nsharma/ kiwi_backups/ backup root@kiwi:/my_articles/ kiwi_backups/ # one file backup root@kiwi:/etc/passwd kiwi_backups/
The configuration above will only work if we can login (without manually entering passwords) to
root via ssh. The easiest way to setup access is by creating “passphraseless” keys with
ssh-keygen, and here’s how to do it.
Setting up “passphraseless” keys
ssh-keygen program to create a public/private key pair with Digital Signature Algorithm (DSA) encryption
$ ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/root/.ssh/id_dsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase):<HIT ENTER> Enter same passphrase again:<HIT ENTER> Your identification has been saved in /root/.ssh/id_dsa. Your public key has been saved in /root/.ssh/id_dsa.pub. The key fingerprint is:0d:f0:ea:bc:b8:0d:69:c6:6d:e0:59:c2:ee:31:4d:90 email@example.com
Transfer public key from
kiwi using scp
$ scp .ssh/id_dsa.pub firstname.lastname@example.org:mango.pub email@example.com's password:<TYPE kiwi’s root PASSWORD><HIT ENTER> id_dsa.pub 100% 619 0.6KB/s 00:00
mango public key
$ cat mango.pub >> /root/.ssh/authorized_keys
mango.pub file from
$ rm -f mango.pub
We should now be able to login to
mango without being prompted for a password.
If you’re uncomfortable with the idea of “passphraseless” keys, then take a look at the
ssh-agent man page and a utility called
keychain available at www.gentoo.org/proj/en/keychain/index.xml.
Testing our configuration
Before we run rsnapshot for the first time, we should make sure the syntax of our configuration file is correct, and execute a dry run of each interval we have defined.
Checking for correct syntax
$ rsnapshot configtest
rsnapshot will either show you the errors, or a
Syntax OK message if there are no errors.
Dry run for each interval
# test run for 'hourly' interval $ rsnapshot -t hourly # test run for 'daily' interval $ rsnapshot -t daily
The output from each command will show you exactly what rsnapshot will do for the specified intervals.
Automating the backup process
Our next and final step is to automate the execution of rsnapshot on
mango. We’ll add two entries to the
cron scheduling server to request execution of rsnapshot every 3 hours on the hour, for the hourly interval, and every night at 11:00 pm, for the daily interval. Logged in as
mango, we’ll invoke the
crontab program with the edit (
-e) option. The
crontab invokes the default editor, as specified using the
EDITOR shell environment variables.
$ crontab -e
Now, we add the following entries and save and close the file.
0 */3 * * * /usr/local/bin/rsnapshot hourly 0 23 * * * /usr/local/bin/rsnapshot daily
That’s it, we now have a fully automated backup system which creates hourly and daily snapshots of our data. For detailed documentation about rsnapshot, check out the rsnapshot man page and the rsnapshot website at www.rsnapshot.org.
Knowing what data to preserve and how to recover it in an emergency is critical to having a solid backup plan. Using the right tools to implement that backup plan is just as important. Take control of your backups with rsnapshot!
Before we finish, here’s an actual run of rsnapshot against the hourly interval.
$ rsnapshot -v hourly echo 19462 > /var/run/rsnapshot.pid mkdir -m 0755 -p /usr2/snapshots/hourly.0/ /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \ --include=/home/nsharma/.ssh/ --exclude=.* --exclude=*~ \ --rsh=/usr/bin/ssh root@kiwi:/home/nsharma/ \ /usr2/snapshots/hourly.0/kiwi_backups/ /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \ --include=/home/nsharma/.ssh/ --exclude=.* --exclude=*~ \ --rsh=/usr/bin/ssh root@kiwi:/my_articles/ \ /usr2/snapshots/hourly.0/kiwi_backups/ /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded \ --include=/home/nsharma/.ssh/ --exclude=.* --exclude=*~ \ --rsh=/usr/bin/ssh root@kiwi:/etc/passwd \ /usr2/snapshots/hourly.0/kiwi_backups/ touch /usr2/snapshots/hourly.0/ rm -f /var/run/rsnapshot.pid