The other day at work someone asked me if there was some way to have OS X run an rsync command to an external drive whenever it was plugged in. Well, given that we were talking about Mac OS 10.4, it was easy to answer. Of course you can do that.

Why would anyone want to do that? Well, when he plugged in the external drive, he wanted it to immediately start backing up his data to the disk, instead of having to type a command or run a script manually. No problem my friend, OS X can accommodate you!

New in 10.4 is a system daemon called launchd. Launchd is Apple’s replacement for a number of *NIX daemons that are typically used for launching system services at boot time or on demand after system launch. Launchd, although a work in progress, is extremely powerful. Process ID 1 in the system is in fact launchd. It’s always running, and always watching.

Launchd gets its configuration information for an agent or daemon from a Property List file (plist). Examples of plist files used by launchd for the system are located in:

/System/Library/LaunchDaemons (admin level system daemons)
/System/Library/LaunchAgents (admin level user agents)

At the user level, you can run launchd processes in user space in a number of ways. You can use launchctl (man launchctl) from the command line. Or you can create your own plist file and place it in a special location for launchd to use when you log in by creating the equivalent “Launch” directories in ~/Library (the /System/Library folders are typically where system admins place global configuration files). Alternatively, you can add the command to a $HOME/.launchd.conf file that you can create and modify (again, the launchctl man page has more information).

The plist file contains information that launchd is going to use to figure out exactly what it’s supposed to do. It could perform a system task or run a custom script.

Ok, enough blabbing, let me illustrate with an example geared toward the request from my co-worker. It’s easier to understand that way. The example assumes you have a firewire/usb external drive to attach to your system.

Basic Setup

1) In terminal cd to ~/Library
2) If you don’t have a LaunchAgents directory create one:

	mkdir ~/Library/LaunchAgents

3) While you are at it create a folder called Scripts

	
	mkdir ~/Library/Scripts

Remember, at login, launchd will scan the contents of the ~/Library/LaunchAgents folder for plist files to process. Once you put one in there launchd will take over for you everytime you log in.

Property List

1) Launch Terminal.app and in the terminal cd into ~/Library/LaunchAgents and issue the following commands:

  touch com.macresearch.backup
  open -e com.macresearch.backup

2) With the new file open in TextEdit add the following content to it:


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" \
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<dict>
        <key>Label</key>
        <string>com.macresearch.backup</string>
        <key>LowPriorityIO</key>
        <true/>
        <key>Program</key>
        <string>/Users/gohara/Library/Scripts/backup.com</string>
        <key>ProgramArguments</key>
        <array>
                <string>backup.com</string>
        </array>
        <key>WatchPaths</key>
        <array>
        <string>/Volumes</string>
        </array>
</dict>
</plist>


Let’s go over this as a lot of important stuff is here. All of the important information is between the dictionary statements (<dict></dict>).

<key>Label</key>
<string>com.macresearch.backup</string>

This is a unique identifier that launchd will use when it loads up the plist file (once launchd loads a configuration file you can issue “launchctl list” at the command line to see what tasks it is monitoring. This is the string it will report). Make this string meaningful, as it’s the quickest way to tell what a launchd command is designed to do.

<key>LowPriorityIO</key>
<true/>

Since we are doing file IO, and we may need to use the computer for something more important like…playing online Poker, we want to minimize the system resources diverted to the backup. This is entirely optional.

<key>Program</key>
<string>/Users/gohara/Library/Scripts/backup.com</string>

This tells launchd what program we want it to….well launch.

<key>ProgramArguments</key>
<array>
        <string>backup.com</string>
</array>

The program arguments are important. The first argument listed is always the program itself. If you want to pass in additional arguments, you simply add more <string></string> statements between the array delimiters.

<key>WatchPaths</key>
<array>
<string>/Volumes</string>
</array>

Finally we tell launchd what we want to use as a trigger for launching the script. In this example we are telling it to watch the path /Volumes. Why? Well anytime we mount a device on the file system a link is placed in /Volumes. From this point on, launchd knows to watch /Volumes for ANY changes. If it detects a change it will then launch the backup script (our “program”). Again, you can add multiple paths for it to watch by adding path strings between the array delimiters. You can check the man pages for launchd for more options (man launchd).

The important thing to remember here is that launchd will execute the script regardless of what is added or removed from the /Volumes path. This includes CD/DVD’s, USB devices, disk images, or even if you create a folder in /Volumes. Launchd is powerful, but it’s stupid (for now). So we need to build some smarts into our program (or script in this case) to make sure the script does the right thing.

Script

I’m going to create a tcsh script for this example. If you are more comfortable
with bash (or even AppleScript), you can convert this example to those forms as well.

1) In Terminal cd ~/Library/Scripts

  touch backup.com
  chmod 755 backup.com
  open -e backup.com

2) Copy the following into the document:

#!/bin/tcsh

# Convenience variables to specify what I want to
# backup and where I want to back it up to

set folderToBackup = "/Users/Shared/Expenses"
set backupVolume = "/Volumes/BACKUP"
set backupTo = "${backupVolume}/backup"

# This sleep timer has been added to allow enough
# time for the system to mount the external drive
# On my PowerBook 30 sec. is more than enough time

sleep 30

# This check is added to test for cases when we are
# removing a drive from /Volumes or if the drive failed
# to mount in the first place

if (! -e $backupVolume ) then
 exit 0
endif

# Create the folder to back up the data to (defined above)

if (! -e $backupTo) then
 mkdir -p $backupTo
endif

# Copy the files over. Here we are using rsync.

rsync -aq --delete $folderToBackup $backupTo

# Optionally, once the rsync is done, unmount the drive.

#hdiutil detach $backupVolume

exit 0

Ok. let’s go through this.

set folderToBackup = "/Users/Shared/Expenses"
set backupVolume = "/Volumes/BACKUP"
set backupTo = "${backupVolume}/backup"

These are convenience variables defining what I want to backup (the Expenses folder in /Users/Shared). The volume name (in this case my firewire drive has a volume called BACKUP). And the location on the backup drive I want to backup the Expenses folder to (in this case in a folder called backup). Obviously if your drive is named something else (and the folder to backup is as well), you’ll need to change these lines.

sleep 30

On my PowerBook it takes about 10 seconds from the time I plug in the device and the device is mounted in /Volumes to the time the device is ready to accept modifications (that is, for the device to be capable of being written to). Put another way, launchd won’t launch the script until it sees the device appear in /Volumes. However, it can still take a few seconds before anything can be written to the drive. So this sleep is just a buffer to ensure the device is ready.

if (! -e $backupVolume ) then
 exit 0
endif

This set of instructions is designed to make sure that we don’t try and write to the device during the unmounting stage. Remember launchd will execute this script ANY time a change is made in /Volumes. When we “eject” the disk, launchd will run again. This test is designed to make sure that if we have ejected the volume, rsync won’t copy files directly in the /Volumes directory.

if (! -e $backupTo) then
 mkdir -p $backupTo
endif

rsync should create the directory structure for us in general, but it’s not bad to make sure it’s already in place. And if it isn’t, make it so.

rsync -aq --delete $folderToBackup $backupTo

Finally let’s do the backup.

One optional step is to unmount the disk when process is complete. To do this, you could add the following line:

hdiutil detach $backupVolume

Register the Script with Launchd

There are two ways to register the script with launchd. From the command line or by simply logging out and then back in. To save some effort let’s register it from the command line:

  launchctl load ~/Library/LaunchAgents

Now issue the command:

  launchctl list

You should see something like this:

[Voyager:~/Library/Scripts] gohara% launchctl list
com.macresearch.backup

Ok. Launchd is aware and ready to go.

Plug in the Drive

Once you plug in the drive (and wait ~30 seconds) what you should notice is that the folder (and its contents) you designated to be backed up will begin appearing on the drive at the specified path. Pretty cool huh?

Once the process is complete you can safely eject the disk.

Afterthoughts

The WatchPath directive is very powerful. Imagine you have a folder that you want occasionally dump files into. Maybe those files are data that is being generated by some other program. You can specify to launchd to watch that folder, and whenever data appears there (or any modifications made really) launchd can run a command/script/program to do something with that data. For example, you could have launchd run a script that will convert the data, pass it into a plotting program, generate plots, and then email the plots to you or a colleague. Pretty cool stuff!