Getting Started with OpenMPI and Xgrid
There are many MPI implementations around for the Mac, but one is particularly well integrated: OpenMPI. OpenMPI is the new kid on the block, though descended from old hands like LAM-MPI, but nevertheless supports of a wide variety of platforms and batch systems, including Mac OS X and Xgrid.
One of the weaknesses of OpenMPI at this early stage in its development is documentation. So although Xgrid is supported by OpenMPI, you are largely on your own when it comes to figuring out how to use it. That's where this tutorial comes in. I've done some of the heavy lifting so that you don't have to. Here I'll discuss how you install OpenMPI for use with Xgrid, and how you need to configure your environment to get started running jobs. What this tutorial will not do is tell you how to use Xgrid. For the time being, you'll need to look elsewhere for a tutorial on that.
Building and Installing OpenMPI
There is an installer package available for Mac OS X from the OpenMPI downloads page, but it doesn't seem to be kept that current, and has a few shortcomings, such as no support for 64-bit machines, and no Fortran bindings. Given these issues, you are probably best off compiling your own copy, which — luckily — is not too involved.
Begin by downloading the source files for the latest stable release. At the time of writing, this is version 1.1.2, and the file to download is openmpi-1.1.2.tar.gz. Unpack it somewhere, either by double clicking it, or by using the command:
tar xvzf openmpi-1.1.2.tar.gz
Change into the root directory of the unpacked files, and create a directory in there to build in called 'build'. Change into the build directory.
cd openmpi-1.1.2 mkdir build cd build
Now we need to do the usual configure/make/make install dance. First, here is the configure command that I used:
../configure --prefix=/usr/local/openmpi --with-fortran --enable-shared --disable-static --with-xgrid
This will result in an OpenMPI installation in the /usr/local/openmpi directory, with Fortran bindings enabled, and shared libraries, rather than static ones. Note also that you need ensure Xgrid support is included; the documentation seems to suggest it is included by default, but my experience is that it is not.
The configure command above will attempt to guess which Fortran compiler you have. If you are using gfortran, available here, everything should work fine. If you want to use another compiler, like Intel's ifort or IBM's xlf, you will need to set the environment variables F77, F90, and FC. If you also want to set compiler flags and link options, you should set F77FLAGS, F90FLAGS, FCFLAGS, and LDFLAGS. A configure command for the XLF compiler, for example, may look like this:
F90=xlf90 FC=xlf90 F77=xlf ../configure --prefix=/usr/local --with-fortran \ --enable-shared --disable-static --with-xgrid
More information on these environment variables is available by supplying the --help option to the configure command.
Having configured the installation, its time to build it. Simply issue the command
and if all goes well
sudo make install
You will be asked to enter your admin password for the latter.
Lastly, you will need to make sure the OpenMPI commands are in your path. You can add them to your shell resource file, but if you want them available system-wide, I suggest adding the following to the /etc/profile file:
Installing on Other Systems
If you want to use OpenMPI with Xgrid, you need to ensure it is available on every machine in the grid. You can either try to setup a shared file system, and install OpenMPI there, or you can simply copy the installation you just created to each machine. You can archive the installation with tar, like this
tar cvzf openmpibinaries.tgz /usr/local/openmpi
copy the openmpibinaries.tgz file to the other machines in your grid, and unpack them like this
sudo tar xvzf openmpibinaries.tgz -C /
To finish the installation, modify the /etc/profile file on each machine, as described above. This will allow user 'nobody', under which the Xgrid tasks are actually run, to access the OpenMPI commands.
Installing a Test Program
Now we need an MPI program to test, and we need to install that program on every computer in the grid. For my test program, I went to Google's new Code Search engine, and looked for a simple ping-pong program, that sends messages back and forth between two nodes. I came up with this, which is included in a package that can be downloaded here.
To compile the ping pong program, unpack the archive, and change to the directory lab_SMI+MPI_SCI-Summer-School_2001/MPI/pingpong. Enter a
make command to build. The resulting pingpong executable needs to be transferred to each computer in your grid, into a directory that is accessible by user 'nobody'. One possibility is /tmp.
Running an OpenMPI Program via Xgrid
To test your Xgrid setup, first setup an Xgrid with password authentication. (For instructions, go here.) On the job submission machine, set the environment variables XGRID_CONTROLLER_HOSTNAME and XGRID_CONTROLLER_PASSWORD appropriately. This is important, because OpenMPI looks for these environment variables when deciding what type of batch environment it will use. If these variables are set, Xgrid will get the nod.
Next, you will want to install the Xgrid Admin tool, if you don't already have it. This will allow you to monitor the progress of jobs, see where they are running, and if they fail. Launch the tool and connect to your Xgrid controller.
Before you actually submit your job, you will need to temporarily disable your firewalls, or configure them to allow access from other machines in the grid. OpenMPI currently uses random port numbers, which unfortunately means you can't just open up a few ports on the firewall of each machine.
Finally, to submit your ping pong test, issue this command:
mpirun -n 2 /tmp/pingpong
If all goes well, the output of the ping pong test should give you a table of data throughput as a function of message size. When running both processes on a single shared-memory Mac, I get numbers like this:
size = 0, bandwidth = 0.000 MB/s latency = 135.031 usec., size = 1, bandwidth = 0.007 MB/s latency = 127.969 usec., size = 2, bandwidth = 0.014 MB/s latency = 132.470 usec., size = 4, bandwidth = 0.025 MB/s latency = 154.469 usec., size = 8, bandwidth = 0.061 MB/s latency = 124.681 usec., size = 16, bandwidth = 0.123 MB/s latency = 123.920 usec., size = 32, bandwidth = 0.249 MB/s latency = 122.650 usec., size = 64, bandwidth = 0.485 MB/s latency = 125.930 usec., size = 128, bandwidth = 0.935 MB/s latency = 130.620 usec., size = 256, bandwidth = 1.938 MB/s latency = 126.002 usec., size = 512, bandwidth = 3.531 MB/s latency = 138.302 usec., size = 1024, bandwidth = 7.765 MB/s latency = 125.771 usec., size = 2048, bandwidth = 16.652 MB/s latency = 117.290 usec., size = 4096, bandwidth = 33.392 MB/s latency = 116.980 usec., size = 8192, bandwidth = 57.058 MB/s latency = 136.921 usec., size = 16384, bandwidth = 93.568 MB/s latency = 166.991 usec., size = 32768, bandwidth = 110.980 MB/s latency = 281.582 usec., size = 65536, bandwidth = 110.892 MB/s latency = 563.612 usec., size = 131072, bandwidth = 160.956 MB/s latency = 776.610 usec., size = 262144, bandwidth = 210.162 MB/s latency = 1189.561 usec., size = 524288, bandwidth = 160.634 MB/s latency = 3112.669 usec., size = 1048576, bandwidth = 289.705 MB/s latency = 3451.791 usec.,
When I run the same test on machines connected via gigabit ethernet, I get this performance:
size = 0, bandwidth = 0.000 MB/s latency = 210.061 usec., size = 1, bandwidth = 0.005 MB/s latency = 191.882 usec., size = 2, bandwidth = 0.013 MB/s latency = 151.482 usec., size = 4, bandwidth = 0.025 MB/s latency = 153.999 usec., size = 8, bandwidth = 0.053 MB/s latency = 144.269 usec., size = 16, bandwidth = 0.103 MB/s latency = 148.239 usec., size = 32, bandwidth = 0.205 MB/s latency = 148.680 usec., size = 64, bandwidth = 0.383 MB/s latency = 159.559 usec., size = 128, bandwidth = 0.710 MB/s latency = 171.809 usec., size = 256, bandwidth = 1.364 MB/s latency = 179.050 usec., size = 512, bandwidth = 2.413 MB/s latency = 202.339 usec., size = 1024, bandwidth = 3.826 MB/s latency = 255.220 usec., size = 2048, bandwidth = 4.687 MB/s latency = 416.710 usec., size = 4096, bandwidth = 7.140 MB/s latency = 547.101 usec., size = 8192, bandwidth = 7.976 MB/s latency = 979.512 usec., size = 16384, bandwidth = 9.060 MB/s latency = 1724.560 usec., size = 32768, bandwidth = 10.133 MB/s latency = 3084.109 usec., size = 65536, bandwidth = 1.338 MB/s latency = 46721.501 usec., size = 131072, bandwidth = 5.549 MB/s latency = 22527.230 usec., size = 262144, bandwidth = 4.853 MB/s latency = 51509.829 usec., size = 524288, bandwidth = 4.535 MB/s latency = 110247.691 usec., size = 1048576, bandwidth = 4.248 MB/s latency = 235423.400 usec.,
These tests are by no means scientific, but it is pretty clear that things are much slower over the network, as you would expect.
That's all for now, but we have plenty more Xgrid goodness coming soon. MacResearch has been successful in convincing the Xgrid-guru Charles Parnot to write a series of articles on the technology. So keep an eye on your MacResearch RSS feed, and don't miss it.