Tutorial: Mixed C and Fortran...and some LAPACK

A recent question in one of the discussion forums led to the general question of using some of the performance libraries in Mac OS X with a specific emphasis on mixing C and Fortran, as well as implied data types in LAPACK. In this mini-tutorial I'm going to cover some of the basics of mixed C and Fortran using LAPACK (really the clapack interface) from the Accelerate framework. I'll cover more of Accelerate when I resume that tutorial series.

C and Fortran

Calling functions/subroutines across C and Fortran can be a mixed bag. This is almost always due to name mangling of the routines that occurs at compile time. In general, without user intervention of any kind (such as typdefs) expect the C frontend to preserve the routine symbols in the form defined in the source file (plus a leading underscore). For example:


void foo_(){ return; }
void FoO(){ return; }

Will produce symbols in the compiled file that look like this ('man nm' for more information on symbol lookup):


_foo_
_FoO

You'll note that in both cases the symbols have a leading underscore. C compilers typically do this to all external scope items to avoid name clashes from the runtime language support mechanisms. However, you'll want to otherwise note that the case and trailing underscores are preserved as is.

Fortran on the other hand is slightly different. What you type and what you get are often very different. It can be complicated by the fact that the behavior can also be different depending on the compiler you use or the operating system. For example with gfortran:


subroutine foo()
subroutine FoO()

Will produce the same symbol:


_foo_

You'll notice the following. First, a trailing underscore was automatically added. Second, in both cases the code produced a symbol in all lowercase. Now if you actually tried to write a Fortran program with both of those subroutines defined and compiled as is, the compiler would complain and halt. The trailing underscores with Fortran compilers are there for historical reasons and compatibility across compiler vendors. To suppress trailing underscores you would use a flag like -fno-underscoring in gfortran/g77. To use underscores (or to keep a second underscore from getting appended if you decide to name your subroutines with underscores for C compatability) or not depends on the situation. If you decide that you will manually include underscores in your names you may need to use -fno-second underscore. It can be tricky and confusing...

But what about the case sensitivity issue? There are ways to preserve case so that foo and FoO are treated differently although the settings vary depending on the compiler (for example with g77 you can use -fcase-preserve, the same flag does not appear to work with gfortran).

What to Expect

All things being equal, which they never are in practice, my recommendation is to expect that the C compiler will create symbols from routine names exactly as they appear in the source and for Fortran expect the names to be converted to lowercase with a trailing underscore. If you assume that, and depending on the code, mix of C and Fortran sources that may not be a valid assumption.

So let's create an example, you'll need gfortran installed on your computer and gcc which comes with Xcode or from here. We will do all of this from terminal for simplicity.

You can download the files from here.

In the folder called part1 are a C and Fortran file. We are going to start from Fortran so open that first.

We've got our main program, and a subroutine called foo, foo then calls cfoo. So open the C file. In the C file, there is function called cfoo_ (note the trailing underscore). We've added the trailing underscore because, if you'll recall, we are expecting the Fortran compiler to generate a trailing underscore.

Let's compile them and see what happens:


gcc -c cfile.c
gfortran -o test fortran.f cfile.o

Before we run it, let's check all of the symbols that contain foo:


[buster:part1] bubb% nm test | grep -i foo
00002a64 T _cfoo_
00002a38 T _foo_

And then let's run the program:


[buster:part1] bubb% ./test 
Hello from C

Sweet! We've now got a program that will make calls from Fortran to C. Let's play some ping pong. Look at the C and Fortran files in part2. Basically we added a new subroutine in Fortran and call to it from C. Again, we are relying on the fact (in C) that the Fortran compiler will append a trailing underscore. You can get around having to mangle your functions using typedefs and compiler flags (but I won't cover that here).


[buster:part2] bubb% gcc -c cfile.c
[buster:part2] bubb% gfortran -o test fortran.f cfile.o
[buster:part2] bubb% nm test | grep -i foo
000029cc T _cfoo_
00002948 T _ffoo_
0000291c T _foo_
[buster:part2] bubb% ./test
Hello from C
 Hello from Fortran

Passing Variables
This is simple. By default, Fortran (77 at least) passes everything by address. In C it's by value. So, when calling from Fortran to C you either need to explicitly send in a value (typically using %VAL(myVariable)) or more commonly in C, you take the value as a pointer.

Look in the folder called part3. We've modified foo (fortran.f) to send in a variable by address and by value. You'll need g77 for this example as I'm not sure if gfortran supports the %VAL extension yet.


[buster:part3] bubb% gcc -c cfile.c
[buster:part3] bubb% g77 -o test fortran.f cfile.o -lSystemStubs
[buster:part3] bubb% ./test
Value from Fortran foo by reference: 1
Value from Fortran foo by value    : 1

Similarly if you wanted to call the same C function from within C, you'd either pass in the address of the variable. Look at part4 to see more:


[buster:part4] bubb% ./test
Value from Fortran foo by reference: 1
Value from Fortran foo by value    : 1
Calling from cfoo2
Value from Fortran foo by reference: 1
Value from Fortran foo by value    : 1

Note in the C code we passed in the address of the variable as the first argument and the variable directly in the second.

Finally you can do pass through from Fortran to C to call another function in C. Loot at part5 for more information:


[buster:part5] bubb% gcc -c cfile.c
[buster:part5] bubb% g77 -o test fortran.f cfile.o -lSystemStubs
[buster:part5] bubb% ./test
Value from Fortran foo by reference: 1
Value from Fortran foo by value    : 1

Note that we passed the first argument straight through (we already have a pointer and we need one for the first argument) and for the second we dereferenced it to send in the value at the address of that pointer.

"That's great, but what does this have to do with LAPACK"

Ok, Ok... I'm getting to it. Let's fast forward to LAPACK as implemented in Accelerate using the clapack entry.

The LAPACK header file is located in:

/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Headers/

Open the file clapack.h in that directory.

At the top you'll see:


#if defined(__LP64__) /* In LP64 match sizes with the 32 bit ABI */
typedef int 		__CLPK_integer;
typedef int 		__CLPK_logical;
typedef float 		__CLPK_real;
typedef double 		__CLPK_doublereal;
typedef 		__CLPK_logical 	(*__CLPK_L_fp)();
typedef int 		__CLPK_ftnlen;
#else
typedef long int 	__CLPK_integer;
typedef long int 	__CLPK_logical;
typedef float 		__CLPK_real;
typedef double 		__CLPK_doublereal;
typedef __CLPK_logical 	(*__CLPK_L_fp)();
typedef long int 	__CLPK_ftnlen;
#endif

typedef struct { __CLPK_real r, i; } __CLPK_complex;
typedef struct { __CLPK_doublereal r, i; } __CLPK_doublecomplex;

So to the first part of the question in the post/thread mentioned above. What are the data types that you need. That's easy. The data types are typedef'd for you by Apple, so in reality you send in the the data type that corresponds to the normal C type. Some might argue that you should use the data type as it's typedef'd by Apple. And they may be right. But unless I have to recreate some complex structure, it makes the most sense to simply use the intrinsic type that C (or Fortran provide).

Now the first place to look for the actual usage and implementation is in the documentation:

http://www.netlib.org/lapack/

The documentation will cover in some detail what LAPACK can (and cannot) do. It will also cover issues related to sizing. In particular if you download the source of LAPACK and look at the file for the subroutine of interest you will see information regarding sizing and size dependencies, in addition to information about the variables and function purpose. What I'll cover is how to access it in Accelerate from Fortran and C.

Look in the folder called part6. You will see a single Fortran source file with a coll to cgeql2. To compile the code and get it to link against Accelerate do the following:


[buster:part6] bubb% gfortran -o test fortran.f -Wl,-framework -Wl,Accelerate
[buster:part6] bubb% ./test

Of course the calculation doesn't really do anything or print anything out. But if you wanted to use it with your own data, that is how you would.

How does this look from C?

From C you can use either the direct entry to a function if you are using Accelerate. For example a call from C to CGELQ2, can be done as:


cgelq2(&m,&n,a,&lda,tau,work,&info);

Or as:


cgelq2_(&m,&n,a,&lda,tau,work,&info);

Note that in both cases scalar values (m,n,lda and info) are passed in by address. Remember that in C arrays (either static or dynamic) are really pointers to a location in memory, so those don't need any modification.

One other note, if you do as I said above and do NOT use the typedef's in Accelerate for the data types, the compiler will complain. In particular if you define your own complex or double complex data type. Although the warning is harmless (because the data types are actually correct), this is one of those instances where it might be better to use the type defined by the header.

Am I giving mixed advice? Yes. But in the end you need to evaluate what is most important for YOUR code. Clean, self documenting code is important, but not at the expense of compile time errors or checks that can make debugging harder later on. So when in doubt use the typdef in the header, and make a note about that in your code (so someone else looking at your code doesn't have to dig into the framework headers to see what is going on).

Ok that's it for now.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Easier than I thought

Thanks for the tutorial! This brings me a good bit closer to what I'm trying to accomplish.
Justin

LAPACK argument documentation

It's great to discover that some of the numerical tools I was looking for are essentially preinstalled for me. But where can I find documentation of argument usage? The names of the arguments in clapack.h are (though descriptive to experts, I'm sure) opaque to me, and the documentation at netlib.org does not seem to have the description that I need - the "Specifications of Routines" section at the end looks like a stub.

For example, in your function call above:

cgelq2(&m,&n,a,&lda,tau,work,&info);

how do I find out exactly what m, n, and a do? Am I missing something here?

LAPACK resource

Hi,

I don't use LAPACK much, but I thought maybe the User's Guide begins to address some of your documentation needs.

http://www.netlib.org/lapack/lug/node110.html

Note that this User's Guide documents the FORTRAN routines(as far as I know)

In case you were unfamiliar with C syntax, the & symbol in front of a parameter means that you are passing in the address of that variable(so if a routine requires a pointer parameter, you would use that).

Regards,

Chinmoy.

aha - here's where

You can look up the source code for each routine online and read the comments at the top. The "User's Manual" still seems incomplete, though.

Question about C/Fortran type conversion

Hi,

I am writing a C++ program in Xcode. I have added Accelerate framework and included Accelerate/Accelerat.h. However, the compiler prompts:
error: invalid conversion from 'int*' to '__CLPK_integer*'

for the following line:
dgbsv_( &n, &ib, &ib, &rhs_n, &mat[0][0], &height, &pivot[0], &rhs[0], &n, &ierr );

where n, ib, ... are integers.

The tutorial by dgohara says, in the beginning of clapack.h, we have the following definition:
#if defined(__LP64__) /* In LP64 match sizes with the 32 bit ABI */
typedef int __CLPK_integer;
typedef int __CLPK_logical;
typedef float __CLPK_real;
typedef double __CLPK_doublereal;
typedef __CLPK_logical (*__CLPK_L_fp)();
typedef int __CLPK_ftnlen;
#else
typedef long int __CLPK_integer;
typedef long int __CLPK_logical;
typedef float __CLPK_real;
typedef double __CLPK_doublereal;
typedef __CLPK_logical (*__CLPK_L_fp)();
typedef long int __CLPK_ftnlen;
#endif

I am working on a Macbook Core 2 Duo. Does the error mean that the architecture of my machine does not support LP64? I have already turned on the "Use 64-bit Integer Math" in the build options.

Many thanks.

Re: Question about C/Fortran type conversion

Hi,

I believe what the compiler is complaining about is the probably due to the type you declared your variables as. Specifically as integers. That is to say the prototype for dgbsv probably is something like:

dgbsv(__CLPK_int *n, __CLPK_int *ib ...);

So it's really just objecting to the fact that you probably declared your n and ib variables as:

int n;
int ib;

as opposed to:

__CLPK_int n;
__CLPK_int ib;

Alternatively you can probably pass in the variables as so:

dgbsv_((__CLPK_int) &n, (__CLPK_int) &ib,,,);

Hope that helps,

Dave