This section discusses a number of problems that may arise when using MPI in a Fortran program. It is intended as advice to users, and clarifies how MPI interacts with Fortran. It does not add to the standard, but is intended to clarify the standard.
As noted in the original MPI specification, the interface violates the Fortran standard in several ways. While these cause few problems for Fortran 77 programs, they become more significant for Fortran 90 programs, so that users must exercise care when using new Fortran 90 features. The violations were originally adopted and have been retained because they are important for the usability of MPI. The rest of this section describes the potential problems in detail. It supersedes and replaces the discussion of Fortran bindings in the original MPI specification (for Fortran 90, not Fortran 77).
The following MPI features are inconsistent with Fortran 90.
MPI-1 contained several routines that take address-sized information as input or return address-sized information as output. In C such arguments were of type MPI_Aint and in Fortran of type INTEGER. On machines where integers are smaller than addresses, these routines can lose information. In MPI-2 the use of these functions has been deprecated and they have been replaced by routines taking INTEGER arguments of KIND=MPI_ADDRESS_KIND. A number of new MPI-2 functions also take INTEGER arguments of non-default KIND. See Section Language Binding on page Language Binding and Section Type Constructors with Explicit Addresses on page Type Constructors with Explicit Addresses for more information.
All MPI functions with choice arguments associate actual arguments of different Fortran datatypes with the same dummy argument. This is not allowed by Fortran 77, and in Fortran 90 is technically only allowed if the function is overloaded with a different function for each type. In C, the use of void* formal arguments avoids these problems.
The following code fragment is technically illegal and may
generate a compile-time error.
integer i(5) real x(5) ... call mpi_send(x, 5, MPI_REAL, ...) call mpi_send(i, 5, MPI_INTEGER, ...)In practice, it is rare for compilers to do more than issue a warning, though there is concern that Fortran 90 compilers are more likely to return errors.
It is also technically illegal in Fortran to pass a scalar actual argument to an array dummy argument. Thus the following code fragment may generate an error since the buf argument to MPI_SEND is declared as an assumed-size array <type> buf(*).
integer a call mpi_send(a, 1, MPI_INTEGER, ...)
In the event that you run into one of the problems related to
type checking, you may be able to work around it by using a compiler
flag, by compiling separately, or by using an MPI implementation
with Extended Fortran Support as described in Section
Extended Fortran Support
.
An alternative that will usually work with variables local to a
routine but not with arguments to a function or subroutine is to use
the EQUIVALENCE statement to create another variable with a type
accepted by the compiler.
( End of advice to users.)
Implicit in MPI is the idea of a contiguous chunk of memory accessible through a linear address space. MPI copies data to and from this memory. An MPI program specifies the location of data by providing memory addresses and offsets. In the C language, sequence association rules plus pointers provide all the necessary low-level structure.
In Fortran 90, user data is not necessarily stored contiguously. For example, the array section A(1:N:2) involves only the elements of A with indices 1, 3, 5, ... . The same is true for a pointer array whose target is such a section. Most compilers ensure that an array that is a dummy argument is held in contiguous memory if it is declared with an explicit shape (e.g., B(N)) or is of assumed size (e.g., B(*)). If necessary, they do this by making a copy of the array into contiguous memory. Both Fortran 77 and Fortran 90 are carefully worded to allow such copying to occur, but few Fortran 77 compilers do it.*
Because MPI dummy buffer arguments are assumed-size arrays, this leads
to a serious problem for a non-blocking call: the compiler copies the
temporary array back on return but MPI continues to copy data to the
memory that held it. For example, consider the following code fragment:
real a(100) call MPI_IRECV(a(1:100:2), MPI_REAL, 50, ...)Since the first dummy argument to MPI_IRECV is an assumed-size array ( <type> buf(*)), the array section a(1:100:2) is copied to a temporary before being passed to MPI_IRECV, so that it is contiguous in memory. MPI_IRECV returns immediately, and data is copied from the temporary back into the array a. Sometime later, MPI may write to the address of the deallocated temporary. Copying is also a problem for MPI_ISEND since the temporary array may be deallocated before the data has all been sent from it.
Most Fortran 90 compilers do not make a copy if the actual argument is the whole of an explicit-shape or assumed-size array or is a `simple' section such as A(1:N) of such an array. (We define `simple' more fully in the next paragraph.) Also, many compilers treat allocatable arrays the same as they treat explicit-shape arrays in this regard (though we know of one that does not). However, the same is not true for assumed-shape and pointer arrays; since they may be discontiguous, copying is often done. It is this copying that causes problems for MPI as described in the previous paragraph.
Our formal definition of a `simple' array section is
name ( [:,]... [<subscript>]:[<subscript>] [,<subscript>]... )That is, there are zero or more dimensions that are selected in full, then one dimension selected without a stride, then zero or more dimensions that are selected with a simple subscript. Examples are
A(1:N), A(:,N), A(:,1:N,1), A(1:6,N), A(:,:,1:N)Because of Fortran's column-major ordering, where the first index varies fastest, a simple section of a contiguous array will also be contiguous.*
The same problem can occur with a scalar argument. Some compilers, even for
Fortran 77, make a copy of some scalar dummy arguments within
a called procedure. That this can cause a problem is illustrated by the
example
call user1(a,rq) call MPI_WAIT(rq,status,ierr) write (*,*) a subroutine user1(buf,request) call MPI_IRECV(buf,...,request,...) endIf a is copied, MPI_IRECV will alter the copy when it completes the communication and will not alter a itself.
Note that copying will almost certainly occur for an argument that is a non-trivial expression (one with at least one operator or function call), a section that does not select a contiguous part of its parent (e.g., A(1:n:2)), a pointer whose target is such a section, or an assumed-shape array that is (directly or indirectly) associated with such a section.
If there is a compiler option that inhibits copying of arguments, in either the calling or called procedure, this should be employed.
If a compiler makes copies in the calling procedure of arguments that are explicit-shape or assumed-size arrays, simple array sections of such arrays, or scalars, and if there is no compiler option to inhibit this, then the compiler cannot be used for applications that use MPI_GET_ADDRESS, or any non-blocking MPI routine. If a compiler copies scalar arguments in the called procedure and there is no compiler option to inhibit this, then this compiler cannot be used for applications that use memory references across subroutine calls as in the example above.
MPI requires a number of special ``constants'' that cannot be implemented as normal Fortran constants, including MPI_BOTTOM, MPI_STATUS_IGNORE, MPI_IN_PLACE, MPI_STATUSES_IGNORE and MPI_ERRCODES_IGNORE. In C, these are implemented as constant pointers, usually as NULL and are used where the function prototype calls for a pointer to a variable, not the variable itself.
In Fortran the implementation of these special constants may require the use of language constructs that are outside the Fortran standard. Using special values for the constants (e.g., by defining them through parameter statements) is not possible because an implementation cannot distinguish these values from legal data. Typically these constants are implemented as predefined static variables (e.g., a variable in an MPI-declared COMMON block), relying on the fact that the target compiler passes data by address. Inside the subroutine, this address can be extracted by some mechanism outside the Fortran standard (e.g., by Fortran extensions or by implementing the function in C).
MPI does not explicitly support passing Fortran 90 derived types to choice dummy arguments. Indeed, for MPI implementations that provide explicit interfaces through the mpi module a compiler will reject derived type actual arguments at compile time. Even when no explicit interfaces are given, users should be aware that Fortran 90 provides no guarantee of sequence association for derived types or arrays of derived types. For instance, an array of a derived type consisting of two elements may be implemented as an array of the first elements followed by an array of the second. Use of the SEQUENCE attribute may help here, somewhat.
The following code fragment shows one possible way to send a derived
type in Fortran. The example assumes that all data is passed by address.
type mytype integer i real x double precision d end type mytype type(mytype) foo integer blocklen(3), type(3) integer(MPI_ADDRESS_KIND) disp(3), base call MPI_GET_ADDRESS(foo%i, disp(1), ierr) call MPI_GET_ADDRESS(foo%x, disp(2), ierr) call MPI_GET_ADDRESS(foo%d, disp(3), ierr) base = disp(1) disp(1) = disp(1) - base disp(2) = disp(2) - base disp(3) = disp(3) - base blocklen(1) = 1 blocklen(2) = 1 blocklen(3) = 1 type(1) = MPI_INTEGER type(2) = MPI_REAL type(3) = MPI_DOUBLE_PRECISION call MPI_TYPE_CREATE_STRUCT(3, blocklen, disp, type, newtype, ierr) call MPI_TYPE_COMMIT(newtype, ierr) ! unpleasant to send foo%i instead of foo, but it works for scalar ! entities of type mytype call MPI_SEND(foo%i, 1, newtype, ...)
MPI provides operations that may be hidden from the user code and run concurrently with it, accessing the same memory as user code. Examples include the data transfer for an MPI_IRECV. The optimizer of a compiler will assume that it can recognize periods when a copy of a variable can be kept in a register without reloading from or storing to memory. When the user code is working with a register copy of some variable while the hidden operation reads or writes the memory copy, problems occur. This section discusses register optimization pitfalls.
When a variable is local to a Fortran subroutine (i.e., not in a module or COMMON block), the compiler will assume that it cannot be modified by a called subroutine unless it is an actual argument of the call. In the most common linkage convention, the subroutine is expected to save and restore certain registers. Thus, the optimizer will assume that a register which held a valid copy of such a variable before the call will still hold a valid copy on return.
Normally users are not afflicted with this. But the user should pay attention to this section if in his/her program a buffer argument to an MPI_SEND, MPI_RECV etc., uses a name which hides the actual variables involved. MPI_BOTTOM with an MPI_Datatype containing absolute addresses is one example. Creating a datatype which uses one variable as an anchor and brings along others by using MPI_GET_ADDRESS to determine their offsets from the anchor is another. The anchor variable would be the only one mentioned in the call. Also attention must be paid if MPI operations are used that run in parallel with the user's application.
Example 14 shows what Fortran compilers are allowed to do.
Example
Fortran 90 register optimization.
This source ... can be compiled as:
call MPI_GET_ADDRESS(buf,bufaddr, call MPI_GET_ADDRESS(buf,...) ierror) call MPI_TYPE_CREATE_STRUCT(1,1, call MPI_TYPE_CREATE_STRUCT(...) bufaddr, MPI_REAL,type,ierror) call MPI_TYPE_COMMIT(type,ierror) call MPI_TYPE_COMMIT(...) val_old = buf register = buf val_old = register call MPI_RECV(MPI_BOTTOM,1,type,...) call MPI_RECV(MPI_BOTTOM,...) val_new = buf val_new = registerThe compiler does not invalidate the register because it cannot see that MPI_RECV changes the value of buf. The access of buf is hidden by the use of MPI_GET_ADDRESS and MPI_BOTTOM.
Example 14 shows extreme, but allowed, possibilities.
Example
Fortran 90 register optimization -- extreme.
Source compiled as or compiled as
call MPI_IRECV(buf,..req) call MPI_IRECV(buf,..req) call MPI_IRECV(buf,..req) register = buf b1 = buf call MPI_WAIT(req,..) call MPI_WAIT(req,..) call MPI_WAIT(req,..) b1 = buf b1 := registerMPI_WAIT on a concurrent thread modifies buf between the invocation of MPI_IRECV and the finish of MPI_WAIT. But the compiler cannot see any possibility that buf can be changed after MPI_IRECV has returned, and may schedule the load of buf earlier than typed in the source. It has no reason to avoid using a register to hold buf across the call to MPI_WAIT. It also may reorder the instructions as in the case on the right.
To prevent instruction reordering or the allocation of a buffer in a register there are two possibilities in portable Fortran code:
call DD(buf) call MPI_RECV(MPI_BOTTOM,...) call DD(buf)with the separately compiled
subroutine DD(buf) integer buf end
(assuming that buf has type INTEGER). The compiler may be similarly prevented from moving a reference to a variable across a call to an MPI subroutine.
In the case of a non-blocking call, as in the above call of MPI_WAIT, no reference to the buffer is permitted until it has been verified that the transfer has been completed. Therefore, in this case, the extra call ahead of the MPI call is not necessary, i.e., the call of MPI_WAIT in the example might be replaced by
call MPI_WAIT(req,..) call DD(buf)
In C, subroutines which modify variables that are not in the argument list will not cause register optimization problems. This is because taking pointers to storage objects by using the & operator and later referencing the objects by way of the pointer is an integral part of the language. A C compiler understands the implications, so that the problem should not occur, in general. However, some compilers do offer optional aggressive optimization levels which may not be safe.