MPI_REDUCE(sendbuf, recvbuf, count, datatype, op, root, comm) | |
IN sendbuf | address of send buffer (choice) |
OUT recvbuf | address of receive buffer (choice, significant only at root) |
IN count | number of elements in send buffer (non-negative integer) |
IN datatype | data type of elements of send buffer (handle) |
IN op | reduce operation (handle) |
IN root | rank of root process (integer) |
IN comm | communicator (handle) |
int MPI_Reduce(const void* sendbuf, void* recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm)
MPI_Reduce(sendbuf, recvbuf, count, datatype, op, root, comm, ierror)
TYPE(*), DIMENSION(..), INTENT(IN) :: sendbuf
TYPE(*), DIMENSION(..) :: recvbuf
INTEGER, INTENT(IN) :: count, root
TYPE(MPI_Datatype), INTENT(IN) :: datatype
TYPE(MPI_Op), INTENT(IN) :: op
TYPE(MPI_Comm), INTENT(IN) :: comm
INTEGER, OPTIONAL, INTENT(OUT) :: ierror
MPI_REDUCE(SENDBUF, RECVBUF, COUNT, DATATYPE, OP, ROOT, COMM, IERROR)
<type> SENDBUF(*), RECVBUF(*)
INTEGER COUNT, DATATYPE, OP, ROOT, COMM, IERROR
If comm is an intracommunicator, MPI_REDUCE combines the elements provided in the input buffer of each process in the group, using the operation op, and returns the combined value in the output buffer of the process with rank root. The input buffer is defined by the arguments sendbuf, count and datatype; the output buffer is defined by the arguments recvbuf, count and datatype; both have the same number of elements, with the same type. The routine is called by all group members using the same arguments for count, datatype, op, root and comm. Thus, all processes provide input buffers of the same length, with elements of the same type as the output buffer at the root. Each process can provide one element, or a sequence of elements, in which case the combine operation is executed element-wise on each entry of the sequence. For example, if the operation is MPI_MAX and the send buffer contains two elements that are floating point numbers ( count = 2 and datatype = MPI_FLOAT), then and .
Section Predefined Reduction Operations , lists the set of predefined operations provided by MPI. That section also enumerates the datatypes to which each operation can be applied.
In addition, users may define their own operations that can be overloaded to operate on several datatypes, either basic or derived. This is further explained in Section User-Defined Reduction Operations .
The operation op is always assumed to be associative. All predefined operations are also assumed to be commutative. Users may define operations that are assumed to be associative, but not commutative. The ``canonical'' evaluation order of a reduction is determined by the ranks of the processes in the group. However, the implementation can take advantage of associativity, or associativity and commutativity in order to change the order of evaluation. This may change the result of the reduction for operations that are not strictly associative and commutative, such as floating point addition.
Advice
to implementors.
It is strongly recommended that MPI_REDUCE be implemented so
that the same result be obtained
whenever the function is applied on the same arguments,
appearing in the same order. Note that this may
prevent optimizations that take
advantage of the physical location of ranks.
( End of advice to implementors.)
Advice to users.
Some applications may not be able to ignore the non-associative nature of
floating-point operations or may use user-defined operations
(see Section User-Defined Reduction Operations
) that require a special reduction
order and cannot be treated as associative.
Such applications should enforce the order of evaluation explicitly.
For example, in the case of operations that require a strict left-to-right
(or right-to-left) evaluation order, this could be done by gathering all
operands at a single process (e.g., with MPI_GATHER), applying the
reduction operation in the desired order (e.g., with MPI_REDUCE_LOCAL),
and if needed, broadcast or scatter the result to the other processes
(e.g., with MPI_BCAST).
( End of advice to users.)
The datatype argument of MPI_REDUCE must be
compatible with
op.
Predefined operators work only with
the MPI types listed in Section Predefined Reduction Operations
and
Section MINLOC and MAXLOC
. Furthermore, the
datatype and op given for predefined operators
must be the same on all processes.
Note that it is possible for users to supply different user-defined operations
to MPI_REDUCE in each process. MPI does not define which
operations are used on which operands in this case.
User-defined operators may operate on general, derived datatypes.
In this case, each argument that
the reduce operation is applied to is one element described by such a datatype,
which may contain several basic values.
This is further explained in Section User-Defined Reduction Operations
.
Advice to users.
Users should make no assumptions about how MPI_REDUCE is
implemented. It is safest to ensure that the same function is passed to
MPI_REDUCE by each process.
( End of advice to users.)
Overlapping datatypes are permitted in ``send'' buffers.
Overlapping datatypes in ``receive'' buffers are erroneous
and may give unpredictable results.
The ``in place'' option for intracommunicators is specified by passing the value MPI_IN_PLACE to the argument sendbuf at the root. In such a case, the input data is taken at the root from the receive buffer, where it will be replaced by the output data.
If comm is an intercommunicator, then the call involves all processes in the intercommunicator, but with one group (group A) defining the root process. All processes in the other group (group B) pass the same value in argument root, which is the rank of the root in group A. The root passes the value MPI_ROOT in root. All other processes in group A pass the value MPI_PROC_NULL in root. Only send buffer arguments are significant in group B and only receive buffer arguments are significant at the root.