It is often useful in a put operation to combine the data moved to the target process with the data that resides at that MPI process, rather than replacing it. This will allow, for example, the accumulation of a sum by having all involved MPI processes add their contributions to the sum variable in the memory of one MPI process. The accumulate functions have slightly different semantics with respect to overlapping data accesses than the put and get functions; see Section Semantics and Correctness for details.
MPI_ACCUMULATE(origin_addr, origin_count, origin_datatype, target_rank, target_disp, target_count, target_datatype, op, win) | |
IN origin_addr | initial address of buffer (choice) |
IN origin_count | number of entries in buffer (non-negative integer) |
IN origin_datatype | datatype of each entry (handle) |
IN target_rank | rank of target (non-negative integer) |
IN target_disp | displacement from start of window to beginning of target buffer (non-negative integer) |
IN target_count | number of entries in target buffer (non-negative integer) |
IN target_datatype | datatype of each entry in target buffer (handle) |
IN op | accumulate operator (handle) |
IN win | window object (handle) |
Accumulate the contents of the origin buffer (as defined by origin_addr, origin_count, and origin_datatype) to the buffer specified by arguments target_count and target_datatype, at offset target_disp, in the target window specified by target_rank and win, using the operator op. This is like MPI_PUT except that data is combined into the target area instead of overwriting it.
Any of the predefined operators for MPI_REDUCE can be used. User-defined operators cannot be used. For example, if op is MPI_SUM, each element of the origin buffer is added to the corresponding element in the target, replacing the former value in the target.
Each datatype argument must be a predefined datatype or a derived datatype, where all basic components are of the same predefined datatype. Both datatype arguments must be constructed from the same predefined datatype. The operator op applies to elements of that predefined type. The parameter target_datatype must not specify overlapping entries, and the target buffer must fit in the target window.
An additional predefined operator, MPI_REPLACE, is defined. It corresponds to the associative function f(a,b) = b; i.e., the current value in the target memory is replaced by the value supplied by the origin.
MPI_REPLACE can be used only in MPI_ACCUMULATE, MPI_RACCUMULATE, MPI_GET_ACCUMULATE, MPI_FETCH_AND_OP, and MPI_RGET_ACCUMULATE, but not in collective reduction operations such as MPI_REDUCE.
Advice to users.
MPI_PUT can be considered a special case of MPI_ACCUMULATE
with the operator MPI_REPLACE.
Note, however, that MPI_PUT and MPI_ACCUMULATE
have different constraints on concurrent updates.
( End of advice to users.)
Example
We want to compute
.
The arrays
A, B, and map are
distributed in the same manner. We write
the simple version.
This code is identical to the code in Example Examples for Communication Calls, except that a call to MPI_GET has been replaced by a call to MPI_ACCUMULATE. (Note that, if map is one-to-one, the code computes , which is the reverse assignment to the one computed in that previous example.) In a similar manner, we can replace in Example Examples for Communication Calls, the call to get by a call to accumulate, thus performing the computation with only one communication between any two MPI processes.
It is often useful to have fetch-and-accumulate semantics such that the remote data is returned to the caller before the sent data is accumulated into the remote data. The get and accumulate steps are executed atomically for each basic element in the datatype (see Section Semantics and Correctness for details). The predefined operator MPI_REPLACE provides fetch-and-set behavior.
MPI_GET_ACCUMULATE(origin_addr, origin_count, origin_datatype, result_addr, result_count, result_datatype, target_rank, target_disp, target_count, target_datatype, op, win) | |
IN origin_addr | initial address of buffer (choice) |
IN origin_count | number of entries in origin buffer (non-negative integer) |
IN origin_datatype | datatype of each entry in origin buffer (handle) |
OUT result_addr | initial address of result buffer (choice) |
IN result_count | number of entries in result buffer (non-negative integer) |
IN result_datatype | datatype of each entry in result buffer (handle) |
IN target_rank | rank of target (non-negative integer) |
IN target_disp | displacement from start of window to beginning of target buffer (non-negative integer) |
IN target_count | number of entries in target buffer (non-negative integer) |
IN target_datatype | datatype of each entry in target buffer (handle) |
IN op | accumulate operator (handle) |
IN win | window object (handle) |
Accumulate origin_count elements of type origin_datatype from the origin buffer ( origin_addr) to the buffer at offset target_disp, in the target window specified by target_rank and win, using the operator op and return in the result buffer result_addr the content of the target buffer before the accumulation, specified by target_disp, target_count, and target_datatype. The data transferred from origin to target must fit, without truncation, in the target buffer. Likewise, the data copied from target to origin must fit, without truncation, in the result buffer.
The origin and result buffers ( origin_addr and result_addr) must be disjoint. Each datatype argument must be a predefined datatype or a derived datatype where all basic components are of the same predefined datatype. All datatype arguments must be constructed from the same predefined datatype. The operator op applies to elements of that predefined type. target_datatype must not specify overlapping entries, and the target buffer must fit in the target window or in attached memory in a dynamic window. The operation is executed atomically for each basic datatype; see Section Semantics and Correctness for details.
Any of the predefined operators for MPI_REDUCE, as well as MPI_NO_OP or MPI_REPLACE can be specified as op. User-defined functions cannot be used. An additional predefined operator, MPI_NO_OP, is defined. It corresponds to the associative function f(a,b) = a; i.e., the current value in the target memory is returned in the result buffer at the origin and the target buffer is not updated. If MPI_NO_OP is specified as the operator, the origin_addr, origin_count, and origin_datatype arguments are ignored. MPI_NO_OP can be used only in MPI_GET_ACCUMULATE, MPI_RGET_ACCUMULATE, and MPI_FETCH_AND_OP. MPI_NO_OP cannot be used in MPI_ACCUMULATE, MPI_RACCUMULATE, or collective reduction operations, such as MPI_REDUCE and others.
Advice to users.
MPI_GET is similar to
MPI_GET_ACCUMULATE,
with the operator MPI_NO_OP.
Note, however, that MPI_GET and
MPI_GET_ACCUMULATE
have different constraints on concurrent updates.
( End of advice to users.)
The generic functionality of MPI_GET_ACCUMULATE might limit the performance of fetch-and-increment or fetch-and-add calls that might be supported by special hardware operations. MPI_FETCH_AND_OP thus allows for a fast implementation of a commonly used subset of the functionality of MPI_GET_ACCUMULATE.
MPI_FETCH_AND_OP(origin_addr, result_addr, datatype, target_rank, target_disp, op, win) | |
IN origin_addr | initial address of buffer (choice) |
OUT result_addr | initial address of result buffer (choice) |
IN datatype | datatype of the entry in origin, result, and target buffers (handle) |
IN target_rank | rank of target (non-negative integer) |
IN target_disp | displacement from start of window to beginning of target buffer (non-negative integer) |
IN op | accumulate operator (handle) |
IN win | window object (handle) |
Accumulate one element of type datatype from the origin buffer origin_addr to the buffer at offset target_disp, in the target window specified by target_rank and win, using the operator op and return in the result buffer result_addr the content of the target buffer before the accumulation.
The origin and result buffers ( origin_addr and result_addr) must be disjoint. Any of the predefined operators for MPI_REDUCE, as well as MPI_NO_OP or MPI_REPLACE, can be specified as op; user-defined functions cannot be used. The datatype argument must be a predefined datatype. The operation is executed atomically.
Another useful operation is an atomic compare and swap where the value at the origin is compared to the value at the target, which is atomically replaced by a third value only if the values at origin and target are equal.
MPI_COMPARE_AND_SWAP(origin_addr, compare_addr, result_addr, datatype, target_rank, target_disp, win) | |
IN origin_addr | initial address of buffer (choice) |
IN compare_addr | initial address of compare buffer (choice) |
OUT result_addr | initial address of result buffer (choice) |
IN datatype | datatype of the element in all buffers (handle) |
IN target_rank | rank of target (non-negative integer) |
IN target_disp | displacement from start of window to beginning of target buffer (non-negative integer) |
IN win | window object (handle) |
This function compares one element of type datatype in the compare buffer compare_addr with the buffer at offset target_disp in the target window specified by target_rank and win and replaces the value at the target with the value in the origin buffer origin_addr if the compare buffer and the target buffer are identical. The original value at the target is returned in the buffer result_addr. The parameter datatype must belong to one of the following categories of predefined datatypes: C integer, Fortran integer, Logical, Multi-language types, or Byte as specified in Section Predefined Reduction Operations. The origin and result buffers ( origin_addr and result_addr) must be disjoint.