MPI supports the following RMA communication calls: MPI_PUT and MPI_RPUT transfer data from the caller memory (origin) to the target memory; MPI_GET and MPI_RGET transfer data from the target memory to the caller memory; MPI_ACCUMULATE and MPI_RACCUMULATE perform element-wise atomic updates of locations in the target memory, e.g., by adding to these locations values sent from the caller memory; MPI_GET_ACCUMULATE, MPI_RGET_ACCUMULATE, and MPI_FETCH_AND_OP perform element-wise atomic read-modify-write updates and return each value before the update; and MPI_COMPARE_AND_SWAP performs a remote atomic compare and swap operation. These procedures are nonblocking. The operation is completed, at the origin or both the origin and the target, when a subsequent synchronization procedure is called by the origin on the involved window object. These synchronization procedures are described in Section Synchronization Calls. RMA communication operations can also be completed with calls to flush procedures; see Section Flush and Sync for details. Request-based operations MPI_RPUT, MPI_RGET, MPI_RACCUMULATE, and MPI_RGET_ACCUMULATE can be completed at the origin by using the MPI test or wait procedures described in Section Communication Completion.
The local communication buffer of an RMA operation should not be updated after the operation started and until the operation completes at the origin. The local communication buffer of a get operation should not be accessed after the operation started and until the operation completes at the origin.
Two concurrent accesses are called conflicting if one of the two is a put operation, exactly one of them is an accumulate operation, or one of them is a get operation and the other is a local store access. The outcome of conflicting accesses to the same memory location is undefined; if a location is updated by a put or accumulate operation, then the outcome of loads or other RMA operations is undefined until the updating operation has completed at the target. There is one exception to this rule; namely, the same location can be updated by several concurrent accumulate operations, the outcome being as if these updates occurred in some order. In addition, the outcome of concurrent load/store accesses and RMA updates to the same memory location is undefined. These restrictions are described in more detail in Section Semantics and Correctness.
The calls use general datatype arguments to specify communication buffers at the origin and at the target. Thus, a transfer operation may also gather data at the source and scatter it at the destination. However, all arguments specifying both communication buffers are provided by the caller.
For all RMA communication operations, the target process may be identical with the origin process; i.e., an MPI process may use an RMA operation to move data in its memory.
Rationale.
The choice of supporting ``self-communication'' is the same as for
message-passing.
It simplifies some coding, and is very useful with accumulate
operations, to allow atomic updates of local variables.
( End of rationale.)
MPI_PROC_NULL is a valid target rank in all MPI RMA communication calls.
The effect is the same as for MPI_PROC_NULL in MPI point-to-point
communication.
After any RMA operation with rank MPI_PROC_NULL, it is still necessary to
close the RMA epoch with the synchronization method that opened the epoch.