233. Semantics and Correctness


Up: Contents Next: Atomicity Previous: Error Classes

The semantics of RMA operations is best understood by assuming that the system maintains a separate public copy of each window, in addition to the original location in process memory (the private window copy). There is only one instance of each variable in process memory, but a distinct public copy of the variable for each window that contains it. A load accesses the instance in process memory (this includes MPI sends). A store accesses and updates the instance in process memory (this includes MPI receives), but the update may affect other public copies of the same locations. A get on a window accesses the public copy of that window. A put or accumulate on a window accesses and updates the public copy of that window, but the update may affect the private copy of the same locations in process memory, and public copies of other overlapping windows. This is illustrated in Figure 21 .


Figure 21: Schematic description of window

The following rules specify the latest time at which an operation must complete at the origin or the target. The update performed by a get call in the origin process memory is visible when the get operation is complete at the origin (or earlier); the update performed by a put or accumulate call in the public copy of the target window is visible when the put or accumulate has completed at the target (or earlier). The rules also specify the latest time at which an update of one window copy becomes visible in another overlapping copy.

    1. An RMA operation is completed at the origin by the ensuing call to MPI_WIN_COMPLETE, MPI_WIN_FENCE or MPI_WIN_UNLOCK that synchronizes this access at the origin.
    2. If an RMA operation is completed at the origin by a call to MPI_WIN_FENCE then the operation is completed at the target by the matching call to MPI_WIN_FENCE by the target process.
    3. If an RMA operation is completed at the origin by a call to MPI_WIN_COMPLETE then the operation is completed at the target by the matching call to MPI_WIN_WAIT by the target process.
    4. If an RMA operation is completed at the origin by a call to MPI_WIN_UNLOCK then the operation is completed at the target by that same call to MPI_WIN_UNLOCK.
    5. An update of a location in a private window copy in process memory becomes visible in the public window copy at latest when an ensuing call to MPI_WIN_POST, MPI_WIN_FENCE, or MPI_WIN_UNLOCK is executed on that window by the window owner.
    6. An update by a put or accumulate call to a public window copy becomes visible in the private copy in process memory at latest when an ensuing call to MPI_WIN_WAIT, MPI_WIN_FENCE, or MPI_WIN_LOCK is executed on that window by the window owner.
The MPI_WIN_FENCE or MPI_WIN_WAIT call that completes the transfer from public copy to private copy (6) is the same call that completes the put or accumulate operation in the window copy (2, 3). If a put or accumulate access was synchronized with a lock, then the update of the public window copy is complete as soon as the updating process executed MPI_WIN_UNLOCK. On the other hand, the update of private copy in the process memory may be delayed until the target process executes a synchronization call on that window (6). Thus, updates to process memory can always be delayed until the process executes a suitable synchronization call. Updates to a public window copy can also be delayed until the window owner executes a synchronization call, if fences or post-start-complete-wait synchronization is used. Only when lock synchronization is used does it becomes necessary to update the public window copy, even if the window owner does not execute any related synchronization call.

The rules above also define, by implication, when an update to a public window copy becomes visible in another overlapping public window copy. Consider, for example, two overlapping windows, win1 and win2. A call to MPI_WIN_FENCE(0, win1) by the window owner makes visible in the process memory previous updates to window win1 by remote processes. A subsequent call to MPI_WIN_FENCE(0, win2) makes these updates visible in the public copy of win2. A correct program must obey the following rules.

    1. A location in a window must not be accessed locally once an update to that location has started, until the update becomes visible in the private window copy in process memory.
    2. A location in a window must not be accessed as a target of an RMA operation once an update to that location has started, until the update becomes visible in the public window copy. There is one exception to this rule, in the case where the same variable is updated by two concurrent accumulates that use the same operation, with the same predefined datatype, on the same window.
    3. A put or accumulate must not access a target window once a local update or a put or accumulate update to another (overlapping) target window have started on a location in the target window, until the update becomes visible in the public copy of the window. Conversely, a local update in process memory to a location in a window must not start once a put or accumulate update to that target window has started, until the put or accumulate update becomes visible in process memory. In both cases, the restriction applies to operations even if they access disjoint locations in the window.
A program is erroneous if it violates these rules.


Rationale.

The last constraint on correct RMA accesses may seem unduly restrictive, as it forbids concurrent accesses to nonoverlapping locations in a window. The reason for this constraint is that, on some architectures, explicit coherence restoring operations may be needed at synchronization points. A different operation may be needed for locations that were locally updated by stores and for locations that were remotely updated by put or accumulate operations. Without this constraint, the MPI library will have to track precisely which locations in a window were updated by a put or accumulate call. The additional overhead of maintaining such information is considered prohibitive. ( End of rationale.)

Advice to users.

A user can write correct programs by following the following rules:

fence:
During each period between fence calls, each window is either updated by put or accumulate calls, or updated by local stores, but not both. Locations updated by put or accumulate calls should not be accessed during the same period (with the exception of concurrent updates to the same location by accumulate calls). Locations accessed by get calls should not be updated during the same period.
post-start-complete-wait:
A window should not be updated locally while being posted, if it is being updated by put or accumulate calls. Locations updated by put or accumulate calls should not be accessed while the window is posted (with the exception of concurrent updates to the same location by accumulate calls). Locations accessed by get calls should not be updated while the window is posted.

With the post-start synchronization, the target process can tell the origin process that its window is now ready for RMA access; with the complete-wait synchronization, the origin process can tell the target process that it has finished its RMA accesses to the window.

lock:
Updates to the window are protected by exclusive locks if they may conflict. Nonconflicting accesses (such as read-only accesses or accumulate accesses) are protected by shared locks, both for local accesses and for RMA accesses.
changing window or synchronization mode:
One can change synchronization mode, or change the window used to access a location that belongs to two overlapping windows, when the process memory and the window copy are guaranteed to have the same values. This is true after a local call to MPI_WIN_FENCE, if RMA accesses to the window are synchronized with fences; after a local call to MPI_WIN_WAIT, if the accesses are synchronized with post-start-complete-wait; after the call at the origin (local or remote) to MPI_WIN_UNLOCK if the accesses are synchronized with locks.

In addition, a process should not access the local buffer of a get operation until the operation is complete, and should not update the local buffer of a put or accumulate operation until that operation is complete.

The RMA synchronization operations define when updates are guaranteed to become visible in public and private windows. Updates may become visible earlier, but such behavior is implementation dependent. ( End of advice to users.)
The semantics are illustrated by the following examples:


Example Rule 5:

Process A:                 Process B: 
                           window location X 
 
                           MPI_Win_lock(EXCLUSIVE,B) 
                           store X /* local update to private copy of B */ 
                           MPI_Win_unlock(B)  
                           /* now visible in public window copy */ 
 
MPI_Barrier                MPI_Barrier 
 
MPI_Win_lock(EXCLUSIVE,B) 
MPI_Get(X) /* ok, read from public window */ 
MPI_Win_unlock(B) 


Example Rule 6:

Process A:                 Process B: 
                           window location X 
 
MPI_Win_lock(EXCLUSIVE,B) 
MPI_Put(X) /* update to public window */ 
MPI_Win_unlock(B) 
 
MPI_Barrier                MPI_Barrier 
 
                           MPI_Win_lock(EXCLUSIVE,B)  
                           /* now visible in private copy of B */ 
                           load X 
                           MPI_Win_unlock(B) 
Note that the private copy of X has not necessarily been updated after the barrier, so omitting the lock-unlock at process B may lead to the load returning an obsolete value.


Example The rules do not guarantee that process A in the following sequence will see the value of X as updated by the local store by B before the lock.


Process A:                 Process B: 
                           window location X 
 
                           store X /* update to private copy of B */ 
                           MPI_Win_lock(SHARED,B) 
MPI_Barrier                MPI_Barrier 
 
MPI_Win_lock(SHARED,B) 
MPI_Get(X) /* X may not be in public window copy */ 
MPI_Win_unlock(B) 
                           MPI_Win_unlock(B)  
                           /* update on X now visible in public window */ 


Example In the following sequence

Process A:                 Process B: 
window location X 
window location Y 
 
store Y 
MPI_Win_post(A,B) /* Y visible in public window */ 
MPI_Win_start(A)           MPI_Win_start(A) 
 
store X /* update to private window */ 
 
MPI_Win_complete           MPI_Win_complete 
MPI_Win_wait 
/* update on X may not yet visible in public window */ 
 
MPI_Barrier                MPI_Barrier 
 
                           MPI_Win_lock(EXCLUSIVE,A) 
                           MPI_Get(X) /* may return an obsolete value */  
                           MPI_Get(Y) 
                           MPI_Win_unlock(A) 
it is not guaranteed that process B reads the value of X as per the local update by process A, because neither MPI_WIN_WAIT nor MPI_WIN_COMPLETE calls by process A ensure visibility in the public window copy. To allow B to read the value of X stored by A the local store must be replaced by a local MPI_PUT that updates the public window copy. Note that by this replacement X may become visible in the private copy in process memory of A only after the MPI_WIN_WAIT call in process A. The update on Y made before the MPI_WIN_POST call is visible in the public window after the MPI_WIN_POST call and therefore correctly gotten by process B. The MPI_GET(Y) call could be moved to the epoch started by the MPI_WIN_START operation, and process B would still get the value stored by A.


Example Finally, in the following sequence

Process A:                 Process B: 
                           window location X 
 
MPI_Win_lock(EXCLUSIVE,B) 
MPI_Put(X) /* update to public window */ 
MPI_Win_unlock(B) 
 
MPI_Barrier                MPI_Barrier 
 
                           MPI_Win_post(B) 
                           MPI_Win_start(B) 
 
                           load X /* access to private window */ 
                                  /* may return an obsolete value */ 
 
                           MPI_Win_complete 
                           MPI_Win_wait 
rules (5,6) do not guarantee that the private copy of X at B has been updated before the load takes place. To ensure that the value put by process A is read, the local load must be replaced with a local MPI_GET operation, or must be placed after the call to MPI_WIN_WAIT.



Up: Contents Next: Atomicity Previous: Error Classes


Return to MPI-2.2 Standard Index
Return to MPI Forum Home Page

(Unofficial) MPI-2.2 of September 4, 2009
HTML Generated on September 10, 2009