219. General Active Target Synchronization


Up: Synchronization Calls Next: Lock Previous: Fence

MPI_WIN_START(group, assert, win)
IN groupgroup of target processes (handle)
IN assertprogram assertion (integer)
IN winwindow object (handle)

int MPI_Win_start(MPI_Group group, int assert, MPI_Win win)

MPI_WIN_START(GROUP, ASSERT, WIN, IERROR)
INTEGER GROUP, ASSERT, WIN, IERROR

void MPI::Win::Start(const MPI::Group& group, int assert) const

Starts an RMA access epoch for win. RMA calls issued on win during this epoch must access only windows at processes in group. Each process in group must issue a matching call to MPI_WIN_POST. RMA accesses to each target window will be delayed, if necessary, until the target process executed the matching call to MPI_WIN_POST. MPI_WIN_START is allowed to block until the corresponding MPI_WIN_POST calls are executed, but is not required to.

The assert argument is used to provide assertions on the context of the call that may be used for various optimizations. This is described in Section Assertions . A value of assert = 0 is always valid.

MPI_WIN_COMPLETE(win)
IN winwindow object (handle)

int MPI_Win_complete(MPI_Win win)

MPI_WIN_COMPLETE(WIN, IERROR)
INTEGER WIN, IERROR

void MPI::Win::Complete() const

Completes an RMA access epoch on win started by a call to MPI_WIN_START. All RMA communication calls issued on win during this epoch will have completed at the origin when the call returns.

MPI_WIN_COMPLETE enforces completion of preceding RMA calls at the origin, but not at the target. A put or accumulate call may not have completed at the target when it has completed at the origin.

Consider the sequence of calls in the example below.
Example


MPI_Win_start(group, flag, win); 
MPI_Put(...,win); 
MPI_Win_complete(win); 

The call to MPI_WIN_COMPLETE does not return until the put call has completed at the origin; and the target window will be accessed by the put operation only after the call to MPI_WIN_START has matched a call to MPI_WIN_POST by the target process. This still leaves much choice to implementors. The call to MPI_WIN_START can block until the matching call to MPI_WIN_POST occurs at all target processes. One can also have implementations where the call to MPI_WIN_START is nonblocking, but the call to MPI_PUT blocks until the matching call to MPI_WIN_POST occurred; or implementations where the first two calls are nonblocking, but the call to MPI_WIN_COMPLETE blocks until the call to MPI_WIN_POST occurred; or even implementations where all three calls can complete before any target process called MPI_WIN_POST --- the data put must be buffered, in this last case, so as to allow the put to complete at the origin ahead of its completion at the target. However, once the call to MPI_WIN_POST is issued, the sequence above must complete, without further dependencies.

MPI_WIN_POST(group, assert, win)
IN groupgroup of origin processes (handle)
IN assertprogram assertion (integer)
IN winwindow object (handle)

int MPI_Win_post(MPI_Group group, int assert, MPI_Win win)

MPI_WIN_POST(GROUP, ASSERT, WIN, IERROR)
INTEGER GROUP, ASSERT, WIN, IERROR

void MPI::Win::Post(const MPI::Group& group, int assert) const

Starts an RMA exposure epoch for the local window associated with win. Only processes in group should access the window with RMA calls on win during this epoch. Each process in group must issue a matching call to MPI_WIN_START. MPI_WIN_POST does not block.

MPI_WIN_WAIT(win)
IN winwindow object (handle)

int MPI_Win_wait(MPI_Win win)

MPI_WIN_WAIT(WIN, IERROR)
INTEGER WIN, IERROR

void MPI::Win::Wait() const

Completes an RMA exposure epoch started by a call to MPI_WIN_POST on win. This call matches calls to MPI_WIN_COMPLETE(win) issued by each of the origin processes that were granted access to the window during this epoch. The call to MPI_WIN_WAIT will block until all matching calls to MPI_WIN_COMPLETE have occurred. This guarantees that all these origin processes have completed their RMA accesses to the local window. When the call returns, all these RMA accesses will have completed at the target window.

Figure 20 illustrates the use of these four functions.


Figure 20: Active target communication. Dashed arrows represent synchronizations and solid arrows represent data transfer.

Process 0 puts data in the windows of processes 1 and 2 and process 3 puts data in the window of process 2. Each start call lists the ranks of the processes whose windows will be accessed; each post call lists the ranks of the processes that access the local window. The figure illustrates a possible timing for the events, assuming strong synchronization; in a weak synchronization, the start, put or complete calls may occur ahead of the matching post calls.

MPI_WIN_TEST(win, flag)
IN winwindow object (handle)
OUT flagsuccess flag (logical)

int MPI_Win_test(MPI_Win win, int *flag)

MPI_WIN_TEST(WIN, FLAG, IERROR)
INTEGER WIN, IERROR
LOGICAL FLAG

bool MPI::Win::Test() const

This is the nonblocking version of MPI_WIN_WAIT. It returns flag = true if MPI_WIN_WAIT would return, flag = false, otherwise. The effect of return of MPI_WIN_TEST with flag = true is the same as the effect of a return of MPI_WIN_WAIT. If flag = false is returned, then the call has no visible effect.

MPI_WIN_TEST should be invoked only where MPI_WIN_WAIT can be invoked. Once the call has returned flag = true, it must not be invoked anew, until the window is posted anew.

Assume that window win is associated with a ``hidden'' communicator wincomm, used for communication by the processes of win. The rules for matching of post and start calls and for matching complete and wait call can be derived from the rules for matching sends and receives, by considering the following (partial) model implementation.

{ MPI_WIN_POST(group,0,win)}
initiate a nonblocking send with tag tag0 to each process in group, using wincomm. No need to wait for the completion of these sends.
{ MPI_WIN_START(group,0,win)}
initiate a nonblocking receive with tag tag0 from each process in group, using wincomm. An RMA access to a window in target process i is delayed until the receive from i is completed.
{ MPI_WIN_COMPLETE(win)}
initiate a nonblocking send with tag tag1 to each process in the group of the preceding start call. No need to wait for the completion of these sends.
{ MPI_WIN_WAIT(win)}
initiate a nonblocking receive with tag tag1 from each process in the group of the preceding post call. Wait for the completion of all receives.

No races can occur in a correct program: each of the sends matches a unique receive, and vice-versa.


Rationale.

The design for general active target synchronization requires the user to provide complete information on the communication pattern, at each end of a communication link: each origin specifies a list of targets, and each target specifies a list of origins. This provides maximum flexibility (hence, efficiency) for the implementor: each synchronization can be initiated by either side, since each ``knows'' the identity of the other. This also provides maximum protection from possible races. On the other hand, the design requires more information than RMA needs, in general: in general, it is sufficient for the origin to know the rank of the target, but not vice versa. Users that want more ``anonymous'' communication will be required to use the fence or lock mechanisms. ( End of rationale.)

Advice to users.

Assume a communication pattern that is represented by a directed graph , where V = {0, ..., n-1} and if origin process i accesses the window at target process j. Then each process i issues a call to MPI_WIN_POST(ingroupi, ...), followed by a call to MPI_WIN_START(outgroupi,...), where and . A call is a noop, and can be skipped, if the group argument is empty. After the communications calls, each process that issued a start will issue a complete. Finally, each process that issued a post will issue a wait.

Note that each process may call with a group argument that has different members. ( End of advice to users.)



Up: Synchronization Calls Next: Lock Previous: Fence


Return to MPI-2.1 Standard Index
Return to MPI Forum Home Page

MPI-2.0 of July 1, 2008
HTML Generated on July 6, 2008