Virtual topologies specify a communication graph, but they implement no communication function themselves. Many applications require sparse nearest neighbor communications that can be expressed as graph topologies. We now describe several collective operations that perform communication along the edges of a graph representing a virtual topology. All of these functions are collective; i.e., they must be called by all MPI processes in the specified communicator. See Section Collective Communication for an overview of other dense (global) collective communication operations and the semantics of collective operations.
If the graph was created with MPI_DIST_GRAPH_CREATE_ADJACENT with sources and destinations containing 0, ..., n-1, where n is the number of MPI processes in the group of comm_old (i.e., the graph is fully connected and also includes an edge from each node to itself), then the sparse neighborhood communication routine performs the same data exchange as the corresponding dense (fully-connected) collective operation. In the case of a Cartesian communicator, only nearest neighbor communication is provided, corresponding to rank_source and rank_dest in MPI_CART_SHIFT with input disp = 1.
Rationale.
Neighborhood collective communications enable communication on a virtual topology.
This high-level specification of data exchange among
neighboring MPI processes enables optimizations in the MPI library because
the communication pattern is known statically (the topology).
Thus, the implementation can compute optimized message schedules during
creation of the topology [40]. This
functionality can significantly simplify the implementation of neighbor
exchanges [36].
( End of rationale.)
For a distributed graph topology, created with
MPI_DIST_GRAPH_CREATE, the sequence of neighbors in the
send and receive buffers at each MPI process is defined as the sequence
returned by MPI_DIST_GRAPH_NEIGHBORS for destinations and
sources, respectively. For a general graph topology, created with
MPI_GRAPH_CREATE, the use of neighborhood collective
communication is restricted to adjacency matrices, where the number of
edges between any two MPI processes is defined to be the same for both
MPI processes (i.e., with a symmetric adjacency matrix). In this case,
the order of neighbors in the send and
receive buffers is defined as the sequence of neighbors as returned by
MPI_GRAPH_NEIGHBORS. Note that graph topologies
should generally be replaced by the distributed graph topologies.
For a Cartesian topology, created with MPI_CART_CREATE, the sequence of neighbors in the send and receive buffers at each MPI process is defined by the order of the dimensions, first the neighbor in the negative direction and then in the positive direction with displacement 1. The numbers of sources and destinations in the communication routines are 2*ndims with ndims defined in MPI_CART_CREATE. If a neighbor does not exist, i.e., at the border of a Cartesian topology in the case of a nonperiodic virtual grid dimension (i.e., periods[...]=false), then this neighbor is defined to be MPI_PROC_NULL.
If a neighbor in any of the functions is MPI_PROC_NULL, then the neighborhood collective communication behaves like a point-to-point communication with MPI_PROC_NULL in this direction. That is, the buffer is still part of the sequence of neighbors but it is neither communicated nor updated.