The following routine starts a number of MPI processes and establishes communication with them, returning an inter-communicator.
Advice to users.
It is possible in MPI to start an SPMD or MPMD application with a fixed number of processes
after initialization by
first starting one process and having
that process start its siblings
with MPI_COMM_SPAWN. This practice is discouraged primarily for
reasons of performance. If possible, it is preferable to start all
processes at once, as a single
MPI
application.
( End of advice to users.)
MPI_COMM_SPAWN(command, argv, maxprocs, info, root, comm, intercomm, array_of_errcodes) | |
IN command | name of program to be spawned (string, significant only at root) |
IN argv | arguments to command (array of strings, significant only at root) |
IN maxprocs | maximum number of processes to start (integer, significant only at root) |
IN info | a set of key-value pairs telling the runtime system where and how to start the processes (handle, significant only at root) |
IN root | rank of process in which previous arguments are examined (integer) |
IN comm | intra-communicator containing group of spawning processes (handle) |
OUT intercomm | inter-communicator between original group and the newly spawned group (handle) |
OUT array_of_errcodes | one code per process (array of integers) |
MPI_COMM_SPAWN tries to start maxprocs identical copies of the MPI program specified by command, establishing communication with them and returning an intercommunicator. The spawned processes are referred to as children. The children have their own MPI_COMM_WORLD, which is separate from that of the parents. MPI_COMM_SPAWN is collective over comm, and also may not return until MPI_INIT has been called in the children. Similarly, MPI_INIT in the children may not return until all parents have called MPI_COMM_SPAWN. In this sense, MPI_COMM_SPAWN in the parents and MPI_INIT in the children form a collective operation over the union of parent and child processes. The inter-communicator returned by MPI_COMM_SPAWN contains the parent processes in the local group and the child processes in the remote group. The ordering of processes in the local and remote groups is the same as the ordering of the group of the comm in the parents and of MPI_COMM_WORLD of the children, respectively. This inter-communicator can be obtained in the children through the function MPI_COMM_GET_PARENT.
Advice to users.
An implementation may automatically establish communication before
MPI_INIT is called by the children. Thus, completion of
MPI_COMM_SPAWN in the parent does not necessarily mean that
MPI_INIT has been called in the children (although the
returned inter-communicator can be used immediately).
( End of advice to users.)
The arguments are:
Advice
to implementors.
The implementation should use a natural rule for finding
executables and determining working directories. For instance, a
homogeneous system with a global file system might look first in the
working directory of the spawning process, or might search the
directories in a PATH environment variable as do Unix shells.
An implementation
should document its rules for finding executables and determining
working directories, and a high-quality implementation should give the
user some control over these rules.
( End of advice to implementors.)
If the program named in command does not call
MPI_INIT, but instead forks a process that calls
MPI_INIT, the results are undefined. Implementations
may allow this case to work but are not required to.
Advice to users.
MPI does not say what happens if the program you start is
a shell script and that shell script starts a program that
calls MPI_INIT. Though some implementations may allow
you to do this, they may also have restrictions, such as requiring
that arguments supplied to the shell script be supplied
to the program, or requiring that certain parts of the environment
not be changed.
( End of advice to users.)
Example
Examples of argv in C and Fortran
To run the program ``ocean'' with arguments ``-gridfile'' and
``ocean1.grd'' in C:
or, if not everything is known at compile time:
In Fortran:
Arguments are supplied to the program if this is allowed by the operating system. In C, the MPI_COMM_SPAWN argument argv differs from the argv argument of main in two respects. First, it is shifted by one element. Specifically, argv[0] of main is provided by the implementation and conventionally contains the name of the program (given by command). argv[1] of main corresponds to argv[0] in MPI_COMM_SPAWN, argv[2] of main to argv[1] of MPI_COMM_SPAWN, etc. Passing an argv of MPI_ARGV_NULL to MPI_COMM_SPAWN results in main receiving argc of 1 and an argv whose element 0 is (conventionally) the name of the program. Second, argv of MPI_COMM_SPAWN must be null-terminated, so that its length can be determined.
If a Fortran implementation supplies routines that allow a program to obtain its arguments, the arguments may be available through that mechanism. In C, if the operating system does not support arguments appearing in argv of main(), the MPI implementation may add the arguments to the argv that is passed to MPI_INIT.
An implementation may allow the info argument to change the default behavior, such that if the implementation is unable to spawn all maxprocs processes, it may spawn a smaller number of processes instead of raising an error. In principle, the info argument may specify an arbitrary set of allowed values for the number of processes spawned. The set {mi} does not necessarily include the value maxprocs. If an implementation is able to spawn one of these allowed numbers of processes, MPI_COMM_SPAWN returns successfully and the number of spawned processes, m, is given by the size of the remote group of intercomm. If m is less than maxproc, reasons why the other processes were not spawned are given in array_of_errcodes as described below. If it is not possible to spawn one of the allowed numbers of processes, MPI_COMM_SPAWN raises an error of class MPI_ERR_SPAWN.
A spawn call with the default behavior is called hard. A spawn call for which fewer than maxprocs processes may be returned is called soft. See Section Reserved Keys for more information on the soft key for info.
Advice to users.
By default, requests are hard and MPI errors are fatal. This means
that by default there will be a fatal error if MPI cannot
spawn all the requested processes. If you want the behavior
``spawn as many processes as possible, up to N,'' you
should do a soft spawn, where the set of allowed values {mi}
is {0, ..., N}. However, this is not completely portable,
as implementations are not required to support soft spawning.
( End of advice to users.)
For the SPAWN calls, info provides additional (and possibly implementation-dependent) instructions to MPI and the runtime system on how to start processes. An application may pass MPI_INFO_NULL in C or Fortran. Portable programs not requiring detailed control over process locations should use MPI_INFO_NULL.
MPI does not specify the content of the info argument, except to reserve a number of special key values (see Section Reserved Keys). The info argument is quite flexible and could even be used, for example, to specify the executable and its command-line arguments. In this case the command argument to MPI_COMM_SPAWN could be empty. The ability to do this follows from the fact that MPI does not specify how an executable is found, and the info argument can tell the runtime system where to ``find'' the executable "" (empty string). Of course, a program that does this will not be portable across MPI implementations.
MPI_ERRCODES_IGNORE in Fortran is a special type
of constant, like MPI_BOTTOM.
See the discussion in Section Named Constants.
( End of advice to implementors.)
MPI_COMM_GET_PARENT(parent) | |
OUT parent | the parent communicator (handle) |
If a process was started with MPI_COMM_SPAWN or MPI_COMM_SPAWN_MULTIPLE, MPI_COMM_GET_PARENT returns the ``parent'' inter-communicator of the current process. This parent inter-communicator is created implicitly inside of MPI_INIT and is the same inter-communicator returned by SPAWN in the parents.
If the process was not spawned, MPI_COMM_GET_PARENT returns MPI_COMM_NULL.
After the parent communicator is freed or disconnected, MPI_COMM_GET_PARENT returns MPI_COMM_NULL.
Advice to users.
MPI_COMM_GET_PARENT returns a handle to a single inter-communicator.
Calling MPI_COMM_GET_PARENT a second time returns a handle
to the same inter-communicator. Freeing the handle with MPI_COMM_DISCONNECT
or MPI_COMM_FREE will cause other references to
the inter-communicator to become invalid (dangling).
Note that calling MPI_COMM_FREE on the parent
communicator is not useful.
( End of advice to users.)
Rationale.
The desire of the Forum was to create a constant MPI_COMM_PARENT
similar to MPI_COMM_WORLD. Unfortunately such a constant cannot
be used (syntactically) as an argument to MPI_COMM_DISCONNECT,
which is explicitly allowed.
( End of rationale.)