Up: The Dynamic Process Model
Next: Process Manager Interface
Previous: Starting Processes
The MPI_COMM_SPAWN and
MPI_COMM_SPAWN_MULTIPLE
routines provide an interface between MPI and
the runtime environment of an MPI application.
The difficulty is that there is an enormous range of runtime
environments and application requirements, and MPI must not be
tailored to any particular one. Examples of such environments are:
- MPP managed by a batch queueing system. Batch queueing
systems generally allocate resources before an application begins,
enforce limits on resource use (CPU time, memory use, etc.), and do
not allow a change in resource allocation after a job begins.
Moreover, many MPPs have special limitations or extensions, such as a
limit on the number of processes that may run on one processor, or
the ability to gang-schedule processes of a parallel application.
- Network of workstations with PVM. PVM (Parallel Virtual
Machine) allows a user to create a ``virtual machine'' out of
a network of workstations. An application may extend the virtual
machine or manage processes (create, kill, redirect output, etc.)
through the PVM library. Requests to manage the machine or processes
may be intercepted and handled by an external resource manager.
- Network of workstations managed by a load balancing system.
A load balancing system may choose the location of spawned processes
based on dynamic quantities, such as load average. It may
transparently migrate processes from one machine to another when a
resource becomes unavailable.
- Large SMP with Unix. Applications are run directly
by the user. They are scheduled at a low level by the operating
system. Processes may have special scheduling characteristics
(gang-scheduling, processor affinity, deadline scheduling, processor
locking, etc.) and be subject to OS resource limits (number of
processes, amount of memory, etc.).
MPI assumes, implicitly, the existence of an environment in which an
application runs. It does not provide ``operating system'' services,
such as a general ability to query what processes are running, to kill
arbitrary processes, to find out properties of the runtime environment
(how many processors, how much memory, etc.).
Complex interaction of an
MPI application with its runtime environment should
be done through an environment-specific API.
An example of such an API would be the PVM task and machine management
routines --- pvm_addhosts, pvm_config, pvm_tasks,
etc., possibly modified to return an MPI (group,rank) when possible.
A Condor or PBS API would be another possibility.
At some low level, obviously, MPI must be able to interact with the
runtime system, but the interaction is not visible at the application
level and the details of the interaction are not specified by the MPI
standard.
In many cases, it is impossible to keep environment-specific
information out of the MPI interface without seriously compromising
MPI functionality. To permit applications to take advantage of
environment-specific functionality, many MPI routines take
an info argument that allows an application to
specify environment-specific information. There is a tradeoff
between functionality and portability: applications that
make use of info are not portable.
MPI does not require the existence of an underlying ``virtual machine''
model, in which there is a consistent global view of an MPI
application and an implicit ``operating system'' managing resources
and processes. For instance, processes spawned by one task may not be
visible to another; additional hosts added to the runtime environment
by one process may not be visible in another process; tasks spawned by
different processes may not be automatically distributed over available
resources.
Interaction between MPI and the runtime environment is limited to the
following areas:
- A process may start new processes with MPI_COMM_SPAWN and
MPI_COMM_SPAWN_MULTIPLE.
- When a process spawns a child process, it may optionally use an
info argument to tell the runtime environment where or how to
start the process. This extra information may be opaque to MPI.
- An attribute MPI_UNIVERSE_SIZE on MPI_COMM_WORLD
tells a program how ``large'' the initial runtime environment
is, namely how many processes can
usefully be started in all. One can subtract the size of
MPI_COMM_WORLD from this value to find out how many processes
might usefully be started in addition to those already running.
Up: The Dynamic Process Model
Next: Process Manager Interface
Previous: Starting Processes
Return to MPI-2.1 Standard Index
Return to MPI Forum Home Page
MPI-2.0 of July 1, 2008
HTML Generated on July 6, 2008