An MPI implementation cannot or may choose not to handle some errors that occur during MPI calls. These can include errors that generate exceptions or traps, such as floating point errors or access violations. The set of errors that are handled by MPI is implementation-dependent. Each such error generates an MPI exception.
The above text takes precedence over any text on error handling within this document. Specifically, text that states that errors will be handled should be read as may be handled.
A user can associate error handlers to three types of objects: communicators, windows, and files. The specified error handling routine will be used for any MPI exception that occurs during a call to MPI for the respective object. MPI calls that are not related to any objects are considered to be attached to the communicator MPI_COMM_WORLD. The attachment of error handlers to objects is purely local: different processes may attach different error handlers to corresponding objects.
Several predefined error handlers are available in MPI:
The error handler MPI_ERRORS_ARE_FATAL is associated by default with MPI_COMM- _WORLDafter initialization. Thus, if the user chooses not to control error handling, every error that MPI handles is treated as fatal. Since (almost) all MPI calls return an error code, a user may choose to handle errors in its main code, by testing the return code of MPI calls and executing a suitable recovery code when the call was not successful. In this case, the error handler MPI_ERRORS_RETURN will be used. Usually it is more convenient and more efficient not to test for errors after each MPI call, and have such error handled by a non-trivial MPI error handler.
After an error is detected, the state of MPI is undefined. That is, using a user-defined error handler, or MPI_ERRORS_RETURN, does not necessarily allow the user to continue to use MPI after an error is detected. The purpose of these error handlers is to allow a user to issue user-defined error messages and to take actions unrelated to MPI (such as flushing I/O buffers) before a program exits. An MPI implementation is free to allow MPI to continue after an error but is not required to do so.
Advice
to implementors.
A high-quality implementation will, to the greatest possible extent,
circumscribe the impact of an error, so that normal processing can
continue after an error handler was invoked. The implementation
documentation will
provide information on the possible effect of each class of errors.
( End of advice to implementors.)
An MPI error handler is an opaque object, which is accessed by a handle.
MPI calls are provided to create new error handlers, to associate error
handlers with objects, and to test which error handler is associated with
an object.
C has
distinct typedefs for user defined error handling callback
functions that
accept
communicator, file, and window arguments.
In Fortran there are three user routines.
An error handler object is created by a call to MPI_ XXX_CREATE_ERRHANDLER, where XXX is, respectively, COMM, WIN, or FILE.
An error handler is attached to a communicator, window, or file by a call to MPI_ XXX_SET_ERRHANDLER. The error handler must be either a predefined error handler, or an error handler that was created by a call to MPI_ XXX_CREATE_ERRHANDLER, with matching XXX. The predefined error handlers MPI_ERRORS_RETURN and MPI_ERRORS_ARE_FATAL can be attached to communicators, windows, and files.
The error handler currently associated with a communicator, window, or file can be retrieved by a call to MPI_ XXX_GET_ERRHANDLER.
The MPI function MPI_ERRHANDLER_FREE can be used to free an error handler that was created by a call to MPI_ XXX_CREATE_ERRHANDLER.
MPI_ {COMM,WIN,FILE }_GET_ERRHANDLER behave as if a new error handler object is created. That is, once the error handler is no longer needed, MPI_ERRHANDLER_FREE should be called with the error handler returned from MPI_ {COMM,WIN,FILE }_GET_ERRHANDLER to mark the error handler for deallocation. This provides behavior similar to that of MPI_COMM_GROUP and MPI_GROUP_FREE.
Advice
to implementors.
High-quality implementations should raise an error when an error handler
that
was created by a call to MPI_ XXX_CREATE_ERRHANDLER is
attached to an object of the wrong type with a call to
MPI_YYY_SET_ERRHANDLER. To do so, it is necessary to
maintain, with each error handler, information on the typedef of the
associated user function.
( End of advice to implementors.)
The syntax for these calls is given below.