At the most basic level, file interoperability is the ability to read the information previously written to a file --- not just the bits of data, but the actual information the bits represent. MPI guarantees full interoperability within a single MPI environment, and supports increased interoperability outside that environment through the external data representation (Section External Data Representation: ``external32'' ) as well as the data conversion functions (Section User-Defined Data Representations ).
Interoperability within a single MPI environment (which could be considered ``operability'') ensures that file data written by one MPI process can be read by any other MPI process, subject to the consistency constraints (see Section File Consistency ), provided that it would have been possible to start the two processes simultaneously and have them reside in a single MPI_COMM_WORLD. Furthermore, both processes must see the same data values at every absolute byte offset in the file for which data was written.
This single environment file interoperability implies that file data is accessible regardless of the number of processes.
There are three aspects to file interoperability:
The remaining aspect of file interoperability, converting between different machine representations, is supported by the typing information specified in the etype and filetype. This facility allows the information in files to be shared between any two applications, regardless of whether they use MPI, and regardless of the machine architectures on which they run.
MPI supports multiple data representations: ``native,'' ``internal,'' and ``external32.'' An implementation may support additional data representations. MPI also supports user-defined data representations (see Section User-Defined Data Representations ). The ``native'' and ``internal'' data representations are implementation dependent, while the ``external32'' representation is common to all MPI implementations and facilitates file interoperability. The data representation is specified in the datarep argument to MPI_FILE_SET_VIEW.
Advice to users.
MPI is not
guaranteed to retain knowledge of what data
representation was used when a file is written.
Therefore, to correctly retrieve file data, an MPI
application is responsible for specifying the same data
representation as was used to create the file.
( End of advice to users.)
Advice to users.
This data representation should only be used in a homogeneous
MPI environment, or when the MPI application is capable of performing
the data type conversions itself.
( End of advice to users.)
Advice
to implementors.
When implementing read and write operations
on top of MPI message-passing, the message data should be typed as
MPI_BYTE to ensure that the message routines do not perform any
type conversions on the data.
( End of advice to implementors.)
Rationale.
This data representation allows the implementation
to perform I/O efficiently in a
heterogeneous environment, though with implementation-defined
restrictions on how the file can be reused.
( End of rationale.)
Advice
to implementors.
Since ``external32'' is a superset of the
functionality provided by ``internal,''
an implementation may choose to implement ``internal''
as ``external32.''
( End of advice to implementors.)
This data representation has several advantages. First, all processes reading the file in a heterogeneous MPI environment will automatically have the data converted to their respective native representations. Second, the file can be exported from one MPI environment and imported into any other MPI environment with the guarantee that the second environment will be able to read all the data in the file.
The disadvantage of this data representation is that data precision and I/O performance may be lost in data type conversions.
Advice
to implementors.
When implementing read and write operations
on top of MPI message-passing, the message data should be converted
to and from the ``external32'' representation in the client,
and sent as type MPI_BYTE.
This will avoid possible double data type conversions
and the associated further loss of precision and performance.
( End of advice to implementors.)