15.5.3. User-Defined Data Representations

PreviousUpNext
Up: File Interoperability Next: Extent Callback Previous: External Data Representation: "external32"

There are two situations that cannot be handled by the required representations:

    1. a user wants to write a file in a representation unknown to the implementation, and
    2. a user wants to read a file written in a representation unknown to the implementation.
User-defined data representations allow the user to insert a third party converter into the I/O stream to do the data representation conversion.

MPI_REGISTER_DATAREP(datarep, read_conversion_fn, write_conversion_fn, dtype_file_extent_fn, extra_state)
IN datarepdata representation identifier (string)
IN read_conversion_fnfunction invoked to convert from file representation to native representation (function)
IN write_conversion_fnfunction invoked to convert from native representation to file representation (function)
IN dtype_file_extent_fnfunction invoked to get the extent of a datatype as represented in the file (function)
IN extra_stateextra state
C binding
int MPI_Register_datarep(const char *datarep, MPI_Datarep_conversion_function *read_conversion_fn, MPI_Datarep_conversion_function *write_conversion_fn, MPI_Datarep_extent_function *dtype_file_extent_fn, void *extra_state)
int MPI_Register_datarep_c(const char *datarep, MPI_Datarep_conversion_function_c *read_conversion_fn, MPI_Datarep_conversion_function_c *write_conversion_fn, MPI_Datarep_extent_function *dtype_file_extent_fn, void *extra_state)
Fortran 2008 binding
MPI_Register_datarep(datarep, read_conversion_fn, write_conversion_fn, dtype_file_extent_fn, extra_state, ierror)

CHARACTER(LEN=*), INTENT(IN) :: datarep
PROCEDURE(MPI_Datarep_conversion_function) :: read_conversion_fn, write_conversion_fn
PROCEDURE(MPI_Datarep_extent_function) :: dtype_file_extent_fn
INTEGER(KIND=MPI_ADDRESS_KIND), INTENT(IN) :: extra_state
INTEGER, OPTIONAL, INTENT(OUT) :: ierror
MPI_Register_datarep_c(datarep, read_conversion_fn, write_conversion_fn, dtype_file_extent_fn, extra_state, ierror) !(_c)

CHARACTER(LEN=*), INTENT(IN) :: datarep
PROCEDURE(MPI_Datarep_conversion_function_c) :: read_conversion_fn, write_conversion_fn
PROCEDURE(MPI_Datarep_extent_function) :: dtype_file_extent_fn
INTEGER(KIND=MPI_ADDRESS_KIND), INTENT(IN) :: extra_state
INTEGER, OPTIONAL, INTENT(OUT) :: ierror
Fortran binding
MPI_REGISTER_DATAREP(DATAREP, READ_CONVERSION_FN, WRITE_CONVERSION_FN, DTYPE_FILE_EXTENT_FN, EXTRA_STATE, IERROR)

CHARACTER*(*) DATAREP
EXTERNAL READ_CONVERSION_FN, WRITE_CONVERSION_FN, DTYPE_FILE_EXTENT_FN
INTEGER(KIND=MPI_ADDRESS_KIND) EXTRA_STATE
INTEGER IERROR

The call associates read_conversion_fn, write_conversion_fn, and dtype_file_extent_fn with the data representation identifier datarep. datarep can then be used as an argument to MPI_FILE_SET_VIEW, causing subsequent data access operations to call the conversion functions to convert all data items accessed between file data representation and native representation. MPI_REGISTER_DATAREP is a local operation and only registers the data representation for the calling MPI process. If datarep is already defined, an error in the error class MPI_ERR_DUP_DATAREP is raised using the default file error handler (see Section I/O Error Handling). The length of a data representation string is limited to the value of MPI_MAX_DATAREP_STRING. MPI_MAX_DATAREP_STRING must have a value of at least 64. No routines are provided to delete data representations and free the associated resources; it is not expected that an application will generate them in significant numbers.


PreviousUpNext
Up: File Interoperability Next: Extent Callback Previous: External Data Representation: "external32"


15.5.3.1. Extent Callback

PreviousUpNext
Up: User-Defined Data Representations Next: Datarep Conversion Functions Previous: User-Defined Data Representations

typedef int MPI_Datarep_extent_function(MPI_Datatype datatype, MPI_Aint *extent, void *extra_state);
ABSTRACT INTERFACE
    SUBROUTINE MPI_Datarep_extent_function(datatype, extent, extra_state, ierror)

TYPE(MPI_Datatype) :: datatype
INTEGER(KIND=MPI_ADDRESS_KIND) :: extent, extra_state
INTEGER :: ierror
SUBROUTINE DATAREP_EXTENT_FUNCTION(DATATYPE, EXTENT, EXTRA_STATE, IERROR)

INTEGER DATATYPE, IERROR
INTEGER(KIND=MPI_ADDRESS_KIND) EXTENT, EXTRA_STATE

The function dtype_file_extent_fn must return, in file_extent, the number of bytes required to store datatype in the file representation. The function is passed, in extra_state, the argument that was passed to the MPI_REGISTER_DATAREP call. MPI will only call this routine with predefined datatypes employed by the user.


Rationale.

This callback does not have a large count variant because it is anticipated that large counts will not be required to represent the extent output value. ( End of rationale.)
MPI_Datarep_conversion_function also supports large count types in separate additional MPI procedures in C (suffixed with the ``_c'') and multiple abstract interfaces in Fortran when using USE mpi_f08.

If the extent cannot be represented in extent, the callback function shall set extent to MPI_UNDEFINED. The MPI implementation will then raise an error of class MPI_ERR_VALUE_TOO_LARGE.


PreviousUpNext
Up: User-Defined Data Representations Next: Datarep Conversion Functions Previous: User-Defined Data Representations


15.5.3.2. Datarep Conversion Functions

PreviousUpNext
Up: User-Defined Data Representations Next: Matching Data Representations Previous: Extent Callback

typedef int MPI_Datarep_conversion_function(void *userbuf, MPI_Datatype datatype, int count, void *filebuf, MPI_Offset position, void *extra_state);
typedef int MPI_Datarep_conversion_function_c(void *userbuf, MPI_Datatype datatype, MPI_Count count, void *filebuf, MPI_Offset position, void *extra_state);
ABSTRACT INTERFACE
    SUBROUTINE MPI_Datarep_conversion_function(userbuf, datatype, count, filebuf, position, extra_state, ierror)

USE, INTRINSIC :: ISO_C_BINDING, ONLY : C_PTR
TYPE(C_PTR), VALUE :: userbuf, filebuf
TYPE(MPI_Datatype) :: datatype
INTEGER :: count, ierror
INTEGER(KIND=MPI_OFFSET_KIND) :: position
INTEGER(KIND=MPI_ADDRESS_KIND) :: extra_state
ABSTRACT INTERFACE
    SUBROUTINE MPI_Datarep_conversion_function_c(userbuf, datatype, count, filebuf, position, extra_state, ierror) !(_c)

USE, INTRINSIC :: ISO_C_BINDING, ONLY : C_PTR
TYPE(C_PTR), VALUE :: userbuf, filebuf
TYPE(MPI_Datatype) :: datatype
INTEGER(KIND=MPI_COUNT_KIND) :: count
INTEGER(KIND=MPI_OFFSET_KIND) :: position
INTEGER(KIND=MPI_ADDRESS_KIND) :: extra_state
INTEGER :: ierror
SUBROUTINE DATAREP_CONVERSION_FUNCTION(USERBUF, DATATYPE, COUNT, FILEBUF, POSITION, EXTRA_STATE, IERROR)

<TYPE> USERBUF(*), FILEBUF(*)
INTEGER DATATYPE, COUNT, IERROR
INTEGER(KIND=MPI_OFFSET_KIND) POSITION
INTEGER(KIND=MPI_ADDRESS_KIND) EXTRA_STATE

The function read_conversion_fn must convert from file data representation to native representation. Before calling this routine, MPI allocates and fills filebuf with count
contiguous data items. The type of each data item matches the corresponding entry for the predefined datatype in the type signature of datatype. The function is passed, in extra_state, the argument that was passed to the MPI_REGISTER_DATAREP call. The function must copy all count data items from filebuf to userbuf in the distribution described by datatype, converting each data item from file representation to native representation. datatype will be equivalent to the datatype that the user passed to the read function. If the size of datatype is less than the size of the count data items, the conversion function must treat datatype as being contiguously tiled over the userbuf. The conversion function must begin storing converted data at the location in userbuf specified by position into the (tiled) datatype.


Advice to users.

Although the conversion functions have similarities to MPI_PACK and MPI_UNPACK, one should note the differences in the use of the arguments count and position. In the conversion functions, count is a count of data items (i.e., count of typemap entries of datatype), and position is an index into this typemap. In MPI_PACK, incount refers to the number of whole datatypes, and position is a number of bytes. ( End of advice to users.)

Advice to implementors.

A converted read operation could be implemented as follows:

    1. Get file extent of all data items
    2. Allocate a filebuf large enough to hold all count data items
    3. Read data from file into filebuf
    4. Call read_conversion_fn to convert data and place it into userbuf
    5. Deallocate filebuf
( End of advice to implementors.)
If MPI cannot allocate a buffer large enough to hold all the data to be converted from a read operation, it may call the conversion function repeatedly using the same datatype and userbuf, and reading successive chunks of data to be converted in filebuf. For the first call (and in the case when all the data to be converted fits into filebuf), MPI will call the function with position set to zero. Data converted during this call will be stored in the userbuf according to the first count data items in datatype. Then in subsequent calls to the conversion function, MPI will increment the value in position by the count of items converted in the previous call, and the userbuf pointer will be unchanged.


Rationale.

Passing the conversion function a position and one datatype for the transfer allows the conversion function to decode the datatype only once and cache an internal representation of it on the datatype. Then on subsequent calls, the conversion function can use the position to quickly find its place in the datatype and continue storing converted data where it left off at the end of the previous call. ( End of rationale.)

Advice to users.

Although the conversion function may usefully cache an internal representation on the datatype, it should not cache any state information specific to an ongoing conversion operation, since it is possible for the same datatype to be used concurrently in multiple conversion operations. ( End of advice to users.)
The function write_conversion_fn must convert from native representation to file data representation. Before calling this routine, MPI allocates filebuf of a size large enough to hold count contiguous data items. The type of each data item matches the corresponding entry for the predefined datatype in the type signature of datatype. The function must copy count data items from userbuf in the distribution described by datatype, to a contiguous distribution in filebuf, converting each data item from native representation to file representation. If the size of datatype is less than the size of count data items, the conversion function must treat datatype as being contiguously tiled over the userbuf.

The function must begin copying at the location in userbuf specified by position into the (tiled) datatype. datatype will be equivalent to the datatype that the user passed to the write function. The function is passed, in extra_state, the argument that was passed to the MPI_REGISTER_DATAREP call.

The predefined constant MPI_CONVERSION_FN_NULL may be used as either write_conversion_fn or read_conversion_fn in bindings of MPI_REGISTER_DATAREP without large counts in these conversion callbacks, whereas the constant MPI_CONVERSION_FN_NULL_C can be used in the large count version (i.e., MPI_Register_datarep_c). In either of these cases, MPI will not attempt to invoke write_conversion_fn or read_conversion_fn, respectively, but will perform the requested data access using the native data representation.

An MPI implementation must ensure that all data accessed is converted, either by using a filebuf large enough to hold all the requested data items or else by making repeated calls to the conversion function with the same datatype argument and appropriate values for position.

An implementation will only invoke the callback routines in this section ( read_conversion_fn, write_conversion_fn, and dtype_file_extent_fn) when one of the read or write routines in Section Data Access, or MPI_FILE_GET_TYPE_EXTENT is called by the user. dtype_file_extent_fn will only be passed predefined datatypes employed by the user. The conversion functions will only be passed datatypes equivalent to those that the user has passed to one of the routines noted above.

The conversion functions must be reentrant. User defined data representations are restricted to use byte alignment for all types. Furthermore, it is erroneous for the conversion functions to call any collective routines or to free datatype.

The conversion functions should return an error code. If the returned error code has a value other than MPI_SUCCESS, the implementation will raise an error in the class MPI_ERR_CONVERSION.


PreviousUpNext
Up: User-Defined Data Representations Next: Matching Data Representations Previous: Extent Callback


Return to MPI-4.1 Standard Index
Return to MPI Forum Home Page

(Unofficial) MPI-4.1 of November 2, 2023
HTML Generated on November 19, 2023