Derived datatypes are an integral part of the Message Passing Interface (MPI), the defacto standard for programming massively parallel, high performance applications. The mechanism is essential for efficient and portable implementations of programs working with complex data layouts. MPI defines several type constructors that are capable of describing arbitrarily complex, heterogeneous and noncontiguous data layouts. The type constructors may be applied recursively, leading to treelike representations of data layouts, also called type trees. Typically, multiple different representations of a given data layout exist. MPI implementations require concise representations to process derived datatypes efficiently. It is not easy to see for users which representation is the best for a given data layout, since many nonobvious factors need to be considered, such as machine specific hardware capabilities and optimizations. The problem of computing the most concise (or optimal) type tree representation for a given data layout was coined the Type Reconstruction Problem. It has been an open question whether this problem can be solved to optimality in polynomial time (w.r.t. to the size of the represented data layout). So far, heuristics have been used to improve a given representation, without providing any guarantees about the quality of the result. In this master's thesis, we present an algorithm that solves the Type Reconstruction Problem for arbitrarily complex data layouts in order n to the fourth time, where n is the size of the data layout. A recent work showed that optimal representations can be computed in subquadratic time if the set of considered type constructors is reduced to those with only one subtype, i.e., if the type trees are restricted to type paths. For this special case, called the Type Path Reconstruction Problem, we improve the currently best known asymptotic bound significantly.
