Performance Optimization of Tensor Contraction Expressions for Many-Body Methods in Quantum Chemistry

Abstract
Complex tensor contraction expressions arise in accurate electronic structure models in quantum chemistry, such as the coupled cluster method. This paper addresses two complementary aspects of performance optimization of such tensor contraction expressions. Transformations using algebraic properties of commutativity and associativity can be used to significantly decrease the number of arithmetic operations required for evaluation of these expressions. The identification of common subexpressions among a set of tensor contraction expressions can result in a reduction of the total number of operations required to evaluate the tensor contractions. The first part of the paper describes an effective algorithm for operation minimization with common subexpression identification and demonstrates its effectiveness on tensor contraction expressions for coupled cluster equations. The second part of the paper highlights the importance of data layout transformation in the optimization of tensor contraction computations on modern processors. A number of considerations, such as minimization of cache misses and utilization of multimedia vector instructions, are discussed. A library for efficient index permutation of multidimensional tensors is described, and experimental performance data is provided that demonstrates its effectiveness.