Code Generation and Performance Engineering for Matrix-Free Finite Element Methods on Hybrid Tetrahedral Grids

Abstract

This paper introduces a code generator designed for extreme-scalable, matrix-free finite element operators on hybrid tetrahedral grids. It optimizes the local evaluation of bilinear forms through various techniques including tabulation, relocation of loop invariants, and interelement vectorization—implemented as transformations of an abstract syntax tree. A key contribution is the generator’s ability to optimize not only the local assembly, but also the loop over the underlying grid. To that end, the developed “cubes” loop pattern leverages local structure resulting from grid refinement and significantly enhances arithmetic intensity. Its effectiveness is established during a memory and layer condition analysis. The paper demonstrates the generator’s capabilities through a comprehensive, educational cycle of performance analysis, bottleneck identification, and emission of dedicated optimizations. For three differential operators ($-\Delta$, $-\nabla \cdot (k(\mathbf{x})\, \nabla$, $\alpha (\mathbf{x})\, \mathbf{curl} \, \mathbf{curl} + \beta (\mathbf{x})$), the set of most effective optimizations is determined from a larger pool. They result in matrix-free operators with a kernel throughput of 1.3 to 2.1 GDoF/s, achieving up to 62% peak performance on a 36-core Intel Ice Lake socket. We make performance comparisons with a sparse matrix-vector multiplication and operators generated by the Firedrake project at increasing levels of refinement. Finally, the solution of the curl-curl problem with more than a trillion $(10^{12})$ degrees of freedom on 21,504 processes in less than 50 s showcases the generated operators’ performance and extreme scalability as part of a full multigrid solver.

BibTeX
@article{id3052,
  author = {B\"ohm, Fabian and Bauer, Daniel and Kohl, Nils and Alappat, Christie L. and Th\"onnes, Dominik and Mohr, Marcus and K\"ostler, Harald and R\"ude, Ulrich},
  doi = {10.1137/24m1653756},
  journal = {SIAM Journal on Scientific Computing},
  language = {en},
  number = {1},
  pages = {B131{\textendash}B159},
  title = {Code Generation and Performance Engineering for Matrix-Free Finite Element Methods on Hybrid Tetrahedral Grids},
  volume = {47},
  year = {2025},
}
EndNote
%O Journal Article
%A Böhm, Fabian
%A Bauer, Daniel
%A Kohl, Nils
%A Alappat, Christie L.
%A Thönnes, Dominik
%A Mohr, Marcus
%A Köstler, Harald
%A Rüde, Ulrich
%R 10.1137/24m1653756
%J SIAM Journal on Scientific Computing
%G en
%N 1
%P B131–B159
%T Code Generation and Performance Engineering for Matrix-Free Finite Element Methods on Hybrid Tetrahedral Grids
%V 47
%D 2025