Communication Avoiding Block Low-Rank Parallel Multifrontal Triangular Solve with Many Right-Hand Sides
Abstract
Block low-rank (BLR) compression can significantly reduce the memory and time costs of parallel sparse direct solvers. In this paper, we investigate the performance of the BLR triangular solve phase, which we observe to be underwhelming when dealing with many right-hand sides (RHS). We explain that this is because the bottleneck of the triangular solve is not in accessing the BLR LU factors, but rather in accessing the RHS, which are uncompressed. Motivated by this finding, we propose several new hybrid variants, which combine the right-looking and left-looking communication patterns to minimize the number of accesses to the RHS. We confirm via a theoretical analysis that these new variants can significantly reduce the total communication volume. We assess the impact of this reduction on the time performance on a range of real-life applications using the MUMPS solver, obtaining up to 20% time reduction.
Origin | Files produced by the author(s) |
---|