Increasing data locality and introducing Level-3 BLAS in the Neville elimination

作者:

Highlights:

摘要

In this paper we present two new algorithmic variants to compute the Neville elimination, with and without pivoting, which improve data locality and cast most of the computations in terms of high-performance Level 3 BLAS. The experimental evaluation on a state-of-the-art multi-core processor demonstrates that the new blocked algorithms exhibit a much higher degree of concurrency and better cache usage, yielding higher performance while offering numerical accuracy akin to that of the traditional columnwise variant in most cases.

论文关键词:Neville elimination,Pivoting,Linear systems,Multi-core processors,High performance

论文评审过程:Available online 23 September 2011.

论文官网地址:https://doi.org/10.1016/j.amc.2011.08.076