Parallel preconditioned conjugate gradient algorithm on GPU
作者:
Highlights:
•
摘要
We propose a parallel implementation of the Preconditioned Conjugate Gradient algorithm on a GPU platform. The preconditioning matrix is an approximate inverse derived from the SSOR preconditioner. Used through sparse matrix–vector multiplication, the proposed preconditioner is well suited for the massively parallel GPU architecture. As compared to CPU implementation of the conjugate gradient algorithm, our GPU preconditioned conjugate gradient implementation is up to 10 times faster (8 times faster at worst).
论文关键词:Preconditioned conjugate gradient,Parallel computing,Graphics processor unit
论文评审过程:Received 30 September 2010, Revised 1 April 2011, Available online 27 April 2011.
论文官网地址:https://doi.org/10.1016/j.cam.2011.04.025