R. Cook
M. Pakzad
C. Phillips
University of Newcastle upon Tyne. 1994
We consider parallel implementations of the conjugate gradient method applicable to both shared memory multiprocessors and local memory multicomputers. The basic method consists of a sequence of linear algebra steps, each of which can be readily parallelised at the cost of a certain amount of synchronisation or communication. In practice the method is usually modified by the incorporation of an appropriate preconditioning step which improves the convergence characteristics. One of the more popular preconditioners is based on the use of incomplete factorisation and in a parallel environment the forward and backward substitutions associated with the preconditioning step can be a limiting factor. We consider two related ways of improving the potential for parallelism at the cost of an increase in the number of iterations and quantify the effect by means of numerical examples.