40 Years of Computing at Newcastle

Department Technical Report Series No. 553

A System for Fault-Tolerant Execution of Data and Compute Intensive Programs Over a Network of Workstations.

J. A. Smith and S.K. Shrivastava.

University of Newcastle upon Tyne. 1996

Abstract

A well known structuring technique for a wide class of parallel applications is the bag of tasks, which allows a computation to be partitioned dynamically between a collection of concurrent processes. This paper describes a fault-tolerant implementation of this structure using atomic actions (atomic transactions) to operate on persistent objects, which are accessed in a distributed setting via a Remote Procedure Call (RPC). The system developed is suited to parallel execution of data and compute intensive programs that require persistent storage and fault tolerance facilities. The suitability of the system is examined in the context of the measured performance of three specific applications; ray tracing, matrix multiplication and Cholesky factorization. The system developed runs on stock hardware and software platforms, specifically UNIX, C++.
Department Technical Report Series - 1996
Department Technical Report Series Index
Contents Page - 40 Years of Computing at Newcastle
Technical Report Abstract No. 553, 30 June 1997