S.K. Shrivastava
A. Tully
University of Newcastle upon Tyne. 1993
Replicated execution of distributed programs provides a means of masking hardware (processor) failures in a distributed system. Application level entities (processes, objects) are replicated to execute on distinct processors. Non-deterministic program constructs within the replicas could cause messages to be processed in non-identical order, or computations to choose different execution paths producing divergence of states. The replicas could thereafter produce inconsistent responses to identical messages and hence appear to be faulty. We identify possible sources of non-determinism and present general solutions for ensuring that non-faulty replicas process messages in identical order and follow identical execution paths in their computations thereby preventing state divergence. Particular attention is paid to real-time programs which can contain a variety of non-deterministic program constructs.