J. Xu
Brian Randell
University of Newcastle upon Tyne. 1995.
Roll-forward checkpointing schemes [Long et al. 1990; Pradhan and Vaidya 1992] are developed in order to avoid rollback in the presence of independent faults and increase the possibility that a task completes within a tight deadline. Despite of the adoption of roll-forward recovery, these schemes are not necessarily appropriate for time-critical applications because interactions with the external environment and communications between processes must be deferred during checkpoint validation steps (typically, two checkpoint intervals) until the fault-free processors are identified. The deadlines on providing services may thus be violated. In this paper we present and discuss two alternative roll-forward recovery schemes, especially for time-critical and interaction-intensive applications, that deliver correct, timely results even when checkpoint validation is required.