A necessary and sufficient condition for transforming limited accuracy failure detectors

作者:

Highlights:

摘要

Unreliable failure detectors are oracles that give information about process failures. Chandra and Toueg were first to study such failure detectors for distributed systems, and they identified a number that enabled the solution of the Consensus problem in asynchronous distributed systems. This paper focuses on two of these, denoted S (strong) and ♢S (eventually strong). The characteristics of a given unreliable failure detector are usually described by its completeness and accuracy properties. Completeness is a requirement on the actual detection of failures, while accuracy limits the mistakes a failure detector can make. Let the scope of the accuracy property of an unreliable failure detector be the minimum number (k) of processes that may not erroneously suspect a correct process to have crashed. Usual failure detectors implicitly consider a scope equal to n (the total number of processes). Accuracy properties with limited scope give rise to the classes of failure detectors that we call Sk and ♢Sk. This paper investigates the following question: “Given Sk and ♢Sk, under which condition is it possible to transform their failure detectors into their counterparts with unlimited accuracy, i.e., S and ♢S?”. The paper answers this question in the following way. It first presents a particularly simple protocol that realizes such a transformation when f

论文关键词:Asynchronous distributed system,Crash failure,Failure detector accuracy,Reduction protocol,Unreliable failure detector

论文评审过程:Received 3 January 2000, Revised 19 August 2003, Available online 11 December 2003.

论文官网地址:https://doi.org/10.1016/j.jcss.2003.08.003