One of the most explored approaches to overcome the impossibility of distributed consensus in fully asynchronous systems was the introduction of the concept of unreliable failure detectors by Chandra and Toueg in 1996. Failure detectors emerged not only as theoretical encapsulation of the synchrony needed for consensus, but also as useful building blocks for many distributed algorithms.
Therefore, in literature a lot of effort has been spent to implement such failure detectors efficiently. However, almost all currently known solutions are based on the assumption of a fully connected network between the processors of the system.
This thesis focuses on the implementation of various failure detectors in sparsely connected networks. Sparse networks model the nature of many non-broadcast networks on a low-level basis, namely wireless ad-hoc networks, sensor networks, and even the Internet. From such an approach, an implementation can profit regarding message complexity and accuracy. However, proving the correctness of a failure detector algorithm in sparse networks showed up to be an elaborating task. The introduction of a local failure detector as a new class of failure detectors with appropriate transformation algorithms to global failure detectors simplifies these things.
The failure detector implementations in this work cover several system models that head at distinct goals, including constant message size in arbitrary large networks, weak timing, and self-stabilization.
Especially the latter seems to be a very interesting combination of two flavors of fault tolerance---robustness and self-stabilization---which has not been addressed sufficiently in literature.