Weak system models for fault-tolerant distributed agreement problems

Robinson, Peter

Record link:

https://resolver.obvsg.at/urn:nbn:at:at-ubtuw:1-42017
http://hdl.handle.net/20.500.12708/12220

Title:

Weak system models for fault-tolerant distributed agreement problems

Citation:

Robinson, P. (2010). Weak system models for fault-tolerant distributed agreement problems [Dissertation, Technische Universität Wien]. reposiTUm. https://resolver.obvsg.at/urn:nbn:at:at-ubtuw:1-42017

CatalogPlus:

AC07809343

Publication Type:

Thesis - Dissertation

Language:

English

Authors:

Robinson, Peter

Advisor:

Schmid, Ulrich

Co-advisor:

Raynal, Michel

Organisational Unit:

E182 - Institut für Technische Informatik

Date (published):

2010

Number of Pages:

151

Keywords:

schwache Systemmodelle; fehlertolerante verteilte Systeme; Byzantinische Fehler; Uhrensynchronisation; partielle Synchronität; Fehlerdetektor; k-set Konsens; Kommunikationsprädikat

weak system models; fault-tolerant distributed systems; Byzantine failures; clock synchronization; partial synchrony; failure detectors; k-set agreement; communication predicates

Abstract:

Diese Arbeit untersucht Synchronitätsaspekte fehlertoleranter, verteilter Systemmodelle. In Teil I wird die Einbettung der Arbeit erläutert und die verwandte Literatur besprochen, sowie ein Überblick über die grundlegenden Systemannahmen gegeben. In Teil II wird das Asynchronous-Bounded-Cycle-Modell (ABC Modell) eingeführt, welches ohne explizite Echtzeitschranken auf Kommunikationskanäle bzw. Taktgeschwindigkeiten auskommt. Stattdessen verwendet das ABC Modell eine graphentheoretische Sychronitätsbedingung auf gewissen kausalen Ketten. Weiters vergleichen wir das ABC Modell mit anderen partiell synchronen Modellen, insbesondere mit den klassischen Modellen von Dwork, Lynch und Stockmeyer. Wir zeigen, wie man im ABC Modell eine Gleichtaktsimulation trotz Byzantinischer Fehler bewerkstelligen kann und beweisen deren Korrektheit. Daraus folgt, dass im ABC Modell auch das - für verteilte Systeme fundamentale, in asynchronen Systemen jedoch unlösbare - Konsensusproblem gelöst werden kann. Danach präsentieren wir das formal schwierigste Resultat dieser Arbeit: Wir beweisen, dass jeder Algorithmus, dessen Korrektheit für das Theta-Modell (von Le Lann und Schmid) bereits bewiesen wurde, auch in unserem ABC Modell funktioniert. In dem technisch aufwendigen Beweis verwenden wir eine Variante des Farkas'schen Theorems über lineare Ungleichungssysteme und entwickeln einen neuartigen Zyklenraum auf gerichteten Graphen, um die Existenz gewisser Formationen kausaler Ketten zu beweisen. Damit zeigen wir, dass jede zeitfreie Safety-Bedingungen auch im ABC Modell gilt. Mittels mengentheoretischer Topologie erweitern wir dieses Resultat dann auf Liveness-Bedingungen. In Teil III wenden wir uns dem Grenzbereich zwischen Modellen, wo Konsensus erreichbar ist, und dem asynchronen Modell zu. Zu diesem Zweck betrachten wir eine Generalisierung des Konsensusproblems: Das sogenannte k-Set-Agreement-Problem verlangt, dass sich alle korrekten Prozesse systemweit auf höchstens k unterschiedliche Entscheidungswerte einigen. Wir präsentieren zwei sehr schwache Modelle MAnti und MSink und beweisen, dass Konsensus in diesen Modellen nicht erreicht werden kann. Weiters untersuchen wir den Bezug zwischen dem Modell MSink und den f-source Modellen von Aguilera et al. Nichtsdestotrotz zeigen wir, dass in MAnti und MSink das (n-1)-Set-Agreement-Problem gelöst werden kann, indem wir eine Implementierung des Fehlerdetektors L angeben. Weiters diskutieren wir den Fehlerdetektor L(k), welcher eine Generalisierung von L darstellt und beweisen die Korrektheit eines Algorithmus, der mittels L(k) das k-Set-Agreement-Problem löst. Ein interessanter Aspekt dieses Algorithmus ist, dass er auch in anonymen Systemen funktioniert. Im darauffolgenden Kapitel analysieren wir das Verhältnis zwischen L(k) und bereits existierenden Fehlerdetektoren für das k-Set-Agreement-Problem. Wir diskutieren auch diverse Aspekte von L(k) in anonymen Systemen. Danach präsentieren wir ein generisches Theorem, welches zur Charakterisierung der Unlösbarkeit des k-Set-Agreement-Problems verwendet werden kann. Dies ermöglicht es uns zu zeigen, dass ([Sigma]_k,[Omega]_k) nicht ausreichend für k-set agreement ist. Die Klasse der partiell synchronen Systeme liefert uns einen weiteren Anwendungsfall für dieses Theorem. Danach beschäftigen wir uns mit dem k-Set-Agreement-Problem in rundenbasierten Systemen. Zuerst erläutern wir eine graphentheoretische Abstraktion, den sogenannten Skelettgraphen, welcher die kontinuierlich vorhandene Synchronität eines Laufs des Systems verkörpert. Wir präsentieren einen Algorithmus, der es Prozessen ermöglicht, den Skelettgraphen lokal zu approximieren. Weiters führen wir ein Prädikat PSources(k) ein, welches eine hinreichende Synchronitätsbedingung zur Lösung des k-Set-Agreement-Problems beschreibt, und beweisen, dass die nächstschwierigere Instanz, nämlich (k-1)-Set-Agreement, mit PSources(k) nicht lösbar ist. Abschliessend zeigen wir, wie man basierend auf unserer Skelettgraphenapproximation das k-Set-Agreement-Problem in Systemen, wo PSources(k) hält, lösen kann.

This thesis investigates various aspects of weak system models for agreement problems in fault-tolerant distributed computing. In Part I we provide an introduction to the context of this work, discuss related literature and describe the basic system assumptions. In Part II of this thesis, we introduce the Asynchronous Bounded-Cycle (ABC) model, which is entirely time-free. In contrast to existing system models, the ABC model does not require explicit time-based synchrony bounds, but rather stipulates a graph-theoretic synchrony condition on the relative lengths of certain causal chains of messages in the space-time graph of a run. We compare the ABC model to other models in literature, in particular to the classic models by Dwork, Lynch, and Stockmeyer. Despite Byzantine failures, we show how to simulate lock-step rounds, and therefore make consensus solvable, and prove the correctness of a clock synchronization algorithm in the ABC model. We then present the technically most involved result of this thesis: We prove that any algorithm working correctly in the partially synchronous Theta-Model by Le Lann and Schmid, also works correctly in the time-free ABC model. In the proof, we use a variant of Farkas' Theorem of Linear Inequalities and develop a non-standard cycle space on directed graphs in order to guarantee the existence of a certain message delay transformation for finite prefixes of runs. This shows that any time-free safety property satisfied by an algorithm in the Theta-Model also holds in the ABC model. By employing methods from point-set topology, we can extend this result to liveness properties. In Part III we shift our attention to the borderland between models where consensus is solvable and the purely asynchronous model. To this end, we look at the k-set agreement problem where processes need to decide on at most k distinct decision values. We introduce two very weak system models MAnti and MSink and prove that consensus is impossible in these models. Nevertheless, we show that (n-1)-set agreement is solvable in MAnti and MSink, by providing algorithms that implement the weakest failure detector L. We also discuss how models MAnti and MSink relate to the f-source models by Aguilera et al. for solving consensus. In the subsequent chapter, we present a novel failure detector L(k) that generalizes L, and analyze an algorithm for solving k-set agreement with L(k), which works even in systems without unique process identifiers. Moreover, We explore the relationship between L(k) and existing failure detectors for k-set agreement. Some aspects of L(k) relating to anonymous systems are also discussed. Next, we present a generic theorem that can be used to characterize the impossibility of achieving k-set agreement in various system models. This enables us to show that ([Sigma]_k,[Omega]_k) is not sufficient for solving k-set agreement. Furthermore, we instantiate our theorem with a partially synchronous system model. Finally, we consider the k-set agreement problem in round-based systems. First, we introduce a novel abstraction that encapsulates the perpetual synchrony of a run, the so called stable skeleton graph, which allows us to express the solvability power of a system via graph-theoretic properties. We then present an approximation algorithm where processes output an estimate of their respective component of the stable skeleton graph. We define a class of communication predicates PSources(k) in this framework, and show that PSources(k) tightly captures the amount of synchrony necessary for k-set agreement, as (k-1)-set agreement is impossible with PSources(k). Based on the stable skeleton approximation, we present an algorithm that solves k-set agreement when PSources(k) holds.

Additional information:

Zsfassung in dt. Sprache

License:

In Copyright

Appears in Collections:

Thesis