Failure detectors and communication efficiency in the crash and general omisión failure models

Cortiñas, Roberto

Failure detectors and communication efficiency in the crash and general omisión failure models

Cortiñas, Roberto

Dirigée par:

Alberto Lafuente Rojo Directeur/trice
Iratxe Soraluce Arriola Directeur/trice

Université de défendre: Universidad del País Vasco - Euskal Herriko Unibertsitatea

Fecha de defensa: 04 avril 2011

Jury:

Sergio Arévalo Viñuales President
Mikel Larrea Álava Secrétaire
José Javier Astrain Escola Rapporteur
Felix Freiling Rapporteur
Ernesto Jimenez Merino Rapporteur

Département:

Arquitectura y Tecnología de Computadores

Type: Thèses

Teseo: 310115 DIALNET ADDI editor

Résumé

Consensus is one of the fundamental problems in fault tolerant distributed systems. In addition to the importance of the problem itself, consensus can be a way to solve many other problems in distributed systems, so it is considered a key topic in the Distributed Computing area. Although many solutions have been proposed to solve consensus in synchronous systems, [Fischer, Lynch, and Paterson, 1985] presented an impossibility result, namely Fischer-Lynch-Paterson or FLP, that states that it is impossible to reach consensus in asynchronous systems where even one process may crash. In order to circumvent FLP, [Chandra and Toueg, 1996] proposed the unreliable failure detector abstraction, which has been widely studied in several systems, specially those where processes can only fail by crashing. Failure detectors offer a modular approach that allows other applications such as consensus to use them as a building block. Additionally, the failure detector abstraction allows to encapsulate the synchrony assumptions of the system, so that applications which make use of failure detectors can be designed as if they run in pure asynchronous systems. In this work we show that failure detectors can also be applied to the general omission failure model, in which processes may fail by crashing and by omitting messages either when sending or receiving. As a practical example, we propose a solution to a security area problem called Secure Multiparty Computation by using failure detectors for general omission. In the context of failure detectors in the crash failure model we also study communication efficiency, a performance measure achieved when there are only n links that carry messages forever, being n the number of processes. We improve this measure by defining communication optimality, in which only c links are needed, being c the number of correct processes. In this regard, we propose some communication-optimal implementations of the eventually perfect failure detector class <>P. Finally, we propose a communication-efficient implementation of a failure detector for the general omission failure model. In this case, we define communication efficiency as a linear number of links carrying messages forever.