Facoltà di scienze informatiche

On the many faces of atomic multicast

Coelho, Paulo ; Pedone, Fernando (Dir.)

Thèse de doctorat : Università della Svizzera italiana, 2019 ; 2019INFO004.

Many current online services need to serve clients distributed across geographic areas. Coordinating highly available and scalable geographically distributed replicas, however, is challenging. While State Machine Replication is the most direct way of achieving availability, no scalability comes from the traditional approach. Typically, scalability is obtained by partitioning the original... More

Add to personal list
    Summary
    Many current online services need to serve clients distributed across geographic areas. Coordinating highly available and scalable geographically distributed replicas, however, is challenging. While State Machine Replication is the most direct way of achieving availability, no scalability comes from the traditional approach. Typically, scalability is obtained by partitioning the original application state among groups of servers, which leads to further challenges. Atomic multicast is a group communication abstraction that groups processes, providing reliability and ordering guarantees, and can be explored to provide partially replicated applications a scalable and consistent alternative. This work confronts the challenges of providing practical group communication abstractions for crash fault-tolerant and Byzantine fault-tolerant (BFT) models. Although there are plenty of atomic multicast algorithms that tolerate crash failures, they suffer from two major issues: (a) high latency for messages addressed to multiple groups, and (b) low performance when proportion of messages to multiple groups is high. To solve the first problem and reduce the latency of multi-group messages, this work presents FastCast, an algorithm with unprecedented four communication delays. The second problem can be addressed by maximizing the proportion of single- group messages and eliminating additional communication among groups to execute operations. In this direction, this document introduces GeoPaxos, a protocol that partitions the ordering of operations like atomic multicast while still keeping the state fully replicated. In the BFT model, the task is more challenging, since servers can behave arbitrarily. This thesis presents ByzCast, the first algorithm that tolerates Byzantine failures. ByzCast is hierarchical and introduces a new class of atomic multicast defined as partially genuine. Lastly, since at the very core of most strong consistent replicated system resides a consensus protocol, the thesis concludes with Kernel Paxos, a Paxos implementation provided as a loadable kernel module, providing at the same time high performance, and abstracting ordering from the application execution.