## Optimal atomic broadcast and multicast algorithms for wide area networks

### Schiper, Nicolas ; Pedone, Fernando

In this paper, we study the atomic broadcast and multicast problems, two fundamental abstractions for building fault-tolerant systems. As opposed to atomic broadcast, atomic multicast allows messages to be addressed to a subset of the processes in the system, each message possibly being multicast to a different subset. We require atomic multicast algorithms to be genuine, i.e., only processes... More

Add to personal list- Summary
- In this paper, we study the atomic broadcast and multicast problems, two fundamental abstractions for building fault-tolerant systems. As opposed to atomic broadcast, atomic multicast allows messages to be addressed to a subset of the processes in the system, each message possibly being multicast to a different subset. We require atomic multicast algorithms to be genuine, i.e., only processes addressed by the multicast message are involved in the protocol. Our study focuses on wide area networks where groups of processes, i.e., processes physically close to each other, are inter-connected through high latency communication links. In this context, we capture the cost of algorithms, denoted latency degree, as the number of inter-group message delays between the broadcasting (multicasting) of a message and its delivery. We present an atomic multicast algorithm with a latency degree of two and show that it is optimal. We then present the first fault-tolerant atomic broadcast algorithm with a latency degree of one. To achieve such a low latency, the algorithm is proactive, i.e., it may take actions even though no messages are broadcast. Nevertheless, it is quiescent: provided that the number of broadcast messages is finite, the algorithm eventually ceases its operation. As a consequence, in runs where the algorithm becomes quiescent too early, its latency degree is two. We show that this is unavoidable, and establish a lower bound on the quiescence of atomic broadcast algorithms. These two lower bound results stem from a common cause, namely the reactiveness of the processes at the time when the message is cast (broadcast or multicast). This reveals an interesting link between the quiescence of total order algorithms and the genuineness of atomic multicast, two problems which seemed to be unrelated at first sight.