Brief Announcement: Validity Bound of Regular Registers with Churn and Byzantine Processes ∗

(1)

Brief Announcement: Validity Bound of Regular Registers

with Churn and Byzantine Processes

^∗

Roberto Baldoni Silvia Bonomi Amir Soltani Nezhad

Dipartimento di Informatica e Sistemistica "Antonio Ruberti"

Università degli Studi di Roma "La Sapienza"

Via Ariosto 25, I-00185 Roma, Italy

baldoni,bonomi,amir@dis.uniroma1.it

ABSTRACT

This paper studies the problem of building a byzantine fault tolerant storage service in a distributed system affected by servers join and leave (i.e., servers churn). We show a bound for ensuring both validity of read operations and the persis- tence of a value written by a write operation. This bound correlates the churn rate, the number of faulty processes and the time taken by register operations (i.e., join, read and write operations).

Categories and Subject Descriptors

H.3.4 [Information Storage and Retrieval]: System and Software—distributed systems; D.4.2 [Operating Systems]:

Storage Management—distributed memories; D.4.5 [Operating Systems]: Reliability—fault-tolerance

General Terms

Reliability, Theory

Keywords

Regular Register, Dynamic Distributed Systems, Byzantine failure, Churn.

1. INTRODUCTION

Building a storage able to tolerate arbitrary failures is a central component in any mission-critical system that has to ensure both correct and highly-available operation execu- tions despite software and accidental errors as well as ma- licious operations. Protocols that can tolerate such failures are said to be Byzantine Fault-Tolerant (BFT) [5]. In the re- cent years, BFT storage has become a first class abstraction in large scale distributed systems to implement services such as directory services, sequencer services, namespaces, key management etc. These services are usually implemented by dozens of servers geographycally deployed over large scale settings (e.g. interconnected datacenters) that notoriously have to withstand various types, patterns, degrees and rates of churn. In such environments, churn creates unpredictable patterns of servers that leave and join a computation and no

∗This work is partially supported by the European STREP project SM4All

Copyright is held by the author/owner(s).

ACM X-XXXXX-XX-X/XX/XX.

assumption can be done on the duration of the churn period and on the distance in time between two successive churn periods (i.e., churn is non-quiescent). Thus, churn adds a new complexity dimension to the implementation of BFT storage, and it must be handled to avoid the storage to become unavailable.

To the best of our knowledge, this is the first paper analyzing implications of building a BFT storage under churn. Imple- mentations of registers resilient to crash failures have been provided indeed on the top of a distributed system prone to quiescent and non-quiescent churn respectively.

2. SYSTEM MODEL

The distributed system is composed of a universe of clients Uc(i.e. clients system) and of a disjoint universe of servers Us(i.e. servers system). The clients system is composed of a finite arbitrary number of processes, while the servers system is dynamic, i.e. processes may join and leave the system at their will. A server enters the server system by executing the join System procedure. Such an operation aims at connect- ing the new process to both clients and servers that already belonged to the system. A server leaves the system by means of the leave System operation. In order to model processes continuously arriving to and departing from the system, we assume the infinite arrival model. The set of processes that can participate in the servers system, i.e. the server system population, is composed of a potentially infinite set of processes Us= {. . . si, sj, sk. . . }, each one having a unique identifier (i.e. its index). However, the servers system is composed, at each time, of a finite subset of the population.

Clients and servers can communicate only by exchanging messages through reliable and authenticated channels. We assume the existence of a protocol managing the arrival and the departure of servers from the distributed system; such a protocol is also responsible for the connectivity maintenance among processes of the system.

Distributed Computation. At time t0, when the server system is set up, n servers belong to the servers computation. A server si, belonging to the servers system, that wants to join the distributed computation has to execute the join Server() operation. Such an operation, invoked at some time t, is not instantaneous and takes time to be executed;

how much this time is, depends on the specific implementation provided for the join Server() operation. However, from time t, the server si can receive and process messages sent by any other processes participating in the computation.

When a server sjparticipating in the distributed computation wishes to leave the computation, it executes the leave Server() operation. We model the leave Server() operation as

(2)

an implicit operation, i.e., when a process pileaves the computation, it just stops to send and process messages related to the register computation. Moreover, we assume that if a server leaves the computation and later wishes to re-join, it executes again the join Server() operation with a new iden- tity. We assume at most f ≤ n/3 servers can deviate from their specifications during the whole life of the computation.

The remaining servers are correct.

Non-Quiescent Churn. The computation alternates periods of churn (Tc) and periods of stability (Ts). More specif- ically, there exist some periods Tc, T_c⁰, T_c⁰⁰ etc. in which servers join and leave the computation, and between two churn periods Tcand T_c⁰, there exists a period Tswhere the computation becomes stable and no join or leave operations occur. However, no assumption is made about how long Tc

and Ts are. During the churn periods, we assume that the churn refreshes a fraction of the computation participants at each time unit. More precisely, we define as churn rate, de- noted c, the percentage of processes that are “refreshed” at every time unit (c ∈ [0, 1]). This means that while the number of servers remains constant (equal to n), in every time unit in Tc, c × n processes leave the computation and the same number of processes invok the join Server() operation.

3. REGULAR REGISTER

A register is a shared variable accessed by a set of processes, i.e. clients, through two operations, namely read() and write(). We consider a regular register specified as fol- lows:

Termination: If a correct process participating in the computation invokes an operation and does not leave the system, it eventually returns from that operation.

Validity: A read operation returns the last value written be- fore its invocation, or a value written by a write operation concurrent with it.

Due to the presence of churn, we add one more property, namely write persistency, that an algorithm has to satisfy:

WritePersistency: Servers maintain the last value written by a write operation despite server departures.

Protocol Model. A protocol Preg implementing a regular register is a collection of distributed algorithms, one for each operation (i.e. Preg={AJ S, AR, AW} where AJ S, AR, AW

are respectively the distributed algorithms implementing the join Server(), the read() and the write() operations). Each algorithm is composed of a sequence of computation steps and communication steps. A computation step is represented by the computation executed locally to each process while a communication step is represented by the sending and the delivering events of a message.

A register is maintaied by a set of servers (possibly a subset of the ones belonging to the computation). No agreement abstraction is assumed to be available at a server. Clients do not maintain any register information; they can just trig- ger operations and interact with servers through message exchanges. Moreover, we assume that each server has the same role in the distributed computation (i.e. there is no special server acting as a coordinator).

For the sake of simplicity, we assume a single register repli- cated at each server belonging to the computation.

4. VALIDITY BOUND

Consider a generic protocol Preg implementing a regular register such that (i) every operation terminates and that (ii) there exists a period of churn longer than the longest operation issued on the register. Starting form the lower bound of 3f + 1 processes needed to tolerate f failures [7], in [3] we prove the following Theorem.

Theorem 1. Let A^{J S}, ARand AW be the algorithms implementing respectively join Server(), read() and write() operations. Let ∆tj, ∆tr and ∆tw be the maximum time in- tervals needed by AJ S to terminate an operation. If c ≥ min{_n×∆t^n−3f

r,_n×(∆t^n−3f

j+∆t_w)}, then it is not possible to ensure both the write-persistency and the read-validity.

Interestingly, the bound is affected by the time taken by the operations to be executed, which depends on the algorithm implementing the operations and on the synchrony assumptions of the underlying distributed systems. In the case of a synchronous distributed system, the values of ∆tj,

∆tr and ∆tw are finite and thus, the value of c can be cal- culated exactly. Moving towards an asynchronous system, the values of ∆tj, ∆trand ∆twcan only be estimated; thus, there is no more a deterministic guarantee that the system works, but it is still possible to provide an estimation of the churn rate that can be tolerated by the system. This result confirms the impossibility result found in [2] that is no algorithm can implement a regular register in the presence of non quiescent churn on the top of a fully asynchronous distributed system.

5. FUTURE WORK

This paper has established a bound for ensuring validity of read operations issued on a regular register which is able to resist arbitrary failures and non-quiescent churn. Future work will study if register storage is weaker than consensus not only in distributed systems prone to both crash failures and quiescent churn (as shown in [1]) but also in a setting including byzantine failures.

6. REFERENCES

[1] Aguilera M. K., Keidar I., Malkhi D., Shraer A., Dynamic atomic storage without consensus, in Proceedings of 28^th Annual ACM Symposium on Principles of Distributed Computing (PODC) 2009, 17-25

[2] Baldoni R., Bonomi S., Kermarrec A.M., Raynal M., Implementing a Register in a Dynamic Distributed System, in Proceedings of the 29th IEEE International Conference on Distributed Computing Systems (ICDCS’09), IEEE Computer Society Press, Montreal (Canada), June 2009.

[3] Baldoni R., Bonomi S., Soltani Nezhad A. Regular Registers in Dynamic Distributed Systems with Byzantine Processes: Bounds and Performance Analysis, Technical report - MIDLAB 3/11 - 2011

http://www.dis.uniroma1.it/~midlab/articoli/

[4] Lamport. L., On Interprocess Communication, Part 1:

Models, Part 2: Algorirhms, Distributed Computing, 1(2):77-101, 1986.

[5] Lamport L., Shostak R. E., Pease M. C., The Byzantine Generals Problem, ACM Transactions on Programming Languages and Systems (TOPLAS) 4(3), 382-401, 1982.

[6] Malkhi D., Reiter M. K. Byzantine Quorum Systems, Distributed Computing 11(4): 203-213, 1998

[7] Martin J., Alvisi L., Dahlin M.. Minimal Byzantine Storage, in Proceedings of the 16th International Symposium on Distributed Computing (DISC 2002), pp.

311-325, 2002.