6 2.1.2 Concurrent execution of transactions

(1)

INFORMATICA E STATISTICA

CORSO DILAUREA ININGEGNERIAINFORMATICA

TESI DILAUREAMAGISTRALE

THE COOKING PHILOSOPHERS

Scheduling Heterogeneous Resources in Domotic Environments

Candidato Relatore

Bruno Ambrogio Prof. Roberto Baldoni

Correlatore Dott.ssa Silvia Bonomi

Anno Accademico 2010/2011

(2)

Alla mia famiglia.

(3)

List of Figures vi

List of Algorithms vii

Abstract 1

Acknowledgements 2

1 Introduction 3

1.1 Context . . . . 3

1.2 Thesis Contribution . . . . 4

1.3 Chapter Organization . . . . 4

2 Related Works 5 2.1 Concurrency in Database Management Systems . . . . 5

2.1.1 The ACID Properties . . . . 6

2.1.2 Concurrent execution of transactions . . . . 7

2.1.3 Lock-Based Concurrency Control . . . . 9

2.1.4 Optimistic Concurrency Control . . . . 14

2.1.5 Timestamp-Based Concurrency Control . . . . 14

2.2 Concurrency in Operating Systems . . . . 16

2.2.1 Scheduling Criteria . . . . 17

2.2.2 Scheduling Algorithms . . . . 18

i

(4)

2.2.4 Deadlock Management . . . . 25

2.3 The Mutual Exclusion Problem . . . . 29

2.3.1 Dijkstra’s Algorithm . . . . 30

2.3.2 Lamport’s Bakery Algorithm . . . . 31

2.3.3 The Philosophers Saga . . . . 32

2.4 Discussion . . . . 36

3 System Model and Problem Statement 37 3.1 System Model . . . . 37

3.2 Problem Formulation and Properties . . . . 41

3.2.1 Conditions for Strong No-Starvation . . . . 42

3.3 Task and Activity Representation . . . . 45

3.3.1 Task Graphs . . . . 45

3.3.2 Activity Graphs . . . . 46

4 A Framework for Resource Allocation 48 4.1 Deadlock Avoidance Policies . . . . 48

4.1.1 Resource Allocation Weighted Graph . . . . 48

4.1.2 Timestamp-based rule . . . . 52

4.2 Centralized Lock Manager . . . . 52

4.2.1 The Algorithm . . . . 53

4.2.2 Algorithm Instances . . . . 55

4.2.3 TheSELECTfunctions . . . . 60

5 Validation 62 5.1 Infinite Quanta, Infinite Lifetime . . . . 62

5.1.1 Summary . . . . 65

5.2 Finite Quanta, Infinite Lifetime . . . . 69

5.2.1 Summary . . . . 72

5.2.2 Effects of the ReturnPredicate . . . . 74

5.3 Infinite Quanta, Finite Lifetime . . . . 75

5.4 Resource Usage andSELECTfunctions . . . . 77

6 Conclusions 82

ii

(5)

Appendix A

Petri Nets to model Tasks and Activities 89

Appendix B

Contention Degree 98

Appendix C

Omitted Charts 100

iii

(6)

2.1 Precedence graph in which three transactions are involved . 11

2.2 Strict-2PL Algorithm . . . . 12

2.3 Waits-for graph . . . . 13

2.4 Prediction of nextCPUburst . . . . 22

2.5 Multilevel Queue Scheduling . . . . 22

2.6 Conflicts between processes represented as a graph . . . . . 34

2.7 Analogy of the Driving Philosophers problem with a round- about . . . . 35

3.1 Locking and Unlocking Mechanism . . . . 41

3.2 Example of a task graph . . . . 45

3.3 Example of an activity graph with eight tasks . . . . 46

4.1 Example of resource allocation weighted graph . . . . 49

4.2 Deadlock example . . . . 51

4.3 Example with a cycle but deadlock-free . . . . 51

4.4 Timestamp-based rule example . . . . 53

4.5 Lock Processing . . . . 55

4.6 Unlock Processing . . . . 58

5.1 Completed Tasks per Second . . . . 63

5.2 Restarts and Activity Latency . . . . 64

5.3 Waiting Queue . . . . 64 iv

(7)

5.5 Restarts and Latencies comparison withRAWG . . . . 66

5.6 Restarts and Latencies comparison withWAITDIE . . . . 66

5.7 Waiting Queue size comparison withRAWG . . . . 67

5.8 Waiting Queue size comparison withWAITDIE . . . . 67

5.9 Waiting Queue size trend during a simulation . . . . 68

5.10 Completed Activities and Permanent Aborts with 1 resource per task . . . . 70

5.11 Activity Latencies and Restarts with 1 resources per task . . 70

5.12 Completed Activities and Permanent Aborts with 3 resources per task . . . . 71

5.14 Completed Activities and Permanent Aborts comparison - RAWG . . . . 72

5.15 Completed Activities and Permanent Aborts comparison - WAITDIE . . . . 73

5.16 Activity Latencies and Restarts comparison -RAWG . . . . . 73

5.17 Activity Latencies and Restarts comparison -WAITDIE . . . . 74

5.18 Completed Activities and Permanent Aborts with returnPred- icate . . . . 74

5.19 Activity Latencies and Restarts with returnPredicate . . . . . 75

5.20 Completed Activities and Permanent Aborts with 3 resources per task . . . . 76

5.22 Consumption distribution with serial activities . . . . 78

5.23 Consumption distribution with maximum concurrency degree 79 5.24 SELECTfunctions impact on the resource set . . . . 80

1 The Task Graph for T. . . . 92

2 The Petri net modeling the task T. . . . 93

3 Activity with serial tasks modeled as a Petri net . . . . 93

4 Activity with parallel tasks modeled as a Petri net . . . . 94

5 Modeling resources that could disappear and/or reappear . 94 6 Our nets are not siphon-circuit or trap-circuit nets. . . . 97

v

(8)

8 Comparisons - 10 activities . . . 101

9 Throughput - 20 activities . . . 102

15 Throughput comparison -RAWG2 . . . 105

16 Comparisons -RAWG2 . . . 105

17 Throughput comparison -WAITDIE2 . . . 106

18 Comparisons -WAITDIE2 . . . 106

19 Finite reusability - 2 resources per task . . . 107

22 Finite reusability -RAWG2 comparison . . . 110

23 Finite reusability -WAITDIE2 comparison . . . 111

24 Finite lifetime - 1 resource per task . . . 112

25 Finite lifetime - 2 resources per task . . . 113

28 Consumption distribution, serial activities -WAITDIE . . . . 116 29 Consumption distribution, concurrent activities -WAITDIE . 116

vi

(9)

1 Process code for Mutual Exclusion . . . . 30

2 Dijkstra Algorithm . . . . 31

3 Bakery Algorithm . . . . 33

4 Lock Manager -LOCKmessage processing . . . . 56

5 Lock Manager -UNLOCKmessage processing . . . . 57

6 Lock Manager - Waiting Queue Observer . . . . 57

7 Lock Manager - Lifetime Observer . . . . 58

vii

(10)

We introduce a new concurrency management problem, useful for example in smart services for home automation. The name comes from the following scenario:

let’s suppose that the philosophers Socrate and Platone are in a kitchen. The former would like to bake a cake, while the latter wants to prepare fresh pasta. Both of them needs eggs to complete their activity. Precisingly, suppose that Socrate need 3 eggs while Platone 4. However, in the fridge we have only 5 eggs. In this case, only one between our philosophers can complete his activity. This is only one of the possible scenarios considered in our problem: eggs are going to expire after some time. Even the hoven is going to be used by both, and it has not the same characteristics of eggs.

So in our problem we consider a wide range of resources, with extremely different characteristics (lifetime, reusability, single and multiple instances...). We have also to deal with deadlock issues. About this topic, in this thesis is presented a theorem that allows to detect and avoid deadlocks in our context. A solution is given my means of a centralized algorithm, with different implementations and behaviours.

1

(11)

Anche per me è arrivato il momento di scrivere la pagina dei ringrazi- amenti della mia tesi di laurea magistrale, forse la pagina più difficile dell’intero lavoro. Sono passati cinque anni dall’inizio del percorso uni- versitario che mi ha portato fin qui. Con queste pagine termina un capitolo importante della mia vita, per dare spazio ad uno tutto nuovo, non più fatto di esami e tesi da preparare.

Le prime persone che vorrei ringraziare sono la mia Mamma ed il mio Papà, che sono stati con me sempre, sia tra i sacrifici che tra le soddisfazioni che questi hanno portato.

Abbraccio mia sorella Nadia, ed Emiliano, che mi hanno sempre incorag- giato ed aiutato nelle mie scelte.

Se sono arrivato fin qui, il merito è anche di tutti loro, che non solo mi hanno permesso di studiare, ma che me ne hanno dato anche la voglia, infondendomi l’importanza dello studio sin da quando ero piccolo.

Vorrei inoltre ringraziare il Prof. Roberto Baldoni e la Dott.ssa Silvia Bonomi, che hanno seguito il mio lavoro, e tutti i miei colleghi e amici dell’università, con i quali ho condiviso cinque anni di vita, uniti dagli stessi obbiettivi.

Un grazie al mio caro amico Giorgio, che ha una grande stima di me, con il quale il sogno di diventare famosi grazie all’informatica è sempre vivo.

Come ultima, assolutamente non in ordine di importanza, ringrazio Martina, compagna delle mie gioie e sventure di quest’ultimo anno, sempre affascinata dal mondo a volte ’nascosto’ dell’informatica, che crede sempre in me, standomi vicina anche nei momenti più difficili.

2

(12)

Introduction1

This master thesis formalizes, discusses and gives a first solution to a new concurrency management problem, in which completely heterogeneous resources are considered. The focus is given to smart houses, classical exam- ples of environments in which heterogeneous resources coexists. Anyway, the generalization maintained in the following gives the possibility to apply the model in any scenario in which resources can be well-modeled using our formalism.

1.1 Context

Nowadays, home automation represents a growing market. Today’s systems are mainly based on ad-hoc and proprietary solutions, with few interoperability and smart integration. Systems likeSM4ALL¹[16] are meant to be part of the environment where humans live, and are thus designed to allow interaction with multiple users at the same time. This means that several actions, involving different users and devices, can be executed at the same time. In this context, several issues about concurrency arise. Un- like in the well-known concurrency management algorithms in literature, however, home automation scenario introduces a strong heterogeneity on the resources, that in this case are physical objects. The number of instances of a single resource is important, and also their lifetime, used to model the

1Smart hoMes for All - http://www.sm4all-project.eu

3

(13)

fact that a physical object can expire after some time. Furthermore, physical objects are subject to the consumption due to their normal usage.

1.2 Thesis Contribution

In [15], Georgia Tech introduces a domotic home that has been built for the elder adult with the goals of compensating for physical decline, aiding recall of past actions and supporting awareness for extended family members. The work also evaluate the likelihood that older adults will use these technologies in their daily lives.

The project Cyber Crumbs, presented in [21], aims at guiding the blind by equipping him/her with a smart badge and pervading the environment with active landmarks. Another smart home targeting elder citizens is the Gator Tech Smart House, presented in [10].

We are assisting to several research projects on smart homes for assisting people with physical or mental disabilities. None of the actual solutions directly address the problem of concurrency, allowing the interaction with multiple users at the same time. The need for concurrency management in these scenarios is motivated for the first time in [3].

1.3 Chapter Organization

In chapter 2, different related works are discussed, focusing from the concurrency control in DBMSs and operating systems, to the mutual exclusion problem, introduced for the first time by Dijkstra in [7]. Finally, the dining, drinking and driving philosophers problems are briefly introduced.

Chapter 3 defines the problem and the system model in which we are, giving the definitions of the basic objects that we are going to use, and stating some properties of the problem.

A framework for the resource allocation is described in chapter 4, while its validation is given in chapter 5, where several tests are presented.

Finally, chapter 6 concludes this thesis, discussing some starting points for possible future works.

(14)

Related Works2

In this chapter we’ll see how concurrency is managed in different con- texts. First of all, we focus on the criticalities that concurrency can give in a DBMS, where data is seen as a resource. Then we move to operating systems, in which the main resource to schedule is theCPU. Finally, we review the mutual exclusion problem, in which we want to avoid concurrent accesses to a critical section of code.

2.1 Concurrency in Database Management Systems

The foundation for concurrent execution in a DBMS is the concept of transaction. A transaction is an execution of a user program, seen by the DBMS as a series of read and write operations. For performance reasons, in a DBMS we should execute several transactions in an interleaved fashion.

The execution of a transaction involves infact both CPU and I/O activities.

During the execution, whenever the CPU needs data that has to be loaded for the external storage, it has to wait until the I/O activity finishes. This results in having high idle times. So, executing transactions one-by-one would result in having high idle times, reducing the number of transactions that we can complete in a given time, i.e., reducing the system throughput.

Since I/O activity can be done in parallel with CPU activity in a computer, while the DBMS is waiting for data from disk, it can use the CPU to carry on some other transactions. Concurrent execution is then necessary to increase the throughput and decrease thus the average time needed to

5

(15)

complete a transaction, i.e., the response time.

However, with the interleaving, we arise some criticalities to deal with.

Consider the case in which we have two concurrent transactions, both using the same object from the database. Let’s suppose the first transaction reads the object and then the second one updates it changing its value. Now, the first one reads again the object from the database. This second read gives a different value for the data object, even if the first transaction has not modified it¹.

A DBMS must ensure that the effect of a concurrent execution of transactions on a database is exactly equivalent to the effect that a serial execution of the same transactions has. In other words, from the user that is executing a transaction point of view, the existence of other running transaction must be completely transparent.

2.1.1 The ACID Properties

In order to have some reliability in the processing of transactions, the DBMS must ensure four important properties to transactions, called the ACID Properties, defined in [9]:

• A^TOMICITY: The execution of a transaction is seen by the user as an atomic operation: either all actions are carried out or none are.

• C^ONSISTENCY: Each transaction, executed alone on the database, must preserve the consistency of the data. The DBMS assumes that consistency holds for each transaction.

• I^SOLATION: Transactions must be isolated from the effects of other concurrently executing transactions. In other words, each transaction must see the system as if there are no other running transactions, i.e., there is no interleaving.

• D^URABILITY: Once the DBMS confirms that a transaction has been successfully completed, its effects must be permanent on the database, even if some crash occurs.

1This is a common anomaly that a DBMS must avoid

(16)

Consistency must be taken into account by users. A user must write a transaction that, starting from a consistent database state, leaves it in a consistent state. The DBMS cannot detect inconsistencies due to user’s errors in the transaction code.

Atomicity and Durability are needed in order to take into account incom- plete transactions. A transaction can be aborted (terminated unsuccessfully) because some anomaly arises during the execution (in this case it is aborted by the DBMS) or because it could get into an unexpected situation, such as read an unexpected data, and decide to abort (in this case it terminates itself). Once the DBMS aborted a transaction, it is automatically restarted.

For the sake of atomicity, the DBMS must rollback the effects of the incomplete execution of a transaction by undoing the operations that it has done before its interruption. In order to be able to do such rollback, the DBMS logs all the modifications of the database. This log is also used to ensure the durability property in case of system crashes.

2.1.2 Concurrent execution of transactions

The DBMS sees a transaction as a list of reads and writes of database objects. In addition, each transaction has a final action, that could be either commit, in case it completes successfully, or abort, in the other case. We are hereby making two assumptions:

1. There is no direct interaction between transactions. They interact each other only via database read and write operations.

2. A database is made up of a fixed collection of independent objects. So, no insertions or deletions are taken into account.

A list of actions from a set of transactions, such that if operation Op1 is before the operation Op2in a transaction T, they are in the same order also in the list, is called schedule. It is, in other words, a potential execution sequence as seen by the DBMS. A transaction infact could also carry out some other operations, like for example evaluate arithmetic expressions.

Those operations are not shown in a schedule.

(17)

If a schedule contains all the actions of all transactions, it is called complete schedule. If the actions in the list are not interleaved, the schedule is called serial.

A schedule over a set S of committed transactions, whose effect on any consistent database instance is guaranteed to be identical to that of some complete serial schedule over S is called serializable. This means that its effects on the database are identical to the effects that we have executing transactions in some serial order. Of course, a DBMS must not allow non-serializable schedules to be executed. In [11] and [25] we can find a comprehensive collection of papers about concurrency control in DBMSs.

Performance of various concurrency control algorithms is discussed in [1], [23] and [24].

Anomalies

Due to the interleaving of transaction actions, we could have some anomalies in the data. That is, we bring the database to an inconsistent state.

Intuitively, an anomaly can occur only if some transaction is going to change the value of some object, i.e. if there is a write action. Two actions on the same object where at least one is a write operation, are called conflicting actions. Thus, there are three types of conflicts: write-read (WR), read-write (RW) and write-write (WW). Here we present an example for each kind of conflict:

• WR: U^NCOMMITTEDREADS

Suppose a transaction T1 reads a data object A that has been modified by another transaction T2 that has not committed yet. Such read is called dirty read. Let’s suppose that T2 is not going to commit. In this case, T1 will work on a dirty value for the object A.

• RW: UNREPEATABLEREADS

We expect that, during the execution of a transaction, if we read twice the same object without modify it, we get the same value. However, due to the interleaving, it could be possibile that between the two reads, another transaction modify the value of that object. In this

(18)

situation, the second read of the first transaction will give a different value for that object, i.e. the isolation property is violated.

• WW: O^VERWRITINGUNCOMMITTEDDATA

Let’s say we have two transactions T1 and T2, both working on the same data objects A and B. T1 sets A and B to 10, while T2 sets A and B to 15. Due to the interleaving, we can have that T2 sets A to 15, then T1 sets A to 10. Now T1 sets B to 10 and finally T2 sets B to 15. At the end we come up with A=10 and B=15. It’s easy to see that there is not a serial execution of T1 and T2 that leaves the database in this state.

So, the schedule is not serializable.

In case we have aborting transactions, some other criticalities arise. Consider the case in which there is a transaction T1 that updates a data object A and then a transaction T2 reads A and uses its value during its execution. If T1 is going to abort, the result is that T2 is working on a dirty copy of A, coming from the execution on the write action of T1 before its abort. For the atomicity property, since T1 has aborted, the effects of that write must be canceled. Inevitably, also T2 must be aborted, since otherwise it will leave the database in a inconsistent state. It’s easy to see that this situation can occur even with more than two transactions, in which all transactions that read a value from an aborting transaction should also be aborted, coming up with cascading aborts. A schedule in which all transactions commit only after all transactions whose changes they read commit, is called recoverable.

If a schedule is recoverable, we also easily avoid cascading aborts.

Definitively, a DBMS must ensure that only serializable and recoverable schedules are executed.

2.1.3 Lock-Based Concurrency Control

A locking protocol is a set of rules to be followed by each transaction to ensure that no anomalies can arise. In other words, even if the actions of all transaction are interleaved, the final effect on the database is the same of some serial execution of the same transactions.

The fundamental rule for transactions in this kind of concurrency control is:

(19)

A transaction can access a data object A if it has a lock on A Of course, two transactions both reading (and not modifying) the same object A can execute concurrently. In our context, a shared lock on A can be granted to both. In case of write operations, we must ensure that no other transaction is reading or modifying the same object and thus a transaction that wants to change the value of A must have an exclusive lock on A, that can be granted only if no other locks on A were granted (shared or not). With an exclusive lock, of course, a transaction can also read the object. Moreover, if a transaction wants to write on A and it already has a shared lock on A, it can upgrade the lock to an exclusive one.

A well-known algorithm for lock-based control is the Two-Phase Lock- ing (2PL). The rule to follow for this algorithm is:

A transaction cannot request additional locks once it releases any lock

This results in having, for each transaction, a growing phase, in which all locks are acquired, followed by a shrinking phase, in which it releases all locks.

A schedule is said to be strict if a value written by a transaction T is not read or overwritten by other transactions until T either aborts or commits.

These schedules are recoverable and also avoid cascading aborts. A variant of the 2PL rule improves on it by adding the strict property, resulting in the Strict-2PL protocol, for which the rule is:

All locks held by a transaction are released when the transaction is completed

Under this protocol, a transaction that writes an object, holds the exclusive lock until it commits or abort. Hence, no other transaction can neither read or modify the same object until the last transaction that wrote it completes.

Locking protocols guarantee that non-serializable schedules are not executed. This can be shown by proving that the set of schedules accepted by the 2PL and Strict-2PL protocols are only those in a subset of the entire set of serializable schedules, named conflict serializable schedules.

(20)

Two schedules are called conflict equivalent if each pair of conflicting actions appears in the same relative order in both schedules. Hence a sched- ule is conflict serializable if it is conflict serializable to some serial schedule.

Every conflict serializable schedule is also serializable, but the converse is not true.

All conflicts between transactions in a schedule S can be represented in a graph, called precedence graph. Such graph has a node for each transaction in S, and an edge from Ti to Tj if an action of Ti precedes and conflicts with one of Tj’s actions. A schedule is conflict serializable if and only if its precedence graph is acyclic. If so, a topological order of the graph can be computed and this represents a conflict equivalent serial schedule.

Fig. 2.1: Precedence graph in which three transactions are involved The Strict-2PL protocol allows only conflict serializable schedules, since it ensures that the precedence graph for any schedule that it allows is acyclic.

A serial order can be obtained by looking the order in which transactions commits. It can be shown that even nonstrict 2PL ensures acyclicity of the precedence graph and therefore allows only conflict serializable schedules.

In this case a serial schedule is given by the order in which transactions enter the shrinking phase.

Lock Management

The part of the DBMS that keeps track of the locks issued to transactions is called the Lock Manager. All the informations about locks are stored in a lock table, which is a hash table where the data object is the key. Informa- tions about transactions are stored in a transaction table, where there is a descriptive entry for each transaction in which the DBMS stores a pointer to a list of locks held by the transaction.

(21)

In the Strict-2PL protocol, when a transaction need a lock on an object, it issues a lock request to the lock manager:

1. If the requested lock is a shared one, the queue of requests is empty, and the object is not currently locked in exclusive mode, the lock manager grants the lock and updates the lock table

2. If the requested lock is an exclusive one and no transaction currently holds a lock on the object (which also implies the queue of requests is empty), the lock manager grants the lock and updates the lock table 3. Otherwise, the requested lock cannot be immediately granted and

the request is added to the queue of lock requests for that object. The issuing transaction is suspended.

Fig. 2.2: Strict-2PL Algorithm

For the nonstrict 2PL, for each lock request, the lock manager must also check that the issuing transaction has not entered its shrinking phase, in order to not violate the 2PL rule. When a transaction aborts or commits, it releases all its locks. At this time the DBMS updates its data structures and examines the lock request at the head of the waiting queues of the objects locked by the completed transaction. Of course, in order to allow different instances of the lock manager to execute concurrently, accesses to the lock table must be synchronized.

(22)

Deadlock Management

A DBMS usually deals with deadlock detection in two different ways: by looking at the waits-for graph or by a timeout mechanism.

In a waits-for graph, there is a node for each transaction and there is an arc from Tito Tj if and only if Tiis waiting for Tj to release a lock. Building and mainteinance for this graph is left to the lock manager. The waits-for graph is periodically checked for cycles, which indicate a deadlock.

Fig. 2.3: Waits-for graph

An alternative to the waits-for graph is to use a timeout mechanism to infere deadlocks: it a transaction has been waiting too long for a lock, we assume that is in a deadlock cycle and abort it.

It’s is also possible to implement a deadlock prevention policy, even if empirical results show that in DBMSs deadlocks are relatively infrequent and hence a deadlock detection approach work fairly good in pratice. Dead- locks can be prevented by giving each transaction a priority and imposing some rule based on it. The idea is to avoid the possibility that a cycle can appear in the waits-for graph. One way to achieve this is to assign to each transaction a timestamp when it starts up. Younger transactions (i.e., with higher timestamps) have lower priorities.

Two policies can be followed by the lock manager whenever a transaction T_irequests a lock on an object that is currently locked by the transaction Tj:

• W^AIT-DIE: It Tihas higher priority, it is allowed to wait; otherwise, it is aborted.

• W^OUND-WAIT: It Tihas higher priority, abort Tjand grant the lock to T_i; otherwise Tiwaits.

The wound-wait policy is a preemptive version of the wait-die rule.

(23)

2.1.4 Optimistic Concurrency Control

Supposing that we are in a system where the contention on data is relatively low, the overhead of the lock management becomes significant. In the optimistic concurrency control [12] we make the premise that there are few conflicts among transactions, and we are going to be as permissive as possible in allowing transactions to execute. The execution on a transaction is split in three phases:

• R^EAD: The transaction reads all the values it need from the database and copies them on its private workspace.

• V^ALIDATION: Whenever the transaction is going to commit, the DBMS check whether the transaction could possibly have conflicted with any other concurrently executing transaction. It there’s a possible conflict, then the transaction in aborted and restarted with a cleared workspace.

• W^RITE: If the validation phase is completed successfully, then the changes made bt the transaction on its private workspace are reflected on the database.

With this policy, if we have few conflicts and, assuming that the validation phase can be done in an efficient way, we can have better performance than in the locking mechanisms. On the other hand, it the number of conflicts grows, this approach would abort and restart transactions repeatedly, hurting hence the performance significantly.

2.1.5 Timestamp-Based Concurrency Control

Timestamps can be used also for concurrency control. Even in this case, each transaction Tiis assigned a timestamp ts(Ti)at startup. We can ensure at execution time that, if action Opiof Ticonflicts with the action Opj of Tj, Op_ioccurs before Opj if ts(Ti) < ts(T_j). If an action violates this ordering, the transaction is aborted and restarted. It’s easy to see that every schedule satisfying the timestamp rule is also conflict-equivalent to the serial schedule given by considering transactions in the order of their timestamps, and hence every schedule that follows the timestamp rule is conflict-serializable.

(24)

Every database object O is given:

• a read timestamp rts(O) equals to the highest timestamp among the active transactions that have read O,

• a write timestamp wts(O), equals to the highest timestamp among the active transactions that have written O,

• a committed write timestamp cwts(O), equals to the timestamp of the last committed transaction that has written O, and

• a commit bit cb(O) that is true if the last transaction that has written O has committed, false otherwise.

If the transaction T wants to read the object O, the timestamp-based scheduler checks whether ts(T ) ≥ wts(O). If it is true, T reads O and rts(O) = max(rts(O), ts(T )). If instead ts(T ) < wts(O), T is aborted and restarted with a new, larger timestamp. This is necessary since the order of the read would violate the timestamp order with respect the most recent write on the same object O.

Just for clarity, let’s suppose that Tw is the transaction that executed the last write on O, wts(O) = 4 and ts(T ) = 3. The action readT(O)comes after the writeT w(O)in the schedule, so the timestamp rule is violated since those operations should appear in the opposite order.

If the transaction T wants to write on O, the scheduler checks whether ts(T ) < rts(O). If it is so, similarly to the previous case, the write action conflicts with the most recent read action on O, and T is therefore aborted and restarted with a new timestamp.

If ts(T ) < wts(O) means that the last transaction that has written on O has a higher timestamp with respect to the write operation that the scheduler is considering. These two actions are again violating the timestamp rule and so we should abort a restart T. However, ignoring the write of T would be safe, since if they were in the right order, the latter write would have overwritten the former. This rule is called the Thomas Write Rule. If instead ts(T )≥ wts(O), T can write O and wts(O) is set to ts(T ).

The commit bit is used to avoid the dirty read anomaly: if T wants to read O, that was previously written by another transaction that has not

(25)

committed yet, is put in a waiting state until the last transaction that wrote O commits or aborts. The same happens if T wants to write O and it was previously written by an uncommited transaction.

The committed write timestamp is used during the rollback of a transaction. Indeed, if T aborts, we should set wts(A) = cwts(A) for each data object A written by T, and set also cb(A) = �, since the transaction that wrote A before T has surely committed.

Multiversion timestamps

In this variant, studied in [20] and [4], the idea is that a transaction never has to wait to read a database object. For each database object, several versions are maintained, each of them with a write and read timestamp.

When T wants to read the object O, the version with the highest timestamp among those whose timestamp precedes ts(T ). After the read, the read timestamp of O is set to ts(T ) if ts(T ) > rts(O).

If T wants to write an object O and ts(T ) < rts(O), T is aborted and restarted with a new timestamp. Otherwise, a new version of O is created and its read and write timestamps are set to ts(T ). If the former case, T must be aborted since another transaction T’ with higher timestamp has read O before T. So, changes made by T would not be seen by T’, that has read O before.

This variant has the drawback of the cost of maintaining versions. How- ever, in situations in which the workload of the DBMS is mainly dominated by read operations, the performance is incremented, since reads are never blocked.

2.2 Concurrency in Operating Systems

In a system with a single processor, only one process can run at a time.

All the others must wait until the processor becomes free. Tipically, processes in a computer system not only useCPU, but also make someI/Orequests to some device. In other words, it’s very likely that a process does not make only computation without interact with other devices in the system. This

(26)

means that, after theI/Orequest, the issuing process must wait until the data it has requested is retrieved. In the meanwhile, theCPUremains idle.

Reducing death times due toI/Oby rescheduling theCPUto other processes would be desirable since the total system throughput would increase, making our system more productive. Processes alternate between theCPU- burst (computing) and theI/O-burst (interacting with devices) state. We can characterize processes in two main behavioral sets:CPU-bound processes, that areCPU-intensive, andI/O-bound processes, that use mainlyI/Ode- vices. In the former a program might have a few longCPU-bursts, while in the latter usuallyCPU-bursts are very short andI/O-bursts dominate.

Most of the timesI/Oactivity is the bottleneck for system throughput, sinceI/Odevices are so much slower than theCPU. Thus, in tipical systems, most of the time theCPUis idle.

CPUScheduling is the basis of multiprogrammed operating systems. The

CPUis the primary computer resource; doing its scheduling efficiently is central to operating systems design.

2.2.1 Scheduling Criteria

SeveralCPUscheduling algorithms have been proposed. The characteri- zation of the processes we are dealing with can help in the choice of the most appropriate one. In order to compare differentCPUscheduling algorithms, several criteria can be used:

• T^HROUGHPUT: the number of completed processes per time unit.

• C^PUUTILIZATION: idle times have to be minimized. The cpu utilization is the fraction of the time the cpu is busy.

• T^URNAROUNDTIME: the interval from the time of submission of a process to the time of completion.

• W^AITINGTIME: the time spent by the process waiting to be executed, given that it’s actually ready to go on.

• R^ESPONSETIME: the interval from the time of submission of a process to the time at which the first response is produced.

(27)

Of course, it’s desirable to maximize theCPUutilization and the throughput, while the turnaround, waiting and response times should be minimized.

2.2.2 Scheduling Algorithms

The role of a scheduling algorithm is to choose, among all the processes that are ready to execute, to which one theCPUshould be allocated. All the processes that are ready to execute are stored²in a ready queue. In the following we briefly introduce several algorithms.

FCFS Scheduler

First-come, first-served algorithm is the simplest scheduling scheme that we can implement. The process that requests theCPUfirst is allocated theCPUfirst. An implementation of this algorithm can be done efficiently using aFIFOqueue. When a process enters the ready queue, itsPCBis linked onto the tail of the queue. When theCPUbecomes free and a process can be allocated to it, the one linked to the head of the ready queue is choosen.

Then, the selected process is removed from the queue.

Within this scheme, the average waiting time is generally not minimal.

Consider the scenario in which three processes arrive at time 0: P1with a burst time of 30ms, P2and P3both with a burst time of 5ms. In this case, if the processes arrive in the order < P1, P₂, P₃ >, the average waiting time is:

0 + 30 + 35

3 ≈ 21.6ms

If, instead, the order is < P3, P₂, P₁ >, then the average waiting time would be:

10 + 5 + 0

3 = 5ms

So, having long-term processes could increment substantially the aver- age waiting time. This is called the convoy effect.

2TheirPCBs are stored

(28)

Round-Robin Scheduler

This algorithm is specialized for time-sharing systems. The round-robin scheme adds to theFCFSone the concept of preemption, in which a running process can be interrupted temporarily even if it is not waiting for any request to be completed. A small unit of time, called a time slice is defined.

The ready queue is treated as a circular queue. The scheduler allocates the

CPUfor one time slice to each process in the ready queue, circurarly.

This kind of scheduler can be implemented easily. The ready queue is used again as aFIFOqueue, in which new processes are added to the tail of the queue. When theCPUbecomes idle, the head of the queue is selected and theCPUis allocated to it, after setting a timeout interval. If, during its execution, the process issues anI/Orequest, then it leave theCPUby itself.

Otherwise, if the process continues to run and the timeout interval expires, then the scheduler interrupts it and selects another process to be allocated on theCPU. The interrupted process is enqueued again so that, eventually, it will be allocated on theCPUagain.

If there are n processes in the system and the time slice is q, then each process gets 1/n of theCPUtime, in chunks of at most q time units. So each process can wait at most (n−1)q time units before execute. The performance of this scheme depends on the size of the time slice. Of course, smaller time slices reduce the average waiting time, while increasing the number of context switches on theCPU.

Priority Scheduler

In this kind of scheme, each process is associated with a priority value, and theCPUis allocated to the process with the highest priority. In case of same priority values,FCFSscheduling is used to decide.

Priority values can be defined either internally or externally. Internal priority values are based on some measurable quantity about the execution of the processes, for example: memory requirements, timing, ratio between

I/OandCPUactivity, and so on. Externally defined priority values are based on something that is outside the operating system, such as the importance of a process.

(29)

The priority scheduler can be preemptive or not.

The main problem with priority scheduling is the possibility of starva- tion: a processwaits forever in the ready queue, because there is always someone else with a higher priority that is executed.

A solution to this problem could be carrying out the priority scheduling based on an aging system. With this system, the priority of processes that are waiting since a long time is gradually increased so that eventually they will be executed.

Another possibility is to split the ready queue into two parts. New processes are inserted into one part, while in the other one we execute the priority scheduling algorithm. When the queue in which we execute the algorithm becomes empty, we insert into it all the processes that are in the other queue.

Shortest-Job-First Scheduler

In this case the algorithm associates with each process the length of the process’s nextCPUburst. When theCPUbecomes available, it is assigned to the process that has the smallest nextCPUburst. If the nextCPUburst of two processes are the same,FCFSscheduling is used to decide among the two.

This is a special case of the Priority Scheduler, in which the priority value is the inverse of the length of the predicted nextCPU-burst.

Suppose that we have the set < (p1, 7), (p2, 10), (p3, 4), (p4, 15) >. Using this scheduler, processes will be executed in the order < p3, p₁, p₂, p₄>. Not that theSJBalgorithm can be preemptive or not.

It can be shown that this scheme is optimal, in the sense that it gives the minimum average waiting time for any given set of processes.

Just intuitively, let’s suppose that we have the sequence < p1, p2, ..., pn>

of processes. The nextCPU-burst for piis indicated with T (pi), and we have that T (pi) < T (p_j)if i < j. The average waiting time is computed as follows:

W = W (p₁) + W (p₂) + W (p₃) + ... + W (p_n) n

Note that W (pi) = W (p_i₋₁) + T (p_i₋₁)for each i > 1. For p1the waiting time is of course zero. Substituting this in the previous formula we get:

(30)

W = 0 + [T (p₁)] + [W (p₂) + T (p₂)] + ... + [W (p_n₋₁) + T (p_n₋₁)]

n

and, doing some math, it’s easy to see that we eventually get:

W = 1

n[(n− 1)T (p1) + (n− 2)T (p2) + (n− 3)T (p3) + ... + T (p_n)]

In the last formula we can see that the smallest value of nextCPU-burst contributes to the sum n − 1 times, while the largest value just one time. It’s easy to see that for any other ordering of the same processes, the corrispond- ing W would be greater.

The real difficulty with this algorithm is estimating the length of the next

CPU-burst of each process. These values can only be estimated since there is no way to know them precisely since we cannot predict the future. The more the estimate is accurate, the more the algorithm behaves well.

The nextCPU-burst is generally predicted as an exponential average of the measured lengths of previousCPU-bursts.

Let tnbe the length of the nthCPU-burst, and let τn+1be our predicted value for the nextCPU-burst. Then, for α, 0 ≤ α ≤ 1, we define:

τ_n+1= αt_n+ (1− α)τn

The parameter α defines the relative weight of recent and past history in our prediction. Small values for this parameter corresponds to a small contribution of the recent history.

Multilevel Schedulers

Whenever we can classify processes in different categories, we can use a multilevel queue scheduler. The ready queue is partitioned into several separate queues, each one associated with a category of processes. Queues are ordered by their priority value. The scheduler can exploit the presence of several queues and priorities in different ways:

• No process waiting in a queue can be executed unless all the queues with higher priority are empty. The problem of starvation is still here.

(31)

Fig. 2.4: Prediction of nextCPUburst

Fig. 2.5: Multilevel Queue Scheduling

(32)

• Using time-slicing. Each queue gets a portion of the^CPUdepending of its priority. The kind of scheduling within the single queues could be different each other. Suppose we have two queues, one with a priority value of 0.8 and the other one with a priority value of 0.2. In this case the first queue gets the 80% of the totalCPUtime while the 20% is left to the other queue. In the first queue we could run aFCFS-based serving system, while in the other a round robin approach can be used.

The classical multilevel scheduler that we described here is quite inflexible, in the sense that once we assigned a process to a certain category, this assignment is permanent. A different, more flexible approach is used in the multilevel feedback-queue scheduler, in which processes can move from a queue to another. The idea is to characterize processes on the base of their

CPU-burst lengths. A process that uses intensively theCPUshould be moved to a lower-priority queue. With this criterion,I/O-bound processes are left in higher-priority queues. Moreover, a process that waits too long in a lower- priority queue may be moved to a higher-priority queue, in order to prevent the starvation. Again, within the single queue, the serving mechanism could be different from the one of another queue.

2.2.3 Multiple Processors Scheduling

So far we were discussing about single-CPUsystems. In case we have several identical processors in our system, the scheduling problem becomes slightly different. Now, not only choosing which process allocate is the problem, but also in whichCPUwe have to allocate it.

Of course, we want to use all resources with the same load, achieving a load balancing, and we don’t want to waste resources.

It’s important to note that in case the processors are not identical each other with respect to their functionality, some other issues arise. We could infact have some additional constraint on the choose of theCPU. Consider for example the case in which not all processors can use allI/Odevices because some processor is not connected to the bus of that device. In this case, a process that needs to use that device cannot be scheduled on a processor that cannot access that device.

(33)

There are two main approach to the scheduling problem in multiple processors environments:

• A^SYMMETRIC MULTIPROCESSING: one processor is responsible for scheduling decisions,I/Oprocessing and other system activities, while the other processors execute only user code. In this case, only one processor accesses the system data structures, so we don’t need to care about concurrent accesses on them.

• S^YMMETRICMULTIPROCESSING(SMP): each processor is self-scheduling.

It’s a kind of distributed scheduler among the processors. All the processes can be in a single common queue or each processor may have its own private ready queue. In case of a single common ready queue, the scheduler must take into account the criticalities of maintaining data consistency.

The choose of theCPUon which allocate the next process can be performed on the base of some architectural characteristics. Each processor has a cache memory in which some data used by the current process is stored, so that subsequently accesses on the same data will be faster. Cache memory is populated as the process execute. Re-scheduling a process each time on a different processor would result in a waste of cache potential, since each time the content of a cache must be invalidated and another cache must be re-populated. Such operations are costly for aCPU, so usuallySMPsystems try to avoid process migration. This is the concept of processor affinity:

a process has higher affinity with the processor on which it’s currently running.

Load Balancing can be achieved in two main ways:

• P^USHMIGRATION: a specific process monitors the load of each processor and periodically checks if there is some imbalance. If it is so, it moves some process from busy processors to more busy ones, hence distributing the load.

• P^ULL MIGRATION: an idle processor pulls a waiting process from a busy processor.