Four semantics for a disciplined concurrency in COP.

(1)

Universit`

a di Pisa

Master’s degree in Computer Science

Four semantics for a disciplined

concurrency in COP.

Supervisors:

Prof. Pierpaolo Degano

Dott. Letterio Galletta

Candidate:

Matteo Busi

(2)

(3)

Chapter 1 Introduction

Computer systems are pervasive in our daily lives, and we depend on them for most of our activities, being them to chat with our friends and family, to track our health, to book our trips, or to navigate in unknown cities. As a result, our always-connected world brings with it the need of software available always and anywhere, even in ever-changing and unknown a priori environments, often referred to as the contexts.

Many solutions to the issues of context-awareness and context-dependency in software systems have been proposed. Historically the first way to implement context-awareness was to model the context using a dedicated data structure and then to provide a querying interface to it. However, the programmer had the full responsibility of the context implementation. Moreover, this kind of approach is against the usual software engineering practices where re-use, maintainability, separation of concerns, and simplicity in debugging are primary interests.

Recently, the context-oriented programming (COP) has been proposed as a way to solve the above issues. COP introduces, at the programming language level, new context-oriented features, needed to support the design and the pro-gramming of context-aware and context-dependent software systems. Due to the newly introduced programming mechanisms, that hide most of the low-level de-tails of context definition and management from the programmer, a new discipline of programming is imposed, with a positive impact on the software engineering issues cited above.

The MLCoDa programming language is a two-component COP language,

con-stituted by a declarative component for programming the context and a func-tional one for computing. More precisely, context-adaptation in MLCoDais possible

through two mechanisms: context-dependent bindings and behavioural variations. The first permits binding of program variables to different values, depending on the properties holding in the current context, while the second allows defining chunks of behaviour that are dynamically activate depending on the context.

(7)

Context-adaptivity introduces a new form of runtime functional errors, called adaptation errors, i.e. errors happening due to the inability of an application to adapt to a context. A two-phase static analysis has been proposed in [DFG16b] for MLCoDa to ensure that no adaptation error will occur at runtime. The first

step of the analysis happens at compile-time when the compiler type-checks and computes its effect, an over-approximation of the manipulations on the context at runtime. The second step of the verification occurs at load-time. More specifically, a control-flow analysis uses the sequence of predicted manipulations of the context (given by the effect computed at compile-time) to predict changes in the context and verify that the application being loaded will be able to adapt to each possible context that may arise at runtime.

The problem of concurrency in COP (and in MLCoDa) was overlooked until

recently, even though it is a natural development of the scenario presented above. Introducing concurrency in MLCoDa consists in considering a setting where many

applications manipulate, query, and communicate through a shared context. A first step in extending MLCoDa to support concurrency is in [DFG16a], where they

consider systems with two threads, the context thread and the actual application. The first thread virtualises the resources and the communication infrastructure. The other is the application, and the interactions with the other entities in the context are rendered as the occurrence of asynchronous events that represent the relevant changes on it.

In this thesis we extend this approach by introducing multiple applications sharing and manipulating the same context, possibly to communicate and ex-changing data through it. We do not assume any scheduling policy or any fixed set of applications: a new application may be loaded at any time without any prior knowledge of its behaviour by others.

To address the problem of the interference that arises, we enriched MLCoDa

with a runtime verification mechanism, embodied in the dynamic semantics of the language. Indeed, we propose four different and alternative semantics, each with its own runtime verification mechanism.

The first is actually a family of semantics, depending on a parameter called the look-ahead. Since the runtime mechanism checks for possible harmful behaviour by the running application, the look-ahead parameter tells the mechanism how far in the future of execution should go the search. More precisely this first mechanism checks, whenever an application is about to interact with the context (either by modifying it or by querying it), whether the future behaviour of the application resulting from the execution of the pending interaction may be harmful to other applications running concurrently. To foresee the future behaviour of the applica-tion, the verification mechanism employs the effect associated with the application at compile-time. Dually, the mechanism also verifies that no other application may,

(8)

in the future, hinder the execution of the running one.

The second semantics, which we call “myopic”, is generally more permissive than the first one, and only checks for consistency when a program attempts to modify the context. If the inspection fails, the application halts until it is safe to execute it again.

The third semantics deals with the applications that halt explicitly, by marking them as frozen. Frozen applications are not considered by the runtime mechanism when checking for interference, hence removing the risk of deadlocks, i.e. situations in which two or more applications mutually wait for resources owned by others. A frozen application is eventually un-frozen once its possible changes on the context are no more harmful to other active programs.

Finally, the fourth semantics is a environment-based variant of the third. In this semantics, in fact, when an application is un-frozen the bindings for variables are computed again in the new context.

Structure of the thesis. The next chapter, Chapter 2, introduces the required background to understand the remainder of the thesis: we introduce the concepts of autonomic computing, context-oriented programming, small-step operational semantics, and type and effect systems. Chapter 3 introduces the sequential version of the MLCoDa language, in terms of its syntax, semantics, and type and effect

system. In Chapter 4 we introduce the four semantics and runtime verification mechanisms sketched above, and we show the relations intercurring among them. Finally, Chapter ??, summarises the results and concludes the work.

(9)

Chapter 2 Background

2.1 Autonomic computing

Information technology (IT) professionals have to improve the quality of service and reduce costs of maintenance even when the complexity of components and infrastructures increases.

In an effort to deal with this problem IBM [IBM06] proposed the autonomic computing approach. The term autonomic comes from the Greek of self-governing and is derived from physiology. In the human physiology, the autonomic nervous system regulates (among others) body temperature, guides respiration, and mon-itors the heartbeat without any conscious effort by the individual. In the same way, IT systems should be able to self-govern themselves with minimal human intervention. Of course the systems cannot and should not be completely self-governing, but the human administrator must be able to limit their freedom by means of policies and rules. As shown in Figure 2.1 an autonomic system is a collection of interacting autonomic components (or elements). Constituents of the single component are the managed element, either a hardware resource like a CPU or a printer or a software resource such as a database or a file, and the autonomic manager.

The autonomic manager is itself constituted by different parts, whose duty is to monitor and acquire knowledge about the managed element and its surrounding environment, to construct and execute plans based on an analysis of the available information. In particular autonomic components, according to [IBM06, KC03], have to treat with four different aspects of self-management:

Self-configuration the components should be able to dynamically adapt to changes (e.g. introduction of a new component or removal of a existing one) in the environment as instructed by policies given by the IT administrator. For example a new autonomic peripheral attached to a system will register itself

(10)

Figure 2.1: Structure of an autonomic system, with a zoom on the internals of a single autonomic component. [KC03]

and its capabilities so that other components can use it.

Self-optimization the components can tune themselves to gain better perfor-mance with respect to the needs of the system or the end-user. Kephart and Chess [KC03] advocate that the component should always seek for ways to improve their operations. For example a complex autonomic database man-agement system could automatically tune its parameters and relocate some resources based on frequent queries from other applications in the system.

Self-healing since in complex systems is hard to determine the cause of failures, a component of an autonomic system must be able to detect, diagnose, and repair localized problems. For example an autonomic software updater could perform testing based on known exploits and apply relevant patches if avail-able.

(11)

protection the component must be able to protect against threats. Self-protecting components can detect hostile behaviour and take corrective ac-tions following policies given by the administrator. For example if a virus is detected to infect a file, that file could be moved to quarantine.

In recent years different ideas to tackle with self-adaptation, particularly for software systems, have been explored, see e.g. [KRV+15]. The proposed methods include model-based approaches, architecture-based approaches, and programming paradigms. The focus of the first approach is on models, considered as first class entities that describe software and its environment. The architecture-based ap-proach focus on software architecture, i.e. the set of software components, the relations among them and the properties of both. The emphasis of the so-called programming paradigms is on programming constructs needed to suitably support self-adaptation.

The need for dedicated programming constructs naturally arises when one tries to solve the problem of adaptivity with standard constructs. Indeed, a program-mer could write a dedicated data structure modelling the context (i.e. the set of resources available to the program) and use a sequence of if /case statements to properly answer to a finite number of queries on the context.

Of course, the necessity to write an ad-hoc data structure for the context in-creases the code complexity and makes difficult good separation of cross-cutting concerns. A complex code and a bad separation of cross-cutting concerns make the code difficult to test, to maintain, to fix, to debug, and to test properly. Moreover, this approach hinders the development of automated analysis tools. A linguistic-level solution seems to be ideal to tackle adaptivity. It can provide flexi-ble enough mechanisms through dedicated constructs and help in keeping the code simple. A first approach that falls in this category is (dynamic) aspect-oriented programming (AOP) [KLM+97, Tan08, BHMO04, MO03]. AOP allows to decou-ple application logic from context-dependent behaviour by defining aspects, i.e. standalone modules used to organize and separate orthogonal code. Although dy-namic AOP permits adaptation through dydy-namic addition and removal of aspects, it does not provide a explicit representation of the context. As a result, the pro-grammer needs to implement himself the context and the mechanisms to react to its changes. For this reason, AOP is not completely suitable. A more promising approach is context-oriented programming [CH05, HCN08], and we will describe it in the next section.

2.2 Context-oriented programming

Context-oriented programming (COP) is a recently proposed programming paradigm, born from the need to have context-dependent behaviour distinguished by an

(12)

ex-plicit treatment of context and proper mechanisms to dynamically adapt the be-haviour of applications as a reaction changes in the context. Such a bebe-haviour can be found in a wide range of applications, for example in location-based services or deeply personalized apps on mobile devices.

More precisely Hirschfeld et al. [HCN08] advocate that a language supporting COP should at least implement:

Behavioural variations. Chunks of new, modified or removed behaviour that can be dynamically activated.

Layers. Groups of related context-dependent behavioural variations. Layers usu-ally are also first-class objects in the language, so that they can be assigned to variables and passed to/returned by functions.

Context. Represents any computationally accessible information. The context is implicitly characterized by a set of layers activated and deactivated at runtime.

Activation. Mechanisms to activate and deactivate layers at run-time.

Scoping. The scope, in which layers are activated or deactivated, should be con-trolled explicitly.

Some fundamental design choices are highlighted by Salvaneschi et al. [SGP12a]. The first choice concerns layer declaration strategy. It can be of the class-in-layer or layer-in-class. The first kind of declaration strategy denotes a language where layers are declared outside the lexical scope of the class for which it provides be-havioural variations. The second one characterizes a language where declarations are inside the lexical scope of the module they provide variations for.

The second choice involves activation mechanisms for layers. When using a global activation mechanism there exist a unique shared context and a behavioural activation affects the control-flow of all threads in the application. With a local activation mechanism instead, different local contexts exist and the activation only impacts on the entities (e.g. threads) affected by the modified context.

Finally the third choice is about the extent of behavioural variations. If dy-namically scoped activation is chosen, then a language construct (e.g. with (layer){ statements; }or without (layer){ statements; }) must be present to activate (or de-activate) a specific layer while executing a piece of code. Activated (or de-activated) layers are usually put on a stack to allow nested calls. Otherwise, if the indefinite activation strategy is chosen, programmers have to specify themselves when a layer is activated and, consequently, de-activated.

(13)

Some languages for COP. Many COP extensions to existing languages have been designed and implemented [AHH+09, SGP12a].

ContextL [CH05, Cos08] was one of the first extensions, it was based on Lisp and extended the Common Lisp Object System. It implemented both class-in-layer and class-in-layer-in-class for class-in-layer declaration, used the global activation mecha-nism, and dynamically scoped activation [SGP12a]. Some other COP extensions for dynamic languages include ContextS [HCN08, HCH08] for Smalltalk, Contex-tErlang [SGP12b] for Erlang, and PyContext [vLDN07] and ContextPy [Sch08] for Python. In the field static languages Java generated ContextJ [HCN08, AHHM11], JCop [AHM+_{10], and others [SGP12a]. The interested reader may find further}

de-tails on features and comparisons of languages in [AHH+09, SGP12a].

Formal studies. Some formal studies about constructs for adaptivity in COP have been conducted.

An early example is due to Schippers et al. [SJHH08]. They introduced a formal semantics for cj, and other languages in the j family, exploiting the δ-calculus [AD02], a delegation-based δ-calculus. Another approach to give the formal semantics of COP layer constructs is based on graph transformations [MSJH10].

Clarke and Sergey [CS09], introduced ContextFJ, a core calculus extending Featherweight Java [IPW01]. They adopted the class-in-layer strategy for be-havioural variations and added layers, scoped layers activation/deactivation, but did not consider constructs for inheritance. They also introduced a type system ensuring that each dispatched method call has a proper binding. Hirschfeld et al. [HIM11] worked on the same core calculus, but with inheritance, and proposed a small-step operational semantics. They introduced a simple type system too, dealing with common field-not-found and method-not-found runtime errors. Un-fortunately the proposed type system is too restrictive since it does not allow layers to introduce new methods. This issue was addressed by Igarashi et al. [IHM12], us-ing a type system to deal with dynamic layer composition. Moreover Hirschfeld et al. [HIM11] do not have constructs for layer de-activation. Kamina et al. [KAI14] solved this problem using on-demand activation.

On the side of functional programming languages Degano et al. [DFGM12] proposed ContextML. ContextML is a core calculus that extends a purely functional subset of ML with typical COP constructs, i.e. layers, layered expressions, and scoped activation mechanisms. In addition to the dynamic small-step operational semantics they also give a type system to ensure that the dispatching mechanism always succeeds at run-time for well-typed expressions. A further evolution of ContextML, introducing some novelties such as a Datalog context and a load-time analysis is MLCoDa [DFG16b]. The approach taken for MLCoDa will be explained

(14)

Example 2.2.1. We further clarify COP constructs and concepts introducing an example from [CH05]. As a first assumption imagine to have a layered class (i.e. a class depending on a layer) person defined as

(define-layered-class person ()

((name :initarg :name :accessor person-name))) with the meaning that person has no superclass, one slot name possibly initial-ized using :name and accessed via person-name.

Now, if we define a function to display an object

(define-layered-function display-object (object)) a corresponding method for the root layer t

(define-layered-method display-object :in-layer t ((object person))

(format t "Person˜%")

(format t "Name: ˜A˜%" (person-name object))) and an instance

(_{defvar *stephen*}

(make-instance ’person

:name "Stephen V. Strange")) then the output of

(display-object *stephen*) will be:

Person

Name: Stephen V. Strange

Suppose now that the layer address-layer has been defined using the macro deflayer:

(deflayer address-layer)

We can then variate the behaviour of the person class. As an example con-sider the addition of the slot address accessible only when address-layer is active

(define-layered-class address :in-layer address-layer () ((address :initarg :value

(15)

(define-layered-class person :in-layer address-layer () ((address :initarg :address

:layered-accessor person-address)))

If we also add method definitions for display-object in address-layer, i.e.

(define-layered-method display-object :in-layer address-layer

((object address))

(format t "Address: ˜A˜%" (address-value object)))

(define-layered-method display-object :in-layer address-layer :after

((object person))

(display-object (person-address object))) then executing:

(_{defvar *svsaddress*} (make-instance ’address

:value "177A Bleecker Street, New York City"))

(defvar *stephen*

(make-instance ’person :name "Stephen V. Strange" :address *svsaddress*)) (display-object *stephen*) will produce:

Person

Name: Stephen V. Strange

Address: 177A Bleecker Street, New York City

2.3 Small-step operational semantics

To speak about the programs meaning and their behaviour we must precisely describe how each language construct will be evaluated or interpreted. Natural

(16)

languages are not suitable since they are inherently ambiguous and they cannot be used as a basis for implementation, analysis, and verification.

Formal semantics comes to help in three different flavours [NN]:

Axiomatic semantics: Which describes program behaviour in term of assertions (i.e. the axioms) about program state. The focus is mainly on properties of programs.

Denotational semantics: Which describes meaning in terms of mathematical objects representing the effects of execution. In this case the focus is on what is the effect of a computation.

Operational semantics: Which describes the meaning of constructs as the com-putation it induces when executed on a machine. Here the focus is on how the effect of a program is produced.

Moreover operational semantics can be further divided into two styles:

1. Big-step operational semantics (also known as natural semantics) which de-scribes how the overall results of executions are obtained.

2. Small-step operational semantics (also known as Structural operational se-mantics, SOS ) which describes how to obtains results step-by-step.

Since in the remainder of the thesis all the semantics will use the SOS style, in this section we will concentrate only on this kind of semantics.

In SOS semantics the meaning of programs is given in terms of transition systems.

Transition systems are defined in terms of a set of configurations C and a transition relation → ⊆ C × C, that describes how the execution takes place.

The relation → is usually induced as the smallest relation satisfying a set of inference rules, where a inference rule assumes the form

p1 . . . pn

c (RuleName)

meaning that if the premises p1. . . pn hold then the conclusion c holds too.

Example 2.3.1. As a first example, we present a SOS semantics of the WHILE language [NN]. Even if this is an imperative language, concepts here presented will be useful when specifying some of the MLCoDa semantics. Let us assume to

have the set of numerals N um, ranged over by n, n0, n1, n2, . . ., and the set of

(17)

Also, assume that the language is generated using the following BNF grammar:

Aexp 3 a ::= n | x | a1⊕ a2 | a1⊗ a2 | a1 a2

Bexp 3 b ::= true | false | a1 a2 | not b | b1orb2

Com 3 S ::= skip | x := a | S1; S2 | if b then S1elseS2 | while b do S

In this case we’ve three transition systems:

(i) The transition system for arithmetic expressions, whose transition relation is →a,

(ii) the transition system for boolean expressions, whose transition relation is →b,

(iii) the transition system for commands, whose transition relation is →.

Of course the shape of the above relations varies, but all of them rely on the idea of stores. The set of stores, Σ, is a set of functions mapping variables to numerals, i.e. Σ = {σ | σ : V ar → N um}.

The change of the value of a variable is modelled through the updating opera-tion, defined as

σ[n/x](y) = (

n if y = x σ(y) otherwise

The first relation →a⊆ (Σ × Aexp) × ((Σ × Aexp) ∪ N um) for arithmetic

expressions is the smallest relation induced by the inference rules in Figure 2.2. Since the store never changes, we simply write σ ` a →aa0 instead of σ, a →aσ, a0.

The first rule, (Var), trivially reduces the variable x to its value in the current

store. The second and the third rule reduce, step-by-step, the first and the second operand of syntactic operation. Finally rule (Op) computes the final value using

one of the operators on N um. It is worth noting that, here, operators ⊕, , ⊗ are purely syntactical, while operators +, −, × are actual arithmetic operators.

Similarly the small-step semantics for boolean expressions is defined as the smallest relation →b⊆ (Σ × Bexp) × ((Σ × Bexp) ∪ {true, false}) induced by

the set of rules in Figure 2.3. Again, for simplicity we write σ ` b →b b0 instead of

σ, b →b σ, b0.

In this case, the operator ≤ stands for the real order relation over N um, whereas is purely syntactical. It is interesting to note that in rules (Not1-3) and rules (Or1-5) the reduced value of the operands is explicitly inspected to

produce the final truth value of the whole expression, as is typical in this kind of semantics where the stress is put on how things are computed. Finally we consider the relation → for commands. In this case the shape of the relation is different

(18)

(Var) σ ` x →aσ(x) (LOp) σ ` a1→aa01 σ ` a1opsa2→aa01opsa2 ops∈ {⊕, , ⊗} (ROp) σ ` a2→aa02 σ ` n1opsa2→an1opsa02 ops∈ {⊕, , ⊗} (Op) σ ` n1opsn2→an1opN umn2 ops∈ {⊕, , ⊗} ∧ opN um∈ {+, −, ×}

Figure 2.2: Rules inducing the relation →a for arithmetic expressions of the

WHILE language. (Lleq) σ ` a1→aa01 σ ` a1 a2→ba01 a2 (Rleq) σ ` a2→aa02 σ ` n1 a2→bn1 a02 (Leq) σ ` n1 n2→bn1≤ a02 (Not1) σ ` b →bb0 σ ` not b →bnotb0 (Not2)

σ ` not true →bfalse

(Not3)

σ ` not false →btrue

(Or1) σ ` b1→bb01 σ ` b1orb2→bb01orb2 (Or2) σ ` true or b2→btrue (Or3) σ ` b2→bb02 σ ` false or b2→bfalse orb02 (Or4)

σ ` false or true →btrue

(Or5)

σ ` false or false →bfalse

Figure 2.3: Rules inducing the relation →b for boolean expressions of the WHILE

language.

with respect to the above since the final result is no more a value, but a store i.e. → ⊆ (Σ × Com) × ((Σ × Com) ∪ Σ). The relation above is induced as the smallest one satisfying the set of rules in Figure 2.4.

The first rule gives meaning to the skip command. The second and the third rules, (Assign0-1), are pretty straightforward: the first reduces the left-hand side

to a value, the second introduces the new bind in the store. Rules (Comp0-1)

introduce command composition. The first rule ensures that the first command S1

is first reduced to a store, while the second rule reduces the second command S2

provided that S1 completed its reduction to a store. Rules(If0-2) allow reducing

if-then-elsecommand: first they reduce the guard b to a boolean value, then they execute either S1 or S2 according to the obtained value. The last rule copes

(19)

(Skip) σ, skip → σ (Assign0) σ ` a →aa0 σ, x := a → σ, x := a0 (Assign1) σ, x := n → σ[n/x] (Comp0) σ, S1→ σ0, S10 σ, S1; S2→ σ0, S10; S2 (Comp1) σ, S1→ σ0 σ, S1; S2→ σ0, S2 (If0) σ ` b →bb0

σ, if b then S1elseS2→ σ, if b0thenS1elseS2

(If1) σ ` b →btrue σ, if b then S1elseS2→ σ, S1 (If2) σ ` b →bfalse σ, if b then S1elseS2→ σ, S2 (While) σ `→bb0

σ, while b do S → σ, if b then (while b do S) else skip

Figure 2.4: Rules inducing the relation → for commands of the WHILE language.

Example 2.3.2. As a further example let us consider an SOS semantics for a pure functional language, which is a subset of MLCoDa (see Chapter 3).

Assume to have a set of constants Const ranged over by c, c0, . . . and a set of variables V ar ranged over by x, x0, y, . . ..

The syntax for this language is generated by the BNF grammar V al 3 v ::= c | λfx.e

Exp 3 e ::= v | x | let x = e1ine2 | if e1thene2elsee3 | e1e2

The first production generates the set of values V al that includes constants and function abstractions. Note that function abstractions are in the form λfx.e,

indicating that the function f has formal parameter x and body e. Functions are named to allow for recursion. The second production generates the set of expressions Exp, that includes values, variables, a let construct, a conditional construct, and a function application construct. Moreover we indicate the substi-tution of a variable x with the expression e0 in an expression e as e{e0/x}, defined as follows c{e0/x} = c (λfx0.e){e0/x} = λfx0.e{e0/x} if f 6= x ∧ x0 6= x ∧ f, x0 ∈ F V (e/ 0) x{e0/x} = e0 x0{e0/x} = x0 if x 6= x0 (e1e2){e0/x} = e1{e0/x} e2{e0/x}

(if e1thene2elsee3){e0/x} = if e1{e0/x} then e2{e0/x} else e3{e0/x}

(let x0 = e1ine2){e0/x} = let x0 = e1{e0/x} in e2{e0/x}

(20)

(Let1) e1→ e01 letx = e1ine2→ let x = e01ine2 (Let2) letx = v in e2→ e2{v/x} (If1) e1→ e01

ife1thene2elsee3→ if e01thene2elsee3

(If2)

if true thene2elsee3→ e2

(If3)

if false thene2elsee3→ e3

(App1) e1→ e01 e1e2→ e01e2 (App2) e2→ e02 (λfx.e) e2→ (λfx.e) e02 (App3)

(λfx.e) v → e{v/x}{λfx.e/f }

Figure 2.5: Rules inducing the relation → for expression of the functional language in Example 2.3.2.

The transition relation → is a subset of Exp × Exp and is defined only for expressions without free variables. We denote a step of computation as e → e0.

The relation → is the smallest one induced by the set of rules in Figure 2.5. The first two rules, (Let1-2), reduce the let construct: the first one reduces

the expression e1 step-by-step to a value v and the second reduces the whole

construct to e2 where x is substituted by v. Rule (If1) reduces the guard of

the conditional construct to a value, while rules (If2-3) simply reduce the whole

conditional to the right expression according to the guard. Finally, the last three rules implement an eager semantics for function application. Rule(App1)reduces

the expression e1to a functional value, once done rule(App2)evaluates the reduces

e2 to a value v which is then substituted in the function’s body by rule (App3).

Interestingly the semantics does not include rules for variables, since we assume expressions to be closed.

2.4 Type and effect systems

Static analysis is a wide area of computer science, comprising a broad collection of methodologies and techniques to safely predict, without actual execution, proper-ties about programs. Usually, the predicted properproper-ties approximate set of values or set of behaviours arising dynamically during execution. For example, a static analysis may forecast a superset of the values that a variable may assume during execution or the set of program points that may invoke a function at runtime.

(21)

of situations, like code optimization or implementation correctness.

According to Nielson and Riis Nielson [NN99] static analysis approaches falls in two categories:

Flow based approaches that includes data flow analysis techniques, usually ap-plied to imperative and object oriented programming, control flow analysis techniques, mainly developed for functional and object oriented program-ming, and abstract interpretation techniques, for imperative, functional, con-current, and logic languages.

Inference based approaches that include type and effect systems, used for func-tional, imperative, and concurrent languages.

This section focusses on the last category, considering in some detail type and effect systems applied to strongly typed functional languages.

Let us suppose that a language is defined as in Section 2.3, then the first step would be to define a type system. The type system is defined by a set of types, intuitively describing the structure of data used and produced by programs, and by a set of inference rules associating types to programs.

Since sub-programs may contain free variables we define a type environment Γ that maps free variables to their type, the typing relation assumes the form of a judgement

Γ ` p : τ

meaning that in the type environment Γ the program p has type τ .

When designing a type system we must ensure that it is semantically correct with respect to the dynamic semantics of the programming language. Moreover we must guarantee that [NN99]:

• the type system is decidably verifiable, meaning that there should be an algorithm, called type checking algorithm, able to check types assigned to a program,

• there is always a “best” type, called principal type, such that it can be inferred automatically by an algorithm.

Example 2.4.1. To better illustrate the concept we introduce a simple type system for the language introduced in the Example 2.3.2.

The set of types may be defined as follows:

τ ::= τbase | τ1 → τ2.

Assuming that τbase is the set of base types (e.g {bool, int, string, . . .}), each

constant c ∈ Const of the language has an associated base type τc ∈ τbase, and

(22)

(TConst) Γ `Tc : τc (TVar) Γ(x) = τ Γ `T x : τ (TLet) Γ `Te1: τ1 Γ, x : τ1`Te2: τ Γ `Tletx = e1ine2: τ (TIf) Γ `Te1: bool Γ `Te2: τ Γ `Te3: τ Γ `Tife1thene2elsee3: τ (TAbs) Γ, x : τ1, f : τ1→ τ2`Te : τ2 Γ `Tλfx.e : τ1→ τ2 (TApp) Γ `Te1: τ1→ τ2 Γ `Te2: τ2 Γ `T e1e2: τ2

Figure 2.6: Rules inducing the typing relation of Example 2.4.1.

For this language, the typing relation has the form

Γ `T e : τ

and is the smallest relation induced by rules in Figure 2.6. The first rule is the axiom associating the constant c to its constant pretty straightforward and asso-ciates constants to their types. The rule (TVar)extracts the type for the variable

x from the environment Γ. The let statement is typed using the rule (TLet):

the type of the whole expression is the same as the type of e2 inferred in an

en-riched environment Γ, x : τ1 where τ1 is the type of expression e1. The rule for

conditional statement is more interesting: the whole expression has type τ in Γ if the guard e1 has boolean type and both branches has the same type, τ . e.g.

the expression if true then 42 else 7 can be typed as int, while expression if true then42 else false is not well typed according our type system, since the type of the then branch is different from the one of the else branch. Rule (TAbs) types the abstraction by assuming the type of the formal argument

and deriving the type of the body e consequently. Finally rule (TApp) types the

function application, assuming that e1 has functional type and the actual

param-eter e2 has the same type as the formal parameter x.

The idea behind type and effect systems is to further refine concepts of type systems by adding effects, i.e. annotations on types to express properties concern-ing the semantics of the program. We illustrate the idea and methodology with an example taken and adapted from [NNH15].

Example 2.4.2. Let us consider the functional language from Example 2.3.2 with the type system introduced in Example 2.4.1, which we will call underlying type system. We introduce a type and effect system to perform an analysis called call-tracking analysis, meaning that for each subexpression we are interested in which function abstractions may be applied during its evaluation [NNH15].

(23)

(SRefl) b τ vbτ (SFun) b τ10 vbτ1 τb2vτb 0 2 ϕ ⊆ ϕ 0 b τ1 ϕ −→_bτ2vbτ 0 1 ϕ0 −−→_bτ20 (TEConst) b Γ `T Ec :τbc& ∅ (TEVar) b Γ(x) =bτ b Γ `T Ex :bτ & ∅ (TELet) b Γ `T Ee1:bτ1& ϕ1 Γ, x :b bτ1`T Ee2:bτ & ϕ2 b Γ `T Eletx = e1ine2:τ & ϕb 1∪ ϕ2 (TEIf) b

Γ `T Ee1: bool & ϕ1 bΓ `T Ee2:bτ & ϕ2 bΓ `T Ee3:τ & ϕb 3 b

Γ `T Eife1thene2elsee3:bτ & ϕ1∪ ϕ2∪ ϕ3

(TEAbs) b Γ, x :bτ1`T Ee :τb2& ϕ2 b Γ `T Eλfx.e :bτ1 {f }∪ϕ2 −−−−−→bτ2& ∅ (TEApp) b Γ `T Ee1:τb1 ϕ −→τ_b2& ϕ1 Γ `b T Ee2:bτ2& ϕ2 b Γ `T Ee1e2:bτ2& ϕ1∪ ϕ2∪ ϕ (TESub) b Γ `T Ee :bτ & ϕ bτ vbτ 0 _{ϕ ⊆ ϕ}0 b Γ `T Ee :bτ 0 & ϕ0

Figure 2.7: Rules for the type and effect system of Example 2.4.2.

In this kind of analysis, effects are sets of function names and, to properly keep track of them, we need first to annotate types as follows

b

τ ::= τbase|τb1

ϕ

− →τ_b2

where ϕ is the effect, i.e.

ϕ ::= ∅ | ϕ1∪ {f } | ϕ1∪ ϕ2 with f a name of function.

Moreover let b·c be the operator that maps annotated types into their un-annotated counterparts, namely: b_bτ c = ( b τ if _bτ ∈ τbase b_bτ1c → bbτ2c if bbτ1 ϕ − →_bτ2c

Judgements have now the form

b

Γ `T E e : τ & ϕb

meaning that the expression e has (annotated) type_bτ and effect ϕ in the annotated type environment bΓ, i.e. a typing environment mapping variables to annotated types.

Figure 2.7 shows the rules of the type and effect system. A brief explanation of their meaning is given below. The first two rules, (SRefl)and (SFun), define

the relation v among types. This definition makes τ_b1 ϕ

(24)

and covariant in τ_b2 and ϕ. Rule (TEConst) is a trivial extension of the

corre-sponding typing rule of Example 2.4.1 and trivially produces an empty effect. The rule (TEVar) produces the expected annotated type and an empty effect, since

naming a variable causes no evaluation. Rules (TELet)and (TEIf) extend types

and combine the effects of their subexpressions. Note that the rule (TEIf) may

cause a loss of precision in the analysis since both the effects of e1 and e2 are taken

into account. Rule(TEAbs)produces a functional type decorated with a latent

ef-fect corresponding to the union of the singleton made by the name of the function and the effect obtained by the body of the function abstraction. Rule (TEApp)

takes into account function application and combines the effects of reducing e1,

the latent effect of the function abstraction, and the effect of reducing the actual parameter. Finally, rule(TESub)implements subtyping and subeffecting allowing

for types and effects to be enlarged as needed.

Note that annotations on types are implicitly considered to be equal modulo the existence of a unit, commutativity, associativity, and idempotence. Functional types are considered equal if they share the same underlying types and all anno-tations on corresponding function arrows are “equals as sets.” To be more precise one should introduce relevant rules to define such equivalences among annotations and types.

Finally, type and effect systems must be semantically correct and a conservative extension of the underlying type system. The first property guarantees that the analysis is correct with respect to the semantics, i.e. that the calculated properties are safe approximations of the real ones. The latter, instead, guarantees that every expression with a type in the underlying type system also has a type in the type and effect system; and whenever an expression has a type in the type and effect system then it has a type in the underlying type system.

Example 2.4.3. Referring again to the functional language from Example 2.3.2 with the type and effect system of Example 2.4.2, the semantic correctness takes the form of a subject-reduction theorem, i.e.

Theorem 2.4.1. If bΓ `T E e : bτ &ϕ and ρ ` e → e

0_{, then b}_{Γ `}

T E e0: bτ &ϕ

0 _where

ϕ0 ⊆ ϕ.

Proof. This result can be shown by structural induction over the structure of expressions. [NNH15]

While the property of being a conservative extension becomes:

Theorem 2.4.2 (Conservative extension). (i) If bΓ `T E e : bτ then bbΓc `T e : bbτ c.

(ii) If Γ `T e : τ then there exists bΓ and τ such that bb Γ `T E e : bτ , bbΓc = Γ and b_bτ c = τ .

(25)

Proof. Rule(TESub)for subtyping and subeffecting in Figure 2.7 guarantees that

the type and effect system is a conservative extension of the type system in Fig-ure 2.6, as requested. [NNH15]

The interested reader may find further details about type and effect systems, algorithms, and methodologies in [NN99, NNH15].

(26)

Chapter 3 ML

_CoDa

3.1 The design of ML

_CoDa

In this section, we informally overview the main programming features of MLCoDa.

In particular, we discuss the mechanisms that allow adaptation and permit to define and manage the context. After that, we briefly summarise the execution model of MLCoDa programs.

As mentioned above, MLCoDais a context-oriented programming language

com-posed of two components [DFG16b]. This two-component design enforces separa-tion of concerns between the specific abstracsepara-tions for the context and those used for the actual computations.

The first component is Datalog [CGT89] and it is dedicated to context defi-nition and management. Intuitively, the context is a heterogeneous collection of data coming from different sources and having different representations. Designing a context requires different skills than those needed to program standard appli-cations and, usually, a requirement engineer takes care of that. The declarative approach of Datalog allows engineers to express what information the context has to include, leaving to the virtual machine how data is collected and managed. In MLCoDa the context is constituted by a set of facts predicating over the data

domain, and a set of rules to deduce further (implicit) properties of the context itself. The deduction mechanism of Datalog, which is fully decidable in polynomial time [CGT89], allows programs to query the context to decide whether properties hold in it or not. Finally, to keep the separation between data coming from the inside those coming from the outside of the application, the context is split into two parts: the system part and the application part. The system part concerns the data about the environment in which the program runs, while the application part, includes specific knowledge related to the program (e.g. application settings). Both system and application parts are accessed and managed uniformly from the

(27)

Compiler Obj code App Context Approx System Libs Sys Context Code Context Approx Exec App Context ML Code Linking if verification ok

Figure 3.1: The execution model of MLCoDa. [DFG16b]

language.

The second component, the computational one, is inspired by ML to which MLCoDa adds two adaptation primitives.

The first primitive is called context-dependent binding and allows the program-mer to declare variables that depend on the context (such variables will be called parameters hereafter). The dlet construct implements such a primitive using a syntax similar to the classical let, but with the addition of a Datalog goal.

The second primitive, which is one of the fundamental concepts in COP, is the behavioural variation. Intuitively, a behavioural variation is a chunk of be-haviour that can be activated depending on the context and allows an application to adapt and modify its behaviour dynamically. In MLCoDa this mechanism is

implemented as a construct similar to pattern matching with Datalog goals as selectors. Syntactically, a behavioural variation is essentially a list of pairs having the form goal.expression. When a behavioural variation is evaluated, it is reduced to the expression associated with the first goal that holds in the context. In MLCoDa behavioural variations are a first-class constructs, so they can be

re-ferred to by identifiers, and passed to and returned by functions. This helps in programming dynamic adaptation patterns, as well as reusable and modular code. Also, a behavioural variation can be supplied by the context, and then composed with existing ones.

Another important aspect of MLCoDa concerns the execution model which is

somewhat atypical (see Figure 3.1). The compiler produces a triple, (C, e, H), where C is the application context, e is the program object code, and H over-approximates e in terms of actions performed on the context. H can be used to show and verify properties about the program. Given (C, e, H) the virtual machine first resolves the system variables and links the application and the system contexts (linking phase), then it uses H to check whether e will adapt to all the context that may occur at runtime (verification phase). If both phases succeed then the

(28)

program can be executed; otherwise, the execution is aborted.

3.2 A guided tour of ML

_CoDa

In this section we survey the main characteristics and mechanisms provided by MLCoDa, using an example. This example builds on those found in [DFG16b] and

in [BDG16], with some additions.

Consider a museum with a cinema room used to project documentaries and videos about the realisation of exhibits or their history and suppose that we are required to develop a ticketing system for the internal cinema of a museum. The showing time for this cinema is divided into time slots, and videos are played once per time slot. Of course, the number of tickets is limited by the number of seats in the room. Visitors have different ways to buy tickets, in fact they can buy tickets from the museum website or directly on the spot, through some of the several vend-ing machines or cashiers each with her own personal computer. Once the visitor has the ticket, she can visit the museum, using the guide application [DFG16b], and also access to the cinema room during its time slot.

In the first subsection below we illustrate how to realise the Datalog knowledge base, i.e. the context, for the museum and the ticketing system. The second subsection describes how the adaptation mechanisms anticipated in Section 3.1 work using short examples of code and simulating their execution. The third subsection introduces the constructs that MLCoDa uses to manipulate the facts in

the context, namely tell and retract. Finally, in the last subsection we introduce intuitively how the verification mechanism works.

To simplify the presentation and make things more clear we use hereafter a sugared, ML-like version of the MLCoDa syntax of Section 3.3.

3.2.1 The Context

The context design constitutes one of the most important choices in the devel-opment of a context-aware application. As anticipated above we split it into two parts, one for the system and one for the applications.

As for the system part of the context, the Datalog knowledge base should include predicates and facts representing properties and capabilities of the system. For example, we know that the ticketing application may be executed on different machines, i.e. a personal computer at the museum counter, a vending machine, or a web server. Of course, the application itself should be able to devise on which device it is running to decide, for example, which GUI to provide or what information must be visible. If the program is running on a vending machine, in the system context will hold the fact

(29)

on_vending_machine () ←

on the other hand, if the program is running on the cashier’s personal computer, it will hold

on_cashier_pc () ←

and, finally, if the application is running on a web server, the following will hold on_webserver () ←

Besides the system context, the requirement engineer should also specify the application context. In our example, it may be useful to store user preferences, such as her preferred language or accessibility settings, and seat availability information. For example, if a user prefers the Italian language and she declares to be blind, the following facts will hold in the application context

user_prefer (italian) ← user_acc_opt (blind) ←

or, again, the following fact tells that the seat 42 (denoted by seat_42) is busy in the second time slot (ts_2) on the 26th of May (date_26_05)

busy (seat_42, date_26_05, ts_2) ←

Up to now, we enlisted only facts, but the machinery of Datalog allows for complex predicates that require some deductions in the knowledge base. For ex-ample the predicate full is a conjunction of busy predicates, indicating if on a given date and time slot the room is full:

full (date_26_05, ts_2) ←

busy (seat_1, date_26_05, ts_2), . . .

busy (seat_N, date_26_05, ts_2).

3.2.2 ML

CoDa

adaptation mechanisms

As anticipated MLCoDaprovides two mechanisms for adaptation, context-dependent

binding and behavioural variation. Below we present some short code snippets to better illustrate their use and meaning.

Context-dependent binding

As said in Section 3.1 MLCoDa provides the context-dependent binding as a

con-struct with a syntax similar to the classical let, but with the addition of a Datalog goal. More precisely it is a construct in the form

(30)

d l e t x = e1 when G in e2

that intuitively declares the parameter x that may denote different objects, de-pending on the current context and on its properties, checked by evaluating the goal G.

Assume that our application has a window through which the user is asked to select the desired date and time slot for booking the ticket. Of course, we want the application to warn the user if she selects a date or time slot in which the cinema room is full. In the code snippet below when txt_full_warn is referred to in a context where full holds, i.e. when the user selected a fully booked date and time slot, txt_full_warn is bound to the handle of the label with the warning message returned by getLabel (). On the other hand, if the snippet is evaluated in a context where full does not hold an adaptation error occurs, unless the programmer has considered this case. Roughly an adaptation error occurs when the application relies on a property that does not hold in the context.

d l e t txt_full_warn = getLabel (str_cinema_full_warning)

when ← full (selected_date, selected_slot) in (* ... *)

Things become more interesting when one nests multiple context-dependent bind-ings. For example, consider the need to translate the application in multiple lan-guages. As expected the parameter str_cinema_full_warning of the above snippet must assume different values, depending on the language the user selected as preferred. To implement the above behaviour in the snippet below we declare multiple times the parameter str_cinema_full _warning. Differently from the Standard ML there is no shadowing among declarations, but the appropriate binding for the context is selected when the parameter is referred:

d l e t str_cinema_full_warning = getItalianWarning() when ← user_prefer (italian)

in

d l e t str_cinema_full_warning = getEnglishWarning()

when ← user_prefer (english) in

(* ... *) Behavioural variation

The mechanism of behavioural variations is designed to ease programming dynamic adaptation patterns. With the following code we declare the behavioural variation

(31)

bvwith a formal parameter x

bv = (x) { G1.e1, . . . , Gn.en }

that, when applied, alters the control flow assuming the behaviour of the selected expression ei according to which goal Gi holds in the context, so as to dynamically adapt the running application. Behavioural variations are first-class values, so we can bind them to variables, pass them to functions, manipulate them with ad-hoc MLCoDa operators, and so on. Executing a behavioural variation is similar

to calling a function. They are applied to their parameters (e.g. x), through the application operator #(bv, v). The application triggers the so-called dispatching mechanism that in MLCoDa consists of (i) visiting the cases of the behavioural

variation in textual order; (ii) selecting the first expression ei whose goal Gi holds; (iii) evaluating ei in a environment where x is bound to v. If none of the goals holds, then the behavioural variation fails with an adaptation error.

The example below illustrates the usage of behavioural variations to generate and show a different user interface for each of the available buying methods, i.e. website, cashier or vending machine.

fun showBuyTicketUI () = l e t showGUIBV = () { ← on_webserver. deliverWebPage(), ← on_cashier_pc. showCashierGUI(), ← on_vending_machine. showVendingMachineGUI() } in #(showGUIBV, ())

Calling showGUIBV triggers the dispatching mechanism. It sequentially inspects the goals (i.e. on_webserver, on_cashier_ pc, and on_vending_machine) to select the right expression to evaluate. If a customer is buying a ticket through the vending machine, both the first and the second goal fail, while the last one is satisfied and the function showVendingMachineGUI is invoked and evaluated. Note that dispatching is not invoked when the behavioural variation is defined, but only when it is applied using the operator #. In this example, an adaptation error would occur if the program is run on a device different from those expected. Finally, since behavioural variations are first-class values, we can also manipu-late them. MLCoDa is equipped with the binary operator ∪ that concatenates two

behavioural variations into one, extending the cases over which they are defined. As an example consider the following snippet of code

(32)

The function addDefault takes a parameter-less behavioural variation bv and a expression dv, and extends the behavioural variation with a default case that behaves like dv using the operator ∪.

3.2.3 Updating the context

We explained how to define a context, how to query it in MLCoDa, and how

pro-grams can adapt to it, but not how to manipulate it. MLCoDa provides two

con-structs to manipulate the context, the tell , that permits to add facts in the context, and the retract, that allows to remove facts from the context.

For example, in the program settings, there may be an option to switch the ap-plication’s language. The function changeLanguage implements such a feature and accepts the parameter nl that specifies the new language to be used:

fun changeLanguage nl =

t e l l ( user_prefer ( nl ) )

Note that tell and retract are not in the Datalog part of the language and they cannot be used within goals or in context description, so querying the context has no side-effects. As expected, MLCoDa restricts those two operations to the

application context, hence preventing security flaws and invalid states.

3.2.4 History expressions and verification

As outlined in Section 3.1, MLCoDauses a two-step verification mechanism to detect

adaptation errors. More precisely in the first step of the verification mechanism, the compiler computes a triple (C, e, H), where C is the initial application context, e is the object code corresponding to the program, and H is a history expression that over-approximates the queries and updates to the context possibly made during the execution of e. During the second step, at load time, the virtual machine merges the context C with the system context and uses H to produce a graph describing how the merged context may change at run time. The virtual machine, visiting the graph, can detect if the queries in the program may fail at run time.

As an example consider a function that allows to delete a booked ticket, and that behaves differently among devices. More precisely the request of deletion can be completed only at the cashier’s desk or online. Assuming the existence of a function showMessage that shows a message on the screen and of a properly localised string str_not_possible_warning that tells the user to go to the cashier’s desk to get a refund, the following snippet of code implements the above:

fun deleteTicket date timeslot seat = l e t deleteBV () {

(33)

showMessage

str_not_possible_warning, ← True.

r e t r a c t (busy (seat, date, timeslot)).

}

in # (deleteBV, ())

The execution of the behavioural variation deleteBV is abstractly represented by the following history expression, where showMessage is assumed to have empty effect (i.e. HM = ):

H = ask ← on vending machine.HM⊗

ask ← True.retract (busy (seat, date, timeslot)) ⊗ fail = ask ← on vending machine. ⊗

ask ← True.retract (busy (seat, date, timeslot)) ⊗ fail

The first element checks whether the goal ← on vending machine holds in the cur-rent context, and then it reduces to the abstraction of the behaviour of showMessage. Otherwise the other alternative is checked, where retract (busy (seat, date, timeslot)) abstracts the removal of the fact busy (seat, date, timeslot) from the context. If no guard succeeds, the occurrence of the adaptation error is represented by the distinguished history expression fail .

In this example, given any initial context C, the load-time analysis shows that the behavioural variation never fails. Indeed this demonstrates that deleteBV will never occur in adaptation errors.

A in depth description of the verification mechanism of MLCoDa is outside the

scope of this thesis; further details are in Degano et al. [DFG16b].

3.3 Syntax and semantics of ML

_CoDa

In this section we define the syntax and the semantics of MLCoDa.

In the first subsection, we introduce the original MLCoDa syntax [DFG16b],

that is enriched in Chapter 4 to deal with concurrent applications and allow for runtime checks. In the second subsection, we present the substitution semantics of MLCoDa, recalling it from [DFG16b]. This semantics is the starting point of the

first three concurrent semantics presented here (see, Chapter 4).

3.3.1 Syntax of ML

CoDa

As already told many times, MLCoDa consists of two components: Datalog with

(34)

x ∈ V ar c ∈ Const p ∈ P redicate u ::= x | c T ∈ T erm A ::= P (u1, . . . , un) A ∈ Atom L ::= A | ¬A L ∈ Literal R ::= A ← B. R ∈ Clause B ::= | L, B B ∈ ClauseBody

F ::= A ← F ∈ F act, where no variables occur in A G ::= ← L1, . . . , Ln. G ∈ Goals

Figure 3.2: Syntax of the context component of MLCoDa.

extended with COP features for computations (the application component ).

Context component

As said, the Datalog part is standard: a program is a finite set of facts and clauses. In the following it will also be assumed a closed world and that each Datalog program is safe and stratified. A Datalog program is said to be safe if (i) each fact is ground; (ii) each variable occurring in the head of a clause occurs in the body of the same clause; and (iii) each variable occurring in a negative literal also occur in a positive literal of the same clause. While a Datalog program is stratified if, before evaluating a predicate in a clause head, it is always possible to completely evaluate all the predicates occurring negatively in the clause body or in the bodies of some subsequent clauses, and all those predicates which are needed to evaluate these negative predicates [CGT89]. Finally, we assume to have the set V ar (ranged over by x, y, . . .) of variables, Const (ranged over by c, n, . . .) of basic constants, and P redicate (ranged over by P , . . .) of predicate symbols.

In Figure 3.2 we show the grammar that generates the language of the MLCoDa

context component. As usual in Datalog, a term is a variable x or a constant c; an atom A is a predicate symbol p applied to a list of terms; a literal is a positive or a negative atom; a clause is composed by a head, i.e. an atom, and a body, i.e. a possibly empty sequence of literals; a fact is a clause with an empty body (we only have ground facts) and a goal is a clause with empty head. G ranges over goals, which occur in behavioural variations. Finally, it is assumed that each Datalog predicate has a fixed arity and a type, according to e.g. [MO84].

Applications component

The application component inherits most of ML constructs. Figure 3.3 shows the syntax of MLCoDa application component. As it can be seen, MLCoDavalues are,

(35)

V a ::= G.e | G.e, V a

v ::= c | λfx.e | (x){V a} | F

e ::= v | x | ˜x | e1e2 | if e1then e2else e3 | let x = e1in e2 |

dlet ˜x = e1when G in e2 | tell(e1) | retract(e1) | e1∪ e2 | #(e1, e2)

Figure 3.3: Syntax of the application component of MLCoDa.

besides the basic constants c and function abstractions λfx.e, variations V a and

Datalog facts F . The application component introduces also the so-called param-eters, in the form ˜x ∈ DynV ar (with V ar ∩ DynV ar = ∅), defining variables that assume values depending on the properties of the running context. As illustrated above in Sections 3.1 and 3.2 MLCoDa has two COP mechanism. The first

mecha-nism is that of behavioural variations (x){V a}, each consisting of a variation V a, i.e. a list G1.e1, . . . , Gn.en of expressions guarded by Datalog goals Gi (x possibly

free in ei). At run time, the first goal Gi satisfied by the context determines the

expression ei to be run (dispatching). The second mechanism is the dlet construct,

that implements the context-dependent binding of a parameter ˜x to a variation V a. The tell/retract constructs update the context by asserting/retracting facts. The append operator e1 ∪ e2 dynamically concatenates behavioural variations, so

allowing for dynamic composing them. The application of a behavioural variation #(e1, e2) applies e1 to the value of its argument e2. To do so, the dispatching

mechanism is triggered to query the context and to select from e1 the expression

to run.

3.3.2 Semantics of ML

CoDa

Degano et. al [DFG16b] provided a substitution based small-step operational semantics for MLCoDa.

Concerning the context component, the adopted semantics is the top-down stan-dard semantics for stratified Datalog programs, as presented in [CGT89]. Given a context C and a goal G, C G with θ means that there exist a substitution, i.e. a mapping between variables in G and constants, θ such that the goal G is satisfied in C.

Concerning the application component, the semantics is inductively defined on closed expressions, with no free variables but possibly with free parameters. In fact, since program execution takes place after the linking and verification phases as outlined in Section 3.1, it is safe to assume that only closed expressions will be evaluated. Parameters will assume values according to a dynamic environment ρ,

(36)

i.e. a function mapping parameters to variations DynV ar → V ar. The standard updating operator is used to update the dynamic environment:

ρ[b/x](y) = (

b if y = x ρ(y) otherwise

In this small-step operational semantics, transitions are in the form

ρ ` C, e → C0, e0

meaning that in the dynamic environment ρ, the expression e is evaluated in context C to e0 changing the context C to C0.

Figure 3.4 shows the semantics rules for MLCoDa.

Rules(Tell1-2)and (Retract1-2), deal with tell /retract constructs of MLCoDa

in the standard way. First the actual parameter of the construct is reduced to a fact (rules (Tell/Retract-1)), then the resulting fact is added ( tell ) or

removed(retract) from the context. The rules for dlet and the rule (Par)

imple-ment the context-dependent binding in MLCoDa. Rule(Par)reduces the parameter

˜

x to the correct expression e as selected by the dispatching mechanism. More pre-cisely the rule looks in ρ for the variation V a bound to the parameter ˜x and uses the dispatching mechanism to dynamically resolve parameters and behavioural variations. The dispatching mechanism is implemented as the function dsp

dsp(C, (G.e, Va)) = ( (e, θ) _{if C G with θ} dsp(C, Va) otherwise dsp(C, G.e) = ( (e, θ) _{if C G with θ} fail otherwise

The function inspects the variation V a from the left to the right to find the first goal G that is satisfied by C under the substitution θ that binds the variables in G. In the case of success the result is the pair (e, θ), where e is the expression corresponding to the goal G, and the parameter ˜x is reduced to eθ. Instead, if the dispatching mechanism fails, the computation gets stuck since the program cannot adapt to the given context. Rule(DLet1)appends the pair G.e1 to the binding of

˜

x in ρ, then evaluates e2 in the updated environment. Rule (DLet2) is standard,

meaning that the dlet reduces to the value eventually computed by e2. The ∪

operator working on variations is implemented by the rules (Append1-3). First,

the rule (Append1) reduces expression e1 until it becomes a behavioural

varia-tion, then the rule(Append2) does the same with e2. Finally, the rule (Append3)

(37)

to avoid name captures. Rules for if construct, let construct and function ap-plication are essentially the rules presented in Example 2.3.2. Finally, the rules for behavioural variation application #(e_1, e_2). Those rules first evaluate the sub-expressions e1 until it reduces to a behavioural variation (x) V a and e2

reduces to a value v (rules (VaApp1-2)). Then rule (VaApp3) invokes the

dis-patching mechanism to select the correct expression e and substitution −→c /−→y , substitutes v to x and −→c to −→y in e, and finally continues the execution with the resulting expression.

(38)

(Tell1) ρ ` C, e → C0, e0 ρ ` C, tell(e) → C0, tell(e0) (Tell2) ρ ` C, tell(F ) → C ∪ {F }, () (Retract1) ρ ` C, e → C0, e0 ρ ` C, retract(e) → C0, retract(e0) (Retract2) ρ ` C, retract(F ) → C\{F }, () (Par) ρ(˜x) = V a dsp(C, V a) = (e, {−→c /−→y }) ρ ` C, ˜x → C, e{−→c /−→y } (Dlet1) ρ[G.e1, ρ(˜x)/˜x] ` C, e2→ C0, e02

ρ ` C, dlet ˜x = e1when G in e2→ C0, dlet ˜x = e1when G in e02

(Dlet2) ρ ` C, dlet ˜x = e1when G in v → C, v (Append1) ρ ` C, e1→ C0, e01 ρ ` C, e1∪ e2→ C0, e01∪ e2 (Append2) ρ ` C, e2→ C0, e02 ρ ` C, (x){V a1} ∪ e2→ C0, (x){V a1} ∪ e02 (Append3) z fresh ρ ` C, (x){V a1} ∪ (y){V a2} → C, (z){V a1{z/x}, V a2{z/y}} (If1) ρ ` C, e1→ C0, e01

ρ ` C, if e1then e2else e3→ C0, if e01then e2else e3

(If2)

ρ ` C, if true then e2else e3→ C, e2

(If3)

ρ ` C, if f alse then e2else e3→ C, e3

(Let1) ρ ` C, e1→ C0, e01 ρ ` C, let x = e1in e2→ C0, let x = e01in e2 (Let2) ρ ` C, let x = v in e2→ C, e2{v/x} (App1) ρ ` C, e1→ C0, e01 ρ ` C, e1e2→ C0, e01e2 (App2) ρ ` C, e2→ C0, e02 ρ ` C, (λfx.e) e2→ C0, (λfx.e) e02 (App3)

ρ ` C, (λfx.e) v → C, e{v/x, (λfx.e)/f }

(VaApp1) ρ ` C, e1→ C0, e01 ρ ` C, #(e1, e2) → C0, #( e01, e2) (VaApp2) ρ ` C, e2→ C0, e02 ρ ` C, #((x){V a}, e2) → C0, #((x){V a}, e02) (VaApp3) dsp(C, V a) = (e, {−→c /−→y }) ρ ` C, #((x){V a}, v) → C, e{v/x, −→c /−→y }

(39)

H ::= | h | µh.H | H1+ H2 | H1· H2 | tell H | retract H | ∆

∆ ::= ask G.H ⊗ ∆ | fail

Figure 3.5: Syntax of history expressions in MLCoDa. [DFG16b]

3.4 Type and effect system

MLCoDa expressions are endowed with a type and effect system. Each expression

is annotated with a type, which is almost standard, and an effect, called history expression. History expressions are used by the virtual machine in the verification phase to ensure that the dispatching mechanism will never fail during the execu-tion. Moreover, we use history expression in the first concurrent semantics here proposed (see Chapter 4).

3.4.1 History expressions

History expressions [BDFZ09] are a basic process algebra abstracting the set of runtime behaviours that a program may assume. In MLCoDa history expressions

are used to approximate the sequence of actions that a program may perform on the context at runtime. Figure 3.5 introduces the syntax of history expressions. is the empty history expression that abstracts the programs with no interactions with the context; µh.H abstracts possibly recursive functions, where h is the re-cursion variable; H1+ H2, the non-deterministic sum describes conditionals, i.e.

the if -then-else construct; the concatenation H1 · H2 stands for sequences of

actions; tell H and retract H should be pretty obvious and denote the respective constructs used to manipulate the context in MLCoDa; ∆, called the abstract

vari-ation, abstract the behaviour of the dispatching mechanism and is defined as a list of history expressions guarded by goals.

A closed history expression (i.e. with no free variables) can be evaluated in a context using a transition system. The transition system defined in Figure 3.6 has the form

C, H → C0, H0

indicating that the history expression H in context C reduces to H0 producing the context C0. The first two rules let us reduce H1· H2 by first reducing H1 to

and then by evaluating · H2 to H2. The rule for recursion simply reduces a

history µh.H to the history H in which h was substituted with µh.H. The sum H1 + H2 is non-deterministically evaluated either to the reduced version of H1 or

(40)

C, · H → C, H C, H1→ C0, H10 C, H1· H2→ C0, H10· H2 C, µh.H → C, H[µh.H/h] C, H1→ C0, H10 C, H1+ H2→ C0, H10 C, H2→ C0, H20 C, H1+ H2→ C0, H20 C, tell F → C ∪ {F }, C, retract F → C\{F }, C G C, ask G.H ⊗ ∆ → C, H C 2 G C, ask G.H ⊗ ∆ → C, ∆

Figure 3.6: Semantics of history expressions in MLCoDa. [DFG16b]

or remove the fact F to the context, leading to empty history expressions. Finally, rules for abstract variation reduce ask G.H ⊗ ∆ to H if G is satisfied in C and to ∆ otherwise. If no suitable goal is reached the stuck configuration fail is reached, representing an adaptation error.

3.4.2 Type and effect system

The type and effect system of MLCoDais based on the history expressions presented

above and assumes that the Datalog part is typed, i.e. each predicate has fixed arity and a type [DFG16b]. More precisely from now on we assume that there exists a function γ that, given a Datalog goal G, returns a list of pairs (x, type-of-x), for each variable x of G.

Recalling what we explained in Section 2.4, a typing environment is needed:

Γ ::= ∅ | Γ, x : τ

where ∅ denotes the empty typing environment and Γ, x : τ denotes an environment having all the bindings in Γ and a binding for the variable x (x not occurring in Γ). Since MLCoDa comprises also parameters we need a parameter environment

K ::= ∅ | K, (˜x, τ, ∆)

mapping the parameter ˜x to a type and to an abstract variation. So, MLCoDa typing judgements are in the form

Γ; K ` e : τ . H

meaning that in the typing environment Γ and parameter environment K the expression e has type τ and effect H.

As Figure 3.7 shows, types include basic types, i.e. int , bool and unit , func-tional types, behavioural variation types, and facts. Some annotations are asso-ciated to functional types and behavioural variation types to support the static

Four semantics for a disciplined concurrency in COP.

Universit`

a di Pisa

Master’s degree in Computer Science

Four semantics for a disciplined

concurrency in COP.

Supervisors:

Prof. Pierpaolo Degano

Dott. Letterio Galletta

Candidate:

Matteo Busi

Contents

Chapter 1

Introduction

Chapter 2

Background

2.1

Autonomic computing

2.2

Context-oriented programming

2.3

Small-step operational semantics

2.4

Type and effect systems

Chapter 3

ML

CoDa

3.1

The design of ML

3.2

A guided tour of ML

3.2.1

The Context

3.2.2

ML

adaptation mechanisms

3.2.3

Updating the context

3.2.4

History expressions and verification

3.3

Syntax and semantics of ML

3.3.1

Syntax of ML

3.3.2

Semantics of ML

3.4

Type and effect system

3.4.1

History expressions

3.4.2

Type and effect system

_CoDa