Generation of concurrent code from Labelled Transition Systems

(1)

U

NIVERSITY OF

P

ISA

M

ASTER

’

S

D

EGREE IN

C

OMPUTER

E

NGINEERING

M

ASTER

’

S

T

HESIS

Generation of Concurrent Code from LTS Models

Author:

Cosimo S

ACCO

Supervisor:

Paolo A

NCILOTTI

Co-Supervisor:

Andrea D

OMENICI Academic Year 2012/2013

(2)

(3)

Abstract

Concurrent programs are intrinsecally more complex than sequential programs. Such additional complexity hampers all development phases. At a design level, proving safety and liveness properties becomes more difficult. Translating concurrency domain concepts in terms of language-specific mechanisms does not preserve design information. Manual implementation is time-consuming and error-prone. Finally, the non-reproducibility of synchronization errors makes traditional testing techniques totally ineffective.

Model-based design approaches can be used to deal with concurrent programs design issues. Abstracting from all implementation details, model-based design achieves high intentionality and complexity reduction. Design correctness can be mechanically and formally proved by means of model validation and verification.

The implementation phase can be automated with to generative programming. Generative programming is an implementation approach focused on the production of highly reusable code generators. The introduction of code generators in the development cycle leads to a drastic reducion of defects and a considerable productivity increase.

fspc (a model-based analysis tool) and codegen (a code generator) constitute an open source development toolchain. The proposed approach is based on the specification language FSP, a process algebra, which is translated into LTS models. These are validated, verified and translated into high-level code stubs.

This work focuses on the design and implementation of codegen, with reference to the Java monitor code generation case study. A comprehensive analysis on code generation-specific model validation constraints is conducted, resulting in the definition of original model validation techniques aimed to guarantee code generation feasibility. Code generation is seen as incremental information refinement carried out through model-to-model and model-to-model-to-text transformations.

(4)

(5)

List of Figures

1.1 Model-based development approach . . . 5

2.1 LTSs relative to the FSP processes in listing 2.3 . . . 14

2.3 LTS relative to the FSP processes in listing 2.5 . . . 14

2.4 In LTS models, information on logic conditions is lost . . . 17

2.7 Non reachable states an transitions in parallel composition . . . 20

2.9 Shared Lock specification . . . 22

2.10 Locking protocol . . . 22

2.11 Deterministic and non-deterministic states . . . 24

3.1 Process algebra transformation and execution . . . 25

3.2 Human interaction degree . . . 28

4.1 Classic information flow in software development . . . 31

4.2 Detailed information flow in software development . . . 32

4.3 Code generation strategy dispatch . . . 32

4.4 User requirements . . . 33

4.5 codegen integration with fspc . . . 34

4.6 Information classes . . . 35

4.7 Analyst and developer concepts . . . 35

4.8 Analyst information flow . . . 36

4.9 Developer information flow . . . 36

4.10 fspc access through generic decorator . . . 37

4.11 Code generator information flow . . . 37

4.12 Component diagram for CodeGenerator . . . 38

5.1 XAnalyst1activity diagram . . . 39

(10)

5.5 Interactions are represented using a bold font . . . . 44

5.6 Local actions are represented using an italic font . . . 45

5.7 Determinism and non determinism in LTSs . . . 47

5.8 unfairtriggers an infinite execution of local actions . . . 48

5.9 Nested monitor calls may cause deadlock . . . 49

5.10 Monitor dependency cycles may cause deadlock . . . 50

5.11 Reentrant calls may cause deadlock . . . 51

5.12 Java code generation based on text-to-text transformations . . . 56

5.13 Allowed components are marked with “OK” . . . 59

5.14 codegen internal model for candidate monitor LTSs . . . 65

5.15 Washing machine, LTS model . . . 66

5.16 codegen internal model for candidate thread LTSs . . . 66

5.17 Saturday night, LTS model . . . 67

5.18 codegen transformation approach . . . 68

6.1 XDeveloper2activity diagram . . . 69

6.2 Object diagram inferred from listing 6.1 . . . 71

6.3 Coffee vending machine . . . 74

A.1 Visitor design pattern . . . 86

A.2 Acyclic Visitor design pattern . . . 87

(11)

Listings

2.1 FSP process definition . . . 13

2.2 FSP process definition . . . 13

2.3 Use of FSP process expression literals . . . 13

2.4 Use of the action prefix operator . . . 13

2.5 Use of the choice operator . . . 14

2.6 P, EVAL_P and EVAL_EVAL_P are equivalent . . . 14

2.7 P and EVAL_P are equivalent . . . 15

2.8 Process definition in terms of local processes . . . 15

2.9 Local processes indexing . . . 16

2.10 EVAL_P is equivalent to P . . . 16

2.11 Parallel composition of independent processes . . . 17

2.12 Parallel composition of interacting processes . . . 18

2.13 Parallel composition of two instances of the same process . . . 20

2.14 Process labeling syntactic sugar . . . 21

2.15 Critical section, FSP specification . . . 21

5.1 Data buffering, FSP specification . . . 40

5.2 Common FSP modeling pattern . . . 42

5.3 Simulation of data in FSP descriptions . . . 43

5.4 Bank account modeling attempt . . . 43

5.5 Also non-shared actions can be interactions . . . 46

5.6 When starting a conversation one can decide to be fair or unfair . . . . 48

5.7 A possible Java implementation for the LTS in figure 5.8 . . . 48

5.8 Active control interactions cannot be deduced from FSP specifications . . . . 51

5.9 CANDIDATE_1 actively interacts with CANDIDATE_2 . . . 51

5.10 CANDIDATE_1 exposes int1 and int2, CANDIDATE_2 exposes int2 . . . 52

5.11 Defective vending machine stochastically giving away coffee for free . . . 53

5.12 Hybrid guarded choice . . . 53

5.13 When defective, this coffee vending machine gives away coffee for free . . . . 54

5.14 Local process with complex choice syntax . . . 54

(12)

5.18 FSP description based on elementary local processes . . . 55

5.19 Interacting choice equation relative to listing 5.18 . . . 55

5.20 Setting equations to complete the FSP description 5.19 . . . 56

5.21 The choice between local actions win and loose is non-deterministic . . . 57

5.22 Stochastic internal evolution . . . 57

5.23 Adjacency-based FSP specification . . . 60

5.24 Washing machine, FSP description . . . 65

5.25 Saturday night, FSP description . . . 67

5.26 Saturday night and washing machine, FSP description . . . 67

6.1 Candidate monitors and threads . . . 70

6.2 Monitor lock . . . 72

6.3 Java rendering of LTS state-space . . . 73

6.4 Java rendering of LTS state . . . 73

6.5 A condition variable is associated to each interaction . . . 73

6.6 Local actions modeled through private methods . . . 73

6.7 Java implementation model for LTS interactions . . . 73

6.8 codegen implementation of the LTS 6.3 . . . 74

6.9 Monitor references declaration and initialization . . . 76

6.10 Choice methods . . . 77

6.11 Run method . . . 77

6.12 Simple case statements for externally deterministic states . . . 77

6.13 CHOICE POINTS management . . . 77

6.14 codegen implementation besed on the LTS 5.17 . . . 78

A.1 Scalable Acyclic Visitor, implementation . . . 89

(13)

List of Tables

5.1 MNF structure for candidate monitor WASHING_MACHINE . . . 65 5.2 TNF structure for candidate thread SATURDAY_NIGHT . . . 67

(14)

(15)

Chapter 1

Introduction

This thesis is about the design and implementation of codegen, a code generator which translates LTS models describing concurrent programs into code stubs.

This chapter begins with a concise exposition of the basic concepts used through all the thesis (section 1.1). Section 1.2 is a brief survey on the differencies between concurrent and sequential programs remarking some of the advantages of the first, while section 1.3 highlights the main difficulties involved in concurrent programming. Some design and implementation approaches can mitigate such difficulties. The proposed approach is a combination of model-based design and generative programming, briefly described, respectively, in section 1.4 and 1.5. Finally, section 1.6 describes the thesis subject and the chapter organization.

1.1 Definitions

Information processing or computation is described by programs and carried out by one or more execution flows.

Execution flows are sequences of actions performed by some entity.

Sequential programs describe a computation in terms of a single execution flow.

In concurrent programs, computation is described in terms of multiple execution flows having overlapping lifetimes. Concurrent programs can also have parallel execution, that is, multiple execution flows are allowed to to overlap in time.

The actual entities embodying execution flows vary in accordance to the context of computation (threads in a concurrent process, concurrent processes supported by an operating system, processes communicating over a network, people concurring and/or cooperating, and more). In this work I will abstract from any particular implementation of execution flows, and I will use the terms execution flow and process interchangeably to indicate a single flow of actions constituting, with others, a concurrent computation.

(16)

execution can affect or be affected by the execution of another process. Interactions can be related to competition (access to common resources) or cooperation (sharing a common goal).

Programming is the activity of defining programs. Computer programs are defined through source code, adhering to some programming language.

Concurrent programming is a form of modular programming: the overall computation is properly factorised into subcomputations (modularity) that may be executed concurrently (concurrency). Concurrence and cooperation among such subcomputations can be regulated through a variety of concurrence mechansims, such as semaphores, monitors, synchronous or asynchronous communication primitives and more.

1.2 Differences between sequential programs and

concurrent programs

In sequential programs:

• execution is deterministic in the sense that, if the input information is known, the evolution of each instantiation of the program can be predicted;

• usually, the main goal of the computation is to produce an output information, therefore termination is often desirable;

In concurrent programs:

• execution is, in general, non deterministic in the sense that the evolution of the program instances cannot be predicted, since the net effect of each interaction between processes may vary in relation to conditions which are unpredictable ahead of execution time (process interleavings, relative execution speeds and more);

• non-termination is often desirable, since concurrent programs are usually used to describe continuous activities with no output information.

1.2.1 Advantages of concurrent programming

Concurrent programming has many advantages:

• concurrent programs can model real-world concurrent phenomena in a more intuitive way;

• concurrent programs can span over multiple processors with cosiderable performance gains;

• while one of the execution flows is performing blocking I/O operations, the other execution flows can continue their computations, thus allowing a throughput increase;

(17)

1.3 Difficulties in concurrent programming 3

• proper modularization into well-constructed subcomputations can lead to increased responsiveness.

Many applications take advantage of concurrent programming, from simple interactive games running on mobile devices to complex weather simulation softwares running on supercomputers.

1.3 Difficulties in concurrent programming

1.3.1 Difficulties in concurrent code development

Thinking in terms of concurrent execution flows can be difficult. The number of all possible execution traces can be huge. In fact, two non-interacting processes can interleave in all possible ways, generating a high number of traces. Each concurrency mechanism provides means to define synchronization constraints, that is, constraints relative to the order of events. Synchronization constraints shrink the set of all possible execution traces so that only the traces satisfying the constraints remain. Such mechanisms are non-trivial and hard to master.

1.3.2 A new kind of errors

All the errors tipically showing up in sequential programs are inherited by concurrent programs, and concurrency makes them even harder to spot. Moreover, improper use of concurrency mechanisms can lead to new, subtle errors.

Deadlock

A bad design of synchronization constraints can lead to situations where all the processes wait for an event to happen and none of them can further continue its execution. Since all the processes are waiting, none of them can act in such a way that one of the expected events happens, resulting in an (usually undesired) arrest of all activities.

Starvation

A bad design of synchronization constraints could fail to guarantee that, in all possible execution traces, some expected event eventually happens. Without such guarantee, a process could wait for that event to happen for an unbounded amount of time.

Priority inversion

Different priorities can be defined for each process. However, in some subtle circumstances, an incorrect usage of concurrency mechanisms can lead to an inversion, de facto, of the processes’ priorities.

(18)

Interference

Synchronization mechanisms can be used to regulate the access to shared resources. If such regulation is not properly designed, the execution of non-atomic operations on the shared resource can lead to inconsistencies. Such inconsistencies can influence the execution of other processes, leading to unexpected states and executing unintended traces.

1.3.3 Proving correctness

A property, in the context of programs, is a predicate which must hold for each possible execution [1].

Safety and liveness are two categories of properties. Safety properties state that no erroneous state can be reached in any possible execution of the program, while liveness properties state that, eventually, all possible executions of a program reach a correct state.

Partial correctness is a safety property stating that, if a program (all its possible executions) terminates, then a correct state is reached. Total correctness is a liveness property stating that each possible execution of a program terminates in a correct state or, equivalently, the program is partially correct and always terminates.

Partial and total correctness are the most important safety and liveness properties for sequential and terminating concurrent processes. However, while verifying such properties on sequential programs is quite simple, it is not so for concurrent programs. The classic debugging techniques are ineffective in the case of concurrent processes, for which the number of possible execution traces is very high and errors are not always reproducible.

Concurrent programs, moreover, must satisfy a serie of additional properties: mutual exclusion on critical sections and absence of deadlock, which are safety properties, and absence of starvation, which is a liveness property. The simple code inspection and the classic debugging techniques are totally ineffective in those cases. The presence of errors in concurrent programs can have utterly undesirable consequences.

1.4 Approaches based on models

The later a defect is discovered, the worse is its effect. If a defect is discovered during the preliminar analysis and design phases, it can be immediately eradicated with few or no consequences. If a defect is found during the implementation phase, its removal could involve a substantial re-design and refactorization of the whole source code. The same applies when errors are spotted in the testing phase. However, if defects manifest when the program has been already deployed, the consequences can be serious.

Once a concurrent program is fully implemented, proving that it satisfies all safety and liveness properties can be very hard. The main purpose of model based approaches is to discover in the design phase as many defects as possible by producing, validating, verifying

(19)

1.4 Approaches based on models 5

and implementing models, which are simplified representations of a system. Concurrent programs can be modelled, and the models be validated and verified (refer to figure 1.1).

false true PROGRAMS validation verification verification MODELS PROBLEMS modeling modeling validation validation implementation modeling

Figure 1.1: Model-based development approach

Model-based approaches have many advantages:

• reasoning about models leads to a better understanding of the modeled system;

• since abstraction from implementation details is required when modeling a concurrent system, the model actually achieves complexity reduction;

• the validation phase (eg. through simulation) permits to gain confidence about the model and its correspondence to the modeled system;

• safety and liveness properties can be mechanically verified on the models; If the model satisfies all safety and liveness properties, it can be implemented.

(20)

1.5 Generative programming

The model implementation should maintain all behavioural properties of models. If the implementation does not reflect the intended and verified behaviour, all modeling, validation and verification efforts are vain.

1.5.1 Drawbacks of manual implementation

Currently, the most common implementation approach is manual implementation. Such approach has some drawbacks [2]:

• direct implementation using general-purpose languages hides design information, that is, general-purpose languages are not highly intentional, where intentionality is the ability of a language to represent domain concepts in an intuitive way, without any loss of information due to obscure language idioms1_;

• it is difficult, since, in general, no well-defined procedure exists to translate designs into implementations;

• being difficult and non-highly intentional, it is also error-prone;

• it is time-consuming, also because, being error-prone, a large amount of time must be dedicated to debugging.

Generative programming is a different, emerging implementation approach to overcome some of the limitations of manual implementation.

1.5.2 About generative programming

In generative programming approaches, the programmer describes in abstract terms the desired system, which is produced by a code generator, that is, a program describing a computation whose result is a program.

Generative Programming is a comprehensive software development paradigm aiming to achieve [2]:

• high intentionality: a program is mechanically generated by a code generator following some abstract description, thus relieving the developer from a direct management of the implementation details and allowing her/him to focus on domain-specific issues; • correctness: the number of defects (from banal mistakes to subtle errors) is drastically

reduced and, if the code generator is formally proven to be correct, defects can be totally absent;

1_{As an example, consider the semaphore mechanism. Simple concepts in the concurrency domain, such}

as barriers, synchronization on events or critical sections, are implemented through a serie of rather obscure techniques, which hide the simple design information.

(21)

1.6 About this thesis 7

• high productivity: implementation times can be reduced by several orders of magnitude, while the small number of defects can noticeably speed-up the testing phase;

• reusability: a code generator can be reused countless times. Generative programming is based on three fundamental steps [3]:

I design of generic implementation components, which are simple, composable source code fragments and templates which can be used as building blocks by code generators; II definition of the configuration knowledge, namely:

• constraints on the input specification;

• mappings and composition rules to produce the target program; III implementation of a code generator;

Code generators can be implemented exploiting a variety of programming techniques and paradigms. This work is about a tansformation-based code generator.

In the context of code generation, transformations can be classified as follows [4]: • model-to-model transformations, that is, transformations among models. These can be

both horizontal (the output model is of the same type) or vertical (the output model is of a different type).

• model-to-text transformations, that is, transformations from models to text (such as source code). Notice that this is a particular case of model-to-model vertical trans-formations.

1.6 About this thesis

1.6.1 Subject

This thesis proposes a model-based approach to the design of concurrent programs, while generative programming is the elected implementation paradigm.

Concurrent programs are modeled through FSP (Finite State Processes) descriptions. Such descriptions are translated into LTS (Labelled Transition System) models. LTS models are validated through simulation, safety and liveness properties are then verified. Finally, the concurrent program is implemented by means of code generation.

The proposed approach is embodied by the fspc analysis tool and the codegen code generator.

(22)

1.6.2 fspc overwiew

fspc (see [5, 6]) is an FSP compiler and LTS analysis tool written in C++ offering: • LTS model generation from FSP specifications;

• model validation through simulation;

• model minimization, achieving complexity reduction; • model verification through model checking;

fspc is composed of three main modules: FSP compiler, LTS analyzer and the user interface. The compiler has to deal with a LR(1) ambiguous grammar. Ambiguities are resolved through preprocessing, thus performing text-to-text transformations targeting a slighty modified, unambiguous version of the language. The unambiguous description is then parsed producing a parse three. LTS models are produced by visiting the parse tree. The parsing module is generated using well known technologies, namely Flex for the scanner and GNU Bison for the parser.

The LTS analyzer fulfills several analysis tasks: • deadlock analysis

• terminal sets discovery • test for determinism • property checking • progress checking • LTS simulation

fspc provides an interactive shell as well as a script based interface. codegen is a code generation extension for fspc .

1.6.3 Organization

The thesis is organized into six chapters.

Chapter 2: FSP specifications and LTS models The first part of this chapter is a brief survey on the FSP process algebra and LTS models. The chapter provides, in the second part, some definitions needed to prove the theorems exposed in section 5.

Chapter 3: Process algebraic code generation: state of the art This chapter is an overview on the main results in the field of code generation based on process algebraic descriptions. A comparison withcodegen is made to highlight its innovative features.

(23)

1.6 About this thesis 9

Chapter 4: fspc codegen The main codegen design choices are discussed in this chapter. Correctness, safety and easy extensibility are the main project directives.

Chapter 5: Model validation and transformation In the first part of this chapter a brief survey on LTS expressiveness limitations is presented.

The second part exposes a thorough analysis on code generation-specific model verification constraints. The main results in this area are integrated with original insights leading to the definition of the Restricted Monitor Normal Form, namely a set of semantic constraints on FSP specifications guaranteeing code generation feasibility. Some equivalent topological constraints on LTS models are then deducted to ease model verification. A complete proof of the equivalence of semantic and topological constraints is also provided.

The chapter concludes with the definition of new internal models with the purpose of easing code generation. Appropriate model-to-model transformations are also defined. The dissertation is limited to the Java monitor code generation case study.

Chapter 6: Code generation The code generation techniques and model-to-text transformations implemented incodegen are discussed in this chapter. The discussion is limited to the Java monitor case study.

Chapter 7: Conclusions Conclusions and future work are exposed in this chapter.

Appendix A: Metaprogramming double dispatch Double dispatch library based on Acyclic Visitor, implemented through template metaprogramming techniques.

(24)

(25)

Chapter 2

FSP specifications and LTS models

This sections exposes the main ideas behind the FSP language and LTS models. Refer to [7] for a more comprehensive dissertation.

2.1 Specifications and models

Process calculi are formal approaches for modelling and reasoning about concurrent systems. Such formalisms provide a way to describe interactions and communications between processes [8]. Formal reasoning is supported by algebraic laws allowing manipulation and analysis of the process descriptions. Finite State Processes (FSP) is a process calculus designed to be easily machine readable. Though providing a large framework of algebraic properties, it is primarily used, in this work, as a succint way to describe Labeled Transition Systems (LTS).

Definition 1(Labelled Transition System). A Labelled Transition System (LTS) is a quadruple

(σ, S0, α, R)where:

• σ is a set of states; • S0is the initial state;

• α is a finite, non-empty set of actions; • R⊆ σ×α×σis a transition relation; Let the following notations be defined:

• σ(λ)denotes the σ set of LTS λ; • α(λ)denotes the α set of LTS λ; • P−→a

λ

Q denotes(P, a, Q) ∈R;

LTSs are usually represented as labelled graphs for convenience.

(26)

2.2 FSP expressions

The admitted expressions are:

• integer literal expressions (eg.1, -1, 12);

• arithmetic expressions: C-like integer arithmetic expressions are defined in FSP (eg.1 + 1, (2*2)%3, (2-(3*5)));

• boolean expressions: C-like relational operators are defined in FSP to operate the C convention on assigning a logical value to an arithmetic expression:

  

i6=0−→i=true i=0−→i= f alse where i is an integer expression;

• range expressions: expressed in the form [min..MAX ] and evaluating to a range of integer values between values min and MAX where min and MAX are arithmetic expressions (eg.[2..3*2] evaluates to [2, 3, 4, 5, 6]);

• action label literal expressions: represented through lowercase strings; • action label expressions: following the production rule

hactionLabeli::= hactionLabelLiterali |

‘[’harithmeticExpressioni‘]’|

hactionLabeli‘[’harithmeticExpressioni‘]’| hactionLabeli‘.’hactionLabeli

(eg.a, [1], a[a], a[2-1], a.a, a[a.[1]] and so on);

• set expressions: expressed in the form{labels }, where labels is a comma-separated list of action labels, a set expression evaluates to a set of action labels (eg. {a[0..5-3].[2], a, b[b]} evaluates to a[0].[2], a[1].[2], a[2].[2], a, b[b]);

• set generator expressions: sets can be conveniently generated through the following rules – hsetGeneratori::= hactionLabeli hrangeExpressioni

(eg.[1].a[0..1] produces the set {[1].a[0], [1].a[1]}); – hsetGeneratori::= hactionLabeli‘[’hsetExpressioni‘]’

(eg.a[{b[0..2]}] produces the set {a.b[0], a.b[1]}); • uppercase strings, representing identifiers;

(27)

2.3 FSP definitions 13

2.3 FSP definitions

2.3.1 Const, set and range definitions

Symbolic names can be given to arithmetic, set and range expressions (see listing 2.1).

c o n s t C = ( 2 + 3 ) ∗5 range R = [ 0 . . 2 ∗ C] s e t S = { a [R ] }

Listing 2.1: FSP process definition

2.3.2 Process definitions

An FSP description is a collection of process definitions (see listing 2.2).

PROCESS_IDENTIFIER = << p r o c e s s e x p r e s s i o n >> .

Listing 2.2: FSP process definition

A process identifier is also a valid process expression. Process expressions are briefly discussed.

Process literal expressions

The following terminal process expressions are defined: • END: this process is used to model correct termination;

• STOP: this process is used to model situations in which no action can be undertaken (deadlock);

• ERROR: this process is used to model erroneous states;

Listing 2.3 refers some valid FSP process definitions using these literal process expressions.

PROCESS_A = END. PROCESS_B = STOP . PROCESS_C = ERROR.

Listing 2.3: Use of FSP process expression literals The related LTSs are represented in figure 2.1.

Action prefix

The action prefix operator (->) extends a process expression adding a new state and defining an action leading from the new state to the extended process expression (see listing 2.4 and 2.2). Let the left operand (a label) be called prefix action, and the right action consequence.

PROCESS = ( a−>END) .

(28)

STOP ERROR END

PROCESS_A PROCESS_B PROCESS_C Figure 2.1: LTSs relative to the FSP processes in listing 2.3

PROCESS

END a

Figure 2.2: LTSs relative to the FSP processes in listing 2.4

Choice

The choice mechanism is an n-ary operator which can be used to achieve multiple traces. Choices require action prefix expressions as parameters. A choice without parameters resolves to the STOP process expression. If action prefixes are specified, one of the prefix actions is undertaken, then the process continues behaving as specified by the related action consequence. The syntax and the related model are referred in listing 2.5 and figure 2.3.

PROCESS = ( a−>END | b−>c−>ERROR | d−>PROCESS) .

Listing 2.5: Use of the choice operator

END

ERROR a

b

d c

Figure 2.3: LTS relative to the FSP processes in listing 2.5

A convenient mechanism to refer a (possibly large) number of alternative action prefixes is to use set generator expressions as shown in listing 2.6.

P = ( a [ 0 . . 2 ]−>END) .

EVAL_P = ( { a [ 0 ] , a [ 1 ] , a [2]}−>END) .

EVAL_EVAL_P = ( a[0]−>END | a[1]−>END | a[2]−>END) .

(29)

2.3 FSP definitions 15

Variables can be defined to iterate over a range or a set and can be used in the choice scope (see listing 2.7).

P = ( c [ i : 1 . . 3 ]−> d [ i ]−>END) .

EVAL_P = ( c [1]−>d[1]−>END | c [2]−>d[2]−>END | c [3]−>d[3]−>END) .

Listing 2.7:P and EVAL_P are equivalent

Local process

A process can be factorized, for convenience, in a collection of sub-processes called local processes. Local process definitions are part of a process definition. Each local process has its own identifier whose scope is the whole process definition. An example is shown in listing 2.8.

P = ( a−>b−>P ) .

/∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ ∗ P can be seen as undertaking ’ a ’ and then behaving as LOCAL_PROCESS, ∗ ∗ which undertakes ’ b ’ and then behaves as P ∗ ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗/ P_EQUIVALENT = ( a−>LOCAL_PROCESS) , LOCAL_PROCESS = ( b−>P ) . Q = ( a−>b−>Q | a−>c−>d−>STOP) . /∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ ∗ In Q, ’ a ’ i s t he s t a r t i n g a c t i o n common t o a l l c h o i c e s . ∗ ∗ In t h e s e c a s e s , the common p r e f i x can be put i n evidence ; th e ∗ ∗ c h o i c e i s consequently l i m i t e d t o th e d i f f e r i n g t e r m i n a t i n g ∗

∗ a c t i o n sequences ∗

∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗/ Q_EQUIVALENT = ( a−>(b−>Q | c−>d−>STOP) ) .

/∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ ∗ A l o c a l p r o c e s s can i d e n t i f y a c h o i c e . So , Q_EQUIVALENT can be ∗ ∗ w r i t t e n i n terms o f l o c a l p r o c e s s e s as f o l l o w s . ∗ ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗/ Q_EQUIVALENT_EQUIVALENT = ( a−>LOCAL_PROCESS_0) ,

LOCAL_PROCESS_0 = ( b−>Q | c−>LOCAL_PROCESS_1) , LOCAL_PROCESS_1 = ( d−>STOP) .

Listing 2.8: Process definition in terms of local processes

A convenient way to refer a large number of local processes is to use indexing and variables1. The scope of such variables is limited to the local process (see listing 2.9).

(30)

c o n s t A = 1000000 c o n s t B = A∗A∗A∗A c o n s t C = B∗B∗B∗B P = LOCAL_PROCESS[ 0 ] ,

LOCAL_PROCESS[ i : 0 . . C] = ( out [ i ]−>LOCAL_PROCESS [ ( i + 1 )%C ] ) . P_EQUIVALENT = LOCAL_PROCESS[ 0 ] ,

LOCAL_PROCESS[ 0 ] = ( out [0]−>LOCAL_PROCESS [ 1 ] ) , LOCAL_PROCESS[ 1 ] = ( out [1]−>LOCAL_PROCESS [ 2 ] ) ,

. . .

LOCAL_PROCESS[C] = ( out [C]−>LOCAL_PROCESS [ 0 ] ) .

Listing 2.9: Local processes indexing

Guarded actions

For each choice alternative an action guard can be defined following the syntax presented in listing 2.10 (processP).

Notice that, in the related LTS shown in figure 2.4, all information about guard conditions is lost. Guarded actions, in fact, serve merely as shortcomings to nimbly describe LTS structures. The alternative, shown in the process EVAL_P in listing 2.10, is to define exactly each state and transition, respectively, through local processes and one-action action prefixes.

P = LINE [ 0 ] ,

LINE[ x : 0 . . 5 ] = ( when x < 3 f i r s t H a l f−>LINE [ ( x +1) %6] | when x >= 3 secondHalf−>LINE [ ( x +1) %6]) . EVAL_P = LINE [ 0 ] ,

LINE[ 0 ] = ( f i r s t H a l f−>LINE [ 1 ] ) , LINE[ 1 ] = ( f i r s t H a l f−>LINE [ 2 ] ) , LINE[ 2 ] = ( f i r s t H a l f−>LINE [ 3 ] ) , LINE[ 3 ] = ( secondHalf−>LINE [ 4 ] ) , LINE[ 4 ] = ( secondHalf−>LINE [ 5 ] ) , LINE[ 5 ] = ( secondHalf−>LINE [ 0 ] ) .

(31)

2.4 Parallel composition 17 0 1 2 3 4 5 fir stH al_f firstHalf first Hal f se co_n d_H al_f secondHalf seco nd Hal f

Figure 2.4: In LTS models, information on logic conditions is lost

2.4 Parallel composition

2.4.1 Parallel composition of non-interacting processes

The parallel composition operator (||) is an n-ary operator producing an LTS modeling all the possible execution traces. When defining a process using such operator, the process identifier must be preceded by (||). Consider the simple listing 2.11, showing the basic || syntax, and the relative LTS models shown in figure 2.5.

P = ( a−>b−>END) . Q = ( x−>y−>END) . ||R = ( P || Q) .

Listing 2.11: Parallel composition of independent processes

The parallel composition processR models all possible interleavings between processes P and Q. Being P and Q independent in the sense of non-interacting, the execution of each action by each process is not constrained by any synchronization, so the actions ofP and Q can freely interleave in all possible ways.

2.4.2 Parallel composition of interacting processes

FSP models synchroniation through shared actions.

(32)

P0 P1 END Q0 Q1 END P0Q0 P1Q0 PendQ0 P0Q1 P1Q1 PendQ1

P0Qend P1Qend END

P Q P||Q a b x y a b a b a b x x y y x y

Definition 2(Shared action). Let λ and µ be LTSs. Then a is shared between λ and µ if a∈α(λ) ∩α(µ)

If a is a shared action between LTSs λ and µ, it can be undertaken only if, being λ in state Li and being µ in a state Mj, it is true that a ∈ α(Li) ∩α(Mj). Such mechanism reflects on

the parallel composition operator. In fact, the composing processes cannot freely interleave. Consider listing 2.12.

P = ( a−>b−>END) . Q = ( x−>b−>END) . ||R = ( P || Q) .

Listing 2.12: Parallel composition of interacting processes

The processes P and Q share the action b which, in order to be executed, requires both P and Q to be in a state where b is an eligible (active) action. Figure 2.6 clearly shows that such synchronization conditions the parallel composition process restricting all the possible interleavings to those satisfying synchronization onb.

2.4.3 Formal definition of the parallel composition operator

The considerations reported in 2.4.1 and 2.4.2 lead to a formal definition of the parallel composition operation.

(33)

2.4 Parallel composition 19 P0 P1 END Q0 Q1 END P0Q0 P1Q0 P0Q1 P1Q1 END P Q P||Q a b x b a a x x b

Definition 3 (Parallel composition). Let λ = σ(λ), S0_λ, α(λ), Rλ

and µ =

σ(µ), S0µ, α(µ), Rµ

be LTSs. Then, a parallel composition operator is defined as follows:

λ||µ=

σ(λ) ×σ(µ),(S0λ, S0µ), α(λ) ∪α(µ), Rλ||µ

where the relation R_λ||µis defined as follows:

∀a∈α(λ) ∪α(µ)                if L1−→a λ L2∧a6∈α(µ) then ∀M∈σ(µ) (L1, M)−−→a λ||µ (L2, M) if M1 a − → µ M2∧a6∈α(λ) then ∀L∈σ(λ) (L, M1) a −−→ λ||µ (L, M2) if L1−→a λ L2∧M1−→a µ M2 then (L1, M1) a −−→ λ||µ (L2, M2)

Remark 1 (Transition relation and non-reachable transitions and states). Notice that the definition of the transition relation R_λ||µ of a parallel composite process includes also transitions which may be non-reachable. Consider the LTSs in figure 2.7. TheP||Q LTS has some non reachable states and transitions. The state-space ofP||Q can be safely restricted to the reachable states and unreachable transitions can be discarded.

Remark 2 (Algebraic properties of the parallel composition operator). Some algebraic properties of the parallel composition operator are now stated:

commutativity

(34)

P0 P1 Q0 Q1 Q2 P0Q0 P1Q0 P0Q1 P1Q1 P0Q2 P1Q2 P Q P||Q a c b a d b d c c c d a b

Figure 2.7: Non reachable states an transitions in parallel composition

associativity

(λ||µ) ||ϑ=λ|| (µ||ϑ) =λ||µ||ϑ idempotency

λ||λ=λ

2.5 Process labeling

The process labeling operator (:) is a binary operator requiring, as operands, a label and a process expression. LetP be a process expression/LTS. Let a be a label. Then

α(a:P) = {a.i, i∈α(P)}

that is, the alphabet of the new process expression is composed of all the labels of the original process expression prefixed bya.. The R relation is consequently updated.

Labeling is used to produce new and independent instances of a process. Consider listing 2.13 and figure 2.8. SinceQ is obtained composing two different instances of P, two different traces lead toEND.

P = ( a−>END) .

||Q = ( f i r s t : P || second : P ) .

(35)

2.6 Process sharing 21 END END P Q a second.a first.a

Some syntactic sugar is added for convenience (see 2.14).

P = ( a−>END) .

||Q = ( f o r a l l [ i : 0 . . 2 ] q [ i ] : P ) . ||Q_EQUIVALENT = ( { q [ i : 0 . . 2 ] } : P ) .

||Q_EQUIVALENT_EQUIVALENT = ( { q [ 0 ] , q [ 1 ] , q [ 2 ] } : P ) .

||Q_EQUIVALENT_EQUIVALENT_EQUIVALENT = ( q [ 0 ] : P || q [ 1 ] : P || q [ 2 ] : P ) .

Listing 2.14: Process labeling syntactic sugar

2.6 Process sharing

The process sharing operator (::) is a binary operator requiring, as operands, a set and a process expression. LetP be a process expression/LTS. Let s be a set. Then

       α(s::P) = [ a∈α(P) {i.a, i∈s} Sj −→a P Sk ⇐⇒ ∀i∈s, Sj i.a −−→ s::P Sk

that is, for each transition Sj −→a

P SkinP, s::P has a set of transitions Sj i.a

−−→

s::P Skfor i iterating

overs, and the alphabet is adjusted accordingly. Consider, as an example, listing 2.15.

LOCK = ( lock−>unlock−>LOCK) .

PROCESS = ( lock−> c r i t i c a l−>unlock−>END) .

||SHARING = ( { p1 , p2 } : PROCESS || { p1 , p2 } : : LOCK) .

(36)

lock unlock p1.lock p2.lock p1.unlock p2.unlock Figure 2.9: Shared Lock specification

END p 1.l_o ck p₁ .c_r i_t_ic a_l

p1.un_lock _p2.lock

p2.cr itica l p2. un lock p2 .lo ck p2.c ritic al p2.unlo ck _p1.lock p₁ .c_r i_t_ic a_l p 1.u n lo ck

Figure 2.10: Locking protocol

The process{p1, p2}::LOCK (figure 2.9) offers a dedicated copy of its labels for each PROCESS instance. Each copy is shared exclusively with one instance ofPROCESS. State transitions in{p1, p2}::LOCK can happen only when p1:PROCESS or p2:PROCESS synchronize with it. Figure 2.10 shows that each critical section is executed immediately after the process has acquired the lock and immediately before the lock is released. Once one of the processes has acquired the lock, it is the only process having the right and the possibility to execute its critical section.

(37)

2.7 Bisimulation 23

2.7 Bisimulation

Bisimultion is a semantic equivalence relation (reflexive, symmetric, transitive) over LTSs [9]. Definition 4(Bisimulation). [10] Let σ0and σ1be sets of states. Let B⊆σ0×σ1be a relation

over the states of σ0and σ1. Then B is a bisimulation if the following predicates hold

∀(S, T) ∈B,    S−→a S0⇒ ∃T0 : T−→a T0∧ (S0, T0) ∈B T−→a T0 ⇒ ∃S0: S−→a S0∧ (S0, T0) ∈B

That is, if S and T are bisimilar (notation: S ∼ T), then, if it is possible to directly reach S0 from S through a transition a, then it must exist a state T0directly reachable from T through the same transition, and S0 and T0 must also be bisimilar; moreover, the dual condition (exchanging Ss an Ts) must hold too.

Definition 5(Bisimilar LTSs). [10] Let λ= (σ(λ), L0, α(λ), Rλ)and µ= σ(µ), M0, α(µ), Rµ be LTSs. Then λ and µ are bisimilar if

L0∼ M0

The ∼ notation can be used with LTSs, so if the previous condition holds, then λ ∼ µ. Bisimilarity, in this definition, is also called strong bisimilarity.

Definition 6 (Trace). Let λ = (σ(λ), L0, α(λ), R_λ) be an LTS. Let D be a discrete interval starting from 0, and|D| ≥2. A trace tk(S), k ∈ D starting from state S∈σ(λ)is a finite2or

infinite3ordered collection of labels4such that, being S0=S, the following condition holds

∀i< |D| −1, i6=0, i∈D,∃Si ∈σ(λ) ∃Si+1∈σ(λ): Si t_i −→ λ Si+1

Moreover, if D is finite and |D| = n, then Sn−1 (the last state reached following the trace)

must be a state with no outcoming labels (that is, tk must be a completed trace). So, a trace

starting from S is a path of labels obtained collecting each label encountered undertaking legal transitions, until a state with no outgoing transitions is found.

The set Tr(S)is the set of all possible traces starting from S. The set Tr(λ) =Tr(S0)is the

set of all possible traces for the LTS λ.

Definition 7(Trace equivalence). Two LTSs λ and µ are trace-equivalent if Tr(λ) =Tr(µ)

Definition 8(Deterministic LTS). An LTS λ is deterministic if

∀(S, a), S∈σ(λ), a∈α(λ), T : S−→a λ T ∈ {0, 1} 2_{So D}_{= {}_{0, . . . , n}₋₁_}_{, n}_∈_N. 3_{So, D is}_N.

(38)

That is, if transition a is eligible in state S, then there must be only one possible destination state T. Let the notation Det(λ)denote that the LTS λ is deterministic. Figure 2.11 gives an example of deterministic and non-deterministic LTSs.

d l r DETERMINISTIC n l r NON-DETERMINISTIC a b a a

Figure 2.11: Deterministic and non-deterministic states

Remark 3 (Bisimilarity and trace equivalence). Let λ and µ be LTSs. Then

(39)

Chapter 3

Process algebraic code generation:

state of the art

3.1 Code generation and execution approaches

Figure 3.1 shows some approaches adopted to shape executable systems from process algebraic descriptions. For each approach the components in black must be implemented from scratch. These approaches can be briefly described follows:

[VI] [V] [IV] [III] [I] [II] High level language Code Generator Code Generator Code Generator Code Generator Code Generator Code Generator Code Generator Code Generator Code Generator Code Generator Code Generator Code Generator Code Generator Code Generator framework framework framework framework framework framework framework Compiler / Interpreter Code Generator Compiler/Interpreter High level language Java Virtual Machine Java bytecode Compiler Compiler Compiler Compiler Compiler Compiler Compiler Compiler framework Code Generator Interpreter Interpreter Interpreter Interpreter Interpreter Interpreter Interpreter Interpreter Virtual Machine Virtual Machine Virtual Machine Virtual Machine Virtual Machine Virtual Machine Virtual Machine Virtual Machine bytecode Compiler Compiler Compiler Compiler Compiler Compiler Compiler Compiler Compiler Compiler Compiler Compiler Compiler Compiler Compiler Compiler Process Algebra

Figure 3.1: Process algebra transformation and execution

(40)

II (extended) process algebras as compiled programming languages; III (extended) process algebras compiled to ad-hoc bytecode;

IV (extended) process algebras compiled to pre-existent bytecodes;

V generation of high-level languages code stubs assisted by additional abstractions; VI generation of high level code stubs;

The main objective of approaches I, II, III and IV is to obtain execution directly from process algebraic specifications. Of course, such approaches require language extensions, since process algebras are not expressive enough to convey many aspects of computation (control flow direction, data representation, algorithms and more). The term extended process algebra will be used in the following sections to denote such languages.

Approaches V and VI intend to produce code stubs, so each aspect unrelated to synchronization, execution traces etc. . . can be left to the developer. Advantages and drawbacks of each approach are briefly discussed.

Development

Interpreters are usually easier to develop than compilers [11].

The complexity of compilers can be cut down taking advantage of the functionalities offered by the target language1. So, compilers targeting bytecode can be significantly simpler than those producing machine-dependent object code. Among these, those targeting ad-hoc designed bytecodes are, of course, simpler. Such benefit, however, is set off against the need of implementing a brand-new virtual machine.

Generating high level code stubs can be a simpler alternative to interpretation and compilation. Process algebra specific abstractions can be implemented in additional target language frameworks in order to ease code generation. However, the dependency from a particular framework can undermine code portability and requires of developers a further effort when dealing with the generated code.

Performances

Approaches based on interpretation and virtualization are less performant than those based on compilation towards machine-dependent object code. Virtual machines implementing Just-In-Time compilation can be more performant, but they are also more difficult to develop. The performances of high level code generated following the approaches V and VI is dependent on two main factors:

1_{As an example, consider the Java bytecode instruction}_{new, a powerful instruction which allocates a new object}

of the desired type. Without such instruction, the compilation of Java code allocating new objects would be by far more complex. So, the Java compiler takes advantage of the features offered by the target bytecode to reduce compilation complexity. Such complexity, however, is simply moved on the Java Virtual Machine.

(41)

3.2 State of the art 27

• target language nature: interpreted, compiled to bytecode or compiled to machine-dependent object code;

• code generation optimizations;

Concerning optimizations, it is very important to remember that

Optimizing compilers are so difficult to get right that we dare say that no optimizing compiler is completely error-free! [12]

3.2 State of the art

3.2.1 CSP++

CSP++ (see [13]) generates C++ code stubs from CSPm specifications. CSPm is a CSP dialect enhanced with an expression language inspired by languages like Miranda and Haskell [14]. The stubs produced by CSP++ depend on a framework, the CSP++ object-oriented application framework, based on GNU Portable Threads. CSPm is used to specify a formally-verifiable control backbone of a system. The CSPm event mechanism is used to invoke user-coded functions written in C++.

CSP++ uses an extended process algebra and produces code stubs which depend on a particular framework.

3.2.2 CSP# to C#

[15] proposes an approach to generate C# code stubs from CSP# models. CSP# is a CSP dialect providing high-level modeling operators and supporting shared variables and asynchronous communication channels. The generated code depends on a C# library called PAT.Runtime, which implements CSP# operators in C#.

This approach is based on an extended process algebra and produces code stubs which depend on a particular framework.

3.2.3 Finite State Process Autocoder

Finite State Process Autocoder [16] is a code generating back-end for the LTSA tool ([17, 18]). The generated code uses a centralized component, calledMasterChooser, whose assignment is to decide which action will be executed from a collective action pool.

FSP Autocode uses plain FSP, but the produced code still needs an additional component.

(42)

3.2.4 PADL2Java

PADL2Java [19] is a software tool that translates PADL models into Java code. PADL2Java provides a package calledSync to enhance Java with PADL abstractions, in order to shorten the distance between formal specification and target language.

PADL2Java uses plain PADL, but the target language in enhanced with a PADL-specific framework.

3.2.5 Process Calculus Virtual Machine

[20] presents a virtual machine for a strongly typed, polymorphic, concurrent, object-oriented programming language based on the TyCO process calculus. A compiler translates TyCO sources into bytecode, which is executed by an underlying virtual machine.

This approach implies the adoption of a heavily extended process algebra as main programming language.

3.3 Comparison with

fspc codegen

The approaches exposed in section 3.2 are now compared tofspc codegen under the aspects of required human interaction, portability, correctness and performances.

3.3.1 Human interaction

Legend Bytecode or object code High-level source code High-level code stubs High-level code stubs +framework Extended Process Algebra Process Algebra human intervention automated tool compiler compiler compiler stub filling stub filling stub filling modeling effort modeling effort fspc codegen TyCO compiler CSP++ CSP# FSP Autocoder PADL2Java

(43)

3.3 Comparison withfspc codegen 29

As shown in figure 3.2, the approaches based on extended process algebras (3.2.1, 3.2.2, 3.2.5) require further modeling efforts, since the specification language is enriched with exogenous concepts like data representation, data types, communication, algorithms and more.

If the generated stub code is tied to a particular framework (3.2.1, 3.2.2, 3.2.3, 3.2.4) the user is required to undergo a preliminary framework-learning phase.

The approaches based on direct compilation of extended process algebras (in 3.2.5, compilation to bytecode) require the smallest user intervention.

fspc codegen uses a plain process algebra (namely, FSP), so the modeling phase is restricted only to synchronization and concurrence aspects. Of course, all other aspects of computation have to be specified by the user during the stub filling phase. Such operation does not require any preliminary learning phase since the produced stubs are not tied to any particular framework.

3.3.2 Portability

The dependency on particular frameworks (3.2.1, 3.2.2, 3.2.3, 3.2.4) hampers source code portability, since the framework has to be ported too. Moreover, software dependencies management can be difficult and source of subtle problems.

The portability of bytecode (3.2.5) is linked to the underlying virtual machine.

fspc codegen code stubs can be freely ported on any architecture supporting the target language2_.

3.3.3 Correctness

Defining process algebra-related abstractions in frameworks shortens the distance between the input process algebra and the target language [19]. Consider, as an example, the following case: a Java (target language) classLocalProcess could be defined to model the FSP (input process algebra) concept of local process; then, the translation of an FSP local process would be a simple LocalProcess class instantiation. If a substantial part of the process algebra is modeled in the framework, the code generation transformation becomes similiar to an isomorphism.

If the code generation transformation is sufficiently simple, its correctness could be proven by very simple means. However, the correctness of the framework implementation must be proven too.

A detailed dissertation onfspc codegen design correctness is exposed in chapter 5.

(44)

(45)

Chapter 4

fspc codegen

4.1 codegen overview

codegen is a fspc extension module providing code generation functionalities. Correctness and safety are by far the most important design goals. Easy extensibility has been taken into high consideration. Performance constraints, instead, are not pressing, since the terms of comparison are human reasoning and handwritten implementation times.

software requirements models analyst developer user

Figure 4.1: Classic information flow in software development

Figure 4.1 shows the classic information flow in software development. The design ofcodegen is based on this prototypical situation: code generation is seen as incremental information refinement carried out by specialized actors.

Analysis can be tailored to specific concurrency mechanisms (semaphores, monitors, remote procedure call...) putting in evidence useful information, hiding redundancies and using proper data structures to represent models. As a consequence, developers must be able to handle heterogeneous models. Thus, figure 4.1 can be refined as shown in figure 4.21: the selection of a strategy for code generation (method resolution) depends both on the required concurrency mechanism and on the target language (figure 4.3).

(46)

Z source Y source X source ZDeveloper YDeveloper XDeveloper C model B model A model CAnalist BAnalist AAnalyst requirements user

Figure 4.2: Detailed information flow in software development

Types Algorithms Language C C++ Java Python Formalism

Semaphores Monitor Actors RPC

(47)

4.1codegen overview 33

An efficient approach to provide method resolution based on several parameters would be to store function pointers in a map whose keys depend on the tuple <mechanism, language>. Such approach, however, has some drawbacks, namely:

• non-uniform call sequence;

• explicit (thus error-prone) call resolution; • more than linear growth of initialization code2_;

Exploiting polymorphism for call resolution would have no such drawbacks. Unfortunately C++ does not allow method resolution based on the dynamic type of multiple arguments [21]. A common solution to this problem is the popular Visitor design pattern [22], which offers double dispatch depending on the dynamic types of two objects. However, this solution suffers from some problems too. The solution used incodegen is based on Visitor but does not have many of its drawbacks. Appendix A contains all the details.

Requirements

Model

Behaviour Safety

Target

Mechanism Language

Figure 4.4: User requirements

Figure 4.4 shows a classification for user requirements in the context of automated code generation. In particular, behaviour and safety are model-level specifications, while language requirements and the use of a particular concurrency mechanism are target-language constraints.

In order to be manipulated by software, user requirements must be expressed in a form that can be easily read by a machine. In the specific case, behavioural requirements are embodied by FSP processes (explicitly designed to be easily machine-readable), while the remaining requirements (concurrency mechanism, target language and more) can be expressed by the user through a simple regular language (see 6.3).

Concurrency mechanism and language requirements can be used as routing coordinates to select a proper code generation path in situations similiar to those modeled in figure 4.2. Once the proper code generation path has been chosen, the right internal model can be built, adhering to the requested concurrency mechanism, and used in code generation towards the right target language.

2_{For m concurrency mechanisms and n languages, the number of lines of code needed to initialize the call-map}

(48)

4.2 codegen design

4.2.1 Integration with

fspc

(safety, liveness) status source code «config» Developer internal model LTS model «config» Analyst language Configuration LTS Analyzer «behaviour» FSP specification FSP Compiler mechanism safety and liveness

constraints

codegen fspc

Figure 4.5:codegen integration with fspc

Figure 4.5 shows codegen integration at a functional level. FSP specifications are compiled into LTS models by the FSP compiler module. Safety and liveness constraints, concurrency mechanism and target language can be specified either through an interactive shell or through scripts. Safety and liveness analysis is done by the LTS Analyzer module. codegen internals are dynamically rearranged to satisfy the requests.

4.2.2 Information representation

As said in section 4.1, code generation is seen as an incremental information refinement. Each refinement step implements a model-to-model or model-to-text transformation.

(49)

4.2codegen design 35 SourceFile SourceCode RPCSpecification SemaphoreSpecification MonitorSpecification UserRequirements «visitable» Information _Serializable

Figure 4.6: Information classes

Informations are represented by theInformation class hierarchy (figure 4.6), which has been set up in order to allow uniform manipulation. AnInformation is a visitable3 element. The result of a visit is a new information.

User requirements are modeled by theUserRequirements class, while each mechanism-dependent internal representation is modeled through a dedicated class. Source code is represented as instance ofSourceCode, which also implements the Serializable interface. SourceCode is a composition of SourceFile.

4.2.3 Information refinement

«visit» «produce» Model «produce» «visit» SourceCode «visitor» Developer Information UserRequirements «visitor» Analyst

Figure 4.7: Analyst and developer concepts

Analyst and Developer (figure 4.7) are Information visitors3_{. In particular,}_Analysts

can handleUserRequirements, while Developers can produce SourceCode.

EachAnalyst implementor can visit UserRequirements producing a more refined model,

(50)

thus implementing a model-to-model transformation. Analyst and Information are roots for a covariant types hierachy, as shown in figure 4.8.

«produce» «produce» «produce» SemaphoreSpecification MonitorSpecification SemaphoreAnalyst MonitorAnalyst «produce» RPCSpecification RPCAnalyst Information «visit» UserRequirements Analyst

Figure 4.8: Analyst information flow

Each Developer implementor can visit a variable subset4 of the possible mechanism-dependent representations made byAnalysts (figure 4.9). Developers produce SourceCode, thus implementing a model-to-text transformation.

«produce» «visit» SourceCode SemaphoreSpecification MonitorSpecification JavaDeveloper CppDeveloper RPCSpecification CDeveloper Information Developer

Figure 4.9: Developer information flow

User requirements refer to LTS models, which are generated and managed by thefspc environment. Both requirements analysis and code generation could need to access these models and thefspc analytical functionalities. Thus, concrete Analysts and Developers

4_{Depending on the specific}_{Developer type, some visits could not make sense. As an example, there is no}

meaning in trying to produceoccam-π code from a MonitorSpecification. A classic Visitor approach cannot handle such situations well (a visit method has to be defined for each type in the visitable hierarchy, even if that visit has no meaning), while the solution used incodegen (see appendix A) can.

Generation of concurrent code from Labelled Transition Systems

U

NIVERSITY OF

P

ISA

M

ASTER

’

S

D

EGREE IN

C

OMPUTER

E

NGINEERING

M

’

T

Generation of Concurrent Code from LTS Models

Author:

Cosimo S

Supervisor:

Paolo A

Co-Supervisor:

Andrea D

Abstract

Contents

List of Figures

Listings

List of Tables

Chapter 1

Introduction

1.1

Definitions

1.2

Differences between sequential programs and

concurrent programs

1.2.1

Advantages of concurrent programming

1.3

Difficulties in concurrent programming

1.3.1

Difficulties in concurrent code development

1.3.2

A new kind of errors

1.3.3

Proving correctness

1.4

Approaches based on models

1.5

Generative programming

1.5.1

Drawbacks of manual implementation

1.5.2

About generative programming

1.6

About this thesis

1.6.1

Subject

1.6.2

fspc overwiew

1.6.3

Organization

Chapter 2

FSP specifications and LTS models

2.1

Specifications and models

2.2

FSP expressions

2.3

FSP definitions

2.3.1

Const, set and range definitions

2.3.2

Process definitions

2.4

Parallel composition

2.4.1

Parallel composition of non-interacting processes

2.4.2