Modelling trophic network with PEPA and Bio-PEPA

(1)

Ca’ Foscari University

MSc in Computer Science

MODELLING TROPHIC

NETWORKS IN PEPA AND

BIO-PEPA

Moscardo Marco, 841583

Supervisor: Cocco Nicoletta

12

sn

March 2015

A.A. 2013-2014

(2)

(3)

1 Introduction 7

2 Trophic networks 9

2.1 Trophic networks . . . 9

2.2 Trophic Analysis - First step: Rappresentation . . . 11

2.2.1 Graph representation . . . 12

2.2.2 Matrix representation . . . 13

2.3 Second step - Trophic Analysis . . . 13

2.3.1 Descriptive Parameters for Trophic Networks . . . 13

2.3.2 Flow matrix . . . 16

2.3.3 Dietary and Length Pathway Analysis . . . 17

2.3.4 Dependency Analysis . . . 20

2.3.5 Activity analysis . . . 21

2.3.6 Efficency Analysis . . . 22

2.3.7 Contribution analysis . . . 23

2.3.8 Total impact analysis . . . 23

2.4 Analysis of Cycling . . . 24

2.5 Dynamic Analysis . . . 24

2.6 Tools for representing and analysing trophic networks . . . . 25

2.6.1 Ecopath with Ecosim . . . 25

3 PEPA and Bio-PEPA 27 3.1 PEPA . . . 27

3.1.1 System representation: Syntax of PEPA . . . 28

3.1.2 System behaviour: PEPA Labelled Transition System 30 3.1.3 PEPA properties . . . 33

3.2 Bio-PEPA . . . 36

3.2.1 System representation: Syntax of Bio-PEPA . . . 37

3.2.2 System behaviour: Bio-PEPA Labelled Transition Sys-tem . . . 42

3.2.3 Bio-PEPA Properties . . . 43

3.3 Tools for PEPA and Bio-PEPA . . . 47 3

(4)

4 The trophic network to be modelled: Wood 51

4.1 Introduction to the problem . . . 51

4.2 Description of Wood . . . 51

4.3 Species and interactions . . . 52

4.4 Weight of interactions . . . 54

4.5 Initial values of Biomass . . . 54

4.6 Graph of Wood . . . 55

5 Modelling Wood in PEPA 57 5.1 How to define a PEPA model . . . 57

5.2 Analyses offered by PEPA tool . . . 58

5.3 Model Wood1 . . . 59

5.3.1 Set of Components C . . . 59

5.3.2 Set of Activities Act . . . 59

5.3.3 Set of Activity Rates R . . . 59

5.3.4 Sequential Component P . . . 60

5.3.5 Analyses . . . 61

5.4 Model Wood2 . . . 63

5.4.5 Analyses . . . 64

5.5 Model Wood3 . . . 65

5.5.5 Analyses . . . 66

5.6 Model Wood4 . . . 68

5.6.5 Analyses . . . 70

5.7 Discussion on PEPA Models . . . 70

6 Modelling Wood in Bio-PEPA 73 6.1 How to define a Bio-PEPA model . . . 73

6.2 How to use functional rates . . . 74

6.3 How to use stoichiometry coefficients . . . 75

6.4 Analyses offered by Bio-PEPA tool . . . 76

6.5 Model Wood5 . . . 77

(5)

6.5.2 Set of Quantities N . . . 77

6.5.3 Set of Functional Rates FR . . . 78

6.5.4 Set of parameters K . . . 79

6.5.5 The model component P . . . 79

6.5.6 Set of sequential components Comp . . . 79

6.5.7 Comments on the model . . . 80

6.5.8 Static analysis . . . 80

6.5.9 Dynamic analysis . . . 81

6.5.10 Wood5’ introducing a CO2 generator . . . 81

6.6 Model Wood6 . . . 83

6.6.1 Set of Compartments V . . . 84

6.7 Model Wood7 . . . 86

6.8 Model Wood8 . . . 90

6.8.10 Discussion of Bio-PEPA Models . . . 94

(6)

A Translation into code 99

A.1 PEPA Eclipse’s tool syntax . . . 99

A.2 Bio-PEPA Eclipse’s tool syntax . . . 100

B Models Code for the Models of Wood 103 B.1 PEPA . . . 103 B.1.1 Model Wood1 . . . 103 B.1.2 Model Wood2 . . . 104 B.1.3 Model Wood3 . . . 104 B.1.4 Model Wood4 . . . 105 B.2 Bio-PEPA . . . 106 B.2.1 Model Wood5 . . . 106 B.2.2 Model Wood5’ . . . 107 B.2.3 Model Wood6 . . . 107 B.2.4 Model Wood7 . . . 108 B.2.5 Model Wood8 . . . 109

(7)

Introduction

A trophic network is a set of organisms in an ecological community that are linked to each other through the transfer of energy and nutrients. Such networks are modelled from the ecologists to study, analyse and simulate the behaviour of them under some conditions. The main software used in this field is Ecopath with Ecosim, EwE [13]. This software permits to the ecologist to calculate the missing data by using specific ecological equations representing properties of the data. It also permits to represent and anal-yse the average state of an ecosystem description and to analanal-yse various properties of such systems.

This thesis focuses on modelling trophic networks by using tools devel-oped in the field of process algebras to model complex systems, evaluating their advantages and disadvantages. The first tool that we consider is PEPA [5] and the second one is Bio-PEPA [1]. PEPA is a stochastic process al-gebra which is used for modelling systems composed of concurrently active components which co-operate and share work. PEPA allows the modeller to study either behavioural or performance properties. This process alge-bra has been developed for computer systems and not for ecological systems. Our goal is to verify if it can model also trophic networks. The second tool is Bio-PEPA. It extends PEPA in order to handle some features of biochemical networks, such as stoichiometry and different kinds of kinetic laws. A main feature of Bio-PEPA is the possibility to support different kinds of analy-sis, including stochastic simulation, analysis based on ordinary differential equations (ODEs) and model checking in PRISM. Our goal is to study the application of both process algebras to trophic networks, their limitations and advantages. In such study we consider a toy case, Wood.

The structure of the thesis is as follows. Chapter 2 focuses on trophic networks. We give the main concepts to understand what a trophic network is and how it is represented. Then we introduce also some static analysis that are possible knowing only the structure of the ecosystem. We present a simple algorithm to find cycles in the trophic network. In the end of

(8)

the chapter, EwE, the main software that the ecologists use to analyse real trophic networks, is presented. In Chapter 3 we introduce PEPA and Bio-PEPA, through the definition of all the components that describe a PEPA model and a Bio-PEPA model. We explain a network behaviour by using a Labelled Transition system. For each process algebra we introduce also the concepts of bisimulation and isomorphism. Chapter 4 presents the toy case Wood. It shows all the species of the network and all the interactions that exist between the species. We give also a graph representation of Wood by using the method explained in Chapter 2. In Chapter 5 and 6 are explained the models of Wood developed for PEPA and Bio-PEPA, respectively. For each process algebra we implement four different models and for each model all the meaningful analyses that the tool provides are performed, where pos-sible. Then a short conclusion follows. Appendix A contains a small guide to translate PEPA and Bio-PEPA syntax into code for the corresponding Eclipses tools. The Eclipses tool codes for all the models of Wood is given in Appendix B.

(9)

Trophic networks

In this chapter we introduce the main concept of trophic networks, starting from the definition of ecological community and ending with the different type of analysis that we can do on them. Hence, we initially give the basic definition of the main components that compose trophic networks. Then we explain two technique to represent trophic networks and several static analysis that are used to extract some important informations using only the structure of the network. This chapter include also the definition of the two main descriptive parameters for a trophic network, namely connectivity and connectance, that are used to compare different trophic networks without the use of some kind of analysis. In the end we briefly introduce dynamic analysis and we show one of the most important and used software by ecologists to analyse and study trophic networks, namely Ecopath with Ecosim Ew E software.

All the information are taken from [12], [13], [7], [11], [10], [6], [8], [9].

2.1 Trophic networks

In the recent years trophic models have been widely used in ecology since they are very useful for purposes of analysis and prediction. These models permit to represent every kind of ecosystem and to analyse its properties and temporal evolution.

First of all we introduce all basic concepts to understand how a trophic network is structured and how it works.

The idea of a trophic network is to represent all the reactions between all the species in an ecological community.

Definition 2.1.1. An Ecological Community is an assemblage or asso-ciations of populations of different species that are in the same geographical area at the same time.

The study of an ecological community considers the reactions between the species in the community.

(10)

In the real world there are also nonliving components that affect the organisms. For this reason the concept of ecological community is extended to incorporate also the nonliving components. The result is called ecosystem. Definition 2.1.2. An Ecosystem is a community of living organisms, such as plants, animal and microbes, in conjunction with the nonliving compo-nents of their environment such as air, water and detrite which interact as a system.

Now, we focus on the reactions existing in a community. Let us image an reaction as an action that a species performs on another one. For example a possible reaction is: a leopard eats a goat. The system, in this case, is composed by two species: leopard and goat, and the reaction is eat. This system of species can be seen as a chain for that specific reaction. This chain is called food chain.

Definition 2.1.3. A food chain is a sequence of one single reaction in a biological community (an ecosystem) to obtain nutrition.

Obviously in a complex ecosystem can exist a lot of reactions and thus there can be more that one chain. In general all the reactions in an ecosys-tem can be seen as a network where the species in the ecosysecosys-tem are the elements of the network and the links between pair of elements are the species’ reactions. Such a network is called ecological network.

Definition 2.1.4. An Ecological network is a representation of biotic reactions in the ecosystem, where the species are connected with pairwise link.

There are two fundamental types of reactions:

• trophic: this type of reaction represents feeding connections;

• symbiotic: in this reaction species depend on each other for survival. Usually reactions are subdivided in different organizational levels. These levels are called trophic levels and they are used to specify the role that each species unroll in the ecosystem.

Definition 2.1.5. The trophic level of an organism is the position it oc-cupies in a food chain.

From the definition of food chain, we know that it is a succession of organisms that interact between themself. This chain has thus a starting point which is the element that is eaten and it never eats some species. The number of steps from the start of the chain to the current position of a species in the chain is a measure of its trophic level. Let us consider a simple system composed by three species: leopard, goat and green plants.

(11)

The food chain of this system is the following: leopard eats goat, goat eats green plants. Using this succession we can determinate the tropic level of each species. The green plants are in the trophic level 1, goat is in the trophic level 2 because it eat the green plants and occupies the second position of the food chain. Leopard is in the trophic level 3 because it eat the goat that is in the level 2.

Definition 2.1.6. An ecological network is called food web when all the reaction in the system are trophic.

Food webs provide a quantitative framework to link community structure (i.e. community ecology) with fluxes of energy and material (i.e. ecosystem ecology) and therefore reconcile biodiversity with ecosystem function.

A classical example of food web is the predator-prey model (generally defined by Lotka-Volterra differential equations [10]), where each animal species is represented as a node and a connection between two nodes exists if a species eats another one. The aim of the model changes the type of food web that is considered. In fact there are different categories of food webs:

• source web: the analysis is focused on the predators; • sink web: the analysis is focused on the prey;

• community web: the analysis is focused on a group of species and on all their connections;

• energy flow web: the analysis is focused on the fluxes of energy between nodes.

The flow networks most familiar to ecologists are trophic in nature, thus the most common food web is an energy flow web, that is a trophic network.

The analysis of a trophic network emphasizes the links between popula-tions and within the information of the species level, it is able to highlight the details of the structure of the community.

The trophic analysis is done through two steps:

• first step: describe the network representing all the reactions and the species;

• second step: apply mathematical and analytical concepts to extract information.

2.2 Trophic Analysis - First step:

Rappresenta-tion

To represent a trophic network, first of all we have to single out the n taxon we want to represent. A taxa can represent an individual, a population of a

(12)

given species or some aggregation of species populations according to guild, habitat, or trophic level that is the elements that will be studied. For each taxon, we have to determine which other taxa are included in its diet.

An important characteristic of an ecosystem is that it is generally not closed. It means that there are always flows of material or energy or both between the system and the ‘rest of the world’. The connections with the external world can be bundled in two main flows, one in input and one in output, without loss of generality. For this reason when we represent trophic networks, we have to take into account also the input and output flows. Example 2.2.1. Let us consider a simple ecosystem without connection with the external world. We suppose to have 4 species, where the reaction is eat:

• species 1 is eaten from nobody and it eats species 2, 3 and 4; • species 2 is eaten from species 1 and it eats species 3;

• species 3 is eaten from species 1, 2 and 4 and it eats nobody; • species 4 is eaten from species 1 and it eats species 3.

In the next section we show how represent this ecosystem.

2.2.1 Graph representation

One way to represent a trophic model is to create a graph where each node represents a taxa and each arc, used to link a pair of nodes, denotes an reaction between the nodes. A possible meaning of an arc from node A to node B is the presence of a flow of energy between A and B. The arc which represents the reaction can be directional. The directionality means that the starting node influences the ending node or vice versa. In this case we have a directed graph or digraph. In the real world some connections can happen more often than others and they are always greater in effect than others. For this reason it is assigned to each connection a quantity that distinguish the connection. The resulting graph is a weighted graph. Figure 2.1 shows the three different types of graph that can be used to represent the ecological network in the example 2.2.1.

(13)

2.2.2 Matrix representation

Another way to represent an ecological network with n taxa is by using a matrix n×n where in the row we insert the starting taxon and in the column the ending taxon. If the cell (i, j) contains one, then there is a connection from taxon i to taxon j. This connection may represent a flow of material or energy between the taxa. Otherwise the cell contains the value zero. This matrix is the adjacency matrix of a corresponding directed graph.

The main advantage of this representation is that we can apply linear algebra to it and use it for the analysis. For this reason matrix methods are the most used.

From the topology of the resulting graph (or the matrix of analysis com-position) we can learn a lot of information about how the system behaves, for example the presence of cycles.

The adjacency matrix of the example 2.2.1 is the following:     0 1 1 1 0 0 1 0 0 0 0 0 0 0 1 0    

2.3 Second step - Trophic Analysis

There are numerous experimental and mathematical methods to study trophic webs. Experimental methods include serological methods, the analysis of stomach contents, stable isotope analysis and indirect methods as the stud-ies on the correlations based on data from the census. In this section we focus only on the mathematical methods.

2.3.1 Descriptive Parameters for Trophic Networks

Trophic networks may be characterized by a series of structural parameters which describe one or more aspects of the network. Now we explain the most significant of them.

Connectance

Connectance represents one of the most important parameter in the analysis of trophic networks, because it allows to summarize more than one charac-teristic of a network in the form of a number easily comparable. It is defined as the ratio between the number of interspecific reactions that really exist among the elements of the network, and the maximum number of possible reactions. In other words it represents the probability of reaction of a species with any other elements of the community. The mathematical definition of

(14)

connectance is:

C = L

S (S − 1) (2.1)

where L represents the number of the reactions and S is the number of the species in the network. When the number of species is high, the expression (2.1) can be approximated as:

C = L S2

Let us consider in L only the reactions of predation. In this case we have connectance minimum. Instead, if in L we considered also the reaction of competition1, we have connectance maximum.

The calculation of the connectance is can be done by using the adjacency matrix, which summarizes enough information about the reactions in the network.

Example 2.3.1. Let us consider the example 2.2.1, the network is repre-sented graphically in figure 2.2, and the network reprerepre-sented in figure 2.3. We want to calculate the connectance minimum and maximum for both ex-amples and compare them.

Figure 2.2: Graph of example 2.2.1

Figure 2.3: Graph representation of other ecosystem

To calculate the connectance minimum we consider only the reaction of predation. This means that if a species is a prey for 3 different species,

1_{Competition is an reaction between organisms or species, in which the fitness of one}

is lowered by the presence of another. There are two type of competion: intraspecific competition that is among members of the same species, interspecific competition that is between individuals of different species.

(15)

then all the reactions are considered as only one reaction. As we can see from figure 2.2 the species 4 is a prey for the species 1, 2 and 3, thus it is considered as one reaction. Instead to calculate the connectance maxi-mum, we consider three reactions because we consider also the reactions of competition.

Hence, for the example 2.2.1 we have that there are 3 reactions of preda-tion and 5 reacpreda-tions of predapreda-tion and competipreda-tion. To use the equapreda-tion 2.1, we need the number of species in the ecosystem: there are 4 different species represented by the node of the graph.

connectance minimum C = 3 4 (4 − 1) = 3 12 = 0.25 connectance maximum C = 5 4 (4 − 1) = 5 12 = 0.42

Now let us consider the example 2.3.4. In this case we have 3 reactions of predation and 6 reactions of predation and competition. Thus the con-nectance maximum will be greater.

connectance minimum C = 3 4 (4 − 1) = 3 12 = 0.25 connectance maximum C = 6 4 (4 − 1) = 6 12 = 0.50

The second example has a greater connectance maximum than the first one, this means that the probability of reaction of a species with any other elements in the second community is greater than in the first community. If we consider the connectance minimum we have the same value and thus the same probability of reaction between two different species.

Connectivity

A parameter related to the connectance is connectivity, which is represented by the product S · C, where S is the number of species and C is the con-nectance maximum. This parameter takes a constant value and it oscillates between 4 and 5 in any system. This means that on average a species in-teracts with a few others, regardless of the total number of species present in the network. To clarify this point, we do a simple mathematical con-sideration. Let us consider a network with a large number of species S. In this case, simplifying and rearranging the terms of the relationship that expresses the connectivity we have:

S · C = L

S = constant

This property is also called density of links per species. From this property we can say that when the number of species in the network increase, the connectance decreases hyperbolically (Fig. 2.4)

(16)

Figure 2.4: Relation between connectivity and number of species

2.3.2 Flow matrix

With the adjacency matrix, the information on the weight of each reaction is lost. For this reason it is possible to define another matrix, called flow matrix, which contains, instead of weights, the rate of exchange among the taxa. This matrix will be denoted with the letter T. Each element Tij

represents the rate of a transfer from the species i to the species j.

The rate is an important notion and it is useful to know how many times an action happens. For a sufficiently long time interval, we can assume that the energy that flows is balanced around each species. Thus, around each taxon, i, we have that

Xi+ n X j=1 Tji= n X k=1 Tik+ Ei+ Ωi (2.2)

where Xi represent the rate of the external input flow, Ei is the rate of

output flow from taxon i to the outside world and Ωi is the dissipation rate.

Dissipation represents all the actions, as respiration or degradation, which generate a lower energetic form. For shorter time intervals, this balancing condition may not be verified.

In the real world many rates are unknown and an investigator gener-ally uses this condition to calculate them indirectly. The problem of un-known information is the biggest problem that the ecologist still encounters. Nowadays there are tabled information regarding the metabolism of differ-ent species which can be useful to obtain the desired rates. It is possible to multiply the density of the population with the corresponding factor in the table to obtain the rate.

(17)

Example 2.3.2. (Taken from [6]) Let us assume that the density of mi-crozooplankton is measured as 150 mgC m−2 and that published metabolic constants show their average consumption rate to be 160% per diem, then the total demand by these organisms should be roughly

150 mgC m−2 × 160

100 × 365 days = 87600 mgC m

−2

yr−1.

Respiration, excretion, and natural mortality for this population can be sim-ilarly reckoned from published tables.

2.3.3 Dietary and Length Pathway Analysis

Using a graph representation, the analyst can see the structure of the net-work and the connection of each node with the others. One analysis that can be done is how a node affect the whole system. In other word a species interacts immediately with only a limited number of other organisms, but it communicates indirectly with a much larger number of elements of the ecosystem. This indirect effect can be seen from the graph representation, but by using the matrix representation it is easier.

To analyse the indirect effect of a species it is used a matrix, called matrix of dietary coefficients, denoted with the letter G. Each element gi j

of this matrix represents the fraction which i comprises of the total input into j, i.e. the contribution of i to the diet of j. To calculate this matrix the flow matrix T and the input vector X are used. For each compartment j:

gij =

Tij

T.j+ Xj

(2.3) The elements gij represent the percentage of i in the diet of j. The power

of this matrix reveal the fractions of the diet of j given by i along pathways of length corresponding to the integer power to which G is raised. Another information that we can extract from Gn is that all the elements different from zero which appear in Gn mean that there exists at least a path from the node i to the node j with length n. Let us see a simple example. Example 2.3.3. (Taken from [6]) Let us consider this graph representation of a network:

then the adjacency matrix is the following:     0 1 1 1 0 0 1 1 0 0 0 1 0 0 0 0    

(18)

Figure 2.5: Simple example of network with four node

In the flow matrix T the ones are substituted with the rates of reactions:

T =     0 t1,2 t1,3 t1,4 0 0 t2,3 t2,4 0 0 0 t3,4 0 0 0 0    

The matrix of dietary coefficients G have thus the same form with differ-ent coefficidiffer-ents: G =     0 g1,2 g1,3 g1,4 0 0 g2,3 g2,4 0 0 0 g3,4 0 0 0 0    

At this point we can calculate the power of G and we obtain:

G2=     0 0 g1,2g2,3 g1,2g2,4+ g1,3g3,4 0 0 0 g2,3 0 0 0 0 0 0 0 0    

Each nonzero element of G2 is a term that represents all the pathways of length 2 that connect i with j. For example, element (1, 3) of G2 _{reveals how}

much biomass gets from 1 to 3 over a single two-step pathway 1 → 2 → 3. The element (1, 4) of G2 is composed by two two-step pathways, the first term represents the path 1 → 2 → 4 and the second one the path 1 → 3 → 4.

Now we calculate also G3:

G3=     0 0 0 g1,2g2,3g3,4 0 0 0 0 0 0 0 0 0 0 0 0    

(19)

The resulting matrix shows only one element different from zero, it repre-sents the path 1 → 2 → 3 → 4. It is possible to see this path also from the graph representation in figure 2.5.

If we want to identify all the paths with length 4, we have to multiply G3 with G and look for elements different from zero. In this case the resulting matrix contains only elements equal to zero, thus there exists no pathway longer than three.

Leontief structure matrix

The way in which the elements in G were normalized guarantees that each gij ≤ 1, making it highly probable that elements in the higher powers of G

will grow progressively smaller. In the 1949 Simon and Hawkins [7] demon-strated that the infinite series of matrix powers does converge to a finite limit and this limit is

lim{ I + G + G2 + G3 + · · · } → [I − G]−1

The matrix [I − G]−1is also called Leontief structure matrix, denoted by S, and the (i, j)-th element of it reveals the fraction of the total input into j that left i and travelled over all paths of all lengths to satisfy a final demand upon j of one unit. Leontief structure matrix is generally used to estimate the production of CO2by each species required to satisfy a prescribed vector

of final demands. The respiration is important to ecologists. In 1980 Levine [11] observed that the sum of each column of the structure matrix S was related to the number of trophic transfers that a species i traverses during its flow to a given species j. Now we use an example to show this relation. Example 2.3.4. (Taken from [6]) Let us consider an ecosystem represented in figure 2.6, where there are the compartments 1, 2, and 3 that are arrayed in chain-like fashion and the compartment 4 that receives only 5 of its 50 units of total flow at the fourth trophic level. It means that the compartment 4 has position four as fourth trophic level only in the chain 1 → 2 → 3 → 4. The compartment 4 receives 50 units and 5 of it is given from the compart-ment 3. It represents the 10% of its sustenance. From the compartcompart-ment 2 arrives 15 units as third trophic level in the chain 1 → 2 → 4 and from th compartment 1 arrives 30 units as second trophic level in the chain 1 → 4. Thus the 30% of its sustenance arrives from the third level and 60% from the second.

Hence, the effective trophic level of compartment can be reckoned as (0.6 × 2) + (0.3 × 3) + (0.1 × 4) = 2.5 .

In the S matrix corresponding to this network, the first three columns sum to 1.0, 2.0, and 3.0, respectively, and the sum down the fourth column is 2.5, as just calculated.

(20)

Figure 2.6: Graph representation of real ecosystem Lindeman trophic transformation matrix

A very similar matrix that use the same idea of the Leontief structure ma-trix is the Linderman trophic transformation mama-trix and it is denoted with the letter L. Each elements Lij of this matrix reveals the fraction by which

taxon j feeds at the i-th trophic level. As for the Leontief structure matrix, Linderman trophic transformation matrix uses the matrix of dietary coeffi-cients G. It is defined by levels, where each level is represent by a row, in this way:

• the first level (L1)T is defined using the coefficients (gO)T, that

repre-sent all the external inputs of the system;

• the other levels (greater that level 1) are defined recursively as (Li)T =

(Li−1)T G.

Let us apply it to the example 2.3.4.

Example 2.3.5. (Taken from [6]) The Linderman trophic transformation matrix L of the network is:

L =     1.0 0 0 0 0 1.0 0 0.6 0 0 1.0 0.3 0 0 0 0.1    

As we can see, the columns of the matrix represent the trophic levels and the row define the compartment, namely the row 1 describes the compartment 1 varying the trophic levels. From the matrix we can see that the first three taxon act wholly at one of the first three trophic levels, but that taxa 4 is partitioned as described in the example 2.3.4.

2.3.4 Dependency Analysis

There is another important matrix called the total dependency matrix, indi-cated by D. The element dij represents how much of what arrives at j can be

(21)

traced to a particular species i. The total dependency matrix is calculated using the Leontief structure matrix S and the flow matrix T. The formula to calculate each element is the following:

dij = (Sij − δij)(

Ti .

siiT. j

)

where δij is the element of the identity matrix I. To represent all the

in-formations in the system, the indexes run from 0 to n+2. The element T0j

represents the external inputs to j, T_i,(n+1)are the useable exports from i to other comparable systems, and Ti,(n+2) are the dissipative losses from i.

The matrix D in the column j contains the indirect diet of j, namely how much j ‘eats’from each element i in the ecosystem. In other words column j shows the amounts by which j depends upon the activity of each element in the ecosystem.

Another utilisation of the matrix of indirect diets is to differentiate trophic roles.

Example 2.3.6. (Taken from [9]) The Chesapeake mesohaline ecosystem is host to two piscivorous predators, striped bass (Morone saxatillis) and bluefish (Pomotatus saltatrix). One would expect heavy competition between them. Their indirect diets reveal (among others) the following indirect de-pendencies:

• striped bass depends on zooplankton: 65.8%; • bluefish depends on zooplankton 28.7%; • striped depends bass on polychaetes 1.8%; • bluefish depends on polychaetes 48.0%.

These apportionments reveal significant stratification of trophic resources between the predators. Striped Bass ultimately shows a high dependency on Pelagic2 production, whereas Bluefish derives their resources more from Benthic1 secondary production3.

2.3.5 Activity analysis

An important information which ecologist are interested in is the activity among species that flows directly from a species i to some other one j. The

2_{Pelagic fishes are a group of fishes that live in open sea, while Beuthic fishes live at}

the sea bottom

3

The secondary production is the transformation, through consumption, of biomass into other forms.

(22)

activity is represented by the matrix of host coefficients, denoted by F, which depends from the flow matrix T.

fij =

Tij

(Ti .+ Ei+ Ωi)

It is possible to notice that this matrix employs a normalization across the rows of T, instead of a normalization down T columns.

Output structure matrix

The corresponding output structure matrix is given by Σ = (I − FT)−1. It is useful to estimate how much of the activity of any node j is generated by a single unit of primary input from i.

2.3.6 Efficency Analysis

A fondamental aspect of a network is its efficiency. This propriety is rep-resented by the total contribution matrix, denoted by C. Its elements cij

represent the efficiencies with which biomass flows from a compartment i to j.

This matrix is based on the same concept that was used for the total de-pendency matrix D. The calculation of each element is done in the following way:

cij = (σji− fji)(

Ti .

σiiT. j

) where σij is the element (i, j) of Σ.

Example 2.3.7. (Taken from [8]) In this example the matrix C is used to compare the trophic efficiencies of two networks. Two tidal marsh4

ecosys-tems situated near the Crystal River nuclear power plant on the west coast of Florida are considered. The idea is to study two or more species into an ecosystem with different conditions and compare them using the C matrix. Two networks are compared: the first network N1 is situated on the thermal outfall from the plant ( with variation of temperature ∆T ≈ 6 C), while the second marsh N2, was far away and thus it is not influenced by the ther-mal effluent. Calculation of the total contribution coefficients of primary production5 to the Gulf Killifish (Fundulus grandis) and to two Needlefish (Strongylura marina and Strongylura notata) revealed the following: The impact of the thermal effluent was obvious. Heated water lowered the overall efficiency of the ecosystem for producing top carnivores by ∼50-60%. The calculation can be seen in detail in [8].

4_{Tidial marsh is a marsh that is found along coasts} 5

Primary production is the process in which living organisms form energy-rich organic material (biomass) from energy-poor inorganic materials in the environment through pho-tosynthesis

(23)

N2 N1 Efficiency Gulf Killifish 0.147 × 10−3 0.67 × 10−3 (-54%) Needlefish 0.338 × 10−3 0.140 × 10−3 (-59%)

2.3.7 Contribution analysis

A limitation of all the previous analysis is that the flows between the nodes are only positive. Some kind of analysis could focus on the negative effects of some species that can propagate throughout the system. The same linear algebra techniques used to quantify positive contributions can evaluate as well the net negative trophic impact that one species has on any other.

Let us assume to be interested to know the overall effect of i on j. To extract this information we have to take the matrix G, which quantifies the positive impact of i upon j, and subtract the matrix F, which measures the negative impact that i has on j. Unfortunately, we can not use directly the matrix F because each element of it is normalized by the total output from i, rather than the predatory losses that i sustains. For this reason we have to remormalize F by its secondary production. This matrix will be denoted as F∗. The elements of the renormalized matrix F∗ are calculated as:

f_ij∗ = _P_nlTij

m=1Tim

where nl is the number of living components of the full suite of n ecosystem components.

In this way we can have a more accurate measure of the net direct effect that i has on j. Thus the calculation is the following:

qij = gij− fji∗

where −1 ≤ qij ≤ 1.

2.3.8 Total impact analysis

At this point we have all the elements to extract the net trophic impact (direct and indirect) of any given compartment i upon any other j. It is the element (i, j) of the matrix of net total impacts, denoted by M, and it is calculated as

M = [I − Q]−1− I.

All matrices and indexes described so far are used to describe only steady-state or temporally averaged networks. Several attempts have been made to expand I/O theory to treat time-varying systems. The main idea is to apply information-theoretic methods to a time series of network snapshots depicting the dynamics over an interval.

(24)

2.4 Analysis of Cycling

A very important step during a trophic analysis is the analysis of cycles. The presence of them indicates that in the ecosystem there is a reutilization of material or there are food chains in which a prey may eat a predator. The Finn Cycling Index FCI [12] was defined To analyse such situations and figure out indexes useful to quantify the percentage of all fluxes that is generated by cycling and compare different network.

This index can be calculated with three different combination of ma-trixes: the first combination uses the Leontief structure matrix S and the flow matrix T, the second one substitutes the Leontief structure matrix with the dependency matrix D, while the third one uses the total contribution matrix C and the flow matrix T. The first method is the most commonly used because its calculation is easier and faster than the other, but the uti-lization of the second and the third combination represents more accurately the probability that a given quantum leaves a particular taxon and returns to it.

Let us focus on the calculation of this index. To calculate the FCI we have to multiply all the diagonal elements of the Leontief structure matrix S with the rows of the flow matrix T, to sum it and then divide it by the total system throughflow T.. that is the double summation of the flow matrix T.

More precisely:

FCI = P

i[Ti.Sii]

T..

The same procedure is used when S is substituted with the dependency matrix D or the total contribution matrix C.

When the ecosystem is perturbed, this index can increase even though it must be constant. This variation of FCI represents a problem because in this case the index can not reflect the developmental status of an ecosystem. Other problem that can provoke a variation of FCI is the presence of the cycles. For this reason, during the analysis the cycles are removed and then studied separately. Hence, first of all we must find the cycle. To find the simple cycles one may perform a depth-first search with backtracking.

2.5 Dynamic Analysis

All the analyses presented previously are static, but it is generally interest-ing to do also dynamic analysis of a trophic system. To do this, ordinary differential equations ODEs that describe all the reactions for each compart-ment, are used. The idea is to represent dynamically the temporal variation of biomass for each species. A variation happens when an reaction with another species happens. When the system of differential equations is built, the mathematical techniques to solve this system can be applied to extract

(25)

all the information that we need, such as finding an equilibrium in the pop-ulations. We show in the following the well-known Lotka-Volterra ODEs model of prey and predator.

Example 2.5.1. (Lotka-Volterra model)

This model is also known as Predator-Prey model because it describes dy-namically the variation of population of two species where individuals of one species, predators, eat individuals of the other one, preys. Hence, the model is described by two differential equations, one for the predators and one for the preys. Each equation describes how the population of predators or preys changes. The aim of the model is to describe the dynamics of reaction be-tween the species.

Now we show a simple example of Predator-Prey model, where the preda-tors are snakes and the preys are rats.

The model is composed by two equations, one for the rats population R and one for the snakes population S, and the dynamics of the reaction is described by using two differential equations:

dR

dt = αR(t) − βR(t)S(t) dS

dt = ηβR(t)S(t) − γS(t)

where α is the growth rate of R, γ is the death rate of snakes where R are not available (snakes don’t eat), β is death rate of rats due to the fact that S eats R, η is the efficiency of predation. The product of R(t)S(t) represents the reaction between rats and snakes.

To describe better the variation of the two populations, the constant α, β, γ and η are often substituted with functions which depend from time.

2.6 Tools for representing and analysing trophic

networks

2.6.1 Ecopath with Ecosim

The most important software developed in the ecological field is Ecopath with Ecosim, EwE, by the Fisheries Centre of the University of British Columbia [13]. This is a free ecosystem modelling software suite which can be used to address ecological questions, to evaluate ecosystem effects of fishing, to model effects of environmental changes and for other analyses on the ecosys-tems. This software permits to the ecologist to calculate the missing data by using specific ecological equations representing properties of the data. This is an important point, since the solution of the problem of unknown data is a main problem in the ecological field and through the utilisation of this

(26)

program it is possible to do analysis in spite of missing data. This is the most used software in the ecological community.

This software is divided into three components that can work indepen-dently:

• Ecopath gives a static mass-balanced snapshot of the system;

• Ecosim provides a time dynamic simulation module for policy explo-ration, it uses differential equations to describe the ecological network; • Ecospace is a spatial and temporal dynamic module primarily designed

for exploring impact and placement of protected areas.

Ecopath describes the system by using mass balance equations, then Ecosim studies its behaviour using differential equations and finally Ecospace analyses the impact of some actions. All of these components may work dy-namically based on the system of differential equations.

(27)

PEPA and Bio-PEPA

In this chapter we introduce two stochastic process algebras used to per-form different types of analysis: PEPA and Bio-PEPA. In the first part we give the fondamental definition of PEPA and its properties. The proper-ties explained are isomorphism and strong bisimulation which are the most important properties for a PEPA system. Then we also introduce the idea of weak isomorphism and weak bisimulation that are weaker form of iso-morphism and bisimulation. In the second part of the chapter we give the definition of Bio-PEPA systems, followed by the definition of Bio-PEPA components. Then we give some important properties that will be useful in the analysis. As for PEPA the most important properties for Bio-PEPA are isomorphism and strong bisimulation. In this case, therefore, we don’t have any weak form of isomorphism and bisimulation.

All the information are taken from [1], [14], [5], [15], [16], [17].

3.1 PEPA

PEPA [5] is a stochastic process algebra defined for the performance analy-sis of computer systems with concurrent behaviour, which extends classical process algebras such as Milner’s CCS [16] and Hoare’s CSP [17] by intro-ducing probabilistic branching and timing of transitions. A PEPA model is composed by a set of components, or agents, which interact through a set of actions. This process algebra assigns to each action a duration which is represented by a random variable with a negative exponential distribution. PEPA offers a set of combinators to describe the possible reactions in a system. A system is described as the concurrent reaction of simple sequential components.

PEPA represents each possible state of the systems as interacting com-ponents. PEPA allows to study quantitative properties of a system of inter-acting processes such as utilisation, throughput, response time and permits also to find deadlocks if they exist.

(28)

We will introduce the main notions that are the basis of the definition of a PEPA model. We will not give a formal definition but only the idea of what component, activity and activity rate are.

We can consider as components all the identificable parts of the system. Let us consider for example a queue, it is composed by an arrival component and a service component which interact. An activity in PEPA is different from the usual process algebra notion of an instantaneous action. A PEPA activity is defined by a pair of elements (α, r), where α represents the action type, namely the name of the action, and r is the associated duration of this action and it is called rate. r is expressed as a random variable with an exponential distribution. The set of all the rates is called set of activity rate.

We give a brief panoramic of PEPA operators:

Prefix: (α, r) . P is the basic term of PEPA. It means that there is a com-ponent with action type α and duration 1/r followed by the process P.

Choice: P + Q means that the system can behave in two different ways, but only one can happen. For example if P happens, then Q is discharged. Constant: C def= P represents a component C whose meaning is given by a defining equation. With this notation we can assign names to different patterns and we can use recursion in the definition.

Hiding: P/L means that all the activities which are perfomed by the compo-nent P and are contained in L can be considered internal or private. Cooperation: P BC

L Q denotes cooperation between P and Q over the cooperation

set L. The cooperants are forced to synchronise for every action in-side L. For actions not in the cooperation set, P and Q can behave independently one at a time. If L is empty, then we have that the two processes work in parallel. Parallelism is represented by shuffling.

3.1.1 System representation: Syntax of PEPA

The syntax of a PEPA component or process, P , is the following: P ::= (α, r) . P | P + Q | P BC

L Q | P/L | X | C. (3.1)

In the prefix term (α, r) . P , the parameter r is the rate of the action α. The operator “+”expresses a choice between possible actions. The constant C must be defined by an equation C def= P . The process P BC_L Q denotes synchronisation between components; the set L determines the activities on which the operands are forced to synchronise. When L is empty the cooperation operator may be substituted by the parallel operator k. The

(29)

process P/L represents hiding, it means that P can evolve with any action except for any action in the set L. All activities in L are hidden and their type is not witnessed upon completion. When the process P performs an action in L it will appear as τ1, namely the action is represented as an internal delay. X represents a process variable: it means that an indexed set of variables X may be replaced by an indexed set of components P [14].

Now we give the formal definition of a PEPA system.

Definition 3.1.1. A PEPA system P is a 4-tuple hC, Act, R, P i, where: • C is the set of components;

• Act is the set of possible activities;

• R is the set of activity rates, { x | x > 0 ; x ∈ R} ∪ {>}; • P is the sequential component describing the system.

where > represents the unspecified activity rate which is the fastest possible rate.

We may use the unspecified activity rate > to define passive compo-nent with respect certain activities shared by many processes with different associated rates. The unspecified rate > means that the component does not contribute to the execution of the activities. In this way the active components determine the rate since PEPA in the cooperation between two components. A model where it contains a component which has an activity with unspecified rate and not shared, is called incomplete.

Some inequalities and equations define the comparison and manipulation of the unspecified activity rates >:

r < w> for all r ∈ R+ and for all w ∈ N w1> < w2> if w1< w2 for all w1, w2∈ N w1> + w2> = (w1+ w2)> for all w1, w2∈ N w1> w2> = w1 w2 for all w1, w2∈ N

Now we show an example in which we define a simple PEPA model of a system of interacting processes.

Example 3.1.1. Let us suppose to have two different users that want to use the same computer. This computer can be used only by one user and the other user must wait until the cumputer becomes available again. Hence we may represent each user as a process which performs two actions, such as use and think, and represent the computer as a process which can perform

1

(30)

two actions, such as use and reset. Each action will have a different rate, but we suppose that the time in which a computer is used by a user is the same for both users. Let us consider that the duration of the utilisation of the computer is equal to 1/u, then the rate is equal to u. For the component Computer we use the unspecified rate > because the computer is used by the users and the duration of the utilisation is given by the specific user. In this case the component Computer is passive with respect the activity use. About the rates of the actions think and reset, we assign them randomly.

We give the definition of the three processes, namely two for the users and one for the computer.

User1 def= (use, u).(think1, t1).User1; User2 def= (use, u).(think2, t2).User2; Computer def= (use, >).(reset, r).Computer;

Now we have to define the sequential component, Sys, to describe the be-haviour of the system. We must use the cooperative operator on the action use.

Sysdef= (User1 k User2) _BC

use Computer

We give the set C of the components, the set Act of activities and the set R of the activity rates of this system:

C = { User1, User2, Computer } Act = {| (use, k1), (use, k2) |}

R = { u, t1, t2, r, > }

where k1 = u_u >_>min(u, >) = u and k2 = u_u >_>min(u, >) = u.

3.1.2 System behaviour: PEPA Labelled Transition System

A Labelled Transition System (LTS) is an abstraction of the behaviour of a system and consists of the graph of its possible states. When an reaction happens in the system, it produces a transition from a state to another and such transitions are represented in the LTS of the system.

As we have seen before, the rate is inversely proportional to the duration of the action thus if an action have a small duration then it has a large rate because in a time period it happen many times. Let us suppose that we have one action with rate 5 and another action with rate 10. Then the first action has duration 1/5 and the second one has duration 1/10, thus the second action will happen more times than the first one in the same period. In PEPA, when two activities may occur in a choice composition, a race condition is considered, namely the activity with greater rate occurs before the activity with lower rate. In other words the activities with higher rate have a greater probability to be executed because they have higher priority than the activities with lower rate.

(31)

Another important concept for the PEPA system is the definition of apparent rate. The apparent rate of a component P with respect to an action α, is the total capacity of the component P to carry out α.

Definition 3.1.2. The apparent rate of an action of type α in a component P , denoted rα(P ), is the sum of the rates of all the activities of type α in

Act(P ). 1. rα((β, r).P ) = ( r if β = α 0 if β 6= α 2. rα(P + Q) = rα(P ) + rα(Q) 3. rα(P/L) = ( rα(P ) if α /∈ L 0 if α ∈ L 4. rα(P BC_L Q) = ( min(rα(P ), rα(Q)) if α ∈ L rα(P ) + rα(Q) if α /∈ L

When there is cooperation between two components, one of them may be passive and the other one active with respect an action in the cooper-ation set. Then the apparent rate is completely determined by the active component.

The set Act(P ) is the multiset of the current activities of P and we use these parantesis {| · · · |} to list the activities. We use the following abbreviation for the set containing all the current activities of component P different from the activities in L

Act_\L(P ) = {|(α, r) ∈ Act(P )|α /∈ L|}

and the next abbreviation for the set containing all the common current activities between the component P and the set L

Act∩L(P ) = {|(α, r) ∈ Act(P )|α ∈ L|}

The set Act is defined for each operator in this way: 1. Act((α, r).P ) = {|(α, r)|}

2. Act(P + Q) = Act(P ) ] Act(Q)

3. Act(P/L) = Act_\L(P ) ] {|(τ, r) ∈ Act∩L(P )|}

4. Act(P BC_L Q) = Act\L(P ) ] Act\L(Q)]

{|(α, r)|α ∈ L, ∃(α, r1) ∈ Act∩L(P ), ∃(α, r2) ∈ Act∩L(Q),

r = r1

rα(P )

r2

(32)

PEPA provides one relation, the transition relation, used to represent the behaviour of a process. It is defined as:

t

−

→⊆ S × S

where S is set of states of the process and t belongs to the set of transition labels T (the set of labels of actions).

The transition relation in PEPA satisfies the following rules: Prefix (α, r).E−−−→ E(α,r) Choice F −−−→ F(α,r) 0 E + F −−−→ E0(α,r) E−−−→ E0(α,r) E + F −−−→ F 0(α,r) Cooperation E−−−→ E(α,r) 0 E BC L F (α,r) −−−→ E 0BC_L F (α /∈ L) F (α,r) −−−→ F0 E BC L F (α,r) −−−→ EBC L F 0 (α /∈ L) E−−−→ E(α,r1) 0 _F (α,r2) −−−→ F0 E BC L F (α,R) −−−→ EBC_L F0 (α ∈ L) where R = r1 rα(E) r2 rα(F ) min(rα(E), rα(F )) Hiding E −−−→ E0(α,r) E/L−−−→ E0/L(α,r) (α /∈ L) E (α,r) −−−→ E0 E/L−−−→ E0/L(τ,r) (α ∈ L) Constant E−−−→ E0(α,r) A−−−→ E0(α,r) (Adef= E)

When two components cooperate on the same activity with different rates, the system executes that activity with rate R.

Now we give the definition of LTS for a PEPA system.

Definition 3.1.3. Let P = hC, Act, R, P i be a PEPA system, a Labelled Transition System LTS for P is a triple (C, Act, {−−−→ |(α, r) ∈ Act})(α,r) where the transition relation −−−→ is given by the rules above.(α,r)

(33)

3.1.3 PEPA properties

Now we present some interesting properties of PEPA systems: isomorphism and strong bisimulation.

We consider only well-defined system then we have at most one action with the same label in each sequential term and each component describes the behaviour of a single process.

Isomorphism in PEPA

Two components, or systems, are isomorphic if they have the same structure in their derivation graph and perform exactly the same activities.

First we define component isomorphism and then we will define when two processes are isomorphic.

Definition 3.1.4. Let P be a system, then ds(P) denotes its derivative set. The derivative set is the set of components which contains all the reachable states of the system.

Definition 3.1.5. A function, F : ds(P ) → ds(Q), is a component iso-morphism between P and Q, if F is an injective function, and for any component P0, Act(P0) = Act(F (P0)), and for all a ∈ Act, the set of a-derivaties of F (P0) is the same as the set of F -images of the a-derivaties of P0, i.e.

{Q0| F (P0)−→ Qa 0} = {F (P00) | P0 a−→ P00}

The derivative graph is the visualization of the derivative set using LTS. Definition 3.1.6. Two components, P and Q, are isomorphic, de-noted P = Q, if there exists a component isomorphism F between them such that D(F (P )) = D(Q), where D denotes the derivative graph.

In PEPA we have the following equational laws for isomorphic compo-nents: Choice: (1) P + Q = Q + P (commutativity of +) (2) P + (Q + R) = (P + Q) + R (associativity of +) Hiding: (1) (P + Q)/L = P/L + Q/L( distributivity of /) (2) ((α, r).P )/L = ( (τ, r).P/L α ∈ L (α, r).P/L α /∈ L ; (3) (P/L)/K = P/(L ∪ K);

(34)

(4) P/L = P if L ∩ A(P ) = ∅ Cooperation (1) P BC L Q = QBCL P (commutativity of BCL ) (2) P BC L (QBCL R) = (P BCL Q)BCL R (associativity of BCL ) (3) (P BC_L Q)/(K ∪M ) = ((P/K)BC_L (Q/K))/M where K ∩M = K ∩L = ∅ (4) P BC K Q = P BCL Q if K ∩ (A(P ) ∪ A(Q)) = L (5) (P BC_L Q)BC_K R = ( P BC L (QBCK R) ifA(R)) ∩ (L\K) = ∅ ∧ A(P )) ∩ (K\L) = ∅ QBC L (P BCK R) if A(R)) ∩ (L\K) = ∅ ∧ A(Q)) ∩ (K\L) = ∅

In the case in which L is an emptyset, then the operation of cooperation is also called parallel.

Parallel (1) P k Q = Q k P (commutativity of k) (2) P k (Q k R) = P k Q k R = (P k Q) k R (3) (P k Q)/K = P/K k Q/K Constant If Adef= P then A = P .

In PEPA there exists also a less strict definition of isomorphism which ignores τ activities and it is called weak isomorphism. We will give no formal definition of it but we will explain the concept. The idea of weak isomorphism is that for a component which carries out several consecutive τ type activities we can find an equivalent compact form, which has the same visible behaviour but a single τ activity of longer duration [5].

Bisimulation in PEPA

Two components, or systems, are strongly bisimilar if they are able to per-form the same actions with the same rates resulting in derivatives that are themselves bisimilar.

First we define when a binary relation over the components is a strong bisimulation and then we define when two processes are strongly bisimilar. Definition 3.1.7. A binary relation, R ⊆ C × C, over components is a strong bisimulation if (P, Q) ∈ R implies, for all α ∈ A,

i) rα(P ) = rα(Q);

(35)

ii) whenever P −→ Pa 0 then, ∃Q0, Q−→ Qa 0, and (P0, Q0) ∈ R; iii) whenever Q−→ Qa 0 then ∃P0, P −→ Pa 0, and (P0, Q0) ∈ R.

As we can see from the definition, if R is a strong bisimulation, then also its reverse R−1 is a strong bisimulation. This conditions are also transitive and preserved by union.

Definition 3.1.8. P and Q are strongly bisimilar, written P ∼ Q, if (P, Q) ∈ R for some strong bisimulation R. Strong bisimilarity is defined as

∼= ∪{R : R is a strong bisimulation}

As for isomorphism, also bisimilarity have a weak form. The idea of weak bisimulation is that two components are weakly bisimilar if any action of one of them is matched by the other, up to the inclusion of additional τ actions [5].

Proposition 3.1.1. If F is a component isomorphism then for any P , P ∼ F (P ).

From the previous proposition we can say that isomorphism between components is a stronger relation between components than strong bisimi-larity.

Now we show an example of simple system modelled with PEPA. Example 3.1.2. In the example 3.1.1 we have fundamentally two different type of processes, the first one is the User process and the second one is the Computer Process, thus we have to built one LTS which represents the three process involved in the system (two users and one computer).

Before showing the LTS, we do some transformation to make more un-derstandable the resulting graph.

User1 1 def= (use, u).User1 2; User1 2 def=(think1, t1).User1 1;

User2 1 def= (use, u).User2 2; User2 2 def= (think2, t2).User2 1; Computer 1 def= (use, >).Computer 2; Computer 2def= (reset, r).Computer 1; Sys def= (User1 1 k User2 1)_BC

use Computer 1;

where Sys1def= (U ser1 1 k U ser2 1)_BC

useComputer 1,

Sys2def= (U ser1 2 k U ser2 1)_BC

useComputer 2,

(36)

useComputer 2,

useComputer 1,

useComputer 2 and

useComputer 1.

PEPA can be very useful to study biosystems, but it does not represent all the features of biological networks, for this reason it has been defined a new process algebra called Bio-PEPA which will be presented in the next section.

3.2 Bio-PEPA

Bio-PEPA is a stochastic process algebra based on PEPA, which has been proposed for modelling and analysing biochemical networks. Differently from PEPA it allows one to explicitly represent stoichiometric coefficients and the roles of species in reactions and it supports also the definition of general kinetic laws.

In a Bio-PEPA model, components represent chemical species and ac-tions represent reacac-tions. The role of the species involved in the reaction is determined by the type of reaction. The effect of a reaction occurrence is to decrease the amount of reactants and to increase the amount of products according to the stoichiometry.

Similarly to PEPA, Bio-PEPA provides a set of combinators which are used to describe the concurrent reactions of the components of the sys-tem. In Bio-PEPA we have reaction that correspond in PEPA to action or activity.

Bio-PEPA permits different analysis technique: numerical methods based on differential equations and stochastic analysis via stochastic simulation (Gillespie’s algorithm [18]) or via numerical evaluation of CTMC2. Depend-ing on the type of analysis, the parameter representDepend-ing the amount of the component may be interpreted as the number of molecules or as discrete levels of concentration. The first interpretation is used for the definition of a transition system and the second one is used for the derivation of a CTMC.

In Bio-PEPA every state of the system is like a container, called com-partment, that represents the location of the various species. Every reaction is an action that can increase, decrease or modify in some ways the number of the molecules inside the container. Moreover, there are some operators

2_{CTMC or continuous-time Markov chain is a stochastic process with continuous time}

and Markov property. It is a mathematical model which takes values in some finite or countable set and for which the time spent in each state takes non-negative real values and has an exponential distribution.

(37)

which can put in relation different compartment. In this way we can de-scribe the behaviour of biochemical systems and we can study and simulate possible evolutions of them.

Before giving the formal definition of a Bio-PEPA system, we have to state some assumptions:

• compartments are static;

• reactions are irreversible reactions;

where static means that the compartments are containers, which are not involved in the reactions. This first assumption simplifies the language and, nevertheless, permits to represent most features of biochemical network. The second assumption is not restrictive, in fact we can represent a reversible reaction as two different irreversible reactions.

We give a brief panoramic about operators of Bio-PEPA.

Prefix: (α, k) op P is the basic term of Bio-PEPA. It means that there is a reaction with label α and stoichiometric coefficient k that, when it happens, it modifies the process P in some ways, such as increasing, decreasing or keeping equal the level of P depending on op.

Choice: P + Q means that the system can behave in two different ways, but only one can happen, as in PEPA. For example if P happens, then Q is discharged.

Constant: C def= P represents a component C whose meaning is given by a defining equation. With this notation we can assign names to different patterns and give recursive definitions.

Cooperation: P BC

L Q denotes cooperation between P and Q over the cooperation set

L. The cooperants are forced to synchronise for every action inside L. If an action is not in the cooperation set, then either P or Q can evolve but not both. If L is empty, then we have that the two cooperants evolve in parallel.

In a Bio-PEPA sustem we define the species component, describing the behaviour of each species, and the model component, describing the overall system and the reaction among components.

3.2.1 System representation: Syntax of Bio-PEPA

The syntax of Bio-PEPA components is the following:

S ::= (α, κ) op S | S + S | C with op =↑ | ↓ | ⊕ | | . (3.2)

(38)

where S is the sequential component and P is the model component. In the prefix term (α, κ) op S of (3.2), the parameter κ is the stoichiometric coefficient of the species in the reaction α, op is the prefix combinator and it represents the role of S in the reaction. Specifically, ↓ indicates a reactant, ↑ a product, ⊕ an activator, an inhibitor and a generic modifier. For reactants and products the level of the species that participate in a reaction depends on the stoichiometric coefficients. For a reactant this parameter specifies the minimum level of the species necessary to start the reaction. The operator “+”expresses a choice between possible actions. The constant C is defined by an equation C def= S. The process P BC

L Q denotes

synchro-nisation between components; the set L determines the activities on which the operands are forced to synchronise. In Bio-PEPA we can set L = ∗ and it respesents the biggest set of common action between the cooperands, namely by using this term the cooperands are forced to synchronise for each possible commom action between them. The parameter x in S(x) represents the amount of S, if x ∈ R+ then it represents the initial concentration of S, else, if x ∈ N, it represents the initial number of molecules of S. In this text we always use the second meaning of S(x).

Bio-PEPA supports a modelling style in terms of concentration levels, where each species is represented by a discrete concentration level from 0, null concentration, to Mi, maximum concentration of the species i. Each

level represents an interval of concentration and the granularity of the system is expressed in terms of the step size H. We assume that H has the same size for all the species because, from the law of conservation of mass, the concentration of a reactant (mass consumed) must be balanced with the concentration of a product (mass created). It is also possible to define different step sizes H for different species, but this doesn’t influence the value of the concentration. The assumption of having a maximum concentration Mi for species i makes it possible to have a finite number of states. Hence

a Bio-PEPA model will have Ni+ 1 levels for a given species i, where Ni =

dM_i/He and for each state, the concentration level li is given by the ratio

between the concentration xi and the step size H, li= dxi/He.

Definition 3.2.1. A Bio-PEPA system P is a 6-tuple hV, N , K, FR, Comp, P i,

where:

• V is the set of compartments;

• N is the set of quantities describing each species; • K is the set of parameter definitions;

• FR is the set of functional rate definitions;

• Comp is the set of definition of sequential compartments; • P is the model component describing the system.

(39)

Now we introduce the main notions that are the basis of Bio-PEPA systems. The definitions are about compartments, elements of a species and functional rates.

The first definition is about compartments. The intuition is that a com-partment is a container that collects the same type of molecules.

Definition 3.2.2. Each compartment is described as “V: v unit”, where V is the compartment name, “v ”is a positive real number expressing the compartment size and the (optional) ”unit” denotes the unit associated with the compartments size. The set of compartments is denoted by V.

Each compartment is defined by a name and the information about the compartment.

Definition 3.2.3. For each species we define the elements “C: H = value H, N = value N, M = value M, V =value V, unit = value u”, where:

• C is the species compartment name, • H is the step size and the value H ∈ R+_,

• N is the minimum level and value N ∈ R+_{∪ {},}

• M is the maximum concentration and value M ∈ R+_{∪ {},}

• V is the name of the enclosing compartment and value V ∈ name(V)∪ {},

• value u represent the unit for concentration,

where is the empty string. The set of all the elements described above is denoted by N .

Depending on the analysis that we want to apply to the system, some elements become necessary. If we are interested in the CTMC with levels or model checking in PRISM3, we need the compartment name and the value for H and N (or, equivalently, H and M ). For stochastic simulation we only need the compartment name. In a well-defined Bio-PEPA system, the value H must be greater than 0 and it belongs to R+, the value N must be equal or greater than 1 and it belongs to N .

In order to describe the kinetics of a biological system, we can associate to each action αi a functional rate fαi. It is possible to define a functional

rate by means of simple mathematical expressions, using also constant com-ponents and parameters.

3

PRISM is a probabilistic model checker, a tool for formally modelling and analysing systems that exhibit random or probabilistic behaviour.