Example of analyzable SPN model - Getting an analyzable model

4.3 Getting an analyzable model

4.3.1 Example of analyzable SPN model

This last part of the Chapter is devoted to illustrate through a simple example how the proposed layered structure and the library of predefined GSPN models can be used to produce an SPN analyzable model. For this purpose, we will follow the guidelines given in this section.

The automation system object of the analysis is a cyclic application that activates two concurrent processes:

each process reads a sample input from a plant, elaborates to produce the future state, saves the new state in memory and produces the new output for the plant. The two processes are completely executed in parallel: each process uses its own automation function to execute a copy of the future state in a memory unit. To increase dependability of the automation system a fault-tolerance strategy has been devised in which the error processing step consists of error detection, error diagnosis and error recovery. The error detection step is carried out through the use of a standard watchdog mechanism while error diagnosis and recovery steps are implemented by a recov-ery mechanism. The watchdog, once being initialized by the automation system that sets its timer to a specific value, starts its count-down to expiration. Periodically, it receives signals of life from the automation system which allow to reset its timer. If the watchdog does not receive any signal from the automation system before the timer reaches the zero value, it expires and sends a notification message to the recovery mechanism. The recovery mechanism is a software mechanism activated from the reception of “exceptions”. In case of reception of a notification message from the watchdog it provides to terminate the watchdog and to check the status of the automation system: if no error is present in the system the watchdog expiration is a false alarm provoked by a delay in the automation system or in the communication, then the watchdog is simply reinitialized. If instead an error is present in the system then a recovery action is carried out that consists of repairing the faulty com-ponents, removing the error in the affected automation functions and, if the automation system is not failed yet,

reinitializing the processes that used the erroneous functions.

Let us suppose that an application-specific CD scheme has been produced for this automation system by fol-lowing the customization process specified in Section 3.2 of Chapter 3. Among the measures to be computed indicated in the customized CD scheme there are the availability of the automation functions f , used to execute a copy of the future state of the plants and the probability of failure of the automation system. The memory units mem are parts of automation components, and they can be affected by physical faults that may provoke errors in the automation functions using the basic operations offered by the former. Communication units are instead not affected by faults.

Guideline 1 From the customized CD scheme and from the description of the automation system given above, we can identify the following SPN component models:

• ACCOM, GSPN model depicted in Figure 4.9 (A), representing a communication unit. We have reused the GSPN component model of a resource as originally defined in [28] since the communication unit is assumed not affected by faults. The communication unit performs a basic “transmission” operation.

• ACMEM, SWN model depicted in Figure 4.9 (B), representing the memory units. We have reused and colored the GSPN component model of a resource depicted in Figure 4.3 in which only a “copy” operation is possible. Since a recovery action can be carried out the SWN model includes “reset” transitions that bring the model from the faulty states to its initial state.

• FTPHY SICAL, SWN model depicted in Figure 4.9 (C), modeling the behavior of a physical fault affecting the memory units. The model is a colored version of the predefined physical fault GSPN model described in Section 4.2 in which a transition, bringing the fault model in its initial statei, has been added.

• ACFSY NCH , GSPN model depicted in Figure 4.9 (D), representing the synchronous communication func-tion used by the automafunc-tion system to initialize the watchdog. It is a simplified model of communicafunc-tion in which only the transmission basic operation is requested and in the communication the transmitted packet length is assumed of few bytes. The execution time of the operations of setup and reset of the communica-tion channel is assumed negligible with respect to the time required to perform the transmission.

• ACFASY NCH, GSPN model depicted in Figure 4.9 (E), representing the asynchronous communication func-tion used by the automafunc-tion system to send signal of life to the watchdog. The model a simplified version of the model of Figure 4.4(S), presented in Section 4.2, representing an asynchronous automation commu-nication function. As the case of model ACF_{SY NCH}, the only relevant basic operation is the transmission over the channel, the setup/reset operation are not explicitly modeled and the transmitted packet length is of few bytes.

• AFMEM, SWN model depicted in Figure 4.9 (F), modeling the behavior of the automation functions f . We have used the skeleton of service model of Figure 4.3 providing to add “reset” transitions that bring the

CHAPTER 4. USE OF STOCHASTIC PETRI NETS IN THE DEPAUDE METHODOLOGY 91

Figure 4.9: Example of analyzable SPN model: the set of SPN components

model from the erroneous states to its initial state.

• ERMEM, SWN model depicted in Figure 4.9 (G), representing the errors affecting the automation functions f , due to a fault occurred in the memory units. We have modified the predefined memory error GSPN model described in Section 4.2 in order to not consider error propagation, to represent an effective error latency and to reset the error model.

• RECMECH, GSPN model depicted in Figure 4.9 (H), representing the behavior of the recovery mechanism.

This model has been produced from the high level design specification of the recovery mechanism.

• AS, SWN model depicted in Figure 4.9 (I), representing the behavior of the automation system. It has been constructed from the high level design specification of the automation system. The automation system, be-fore starting its own activities, initializes and configures the watchdog. It starts then two parallel processes and waits for their termination: when both the processes have terminated, the automation system sends a signal of life to the watchdog and restart the execution of the two processes. The model is a refinement of the process model depicted in Figure 4.3.

• FAIL, SWN model depicted in Figure 4.9 (J), modeling a failure of the automation system. A failure occurs if the error is not detected and recovered in due time and it is considered an halting failure. The model is the colored version of the predefined failure mode GSPN model shown in Figure 4.7, in which no repairing actions are modeled.

• W D, GSPN model depicted in Figure 4.9 (K), representing the behavior of the watchdog mechanism.

The model is a modified version of the watchdog model derived automatically from the translation of a StateChart that will be presented in the second part, Section 6.4 of Chapter 6. Modification are related to the transformation of the type of interface (from place interface to transition interface). Moreover, with respect to the original version, a number of non-observable immediate transitions have been eliminated.

As shown in the Figure 4.9 the SPN models identified and described above have been inserted in the three-layered structure. For the insertion of some of them we have based on the classes of the application-specific scheme that their represent. In particular, the models ACCOM, ACMEMand FTPHY SICALrepresenting communica-tion and memory units of automacommunica-tion components and faults, respectively, are considered resource models and hence they are placed at the lowest level of the structure. The models ACFSY NCH, ACFASY NCH, AFMEM, ERMEM

representing automation (communication) functions and errors are placed at service level. The models AS, FAIL, representing the automation system and the halting failure mode, respectively, are placed at process level.

The dependability mechanisms have been inserted in the structure taking in account their design specification:

the watchdog is a process, normally running on a different node with respect to the controlled application, while the recovery mechanism can be seen as a service that interacts with processes, functions and resources. The models W D and REC_MECH have been placed then at process level and at service level, respectively.

CHAPTER 4. USE OF STOCHASTIC PETRI NETS IN THE DEPAUDE METHODOLOGY 93 Guideline 2 We first identify the interactions existing between models laying at the same level in order to define the operators for the horizontal composition. Associations affect defined between memory units and physical faults, memory errors and automation functions, halting failures and automation systems allow to define three labels: ftmem, to synchronize the memory units model ACMEMand the physical faults model FTPHY SICAL, erraf, to synchronize the automation functions model AF_MEM and the memory errors model ER_MEM, and finally, fail, to synchronize the automation system model AS and the halting failure model FAIL. Moreover, new labels are defined to model the interaction between the recovery mechanism and the memory error model (labels detect, noerr) and between the recovery mechanism and the automation functions model (recaf). We can then define the sets of labels for the horizontal compositions: L_res={ f tmem}, Lsrv={erra f }, L⁰srv={detect, noerr, reca f }, and Lpr={ f ail}.

The resource layer model RES, the service layer model SERV and process layer model PROC are then obtained by applying the composition operator|| over transition labels:

RES =

ACCOM| |

/0,/0ACMEM

| |

Lres,/0FTPHY SICAL, (4.1)

SERV = nh

ACF_{ASY NCH}| |

/0,/0ACF_{SY NCH}

| |/0,/0AF_MEMi

| |

Lsrv,/0ER_MEMo

| |

L⁰_srv,/0REC_MECH, (4.2)

PROC =

AS | |

Lpr,/0FAIL

| |/0,/0W D (4.3)

Some of the labels used in the vertical compositions are identified by the associations perfom defined be-tween the automation components (communication and memory units) and the automation (communication) functions; in particular, S transf, E transf are used to synchronize the automation communication function mod-els ACF_{ASY NCH}, ACF_{SY NCH} with the communication unit model AC_COM, S mem, E mem are used to synchronize the automation functions model AFMEM and the memory units model ACMEM. From the associations effect de-fined between the physical faults model and the memory error model, and between the memory error model and the halting failure model, we have derived the labels fterr and errfail, respectively.

New labels are added to model: 1) the interactions among the recovery mechanism model and the compo-nent models laying at resource level (recmem,noft) and at process level (S notify, E notify, S e termination, E e termination,recas, S e start, E e start), 2) the interactions among the automation system model and the automation (communication) functions models (S init, E init,S af, E af, S Iamalive, E Iamalive) and, finally, 3) the interactions among the watchdog model and the automation communication functions models (S e start, E e start, S e heartbeat, E e heartbeat).

We can then define the two sets of labels used in the vertical composition:

Lsr = {S trans f , E trans f , S mem, E mem, f terr, recmem, no f t},

L_psr = {S init, E init, S e start, E e start, S a f , E a f , S Iamalive, E Iamalive, S e heartbeat, E e heartbeat, err f ail, S noti f y, E noti f y, S e termination, E e termination, recas}

The final SPN model PSR is then defined as follows:

PSR = PROC | |

Lpsr,/0

SERV | |

Lsr,/0RES

(4.4) Note that in the compositions we have not used further “control” SWN models to establish the relationships between objects. Indeed this is a simple case in which the relationships at object level are are captured by: 1) the definition of a common basic color class C for the places of the SWN models, defined as the union of two static subclasses C1={c1} and C2={c2} containing each one a single color, and 2) by using the same name for the variables in the expressions of the arcs related to the synchronized transitions in order to unify their values. So that for example a process activated by the automation system p_i will use the automation function f_j at service level, that, when activated, performs a copy operation in a memory unit memk, where i = j = k (i, j, k = 1, 2).

Guideline 3 Actually the initial marking of each single component model has been already defined by con-sidering the following assumptions: 1) there is one communication unit; 2) memory unit m1 is more worn out than memory unit m₂, hence we focus only the physical faults affecting m₁. The GSPN component models are characterized by their corresponding initial places marked with one token. The SWN models ACMEMand AFMEM

are characterized by an initial marking parameter M0, defined as the set of all the colors of the basic color class C, while the SWN models FT_{PHY SICAL}, ER_MEMand FAIL have an initial marking parameter M₁, defined as the set containing the color c1. Color c1 allows to identify the memory unit affected by the fault, the automation function affected by the error and the process that may delay and cause the system failure. The initial marking of the SPN composed model is then defined as the union of the initial markings of its component models.

Guideline 4 From the application-specific CD scheme we can identify the following input parameters: 1) er-ror latency rate, new parameter defined in the Memory Erer-ror customized class, that represents the length of time between the presence of an error in the system and the occurrence of the related failure. To the attribute has been assigned an exponential probability distribution function (PdF) with parameter λ= 0.1ms. 2) fault rate and 3) duration rate, two parameters of the Physical Faults customized class, representing the rate of the fault occurrence and its rate of persistence, respectively. To the attributes has been assigned an exponential PdF with rateλ= 0.001ms. In order to get an analyzable SPN model it is necessary to assign a value to the rate parameters of other timed transitions, such as: timeout, the timed transition of the watchdog representing the timer; TRANS and COPY, representing the execution of the basic operations performed by the communication unit and by the memory units, respectively; begin, representing the initial watchdog configuration activity executed by the automation system. The rate parameters assigned to the timed transitions of the SPN model, together with their values, are summarized in Table 4.1.

Concerning metrics to be evaluated, these are specified by three attributes: availability, defined in the Automa-tion FuncAutoma-tion customized class; PF, defined in the Halting Failure customized class, specifying the probability of failure; and false alarm, defined in the Watchdog customized class, that suggests the computation of the fre-quency of false alarms. For the availability A_f₁ of the automation function that can be affected by error and for

CHAPTER 4. USE OF STOCHASTIC PETRI NETS IN THE DEPAUDE METHODOLOGY 95

Rate Parameter Transition Value

comm rate TRANS 0.1 ms

copy rate COPY 0.1 ms

count rate timeout 0.05 ms

fault rate ft occ 0.001 ms

duration rate ft end 0.001 ms error latency rate err fail 0.1 ms

config rate begin 1 ms

Table 4.1: Rate parameters of the SPN model.

the false alarms are required mean values, so that they can be computed in steady state as the probability that place error is empty, i.e., Pr{M[error] = 0}, and as the throughput of the transition delay, respectively. For the probability of failure is instead required a computation of the probability distribution function; this metric is then evaluated in transient state and it is defined as the probability that place failure becomes marked within time t, i.e., Pr{M[ f ailure](x) = 1, x ≤ t}.

Guideline 5 We have used GreatSPN tool [69] to construct the SPN component models depicted in Figure 4.9 and the program algebra [7] to carried out their composition by using the horizontal operators 4.1, 4.2, 4.3 and the vertical operator 4.4. The reachability graph of the final SWN model contains 115 tangible markings, 778 vanishing markings and 4 dead markings (corresponding to the failure of the automation system). The probability of system failure PF can be computed directly on the final SWN, indeed, since it is a transient state measure, we do not need to modify the model. The availability A_f₁ and the frequency of false alarms are instead steady state measures, so that it is necessary to add “reset” transitions out of the deadlock markings. We present first the results obtained from the computation of PF, by using the tool MultiSolve [81]. We have computed PF over the time interval [0.1ms, 2700ms] by assigning different values to the rate parameter count rate and by setting the other parameters of the SWN model to the values shown in Table 4.1. Figure 4.10 shows the three curves of the probability of the system failure. As expected, the rate of expiration assigned to the timer of the watchdog has a great impact on the probability of system failure, indeed, as the rate is greater the probability of a system failure is lower since the recovery mechanism, more frequently activated by the reception of the notification of expiration, is able to carried out a timely recovery action in case of error before the system failure.

The computation of the metrics in steady state is carried out on the modified ergodic model, its RG is character-ized by 103 tangible markings and 1051 vanishing markings. As for the probability of failure, we have analyzed the model for different values assigned to the rate parameter count rate, and the results are reported in Table 4.2. The availability of the automation function affected by error and the false alarms increase as the notification of exception sent to the recovery mechanism are more frequent: when the rate of the watchdog expiration is set to 1.05ms the availability of the automation function reaches 99.9% but as a couterpart the 52.25% of the

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

0 500 1000 1500 2000 2500

time(ms) count_rate=0.05

count_rate=0.55 count_rate=1.05

Figure 4.10: Probability of failure of the automation system.

notifications of the watchdog are false alarms.

count rate (ms) A_f₁ (%) false alarm (%)

0.05 98.0 0.25

0.55 99.8 27.54

1.05 99.9 52.25

Table 4.2: Metrics in steady state

Part II

From UML behavioral diagrams into

analyzable Generalized Stochastic Petri Net models.

This second part of the thesis is devoted to the description of the automatic translation of UML Sequence Dia-grams into SPN formalism and of the role played by the derived translated models in the context of properties validation and quantitative evaluation of the system. Automatic translation techniques are advocated when the UML description of a system is meant for design specification since they allow both to minimize the number of errors and omissions done during the modeling activities and to reduce the time required by the V&V activities.

We do not make any assumption on the domain of the modeled system, however, observe that in case of automation systems additional information, useful for the generation and the analysis of the SPN final models, are provided by the customization of the UML CD scheme for the specific system. Such information concern the attributes of the customized classes that can be mapped to input parameters of the translated SPN models, and to possible values to be assigned to them, to properties to be verified and to measures to be computed.

The standard UML still does not have a formal semantics for UML behavioral diagrams, although there are a number of attempts along this line coordinated by the precise UML working group [39], so that there are some lacks and ambiguities for which we have made a specific choice: in this second part of the thesis, we

Nel documento Building Stochastic Petri Net models for the veriﬁcation of complex software systems (pagine 89-107)