Unified Framework for Coverage Control

(1)

Università di Pisa

Scuola di Ingegneria

Corso di Laurea Magistrale in Ingegneria Robotica e dell’Automazione

Tesi di Laurea in

Controllo dei Sistemi Incerti

Unified Framework

for Coverage Control

Relatore:

Chiar.mo Prof. Mario Innocenti

Laureando: Alberto Mellone

(2)

(3)

i

Abstract

This work of Thesis is concerned with the control of teams of autonomous agents to cooperatively perform coverage tasks. The control is set in the Framework of the Descriptor Function, which allows to deal with different scenarios, such as Deployment Problems, Effective or Persistent Coverage, in a unified fashion. The problem of full decentralization of the control law is addressed; as far as dynamic tasks are concerned, the main hurdles are the estimation of the attained coverage over time as well as a

convenient deadlock escaping strategy. It is shown that if the agents are able to

communicate with others within a limited range, task fulfillment is still guaranteed. Inter-agent collision avoidance is integrated with an obstacle avoidance policy that does not require an a priori knowledge of position and shape of obstacles. Without introducing further complexity, agents just need to detect their walls through

proximity sensors. As regards agents’ dynamics, the state-of-the-art control laws,

applied to nonlinear affine-in-control and driftless models, is adapted to a new class of

vehicles with drift components, taking into account their dynamic model. Finally,

Deployment Problems are framed in the context of Differential Cooperative games, for which a suboptimal solution is provided in a closed form. Throughout the work, the validity of the assertions is proved by means of formal arguments, while their effectiveness is shown in scenarios simulations aimed at comparing performances between different approaches.

Keywords: Coverage problems · Autonomous agents · Descriptor Function ·

Decentralized control · Obstacle avoidance · Heterogeneous dynamics · Cooperative games

(4)

Introduction 1

1 On the Coverage Problem 4

1.1 Literature Review . . . 4

1.2 The Descriptor Function Framework . . . 9

1.2.1 Agent Descriptor Function . . . 10

1.2.2 Task Descriptor Function . . . 13

1.2.3 Task Error Function . . . 15

1.2.4 Control . . . 15

1.2.5 Deployment Problems - Simulation Results . . . 18

2 Decentralized Control in the DF Framework 22 2.1 State of the Art . . . 22

2.1.1 Collision Avoidance Control . . . 24

2.1.2 Coverage Control . . . 25

2.1.3 Waypoint Guidance Control . . . 26

2.2 Computation of the Error Fuction Gradient . . . 28

2.3 Local Switching Policy . . . 31

2.4 Attained Coverage Estimation . . . 37

2.4.1 Discrete Time Implementation . . . 39

2.4.2 Continuous Time Implementation . . . 41

2.4.3 Further Improvements . . . 49

2.5 Connectivity Preservation . . . 50

2.5.1 λ2 estimation . . . 50

2.5.2 Control . . . 52

2.5.3 Implementation . . . 53

2.6 Considerations about the Persistent Coverage . . . 58

3 Obstacle Avoidance 63 3.1 State of the Art . . . 63

3.2 General Strategy for Obstacle Avoidance . . . 65

3.2.1 Obstacles Detection: an example . . . 74

4 Robots with Drift Component 77 4.1 First and Second Order Kinematic Models . . . 77

4.2 Control . . . 80

(5)

CONTENTS iii

5 Coverage as Cooperative Game 88

5.1 Cooperative Games . . . 89

5.2 Suboptimal Solution of the Cooperative Game . . . 91

5.3 Application to the Coverage Problem . . . 94

5.3.1 First Order Kinematic Models . . . 94

5.3.2 Second Order Kinematic Models . . . 101

(6)

1.1 Gaussian ADFs Examples. . . 11

1.2 Gaussian ADF with FoV. . . 12

1.3 Initial cTDF. . . 18

1.4 Uniform Deployment of the Swarm. . . 19

1.5 TDF associated to the Static Coverage Task. . . 20

1.6 Static Coverage Task. . . 20

1.7 TDF associated to the Target Assignment Task. . . 20

1.8 Target Assignment Task. . . 21

2.1 Normalized error function over time. . . 28

2.2 Inter-agent distances. . . 28

2.3 TDF evolution. . . 29

2.4 Comparison between the error functions ξ1 and ξ2 when following the original or the new switching policy, respectively. . . 37

2.5 TDF evolution - Case of new switching policy. . . 38

2.6 Inter-agent distances - Case of new switching policy. . . 38

2.7 Comparison between the error functions ξ₁ and ξ₂ when the coverage control is computed using the true attained coverage or the estimated one, respectively. . . 46

2.8 TDF evolution - Case of estimated coverage approach. . . 47

2.9 Coverage estimation error of the green agent. . . 48

2.10 Inter-agent distances - Case of estimated coverage approach. . . 48

2.11 Comparison between the error functions ξ₁ and ξ₂ when connectivity preserving control is inactive or active, respectively. . . 54

2.12 Time history of the true λ2 (dashed black line) and the agents’ estimations (coloured lines). . . 55

2.13 Inter-agent distances - Case of connectivity preserving control. . . 56

2.14 TDF evolution - Case of connectivity preserving control. Green agents are the swarm leaders. . . 56

2.15 Effective Coverage - TDF estimated by the green agent at different times. 58 2.16 Effective Coverage - True TDF. . . 58

2.17 Comparison between the error functions ξ1 and ξ2 when the estimation quality is low or high, respectively. . . 59

2.18 Persistent Coverage - Performance with different gains. . . 61

2.19 Persistent Coverage - Error function evolution with different gains. . . . 62

2.20 Persistent Coverage - Performance with different σ values in the north-east quarter. . . 62

(7)

LIST OF FIGURES v

3.1 Target Assignment with obstacle. . . 70

3.2 Trajectories comparison. . . 71

3.3 Persistent Coverage - Unknown obstacles. . . 72

3.4 Persistent Coverage - Known obstacles. . . 72

3.5 Persistent Coverage - Moving obstacles scenario. . . 74

3.6 Effective Coverage - Obstacles Detection. . . 75

3.7 Effective Coverage - Agents - Obstacles distances. . . 76

4.1 Uniform Deployment: normalized error index evolution. . . 84

4.2 Static Coverage: normalized error index evolution. . . 84

4.3 Target Assignment: normalized error index evolution. . . 84

4.4 Effective Coverage: Heterogeneous Team. . . 87

4.5 Persistent Coverage: Heterogeneous Team. . . 87

5.1 Cooperative Strategy - Target Assignment. . . 99

(8)

Nature shows that cooperation is the basis of life success. Evolution has confirmed that more complex organisms may only arise from the strict interaction between simpler forms of life: a social behaviour of heterogeneous cells results in multicellular organisms. Complex tasks, which a single cell alone is not able to accomplish, are made feasible by clustering. Just the same way, multicellular organisms, either plants or animals, that display a tendency to aggregation are rewarded by significantly higher survival rates as a result of proper tasks and resources redistribution. They can be both traced back to an extended concept of society. In [1] the following definition is given:

Society is a complex of forms or processes each of which is living and growing by interaction with the others, the whole being so unified, that

what takes place in one part affects all the rest. It is a vast tissue of

reciprocal activity, differentiated into innumerable systems, some of them quite distinct, others not readily traceable, and all interwoven to such a degree that you see different systems according to the point of view you take.

The just mentioned definition is so general that complies with multicellular organisms as well as with communities of individuals. It points out that the behaviour of the single is deeply dependent on the group to which it belongs and emphasizes how vital is their interaction. Furthermore, heterogeneity is identified as a building block of a complex society.

The field of Robotics is often inspired by natural processes and their working principles; the control of groups of robots is not an exception. While the main hurdles towards the accomplishment of a challenging task are the computational abilities and the resources limitations, the response of nature over millennia has been the

development of societies. If the problem can be splitted in smaller and simpler

components, which can be executed in a parallel fashion by a multitude of agents, both costs and complexity may be reduced. This form of cooperation is extremely common in groups of animals and is known as emergent behaviour. A large number of examples following this paradigm can be given: ants and bees swarming, fish schooling and sheep and bird flocking. The key feature is relying on team members, with only

(9)

partial knowledge of the surronding environemnt, enforcing simple action protocols

that result in a desired global and intelligent behaviour. Besides simplicity,

cooperation is able to achieve higher robustness, since single points of failure represented by centralized control and management systems are avoided: if a member of the group is damaged or lost, the task execution is barely affected.

When it comes to the branch of control of teams of robots, the main problem is formulating an as general as possible policy to comply with team heterogeneity: in practical applications robots may have different sizes, move according to various dynamic laws and have non-uniform effectiveness in accomplishing their assigned task. The primary objective is dealing with this difficulties in a unified manner, of course

keeping an eye on mathematical meaningness, complexity issues and physical

implementation feasibility.

From the point of view of the global behaviour of the swarm, the most desirable objective is achieving full optimality: the swarm should carry out the task at the best of its possibilities, with the further requirement of consumptions reduction. On the other hand, the complexity of the problem itself, which may take into account a wide variety of scenarios and nonlinear robots’ dynamics, make the search for global optimal peformance hardly successful, if even meaningful.

The focus of this work is the description of a unified framework to deal with control of swarms of heterogeneous agents performing coverage tasks. The goal is providing a strategy that suits well different scenarios, regardless of the specific task, the swarm composition in terms of agents’ mobility and efficiency, the environment set-up.

The coverage problem is vast and widely discussed in the literature. Several solutions have been proposed in the last decades; to the best of the author’s knowledge, however, what is missing is a common way to model different objectives and to design control laws of general validity, whereas ad-hoc solutions are widely preferred. Chapter 1 is devoted to presenting some approaches to the different coverage tasks. The Descriptor Function Framework, which will be the common thread of this work, will be addressed as well, reporting the results achieved in the related seminal works.

Chapter 2 extends the Framework, discussing some solutions in terms of team heterogeneity and achieving a fully decentralized control. In Chapter 3 a control law to ensure agents’ safety when moving in an environment with obstacles is proposed. As it will be shown, agents do not necessarily need to have an a priori knowledge of the obstacles’ positions and shapes, as long as they are able to detect them through proximity sensors. Chapter 4 introduces a new class of vehicles’ dynamics into the framework, thus proving its flexibility. In Chapter 5 the coverage task is modeled as a differential cooperative game, for which a suboptimal solution is proposed.

The validity of the proposed control laws will be proved by means of formal

arguments and their effectiveness will be assessed through simulations. Particular

(10)

between performances, resources and task requirements.

(11)

Chapter 1

On the Coverage Problem

The aim of this Chapter is to provide some general background on the problem of coverage and to illustrate the features of the Descriptor Function Framework.

The literature related to the deployment of swarms of autonomous agents in order to perform tasks of surveillance and patrolling is extremely vast. The focus is often on particular study cases; therefore, to retain only the most significant approaches, in relation to the scope of this work, a general overview of the main scenarios and techniques will be presented. On the other hand, since the present work is set in the Descriptor Function Framework, a more detailed discussion of its formalisms and the results so far obtained will be provided.

1.1 Literature Review

The goal of a coverage protocol for a set of autonomous robots is to properly cover a search domain with the union of the sensory areas of the agents belonging to the

deployed swarm. The sensory area of each agent is the region where it is capable

of performing some sort of task. In the case of environment monitoring, this can be considered the portion of space where the agent is capable of acquiring information, for instance using a camera. Several applications can be considered, from surveillance to exploration. In other settings, the sensory area is meant as the region in which the robot can act in order to modify the environment. This can be useful, for example, in scenarios of lawn mowing, floor cleaning or room heating, in which each robot has a limited action range and a joint operation allows pursuing a global objective.

Besides their physical meaning, more substantial differences can be highlighted as far as the kind of coverage tasks is concerned.

A first category consists in finding the optimal locations that a set of immobile agents should occupy in order to minimize some cost function. This issue is of relatively little interest in the field of Robotics, as the computation can be performed offline and no control law is synthesized to drive the motion of the agents. The entire domain

(12)

is divided into proper Voronoi cells. Let Rn be the considered environment and let

D = {p₁, ..., pd} ⊂ Rn a finite set of points in Rn. A Voronoi partition (or tessellation)

of Rn generated by D is a collection of subsets Vi(D), i = 1, ..., d of Rn such that

Vi∩ Vj = ∅ for i 6= j, Sd_i=1Vi = Rn and

Vi(D) = {q ∈ Rn | kq − pik ≤ kq − pjk pi, pj ∈ D, pi 6= pj} .

In most cases, the problem is solved using centroidal Voronoi partitions, i.e. generated by the centroids of the cells themselves. Agents’ optimal positions are such centroids. References [2] and [3] provide a more in-depth discussion of this topic.

Static Coverage

Another type of coverage problem, mainly known as Static Coverage, refers to the motion control of a swarm of agents in order to make them achieve final optimal positions; a wide range of examples is provided in [4] and references therein. Static Coverage is the enhanced version of the aforementioned problem, in that the vehicles’ initial positions are not the optimal ones; a feedback control law is necessary to drive them from their deployment locations to final ones minimizing some cost function.

One of the most widespread approach is resorting to Voronoi partitions. In

particular, a Voronoi tessellation generated by the agents’ positions is defined and

updated over time; a control law aims at driving the swarm to a balanced

configuration, which is generally the solution of an optimization problem.

An example is provided in [5], where, for a team of d agents carrying isotropic sensors whose performance degrades for increasing distances, the objective function

H = d X i=1 Z Vi f (kq − pik)φ(q) dq

is defined, f : R+0 → R being a non increasing performance function and φ weighting

each point of the environment. The swarm configuration for which the maxima of this function are attained are solutions of the static coverage problem. A particular

case is choosing f (x) = −x2, for which optimal configurations are centroidal Voronoi

partitions weighted by φ; this is known as centroid problem. If the d agents have single

integrator dynamics ˙pi = ui and are controlled by the law ui = ∂H/∂pi, convergence

to a critical point of H (though not necessarily the global maximum) has been proven. This algorithm is fully distributed, since it is shown that each robot needs to exchange information with neighbouring other ones.

Building on the previous definition of H and restricting the scope to the centroid problem, in [6] a control law driving the agents (with single integrator dynamics) towards their Voronoi cell centroid is discussed. Namely, the authors assume that each robot does

(13)

CHAPTER 1. ON THE COVERAGE PROBLEM 6 not know the function φ, hence must compute and update an estimate of the centroid position through a consensus-based filter; a proportional control is implemented to make the agent reach it. As shown, every member of the team accomplish this task and, under further assumptions on its trajectory, the estimated centroid eventually coincides with the true one, thus yielding the optimal result.

A further variation is provided in [7], where the authors consider the case of anisotropic sensors. By employing a non-Euclidean distance, new Voronoi partitions

V_i∗ are defined. Restricting the scope to uniform sensors orientation, a distributed

control law is then synthesized to drive the agents towards the centroids of this new tessellation. The introduction of modified Voronoi partitions are the foundations of other works presented in [8] and [9], where the cases of arbitrary convex sensor footprints ar explored and a distributed control law, requiring the agents’ knowledge of neighbours’ spatial configurations, is synthesized resorting to the H function gradient ascent.

Voronoi-based approaches to the Static coverage problem have a particularly evident practical drawback, i.e. they are computationally too intensive to handle, as

Voronoi tessellations must be continuously updated. The Descriptor Function

Framework presented in Section 1.2 offers a less demanding solution to this task. A particular case of Static Coverage is the so-called Uniform Deployment task, which

occurs when there are no preferential regions within the operational area. In such

case, an alternative to Voronoi tessellations is resorting to potential fields. In [10], for instance, a potential field is defined over the all environment as the sum of contributions from obstacles and agents. Namely, they are treated as carriers of positive electric charges. As a result, each agent, modeled as a point mass, experiences a force opposite to the gradient of the potential field and function of its relative distance from other agents and obstacles. If an equation of motion involving a viscous friction contribution is used, the convergence to a static equilibrium can be easily proved.

Effective Coverage

When considering Static tasks, the implicit assumption is that the number of agents deployed is sufficiently high in relation to the size of the operational area, thus achieving a satisfactory level of coverage. On the other hand, if the operational area is too vast to be covered by a team of agents, reaching a final static, possibly optimal, configuration still provides an incomplete knowledge of the environment due to sensory areas limitations. Instead, it may be desirable that the team stops after a sufficient wandering in the entire operational area. This kind of approach is mainly suitable for applications in which a limited number of agents have the goal of exploring all the environment, perhaps as a preliminary task of obtaining a global map.

(14)

introduced in [11]. Each agent, with a single integrator dynamics, carries a sensor characterized by an instantaneous coverage function, which describes how effectively the sensor is covering a point q. The effective coverage attained at a point q is then

the integral over time of the sum of all the agents’ covering contribution at q. A

cumulative error function over all the environment is defined and the motion control of the swarm makes agents descend its gradient, thus moving towards poorly covered areas. A more in-depth and formal description of this task will be provided in Section 1.2, where it is shown how it is suitably contextualized in the Descriptor Function Framework.

One issue of the Effective Coverage task is that the swarm may come to local minima of the cost function, which results in all the agents stopping in fully covered areas while some portions of the environment are not yet sufficiently explored. An improvement to the control strategy is necessary to take into account this possibility. The solution proposed in [11] consists in agents selecting the closest uncovered point and using a feedback law to drive towards them, whenever this situation occurs.

In [12] the work is extended by including an inter-agent collision avoidance term in the control input of robots, thus guaranteeing that their relative distances do not drop under a specified threshold.

By adding further control terms, in [13] the inter-agent distances are upper-bounded, thus obtaining a flocking behaviour of the team. Two control modes, coverage and swarming, are defined and the totality of agents switch between them when proper conditions are verified. During coverage mode, agents are driven by a gradient-descent control law aimed at minimizing a coverage error function until a local optimum is reached, which makes them switch to swarming mode; while in swarming mode, their sensors are turned off and the control input makes them head towards a chosen, unique, waypoint in an uncovered area; as each of them has reached a minimum distance from the waypoint, sensors are switched on again and the coverage mode can recover.

Similarly, in [14], a so-called supervised control is proposed, involving the presence of a static agent, the supervisor, which assigns waypoints to covering agents as the swarm is in the proximity of local minima. To preserve a communication link between the agents and the supervisor, the control law guarantees that agents move within the communication range from the supervisor.

Team heterogeneity in relation to the type of task can be found also in [15], as some agents are not involved in covering, but are in charge of spreading coverage-related information all over the network. Flocking behaviour is guaranteed by proper proximity maintenance control as well, thus preserving communication links.

Authors of [16] propose a control strategy for team of robots with the dynamics of a unicycle; agents carry sensors with a limited field of view, which suitably model the acquisition ability of cameras. This feature is particularly useful in case of Effective

(15)

CHAPTER 1. ON THE COVERAGE PROBLEM 8 Coverage task when the goal is exploring an unknown environment.

In [17] the perturbation control drives the agents towards waypoints which allow them covering the operational area in a hierarchically ordered fashion. Namely, the environment is divided in a first order of cells; each of them is divided again in a second order of subcells; this procedure goes on iteratively until the subcells of the highest order are small enough so that they can be thoroughly covered by a single agent. Then the agents’ aim is to properly cover all the cells belonging to the same order, which guarantees that the parent cell is fully covered as well. Then, they switch to another cell and repeat the procedure. This waypoint guidance control is mixed with the usual gradient-descent coverage control by means of a suitable, time varying weighting scheme. Some energy considerations are made in [18], where the perturbation mode consists in agents driven towards waypoints chosen in a way which properly distributes the control effort burden. In particular, agents deployed with a lower battery level or those ones that underwent a higher energy consumption as a result of coverage control are assigned closer waypoints.

Persistent Coverage

It is sometimes required that agents explore portions of the environment that they already fully covered in the past. Such need can be justified by a decrease over time of the validity of the information collected or by the evanishment of the effects on the

environment brought by robots in the past. A typical example, which frequently

occurs in practical situations, is the task of patrolling: it requires agents to

continuously perlustrate a closed area; clearly robots should go back to visited regions and, in general, spots that have not been visited for a long time have a higher priority. Another application is, for instance, heating closed spaces; here, the requirement of agents re-visiting areas is justified by the decay over time of temperatures as a consequence of diffusion.

Approaches to this kind of task are mainly of two types: algorithmic or

gradient-descent-based. The former basically consists in some kind of discretization of the environment and in the design of a policy that guarantees that each area is visited

periodically. The latter is more similar to the previously discussed techniques and

drives agents towards areas that allow them to minimize a defined cost function. An example of algorithmic solution to the task of Persistent Coverage can be found in [19], where, at the initialization stage, agents identify via points in the operational area and compute proper paths between them. Then, the goal becomes following these paths in an ordered way and this is done by means of a fixed structure algorithm. The control law to drive agents is not explicitly given, since the proposed solution is general and can be adapted to agents with arbitrarily complex dynamics. This work is extended in [20], where the minimum number of agents required to obtain full coverage, in spite of the

(16)

decay rate, is given; furthermore, conditions for periodic achievement of full coverage in predefined points of interests are deduced and it is shown that, in case of no decay (thus Effective Coverage task), the algorithm guarantees full coverage. Such approaches to this task have the advantage of high predictability and avoid issues of incurring in local minima that arise when seeking optimal configurations for the swarm. However, some drawbacks are evident: there is a lack of flexibility and the solution is fault-intolerant, since guarantees on periodicity or fullness of the attained coverage are lost in case of malfunctioning agents. Also, the environment is supposed to be obstacle-free, which is a rather strong assumption in practical situations.

In [21] a discrete-time/discrete-state procedure is proposed. In particular, an

optimization problem is introduced in order to find the best strategy in a limited step prediction horizon; the guarantee of optimality is obtained as a consequence of considering the environment as a finite collection of points. The strategy is based on the exploration of a tree of possible control laws, further exploring branches that allow for a more rewarding cost minimization and pruning those ones which would result in

a poorer performance. The main drawback of this approach is due to the

computational complexity required to guarantee a satisfactory swarm behaviour. A gradient-descent strategy, combined with a perturbation term to avoid agents being trapped in local minima of the cost function, is presented in [22]. Inter-agent

collision as well as obstacle avoidance techniques are reported. The perturbation

control consists in a waypoint guidance towards poorly covered points. The behaviour is not characterized by switching between modes, but rahter both control terms take part in a time-varying weighted sum, with the coverage contribution outweighing the

waypoint guidance in areas that require a more intensive exploration. In [23] this

approach is slightly modified, though preserving the general structure, and some energy considerations are made. Namely, limiting the range of sensors as well as their power when the istantaneous coverage is low brings about beneficial outcomes in terms of reduction of energy consumption.

1.2 The Descriptor Function Framework

Introduced in the 2010 work [24] and extended in [25], the Descriptor Function Framework addresses the need for a unified method to approach different types of tasks. The robots belonging to a team may have different missions, as a result of robots heterogeneity, as well as of time-varying goals. In this way, if agents are able to perform different kinds of tasks, these can be properly switched, in order to achieve a global, more complex, swarm objective. The mathematical tool of the Descriptor Function allows for an abstraction from the actual task, thus providing control laws for the entire swarm which have general validity and fit to several applications, from static swarm

(17)

CHAPTER 1. ON THE COVERAGE PROBLEM 10 deployment to dynamic exploration of the environment.

Notation and Conventions

The general notation employed in the remainder of this work will be now presented. T is the set of all agents, whose cardinality is N . The following subsets of R are

introduced: R+0 = {x ∈ R | x ≥ 0}, and R+ = R+0\{0}. Without loss of generality,

R+0 will represent the time horizon. The operational area where agents are allowed to

operate is a closed and bounded set Q ⊂ Rn. A configuration space on Q, denoted

by C(Q) is introduced, which is the set all possible configurations for an agent located

in Q. q_i ∈ Q will denote the i-th robot position, while p_i ∈ C(Q) will be called its

pose. For instance, in the case of unicycle agents, Q ⊂ R2 and a proper choice of the

configuration space could be C(Q) = Q × [0, 2π). In such cases, pi = [px,i, py,i, θi] and

the selection matrix

S = "

1 0 0

0 1 0

#

will be used to extract the position vector from the pose vector : qi = Spi. The vector

p = [pT₁, ..., pT_N]T will denote the concatenation of all the agents’ poses.

The following shorthand will be used for the weighted norm of a real vector v:

kvkA =

√

vT_Av.

The support operator supp{·} is introduced; for a generic function f (·, q) ∈ Rm×n,

m, n ∈ N, it is defined as:

supp {f } = {q ∈ Rn| f (·, q) 6= 0} ,

i.e. it is the set of points in the operational space where f is not identically zero. For the sake of generality, all the quantities will be indicated without units of measurement.

1.2.1 Agent Descriptor Function

The Agent Descriptor Function (ADF) of robot i ∈ T is a function di: C(Q) × Q →

R+0. It represents the spatial effectiveness of agent i in performing the assigned task.

For instance, in the case of measures acquisition, d_i(pi, q) models the sensing quality

achieved by the i-th robot with pose pi at different points of the environment q ∈ Q.

The ADF is assumed to be continuous and differentiable over Q. As far as ADFs are concerned, the definition of support specializes to:

(18)

20 Agent Descriptor Function

10 -6 -4 -2 0 0 2 5 4 6 10 15 0 20 0 1 2 3 4 5 6

(a) Isotropic Gaussian ADF.

20 Agent Descriptor Function

10 -6 -4 -2 0 0 2 5 4 6 10 15 0 20 0 1 2 3 4 5 6

(b) Gaussian ADF with preferential direction. Figure 1.1: Gaussian ADFs Examples.

and it is assumed to be a connected set.

An example of ADF in case of Q ⊂ R2 could be the Gaussian ADF:

dG(pi, q) = A exp  −1 2 " cos θi sin θi − sin θi cos θi # (q − Spi) 2 Σ−1   (1.2)

with A being the maximum ADF value, attained in Spi and Σ = diag {σx, σy}

describing the decay along the two directions. If σ_x = σy the resulting sensor is

isotropic. Figure 1.1 represents the case of isotropic and anisotropic Gaussian ADFs. On the other hand, some sensors, tipically cameras, are able to acquire environment information only in a limited field of view. In [26], using the sigmoidal logistic function

(1 + e−x)−1 as a starting point, the authors propose the definition of a Field of View

function fF OV : C(Q) × Q → [0, 1]: fF OV(pi, q) = 1 1 + e−k(r1cos(φ₂)+r2sin(φ₂)) 1 1 + e−k(−r1cos(φ₂)+r2sin(φ₂)) (1.3)

whith k characterizing the slope of the sigmoid and φ ∈ [0, 2π] defining the angular

span of the field of view. Quantities r₁ and r₂ depend on the agent pose:

r1(pi, q) =(qx− px,i) cos θi− π 2 + (qy− py,i) sin θi− π 2 , (1.4) r2(pi, q) = − (qx− px,i) sin θi− π 2 + (qy− py,i) cos θi− π 2 . (1.5)

The function fF OV shapes the field of view of the agent due to its position and its

parameters k and φ, which describe how sensing quality degrades on the borders of the field of view and how large it is, respectively. Therefore, combining this shaping function with an ADF, it is possible to obtain an ADF with a limited field of view. For

(19)

CHAPTER 1. ON THE COVERAGE PROBLEM 12 instance:

dG(pi, q)fF OV(pi, q), σx= σy (1.6)

is an ADF whose amplitude decreases isotropically further from the agent, but towards

a preferential direction, θ_i, and with a visual angle equal to φ. Figure 1.2 shows an

example of an ADF of the form (1.6).

-2 -1 0 0 1 2 3 4 5

Agent Descriptor Function

10 10 0 2 4 6 8 0.5 1 1.5 2 2.5 3 3.5 4

Figure 1.2: Gaussian ADF with FoV.

Other ADFs, found in [25], [26] and [27], can be characterized either by radial symmetry (polynomial ADF) or by planar symmetry (sinusoidal ADF), as well as by non-symmetrical patterns. Each of these shapes can be the most suitable in describing the acquisition area of a sensor or the action range of a device carried by the mobile robot.

The most powerful property of ADFs, however, is that their shape does not affect how they are mathematically treated. As explained later, the control law is not concerned with the size or the field of view of the agent’s sensory area, thus using the same formalism to deal with arbitrary kinds of ADFs. This helps abstracting from the actually performed tasks and makes the framework absolutely general.

Theoretically, some ADFs, such as the Gaussian or the sinusoidal one, have their support coinciding with Q, since they asymptotically approach zero. However, it is reasonable to consider a more limited support, which better reflects the behaviour of true devices with spatially confined capabilities. Due to the requirement of continuity and differentiability of the ADF, they can not be truncated; nevertheless, provided that

a lower threshold A∈ R+₀ of the ADF value is set, a new definition of support can be

introduced:

supp_A{d_i(pi, ·)} = {q ∈ Q | di(pi, q) > A} . (1.7)

In the remainder, the distinction between spatially bounded and unbounded ADFs will not be explicitly addressed, and, with a slight abuse of notation, supp {·} will also

denote supp

(20)

Remark 1.1. Because of the semi-positive definiteness of the ADF, it is possible to draw the following conclusion about its gradient with respect to the agent’s pose:

supp ∂di

∂pi

⊆ supp {di} . (1.8)

In fact, it is sufficient to observe that

di(pi, ˜q) = 0 =⇒ ∂di(pi, q) ∂pi q=˜q = 0

The sum of all the ADFs is called the current Task Descriptor Function (cTDF) and represents the cumulative amount of sensing that the team is instantly achieving. In

particular, it is defined as d : CN_{(Q) × Q → R}+₀,

d(p, q) =X

i∈T

di(pi, q). (1.9)

1.2.2 Task Descriptor Function

The (desired) Task Descriptor Function (TDF) d∗describes how the agents should be

distributed in the operational area to maximize the goal achievement. In other words, for each q ∈ Q, it assesses how much of the available amount of resources is needed. Another strength of the Framework lies in the ability of dealing with a large variety of task types

just using a proper definition for d∗. In the following paragraphs this flexibility will

be discussed; in particular, it will be highlighted how different problems presented in Section 1.1 can be uniformly addressed in the Descriptor Function Framework.

Static Coverage

In the case of Static Coverage, the TDF is a function d∗ : Q → R+₀ that represents

how much coverage is needed at each q and does not change over time. Regions where

d∗(q) is higher will need an as much higher agents density. Notice that the case of

Uniform Deployment is a special case of Static Coverage with d∗(q) = ¯d∗ > 0

Effective Coverage

Rather than a constant amount of sensing, the Effective Coverage Task requires that each point of the operational area achieves a minimum level of coverage over time; when this happens, the point becomes of no interest to the swarm. It is clear, then, that the Task Descriptor Function associated to the Effective Coverage changes over time, as the task unfolds and the agents explore the environment. The total coverage attained, also

(21)

CHAPTER 1. ON THE COVERAGE PROBLEM 14

called information, is a function I : R+0 × Q → R+0,

I(t, q) =X i∈T Z t 0 di(pi(τ ), q)dτ = Z t 0 d(p(τ ), q)dτ. (1.10)

Note that it is the integral of the cTDF (1.9) over time. Therefore, if C∗is the minimum

level of information required in all points of the environment, the TDF for the Effective Coverage task is d∗: R+₀ × Q → R+₀:

d∗(t, q) = max {0, C∗− I(t, q)} , (1.11)

The max {·} operator is used to constrain the TDF to be a positive real.

Persistent Coverage

In those applications in which the relevance of information acquired in the past fades as time passes, it is necessary that agents explore again points previously visited. To model this phenomenon, a first order differential equation involving I is used. Namely:

∂I

∂t(t, q) = δI(t, q) + d(p(t), q), (1.12)

where the parameter δ ≤ 0 is called information decay rate and characterizes how fast the swarm loses awareness of point q. Observe that the case δ = 0 coincides with the Effective Coverage task, while the general closed-form expression for I, assuming zero initial information, is:

I(t, q) =

Z t

0

eδ(t−τ )d(p(τ ), q)dτ. (1.13)

and the TDF is, as before: d∗(t, q) = max {0, C∗− I(t, q)}. Points currently covered

by the agents have increasing information, thus lower TDF values; instead, the decay process makes the need for resources rise where agents are absent.

Target Assignment

Suppose that there are Nw targets in the environment. The Target Assignment task

requires that agents track at most N of the targets. Assuming identically shaped ADFs

di, each target can be described by analogous Descriptor Functions, thus yielding the

following TDF: d∗: QNw× Q → R+₀, d∗([qwT1 , ..., qwTNw] T_{, q) =} Nw X j=1 dj(qwj, q), (1.14)

(22)

1.2.3 Task Error Function

With the definition of the cTDF and of the TDF, it is natural to introduce an error between the amount of resources that the task requires and those actually provided by the swarm at each q ∈ Q. In fact, the (absolute) Task Error Function (TEF) is defined as:

e(t, p, q) = d∗(t, q) − d(p, q), (1.15)

and the time dependence is actual only in case of Effective or Persistent Coverage. With a slight abuse of notation the TEF will be denoted by e(t, p, q) also when dealing with a generic task, even if it is time-invariant, to preserve generality

Where |e| is small, the right amount of sensing is available. At points where large values e > 0 are measured, the task is not sufficiently fulfilled; on the contrary, e < 0 indicates an excess of resources. In [25] another possible definition of TEF is given as the ratio between the TDF and the cTDF:

er(t, p, q) =

d∗(t, q)

d(p, q) + c (1.16)

where the constant c > 0 ensures that it is always well defined. The task is considered

fulfilled as e_rapproaches 1, although the exact disparity between TDF and cTDF is not

directly measurable. This can be an advantage when a precise model for the need of resources is not available. Unless explicitly mentioned, the TEF adopted is the absolute one.

1.2.4 Control

The agents’ motion in the environment is driven by a control law set in the context of an optimization problem. In particular, a cost function is introduced, which is a cumulative measure of the TEF in the whole operational area, and the proposed input to the agents is aimed at minimizing it with respect to the robots’ poses.

The aforementioned cost function, also called error function is, in the most general case:

ξ(t, p) = Z

Q

f (e(t, p, q))σ(q)dq. (1.17)

The function f : R → R+0 is continuous in R, while σ : Q → R

+

0 weights each point of

the environment. The meaning of the cost function is clear: it is a measure, weighted at q by σ(q), of the function f (e(t, p, q)) all over the environment Q. In general, such function will be of the form:

f (e) = max{0, e}p, p ∈ {2, 3, ...}. (1.18)

(23)

CHAPTER 1. ON THE COVERAGE PROBLEM 16 respect to the robots’ poses:

pd= arg min

p∈CN_(Q)ξ(t, p). (1.19)

However, the general non-convexity of ξ, typical feature of tasks such as coverage, may lead to suboptimal solutions, as the swarm configuration approaches its local minima.

Assuming that agents’ motion is described by the single integrator dynamics: ˙pi(t) =

ui(t), the control law is synthesized using Lyapunov theory as a starting point. Define

the control: ui(t) = −β ∂ξ ∂pi T , (1.20)

with β > 0 a design parameter. Because of the specified agents’ dynamics, letting u =uT₁, ..., uT_NT, the compact form for the whole team is:

˙

p = −β∂ξ

∂p

T

, (1.21)

which has a set of equilibria described by:

Λ(t) = ¯ p ∈ CN(Q) | ∂ξ(t, ¯p) ∂p = 0 , (1.22)

i.e. the equilibria of the global system are the critical points of ξ. In fact, consider the cost function to be a Lyapunov candidate; this is possible since ξ ≥ 0 and ξ = 0 if and only if the task error function e is lower than or equal to zero in all the environment. Derivation with respect to time yields:

dξ(t, p) dt = ∂ξ(t, p) ∂t + ∂ξ(t, p) ∂p p˙ (1.23)

and, introducing the global control law (1.21), dξ(t, p) dt = ∂ξ(t, p) ∂t − β ∂ξ(t, p) ∂p T 2 = ∂ξ(t, p) ∂t − 1 β X i∈T kui(t)k2. (1.24)

For tasks in which the TEF is not explicitly time-dependent (Static Coverage, Uniform Deployment, Target Assignment), also called Deployment Problems, ξ is only function of the instantaneous poses p, which results in ∂ξ/∂t = 0. In such situations, it is straigthforward to conclude that:

dξ(p)

dt = −β

−1X

i∈T

kui(t)k2 ≤ 0 (1.25)

(24)

system converges to the largest invariant set contained in Λ; on the other hand, Λ itself is invariant because of the specified control law. Therefore, the global configuration p

will converge to a single point ¯p in Λ, i.e. a critical point of the cost function. This

can be a minimum, a maximum or a saddle point, but notice that, in the latter cases, the equilibrium would be unstable and arbitrarily small perturbations would make the system reach a minimum, in which case asymptotic stability is clearly achieved.

The case of time-varying cost functions will be dealt with in more detail in Chapter 2, where the more general case of agents characterized by driftless, affine in control dynamics will be discussed. Notice however, that the following result regarding the Effective Coverage will be used several times in the remainder of this work.

Lemma 1.1. In case of Effective Coverage task, if the penalty function is chosen as in (1.18), the following implication holds:

∂ξ(t, p) ∂t = 0 =⇒ ∂ξ(t, p) ∂pi = 0 ∀i ∈ T . (1.26) Proof. By definition: ∂ξ ∂t = Z Q ∂f ∂e ∂e(t, p, q) ∂t σ(q) dq, (1.27) ∂ξ ∂pi = Z Q ∂f ∂e ∂e(t, p, q) ∂pi σ(q) dq. (1.28)

Recall that, in the case of Effective Coverage,

e(t, p, q) = max 0, C∗− Z t 0 d(p(τ ), q) dτ − d(p(t), q);

first observe that both integrands in (1.27) and (1.28) are zero at points where e ≤ 0,

due to the penalty function definition. Also, since d ≥ 0, if d∗ = 0, necessarily e ≤ 0.

Therefore, points where d∗ = 0 do not contribute to the integrals. Then, for d∗ > 0,

∂e ∂t = −d(p, q) (1.29) and ∂e ∂pi = −∂d(p, q) ∂pi = −∂di(pi, q) ∂pi (1.30)

(25)

CHAPTER 1. ON THE COVERAGE PROBLEM 18 Hence integrals (1.27) and (1.28) can be rewritten as:

∂ξ ∂t = − Z supp{d} ∂f ∂e d(p, q) σ(q) dq, (1.31) ∂ξ ∂pi = − Z supp n ∂di ∂pi o ∂f ∂e ∂di(pi, q) ∂pi σ(q) dq. (1.32)

Extending the result discussed in Remark 1.1 and noticing that supp{d} = ∪_isupp{di},

the following holds:

supp ∂di

∂pi

⊆ supp {d} (1.33)

and it can be concluded that if ∂ξ/∂t = 0, then e ≤ 0 ∀q ∈ supp{d}, hence ∀q ∈

supp{∂di/∂pi} and, therefore, ∂ξ/∂pi= 0 ∀i ∈ T .

Remark 1.2. Notice that, since supp {∂di/∂pi} ⊆ supp {di}, ∂ξ/∂pi = 0 as soon as

e ≤ 0 ∀q ∈ supp{di}.

1.2.5 Deployment Problems - Simulation Results

In this Section some simulations of Deployment Problems in Q ⊂ R2 are provided.

The team is made up of eight agents and the environment is a square area, its side

measuring 40. Four of the robots have Gaussian ADFs of the form (1.2) and the

remaining ones are of type (1.6); for all of them, the parameters associated to the

Gaussian surface are: A = 3, σx = σy = 3; the Field of View of those agents carrying

limited visual angle sensors is characterized by k = 2 and φ = π/2. The initial

deployment configuration has been kept identical for three different scenarios and the respective cTDF is shown in Figure 1.3. In the definition of the error function ξ(q),

30

20 Current Task Descriptor Function

-10 10 0 -5 5 0 10 15 20 ₀ 25 30 Figure 1.3: Initial cTDF. σ(q) = 1, while the penalty function:

(26)

has been chosen. If f (e) = e2 had been preferred, the values e < 0, i.e. excess of resources, would have been penalized in the same way as lack of resources. Such choice, however, may lead to a not satisfactory performance that can be avoided if the max{·} operator is used. The control gain for all agents is β = 1.

30 20 Current Task Descriptor Function

-10 10 0 -5 5 0 10 ₁₅ 20 25 ₃₀ 0 0 0.5 1 1.5 2 2.5 (a) Final cTDF. 0 2 4 6 8 10 12 14 16 18 20 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(b) Evolution of the normalized cost function. Figure 1.4: Uniform Deployment of the Swarm.

The first considered task is the Uniform Deployment. The associated TDF is d∗(q) =

1. In Figure 1.4 the final cTDF and the time history of the normalized error function ξ is illustrated. The final swarm configuration is such that each ADF is as far as possible from the others, thus reaching a minimum of the cost function.

As regards the Static Coverage task, it is first necessary to define the associated TDF. For the performed simulation, the following function has been chosen:

d∗(q) =

1

2(sin (0.3(qx+ qy)) + cos (0.3(qx− qy))) + 1, (1.35)

which, in the considered domain Q has four global maxima, as displayed in Figure 1.5 Starting from the configuration depicted in Figure 1.3, the described control law drives the agents in order to make their ADFs gather where the resource requirement is higher. As a result, three of the agents carrying FoV sensors cover the TDF maximum located north-east of the environment; the fourth FoV sensor and the Gaussian ones are displaced on the remaining TDF maxima, as Figure 1.6 highlights.

Finally, consider the case of Target Assignment. Two targets have been defined,

located at points qw₁ = [10, 10]T and qw₂ = [20, 20]T. The ADFs associated to the two

targets are both Gaussian and isotropic, with amplitude and standard deviation equal to 5 (Figure 1.7).

In Figure 1.8 the final swarm configuration is shown. In the graph illustrating

the evolution of the normalized cost function ξ observe that, after a first steep drop, the system comes to neighbourhood of a saddle point, as the decrease rate suddenly

(27)

CHAPTER 1. ON THE COVERAGE PROBLEM 20

30 20 Task Descriptor Function

-10 10 0 -5 5 0 10 ₁₅ 20 ₀ 25 ₃₀ _0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

Figure 1.5: TDF associated to the Static Coverage Task.

-10 10 0 -5 5 0 10 ₁₅ 20 ₀ 25 ₃₀ 0 0.5 1 1.5 2 2.5 3 (a) Final cTDF. 0 2 4 6 8 10 12 14 16 18 20 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(b) Evolution of the normalized cost function. Figure 1.6: Static Coverage Task.

30 20 Task Descriptor Function

-10 10 -5 0 0 5 ₁₀ 5 15 20 ₂₅ ₀ 30 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

(28)

approaches zero. Nevertheless, saddle and maximum points are unstable, as already mentioned, thus causing the system to move away from them. This is noticeable at approximately time 4, when the error function starts decreasing again until a minimum is reached.

All the considered scenarios confirm that the cost function never satisfies ˙ξ > 0 and,

in particular, the system converges to configurations p such that ˙ξ = 0 corresponding

to (local or global) minima of ξ.

-10 10 -5 0 ₅ 0 10 ₁₅ 20 ₀ 25 ₃₀ 0 0.5 1 1.5 2 2.5 3 3.5 4 (a) Final cTDF. 0 2 4 6 8 10 12 14 16 18 20 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(b) Evolution of the normalized cost function. Figure 1.8: Target Assignment Task.

(29)

Chapter 2

Decentralized Control in the DF

Framework

In Chapter 1 an introduction to the Descriptor Function Framework has been provided. In this Chapter it will be highlighted that the swarm control, as previously formulated, is centralized. On the other hand, when a set of agents is deployed, it is desirable that they carry out their tasks in a decentralized way, which means that they are expected to act on the basis of their own knowledge of the environment or on the

information provided by close agents via communication links. This does not

necessarily coincide with a full knowledge of the swarm configuration or of the global task achievement. The aim of this Chapter is to show that both Deployment Problems and Effective or Persistent Coverage Problems can be faced in a decentralized scheme. In the latter types of scenario, this will be at the expenses of a partial deterioration of the performances.

It turns out that the Effective task is the most representative among the coverage problems. Therefore it will be the main topic of this Chapter, although an eye will be

constantly kept on how to extend the results to other scenarios. The Chapter is

structured in a progressive way. With the goal of obtaining a full and complete

decentralized control, the approach presented in [28] will be gradually modified and the proof of convergence will be provided.

2.1 State of the Art

This Section is devoted to summarize the main results achieved as regards the Effective Coverage. For the sake of clarity, the main features related to this task, as dealt with in the Descriptor Function Framework, are briefly recalled where necessary. The environment is supposed obstacle-free. Agents can have heterogeneous

(30)

in-control driftless dynamics of the form:

˙xi = gi(xi)ui, (2.1)

where xi ∈ Rqi and ui ∈ Rmi are the i-th agent state and control vectors, respectively,

and gi: Rqi → Rqi×mi is nonzero for all xi. Moreover, let:

pi = hi(xi), (2.2)

with hi : Rqi → C(Rn) a map between the i-th agent state vector and its pose in the

configurations space. Both g_i and h_i _{are assumed continuous and differentiable in R}qi_.

Moreover, define: x =     x1 .. . xN     , p =     p1 .. . pN     =     h1(x1) .. . hN(xN)     = h(x)

Recall that the amount of coverage (also called information) attained at a point

q ∈ Rnis ruled, by definition, by the first-order dynamics:

˙

I(t, q) = δI(t, q) +X

i∈T

di(pi(t), q) = δI(t, q) + d(p(t), q), (2.3)

where δ, the information decay rate, is zero in case of Effective Coverage and strictly negative in case of Persistent Coverage. Hence, for both cases, the TDF is defined as:

d∗(t, q) = max {0, C∗− I(t, q)} , (2.4)

which means that q is considered sufficiently covered if the information attained is above

the threshold C∗ > 0. Notice that the closed form expression of I(t, q) is given by:

I(t, q) = eδ(t−t0)_I(t

0, q) +

Z t

t0

eδ(t−τ )d(p(τ ), q)dτ, (2.5)

which, for null initial information and in an Effective Coverage scenario, becomes:

I(t, q) =

Z t

t0

d(p(τ ), q)dτ. (2.6)

The TEF is:

(31)

CHAPTER 2. DECENTRALIZED CONTROL IN THE DF FRAMEWORK 24

which allows to define the error function ξ : R+0 × CN(Rn) → R+0:

ξ(t, p) = Z

Q

f (e(t, p, q))σ(q)dq, (2.8)

with f : R → R+0, f (e) = max {0, e}

p

, p ∈ {2, 3, ...} known as penalty function and

σ : Q → R+0 a weight function which quantifies the importance of sensing each point

of the environment. By definition, f is positive semidefinite in R and ∂jf /∂ej, j ∈

{0, ..., p − 1} is strictly convex in R+_.

The Effective Coverage task is completed when ξ = 0 is satisfied. For this to happen, because of the definition of the penalty function, e(t, p, q) ≤ 0 ∀q ∈ Q must hold.

2.1.1 Collision Avoidance Control

The desired behaviour of the swarm is to accomplish the assigned task avoiding collisions between agents. Hence, the control action should guarantee that the distance

between agents in Rn does not drop below a specified threshold r > 0. Moreover, it is

assumed that each robot is able to detect another one when their relative distance is

under R > r. First, define the function l : R → R+0,

l(x) = min 0,x 2_{− R}2 x2_{− r}2 2 , (2.9)

and observe that, if x ≥ R, l(x) = 0, while for x ∈ (r, R) the function is strictly

decreasing and lim_x→r+l(x) = +∞. The collision avoidance function is defined in the

following way: v(p) =X i∈T vi(pi) = X i∈T X j∈T \{i} l(kS(pi− pj)k), (2.10)

where S ∈ Rn×dim{C(Q)} extracts the position components. The distance kS(pi− pj)k

between agents i and j will be also denoted by ρ_i,j. Moreover, let

P(x) =p ∈ CN

(Rn) | ρi,j > x ∀i, j ∈ T , i 6= j .

From its definition, it is clear that if v is showed to be bounded throughout the execution of the task, collisions can not occur. To avoid collisions, each agent i ∈ T , implements the following law:

uv,i(t) = −γgTi (xi(t)) ∂hi(xi(t)) ∂xi(t) T_∂v(p(t)) ∂pi(t) T , (2.11)

with γ > 0, which drives it towards points where v is lower, i.e. farther from the other agents. The law (2.11) will be added to other control input terms in order to guarantee a safe execution of tasks.

(32)

Observe that the collision avoidance control does not depend on the kind of coverage scenario, which makes it suitable both for Deployment Problems and Effective or Persistent Coverage tasks, thus being as general as possible. Moreover, it is implicitly distributed; indeed:

∂v ∂pi

= 2∂vi ∂pi

(2.12)

and vi 6= 0 if and only if agent i is at distance r < ρi,j < R from any other agent j. In

this case: ∂vi ∂pi T = X j∈T \{i} r<ρi,j<R 4(R2− r2) ρ 2 i,j− R2 (ρ2 i,j− r2)3 STS(pi− pj) (2.13)

and all the information needed is directly achievable by means of proximity sensors with detection radius higher than or equal to R. Alternatively, it is sufficient that agents i and j exchange their poses (in particular their positions, due to the selection matrix S), provided that they are able to communicate at distances not smaller than R. From a practical point of view, the range R would be small enough to guarantee communication to certain degree of confidence.

2.1.2 Coverage Control

The main idea of the coverage control is to drive the agents towards areas where more information is needed, i.e. such that the error function ξ is properly decreased. Hence, a gradient-descent approach, with respect to ξ, is used; namely, the implemented control law is:

uξ,i(t) = −βgTi (xi(t)) ∂hi(xi(t)) ∂xi(t) T ∂ξ(t, p(t)) ∂pi(t) T , (2.14) being β > 0.

In case of Deployment Problems, the following Proposition extends the results shown in Section 1.2.4 to the more general case of agents with dynamics (2.1) - (2.2), with the guarantee of inter-agent collision avoidance.

Proposition 2.1. If the following conditions hold: 1. ker ( gT_i (xi(t)) ∂hi(xi(t)) ∂xi(t) T) = {0} ∀i ∈ T , (2.15) 2. p(t₀) ∈ P(r),

3. u_i(t) = uξ,i(t) + uv,i(t) ∀i ∈ T ,

then the agents safely execute the assigned task until a minimum of the function ξ(p(t))+ (γ/β)v(p(t)) is reached.

(33)

CHAPTER 2. DECENTRALIZED CONTROL IN THE DF FRAMEWORK 26 Proof. Consider the candidate Lyapunov function:

V = ξ(p(t)) + γ

βv(p(t)) ≥ 0, (2.16)

which is zero if the task is completely accomplished and p ∈ P(R). Differentiation with respect to time yields:

˙ V =X i∈T ∂ξ ∂pi + γ β ∂v ∂pi ∂hi ∂xi ˙xi. (2.17)

The substitution of x_i = giui leads to:

˙ V =X i∈T ∂ξ ∂pi + γ β ∂v ∂pi ∂hi ∂xi gigTi ∂hi ∂xi T −β ∂ξ ∂pi T − γ ∂v ∂pi T (2.18) = −1 β X i∈T kuξ,i+ uv,ik2 ≤ 0, (2.19)

and, by LaSalle’s Invariance Principle, it can be concluded that convergence is ensured

to the largest invariant set where u_i = 0. Because of condition (2.15) and of the

formulation of the control law, the swarm converges to configurations corresponding to

local minima of V . Moreover, since V is initially upper-bounded and ˙V ≤ 0, V can not

increase to infinity and, therefore, agents’ safety is ensured.

Remark 2.1. The swarm does not converge anymore to local minima of the error function: in fact, the collision avoidance term would prevent agents from coming too close to each other, even if this implies a lower value of ξ. The relative importance of coverage with respect to collision avoidance can be chosen by tuning the gains β and γ. As γ/β approaches zero, the coverage task becomes priority and agents get closer to each other. The converse is true as such ratio increases.

2.1.3 Waypoint Guidance Control

As shown later, it is sometimes necessary to implement a motion control to drive an agent from its current position to a waypoint. All robots performing a waypoint

guidance belong to the set W ⊆ T . If robot i has been assigned a waypoint w_i(pi),

the control law to reach it is gradient-based and aims at reducing the waypoint error function ew : CN(Rn) → R+0: ew(p) = 1 2 X i∈W k(pi− wi(pi))k2. (2.20)

As a result, the waypoint guidance is given by:

uw,i(t) = −αgTi (xi(t)) ∂hi(xi(t)) ∂xi(t) T ∂ew(p(t)) ∂pi(t) T . (2.21)

(34)

with α > 0. Note that if i /∈ W, uw,i(t) = 0, while, if i ∈ W, uw,i = −αgTi (xi) ∂hi(xi) ∂xi T (pi− wi(pi))T In− ∂wi(pi) ∂pi

In [28] the following Lemmas regarding the Effective Coverage scenario are proved. Lemma 2.1. If the following conditions hold:

1. ker ( gT_i (xi(t)) ∂hi(xi(t)) ∂xi(t) T) = {0} ∀i ∈ T , (2.22) 2. γ = ωcβ, ωc∈ R+, 3. p(t0) ∈ P(r),

4. u_i(t) = uξ,i(t) + uv,i(t) ∀i ∈ T ,

then the agents explore the operational area until they reach at time t_d a safe

configuration p_d∈ P(R) such that:

e(td, p, q) ≤ 0 ∀q ∈ supp {d(pd, q)} (2.23)

If ξ(t_d, p) = 0, the Effective Coverage task has been fulfilled. If ξ(td, p) > 0, a

so-called deadlock occurs and pd is called a deadlock configuration. If this is the case,

the team can recover entering a perturbation mode described by Algorithm 1 in [28], which basically assigns a waypoint in an uncovered area to a subset of agents W ⊆ T . The following result is proven.

Lemma 2.2. Let condition (2.22) and γ = ωpα, ωp ∈ R+ hold; then, if ui(t) =

uw,i(t) + uv,i(t) ∀i ∈ T , the agents in W reach their waypoints and no collisions occur

during the perturbation mode.

A finite switching between the two strategies leads to the total coverage of the operational area, as shown in Lemma 3 in [28].

Simulation Results

In this paragraph the effectiveness of the control policy will be presented through the results of a simulation involving N = 5 agents performing an Effective Coverage

task. Two of them have the dynamics of a single integrator and are characterized

by an isotropic Gaussian ADF (as described in Chapter 1) with parameters A = 3,

σx = σy = 3; they are deployed at the opposite north-west and south-east corners. The

(35)

CHAPTER 2. DECENTRALIZED CONTROL IN THE DF FRAMEWORK 28 whose field of view is limited (see Chapter 1). Parameters of this sensors are: A = 3,

σx = σy = 3, k = 2, φ = π/2; they are all deployed in the south-west corner, at a safe

distance. For all the agents, R = 1, r = 0.5, α = β = γ = 1. The initial TDF is equal

to C∗= 1 in all the points of the environment.

Figure 2.1 shows the decrease of the error function (normalized with respect to its initial value) over time, while Figure 2.2 illustrates the trend of inter-agent distances, proving that no collision occurs. Finally, the contour plot of the TDF in the environment is provided, for different instants, in Figure 2.3.

0 50 100 150 200 250 300 350 400 450 500 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 2.1: Normalized error function over time.

0 50 100 150 200 250 300 350 400 450 500 10-1 100 101 102 103

Figure 2.2: Inter-agent distances.

2.2 Computation of the Error Fuction Gradient

In [28] the error function ξ is minimized jointly by the swarm of agents. In particular, for each agent to operate, the knowledge of all the other ADFs seems to be necessary to compute the cTDF appearing in (2.7), hence in (2.8). As explained in Chapter 1, the point of defining ξ in this way lies in making the Framework as general as possible, thus approaching different kinds of tasks in a unified manner. Such knowledge is implicitly a feature of a centralized architecture.

(36)

0 20 40 60 80 100 0 10 20 30 40 50 60 70 80 90 100

Task Descriptor Function

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (a) t = 0.5, ξ(t)/ξ(t0) = 0.9832. 0 20 40 60 80 100 0 10 20 30 40 50 60 70 80 90 100

Task Descriptor Function

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (b) t = 25, ξ(t)/ξ(t0) = 0.5408. 0 20 40 60 80 100 0 10 20 30 40 50 60 70 80 90 100

Task _{Descriptor Function}

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (c) t = 250, ξ(t)/ξ(t0) = 0.0719. 0 20 40 60 80 100 0 10 20 30 40 50 60 70 80 90 100

Task _{Descriptor Function}

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 (d) t = 400, ξ(t)/ξ(t0) = 0.0016. Figure 2.3: TDF evolution.

(37)

On the other hand, it is worth observing that, as far as the control term ∂ξ/∂pi,

which agent i needs to compute, the knowledge of the ADFs of all the swarm is not actually needed. Indeed, recall that:

∂ξ(t, p) ∂pi = − Z Q ∂f (e(t, p, q)) ∂e ∂d(p, q) ∂pi σ(q)dq (2.24) = − Z suppn∂di ∂pi o ∂f (e(t, p, q)) ∂e ∂di(pi, q) ∂pi σ(q)dq. (2.25)

which means that the TEF computation can be restricted to points

q ∈ supp {∂di/∂pi}. This implies that the information needed by agent i is only

related to the ADFs whose support overlaps with supp {∂d_i/∂pi}. Also, since

supp {∂di/∂pi} ⊆ supp{di}, a sufficient condition is that the overlap occurs with the

latter. Let O_i be the set of agents, including i, carrying ADFs for which this condition

is verified.

Now, assume that agents are able to exchange information with each other; for a thorough description of the conventions about inter-agent communication, see Section

2.4. Here, it is sufficient to suppose that two agents can communicate if they are

within distance Rcom of each other and that, if their ADF supports overlap, this is

certainly verified. Such condition is very loose and can be easily achieved in practical implementations thanks to the available long range communication technologies.

If agents exchange their poses and ADFs’ parameters, they are able to compute in a distributed way a partial cTDF, taking into account the contribution of a subset of robots within communication range. Hence, agent i can locally compute a partial TEF:

ei(t, p, q) = d∗(t, q) −

X

j∈Oi

dj(pj, q). (2.26)

It is clear that the following equality holds: ∂ξ(t, p) ∂pi = − Z supp n ∂di ∂pi o ∂f (ei(t, p, q)) ∂ei ∂di(pi, q) ∂pi σ(q)dq, (2.27)

which leads to the conclusion that, even though ξ is not known to agent i, yet it is able to compute its own coverage control term in a distributed fashion.

Remark 2.2. In the case of Deployment Problems it is rational to assume that the time

invariant TDF d∗(q) is known to the agents when they are deployed. As a result, the

observations in this Section are sufficient to conclude that such tasks are easily executed in a distributed fashion and the performance is not affected by the partial knowledge of the agents. As regards Effective and Persistent Coverage, the switching policy and the time-varying TDF are still hurdles towards a total decentralization of the control. They will be the topic of Sections 2.3 and 2.4, respectively.

(38)

2.3 Local Switching Policy

Control decentralization for Deployment Problems has been obtained observing that agents can easily retrieve all the information needed from neighbouring agents. Therefore, from now on, time-varying tasks only will be taken into account.

As explained in Section 2.1, when performing Effective Coverage, the entire swarm switches between a coverage mode, during which agents are controlled towards the direction of maximum decrease of the error function, and a perturbation mode, during which each agent moves to its own waypoint (if defined) and waits until all the others have reached theirs. Some observations are in order:

1. while in coverage mode, if u_ξ,i = 0, agent i is idle, i.e. it is not contributing to

lowering ξ;

2. while in perturbation mode, if agent i has reached its waypoint, it will remain idle until the coverage mode is restored;

3. all agents must know the rate of change of ξ, so that they can simultaneously switch to the perturbation mode; conversely, they need to know when everyone has reached its waypoint in order to switch to the coverage mode.

Therefore, it is clear that, besides being poorly performant, this approach supposes a centralized knowledge of ξ and its rate. Moreover, all the agents need to know instantly the state of all the others. The goal is now to design a new local switching policy. Namely, each agent autonomously switches between control modes, thus avoiding idle time due to waiting a global deadlock or recovery. Some notes on notation will be hereinafter given, followed by the discussion about the new distributed strategy.

Definition 1. An agent i is said to be safe if @j ∈ T \{i} such that kS(pi− pj)k < R.

Let U (t) ⊆ Q be the set of uncovered points, i.e. U (t) = {q ∈ Q | I(t, q) < C∗}, and

¯

U (t) = Q\U (t). Define a function w : C( ¯U ) → C(U ) which associates a configuration in

the uncovered area to a configuration in a fully covered area; hence, for p ∈ C( ¯U ), w(p)

is said to be the waypoint associated to p. In the case Q ⊂ R2 let such function be:

w(p) =      ˜ qx ˜ qy arctan ˜qy− py ˜ qx− px      , q =˜ " ˜ qx ˜ qy # = arg min q∈UkSp − qk, p =    px py θ    (2.28)

In other words, the waypoint associated to a covered point is the closest uncovered point and the angle is the one between the x axis and the line joining them.

In what follows, waypoints will be assigned in a subset U∗ ⊆ U according to the

(39)

1. Initialize U∗ = U ;

2. If a new agent i with pose pi is assigned the waypoint wi(pi) in a position ˜qi,

define B_R(˜qi) = {q ∈ Q | kq − ˜qik < R};

3. Update U∗ = U∗\BR(˜qi).

Therefore, waypoints are assigned such that they can be safely reached by agents. Now, define two subsets of T , namely W and I, with W ∩ I = ∅; W is the set of agents with an assigned waypoint and I is the set of idle agents. Consider the following rules for a generic agent i to switch between sets:

1. Safety: necessary condition for agent i to switch is that it is safe; this condition will be considered tacit in the remaining rules;

2. Waypoint assignment: if i ∈ T \W and, at time tw,i the following conditions

hold:    ∂ξ ∂pi = 0, U∗ 6= ∅ (2.29)

then i switches to W; this condition will be called a local deadlock for agent i and,

whenever it is verified, agent i is assigned a waypoint wi(pi(tw,i)) according to

the previously described procedure. 3. Idle agent: if i ∈ T \ (W ∪ I) and

   ∂ξ ∂pi = 0, U∗ = ∅ (2.30) then i switches to I;

4. Recovery: if i ∈ (W ∪ I) and condition ∂ξ ∂pi

6= 0 (2.31)

holds, then i switches to T \ (W ∪ I); 5. Successive waypoint: if i ∈ W and

 



k(pi− wi)k = 0,

U∗ _{6= ∅} (2.32)