Elenco delle tesi il cui relatore è "Grammatico, Sergio" - Webthesis

(1)

Master of science in Mechatronic engineering

Master Degree Thesis

Learning from humans to improve Socially-Aware motion planning

Supervisor

prof. Alessandro Rizzo

Candidate:

Giada Galati

identification number: 253202

External supervisor Technische Universiteit Delft

prof. Sergio Grammatico

(2)

This work is subject to the Creative Commons Licence

(3)

(4)

Acknowledgements

Ho riflettuto molto in questi ultimi mesi sul come sono riuscita a raggiungere questo traguardo. Fortuna? Talento? Costanza e dedizione? Ho provato a confermare ogni singola ipotesi cercando di trovare il connubio perfetto per spiegare ciò che stava accadendo. La fortuna è troppo imprevedibile e da sola sicuramente non sarebbe bastata per completare un percorso in ingegneria. Penso che sia lo stesso anche per il talento. Quest’ultimo sicuramente può essere una buona base per creare e ottenere qualcosa di ambizioso, ma ovviamente non basta.

Cosa dire della costanza e della perseveranza? Ritengo siano la chiave di ogni successo! Penso che senza la quotidianità e il duro lavoro non si possa raggiungere nessun risultato di qualità. Questa tesi è il risultato di un lavoro che ha impegnato gli ultimi 8 mesi del mio percorso universitario. 8 mesi di domande e dubbi, 8 mesi di profondi cambiamenti da tutti i punti di vista.

Ma ovviamente il quadro per spiegare il fenomeno non è ancora completo, sicuramente un ruolo fondamentale lo hanno avuto tutte le persone che hanno creduto in me e che mi hanno sostenuto in questi anni.

In primis vorrei ringraziare il Professor Rizzo, per avermi dato la possibilità di fare un’esperienza all’estero e per essere stato sempre molto disponibile.

Ringrazio il mio supervisor dell’università di Delft, il professor Sergio Grammatico, che è riuscito a indicarmi la giusta via quando il progetto era ancora agli inizi.

In ultimo ringrazio Stefano, che è sempre stato gentile e a disposizione per cercare di trovare la migliore soluzione ai nuovi problemi che si presentavano durante lo svolgimento del progetto.

Ma il ringraziamento più grande di tutti va ai miei genitori.

Loro hanno sempre creduto in me nonostante le scelte accademiche poco usuali per i loro standard. Loro non sapevano che tipo di percorso stavo intraprendendo ma avevano e hanno una fiducia incondizionata in me. Hanno sempre scommesso su di me, mi hanno sempre lasciata libera e penso che abbiano trovato la chiave vincente per fare uscire la parte migliore di me. Li ringrazio per il supporto morale che sono riusciti a darmi, soprattutto in Olanda, li ringrazio per aver messo in secondo piano le loro esigenze e per essere stati così forti da rinunciare ad avermi accanto nella quotidianità in nome del mio successo e della mia felicità.

Ringrazio mia madre per essermi stata sempre vicino nonostante la distanza.

(5)

Un ringraziamento speciale va anche a Peppe, che in punta di piedi è entrato nella mia vita e ha cercato in tutti i modi di starmi accanto volendomi bene e prendendosi cura di me fin dall’adolescenza.

Ma adesso passiamo agli amici che mi hanno fatto sentire a casa qui a Torino ma anche in Olanda.

Ad Aldo, il mio migliore amico. L’unico che mi supporta da quasi 11 anni.

L’unico che riesce a capirmi con uno sguardo e che nonostante la distanza e le esperienze completamente diverse riesce quotidianamente a confermare e a rafforzare la nostra profonda amicizia.

A Roberta, Nina e Carlo, amici di adolescenza su cui ho sempre potuto contare.

Ringrazio Chiara, coinquilina e amica sopra le righe. Vulcano di simpatia e uragano di emozioni. Penso che non dimenticherò mai le serate passate nella mia stanzetta in collegio a ridere, scherzare e a discutere sulla nostra quotidianità.

Ringrazio Giulia, compagna di viaggio di questo lungo e faticoso percorso universitario. Vicina di stanza perfetta e posticipatrice seriale di sveglie proprio come la sottoscritta.

A Betta, amica e coinquilina adorabile! Otrantina famosa per le sue doti culi- narie che ha esportato anche in Svizzera, dove adesso vive. Grazie per essermi stata accanto in questo viaggio nonostante la distanza.

A Francesco, amico e guida in questo percorso universitario. Amico sempre disponibile e brillante!

A Marco, compagno di abbuffate e di consigli molto particolari.

Ringrazio mia cugina Simona, per avermi supportato e consigliato sempre per il meglio. Un ringraziamento speciale va ai miei zii di Vigevano che mi hanno accolto nella loro casa facendomi sentire in famiglia anche qui nell’estremo Nord.

Ringrazio Sara, compagna di full immersion pre-esame degli ultimi 2 anni. Amica con cui ho condiviso gioie e dolori di tutto il percorso magistrale. Grazie per esserci sempre stata.

Ringrazio il mio amico Paolo, che con tanta pazienza mi è stato accanto durante l’esperienza Erasmus.

Ringrazio i ragazzi del primo e secondo piano del Collegio Einaudi, che mi hanno permesso di sperimentare la vita in comunità e che hanno risposto al questionario relativo a questo progetto, consentendomi di raccogliere i dati nel più breve tempo possibile.

In ultimo, non di certo per importanza, ringrazio i ragazzi del "Ground floor"

di Delft, che mi hanno permesso di conoscere realtà provenienti da tutto il mondo (dall’India all’Uruguay).

L’ultimissimo ringraziamento va al programma Erasmus. Si dice che questa

(6)

esperienza cambia lo studente e lo migliora. Io penso che l’Erasmus non cambia la studente ma migliora indubbiamente la consapevolezza del sé.

È un viaggio dentro se stessi ed è il miglior modo per conoscersi e scoprire parti nascoste della propria personalità che nella comfort zone non potrebbero mai venir fuori.

Grazie al programma Erasmus perché mi ha anche migliorato professionalmente.

(7)

List of Tables ix

List of Figures x

1 Introduction and motivation 3

1.1 Our approach . . . 6

1.2 Thesis organisation . . . 7

2 Related work 9 2.1 State of art of human motion prediction . . . 9

2.1.1 Reactive model . . . 10

2.1.2 Predictive planners . . . 11

2.1.3 Learning approach . . . 13

2.1.4 Game theory modelling . . . 14

2.1.5 Proposed approach . . . 15

2.2 Robot navigation . . . 16

2.2.1 Motion controller review . . . 16

3 Background on Game Theory 19 3.1 Overview. . . 19

3.1.1 Terminology . . . 19

3.1.2 Game types . . . 20

3.1.3 Nash equilibrium of Non-cooperative game . . . 21

3.2 Navigation example with game theory. . . 21

4 Model of human motion prediction 25 4.1 Social behaviour. . . 25

4.2 Proxemic. . . 25

4.3 Personal space . . . 26

4.4 Modelling navigation as a game . . . 28

4.5 Optimisation problem . . . 30

4.6 Constraints and objective function . . . 30

(8)

4.7 Motivation for the choice of a discrete approach . . . 36

4.8 Algorithm . . . 38

4.8.1 Function explanation . . . 40

4.8.2 Group of people . . . 41

4.8.3 Game solution . . . 45

4.8.4 Human-static object interaction . . . 45

5 Model validation and experiment 49 5.1 Human motion model validation with videos . . . 49

5.2 Test . . . 52

5.2.1 Evaluation method . . . 52

5.2.2 Experimental setup . . . 56

6 Test results 59 6.1 Second section test result . . . 61

6.1.1 Surveillance videos . . . 61

6.1.2 Humans with non-player robot. . . 62

6.1.3 Humans with player robot . . . 63

6.2 Third section test result . . . 64

6.2.2 Humans with circled non-player robot . . . 66

6.2.3 Humans with circled player robot . . . 67

6.3 Fourth section test result . . . 68

6.3.2 Human motion model. . . 69

6.4 Discussion . . . 70

7 Conclusions and future works 71

Bibliography 73

(9)

4.1 Space around a person considering social interaction according to the Hall’s study [13]. . . 26 6.1 General information collected from the participants. . . 59

(10)

List of Figures

1.1 Socially-aware navigation in a crowd of humans [49]. . . 4 1.2 During navigation, the robot should take into account: a) Personal

space; b) O-space of an interaction [41].. . . 5 2.1 a) Reactive based planning: the pedestrian changes its direction when

another human appears in his way; b) Predictive planning: the hu- man first predicts the motion of the other agent and then, it calculates its path considering the mutual avoidance manoeuvres in advance. . . 11 2.2 Example of human motion prediction in literature: a) Social Force

model; b) Linear; c) Probable state transition in a grid map; d) Us- ing potential field; e) Growing uncertainty; f ), g) Machine Learning technique [23]. . . 12 2.3 Overview of robot navigation: a)Player robot technique; b)Non-

player robot approach. . . 17 3.1 Example of human motion and interaction-aware navigation [49].. . . 22 3.2 a)Situation in Figure 3.1 modelled as static game; b)The action of

each player and the respective cost function is shown. If the two players collide the cost is infinite. The Nash equilibrium is circled [49]. 23 4.1 Situation in which humans respect personal space (blue circle)[41]. . . 26 4.2 Different forms of surrounding personal area: a) Concentric circles

[13]; b) Egg shape [15]; c)Ellipse shape [16]; d)Asymmetric shape [11]. 27 4.3 Resume of a dynamic game solution with two pedestrians. . . . 29 4.4 Weight target at different time instant. . . 34 4.5 Cost map of human in a static environment. Blue areas are with the

smallest cost, while yellow areas are with the highest cost, i.e. with a cost equal or greater than 1000. . . 35 4.6 Comparison between the performance of Knitro and fmincon [39]. . . 37 4.7 Simulation results: a) Unfeasible solution with Knitro; b) Feasible

solution with fmincon. In both graphs the axis identify the two- dimensional Cartesian space. . . 37 4.8 Simulation results in which agents do not intersect obstacles and other

pedestrians trajectories. . . 41

(11)

4.10 Simulation results: a) First estimation and group recognition; b)

Model solution. . . 44

4.11 In a) the first estimation with the obstacle and intersection recogni- tion; In b) the model solution with game theory. . . . 46

4.12 In a) the first estimation with the intersection recognition. In b) the model solution with game theory. . . . 47

4.13 Simulation results: a) First estimation and obstacles recognition; b) Model solution. . . 48

5.1 Validation with surveillance videos. On the left: the real trajectories are shown; On the right: Trajectories output of the game theory model. 50 5.2 Validation with surveillance videos. The scenario is the same described in Figure 5.1. . . 50

5.3 Validation with surveillance videos. The scenario is the same described in Figure 5.1. . . 51

5.4 Validation with surveillance videos. The scenario is the same described in Figure 5.1. . . 51

5.5 Training a) First section training; b) Training with arrows. . . . 52

5.6 Overview of the experiment. . . 53

5.7 Arrows that are moving as agents in the same training urban envi- ronment. a) Second and fourth section set-up; b) Third phase set-up. 54 5.8 a) Non-player robot behaviour (VFH+); b) Player robot behaviour. . 54

5.9 Robot in urban space: the yellow arrow, in both images, represents the robot position at the same instant of pedestrians (blue points). a) Non-player robot (VFH+); b) Player robot. . . . 55

6.1 Level of experience of the participants in robotics. . . 60

6.2 Distribution of participants’ age.. . . 60

6.3 Surveillance videos recognition. . . 61

6.4 On the left: recognition of the non-player robot; On the right: nat- uralness of the recognised non-player robot on a Likert scale from 1 (completely unnatural) to 5 (completely natural). Labels (a), (b) and (c) are used to identify the specific video. . . 62

6.5 On the left: recognition player robot; On the right: naturalness of the recognised player robot. Labels (a), (b) and (c) are used to identify the specific video. . . 64

6.6 On the left: recognition of the circled arrow as human; On the right: naturalness of the circled arrow on a Likert scale from 1 (completely unnatural) to 5 (completely natural). Labels (a), (b) and (c) are used to identify the specific video. . . 65

(12)

6.7 Each image row is labelled by a letter (it identifies the results of the same videos). On the left: the recognition of the circled non-player robot as human; On the right: the degree of naturalness of the circled arrow. . . 66 6.8 On the left: the recognition of the circled player robot as human; On

the right: the degree of naturalness of the circled arrow. The image row is labelled by a letter in order to identify easily the final results of each video. . . 67 6.9 Graphs show the percentage of naturalness for the player robot that

was classified as an artificial arrow for the three videos. . . . 68 6.10 On the left: arrows categorized as human; On the right: natural-

ness of arrows on a Likert scale from 1 (completely unnatural) to 5 (completely natural). . . 69 6.11 On the left: in orange is shown how many participants identified the

arrows’ movement as human motion; On the right: the naturalness of all arrows on a Likert scale from 1 (completely unnatural) to 5 (completely natural). . . 69

(13)

(14)

Abstract

In the near future, one of the most likely scenarios is the daily coexistence of humans and robots. This scenario will be witnessed, for instance, in assistive robots for seniors, cleaning robots, reception robots in public or private shops.

Although there already exist technological tools to perform a secure motion in a crowded environment, not too many authors have studied if a robot trajectory would be pleasant and accepted by humans. The thesis fits in this scenario and is focused on the trajectory generation for a mobile robot in order to be acceptable and not annoying humans.

How can we reach this ambitious goal?

Humans are more successful in planning a collision-free trajectory with mutual avoidance manoeuvres in a populated environment than any motion planning algorithm developed so far.

While humans can easily deal with predicting the motion of surrounding people, robotic systems are still facing problems.

Moreover, humans tend to attribute intentions and consciousness to non-human en- tity. In literature, this behaviour is named anthropomorphizing. Here, we leverage anthropomorphizing to improve human-robot interaction and, thus, raise the acceptance for humans. This is achieved by attempting to design the robot motion in a human-like manner.

For these reasons, we want to improve the robot navigation starting from the study of humans’ decisions.

The majority of the approaches concentrate their attention to predict human motion individually without considering the interaction between humans.

Here, we model the motion of humans considering the interaction between and with pedestrians using a game-theoretic approach.

Generally, many motion planning approaches at the state of the art have a reactive behaviour, i.e. they avoid obstacles without considering the prediction of human motion and the interaction with them.

Game theory has a lot of advantages over reactive methods, in fact, it can model the mutual anticipation of the influence of other agents and adapt their own decisions based on the possible actions of others.

In this thesis, non-cooperative game theory is applied to predict the decision of multiple humans that interact with each other during navigation.

Hence, the concept of Nash equilibrium in dynamic games is applied to solve our model.

In the last decades, some scientists studied human motion from a game-theoretic point of view, but their models comprised two people. Moreover, static obstacles

(15)

people and detecting groups of them, as well as evaluating patterns of natural in- teraction between humans and static objects.

The model has been validated with real-world surveillance videos, qualitatively com- paring real trajectories and the output of our model.

In the second part of the dissertation, we used the model previously developed, to create a human-like trajectory for an autonomous robot that navigates among multiple humans, considering the robot as a player of the game.

In the end, with a variation of the Turing test, we tried to evaluate quantitatively whether the final motion robot planning is socially acceptable for humans. This quality has been measured considering the human likeness of the trajectory gener- ated by our motion planner.

50 volunteers participated in our test and the collected data validates the proposed approach.

(16)

Chapter 1 Introduction and motivation

The adoption of robots in our daily life causes an important issue about the necessity of humans to interact with them. Generally, in industrial applications, robots are completely separated from the humans’ workspace, but this is not the case in the near future applications, where robots will share the same environment with humans.

In order to accomplish a diverse set of objectives, navigation is an essential task for autonomous mobile robots. As a matter of fact, to improve the collaboration with humans, the robot should ensure:

• human-likeness motion: i.e., the robot’s trajectory should follow a smooth behaviour similar to human motion;

• safe motion: i.e., the robot does not harm the human in a physical or psycho- logical way;

• reliable and effective motion: i.e., the robot executes the task adequately and reaches the goal considering its motion limit (i.e. minimum turning radius of the robot, maximum linear and angular velocities, etc.);

• interaction awareness: i.e., the robot anticipates the human mutual collision avoidance.

If a robot satisfies the above conditions, its interaction with humans becomes easier and more intuitive for humans.

In particular, our object is to design a robot’s motion that is acceptable for humans.

How to measure the social acceptance of a robot?

Humans attribute sometimes, intentions and consciousness to non-human agent [55].

The term to explain this behaviour is anthropomorphizing. In this thesis, we use anthropomorphizing to improve human-robot interaction [50]. This is achieved by attempting to design the robot motion in a human-like manner.

This perspective opens up important research opportunities.

Since the robot should move like a human and should “know” the intentions of

(17)

humans, a model of human motion behaviour is necessary to compute a robust socially-aware motion planning.

Definition 1 Socially-aware navigation is the strategy exhibited by a social robot which identifies and follows social conventions in order to preserve a comfortable interaction with humans. The resulting behaviour is predictable, adaptable and easily understood by humans [41].

What are the advantages of modelling human’s intentions from the robotic point of view?

A first benefit in predicting the human motion is that humans are no longer recognised as dynamical obstacles but as social entities that interact with other pedestrians.

In addition, the output of a human motion model can be used as a source of information that allows robot to adapt its motion according to the predicted human motion.

Robot can promptly respond in a safe manner (Figure 1.1). Moreover, robot can use their predictions of human motion to generate its own motion planning, in order to achieve a given task. In this way, the “intention” of the robot becomes more intelligible and natural to be predicted by humans [5]. Then, this, increases the social acceptance of robots in daily life.

Last but not the least, it may happen that the robot locks in a place (or executes unnecessary manoeuvres) while trying to avoid collisions with humans, since all available trajectories are unsafe, it remains stuck in a deadlock. The name of this situation is “freezing robot problem” (FRP), which can be avoided, for example, by implementing an socially-aware motion navigation [48].

Figure 1.1: Socially-aware navigation in a crowd of humans [49].

From the literature, it is possible to summarise the requirements that socially- aware navigation should have [23]:

(18)

1 – Introduction and motivation

• Respect personal space. When humans interact with others, they feel annoyed if others are too close or too far away from their own personal space (the proxemic interpersonal distance is shown in Table 4.1 and the personal space in Figure 1.2a).

In particular, if the distance to someone is excessive, it indicates a dislike.

In this regard, also the robot must keep an appropriate social distance with humans to avoid fear and discomfort.

• Respect activity spaces. Robot should avoid the space where humans can per- form actions. In the related work [29], there is not a precise definition of the shape of the activity space because it depends on the type of humans activities.

• Respect group of agents zone. In the literature, the area shared among in- teracting people is called O-space [20], as shown in Figure 1.2b. In general, the geometrical shape of the O-space depends on the posture and orientation of humans. In [20] the authors show that, generally, humans are placed in a circle.

• Avoid weird motion or noises that could cause a distraction for humans.

• Modulate the velocity based on the distance from humans or maintain the same speed similar to human walking speeds. We used the latter situation to increase the possibility of anthropomorphizing the robot as in [34].

Figure 1.2: During navigation, the robot should take into account: a) Personal space; b) O-space of an interaction [41].

The following section starts with the presentation of our human motion model and proceeds with the description of the requisites that have been implemented among of those presented above, to achieve the goal of the socially-aware navigation.

(19)

1.1 Our approach

Game theory is a branch of mathematics that is useful to model situations in which players make decisions that are interdependent with other participants. Indeed, game theory is a powerful tool to investigate optimal decision-making by rational players considering the other player’s possible decisions.

Before proceeding with the dissertation, for more clearness, it is necessary to un- derline that from now the terms player, pedestrian, person and human are used as synonyms.

Moreover, with the term agent, we denote any mobile entity, either human or robot.

This study uses a game-theoretic approach to model the decision process of pedes- trians in a dynamic environment. Players are considered as rational agents, i.e. they try to maximise their profits.

In general, we adopt a microscopic approach, i.e. we focus on predicting both the interaction and the decision processes between humans in a crowd without considering the overall movement of the crowd (macroscopic analysis).

In this dissertation the interaction-aware decision making is modelled as a non- cooperative, dynamic and non-zero-sum game (this nomenclature will be explained in more detail in Chapter3).

The pedestrians respect the other personal zones by maintaining a certain distance based on the relationship between players.

In our approach, we consider the achievement of the Nash equilibrium [35] as the solution of the game. The Nash equilibrium is the best response for all players, though we give a more formal definition of Nash equilibrium in Chapter3.

Generally, many human motion approaches at the state of the art have a reactive behaviour, i.e. they avoid obstacles without considering the human motion and the interaction with them [16, 17].

Our approach overcomes the reactive methods (see Section 2.1.1), since with game theory we can predict the motion of other pedestrians. Thus, each player can adapt its own decisions accordingly to the others strategies. In this way, our model is considered predictive.

Recently (see Section 2.1.4), some authors studied human motion from a game- theoretic point of view considering only two players, without focusing on the recognition of groups, and avoiding a static obstacle in a human-like way [49].

Here, we have reformulated the problem considering a different cost function, ex- tending the model considering multiple people and detecting groups of them, as well as evaluating patterns of natural interaction between humans and static objects.

The group recognition is important because if a robot identifies a group, it avoids to move through or too close to it, thus, increasing the quality of the interaction with humans.

(20)

1.2 – Thesis organisation

Based on the prediction from our model, we created a socially-aware motion planning for an autonomous robot to navigate in a crowd. Thus, since our trajectory is inspired by our human motion model, during navigation the robot satisfies some of the requirements that we described in the previous section. In particular, the robot does not invade the human personal zone and conserves the same speed as a human.

Moreover, the robot avoids weird motion because the cost function has a term con- nected to the smoothness of the final trajectory.

The respect of the activity space is out of the scope of this dissertation, since we are studying human motion in urban space, without involving specific activities.

About the recognition of the O-space shared by a group of people that are stuck in a place, the robot identifies that group as a static obstacle, thus, it will maintain a certain safe distance and it will avoid to pass through it.

In a nutshell, the final objects of this thesis are:

1. Modelling the intention of humans in populated environment using game theory.

2. Improving existing motion planning for mobile robots, based on the model of human behaviour described in the previous point.

1.2 Thesis organisation

The rest of the thesis is organised as follows.

Chapter 2 reviews models of pedestrian motion and the state of the art of reactive robot navigation. The latter, will be used to implement the non-player robot for the videos that we designed for the final experiment.

Chapter 3 illustrates the preliminaries on game theory. The detailed description of our model is shown in Chapter 4. In Section 5, we discuss and validate the game theoretical model through real-world surveillance videos. Besides, we describe the setup and details of our variation of Turing experiment used to validate the human motion model and the human likeness of our socially-aware motion planner.

Chapter6reports results coming from the collected data analysis of our experiment.

Finally, Chapter 7 draws the conclusions of this dissertation and devises future avenues of research.

(21)

(22)

Chapter 2 Related work

When a mobile robot moves in a dynamic scenario, where humans are goal-directed, the sensors perceiving the environment and detect the human positions.

If the robot estimates the human motion with a realistic model, the robot can navigate safely and comfortably with humans.

Thereby, a performing socially-aware navigation is strictly connects to a deep research in human motion models.

2.1 State of art of human motion prediction

Lots of modelling and simulating pedestrian motion have been developed in the last years in different areas to reach different objects such as: simulating evacuation movements to design a safely public or private building, human motion analysis for socially-aware robot navigation or gaming computer animation.

In general, the research studies about predict human decision in a crowd can be divided into two big approaches: macroscopic and microscopic.

The macroscopic approach is normally used in the case of large crowd considering the group of people as a whole. The macroscopic model is less computationally intense than the microscopic model, because it considers fewer details among indi- viduals interactions and between individuals and the environment.

On the other hand, the microscopic description focuses on predicting the interac- tion and the decision processes between humans in a crowd without considering the overall movement of the crowd.

In this dissertation, since one of the goals is find a good model that is able to pre- dict the human motion in a crowd, we will take into account only the microscopic approach.

We present in the following section, the state of the art of human motion prediction considering an overview of the reactive models, predictive planners, learning-based strategies and game theory models.

(23)

In the scientific literature, there is not a stringent separation between the geometric reasoning approach and the learning technique, because both models have things in common and can be combined to obtain a hybrid approach.

Nevertheless, in this thesis, for simplicity, we try to divide the publications considering the approach they focus on.

For each class, we consider the advantages and disadvantages and we motivate the choice of game theory.

In the end, we present an overview of robot navigation because it will be useful for the test explained in Chapter5.

2.1.1 Reactive model

Pioneering work of modelling humans as reactive particles is the social force method [16]. In this model, the agents navigate in the environment considering attractive forces that guide them towards the destination and repulsive forces that ensure collision-avoidance depending on their relative distances (Figure 2.2a). It generates plausible patterns about global motion. On the other hand, in local level, individual trajectory is not human-like.

In most of the available research [2], the scientists assume a static world, where the human prediction is simplified with the assumptions of proceeding to walk straight with current direction and speed (Figure2.2b).

One of the well-known models based on grid motion decisions is the Cellular Au- tomaton (CA). This technique uses a discrete representation for the environment and the human decision motion. The first research about CA is conducted by Ta- dokoro et al. [46]. Human motion is predicted using a probability distribution maps of movement in the near future, considering the status of neighbouring cells. The main idea is visualised in Figure 2.2c, where the darkness of grid maps represents the likelihood of the transition.

Schadschneider [44] proposes the concept of floor field to improve the CA technique.

Starting from the CA grid map, Schadschneider adds a second grid of cells that can be static or dynamic. The first one does not depend on time and it is used to model the most attractive region in the map (for instance an emergency exit). On the other hand, the dynamic floor is adopted to take into account the interaction between agents.

In contrast to this discrete approach, there are several instances of continuum models. For example, Hoeller et al. [17] define the human motion as the combination of attractive and repulsive potential fields. The first one guides the pedestrian to the possible destination and the other one is useful to avoid collision with obstacles.

Since the true destination of a person is unknown, Hoeller defines an attractor that guess the possible destinations (Figure 2.2d).

The models examined so far consider the pedestrians as passive subjects that

(24)

2.1 – State of art of human motion prediction

react only to external forces (Figure 2.1a). In this way, they ignore the interactive nature of humans. Indeed, pedestrians are active agents that, during navigation, repeat decision-making progress based on the surrounding humans prediction. This aspect is taken into account in the predictive planner (as shown in Figure2.1b) and in the game theory models.

Figure 2.1: a) Reactive based planning: the pedestrian changes its direction when another human appears in his way; b) Predictive planning: the human first predicts the motion of the other agent and then, it calculates its path considering the mutual avoidance manoeuvres in advance.

2.1.2 Predictive planners

Paris et al. [36], based on Fiorini and Shiller [8] approach, create a velocity-based model that improves the Fiorini’s work. The main advantage of Paris’ technique is the resolution of the oscillation between velocities due to the lack of the anticipation of the surrounding agents. With this method, the reference entity is not repelled by neighbouring pedestrians and static object, but humans actively find a free path through the crowd. The considered agent computes the path planning evaluating:

the reachable space regarding all directions and a limited set of velocities, simultaneously, it researches for the possible collision with the surrounding pedestrians, thus, it finds the optimal path for the near future.

Then, Karamouzas et al. [19], based on the Paris’ approach, tried to reduce the computational effort focusing more on upcoming collisions.

In the last year, Warren [54] builds a model where the pedestrian motion is computed considering the superposition of 3 frameworks: the closest, the furthest and

(25)

Figure 2.2: Example of human motion prediction in literature: a) Social Force model; b) Linear; c) Probable state transition in a grid map; d) Using potential field; e) Growing uncertainty; f ), g) Machine Learning technique [23].

the intermediate zone.

The closest pedestrians area is the repulsion zone that corresponds to the personal Hall space [13], the furthest area is the zone where pedestrians move toward neighbours, instead, the intermediate zone is the alignment area where the agent matches the speed and the heading directions of neighbours.

In continuous models, the increasing prediction uncertainty is commonly represented as in Figure 2.2e. In this regard, Trautman et al. [48] study dense human crowds and evaluate the pedestrian motion considering the Gaussian process. The authors start from the "freezing robot problem", where the motion planner cannot compute safe decision because the environment is too crowded, thus, the robot freezes in a place or performs unnecessary motions. In order to solve this problem, the robot predicts human cooperation with interacting Gaussian processes.

The predictive methods adhere to the rules of human motion navigation. For this reason, predictive humans planners are more feasible and incorporate much social information than the reactive one.

On the other hand, the pedestrian prediction requires much computational effort, especially when the crowd density becomes significant.

Furthermore, since humans take stochastic decisions, the long-time prevision is significantly uncertain.

(26)

2.1.3 Learning approach

In the previous approaches, predictions are based on how pedestrians behave in general. In learning strategy, previsions are estimated considering the observations of humans individually in particular situations and environments.

Recently, the machine learning technique is becoming significantly popular due to the advantage of improving over time and adapting to special circumstances.

The greatest computational effort occurs offline with the training process that needs a large amount of data, especially when the number of pedestrians increases.

Bennewitz et al. [3] create an algorithm that has a set of trajectories as input. As start and target points, they identify the so called "resting place", where the pedestrians normally stop and stay for a certain amount of time.

The model learns observing human motion and clustering the trajectories into a set of motion patterns through the use of Expectation-Maximisation algorithm (Figure 2.2f).

Foka et al. in [9] forecast trajectories and velocity of humans using neural network approach. This technique is useful to predict non-linear behaviours for one step ahead prediction (Figure2.2g).

More recent works have developed models with a recurrent neural network (RNN) to predict future human action as in [45]. However, this approach does not consider the behaviour of other people in proximity, for this reason, Alahi et al. [1] propose a social LSTM (Long short-term memory) that combine the forecasting of all agents inside a crowd and the common sense rule in a shared environment with a "Social pooling of hidden states". Nevertheless, this method evaluates pedestrian prediction near the considered human without analysing interactions between all agents.

Furthermore, Alahi et al. have as output a single trajectory that is the "average behaviour".

In this regards, Gupta et al. [12] try to overcome these restrictions and build a model with Generative Adversarial Networks (GANs), that is also more competitive about computational complexity.

Liang et al. [27] outperform the social LSTM model and the social GAN, forecasting the future path and the possible activity in videos.

The main advantages of the machine learning technique are the natural behaviour trajectory because the model is trained using real trajectory data, and the capability to be accurate also in a complex crowded settings.

Learning structure incorporates the influence of other pedestrians considering the forecasting state of other agents (interaction-awareness).

Nevertheless, the main problem is the scenario generalisation, indeed if the environment changes, the model should be trained again. This approach should deal with the generalisation problem efficiency.

(27)

2.1.4 Game theory modelling

The game theory has been tested in different scenarios to demonstrate the capability of modelling cooperative behaviour of rational decision players. Some applications of game theory are: biology, computer science, psychology, economics, political science, evacuation process, electric power market, etc.

Despite the corroborated ability of the game theory to model different types of situations, in field of human motion prediction the literature is limited.

The pioneer of human motion prediction with the game theory is Hoogendoorn et al. [18]. The study analyses the pedestrian navigation in a simulated environment.

In particular, the authors adopt differential game to model the human motion and describe pedestrians as optimal feedback oriented controller that tries to reach their goals minimising the cost of navigation, supposing the motions of the other agents.

However, the game solution is not an equilibrium but is computed as an optimal control problem based on a pedestrian cost function that takes into account also the running cost of the other agents.

In another study, game theory is joined with Cellular Automata [47] and after some years, Mesmer and Bloebaum [32] present a model where human decision navigation is modelled considering game theory and velocity obstacle. Though, both mix methods are modelled for an emergency evacuation situation.

Turnwald et al. [49] analyse the interaction during human navigation in a microscopic way. They study five different cost functions for a non-cooperative game and evaluate the best result with a real experiment (the experimental players are only two). The most suitable cost function is related to the length of the trajectory.

However, the main framework of this work is the examination and validation of the Nash equilibrium for human motion. In other words, they demonstrate that all players choose the trajectory that generate a Nash equilibrium.

More recently, Ma et al. [31] combine the game theory and the deep learning approach to forecast future trajectories of multi-agents. The authors used the Brown’s fictitious game to predict the long-term navigation considering the interaction with the other players, instead, to customize the pattern for each pedestrian they used the learning approach from a single image.

A similar procedure is adopted in [40], where pedestrian motion is modelled as two games: the first is played with the closest agent in the visibility zone and the other one with all surrounding humans modelled as the learning-based game. The game with the nearest person is modelled as a static, non-cooperative and non-zero-sum game. The method is validated considering a microscopic and macroscopic approach.

In another work, Roy et al. [43] investigate the avoidance technique of two interacting pedestrians with the Fokker-Plank Nash game. The differential game is solved considering the scenario as an optimal control problem, namely, each player tries to minimise the collision cost function.

So far, we have shown an overview of methods considering the interaction with a

(28)

limited number of players. If the amount of pedestrians increases, the computational cost will become extremely hard to manage. To surmount this problem, some works adopt the mean field game to handle a big crowd as in [6] and [37].

2.1.5 Proposed approach

Our goal is to study and model the human motion based only on game theory.

Game theory has a lot of benefits over the methods presented above. Specifically, game theory overcomes the reactive method considering the possible pedestrian motion in advance. The learning approach is notably promising because can customise the trajectory of each pedestrian collecting human motion data from the sensor. This is at the same time a downside, because the model is not versatile for all scenarios but it depends on trial data. With game theory, human motion is not customised for each player, but the method is adaptable and can be used in a new environment without passing through the training data.

The exclusive work that could be compared with our scenario is the Turnald et al [49] method. Nonetheless, we overcome that model because we increase the number of players, that in [49] were two, and we build a different cost function. In addition, we also modelled the interaction between humans and static object considering the avoidance in a natural (i.e. in a human-like) way.

In our model, we consider also the group recognition that improves the computational performance of the game theory algorithm. The last two features are the main differences between the Turnald’s work [49].

(29)

2.2 Robot navigation

Typically, the robot navigation can be summarised as follows:

• Localisation: is the method of figuring out where the robot is on the map;

• Mapping: is the process where the robot creates the map of the environment, starting from the knowledge of its pose (localisation);

• Planning: the process where the robot finds a sequence of valid configurations in order to reach its final goal;

• Motion control: moves the robot considering the path planning as a reference, to reach the given goal;

Robots use active localisation that combines the measurements from odometry (proprioceptive sensor) and from exteroceptive sensors through probabilistic filters.

The most famous approaches are: extended Kalman filter and Particle filters (Monte Carlo Localisation methods). About mapping is possible to use two main methods:

landmarks and occupation grids. The first one builds stochastic maps with a probabilistic description of static obstacles. Indeed, occupation grids create a map where each cell is associated with the occupation probability.

In general, when a robot is moving in an unknown environment prefers to not solve the two problems separately but concurrently.

For this reason, in most cases, robots use SLAM (Simultaneous Localisation And Mapping).

The analysis of localisation and mapping is out of this work.

We did the hypothesis that the robot has the MAP with static obstacle before it starts the navigation. Then, the robot uses the SLAM technique to localise itself and map the positions of dynamic obstacles (Figure 2.3a) and finally, the robot utilises our socially-aware motion planning.

On the other hand, we study also the non-player robot navigation (i.e. the reactive motion) to compare the two scenarios with a variation of the Turing test (Figure 2.3b).

The next section gives an outline of the state of the art motion controller for robotics, that we will use for the implementation of the non-player robot.

2.2.1 Motion controller review

The complementary framework of path planning with the motion controller (Figure 2.3a) is only the obstacle avoidance problem (Figure 2.3b).

The final goal is to reach the robot target considering the obstacles detection with sensors. In this way, the robot adapts its motion in a local manner.

(30)

2.2 – Robot navigation

Figure 2.3: Overview of robot navigation: a)Player robot technique; b)Non-player robot approach.

One of the first related work for motion controller problem is developed by Khatib [21] with time-varying artificial potential field to avoid obstacles in real-time.

The robot navigates toward the target considering the superimposition of two types of potential field: repulsive from the obstacle and attractive from the goal pose.

In [30], Lumelsky et al. create a Bug algorithm where the robot follows the perime- ters of the obstacles if the latter are in the way toward the final target.

Boreinstein et al. [4] use the Vector Field Histogram algorithm (VFH), to compute a polar obstacle density histogram over the sensor angular sector. To select the final output direction, the algorithm studies each histogram and compares them with a given threshold. The set of available directions (candidate valley) are the histograms below the threshold and the final strategy is chosen considering three heuristics that depend on the target position.

The main problems of this technique are: the oscillations between positions in the case of environments with a lot of narrows, the absence of smooth trajectory and the neglect of the robot dynamics and kinematics.

To solve these problems, after some years, the same authors develop VFH+ [51] and VFH* [52] that are enhanced versions of VFH.

In VFH+ algorithm there are two phases: first, it is necessary to read from the sensor the angles and the distance between the obstacle and the actual position of the robot to build a polar histogram for obstacle locations. In the second phase, the algorithm generates a masked histogram based on two hysteresis thresholds that solve the indecisive robot behaviour of the basic VFH.

The VFH+ algorithm considers a set of steering directions based on a cost function

(31)

(based on the smoothness of the navigation and the goal-distance) and chooses the free direction with minimal cost. It is possible to set also the Robot radius that takes into account the robot width.

Further work, related to obstacle avoidance, is proposed in [10] with the Dynamic- Window Approach (DWA), where the motion commands are selected from the space of velocities.

In [33], Minguez et al. develop a motion controller able to have a good performance in an unknown and cluttered environment that is improved in [7].

So far we have considered only motion controllers that have local approach to the environment. This is a limitation because an optimal local solution does not guarantee the best solution for the final general path.

For this reason, some authors try to close the gap between global path planning and local motion control. For instance, Quinlan and Khatib [38] present the Elastic Band approach, where the algorithm deforms in real-time the general path planning subject to internal and external force to obtain a smooth path and an appropriate distance from obstacles. Rosmann et al. [42] expand that method with the Timed- Elastic-Band.

However, the reason behind the research about motion robot control is to find a reasonable state of the art obstacle avoidance to do the comparison with our socially- aware trajectory planning.

We chose the VFH+ algorithm to implement the non-player robot because there is a good trade-off between performance and ease of implementation.

As a matter of fact, in two parts of our experiment, we showed to volunteers some videos with our player-robot and non-player robot to understand if they recognise the difference or not.

(32)

Chapter 3 Background on Game Theory

The purpose of this thesis section is to give to the reader the main concepts about game theory, in order to work out the human motion model developed in Chapter 4. For completeness and clarity, at the end of this chapter is given also an example of a simple navigation model with game theory.

3.1 Overview

Game theory is a vast discipline that has been studied for decades and has been used in different fields as: economics, politics, biology, evacuation process, military strategy, human motion forecasting, etc.

Game theory creates a mathematical model to study the strategy of rational decision- maker. The pioneer on this field was Von Neumann that published in 1928 the general theory for solving the zero-sum cooperative game. This work has been improved in 1944 with [53]. Nevertheless, the most famous concept about game theory was developed in 1950 by John Nash [35] with the Nash equilibrium theory in non- cooperative game.

In the following section there is the most widely used terminology in game theory useful to understand the following chapters.

3.1.1 Terminology

In a game, the participants are the players. Each player has an action set that is used to make the decision during the game. The number of times in which the player are called to make decision are the stages. The information about all players in that particular stage is the state of the game.

Furthermore, a strategy in game theory, means a procedure in which the player decides what to do for all situations throughout the game. In other words, when the game starts, the player specifies the action that will take for each possible situation

(33)

of the game. In particular, the specification can be deterministic (pure strategy), or the specification can be gives as a probability distribution for every action of the action set (mixed strategy).

For each player is defined a cost function that guides the choice of a certain action during the game. The players have a rational behaviour, it means that they try to maximise their profit (i.e. minimise the expected cost).

To study the interaction among players there are a different class of games that are presented in more details in the following section.

3.1.2 Game types

Games can be classified according to certain attributes, but in the following, we present only a summary of the common types of games [26] that we will use to model the interaction and the behaviour of pedestrians in a crowded environment.

Non-cooperative or cooperative:

In non-cooperative games the focus is set on the individual player that tries to max- imise the own profit. It does not mean that the players do not cooperate, but they collaborate if the coalition can help them to reach their individual interest [26]. If the game leads to a situation where all players maximise their profits, the game reaches the equilibrium point, that in non cooperative game is the Nash equilibrium.

On the other hand, in cooperative games the unit is the group and the players put the interest of the coalition before their own.

Zero-sum or Non-zero-sum

In zero-sum game the gain of one player corresponds to the same amount of loss of the other player. Thus, the sum of the payoffs of all players is zero. This situation is the most extreme circumstance of conflicting interest.

Meanwhile, in non-zero-sum game the player not wins the same amount that the other player loses.

Static and dynamic game

In the static case the decisions of all players are taken simultaneously. Considering the pedestrians scenario, if the condition is modelled as static game, the agents ob- serve the situation and react instinctively. Instead, in dynamic game, the decisions are taken sequentially [49]. In this last condition, it is necessary to describe the amount of information that each player has about the current and previous state of the other players.

Perfect information game:

If each player, when makes a decision, knows the previous actions of all players, the

(34)

3.2 – Navigation example with game theory

game is called perfect information game.

Finite game:

If the number of players and the corresponding actions are limited the game is called finite.

3.1.3 Nash equilibrium of Non-cooperative game

The most famous solution for non-cooperative games is the Nash equilibrium concept [35]. The equilibrium represents a game strategy where all players find a balance between the self interests of all the agents. In other words:

"A Nash equilibrium is a combination of strategies where no agent can reduce its own cost by changing its action if the other agents stick to their actions. A Nash equilibrium is the best response for everyone [50]."

From now on we indicate the Nash equilibrium with an asterisk s^j∗_i = (s^j∗₁ , ..., s^j∗_N)

this is the combination of the optimal strategies of all payers (subscript i). The superscript j refers to the stage of the game.

In mathematical terms it is possible to define the Nash equilibrium as follows:

the N-tuple (a set on N items, where each item is correlated to a different player [35]) of strategies s^j∗_i is a Nash equilibrium if the following N inequalities are satisfied [49]:

J₁(s^j₁^∗, s^j₂^∗, ..., s^j_N^∗) ≤ J₁(s^j₁, s^j₂^∗, ..., s^j_N^∗) J₂(s^j₁^∗, s^j₂^∗, ..., s^j_N^∗) ≤ J₂(s^j₁^∗, s^j₂, ..., s^j_N^∗)

...

J_N(s^j₁^∗, s^j₂^∗, ..., s^j_N^∗) ≤ J_N(s^j₁^∗, s^j₂^∗, ..., s^j_N) where:

J_i is the cost function for the i-th player.

3.2 Navigation example with game theory

One of the simplest navigation decision problem scenario of two players is shown in Figure3.1. Two pedestrians, P₁ and P₂, walk toward each other. In this example, for simplicity, the game is modelled as static, non-cooperative and non-zero sum.

(35)

Figure 3.1: Example of human motion and interaction-aware navigation [49].

For each player (P_i) in Figure 3.2a is represented the action set (a¹_i, a²_i..., a⁵_i) for the stage j and the relative cost for each trajectory. The cost for each action (in this case coincide with a trajectory), is computed considering the length of the path and crossing right is more convenient than passing left. If the two trajectories collide the cost is infinite (Figure 3.2b). Each cell of the table in Figure 3.2b contains the cost pairs J₁(a^k₁) | J₂(a^k₂). According to the definition in Section 3.1.3, it is possible to compute the Nash equilibrium at the j-th stage of the game as:

J₁(s^j₁^∗, s^j₂^∗) = min(J₁(a^k₁, a^k₂^∗)) J₂(s^j₁^∗, s^j₂^∗) = min(J₂(a^k₁^∗, a^k₂))

In this example, we find four Nash equilibrium that is highlighted with a circle in Figure3.2b.

How to calculate the Nash equilibrium by looking only at the table?

At the same time 2 conditions should be observed:

• the cost column J₁ is less or equal than all the other same column cells.

• the cost row J₂ is less or equal than all the other same row cells.

This scenario was tested in [49] with a real experiment and the result was that the two players during navigation adopted one of the Nash equilibrium solutions.

Nevertheless, this example originates the question about which equilibrium a player should choose. It is possible to select the equilibrium trajectories reasonably. For instance, for each player is computed the Nash equilibrium at each time step. In order to select in the current time (t) the right equilibrium, the player should con- sider the set of Nash equilibrium in the previous time step (t − 1) and in the current time (t) but also the set of observed trajectory in the previous time step (t − 1).

First of all, the player considers which observed trajectory is similar to the Nash

(36)

3.2 – Navigation example with game theory

Figure 3.2: a)Situation in Figure3.1 modelled as static game; b)The action of each player and the respective cost function is shown. If the two players collide the cost is infinite. The Nash equilibrium is circled [49].

equilibrium in the previous time step, then, compares the chosen equilibrium with the equilibrium set in the current time.

The main problem of this approach is the computation and evaluation of all equilibrium each time for all players. For this reason, we modelled the human motion behaviour considering a sequential best response, it means that the game is dynamic and the solution is calculated considering the current observation of all agents and, thus, the Nash equilibrium concept (this concept will be explained in the Chapter 4).

The best response strategy in our model is unique and overcomes the static approach.

(37)

(38)

Chapter 4 Model of human motion prediction

4.1 Social behaviour

Socially-aware planning combines the typical information used to perform the motion planning (composed by the abstraction of the environment coming from the proprioceptive and/or exteroceptive robot’s sensor) and the social conventions con- nected to the society in which the robot will move.

Definition 2 The social conventions are behaviours created and accepted by the society that help humans to understand intentions of others and facilitate the com- munication [41].

In order to generate a safe and comfortable strategy for robot navigation in a populated humans environment, it is essential to consider the main aspect of the social conventions (Definition2) during human motion.

In this regard, it is interesting to explore, during navigation, how humans manage their surrounding area and, thus, the distance that human mutually respects to prevent emotional discomfort.

4.2 Proxemic

The concept of "Proxemic" was proposed by Edward T.Hall in [13] for human-human interaction scenario.

Definition 3 Proxemics is the study of spatial distances individuals maintain in various social and interpersonal situations. These distances vary depending on en- vironment or cultural factors [41].

Hall [13] studied the existence of particular unwritten rules that humans adhere during the interaction depending on their relationship. He observed different social distances that individuals maintain from others, as shown in Table 4.1.

(39)

Zone Distance Intention or Relationship Intimate ≤ 0.45 m Embracing, touching, whispering Personal 0.45 - 1.2 m Friends

Social 1.2 - 3.6 m Acquaintances and strangers Public >3.6m Public speaking

Table 4.1: Space around a person considering social interaction according to the Hall’s study [13].

Based on the Theory of Mind [22], during navigation humans maintain a certain space for themselves as those they imagine others would prefer.

It should be noted that our human motion model takes into account the proxemic for all players, groups included.

4.3 Personal space

Definition 4 A personal space is the region around humans that they actively main- tain into which others cannot intrude without causing discomfort [14].

An example of people that respect the individual personal space is shown in Fig- ure4.1, in which the personal space is illustrated using a blue circle and considering the Hall’s theory.

Figure 4.1: Situation in which humans respect personal space (blue circle)[41].

In the scientific literature different shapes of personal space (Figure 4.2) have been proposed.

Concentric Circle (Figure 4.2a). As shown in Table 4.1, it is possible to classify the space around a human in four specific zones. It is necessary to highlight that

(40)

4.3 – Personal space

the distances between people are not rigid and vary with the culture, age, type of relationship and context. Indeed, in some cultures the physical contact is avoided, instead of others that are more tolerant. For this reason, it is very important to underline that the zones proposed by Hall are refereed to US citizens.

Egg shape (Figure 4.2b). Humans are more exigent regarding the respect of the frontal area. Thus, the invasion of the frontal zone is considered as uncomfortable [15].

Ellipse shape (Figure 4.2c). One of the most famous approach to represent human motion behaviour is the Social Force Model [16]. In this model, there are two types of forces: attractive and repulsive. The first one guides humans to the destination. On the other hand, the repulsive force is used to avoid collisions between pedestrians. The potential repulsive force is modelled as a monotonic decreasing function with equipotential lines having the form of an ellipse directed to the direction of motion.

Asymmetric shape (Figure 4.2d). After physical experiments and virtual simula- tions, the researchers in [11] concluded that the size of the personal space does not change with the walking speed during the circumvention of a static obstacle. In particular, the personal zone is smaller in the pedestrian’s dominant side.

Figure 4.2: Different forms of surrounding personal area: a) Concentric circles [13];

b) Egg shape [15]; c)Ellipse shape [16]; d)Asymmetric shape [11].

In our human motion model we designed the personal space as a concentric circle (Figure 4.2a) considering a social zone greater than 1.2m diameter, as shown in Table 4.1.

(41)

4.4 Modelling navigation as a game

During navigation, human observes the environment and reacts to people and generic obstacles, taking sequential decisions. To model this condition, we use a dynamic game. In fact, in this scenario the player chooses the strategy after observing the actions of other players.

It is necessary to highlight that each player has perfect information, i.e. knows the current and previous actions of all players.

Each pedestrian wants to reach its own goal, thus, the game is non-cooperative.

Further, if a pedestrian "wins", it is not necessary that another player loses the same amount, hence, the game is non-zero-sum.

Each player has a finite number of actions included in the action set. The proposed approach uses an action set with 7 actions.

The solution used to solve this game is based on the Nash equilibrium, that is explained in Section3.1.3.

The solution of the dynamic game, with only two players, is presented in Figure 4.3. The player1 observes the player2 and notes that the player2 is moving in a particular direction. Then, the player1 hypothesis that the player2 will move in the same direction for the following time steps (T). This hypothesis is sketched with

"Initial Strategy2" in Figure 4.3. Thus, the player1 solves its optimisation problem and the output is the "Strategy1". Subsequently, a control action is necessary to verify if the actual total strategy is the same computed in the previous iteration.

During the first iteration the total strategy is completely composed by the actual directions of the players.

If the Nash equilibrium is not reached, the player2 solves its optimisation problem considering the Strategy1. The output of the optimisation problem is the Strategy2 and, then, the same control strategy previously presented is applied.

The solution is achieved if the process reaches the Nash equilibrium.

Resuming, the game is dynamic, finite, perfect information, non-cooperative and non-zero sum and is solved with a Nash equilibrium.

In the following, we present in details the optimisation problem solved for each pedestrian.

Furthermore, in general, the model considers three situations:

• single pedestrian that interacts with all people

• recognition and resolution of the game considering a group of people

• interaction between human and a static object.

(42)

4.4 – Modelling navigation as a game

Figure 4.3: Resume of a dynamic game solution with two pedestrians.

(43)

4.5 Optimisation problem

To produce a human-like prediction we use a constrained optimisation problem composed by a cost function subject to a set of constraints.

A cost or objective function is a mathematical function that must be maximised or minimised with respect to some variables, in order to search for the optimal ¹ solution of the problem.

Constraints are mathematical concepts used to limit the variables evaluated by the optimisation. Constraints are hard or soft: hard constraint is an absolute limit that cannot be overcame; while soft constraint allows to relax the inflexible constraints acting a penalty to the objective function.

The standard form of an optimisation problem is:

minimise

x f (x)

subject to g_i(x) ≤ 0, i = 1, . . . , m.

h_j(x) = 0, j = 1, . . . , p.

where:

• f (x) is the objective function to be minimised over the vector x with n-variable

• g_i(x) ≤ 0 are m inequality constraints

• hj(x) = 0 are p equality constraints

• m ≥ 0 and p ≥ 0. If m and p are equal to zero, the optimisation problem is unconstrained.

4.6 Constraints and objective function

We started assuming that pedestrians are always goal-directed, i.e. their motion is always directed toward the goal. In this regard, an element of the objective function (see Equation4.1) is modelled minimising the overall path length; in this way, the resulting path planned by people is the shortest one.

minimise

εi

γ₁||p_i(t) − p^∗_i|| (4.1) where:

1Optimal solution is the best feasible result for a given problem