Development of a cognitive and emotional control system for a social humanoid robot

(1)

Development of a cognitive

and emotional control system

for a social humanoid robot

Doctoral School of Engineering “Leonardo Da Vinci”

PhD Program in Automation, Robotics and Bioengineering XXVI CYCLE

SSD: ING-INF/06

Author

Nicole Lazzeri

Tutors

Prof. Danilo De Rossi Eng. Daniele Mazzei

(2)

(3)

University of Pisa Research Center “E. Piaggio” Automation, Robotics and Bioengineering PhD Cycle XXVI (2011-2013)

Development of a cognitive and emotional control system for a social humanoid robot

Author:

Nicole Lazzeri

Tutor:

Prof. Danilo De Rossi

Tutor:

(4)

(5)

(6)

(7)

Abstract

In the last years, an increasing number of social robots have come out from science fiction novels and movies becoming reality. These social robots are interesting not only in the science fiction world but also in the scientific research field. Building socially intelligent robots in a human-centred manner can help us to better understand ourselves and the psychological and behavioural dynamics behind a social interaction.

The primary and most important function of a social robot is to appear “believable” to human observers and interaction partners. This means that a social robot must be able to express its own state and per-ceive the state of its social environment in a human-like way in order to act successfully, i.e., it must possess a “social intelligence” for maintaining the illusion of dealing with a real human being. The term “social intel-ligence” includes aspects both of appearance and of behaviour that are factors tightly coupled with each other. For example, a social robot de-signed to be aesthetically similar to an animal is expected to have limited functionalities. Instead, a humanoid robot that physically resembles a human being elicits strong expectations about its behavioural and cogni-tive capabilities and if such expectations are not being met then a person is likely to experience disorientation and disappointment.

The believability of a social robot is not only an objective matter but it also depends on a subjective evaluation of the person involved in the

(8)

Abstract

interaction. A social robot will be judged believable or not on the base of the individual experience and background of the person who interacts with the robot. Clearly, it is not possible to know what is really going on in the mind of that person during the interaction. Nevertheless it is possible to analyse and evaluate the psychophysiological and behavioural reactions of the subject to obtain useful cues for improving the quality and performance of the social interaction.

Based on these considerations, this thesis aims to answer two research questions: (1) How can a robot be believable and behave in a socially acceptable manner? and (2) How to evaluate the social interaction of the subject with the robot?.

This thesis presents the development of a novel software architecture for controlling a humanoid robot able to reproduce realistic facial ex-pressions on one hand and the development of a software platform for analysing human-robot interaction studies from a point of view of the subject who interacts with the robot on the other hand.

The architecture developed for controlling the robot is based on a hy-brid Deliberative/Reactive paradigm to make the robot able to quickly re-act to the events, i.e., rere-active behaviours, but even to perform more com-plex high-level tasks that require reasoning, i.e., deliberative behaviours. The integration of a deliberative system based on a rule expert system with the reactive system makes the robot controllable through a declar-ative language that is closer to the human natural way of thinking. An interactive graphical interface provides the user with a tool for controlling the behaviour of the robot. Thus, the robot becomes a research tool suit-able for investigating its “being social and believsuit-able” and testing social behavioural models defined by set of rules.

The hybrid architecture for controlling the robot has proven to be a good design for making the robot able to perform complex animations

(9)

Abstract

and convey emotional stimuli. The robot can perceive and interpret social cues of the environment, react emotionally to people in the surrounding and follow the person who attracted its attention.

The platform developed for studying the subject’s psychophysiological and behavioural reactions during the interaction with a robot is designed to be modular and configurable. On the base of the experiment specifi-cations, multiple and heterogeneous sensors with different hardware and software characteristics can be integrated into the platform. Collecting and fusing together complementary and redundant subject-related infor-mation makes possible to obtain an enriched scene interpretation. Indeed merging different types of data can highlight important information that may otherwise remain hidden if each type of data is analysed separately. The multimodal data acquisition platform was used in the context of a research project aimed at evaluating the interaction of normally devel-oping and autistic children with social robots. The results demonstrated the reliability and effectiveness of the platform in storing different types of data synchronously. In multimodal data fusion systems, the problem of keeping the temporal coherence between data coming from different sensors is fundamental. The availability of synchronized heterogeneous data acquired by the platform such as self-report annotations, physiolog-ical measures and behavioural observations facilitated the analysis and evaluation of the interaction of the subjects with the robot.

(10)

(11)

Acknowledgment

First of all I would like to express my warmest thanks to my family that have always supported me. I would not be here and I would not have reached this goal without their help and encouragement throughout these long years.

I would like to thank to my supervisor, Prof. Danilo De Rossi, for his academic guidance and support throughout these three years of research thesis. I am grateful to my co-supervisor, Daniele Mazzei, for his constant suggestions and comments without them this thesis would not have been the same. I cannot forget to thank Prof. Antonio Cisternino, tutor of my master degree thesis and always life mentor, for his technical support.

I want to thank Roberta Igliozzi, Alice Mancini, Antonio Lanatà, Al-berto Greco and Annalisa Rotesi for their collaboration, technical support and availability for projects and experiments during my thesis. Working together we reached important results.

I want to express my gratitude to Prof. Nadia Magnenat-Thalmann who hosted me for three months at MIRALab research center in Geneva (Switzerland) and gave me the opportunity to live this experience. I would like to thank everyone at MIRALab and in particular Maher Ben Moussa for his friendly and technical support during my stay, Marlene Arevalo for her availability as actress for my experiment, Niels Nijdam for his friendly support and his help with the motion capture acquisition

(12)

Acknowledgment

and Lara Broi for her kindness and her secretarial support. I would also like to thank everyone who participated in my experiment.

I thank all of my working group, FACETeam, for their collaboration and hard work of each day and all my friends and colleagues at research centre “E. Piaggio” who shared with me this academic journey. I would also like to thank my previous working group at the Computer Science Department, CVSLab, with whom I shared work, laughs and cakes. In particular I thanks Nadia for the contribution with her master degree thesis and Andrea for his technical support.

Last but not least, my heartfelt thanks to my closest friends, Vera, Giulia and Pasquale, who were always a great support and were there cheering me up through the good and bad times.

Nicole

(13)

Introduction

Research Context

The recent history of machines and robotic devices shows us that we are entering a new era in which personal service robots will interact directly with people. The first generation began after the World War II due to the rapid progress of science and technologies and was mainly dominated by the development of industrial robots for automation processes. Succes-sively the necessity to acquire external information by sensors to better perform the operations led to the so-called second generation adaptive robots that prevailed gradually in the 1970s. To resolve some limitations of the first and second generation robots, three conditions were proposed: a robot should have been autonomous, independent and mobile [118]. This third generation started around the 1990s focused on designing and building functional robots endowed with human social and communica-tive skills and able to cooperate with humans in the same environment. This is the main objective of scientists interested in the new research field of human-robot interaction (HRI) which presents challenges related to, but distinct from, those of human-computer interaction (HCI).

Several HCI studies confirm what Revees and Nass outlined in 1996 with the Media Equation Theory: “... individual’s interaction with com-puters, television and new media are fundamentally social and natural,

(18)

Introduction

just like interactions in real life. ... Everyone expects media to obey a wide range of social and natural rules. All these rules come from the world of interpersonal interaction, and from studies how people interact with real world. But all of them apply equally well to media...“ [138]. Many of the studies presented by Reeves and Nass were based on in-teractions between a human and a media device, such as a computer or a television. Attributing human-like characteristics, motivations, inten-tions, and emotions to non-human agents is demonstrated to be a typical attitude of all human beings, especially the children who ascribe lifelike qualities to simple electronic toys as even their calculators [168, 34, 134]. Thus, we could hypothesize that a similar effect could be found in interac-tions between humans and cartoon-like robots and even more so between humans and human-like robots.

Motivations and Objectives

Humans are fascinated by the creation of robots that have similar level of emotional understanding, sensation and communication. These robots invoked on people’s projections and expectations due to the lifelike be-haviour displayed by their aesthetic form [70]. Appearance and bebe-haviour represent two complementary elements of the human social sphere which are tightly coupled with each other: the aesthetic aspect is the first signif-icant element that impacts a communication while the behavioural aspect is a crucial factor in evaluating the ongoing interaction. These elements are directly linked with the concept of believability: “It does not mean an honest or reliable character.. but one that provides the illusion of life.. and thus permits the audience’s suspension of disbelief.” [16].

If “being believable” is important for a social robot as it means that the robot is able to engage people in an interaction, this feature becomes fundamental for a humanoid social robot due to its similar morphology

(19)

Introduction

that creates more expectations in the human beings. As in human-human interactions, on first encounter the believability of a humanoid social robot is communicated through its physical embodiment which strongly influences people’s expectations about how it will behave. Later on the perception of believability of the robot is given by its expressiveness, be-haviour and reaction to external stimuli which can make a human-robot interaction more or less natural and lifelike.

Although there are objective elements for defining the believability of a social robot, the performance measure of the robot believability also depends on the subjective perception and interpretation of the person who interacts with it: “Believability is in the eye of the observer which means that it is influenced by the observer’s individual personality, naive psychology and empathy mechanism.” [47]. Thus, it is also necessary to investigate and understand how people interpret human-like signs ex-pressed by social robots and how people feel when interacting with these synthetic partners.

The work presented in this dissertation started from the necessity to control an android able to mimic realistic facial expressions. The robot is equipped with a passive human-size mannequin body and a female face that is a high-fidelity reproduction of a human head. Since the robot appears extremely realistic, the expectations of people who interact with it increase consequently. This raised some questions: How can a humanoid robot be believable and behave in a socially acceptable manner? Given a socially interactive robot, how to evaluate the social interaction of a subject with such robot?.

Clearly the answer to these questions involves more than one dis-cipline, from the robotics and computer science to the psychology and cognitive science. This is a new interdisciplinary research field defined by Hiroshi Ishiguro as “android science” which is based on the finding

(20)

Introduction

that human-like robots can elicit the sorts of responses that people direct toward each other and therefore it aims at “realizing a human-like robot and finding the essential factors for representing human likeness.” [76]. The android science is a bidirectional research field. Robotic and com-puter scientists need knowledge from human science to understand how to program the robot so that the robot can adapt itself to the interlocutors’ emotional state and show behaviours that can be considered acceptable by humans. At the same time, psychologists and cognitive researchers have begun to use social robots for investigating human cognition underlying daily interactions between individuals.

Following this approach, the research thesis is focused on two com-plementary aspects: on one hand the development of a software system for controlling a humanoid robot in order to make it able to emotionally engage people in social interactions and on the other hand the develop-ment of a multimodal acquisition platform able to record different types of data during the ongoing interaction in order to investigate how people feel and react while interacting with this robot.

Outline

This thesis is organized as follows. Chapter 1 presents a general state-of-the-art about social robots, their features and their purposes in the field of human-robot interaction. Following that, the chapter introduces the motivations behind the development of social robots and briefly describes the theoretical background on the artificial intelligence and the embodied agents. Finally, some examples of existing control systems are reported and discussed in comparison with the system presented in this thesis. Chapter 2 presents the technical description of the android used in this research. Chapter 3 illustrates the software architecture developed for controlling the android. The chapter discusses the motivations

(21)

Introduction

ing the overall architecture and explains its design and implementation in details. Chapter 4 introduces the emotional system of the android. It de-scribes how to codify human facial expressions by using a standard system for describing facial muscle movements. Then, the chapter presents the system used for generating and applying facial expressions to the robotic head. Chapter 5 presents the multimodal acquisition platform developed for acquiring data from the subject who interacts with the robot. The chapter starts with the motivations and the research approach that guided the development of the platform and follows with the description of its technical details. Chapter 6 describes three case studies that used the android and the multimodal platform for different purposes. For each of case studies, protocol, results and discussion are reported. Chapter 7 summarizes the conclusions and achievements of this research thesis and explains perspectives for future works.

(22)

(23)

Chapter 1

State of the art

1.1 Social robots

The imagination of human-like creatures able to interact with us and to move around our physical and social spaces has inspired writers, produc-ers and directors since the dawn of the science fiction genre. From the robots in Karel Capek’s R.U.R. to the Frankenstein monster of the Mary Shelley’s novel, from the Star Wars’s droids R2-D2 and C-3PO to the positronic robots of the Asimov’s short stories up to the Philip K. Dick’s replicants, science fiction novels, plays and movies have illustrated us how this robotic technology may live together with us and benefit society but also raise questions about ethics and responsibility.

In the last decades, this imagination has become reality with the enor-mous advances in hardware performance, computer graphics, robotics technology and artificial intelligence (AI). Different reasons can guide researchers in building a robot able to interact with people in a human-centered way. We are a profoundly social species and understanding our sociality can help us to better understand ourselves and our humanity [26]. Such robots can be a test bed for modelling human social behaviours and

(24)

Chapter 1. State of the art

the parameters of those models could be systematically varied to study and analyse behavioural disorders [26]. If it is possible to interact with social robots in a natural and familiar way, they can be used to enhance the quality of the life. In the future, a personal social robot could assist people in a wide range of activities, from domestic to service tasks up to educational and medical assistance.

There are many points of view for defining a social robot [26, 58, 11] but all of these researchers agree that there are some fundamental charac-teristics related to “being social”: the robot should be able to recognize the presence of humans and engage them in an interaction, express its own emotional state and interpret the emotions of the others and, use its gestures for communicating in a natural and intuitive way.

Many social robots have been designed and developed and much more has still to be done. But even if social robots vary greatly in terms of size, shape, functions and features their aesthetic form is designed to guaran-tee an intuitive usage and a better comprehension [70]. According to their aspect, social robots can be roughly classify in two main categories: cartoon-like social robots and human-like social robots.

1.1.1 Cartoon-like social robots

Since the 1990s, the field of social robots came on in leaps and bounds by attracting much attention both from commercial and academic research communities. The motivations behind the choice to produce cartoon-like robots can be various, such as teaching children caring behaviours, allow-ing elderly and ill persons to develop emotional relationships, studyallow-ing human behavioural models or developing household intelligent systems.

In 1998, a gremlin-like talking robot known as Furby became the must-have toy for kids (Fig. 1.1a). Behind its strange aspect between a hamster and an owl and its ability to talk, Furby was always advertised

(25)

Figure 1.1: Examples of cartoon-like social robots: (a) Furby (Tiger Electronics/Hasbro); (b) AIBO ERS-7 (Sony); (c) Kismet (MIT); (d) Leonardo (MIT); (e) Paro (AIST, Japan); (f) Keepon (NICT, Japan); (g) iCat (Philips).

as a learning robot through a new language made of simple syllables, short words, and various sounds called “Furbish” [69]. But over time, Furby gradually replaced Furbish with English as the instruction manual

(26)

touted: “The more time you spend with me, the sooner I will be able to speak your language.” Although the ability of Furby to interact with people was limited, it represented one of the first attempt to build a social robot with increasingly lifelike features.

One year later, Sony introduced a series of robotic pets called AIBO (Artificial Intelligence Robot) (Fig. 1.1b). AIBOs were marketed for domestic use as “Entertainment Robots” but were also widely used for educational and research purposes. All AIBOs came with a rich set of sen-sors, such as camera, proximity, touch, microphones, speakers, a complex body with 16 degrees of freedom, and a preloaded software that gave the robot a personality and the ability to walk, to analyse its environment, and to recognize spoken commands for interacting with humans [159].

In the same period, Cynthia Breazeal developed the most expressive sociable robot built to that moment, Kismet (Fig. 1.1c). Its engineer-ing was inspired by the social development of human infants: it was programmed to have the same basic motivations as a 6-month-old child, i.e., the drive for the novelty, the social interaction and the periodic rest [29]. Kismet has an animated head with big blue eyes, flirty lashes, red lips and pink ears that curved upward or downward depending on its mood. Changes of facial expressions reflect the Kismet’s mood states, i.e., aroused, bored or neutral, according to the satisfaction of its drives. Kismet was one of the first robots to respond to people in a natural way through a visual, an auditory, and a proprioceptive sensory system.

The successor of Kismet looked nothing like it. Leonardo is 2.5 feet tall creature-like robot with a youthful appearance specifically designed to engage people in social interactions (Fig. 1.1d). It has big eyes, enor-mous pointy ears, a mouth with soft lips and tiny teeth, a furry belly, furry legs and pliable hands. It did not try to mimic any living creature since the idea was that robots must be their own kind of creature to be

(27)

accepted and valued on those terms [28]. It was designed to be expressive and communicative with humans in social interactions. It can gesture and manipulate objects in simple way. The software control of Leonardo [27] includes: a visual tracking system to know where people are and what they are doing, to understand aspects about the inanimate environment and to specify a target of attention to be tracked; a real-time face recog-nition system to learn and memorize people’s faces; a cognitive-affective learning system to enable the robot to understand and decode emotional messages conveyed through facial expressions of the people who are in-teracting with it; a social learning system to learn new skills from natural human interactions.

At the beginning of the new millennium a baby harp seal robot made its first appearance in public. Paro (Fig. 1.1e) was designed by Takanori Shibata at the Japan’s AIST for people who cannot take care of real animals and for those who live in places where pets are forbidden [129]. It was covered with soft artificial fur to make people feel comfortable so that its cute appearance and calming effect makes Paro a research tool for the Animal-Assisted Therapy to elicit emotional responses in patients with disabilities. Through its tactile sensors Paro can respond to petting by moving its tail and opening and closing its eyes. It can also recognize the direction of voices and some words through its audio sensor. Paro is active during the daytime and gets sleepy at night and can express emotions, such as surprise, happiness and anger, just like a human.

Some years later, Dr. Kozima and Dr. Nakagawa developed a yellow creature-like robot that resembled a small chick: Keepon (Fig. 1.1f). The minimal design of Keepon’s appearance and behaviour was chosen for exchanging emotions and attention with infants and studying their social development [88]. Keepon has only 4 degrees of freedom to shake and tilt the body for directing its attention and to rock and shrink/stretch

(28)

the body for expressing its emotions. The body is made of silicone rubber which can be deformed whenever it changes posture and when someone touches it. The Keepon’s eyes and nose are two cameras and a microphone respectively. Since Keepon was thought to interact with autistic children, other facial features, such as eyebrows or mouth, were not added for avoiding to intimidate the children.

In the same period, Philips Research laboratories were investigating technical and social aspects of user interface robots in an “Ambient Intel-ligence” environment. Focused on research platforms for studying human-robot interactions, they built a small cat-like human-robot: iCat (Fig. 1.1g). The robot is 38 cm tall and is equipped with 13 servo motors that control the face and head position to convey emotions and create social interaction dialogues. The multisensory system of iCat includes: a camera in the nose for recognizing objects and faces; a microphone and a speaker on the feet for identifying and playing sounds; and touch sensors and multi-colour LEDs in the feet and ears to sense whether the user touches the robot and to communicate by encoding the coloured light [171]. Through the internet connection, iCat can obtain more information to interact in the environment.

1.1.2 Human-like social robots

Since the ancient times, humans have always been curious about under-standing and simulating the human nature. Nowadays, due to the rapid advances in robotics, engineering and computer science, this curiosity has become reality. A new generation of robots with an anthropomorphic body and human-like senses is enjoying increasing popularity as research tool. The efforts beyond these human-like robots aim at creating robots that communicate with humans in the same way that humans communi-cate each others and work in close cooperation with humans in the same

(29)

environment. Therefore it is not surprising that these robots are equipped with a human body to take advantage of the human-centered design of the environment and a set of sensors to reproduce the human-like senses and make the robot capable of intuitively communicating with humans [62, 73]. All these features make these robots research tools suitable for studying the human intelligence and behavioural models by investigating the social dynamics during a human-robot interaction.

Starting from 1973, Prof. Ichiro Kato at Waseda University created Wabot-1 (Fig. 1.2a), the first full-scale anthropomorphic robot able to walk on two legs [82]. The Wabot-1 was able to communicate with a person in Japanese, to measure distances and directions to the objects and to grip and transport objects with hands through tactile sensors. Ten years later the Prof. Kato’s team designed the successor, Wabot-2 (Fig. 1.Wabot-2a), with the aim to build a “specialist robot” endowed with human-like intelligence and dexterity [83]. Wabot-2 was designed to be a musician humanoid robot able to read a normal musical score, play a keyboard instrument and converse with a person.

In the same period, Honda began its research and development pro-gram focused on “intelligence” and “mobility” since a robot “should co-exist and cooperate with human beings, by doing what a person cannot do and by cultivating a new dimension in mobility to ultimately benefit soci-ety.” [72]. Being employed for home use, the robot had to be capable of moving through furnished rooms, going up and down the stairs and em-ploying two-foot/leg mobility. After ten years of research, in 1996 Honda presented P2 and P3 (Fig. 1.2b), the first full-body humanoid robots with realistic movements. This research line culminated in 2000 when Honda unveiled ASIMO (Advanced Step in Innovative MObility) (Fig. 1.2b), the first completely independent, two-legged humanoid walking robot [144]. ASIMO was designed to be a multi-functional mobile assistant and to

(30)

op-Chapter 1. State of the art

Figure 1.2: Examples of human-like social robots: (a) Wabot-1 and Wabot-2 (Waseda Uni-versity); (b) Asimo (Honda); (c) Toyota Partner Robots (Toyota); (d) Repliee actroids (Osaka University); (e) Geminoids (Hiroshi Ishiguro Laboratories); (f) Kaspar (University of Hertford-shire); (g) Nao (Aldebaran); (h) Zeno (Hanson Robotics); (i) iCub (IIT, RoboCub project).

erate in human environments with the aim to become a part of people’s everyday live. Since 2002 ASIMO has been also equipped with social capabilities, such as approaching people by following them, recognizing their faces and addressing them by name, and interpreting voice com-mands and human gestures to respond accordingly [144]. It is able to

(31)

walk and run on two feet at 6 km/h through a posture control logic that enables the robot to maintain balance while increasing walking speed.

In the same year that ASIMO appeared, Toyota started its humanoid robot program. Rather than concentrating on a single robot, Toyota developed the Toyota Partner Robots, a series of humanoid robots de-veloped to embody kindness and intelligence and to assist with human activities [167]. The Music playing robots (Fig. 1.2c) were designed to use tools and to play an entertaining role in the performance. Able to move their whole body with advanced coordination control, these robots demonstrate the agility of their arms, hands and fingers as they play trumpets, tubas and drums. The Toyota’s band includes: Richie that keeps the beat with his two fully articulated arms and hands; Chuck, the tuba player, that carries his instrument on his shoulder with one hand while fingering the valves with his other; Harry, the bipedal trumpeter; and Dave, the rolling trumpeter.

At the Expo 2005 in Aichi (Japan), Honda’s and Toyota’s robots met Repliee Q1-expo (Fig. 1.2d), a female android greeting the visitors at the information booth. Unveiled in 2004, Repliee Q1-expo was modelled af-ter a Japanese newscasaf-ter and was a pioneer example of a real humanoid robot called “actroid” with strong visual human-likeness developed by Prof. Ishiguro (Osaka University) in collaboration with Kokoro Co., Ltd. Repliee Q1 had a sister called Repliee R1 modelled after a 5-year-old Japanese girl [116]. The successor was Repliee Q2 modelled after faces of several young Japanese women to suggest the appearance of an anony-mous Japanese girl [157]. All these androids can mimic lifelike functions as blinking, speaking, and breathing and interact with humans by pro-cessing speech and responding in kind. Starting from 2006, Prof. Ishiguro began to develop a new series of androids called Geminoids (Fig. 1.2e) following two approaches: an engineering approach to develop effective

(32)

tele-operation interface or generation of natural, human-like motion and a cognitive approach to investigate the effect of transmitting the human presence [123]. Geminoid robots completely resemble a real person both in appearance and behaviour therefore they need to be tightly-connected with their source person to replicate personality and social behaviours.

In the same period a parallel research line contributed to build robot with a design idea that combined the guidelines of the robots seen un-til now: robots that resembled humans in shape and behaviour as the human-like robots but focused on teaching children, assisting people with disabilities and studying new therapeutic social tools as cartoon-like com-panion robots.

In 2005 the Adaptive Systems Research Group led by Dr. Kerstin Dautenhahn developed Kaspar (Fig. 1.2f), a child-size humanoid robot aimed at investigating the role of robots in autism therapy for children. Children with autism have difficulties to approach people and interpret social cues therefore Kaspar was based on a minimalistic design with a simplified human expressive behaviour [48]. Kaspar’s features, from size to posture up to clothing, were all chosen to be “child-friendly” so that autistic children can consider the robot as a playmate. Kaspar interacts with people through simple facial expressions, gestures and body move-ments and can produce words and sentences in order to provide feedback. Hands, feet, chest, arms and face are provided with tactile sensors to respond to being touched.

Nao (Fig. 1.2g), initially used as soccer player for an international robot soccer competition named as RoboCup Standard Platform League (SPL) [75], is actually used as educational and therapeutic tool to teach children in the schools [102, 155]. Since 2008 Nao was released to universi-ties and institutes for research and educational purposes. Conversely from Kaspar, Nao is a 58cm tall robot that does not resemble a real human but

(33)

a cartoon-like character. It is equipped with 2 cameras for detecting and recognizing faces and shapes, 4 microphones to track sounds 2 IR emitters and receivers to estimate the distances to obstacles in the environment, 1 inertial board to provide Nao with stability and positioning within space, 9 tactile sensors and 8 pressure sensors to give NAO information through touch or to trigger actions. Its communication system includes a voice synthesizer, LED lights, and 2 high-fidelity speakers.

Similar to Nao but with a more realistic human-like appearance, in 2007 Hanson Robotics built Zeno (Fig. 1.2h). Zeno is a child-size robot with a cartoon-like body and a fully expressive human-like head made of a particular silicone material called “Frubber” [65]. The last prototype of Zeno can stand, make eye contact and engage people in conversations. It is equipped with multiple sensors to detect faces, to sense touches, and to track motions and sounds. Through an advanced artificial intelligence software Zeno can react and interact physically and verbally in social scenarios. Zeno was inspired by the “desire to create socially interactive robots with human-inspired personalities that model the cognitive affect of people” [65] and actually it is used as therapeutic tool for children with autism as Nao.

In the same year, the European Commission funded a 5-years long project, RobotCub1, to study the human cognition through the imple-mentation of a child-size humanoid robot: iCub (Fig. 1.2i). As a child who learns by interacting with the environment, iCub is not programmed to perform specific actions but it is designed to learn cognitive skills by using its body to explore the world and gather data by using its senses [114]. The project aimed at reproducing the perceptual system and artic-ulation of a two and half years-old child to be tested in cognitive learning scenarios in which iCub could interact in the same way that such a child

1

(34)

does. Through its motor skills and sense abilities, i.e. vision, sound, touch, balance and proprioception, iCub can crawl on all fours, grasp and manipulate objects and direct its attention to follow gestures.

1.1.3 Too much realism? The Uncanny Valley hypothesis

The development of social robots more and more similar to humans in morphology and functionality has raised the question whether or not these androids could fall into the so-called Uncanny valley. The Uncanny Valley hypothesis was proposed by Masahiro Mori in the late 1970s as association of an uncanny feeling with the robot design [119]. Mori hypothesized that the acceptance of a humanoid robot increases with its realism but only up to a certain point when the curve suddenly plunges into the uncanny valley. Thus, an observer could lose the sense of affinity and experience an eerie sensation (Fig. 1.3). Mori observed that even subtle imperfec-tions detected in a human-like robot can raise a sense of strangeness and remind a cold, still corpse, e.g., at first glance prosthetic hands are indis-tinguishable from human hands but shaking one would cause someone to feel uneasy or shocked. [119].

For years, the Uncanny Valley hypothesis has been a guideline to avoid designing and building robots for commercial purposes with too high levels of realistic anthropomorphism. However many researchers have questioned the validity of the this hypothesis even in the light of the recent technological advances. Indeed MacDorman and Ishiguro [106] highlighted that the Uncanny Valley hypothesis was based on theoretical assumptions since there was no human-like robot so perfect which could be used to verify whether it did pull itself out of the Uncanny Valley when Mori published his essay. Recently Hanson et al. [67] demonstrated that if robots attain certain aesthetic cues they can be appealing to humans since high levels of realism make people more sensible. Indeed it is the

(35)

Figure 1.3: Visualization of the Mori’s hypothesis as relation between human likeness and perceived familiarity: the acceptance of a humanoid robot increases with its realism up to a certain point when the curve suddenly plunges into the Uncanny Valley. After this point subtle differences in appearance and behaviour provoke a sense of eerie.

poor quality of aesthetic designs that can cause the illusion to fall into the uncanniness. Conversely, Bartneck et al. [12, 13] showed that there are no strong evidences that support the Uncanny Valley hypothesis. In their results no significant differences were found between the likeability of android and human stimuli in an experiment with an android robot that was an exact copy of a human being.

Nowadays the common idea among researchers is that androids have an enormous potential: any feelings associated with the Uncanny Valley and raised from the interaction with these high-fidelity androids could help to evaluate and modify the underlying social and cognitive models.

(36)

1.2 Control systems for social robots

A theme common to all debates about the Mori’s Uncanny Valley con-cerns the consistency between appearance and behaviour of a robot, i.e., the robot should behave in a way that is consistent with its appearance [177]. Indeed, especially in robots with an extremely realistic appearance, even slight inconsistencies in their behaviour can have a powerful unset-tling effect [177]. Thus, the researchers’ efforts are focused on developing robot control systems aimed at emulating movements and behaviours of real humans as much as possible.

As already mentioned, creating a sociable robot strongly requires ad-ditional knowledge from different fields, such as social psychology, com-puter science, affective computing, and AI. The contributions of each of these fields have repercussions on the design of the underlying control framework. Social psychology provides information on how people react to social stimuli that represent guidelines for the creation of the robot’s behaviour. Computer science deals with the development of software sys-tems that control the behaviour of the robot and its interaction with people and the world. Affective computing is a new interdisciplinary field focused on simulating empathy in machines [132], i.e., giving machines the ability to interpret the emotional state of humans and adapt their state and behaviour to them. AI is fundamental for enhancing capabili-ties and believability of the robot using models and algorithms to learn from human behaviours, to process environmental cues, and to determine what action to take at a given instant on the base of the current social context.

(37)

1.2.1 Embodied agents

Over the last 60 years the AI has dramatically changed its paradigm, from a computational perspective which includes research topics, such as problem solving, knowledge representation, formal games and search techniques, to an embodied perspective which concerns developing sys-tems in the real physical and social world that must deal with many issues extraneous to the first perspective.

This new multidisciplinary field called “embodied artificial intelli-gence” started to acquire another meaning in addition to the traditional algorithmic approach also known as GOFAI (Good Old-Fashioned Ar-tificial Intelligence): it designates a paradigm aimed at understanding biological systems, abstracting general principles of intelligent behaviour and applying this knowledge to build intelligent artificial systems.

On this research line, promoters of the embodied intelligence began to build autonomous agents able to interact in a complex, dynamic and hos-tile world taking the human being as a reference point. An autonomous embodied agent should be able to act in and react to the environment by building a “world model”, i.e., a dynamic map of information changing over the time acquired through its sensors. The body assumes a key role in the exchange of information between the agent and the environment. The world is affected by the agent through the actions of its body and the agent’s goal can be affected by the world through the agent’s body sensors. However building a world model also requires the ability to sim-ulate and make abstract representations of what it is possible to do in certain situations that means “having a mind”.

One of the major figures who outlined the tight bond between mind and body is Antonio Damasio2: “Mind is not something disembodied, it is something that is, in total, essential, intrinsic ways, embodied. There

2

(38)

would not be a mind if you did not have in the brain the possibility of constructing maps of our own organism. [...] you need the maps in order to portray the structure of the body, portray the state of the body, so that the brain can construct a response that is adequate to the structure and state and generate some kind of corrective action.”

By combining the biological and robotic perspective, building an in-telligent embodied agent requires both a body and a mind. The body represents the means through which the agent acquires the knowledge of the external world and the mind represents the means through which the agent models the knowledge and controls its behaviour.

1.2.2 Modelling the behaviour

From a robotic point of view, humans are sophisticated autonomous agents that are able to work in complex environments through a com-bination of reactive behaviours and deliberative reasoning. A control system for an autonomous robot must perform tasks based on complex information processing in real-time. Typically a robot has a number of inputs and outputs that has to be handled simultaneously and it operates in an environment in which the boundary conditions determined through its sensors change rapidly. The robot must be able to react to these changes in order to reach a stable state [6].

Over the years, many approaches have been used in AI to control robotic machines. The three most common paradigms are Hierarchical, Reactive and hybrid Deliberate/Reactive and all of them are defined by the relationship among the three primitives, i.e. SENSE, PLAN and ACT, and the processing of the sensory data by the system [122].

The Hierarchical paradigm is historically the oldest method used in robotics since 1967 with the first AI robot, Shakey [126]. In the Hierar-chical paradigm, the robot senses the world to construct a model, plans

(39)

the next actions to reach the goal and finally acts to carry out the first directive. This sequence of activities is repeated in a loop in which the goal may or may not have changed (Fig. 1.4).

Figure 1.4: The Hierarchical paradigm based on a repetitive cycle of SENSE, PLAN and ACT.

Fig. 1.5 shows an example of Hierarchical paradigm characterized by a horizontal decomposition as designed by Rodney Brooks [30]. The first module consists in collecting and processing the environmental data received through the robot’s sensors. The processed data are used to either construct or update an internal world model. The model is usually constituted by a set of symbols composed by predicates and values which can be manipulated by a logical system. The third module, i.e., the planner, uses the world model and the current perception to decide a feasible plan of actions to be executed to achieve the desired goal. Once a suitable set of actions has been found, the fourth and fifth modules execute the actions by converting the high-level commands in low-level commands to control the actuators of the robot. This process is repeated continuously until the main goal of the robot has been achieved.

Using a top-down design and sequential modules, the Hierarchical paradigm lacks of robustness because each subsystem is required to work. Moreover it needs of higher computational resources due to the modelling and planning phases.

Starting from the 1970s many roboticists in the field of AI explored biological and cognitive sciences in order to understand and replicate the

(40)

Figure 1.5: Example of traditional decomposition of a mobile robot control system into functional modules.

different aspects of the intelligence that the animals use to live in an “open world” overcoming the previous “closed world” assumption. They tried to develop robot control paradigms with a tighter link between perception and action, i.e., SENSE and ACT components, and literally threw away the PLAN component (Fig. 1.6).

Figure 1.6: The Reactive paradigm based on a direct link between SENSE and ACT.

In the Reactive paradigm the system is decomposed into “task-achieving behaviours” which operate in parallel and independently of any other be-haviours. Each “behaviour module” implements a complete and func-tional robot behaviour rather than one single aspect of an overall control task and it has access to sensors and actuators independently of any other modules. The fundamental idea of a behaviour-based decomposi-tion is that intelligent behaviour is not achieved by designing one complex, monolithic control structure but by bringing together the “right” type of

(41)

simple behaviours, i.e., it is an emergent functionality.

The subsumption architecture developed by Rodney Brooks in 1986 [30] is perhaps the best known representative of the Reactive paradigm for controlling a robot. The model is based on the fact that the cognition can be observed simply using perceptive and action systems that inter-act directly with each other in a feedback loop through the environment. The subsumption architecture is focused around the idea of removing cen-tralized control structures in order to build a robot control system with increasing levels of competence. Each layer of the behaviour-based con-troller is responsible for producing one or few independent behaviours. All layers except the bottom one presuppose the existence of the lower layers, but none of the layers presupposes the existence of the higher lay-ers. In other words, if the robot is built with a bottom-up approach, each stage of the system development is able to operate. This architecture entails that a basic control system can be established for the lowest hard-ware level functionality of the robot and additional levels of competence can be built on the top without compromising the whole system. Fig. 1.7 shows an example of a behaviour-based decomposition of a mobile robot control system with the subsumption architecture.

Figure 1.7: Example of decomposition of a mobile robot control system based on task-achieving behaviours.

(42)

However the Reactive paradigm eliminated planning or any reason-ing functions therefore a robot could not select the best behaviour to accomplish a task or follow a person based on some criteria. Thus, at the beginning of the 1990s AI roboticists tried to reintroduce the PLAN component without disrupting the success of the reactive behavioural control which was considered the correct way to perform low level control [122]. From that moment architectures that used reactive behaviours and incorporated planning activities were referred as using a hybrid Deliber-ative/Reactive paradigm (Fig. 1.8).

Figure 1.8: The hybrid Deliberative/Reactive paradigm which reintroduces the PLAN com-ponent and combines a behaviour-based reactive layer with a logic-based deliberative layer.

The hybrid Deliberative/Reactive paradigm can be described as PLAN, then SENSE-ACT: the robot first plans how to best decompose a task into subtasks, then it decides what are the suitable behaviours to accomplish each subtask. The robot instantiates a set of behaviours to be executed as in the Reactive paradigm. Planning is done at one step while sensing and acting are done together. The system is conceptually divided into a reactive layer and a deliberative layer.

In a hybrid Deliberative/Reactive system the three primitives are non clearly separated. Sensing remains local and behaviour specific as it was in the Reactive paradigm but it is also used to create the world model

(43)

which is required by the planning. Therefore some sensors can be shared between the model making processes and each perceptual system of the behaviours. Instead other sensors can be dedicated to provide observa-tions which are useful for world modelling and are not used for any active behaviours.

Here, the term “behaviour” has a slightly different connotation than in the Reactive paradigm: if “behaviour” indicates a purely reflexive action in a Reactive paradigm, the term is nearer to the concept of “skill” in a hybrid Deliberative/Reactive paradigm.

1.2.3 Examples of control systems

This section briefly presents examples of existing architectures for control-ling social robots. The first example is Kismet, belonging to the category of the cartoon-like robots, that is focused on building a socially intelligent machine able to communicate with and learn from people and to express its personality through facial expressions. The second example is iCub, a full-body humanoid robot, more focused on building cognitive capabil-ities based on enactive development by means of the interaction with the environment. A comparison will be outlined with the system presented in this thesis for controlling an android able to reproduce realistic facial expressions 2.

Kismet

Kismet is the first robot designed to explicitly engage people in natural and expressive face-to-face interactions, and it is widely recognized as the pioneering effort in the new field of social robotics. The design of Kismet was inspired by infants, who “are born as a coherent system, albeit immature, with the ability to respond to and act within their environment in a manner that promotes their survival and continued growth.” [26].

(44)

The Kismet’s appearance has been thought to encourage people to treat it as if it were a very young child or infant. Kismet can communicate its emotional state and social cues to the social partner through face, gaze direction, body posture, and voice.

Figure 1.9: The framework used for designing the Kismet’s synthetic nervous systems.

The underlying architecture of Kismet was designed on the base of be-havioural models and mechanisms of living creatures referred by Cynthia Breazeal as the robot’s synthetic nervous system (SNS). Kismet’s SNS is a modular system that includes 6 modules (1.9):

• The low-level feature extraction system is responsible for acquiring the raw sensory information and extracting those features that are

(45)

relevant for the behaviour of the robot. From the earliest stages of development, human infants can discriminate between social stim-uli, such as faces or voices, and salient non-social stimstim-uli, such as brightly coloured objects, loud noises or large motion [26]. There-fore the detection of the eyes as visual cues or the recognition of vocal affect as auditory cues may be interesting perceptual cues from the point of view of Kismet.

• The high-level perception system is responsible for generating per-ceptions hat are behaviourally relevant starting from the low-level features of the target stimuli identified by the attention system. Each behaviour and emotive response has a corresponding releaser, i.e., a collection of feature detectors used to identify a particular ob-ject or event of behavioural significance. A releaser will determine if all perceptual conditions are right for the response to become ac-tive. In that case, active responses are passed to their corresponding behaviour process in the behaviour system and also to the affective appraisal stage where they can influence the emotion system [26].

• The attention system receives the low-level visual perceptions and selects the ones that are particularly salient or relevant at that time. The selection of perceptual stimuli can depend on different factors, e.g., someone that suddenly appears, or something that has a special significance for the robot or has an inherent saliency can attract the robot’s attention. On the base of the stimulus considered to be the most salient, the robot can organize its subsequent behaviour around it.

• The motivation system regulates and maintains the “well-being” state of the robot varying from an alert state when it is interact-ing well with people and a mildly positive affective state when the interactions are neither overwhelming nor under-stimulating [26].

(46)

The nature of the robot is defined by its “needs” which influence its behaviour since it acts to satisfy them. The motivation system consists of two related subsystems: drives and emotions. Kismet’s drives model critical parameters necessary to the homeostatic bal-ance which have to be maintained within a bounded range. Kismet’s emotions are idealized models of emotions and arousal states which serve in social contexts to respond in an adaptive manner.

• The behaviour system implements and arbitrates between typi-cal competing behaviours of infants. Each behaviour is viewed as an independent goal-directed entity that competes with other behaviours. Behaviours are organized into competing functional groups where each group is responsible for maintaining one of the three homeostatic functions: to be social, to be stimulated by the environment, and to occasionally rest [26]. Each functional group consists of an organized hierarchy of behaviour groups each of which represents a competing strategy for satisfying the goal of its par-ent behaviour. Kismet uses an arbitration mechanism to determine which behaviour has to be activated and for how long: at the be-havioural category level, it selects the functional group which repre-sents the need to be satisfied; at the strategy level, it decides which behaviour group belonging to the winning functional group is the winner; finally, at the level task, it selects one of the behaviours belonging to the winning behaviour group.

• The motor system is responsible for commanding the actuators in order to carry out the task selected by the behavioural system and to convey the affective state established by the motivation system. The Kismet’s architecture includes the vocalization system to ex-press utterance, the facial animation system to orchestrate facial expressions and lip synchronization, and the oculo-motor system

(47)

to reproduce human-like eye movements and head orientations. At a given time, concurrently active behaviours may compete for the same actuators therefore the motor skills system is responsible for appropriately blending the motor actions. The motor skills system is also responsible for smoothly transitioning between sequentially active behaviours in a timely manner so as to not disrupt the natural flow of the interaction. Finally, the motor skills system is respon-sible for moving the robot’s actuators to convey the appropriate emotional state of the robot.

Kismet is mainly designed to model the social interaction between an infant and its caregiver, but it is not thought to be neither a tool nor an interface [26]. Kismet is not use to perform specific tasks but its cognitive system is motivated by basic drives which are typical for a child, i.e., thirst, hunger and fatigue. The modular architecture is structured to provide Kismet with the ability to express lifelike qualities, to perceive and understand the complexity of human social behaviours and to adapt to the current social scenario by changing its behaviour by means of a physical body that allows the robot to be socially situated with people.

The same social features and abilities define the nature of FACE, but differently from Kismet FACE is designed to be a research tool for investi-gating the dynamics underlying human social interactions. Psychologists and behavioural scientists can create social scenarios and behavioural models and use FACE as a test bed, even modifying the behaviour of the robot in real-time. Therefore the FACE’s cognitive system is based on a rule-based expert system that exposes an interactive tool to control the behaviour of the robot even at runtime by adding new rules or modifying the existing ones 3.

(48)

iCub

iCub is an infant-like robot with the motor and cognitive abilities of a two and half years-old child. The development of iCub is based on replicating the learning process that a real child goes through from a dependent, speechless newborn into a walking, talking being [125]. The physical appearance of iCub with its 90cm of height reflects the age of a baby. A semi-transparent mask with luminous colour light-emitting diodes highlights the eyebrows and the mouth to smile and frown.

The fully articulated body makes iCub able to crawl and sit. The robot is equipped with human-like senses [125]: a stereoscopic vision by two cameras mounted on moving ocular bulbs with eyelids; an auditory system with two microphones mounted on the head; an vestibular system using an inertial sensor that provides absolute orientation and angular ac-celeration; a proprioceptive perception to detect the position of all joints; a tactile system based on capacitive sensors on the hands which provides contact and pressure information.

Taking inspiration from psychological and neuroscience studies about the development of babies, the iCub’s brain is provided with a rich set of innate action and perception abilities that a newborn would be able to do, such as recognizing a human face and detecting objects against a background [125]. More complex cognitive abilities would be learned over time through an increasing development process.

The software architecture provides the basic control of the hardware based on the YARP middleware [113], an open-source software library that supports distributed computation with a special focus on robots. All data are exchanged as IP packets over a GBit Ethernet connection. The lowest level filters the information acquired through the sensory system to determine the most salient signals to be sent to the cognitive architecture.

The cognitive architecture includes three main levels:

(49)

Figure 1.10: The software control architecture of iCub.

• the multi-functional perceptuo-motor circuits represent all the abil-ities that can be considered innate and initially-planned in a neona-tal development, such as the basic ability to re-orient the gaze to-wards local perturbations in tactile, auditory and visual field, or more complex ability to detect human faces and follow the the eyes. These circuits operate concurrently, competitively and co-operatively therefore specific mechanisms are required to specify which skills are selected and uninhibited [172];

• the modulation circuits receive data from the lower level and com-pare those data with combinations of actions and sensory informa-tion that iCub has encountered before deciding the next acinforma-tion. It is based on the mechanism by which the agent achieves an

(50)

increas-Chapter 1. State of the art

ingly greater degree of anticipation and simulation as it learns and develops with experience [172];

• the self-modification circuits explore and predict new perceptual possibilities not on the base of an objective environment, but on the base of the prior experience and the space of possible actions that the system can engage in whilst still maintaining the consistency of the coupling with the environment [172]. The information is sent down to the middle level to help determine the robot’s next action.

iCub has been developed with the aim of studying and reproducing the self-development process of a child. Its cognitive architecture is based on the concept of enactive system, that is a system able to experience and assimilate, anticipate and predict, learn and develop autonomously. Five concepts define the enactive cognitive science [173]: embodiment, i.e., a physical entity that interacts with its environment; experience, i.e., the history of interaction with the world; emergence, i.e., the development of cognitive behaviours from dynamic interplay between component parts; autonomy, i.e., self-generating identity and self-regulation of homeostasis; and sense-making, i.e., generation of knowledge autonomously by acting in the world. From this point of view, cognition is the process whereby an autonomous system adapts to its environment through a continuous pro-cess of self-organization. Thus, the embodiment becomes fundamental to make the robot able to move into the space, manipulate the environment and experience from these manipulations.

Conversely, the research approach behind FACE aims at providing neuroscientists, psychologists and human behaviour researchers with a easy-to-use system for studying computational models of human social abilities during controlled social scenarios with the robot. In this con-text, cognition entails the manipulation of explicit symbolic representa-tions of the external world and the storage of the knowledge gained from

(51)

experience to reason more effectively in the future [173]. An interactive interface can be used to modify the current behaviour of the robot or define new behavioural models to be tested in real-time 3. Although the cognitivism supports the idea that the embodiment is not always neces-sary, this research thesis also aims at highlighting the importance of a physical embodiment to improve quality, realism and effectiveness of the interaction with the robot.

Both systems are based on YARP which makes the architecture intrin-sically modular, distributed even across machines with different operating systems, highly scalable and extensible with additional modules without the necessity of structural changes.

(52)

(53)

Chapter 2

The FACE android

FACE (Facial Automaton for Conveying Emotions) is a humanoid robot with a believable facial display system based on biomimetic engineering principles equipped with a passive articulated body. The latest prototype of the head has been fabricated by David Hanson through a life-casting technique. It aesthetically represents a copy of the head of a female subject, both in shape and texture, and the final result appears extremely realistic (Fig. 2.3).

(54)

Chapter 2. The FACE android

The skull is sculpted following forensic facial reconstruction processes to allow the robot to reproduce natural looking expressions and printed in ABS plastic using a 3D printer. The artificial skull is covered by a porous silicone elastomer called FrubberTM _{developed and patented by Hanson} Robotics1. FrubberTM _{is an extremely soft, supple and strong silicone} that makes it closely correlated with a living facial tissue [68]. Flexible rubber-cloth anchors are designed and cast directly into the FrubberTM material. These anchor points are strategically defined to simulate the stress distribution of the facial muscles on the human skin surface and reproduce human-like wrinkles, folds and bunches. At the end of the casting process, the FrubberTM_{skin is placed on the skull and the anchors} are connected by yarns to the servo motors that represent the actuation system of FACE (Fig. 2.2).

Figure 2.2: The actuation system of FACE composed of 32 servo motors integrated into the skull and the upper torso.

The actuation system is controlled by 32 electric gearhead servo mo-tors which are integrated into the skull and the upper mo-torso similarly to the major facial muscles (Fig. 2.3a and 2.3b). A total of 25 servo

mo-1

http://hansonrobotics.wordpress.com/

(55)

Chapter 2. The FACE android

tors simulates the facial muscles, 4 servo motors move the neck and the remaining 3 motors control the movement of the eyes.

Figure 2.3: (a) Servo motor positions in the robot’s skull; (b) The major facial muscles used as reference for positioning the servo motors.

Thanks to the physical and mechanical characteristics of FrubberTM_, the linkages distribute the motor force into the skin which can be easily modelled and stretched with 20x less force than solid elastomers to affect a full range of simulated facial expressions in a repeatable and flexible way [68]. The facial expressions of FACE appear extremely realistic as shown in Fig. 2.4.

(56)

(57)

Chapter 3

The FACE’s control system

3.1 Designing the system

Controlling a social robot means developing a computational infrastruc-ture to make the robot able to interpret and send human readable social cues and to employ a variety of behavioural and communicative skills to attract people in social interactions.

Taking inspiration from the biological model, human intelligence does not depend on monolithic internal models, on a monolithic control, and on a general purpose processing [31]. Humans perceive the world and their internal state through multiple sensory modalities that in paral-lel acquire an enormous amount of information used to create multiple internal representations. Moreover behaviours and skills are not innate knowledge but are assimilated by means of a development process, i.e., performing incrementally more difficult tasks in complex environments [31]. There is also evidence that pure rational reasoning is not sufficient for making decisions since human beings without emotional capabilities often show cognitive function deficits [44].

(58)

Chapter 3. The FACE’s control system

sensors to acquire information. Raw data are processed and organised to create “metamaps”, i.e., structured objects of itself, the world, and its social partners which together form the knowledge base. The knowledge representation as structured objects offers the advantage to manipulate the information at a higher level of abstraction and in a more flexible and natural way using a rule-based declarative language. The application of rules to the existing knowledge produces new structured information that can also be decoded to be processed again by a procedural language.

As in the human nervous system, planning is the slowest part of the control. Rule-based expert systems can deal with a huge amount of rules but require time to compute the final action. In the meanwhile sensors and actuators have to be linked through direct communication channels to perform fast reactive actions. Thus, a hybrid Deliberative/Reactive paradigm which supports heterogeneous knowledge representations is a good solution for designing a control architecture of a social robot. In-tegrating a logic-based deliberative system with a behaviour-based re-active system ensures that the robot can handle real-time challenges of its environment appropriately while perform high-level tasks that require reasoning processes [136].

Fig. 3.1 shows the current state of the architecture based on a hy-brid Deliberative/Reactive paradigm. The architecture is highly modular by encapsulating functionalities into single modules. Procedural modules collect and elaborate raw data gathered from sensors or received from the other modules while declarative modules process high-level information through a rule-based language. For instance, the sensory subsystem ac-quires and processes incoming data and makes the output available both to the actuation subsystem which manages fast and instinctive behaviours (SENSE-ACT) and to the deliberative system which creates metamaps of the social world and the robot itself (SENSE-PLAN). Based on these

Development of a cognitive and emotional control system for a social humanoid robot