• Non ci sono risultati.

Mnemonic effects of action simulation from pictures and phrases

N/A
N/A
Protected

Academic year: 2021

Condividi "Mnemonic effects of action simulation from pictures and phrases"

Copied!
39
0
0

Testo completo

(1)
(2)

Several theoretical approaches suggest that language comprehension and action observation rely on similar mental simulations. Granted that these two simulations partially overlap, we

assumed that simulations stemming from action observations are more direct than those stemming from action phrases. The implied prediction was that simulation from action observation should prevail on simulation from action phrases when their effects are contrasted. The results of three experiments confirmed that, when at encoding the phrases were paired with pictures of actions whose kinematics was incongruent with the implied kinematics of the actions described in the phrases, memory for action phrases was impaired (Experiment 1). However, the reverse was not true: when the pictures were paired with phrases representing actions whose kinematics were incongruent with the kinematics of the actions portrayed in the pictures, memory for pictures portraying actions was not impaired (Experiment 2). Also, in line with evidence that simulations from action phrases and those from action observation partially overlap, when their effects were not contrasted their products were misrecognized. In our experiments, when action phrases only

presented at recognition described actions depicted in pictures seen at encoding, they were misrecognized as had already been read at encoding (Experiment 1); further, when pictures only presented at recognition portrayed actions described in phrases presented at encoding, they were misrecognized as seen at encoding (Experiment 2). A third experiment excluded the possibility that the pattern of findings was simply a consequence of better memory for pictures of actions as opposed to memory for action phrases (Experiment 3). The implications of our results in relation to the literature on simulation in language comprehension and action observation are discussed.

(3)

1. Introduction

The main assumption underlying our investigation relied on two main findings in the literature: observing an action performed by an agent, as well as comprehending an action related phrase, triggers a mental simulation of the action (see, e.g., Barsalou, 2008; Zwaan & Taylor, 2006).

When we observe someone else executing an action, our brain simulates the performance of the action being observed (Jeannerod, 1994); this internal representation of the observed motor programs, in absence of overt movements, is usually called “action simulation” (Jeannerod, 2001). Several studies have suggested that such simulation is rapid and obligatory; thus, for example, our memory for the final position of a moving object is systematically distorted forward along its path of motion (this phenomenon has been termed “representational momentum”, see, e.g., Freyd, 1983).

Simulation stemming from action observation relies on the observer’s motor system. In both non-human primates (e.g., Gallese, Fadiga & Fogassi, & Rizzolatti, 1996) and human primates (e.g., Rizzolatti, 2005), the simple observation of an action involves the activation of the same areas devoted to the action production. Specifically, several studies have detected the activation of a complex neural network (AON, the “action observation network”), which mainly involves the inferior frontal cortex, the dorsal premotor cortex, the supplementary motor area and the inferior parietal lobule (see, e.g., Caspers, Zilles, Laird, & Eickhoff, 2010; Van Overwalle & Baetens, 2009). In this view, motor reactivity to observed actions likely reflects the anticipatory simulation of future phases of the observed action (e.g., Schütz-Bosbach & Prinz, 2007).

Even observing a still photograph of an actor in motion triggers a mental simulation through the activation of the observer’s motor system. Indeed, an action photo conveys dynamic information about the position of the actor after the photo was taken (i.e., implied motion). For example, studies have revealed that the observation of a scene depicted in a photo portraying an actor performing an action may trigger a simulation that in turn can give rise to a false memory of the actor performing an advanced phase of the action (Ianì, Mazzoni & Bucciarelli, 2018). Consistent with these

(4)

findings, viewing static snapshots of hands that imply body actions activates the human motor system (see, e.g., Urgesi, Moro, Candidi & Aglioti, 2006), and viewing photographs of humans with implied motion activates the same medial temporal/medial superior temporal area involved in the visual analysis of real motion processing (e.g. Kourtzi & Kanwisher, 2000). Most relevant for the present investigation, observation of start and middle phases of grasp and flick actions seems to engender a significantly higher motor activation than observing their final posture (Urgesi et al., 2010). These results suggest that dynamic information is extracted from static images of the human body: observing specific features of other people’s actions allows an observer to simulate their actions and thus to understand their intentions (i.e., the goal of an action, see, for example, Blakemore & Decety, 2001; Wilson & Knoblich, 2005). The attribution of intention to an actor relies on a mental simulation guided by the kinematic of a specific action (see, e.g., Castiello, 2005), which in turn depends on the actor’s intention: different kinematics represent different intentions (see, for a review, Quesque & Coello, 2015). Besides the motor resonance resulting from the observed action kinematics, the presence of a specific object in a photo allows the observer to recognize and predict the action (see Bach, Nicholson & Hudson, 2014). The results of evoked potential studies have revealed that when participants were presented with a handled object (i.e. a mug) close to them, motor-evoked potentials were higher compared to the observation of either a not graspable object or a graspable object falling outside their reach; crucially, the same effect also occurred when the object was close to a virtual individual such as an avatar (Cardellicchio,

Sinigaglia & Costantini, 2013). These results suggest that representing the affording features of a given object in terms of their effective readiness-to-hand allows one to anticipate what other people are in the position to do and therefore to guess what they may most likely intend to do (see also Costantini, Committeri & Sinigaglia, 2011).

When comprehending an action-related phrase we also rely on a sensorimotor simulation of the action (see, for a review, Fischer & Zwaan, 2008). Reading or hearing single words describing objects triggers a mental simulation of the actions that we usually perform with them (see Tucker &

(5)

Ellis, 2004). For instance, participants in the pioneering study of Tucker and Ellis (1998) made responses (push-button) with the left or right hand depending on whether common graspable objects were upright or inverted: their responses were affected by the orientation of the objects, which was manipulated to make them compatible with a grasp movement by either the left or the right hand. The visual representation of objects triggers the activation of the motor representations associated with their affordances. Crucial for the present investigation, studies have detected similar

affordance effects when participants had just to read the names of the objects (see, e.g., Tucker & Ellis, 2004). These results are in line with evidence that reading action words referring to face, arm, or leg actions triggers a somatotopic activation in the readers’ pre-motor and motor areas (e.g., Hauk, Johnsrude & Pulvermüller, 2004). In a study involving action phrases rather than objects or names of objects, participants read a phrase and then responded to an object’s picture

(e.g., Stanfield & Zwaan, 2001; Zwaan, Stanfield & Yaxley, 2002). Their task was to respond “yes” if the name of the pictured object was mentioned in the phrase (target trials) and “no” if it was not (filler trials). The results of these studies have revealed that readers were faster in correctly

responding to pictures that were congruent, rather than incongruent, with the implied

shape/orientation of the object in the phrase. For instance, after reading the phrase “He hammered the nail into the floor” participants were faster to respond to a picture of a vertically oriented nail than to a picture of a horizontally oriented nail. These results suggest that readers used contextual information inherent in the phrase and incorporated them into their mental simulation from the phrase. Also the so-called “action-sentence compatibility effect” suggests that language

understanding involves mental simulation. The participants in a study by Glenberg and Kaschak (2002) read phrases describing actions toward their body (e.g., “Mary gave you a pencil”) or away from their body (e.g., “I closed the door”), as well as nonsense phrases. Their task was to judge the sensibility of each phrase by moving their hand toward or away from their own body. Participants were faster in respond correctly when the direction implied in the phrase was congruent with the action with which they had to respond.

(6)

Overall, these results have shown that simulations involved in language comprehension, action observation and memory rely on the same sensorimotor resources (see Dijkstra & Post, 2015). However, for the aim of our investigation it is important not to overlook the differences between observation of an action (or picture of an action) and comprehension of a description of that action. In Section 2 we argue in favor of a distinction between “direct” simulations from observed actions and “indirect” simulations from described actions. In Section 3 we test the predictions implied by this assumption. In Section 4 we discuss the results of our investigation in relation to the literature on simulation from language comprehension and action observation.

2. “Direct” and “indirect” action simulations

Mental model theory (Johnson-Laird, 2006), our theoretical framework, assumes that individuals construct mental models to comprehend and reason about states of affairs perceived or described. Mental models can be either static or kinematic. A kinematic model unfolds in time, and the sequence of situations it represents corresponds to a temporal order of events in the world, real or imaginary (Johnson-Laird, 1983; Schaeken, Johnson-Laird, & d’Ydewalle, 1996); the kinematic simulation is akin to a mental “movie” (Hegarty, 1992; Johnson-Laird, 1983). Within this

theoretical framework, we highlight some reasons why mental simulations from action observation and from language processing should differ.

Kinematic mental simulations heavily rest on visuospatial working memory. Provided the existence of separate buffers in working memory, specialized for maintaining visuospatial

representations and verbal representations (Baddeley, 2002), studies have demonstrated that visuospatial working memory load interferes more than verbal working memory load with tasks involving kinematic mental simulation such as mechanical reasoning, and similarly, mechanical reasoning interferes more with the visuospatial than the verbal memory loading task (Sims & Hegarty, 1997). In this respect, as opposed to simulation from action observation, simulation from

(7)

comprehension of action phrases also relies on verbal working memory. In particular, mental model theory implies that comprehension of action phrases relies on the propositional representation resulting from parsing processes and then on the construction of a kinematic mental model of the state of affairs described starting from the propositional representation (see also Kintsch, 1998). Observing the action, instead, should give rise to the direct construction of a kinematic mental model (see, e.g., Bauer & Johnson-Laird, 1993; Glenberg & Langston, 1992). As Bucciarelli (2007) pointed out, verbal representations, in contrast to iconic representations, are “indirect” in that they have an abstract syntax interposed between representations and represented entities (see,

McKendree, Small, Stenning & Conlon, 2002; Stenning, 2000). In line with the tenets of mental model theory, we assume that simulation in action observation and in language comprehension differs in quality and in time course (the latter should be more indirect and slower than the former).

Our assumption is consistent with the well-established finding that when we observe an agent we gain rich and complex comprehension of what she is doing through “automatic, implicit, and non-reflexive simulation mechanisms” (Gallese, 2005, p. 117), also referred to as “relatively smart perception” (Gallagher, 2008). Observing the movements of another person one already sees their meaning because mental simulation is triggered by the specific kinematics of the action. This simulation is not based on conscious strategies but it depends on the implicit engagement of neural systems. When we comprehend an action phrase, instead, the linguistic system does control the simulation (see, e.g., Barsalou, Santos, Simmon & Wilson 2008). In this case, simulation is indirect in that its input is the propositional representation resulting from parsing processes. Consistent with this claim, studies have revealed that cortical responses to action observation occurred starting from 400 ms after the action onset, whereas those arising from sentences presentation started about 1200 ms post action verb onset (Schaller, Weiss & Müller, 2017). As pointed out by the “situated

simulation” view, during language processing when a word is perceived, both the linguistic system and the simulation system become active initially, but the linguistic system’s activation peaks before the simulation system’s activation (Barsalou et al., 2008). Indeed, studies have revealed that

(8)

although motor resonance occurs in both action observation and language comprehension (see, e.g., Urgesi et al., 2010, and Hauk et al., 2004, respectively), in language comprehension it is modulated by incoming linguistic input (see, e.g., Glenberg et al., 2008). Further, it seems that action

observation lead to the mental simulation of a specific instance of action, whereas comprehension of action phrases lead to the mental simulation of a more underspecified action. As a result, motor resonance should be greater in action observation than in comprehension of action phrases. The results of a study comparing electrophysiological brain responses during observation of prototypical arm movements and during semantic processing of action sentences containing arm-related action verbs were consistent with this assumption: oscillations that are clearly related to the activation of the motor system were strong during action observation, and weaker during processing of action sentences (Schaller et al., 2017). Thus, the results suggested that these action simulations recruit a gradual involvement of the motor system: motor representations seemed to be retrieved to a different extent depending on stimuli triggering mental simulations. Motor resonance, in turn, appeared to impact on the formation of memory traces. Studies have revealed that a specific motor memory, similar to that induced by practicing movements, is formed by observation (Stefan et al., 2005) and that the acquisition of motor behaviors is facilitated by previous observation of subjects learning the novel task; on the contrary, motor learning by observation is impaired when the motor system is engaged with an unrelated movement task (Mattar & Gribble, 2005).

We assume that the greater motor resonance in action observation compared to comprehension of action phrases has specific effects on memory when simulations from

observation and from language are contrasted. The revised literature enforces the assumption that simulation elicited by action observation should prevail on simulation elicited by action phrases in tasks involving them both. We tested the predictions deriving from this assumption in two

recognition task experiments whose rationale was to establish the effects of the different simulations in absence of any task demands that explicitly encouraged the formation of mental images (see, Ditman, Brunyé, Mahoney & Taylor, 2010). At the encoding phase, the participants

(9)

encountered action phrase-action photo pairs, and the members in each pair depicted the same action (congruent pairing) or not (incongruent pairing). The action phrase “She is drinking from the bottle” paired with a photo portraying an actress about to drink from a bottle is an example of congruent pairing, whereas the same phrase paired with a photo portraying an actress about to pour the water from a bottle is an example of incongruent pairing. We used photos representing an actor about to perform the action since a main assumption underlying our investigation is that the

observation of start and middle phases of an action elicits in the observer significantly higher motor activation than observing the final state (see Urgesi et al., 2010).

In Experiment 1 we tested the predictions that memory for action phrases should be impaired in cases of incongruent pairings, and in Experiment 2 we tested the predictions that, instead, memory for pictures depicting actions should not be impaired in cases of incongruent pairings. Further, in line with the literature according to which simulations from action observation and from action phrases share some features, in both experiments we tested the prediction that when the effects of the two simulations are not contrasted, their products can easily be confused at recognition.

Specifically, action phrases only presented at recognition should be misrecognized as already being read at encoding when they describe actions depicted in pictures seen at encoding (Experiment 1), and pictures only presented at recognition should be misrecognized as already being seen at encoding when they depict actions described in phrases presented at encoding (Experiment 2).

It is a well-established finding in the literature that memory for pictures is generally superior to memory for words (see, e.g., Paivio & Csapo, 1973; Erdely & Becker, 1974). Memory for action pictures, however, has never been compared with memory for action phrases. If memory for action pictures was superior to memory for action phrases, results confirming the predictions tested in Experiments 1 and 2 could be accounted for in terms of a hypermnesia for pictures. In such case, there would be no need to invoke mental simulation arising from action observation in order to explain the findings (i.e., the “action” manipulation would be unnecessary). To exclude this

(10)

alternative explanation and to single out the unique role of action observation, we devised

Experiment 3 in which we compared memory for action phrases and memory for action pictures. The experiments we carried out to test these predictions were in accordance with the Code of Ethics of the World Medical Association and had the approval of the Ethical Committee of the University of Turin.

3. Experiment 1: Memory for action phrases is impaired when presented at encoding with photos of actions featuring different kinematics

The participants in the experiment read a series of action phrases on a computer screen, each one followed by a photo depicting either the action described by the phrase or a different action on the same object, then they performed a recognition task for the phrases. When presented with the phrase-photo pairs, participants were informed that the second part of the experiment was a memory test for the phrases. In the recognition task of the second part of the experiment, participants

encountered the same phrases they had encountered at encoding along with phrases not presented at encoding, but describing actions either depicted or not in the photos presented at encoding. We predicted that memory for action phrases would be impaired when, at encoding, the phrases were paired with pictures of actions whose kinematics were incongruent with the kinematics of the action described, and action phrases only presented at recognition would be misrecognized as read at encoding when they described actions depicted in pictures presented at encoding.

3.1 Participants

The participants in the experiment were 40 adults, all Caucasians, students of a Psychology course at the University of Turin (6 males and 34 females, mean age = 22.95 years; SD = 1.24 years). All had normal or corrected to normal vision. They took part in the experiment voluntarily in exchange for course credits, after informed consent.

(11)

The experimental material consisted of 10 phrases presented at both encoding and

recognition, each describing an action performed on an object (see Appendix A). Each phrase was paired with a photo: in the congruent pairing the photo depicted an actor performing the action described in the phrase (hereafter, “congruent” photo); in the incongruent pairing the photo depicted the same actor performing a different action, but on the same object (hereafter, “incongruent” photo). These photos were a subset of those used by Ianì et al. (2018) in their experiments on memory for photos. The participants in those experiments encountered at encoding the action photos and at recognition they misrecognized for them photos depicting the unfolding in time of that action more than photos depicting the unfolding in time of a different action on the same object. For instance, when participants saw at encoding the photo of an actor holding a bottle to drink from it, they misrecognized the photo of the actor while drinking more than a photo portraying the final state of another action on the same object (i.e., pouring from the bottle). This finding suggests that participants interpreted the stimuli in the way we intended in our experiments. Further, our

assumptions on the meaning of our stimuli (i.e., actions depicted in photos) are consistent with the literature on action observation illustrated in the Introduction: both the action’s kinematics and the object feature trigger a specific mental simulation in the observer that leads in turn to a specific interpretation of the action observed (Bach et al., 2014).

The following is an example of the photo-phrase pairs in our experiment: “Open a box”

Congruent photo: Incongruent photo:

Also, the material comprised 10 phrases only presented at recognition (see Appendix A). Five of them described actions portrayed by the photos of the incongruent pairing at encoding (that

(12)

is, actions not read but seen in the photos) and 5 of them were completely new action phrases (that is, not read and not seen in any photo before).

We devised two experimental protocols, so that in each protocol 5 of the 10 phrases were paired with the “congruent” photos and 5 were paired with the “incongruent” photos. In Protocol 1, phrases numbered from 1 to 5 in Appendix A were paired with congruent photos (CO-congruent pairing), and phrases numbered from 6 to 10 in Appendix A were paired with incongruent photos (IN-incongruent pairing). In Protocol 2, phrases numbered from 1 to 5 in Appendix A were paired with incongruent photos (IN-incongruent pairing), and phrases numbered from 6 to 10 were paired with congruent photos (CO-congruent pairing). The presentation order of the phrase/photo pairs within each protocol was randomized for each participant using E-Prime 2.0 Software. Half of the participants were randomly assigned to Protocol 1 and half to Protocol 2.

The experiment was in a single session: an observation phase followed by a recognition phase. In the observation phase, the participants observed the phrase-photo pairs on a computer screen, for each pair first the phrase (5 seconds) then the photo (5 seconds). Between the presentations of the photo and the phrase we inserted an interval of 5 seconds. This interval

guaranteed that the simulations stemming from the photo and the phrase were both in participants’ attentional focus (see, e.g., Cowan, 2008). In the recognition phase, the participants observed the 10 phrases encountered at encoding and the 10 new phrases (each for 5 seconds). At recognition, the action phrases were presented to each participant in random order on the computer screen. The action phrases presented at encoding required affirmative responses, and those only presented at recognition required negative responses.

The experiment took place in the sole presence of the experimenter and an assistant. The participants were sitting on a chair in front of a desk where a computer was placed (approximately at 8 inches from the participant). They received the following instructions:

Thanks for your participation and for your time. In this phase you’ll observe some phrases. Each phrase is followed by a photo. Your task is to read and observe with attention each

(13)

phrase-photo pair. In the second phase of the experiment, you’ll be presented with a series of phrases and your task will be to judge which phrases you read in the first phase. The first phase is purely observational and each phrase and photo will remain on the screen for 5 seconds. In between each phrase-photo pair a black screen will indicate that you have to press the space bar to pass onto the next pair.

Immediately after this phase, participants were asked to perform the recognition task, for which they were instructed as follows:

You will read some phrases. For each phrase your task is to tell whether you encountered the phrase in the previous phase. Press “Yes” if you saw it, press “No” if you did not see the phrase.

3.3 Results

We analyzed correct acceptance (“old” items) and correct rejections (“new” items)

separately because there were two types of old items and two types of new items in this experiment. “Old” phrases were matched at encoding either with congruent or with incongruent photos, and “new” phrases were either totally new phrases, or “partially new” phrases in that they described actions seen in photos at encoding. Hence, collapsing correct rejections and false positive responses of totally new phrases with those of partially new phrases, as well as collapsing hits and false negative responses of congruent phrases with those of incongruent phrases, would be not totally justified.

Table 1 summarizes the means of correct recognitions of phrases presented at encoding and relative response times as a function of the pairing. As predicted, accuracy was greater for

congruent pairings than for incongruent pairings (Wilcoxon test: z = 1.70, one-tailed p < .05; Cliff’s δ = .15). Further, response times for incongruent pairings were longer than for congruent pairings, although the difference was not statistically significant (Wilcoxon test: z = 1.63, one-tailed p = .052; Cliff’s δ = .15); this tendency suggested that participants had difficulty ignoring what was portrayed in the photos when memory for phrases was correct.

(14)

Phrases presented at encoding

Accuracy Response times

Congruent pairing 4.63 (SD = 0.74) 1668 (SD = 641) Incongruent pairing 4.40 (SD = 0.84) 1915 (SD = 887)

Table 1. Means of correct recognitions of phrases presented at encoding, and relative response times (in milliseconds) as a function of the congruent and incongruent pairings (Experiment 1).

Table 2 summarizes the results for phrases only presented at recognition. As predicted, correct rejections were greater when they described actions never seen in photos at encoding compared to when they described actions seen in photos at encoding (Wilcoxon test: z = 2.99, one-tailed p < .002; Cliff’s δ = .27). Further, still in line with our predictions, response times were longer for phrases describing action seen in photos than for phrases describing actions never seen

(Wilcoxon test: z = 1.69, one-tailed p < .05; Cliff’s δ = .12).

We acknowledge that the overall amplitude of our predicted effects (in terms of accuracy and RTs) is relatively small. But we also argue that this is mitigated by the fact that, at encoding, participants were explicitly warned that they would be tested only on their memory for phrases. Hence, what the results revealed was that participants showed the effect of the incongruent action photo notwithstanding that they were aware of the nature of the memory test. It is highly likely that had the participants not been warned about the type of test at recall (i.e., for phrases or for photos) the effect would have been greater than the one we observed.

Phrases only presented at recognition

Accuracy Response times

Never seen in photos 4.80

(SD=0.4)

1874 (SD=756)

(15)

Seen in photos 4.45 (SD=0.7)

1977 (SD=781)

Table 2. Means of correct rejections of phrases only presented at recognition and relative response times (in milliseconds), depending on the fact that they described actions seen in photos presented at encoding or not (Experiment 1).

The global results of Experiment 1 confirmed our predictions: memory for action phrases was impaired when, at encoding, the phrases were paired with pictures of actions whose kinematics were incongruent with the kinematics of the actions described. Also, still in line with our

predictions, action phrases only presented at recognition were misrecognized as read at encoding when they described actions depicted in pictures seen at encoding. The results suggested that simulation from action observation prevailed on simulation from action phrases when their effects were contrasted. Experiment 2 tested the complementary prediction that memory for action pictures is not impaired when, at encoding, the pictures are paired with phrases describing actions whose kinematics are incongruent with those in the actions depicted in the pictures.

4. Experiment 2: Memory for action photos is not impaired by actions phrases featuring different kinematics

The participants in the experiment watched a series of action photos on a computer screen, each followed by a phrase describing either the action depicted in the photo or a different action on the same object. When presented with the photo-phrase pairs, participants were informed that the second part of the experiment was a memory test for the photos. Then, participants performed a recognition task: they encountered the same photos they had encountered at encoding along with photos not presented at encoding, either depicting actions described by phrases presented at encoding or not. We predicted that memory for action photos would not be impaired when, at encoding, the photos were paired with action phrases whose kinematics were incongruent with the

(16)

kinematics of the photo. However, action photos only presented at recognition would be

misrecognized as seen at encoding when they depicted actions described in phrases presented at encoding.

4.1 Participants

The participants in the experiment were 40 adults, all Caucasians, Psychology students at the University of Turin (5 males and 35 females, mean age = 23.6 years; SD = 1.45 years). All had normal or corrected to normal vision. They took part in the experiment voluntarily in exchange for course credits, after informed consent. None of them had taken part in Experiment 1.

4.2 Material and Procedure

The experimental material consisted of 10 photos, each depicting an action on an object. Each photo was paired with an action phrase: in the congruent pairing, the phrase described the action depicted in the photo (hereafter, “congruent” phrase); in the incongruent pairing, the phrase described a different action performed on the same object (hereafter, “incongruent” phrase). The following is an example of the pairings:

Congruent phrase: “Open a box” Incongruent phrase: “Lift a box”

Also, the material comprised 10 photos only presented at recognition. Five of them depicted actions described by phrases of the incongruent pairing at encoding (that is, actions not seen but read), and 5 of them were completely new photos (that is, not seen nor read). For the way in which the photos were devised, at recognition each object appeared twice, once in the photo they had already seen at encoding and once in a photo only presented at recognition. The photos and the phrases used in the experiment are in Appendix B.

(17)

We devised two experimental protocols, so that in each protocol each photo occurred with either the “congruent” action phrase or the “incongruent” action phrase, and in each protocol there were 5 congruent pairings and 5 incongruent pairings. In Protocol 1, photos numbered from 1 to 5 in Appendix B were paired with the congruent phrases (CO-congruent pairing), and photos numbered from 6 to 10 were paired with the incongruent phrases (IN-incongruent pairing). In Protocol 2, photos numbered from 1 to 5 in Appendix B were paired with the incongruent phrases (IN-incongruent pairing), and photos numbered from 6 to 10 were paired with the congruent phrases (CO-congruent pairing). The order of presentation of the photo/phrase pairs within each protocol was randomized for each participant using E-Prime 2.0 Software. Half of the participants were randomly assigned to Protocol 1 and half to Protocol 2.

The experimental procedure was the same as for Experiment 1, with the exception that the participants observed first the photo then the phrase for each photo-phrase pair, and they were told that in the second phase of the experiment their task was to recognize the photos they had seen in the first phase.

4.3 Results

We analyzed separately correct acceptance (“old” items) and correct rejections (“new” items) because there were two types of old items and two types of new items also in this experiment. “Old” photos were matched at encoding either with congruent or with incongruent phrases, and “new” photos were either totally new photos, or “partially new” photos in that they depicted actions read in phrases at encoding.

Table 3 summarizes the means of correct recognitions of photos presented at encoding and relative response times as a function of the pairing. As predicted, accuracy was no greater in congruent pairings compared to incongruent pairings (Wilcoxon test: z = .13, one-tailed p = .45, Cliff’s δ = .05). Further, still consistent with our prediction, response times for incongruent pairings and for congruent pairings did not differ significantly (Wilcoxon test: z = .55, one-tailed p = .29, Cliff’s δ = .02).

(18)

Photos presented at encoding Accuracy Response times Congruent pairing 4.53 (SD=0.68) 1973 (SD=983) Incongruent pairing 4.53 (SD=0.85) 2114 (SD=1264)

Table 3. Means of correct recognitions of photos presented at encoding and relative response times (in milliseconds) as a function of congruent and incongruent pairings (Experiment 2).

Table 4 summarizes the results for the photos only presented at recognition. As predicted, correct rejections were greater when they depicted actions never described by phrases encountered at encoding compared to when they depicted actions described by phrases read at encoding

(Wilcoxon test: z = 1.83, one-tailed p < .05, Cliff’s δ = .19). Response times with photos depicting actions never described by phrases and with photos depicting actions described by phrases did not differ significantly (Wilcoxon test: z = .34, one-tailed p = .37, Cliff’s δ = .08).

Since these null effects were critical to our predictions, we performed a Bayes Factor (BF) analysis (see Rouder, Speckman, Sun, Morey, & Iverson, 2009) in order to determine the ratio of evidence in favour of both the null hypothesis and the alternative hypothesis for our pair-wise comparisons of interest. Bayes factor tests were run using JASP software (JASP Team: JASP (Version 0.8.2), 2018) with a default JASP prior, that is a Cauchy prior with a location parameter of 0 and scale parameter of 0.707. Specifically, we compared both the accuracy and the response times (of correct responses) for congruent pairings compared to incongruent parings, testing the H0 (no differences between the two conditions) and the H1 (differences between the two conditions) using Bayesian t-test. As regards accuracy, we obtained a BF01 of 5.86 for the difference between the two

(19)

more support for the null hypothesis, being 5.86 and 4.01 times more likely to occur under the null hypothesis, compared to the alternative hypothesis (see, Jarosz & Wiley, 2014).

Photos only presented at recognition

Accuracy Response times

Never read in phrases 3.73

(SD=1.11) 1943 (SD=715) Read in phrases 3.38 (SD=1.10) 1895 (SD=813)

Table 4. Means of correct rejections of photos only presented at recognition and relative response times (in milliseconds), depending on the fact that they depicted actions described by phrases presented at encoding or not (Experiment 2).

The results of Experiment 2 were not directly comparable to those of Experiment 1 because the two experiments concerned two different groups of participants. However, an emerging pattern was that participants were most accurate in rejecting phrases describing actions not read and not seen in photos (Experiment 1), and least accurate in rejecting photos depicting actions not seen and not described in phrases (Experiment 2). We believe that this result could be explained in terms of an experimental artifact. In Experiment 2, in each photo depicting an action not seen in a photo and not described in a phrase, there were the same actor and the same object that had been seen in a photo at encoding, although the actor was performing a different action with the object. Hence, in a way, this new photo was familiar to the participants. The same was not true for phrases not read at encoding in Experiment 1.

The global results of Experiment 2 confirmed our prediction: memory for action photos was not impaired when, at encoding, the photos were paired with phrases describing actions whose kinematics were incongruent with the kinematics of the actions depicted in the photos. Also, again in line with our predictions, action photos only presented at recognition were misrecognized as seen

(20)

at encoding when they depicted actions described in phrases presented at encoding. These results strengthened the assumption that simulation from action observation prevails on simulation from action phrases when their effects are contrasted. The reverse does not seem to occur.

The aim of Experiment 3 was to exclude an alternative explanation for the results of Experiments 1 and 2: memory for action pictures is better than memory for action phrases.

5. Experiment 3: Is memory for pictures of actions better than memory for action phrases? At encoding, the participants encountered phrases and pictures each in a separate block: the phrases were those presented at encoding in Experiment 1 (block 1) and the pictures were those presented at encoding in Experiment 2 (block 2). At the end of each block, the participants

performed a recognition task (for the phrases at the end of block 1 and for the photos at the end of block 2). If memory for action pictures were better than memory for action phrases, then the results of Experiments 1 and 2 could be explained in terms of a hypermnesia for action pictures.

5.1 Participants

The participants in the experiment were 31 Psychology students at the University of Turin (11 males and 20 females, mean age = 25.77 years; SD = 2.99 years), all Caucasians, with normal or corrected to normal vision. They took part in the experiment voluntarily, in exchange for course credits, after informed consent. None of them had taken part in Experiments 1 or 2.

5.2 Material and Procedure

The experimental material consisted of all and only the action phrases and the action pictures used in Experiments 1 and 2. In block 1, the participants encountered at encoding and at recognition the phrases encountered by the participants in Experiment 1 at encoding and at recognition, respectively (see Appendix A). Therefore, they encountered 10 action phrases at encoding and the same 10 phrases (i.e., signals) at recognition, along with other 10 action phrases (i.e., noise trials, see bottom of Appendix A). In block 2, the participants encountered at encoding and at recognition the photos encountered by the participants in Experiment 2 at encoding and at

(21)

recognition, respectively (see Appendix B). Therefore, they encountered 10 action photos at encoding and the same 10 photos (i.e., signals) at recognition, along with other 10 action photos (i.e., noise trials). Each participant dealt with the two blocks of trials, but the order of the two blocks was counterbalanced over all participants.

Further, in between the two blocks of trials the participants were invited to perform a reading task (i.e., to search for articles in a short written scientific text); since reading requires visual and linguistic processing (e.g., Engle & Conway, 1988) this procedure was meant to avoid possible interferences of memory for actions involved in the first block with memory for actions involved in the second block.

In each block, at encoding, the participants observed the phrases or the photos on a computer screen each one for 5 seconds. Soon after that presentation, the participants performed a recognition task in which they had to observe the phrases or photos for 5 seconds each. The action phrases or photos presented at encoding required affirmative responses, and those only presented at

recognition required negative responses. At both encoding and recognition the presentation of the phrases and the photos was randomized for each participant by using E-Prime 3.0 Software.

The experiment took place in a single session in the sole presence of the experimenter. The participants were sitting on a chair in front of a desk where a computer was placed (approximately at 8 inches from the participant). At the beginning of the experimental session they received the following instructions: “Thank you for participating in this experiment which is in two parts: one is concerned with the ability to remember phrases, and one with the ability to remember photos”.

Then, for the block of trials concerning memory for phrases, the participants received the following instructions: “Now I will show you some phrases on the computer screen. Please read them carefully because later on I will present you some phrases and for each one I will ask you whether you have already read it. Each sentence will be presented for 5 seconds. Press the space bar when you’re ready”. At the end of the presentation of the phrases, the following instructions

(22)

determine whether you have already read it. Please press ‘yes’ if you have already read it and ‘no’ if you have not. Press the space bar when you’re ready”.

For the block of trials concerning memory for photos, the participants received the following instructions “Now I will show you some photos on the computer screen. Please watch them

carefully because later on I will present you some photos and for each one I will ask you if you have already seen it. Each photo will be presented for 5 seconds. Press the space bar when you’re ready”. At the end of the presentation of the photos, the following instructions appeared on the screen: “Now I will present you with some photos. For each photo, you should determine whether you have already seen it. Please press ‘yes’ if you have already seen it and ‘no’ if you have not already seen it. Press the space bar when you’re ready”.

In both recognition tasks of block 1 and of block 2, B and N keys of the keyword were labelled YES and NO.

5.3 Results

Our recognition test was a yes–no task involving signal trials: the participants encountered 10 signals (i.e., original phrases/photos seen at encoding) and 10 noise trials (i.e., phrases/photos only presented at recognition). On signal trials, “yes” responses were correct (hits); on noise trials, “yes” responses were incorrect (false alarms). Table 5 illustrates the means of hits and false alarms in action phrase and action photo recognition.

Hits False alarms

Phrases 0.89 (SD=0.13) 0.15 (SD=0.16) Photos 0.93 (SD=0.09) 0.20 (SD=0.14)

Table 5. Means (and standard deviations) of hits and false alarms in action phrase and action photo recognition in Experiment 3.

(23)

To evaluate how participants distinguished between signal and noise trials, we computed d′, a measure of signal detection sensitivity (z-scored hit rate minus z-scored false alarm rate; Green & Swets, 1966) for both the phrase and the photo tasks. For each participant, we calculated the

proportion of hits (“yes” responses when the stimulus was a signal) and false alarms (“yes” responses when the stimulus was a noise trial). Since signals differed markedly from noise trails, some hits and false rates were equal to 1 (corresponding to a z score of + ∞) or 0 (corresponding to a z score of - ∞). Therefore, hits and false rates were calculated adding 0.5 to both the number of hits and the number of false alarms and adding 1 to both the number of signal trials and the number of noise trials (Hautus, 1995). The results revealed that, as predicted, d′ for memory of phrases (a mean of 2.26, SD=.93) and d’ for memory of photos (a mean of 2.14, SD=.75) did not differ significantly (t(30) = .64, p = .53).

Also, we did not detect significant differences between the time the participants took to respond correctly to signals (i.e., “yes” response) as opposed to noise trials (i.e., “no” response) when dealing with recognition memory for phrases (mean RT= 1666, SD= 624) compared to when dealing with recognition memory for photos (mean RT= 1743, SD= 935; t(30) = .88, p = .39).

Finally, since these null effects were critical to our prediction, we performed a Bayes Factor (BF) analysis (see Rouder et al., 2009) in order to determine the ratio of evidence in favour of both the null hypothesis and the alternative hypothesis for our pair-wise comparisons of interest. Bayes factor tests were run using JASP software (JASP Team: JASP, Version 0.8.2, 2017) with a default JASP prior, that is a Cauchy prior with a location parameter of 0 and scale parameter of 0.707. In particular, we compared both d’ values and response times (of correct responses) for the phrase memory task with those for the photo memory task, testing the H0 (no differences between the two conditions) and the H1 (photos are more likely remembered than phrases; reaction times are shorter for photos than for phrases) using Bayesian t-test. As regards the d’, we obtained a BF01 of 7.96 for the difference between the two tasks, whereas for the response times we obtained a BF01 of 9.06.

(24)

This suggests that both data actually provide more support for the null hypothesis, being 7.96 and 9.06 times more likely to occur under the null hypothesis, compared to the alternative hypothesis (see, Jarosz & Wiley, 2014).

To investigate response bias, we computed c using the following formula: –0.5 × (z-scored hit rate + z-scored false alarm rate) (Green & Swets, 1966). Response bias was negative for both memory tasks (c=-.07, SD=.27 for the phrase task, c=-.24, SD=.22 for photo task), which showed that there was a tendency to respond “Yes” more often than “No”. This bias was significantly higher for photos than for phrase task (t(30) = 2.91, p < .01), thereby indicating that for the photo task a more liberal response bias was adopted, whereas for phrase task the criterion was slightly more conservative.

The results of Experiment 3 revealed that memory for action photos is not greater than memory for action phrases, at least when considering the specific material used in our experiments. These results, along with those of Experiments 1 and 2 entitle us to conclude that simulation from action observation prevailed on simulation from action phrases when their effects were contrasted.

6. Discussion and Conclusions

A vast literature has evidenced the pivotal role of mental simulation in action observation and language comprehension. Some scholars have hypothesized that mental simulation exploits similar sensorimotor resources in these two circumstances (see, e.g., Dijkstra & Post, 2015).

Consistent with evidence that simulations from observation and from language share some features, we tested the prediction that when the effects of the two are not contrasted their products are likely to be misrecognized. The results of our experiments confirmed this prediction: the participants in Experiment 1 misrecognized as read at encoding action phrases describing actions depicted in photos seen at encoding but not read at encoding, and the participants in Experiment 2

misrecognized as seen at encoding photos depicting actions described by phrases read at encoding but never seen at encoding.

(25)

However, studies in the literature have also suggested that simulation in language comprehension differs from simulation in action observation in some respects. In line with that claim, we assumed that action observation triggered direct simulation, whereas language

comprehension involved indirect, language mediated, simulation. In our experiments we tested a main prediction deriving from this assumption: when the effects of the two types of simulation are contrasted, simulation from action observation prevails on simulation from language

comprehension. The results of our experiments confirmed this prediction as well: the participants in Experiment 1 were less accurate in recognizing action phrases paired at encoding with photos depicting actions featuring different kinematics, but participants’ accuracy in recognizing photos in Experiment 2 did not suffer from incongruent photo-phrase pairings at encoding.

There is a possible alternative explanation of our results. It is well known that visual long-term memory is higher for pictures than for text. According to the so-called “picture superiority effect” (e.g., Paivio & Csapo, 1973), people remember more details from pictures than they do from words. As such, the asymmetric pattern of results we detected in our experiments could be

attributed to better memory for pictures compared to phrases. In this case, there would be no need to invoke mental simulation arising from action observation in order to explain the findings in

Experiments 1 and 2. In order to rule out the possibility that our findings are simply a consequence of better memory for pictures than for text, we devised Experiment 3 in which we compared memory for phrases with memory for pictures, using phrases from Experiment 1 and pictures from Experiment 2. Experiment 3 reveled no difference between these two types of memory, thereby excluding the alternative explanation of our results: effects observed in Experiment 1 and 2 could not be accounted just by the “picture superiority effect”. We had no a priori reason to expect that the results obtained in the classical studies run within the dual code paradigm were indeed

generalizable to stimuli like ours. Our stimuli were not words, but action phrases, and we know that the two differ for two main reasons: verbs elicit more motor activation than non-verb words, and phrases are more elaborated than single words. These could be the reasons why memory for phrases

(26)

was not worse than memory for pictures in Experiment 3. In this respect, the results of Experiment 3 are new and seem to suggest that the results obtained in the dual-code literature are not

generalizable to more complex visual and verbal stimuli.

Our aim was to investigate how the congruity of information conveyed by language and body movement at encoding may affect memory. In this respect, few studies have investigated the relation between language and body kinematics, whereas there are plenty of studies investigating the effect of speech-gesture mismatch on memory for speech. Our assumption that simulation from action observation prevails on simulation from action phrases when their effects are contrasted echoes the findings in the literature on speech-gesture mismatch. When gestures convey information contradicting that conveyed by speech they disturb sentence processing (see, e.g., Cassell, McNeill & McCullough, 1999), listeners more strongly engage their motor system to “simulate” the mismatching gesture to reevaluate whether it fits with the processed speech signal (Drijvers, Özyürek & Jensen, 2018), and recall of sentences decline over time (Feyereisen, 2006).

A limit of our investigation is the complex nature of the stimuli: since they involve both observed action kinematics and kinematics elicited by the object involved in the action, they do not allow us to disentangle the role of action kinematics and object kinematics in the

congruence/incongruence effect; future studies could dissociate the contribution of each of these components. Further, future studies could investigate the congruence/incongruence effect when action observation is contrasted with comprehension of spoken action phrases rather than written action phrases: in this case participants could be facilitated in attending just to phrases or photos and they should not experience the effect of incongruent pairings. Also, since the results of some studies have revealed that simulation from single action verbs is very fast (see, e.g., Mirabella, Iaconelli, Spadacenta, Federico & Gallese, 2012), future studies could investigate more in depth how different degrees of motor activation by different kind of linguistic stimuli, as compared to motor activation by action observation, impact on the congruence/incongruence effect.

(27)

Finally, why might simulations from perceived states of affairs be more direct than simulations from described states of affairs? A possible reason could be that “[…] the conceptual system evolved primarily to process nonlinguistic stimuli, including perceptual, motor, and introspective aspects of experience” (Barsalou, 2008, p. 248). Therefore, the processing of experience is more central to human cognition than the processing of words. This view is in line with the idea of a direct perception of other person’s intentions (Gallagher, 2008): we can understand an action just through an implicit and non-reflexive simulation mechanism (Gallese, 2007). Some scholars, before us, hypothesized that mental simulations triggered by different stimuli might fall into different categories (see e.g., Kent & Lamberts, 2008; Moulton & Kosslyn, 2004). The results of our investigation suggest a clear distinction: a direct simulation seems to mediate action observation whereas an indirect simulation appears to be at stake in language comprehension.

Acknowledgements

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References

Bach, P., Nicholson, T., & Hudson, M. (2014). The affordance-matching hypothesis: how objects guide action understanding and prediction. Frontiers in Human Neuroscience, 8, 254. doi:10.3389/fnhum.2014.00254

Baddeley, A. D. (2002). Is working memory still working? European Psychologist, 7, 85–97. doi:10.1027//1016- 9040.7.2.85

Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617-645. http://org/10.1146/annurev.psych.59.103006.093639

(28)

Barsalou, L. W., Santos, A., Simmons, W. K., & Wilson, C.D. (2008). Language and simulation in conceptual processing. In: De Vega, M., Glenberg, A.M., Graesser, A. (Eds.), Symbols,

Embodiment, and Meaning, pp. 245-283. Oxford University Press, Oxford. http://10.1093/acprof:oso/9780199217274.003.0013

Bauer, M. I., & Johnson-Laird, P. N. (1993). How diagrams can improve reasoning. Psychological Science, 4, 372-378. https://doi.org/10.1111/j.1467-9280.1993.tb00584.x

Blakemore, S. J., & Decety, J. (2001). From the perception of action to the understanding of intention. Nature reviews neuroscience, 2, 561. doi:10.1038/35086023

Bucciarelli, M. (2007). How the construction of mental models improves learning. Mind & Society, 6, 67-89. https://doi.org/10.1007/s11299-006-0026-y

Cardellicchio, P., Sinigaglia, C., & Costantini, M. (2013). Grasping affordances with the other’s hand: A TMS study. Social Cognitive and Affective Neuroscience, 8, 455-459.

https://doi.org/10.1093/scan/nss017

Caspers, S., Zilles, K., Laird, A. R., & Eickhoff, S. B. (2010). ALE meta-analysis of action observation and imitation in the human brain. NeuroImage, 50, 1148-1167. http:// 10.1016/j.neuroimage.2009.12.112

Cassell, J., McNeill, D., & McCullough, K. E. (1999). Speech-gesture mismatches: Evidence for one underlying representation of linguistic and nonlinguistic information. Pragmatics & Cognition, 7, 1-34. DOI: 10.1075/pc.7.1.03cas

Castiello, U. (2005). The neuroscience of grasping. Nature Reviews Neuroscience, 6, 726-736. https://doi.org/10.1038/nrn1744

Costantini, M., Committeri, G., & Sinigaglia, C. (2011). Ready both to your and to my hands: mapping the action space of others. PloS One, 6, e17923.

(29)

Cowan, N. (2008). What are the differences between long-term, short-term, and working memory? Progress in Brain Research, 169, 323–338. https://doi.org/10.1016/S0079-6123(07)00020-9

Dijkstra, K., & Post, L. (2015). Mechanisms of embodiment. Frontiers in Psychology, 6, 1525. http://10.3389/fpsyg.2015.01525

Ditman, T., Brunyé, T. T., Mahoney, C. R., & Taylor, H. A. (2010). Simulating an enactment effect: Pronouns guide action simulation during narrative comprehension. Cognition, 115, 172-178. https://doi.org/10.1016/j.cognition.2009.10.014

Drijvers, L., Özyürek, A., & Jensen, O. (2018). Hearing and seeing meaning in noise: alpha, beta, and gamma oscillations predict gestural enhancement of degraded speech

comprehension. Human Brain Mapping, 39, 2075-2087.

Engle, R. W., & Conway, A. R. A. (1988). Working memory and comprehension. In R. H. Logie & K. J. Gilhooly (Eds.), Working memory and thinking, 67-91. UK: Psychology Press.

Erdely, M. H., & Becker, J. (1974). Hypermnesia for pictures: Incremental memory for pictures, but not words in multiple recall trials. Cognitive Psychology, 6, 159-171.

https://doi.org/10.1016/0010-0285(74)90008-5

Feyereisen, P. (2006). Further investigation on the mnemonic effect of gestures: Their meaning matters. European Journal of Cognitive Psychology, 18, 185-205.

https://doi.org/10.1080/09541440540000158

Fischer, M. H., & Zwaan, R. A. (2008). Embodied language: A review of the role of the motor system in language comprehension. Quarterly Journal of Experimental Psychology, 61, 825-850. http://10.1080/17470210701623605

Freyd, J. J. (1983). The mental representation of movement when static stimuli are

viewed. Perception & Psychophysics, 33, 575-581. http://dx.doi.org/10.3758/BF03202940 Gallagher, S. (2008). Direct perception in the intersubjective context. Consciousness and

(30)

Gallese, V. (2005). “Being Like Me”: Self-Other Identity, Mirror Neurons, and Empathy. In S. Hurley & N. Chater (Eds.), Perspectives on imitation: From neuroscience to social science: Vol. 1. Mechanisms of imitation and imitation in animals (pp. 101-118). Cambridge, MA, US: MIT Press.

Gallese, V. (2007). Before and below ‘theory of mind’: embodied simulation and the neural correlates of social cognition. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 362, 659-669. DOI: 10.1098/rstb.2006.2002

Gallese, V., Fadiga, L., Fogassi, L., & Rizzolatti, G. (1996). Action recognition in the premotor cortex. Brain, 119, 593-609. http://dx.doi.org/10.1093/brain/119.2.593

Glenberg, A. M., Sato, M., Cattaneo, L., Riggio, L., Palumbo, D., & Buccino, G. (2008). Processing abstract language modulates motor system activity. The Quarterly Journal of Experimental Psychology, 61, 905-919. https://doi.org/10.1080/17470210701625550

Glenberg, A. M., & Kaschak, M. P. (2002). Grounding language in action. Psychonomic Bulletin & Review, 9, 558-565. http://10.3758/BF03196313

Glenberg, A. M., & Langston, W. E. (1992). Comprehension of illustrated text: Pictures help to build mental models. Journal of Memory and Language, 31, 129-151.

https://doi.org/10.1016/0749-596X(92)90008-L

Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley Hauk, O., Johnsrude, I., & Pulvermüller, F. (2004). Somatotopic representation of action words in

human motor and premotor cortex. Neuron, 41, 301-307. https://doi.org/10.1016/S0896-6273(03)00838-9

Hautus, M. J. (1995). Corrections for extreme proportions and their biasing effects on estimated values of d′. Behavior Research Methods, Instruments, & Computers, 27, 46-51. DOI: 10.3758/BF03203619

(31)

Hegarty, M. (1992). Mental animation: inferring motion from static diagrams of mechanical

systems. Journal of Experimental Psychology, Learning, Memory & Cognition, 18, 1084-1102. http://dx.doi.org/10.1037/0278-7393.18.5.1084

Ianì, F., Mazzoni, G., & Bucciarelli, M. (2018). The role of kinematic mental simulation in creating false memories. Journal of Cognitive Psychology, 30, 292-306.

https://doi.org/10.1080/20445911.2018.1426588

JASP Team (2017). JASP (Version 0.8.2)[Computer software].

Jeannerod, M. (1994). The representing brain: Neural correlates of motor intention and imagery. Behavioral and Brain sciences, 17, 187-202.

https://doi.org/10.1017/S0140525X00034026

Jeannerod, M. (2001). Neural simulation of action: a unifying mechanism for motor cognition. NeuroImage, 14, S103-S109. http://10.1006/nimg.2001.0832

Jarosz, A. F., & Wiley, J. (2014). What are the odds? A practical guide to computing and reporting Bayes factors. The Journal of Problem Solving, 7, 2-9. DOI: 10.7771/1932-6246.1167

Johnson-Laird, P. N. (1983). Mental models: Towards a cognitive science of language, inference, and consciousness. Cambridge: Cambridge University Press. Cambridge, MA: Harvard University Press.

Johnson-Laird, P. N. (2006). How we reason. Oxford University Press, USA.

Kent, C., & Lamberts, K. (2008). The encoding–retrieval relationship: retrieval as mental simulation. Trends in Cognitive Sciences, 12, 92-98.

http://psycnet.apa.org/doi/10.1016/j.tics.2007.12.004

Kintsch, W. (1998). Comprehension: A paradigm for cognition. New York, NY: Cambridge University Press.

Kourtzi, Z., & Kanwisher, N. (2000). Activation in human MT/MST by static images with implied motion. Journal of Cognitive Neuroscience, 12, 48-55.

(32)

Mattar, A. A., & Gribble, P. L. (2005). Motor learning by observing. Neuron, 46(1), 153-160. https://doi.org/10.1016/j.neuron.2005.02.009

McKendree, J., Small, C., Stenning, K., & Conlon, T. (2002) The role of representation in teaching and learning critical thinking. Educational Review, 54, 57–67.

http://dx.doi.org/10.1080/00131910120110884

Mirabella, G., Iaconelli, S., Spadacenta, S., Federico, P., & Gallese, V. (2012). Processing of hand-related verbs specifically affects the planning and execution of arm reaching movements. PloS One, 7, e35403. https://doi.org/10.1371/journal.pone.0035403

Moulton, S. T., & Kosslyn, S. M. (2009). Imagining predictions: mental imagery as mental

emulation. Philosophical Transactions of the Royal Society B: Biological Sciences, 364, 1273-1280. http://10.1098/rstb.2008.0314

Paivio, A. & Csapo, K. (1973). Picture superiority in free recall: Imagery or dual coding? Cognitive Psychology, 5, 176-206. https://doi.org/10.1016/0010-0285(73)90032-7

Quesque, F., & Coello, Y. (2015). Perceiving what you intend to do from what you do: evidence for embodiment in social interactions. Socioaffective Neuroscience & Psychology, 5,

10.3402/snp.v5.28602. http://doi.org/10.3402/snp.v5.28602R

Rizzolatti, G. (2005). The mirror neuron system and its function in humans. Anatomy and Embryology, 210, 419-421. http://10.1007/s00429-005-0039-z

Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D., & Iverson, G. (2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16, 225-237. https://doi.org/10.3758/PBR.16.2.225

Schaeken, W. S., Johnson-Laird, P. N., d’Ydewalle, G. (1996). Mental models and temporal reasoning. Cognition, 60(3), 205-234. https://doi.org/10.1016/0010-0277(96)00708-1 Schütz-Bosbach, S., & Prinz, W. (2007). Prospective coding in event representation. Cognitive

(33)

Schaller, F., Weiss, S., & Müller, H. M. (2017). EEG beta-power changes reflect motor

involvement in abstract action language processing. Brain and Language, 168, 95-105. doi: 10.1016/j.bandl.2017.01.010

Sims, V. K., & Hegarty, M. (1997). Mental animation in the visuospatial sketchpad: Evidence from dual-task studies. Memory & Cognition, 25, 321-332. DOI: 10.3758/BF03211288

Stanfield, R. A., & Zwaan, R. A. (2001). The effect of implied orientation derived from verbal context on picture recognition. Psychological Science, 12, 153-156.

https://doi.org/10.1111/1467-9280.00326

Stefan, K., Cohen, L. G., Duque, J., Mazzocchio, R., Celnik, P., Sawaki, L., Ungerleider, L., & Classen, J. (2005). Formation of a motor memory by action observation. Journal of

Neuroscience, 25, 9339-9346. DOI: https://doi.org/10.1523/JNEUROSCI.2282-05.2005 Stenning, K. (2000). Distinctions with differences: comparing criteria for distinguishing

diagrammatic from sentential systems. Theory and application of diagrams. Lecture Notes in Artificial Intelligence, 1899, 132–148. http://10.1007/3-540-44590-0

Taylor, L. J., & Zwaan, R. A. (2008). Motor resonance and linguistic focus. Quarterly Journal of Experimental Psychology, 61, 896-904. http://10.1080/17470210701625519

Tucker, M., & Ellis, R. (1998). On the relations between seen objects and components of potential actions. Journal of Experimental Psychology: Human perception and performance, 24, 830-846. http://psycnet.apa.org/doi/10.1037/0096-1523.24.3.830

Tucker, M., & Ellis, R. (2004). Action priming by briefly presented objects. Acta Psychologica, 116, 185-203. http://10.1016/j.actpsy.2004.01.004

Urgesi, C., Maieron, M., Avenanti, A., Tidoni, E., Fabbro, F., & Aglioti, S. M. (2010). Simulating the future of actions in the human corticospinal system. Cerebral Cortex, 20, 2511-2521. https://doi.org/10.1093/cercor/bhp292

(34)

Urgesi, C., Moro, V., Candidi, M., & Aglioti, S. M. (2006). Mapping implied body actions in the human motor system. Journal of Neuroscience, 26, 7942-7949.

https://doi.org/10.1523/JNEUROSCI.1289-06.2006

Van Overwalle, F., & Baetens, K. (2009). Understanding others' actions and goals by mirror and mentalizing systems: a meta-analysis. NeuroImage, 48, 564-584.

http://10.1016/j.neuroimage.2009.06.009

Wilson, M., & Knoblich, G. (2005). The case for motor involvement in perceiving conspecifics. Psychological Bulletin, 131, 460-473. http://dx.doi.org/10.1037/0033-2909.131.3.460

Zwaan, R. A., Stanfield, R. A., & Yaxley, R. H. (2002). Language comprehenders mentally represent the shapes of objects. Psychological Science, 13, 168-171.

https://doi.org/10.1111/1467-9280.00430

Zwaan, R. A., & Taylor, L. J. (2006). Seeing, acting, understanding: Motor resonance in language comprehension. Journal of Experimental Psychology: General, 135, 1-11.

(35)

Appendix A.

The experimental material of Experiment 1. The phrases are translated from Italian.

1. Drink from a bottle

1_CO 1_IN

2. Lean out the window

2_CO 2_IN 3. Open a box 3_CO 3_IN 4. Kick a ball 4_CO 4_IN 5. Play a trumpet 5_CO 5_IN

(36)

6. Hang an hat

6_CO 6_IN

7. Store the pen

7_CO 7_IN 8. Knock on a door 8_CO 8_IN 9. Make a pass 9_CO 9_IN 10. Open a suitcase 10_CO 10_IN

(37)

Pour from a bottle; Open the window; Lift a box; Pick up a ball; Put away the trumpet; Wear an hat; Write with a pen; Open a door; Shoot a basket; Lift a suitcase

Appendix B.

The experimental material of Experiment 2. The phrases are translated from Italian.

1.

1_CO Drink from a bottle 1_IN Pour from a bottle

2.

2_CO Lean out the window 2_IN Open the window

3.

3_CO Open a box 3_IN Lift a box

4.

4_CO Kick a ball 4_IN Pick up a ball

5.

(38)

6.

6_CO Hang an hat 6_IN Wear an hat

7.

7_CO Store the pen 7_IN Write with a pen

8.

8_CO Knock on a door 8_IN Open a door

9.

9_CO Make a pass 9_IN Shoot a basket

10.

(39)

Riferimenti

Documenti correlati

By combining all this information, it clearly appears that Kharga was an important desert crossroad where the north-south caravan route (known as the Darb el-Arbain) met an east-

44.2.7, l’idea portante, sulla quale si costruisce l’intero ragionamento ulpianeo circa gli effetti derivanti dalla deduzione della domanda in giudizio e dal conseguente giudicato,

Complex surfaces, freeform geometry and relative structures are difficult to draw and commonly used tools, such as AutoCAD or Revit, are not suited enough.. Furthermore,

Il fatto di utilizzare poligoni relativi alle aree umide attuali, seppur modificati come descritto nel paragrafo 3, risulta un limite di questo studio, ma si ritiene che

At the highest level, we have considered transformations whose target language is the same as the source (e.g., program optimizations or obfuscations) and we have verified that

The present work investigates whether differ- ent quantification mechanisms (set compari- son, vague quantification, and proportional es- timation) can be jointly learned from

essere i reali proprietari del segno, quale collettività. La divaricazione tra operatori-proprietari e consorzi può essere colta nel fatto che un produttore può