• Non ci sono risultati.

Call the plumber, I need to fix my television: Interpreting (and Enriching) Typical and Atypical Sentences

N/A
N/A
Protected

Academic year: 2021

Condividi "Call the plumber, I need to fix my television: Interpreting (and Enriching) Typical and Atypical Sentences"

Copied!
199
0
0

Testo completo

(1)

Dipartimento di Filologia, Letteratura e Linguistica

Corso di Dottorato in Discipline Linguistiche e Letterature Straniere

Ciclo XXXII

Tesi di Dottorato

Call the plumber, I need to fix my television: Interpreting (and

Enriching) Typical and Atypical Sentences

TUTORE

Prof. Alessandro LENCI

CANDIDATO

Paolo VASSALLO

(2)

1 Index

INTRODUCTION ... 3

CHAPTER 1: ARGUMENT STRUCTURES AND EVENT KNOWLEDGE IN LINGUISTICS AND COGNITIVE SCIENCE ... 5

1.1SEMANTIC ROLES IN LINGUISTIC THEORY ... 6

1.1.1 The Classical Notion of Semantic Role: syntax-semantics interface ... 6

1.1.2 Dowty’s Proto-Roles ... 9

1.1.3 Towards a gradable conception of Selectional Restrictions ... 12

1.2LEXICAL MEANING NOWADAYS ... 17

1.2.1 The Generalized Event Knowledge (GEK) paradigm: “lexical knowledge without a lexicon...” ... 17

1.2.2 Event Knowledge and Predictive Processing... 22

1.2.3 From Generalized Semantic Roles to Verb-Specific Proto-Roles: Thematic Fit Judgments ... 25

1.2.4 Compositionality and Thematic Fit Update ... 30

1.3DISTRIBUTIONAL MODELS OF THEMATIC FIT ... 35

1.3.1 An Exemplar-Based Model of Selectional Preferences... 35

1.3.2 Datasets for Thematic Fit Modeling Evaluation ... 41

1.3.2.1 McRae et al. (1998) Dataset ... 41

1.3.2.2 Ferretti et al. (2001) Dataset: Instruments and Locations ... 43

1.3.2.3 Padó (2007) Dataset... 45

1.3.3 Compositional Thematic Fit Modeling: Expectations Composition and Update 47 1.3.3.1 Bicknell et al. (2010) Datasets ... 50

1.4THEMATIC RELATIONS AND EVENTS INTERPRETATION ... 52

1.5SUMMARY ... 56

CHAPTER 2: THE DYNAMIC THEMATIC FIT DATASET ... 58

2.1TRADITIONAL THEMATIC FIT DATASETS SHORTCOMINGS: NEED AN UPDATE… ... 59

2.2DATA COLLECTION PROCEDURE 1:AGENTS AND PATIENTS ... 60

2.2.1 From Image Descriptions to Verb-Object List ... 60

2.2.2 Agents (and Patients) Production ... 61

2.2.3 Collecting Typicality Ratings for Agents and Patients: the Tasks ... 65

2.2.4 Results and Discussion ... 68

2.3DATA COLLECTION PROCEDURE 2:INSTRUMENTS AND LOCATIONS ... 70

2.3.1 The Instrument Semantic Role... 70

2.3.2 Producing Typical and Atypical Instruments ... 71

2.3.3 The Location Semantic Role ... 73

2.3.4 Producing Typical and Atypical Locations ... 74

2.3.5 Collecting Typicality Ratings for Instruments and Locations ... 76

2.3.6 Results and Discussion ... 78

2.4DATA COLLECTION PROCEDURE 3:TIMES AND RECIPIENTS/GOALS ... 81

2.4.1 The Time/Temporal Role: at the boundary between Lexical and Discourse Semantics ... 81

2.4.2 Producing Typical and Atypical Times for Complex Events... 83

2.4.3 Collecting Typicality Ratings for Times ... 85

2.4.4 Results and Discussion ... 87

2.4.5 The Recipient/Goal Semantic Role: Syntax-Semantics Interface ... 89

2.4.6 Producing Typical and Atypical Recipients ... 90

2.4.7 Collecting Typicality Ratings for Recipients ... 92

(3)

2

2.5THE DTFIT DATASET:SUMMARY ... 96

2.6THE DTFIT DATASET:EVALUATION STRATEGIES ... 97

2.7THE DTFIT DATASET:EXPERIMENTS ... 99

2.8THE DTFIT DATASET:CONCLUSIONS ... 104

CHAPTER 3: ATYPICAL EVENTS INTERPRETATION ... 105

3.1INTERPRETING TYPICAL AND ATYPICAL EVENTS: WHAT IS IN A SENTENCE SEMANTIC REPRESENTATION? ... 106

3.2GENERALIZED EVENT KNOWLEDGE MODULATION AND COMPOSITION IN ATYPICAL EVENTS COMPREHENSION ... 111

3.3ATYPICAL EVENTS NORMING PRODUCTION TASK ... 114

3.3.1 From English Transitive Verbs to Agent-Verb-Patient Triples ... 114

3.3.2 Assessing Atypicality: collecting Typicality Ratings for Triples ... 116

3.3.3 Creating Target Sentences for Norming Tasks ... 119

3.3.4 Norming Tasks for Partial Sentences ... 121

3.3.5 Norming Task for Complete Sentences ... 126

3.3.6 Normalizing Collected Data... 129

3.3.7 Comparing Inferential Content of Complete Sentences with event knowledge retrieved by the corresponding Partial Sentences ... 131

3.3.8 Annotating Event Participants with Thematic Relations Labels ... 136

3.3.9 Comparing single aspects of Inferential Content with the corresponding type of event knowledge retrieved by Partial Sentences ... 146

3.3.10 New Elements of Complete Sentences: where do they come from? ... 152

3.3.11 Integrating and Suppressing Elements from Partial Sentences ... 156

3.3.12 Conclusions ... 163

CHAPTER 4: GENERAL CONCLUSIONS ... 168

BIBLIOGRAPHY ... 170

APPENDIX A: DTFIT VERB LIST ... 185

APPENDIX B: AGENT-VERB PAIRS RATINGS ... 187

APPENDIX C: AGENT-VERB-PATIENT TRIPLES RATINGS ... 188

APPENDIX D: AGENT-VERB-PATIENT-INSTR RATINGS ... 189

APPENDIX E: AGENT-VERB-PATIENT-LOCATION RATINGS ... 190

APPENDIX F: AGENT-VERB-PATIENT-TIME RATINGS ... 191

APPENDIX G: AGENT-VERB-PATIENT-REC RATINGS ... 192

APPENDIX H: (A)TYPICALITY RATINGS FOR AGENT-VERB-PATIENT TRIPLES USED IN EXPERIMENTS ON ATYPICAL EVENTS INTERPRETATION ... 193

APPENDIX I: SUBJ-VERB-OBJ ATYPICAL SENTENCES ... 194

APPENDIX J: SUBJ-VERB PARTIAL SENTENCES ... 195

APPENDIX K: VERB-OBJ PARTIAL SENTENCES ... 196

APPENDIX L: SAMPLE OF ANNOTATED ELEMENTS PRODUCED FOR COMPLETE SENTENCES ... 197

(4)

3

Introduction

In the present work concepts and Thematic Roles are reported in capital letters (e.g AGENT, PATIENT, INSTRUMENT, HOUSE, TREE) while word lemmas in italics (e.g house, tree, etc.).

The present thesis focuses on two main, related, issues: (1) the collection of a dataset of human-elicited typicality ratings for verb-arguments combinations of different length and (2) a first attempt at investigating the mechanism behind interpretation of Subject-Verb-Object sentences describing atypical but possible events.

The dataset collection, introduced in Chapter 2, is aimed at providing a benchmark for computational models that try to simulate what is called in the current neuro-cognitive literature “Selectional Preferences Update”.

By collecting typicality judgements for candidate nouns in the context of broader verb-arguments combinations we test the ability of computational models to exploit some compositional information and therefore to adapt Selectional Preferences of a verb according to how other argument positions are filled.

The requirement for such a dataset is twofold; on the one hand it needs to contain typicality ratings for a wide inventory of Semantic Roles allowing researchers to compare performance of their models on a variety of different argument positions; on the other hand, it needs to contain a considerable amount of judgements for each different Semantic Role in order to be a valuable resource to state some general conclusion about how difficult it is to model Selectional Restrictions for each different role.

For what concerns the second point, developed in Chapter 3, our concern is to shed some light on how speakers exploit their general world knowledge to interpret short sentences describing atypical events.

Assuming speakers are able to comprehend atypical sentences and to build a coherent semantic representation that also includes elements that are not overtly expressed in the linguistic structure, we try to understand what kind of information do speakers use to accomplish this process.

This topic is strictly related to that of Selectional Restrictions to the extent to which we define the notion of atypicality as something that can be measured for a particular

(5)

verb-4

argument combination using typicality ratings as those collected in the previously mentioned dataset.

(6)

5

Chapter 1: Argument Structures and

Event Knowledge in Linguistics and

Cognitive Science

In the present chapter fundamental concepts of Semantic Roles Theory and Event Knowledge Exploitation are introduced.

In paragraph 1.1 the classical definition of Semantic Role (1.1.1), its later evolution into Proto-Role (1.1.2) and the introduction of a more gradual conception of Selectional Restrictions (1.1.3) are presented.

Paragraph 1.2 offers an overview of the status of Semantic Roles in contemporary theories of lexical meaning (1.2.1 and 1.2.3), the close relationship between Argument Structure and the so called Predictive Processing (1.2.2) and the impact of compositionality on Semantic Roles’ representations (1.2.4).

Paragraph 1.3 introduces Distributional Models of Thematic Fit: main state-of-art exemplar models of Selectional Preferences (1.3.1) and datasets commonly used to evaluate them (1.3.2) are described in detail; the less explored issue of Selectional Preferences update simulation and available datasets are addressed in 1.3.3 and 1.3.3.1 respectively.

Finally, 1.4 briefly introduces an emergent type of semantic relations, Thematic Relations, and their possible contribution in language comprehension.

(7)

6

1.1 Semantic Roles in Linguistic Theory

1.1.1 The Classical Notion of Semantic Role: syntax-semantics interface

Generally speaking Semantic Roles can be defined as labels that are assigned to verb argument positions; they identify the semantic relation between a verb and its arguments (Fillmore, 1970; Jackendoff, 2002; Levin & Rappaport Hovav, 2005).

Consider, for example, the three English transitive verbs kill, break and open: their very different meanings notwithstanding we can identify semantic properties shared by their direct object arguments (i.e direct object nouns change state, are not voluntarily involved in the event, etc.) (Fillmore, 1970; 1971b).

To put it more formally, Semantic Roles emerged as a way to name clusters of semantic properties shared by arguments of different verbs (Dowty, 1989; 1991).

The procedure traditionally followed in defining Semantic Roles can be summarized as follows.

Suppose we are given a set of English verbs having quite different meanings; how could we proceed to identify a finite set of Semantic Roles that allow us to fully characterize the meaning of each verb in the set?

For each verb in the given set, specify the semantic properties of its argument nouns; this way we end up having the properties of all verb arguments.

Then, imagine to group arguments that share exactly the same properties in equivalence classes: we would probably have a small number of classes that contain a considerable number of verb arguments and a lot of other classes containing very few arguments.

Ignoring, for the moment, classes with very few members, labels assigned to other ones are the so called Semantic Roles.

Semantic Roles are in fact sets of verb arguments that share exactly the same configuration of semantic properties1 (Levin & Rappaport Hovav, 2005).

1 Notice that we have omitted the question concerning the nature and granularity of semantic features used to characterize argument positions which is indeed a fundamental debated question in the field of Semantic Roles Theory (Dowty, 1991; Jackendoff, 2002; Levin & Rappaport Hovav, 2005)

(8)

7

The number and type of roles is a matter of debate in Linguistics and there is little consensus on proposed lists; it is worth noting that there are a few roles that seem uncontroversial among researchers in the field: AGENT, PATIENT, INSTRUMENT, LOCATION, GOAL, ... In Linguistic Theory each verb meaning was characterized as the list of Semantic Roles the verb assigned to its argument positions; this way became possible to define verb classes containing verbs that assigned the same Semantic Roles to their arguments (Croft, 1991; Jackendoff, 1972; 1987; Rappaport Hovav & Levin, 1998).

Semantic Roles were therefore used to formulate generalizations on verb argument syntactic realizations: argument nouns that are assigned the same Semantic Role have access to the same morphosyntactic realization options.

For example, English verbs break and open are taken to assign the same role, PATIENT, to the argument that undergoes the action described by the verb; this participant can therefore be realized either as Direct Object of a transitive clause or as Subject of an intransitive one (Fillmore, 1970; Levin & Rappaport Hovav, 2005; Goldberg, 1995; 2006; Rappaport Hovav & Levin, 2012):

(1) John broke the window (1a) The window broke (2) John opened the window (2a) The window opened

This is the very famous Causative Alternation.

Clearly, Semantic Roles are a theoretical construct that is functional to both models of semantic interpretation and theories of syntax-semantics interface.

For what concerns the first, identifying the semantic properties of verb arguments is a fundamental step to characterize verb meaning in a way that is useful to build formal representations of sentence meaning (Carlson, 1984; Carlson & Tanenhaus, 1988; Dowty, 1989; Jackendoff, 1987; 1990; Parsons, 1990; 1995; Pustejovsky, 1995).

On the other hand, theories of syntax-semantics interface assume certain facets of meaning to be syntactically relevant; those grammatically relevant semantic features should be the ones accounted for by a Semantic Role List (Fillmore, 1968; 1970; 1971b; Levin, 1993;

(9)

8

Rappaport Hovav & Levin, 1988; 1998a; Van Valin, 1999; Van Valin & LaPolla, 1997; Van Valin & Wilkins, 1996).

(10)

9

1.1.2 Dowty’s Proto-Roles

As previously stated, each Semantic Role is strictly defined by a set of features a particular verb argument position has to possess in order to be assigned the considered role.

This model immediately faces an implementation problem, what Dowty termed Role Fragmentation (Dowty, 1991, Reisinger et al., 2015) : there seems to be a very large number of possible semantic properties combinations with argument positions sharing all the features defining a canonical Semantic Role but also having additional characteristics (that can be shared with other Semantic Roles).

Therefore, accounting for this variability would mean to introduce a very large Semantic Roles List, up to idiosyncratic roles identifying particular verb arguments; this way we would lose the generalization power a Semantic Role Theory should have.

From a methodological standpoint, the most famous and fruitful attempted solution to this problem is that proposed by the linguist Daniel Dowty (Dowty, 1989; 1991): Semantic Roles are no more considered sets of jointly necessary and sufficient features an argument position has to possess but can be conceived of as cluster concepts or prototypes (Rosch, 1973; Rosch & Mervis, 1975; Dowty, 1991; Schlesinger, 1995; Levin & Rappaport Hovav, 2005). Instead of positing a list of Semantic Roles, Dowty introduces only two proto-roles, Agent Proto-Role and Patient Proto-Role, and provides a set of defining properties (Dowty, 1991; Levin & Rappaport Hovav, 2005):

(a) Contributing properties for the Agent Proto-Role: – volitional involvement in the event or state – sentience (and/or perception)

– causing an event or change of state in another participant – movement (relative to the position of another participant) – (exists independently of the event named by the verb)

(11)

10

(b) Contributing properties for the Patient Proto-Role: – undergoes change of state

– incremental theme

– causally affected by another participant

– stationary relative to movement of another participant – (does not exist independently of the event, or not at all)

(Dowty 1991: 572, (28))

For each argument position, we count the number of entailments it shares with each prototype representation to state which is the most suitable one.

Notice that this model was motivated both on semantic and grammatical grounds: the idea was to formalize a principle that allowed to state which verb argument was realized as Subject and which one as Direct Object.

The argument bearing the greatest number of Agent Proto-Role entailments will be realized as Subject while the one sharing a greater number of entailments with Patient Proto-Role will be realized as Direct Object (see Argument Selection Principle; Dowty, 1991: 576; Levin & Rappaport Hovav, 2005).

Consider the following sentence:

(3) John (ARG1) assassinated the clown (ARG2)

Argument slot ARG1 of verb to assassinate produces the following entailments on its filler noun, John: (a) volitional involvement in the event or state, (b) sentience, (c) causing an event or change of state in another participant.

Argument slot ARG2 of the verb produces those entailments on its filler noun, clown: (a) sentience, (b) undergoes change of state, (c) causally affected by another participant. The mapping algorithm (i.e. Argument Selection Principle) tells us that the argument slot with more Agent Proto-Role properties will be realized as Subject while those having more Patient Proto-Role properties will surface as Direct Object: ARG1 has three Agent Proto-

(12)

11

Role properties (only one, sentience, for ARG2), so it is assigned Subject grammatical relation; ARG2 has two Patient Proto-Role entailments (no one for ARG1) so it is assigned Direct Object grammatical relation.

The interesting aspect of this proposal is not in the particular content of the semantic entailments provided by Dowty, which closely resemble those previously proposed by other Semantic Role Theories, but in the conception of Semantic Roles as prototype concepts with no clear boundaries; we can quantify the agenthood or patienthood of a given argument by simply counting the number of entailments it shares with each defined prototype.

(13)

12

1.1.3 Towards a gradable conception of Selectional Restrictions

Another fundamental issue in classical approaches to verb meaning analysis is that of Selectional Restrictions (Katz & Fodor, 1963; Chomsky, 1965; Katz, 1972; Jackendoff, 1990; Caplan et al., 1994; Pustejovsky, 1995).

For a first approximation, Selectional Restrictions are semantic properties a noun should possess in order to be an appropriate filler of a verb argument position (Resnik, 1996). Consider the English transitive verb murder whose meaning in Cambridge Dictionary is “to commit the crime of intentionally killing a person.”; this predicate assigns two Semantic Roles, AGENT and PATIENT.

To build an interpretable sentence with murder its argument slots have to be filled with two nouns bearing the semantic feature HUMAN.

For example:

(4) The prisonerAG murdered a guardPA

The two argument nouns are perfectly integrated with verb’s Semantic Roles: both prisoner (AGENT) and guard (PATIENT) have the semantic property HUMAN.

So, according to classical approaches to meaning representations, interpretation of sentence (4) gives rise to a fully coherent and informative semantic representation.

Conversely, sentence (5), in which one of the two preceding argument nouns has been replaced with another word denoting an inanimate entity, sounds odd and very difficult to interpret in a meaningful way:

(5) The prisonerAG murdered a gatePA

Direct Object gate, being an inanimate entity, violates Selectional Restrictions imposed by the verb on its PATIENT role.

(14)

13

As a result it becomes difficult (if not impossible) for a speaker to build a mental representation of the described situation2.

Semantic Roles and Selectional Restrictions are two different, yet strictly related, issues in verb meaning characterization.

Semantic Roles, being entailments a verb produces on its argument positions, can be conceived of as properties a target noun comes to possess when it fills in a particular argument position of the verb (Dowty, 1989; 1991; Jackendoff, 1990; 2002); Selectional Restrictions, on the other hand, are semantic features a noun has to include as part of its lexical meaning and consist of ontological properties characterizing the encoded concept independently of any particular argument position.

This may seem at first a very subtle distinction but it is necessary for a formal model of Sentence and Discourse Representation (Asher & Lascarides, 2003; van Eijck & Kamp, 2011).

During language production and interpretation we introduce through different linguistic expressions a set of entities, the so called Discourse Referents, that become part of the Discourse Domain and whose representations are constantly updated with new information as the discourse unfolds (Kamp & Reyle, 1993; van Eijck & Kamp, 2011; Kamp, 2013). Entailments a verb produces on one of its arguments can be considered as a way to update a Discourse Referent and are then encoded into the discourse representation as features the object has in that particular context (cf. Dowty, 1989; Lebani & Lenci, 2018).

Consider sentence (6):

(6) As soon as the party began, John broke a glass…

The NP a glass introduces into the Discourse Domain a generic instance of the concept GLASS, x ; being PATIENT of to break, x comes to be in the resulting state encoded by the verb and this information is encoded into the Discourse Representation ready to be retrieved as new linguistic input has to be processed.

Suppose the discourse proceeds with the following sentence:

2 Decompositional Theories of word meaning assume that a Selectional Restrictions violation prompts a figuarative (i.e. metaphorical) reading of the sentence (Katz & Fodor, 1963; Katz, 1972; Taylor, 1989).

(15)

14

(7) Immediately a housemaid arrived to remove the glass from the floor.

When interpreting NP the glass a speaker retrieves discourse referent x, the glass previously introduced and having the property, stated in sentence (5), IS_BROKEN.

On the other hand, Selectional Restrictions to break imposes on its PATIENT argument can be formalized through a binary feature [+PHYS OBJECT] that is possessed by the noun glass before it becomes argument of the verb.

The concept of Selectional Restrictions came out in the context of decompositional approaches to meaning representation: word meaning was decomposed into a set of semantic features both necessary and sufficient to characterize its referent (Katz & Fodor, 1963; Katz & Postal, 1964; Katz, 1972; Taylor, 1989).

Semantic features were then used to model predicate-argument composition processes by checking whether a given argument noun possessed all the semantic properties imposed by the predicate on its argument slots; if a mismatch occurred, a semantic anomaly produced and alternative interpretations were triggered.

Subsequent approaches developed the notion of Selectional Restrictions into a more formal and articulated system by expanding the set of semantic primitives and compositional operations (Jackendoff, 1983; 1990; 1997; 2002; Pustejovsky, 1991; 1995; Wierzbicka, 1972; 1996; Goddard, 1998).

However, the core aspect of the theory was preserved: Selectional Restrictions violation was a matter of binary judgment.

A candidate noun could possess all the semantic features imposed by the verb on the considered argument slot, being an appropriate filler, or do not possess at least one of these properties giving rise to a semantic anomaly.

The use of very abstract binary features was instrumental to theories that drew a sharp distinction between Linguistic Semantics and World Knowledge (Katz, 1972; Chomsky, 1965; Sperber and Wilson 1995; Schlesinger 1995; Jackendoff, 2002; Bornkessel and Schlesewsky 2006; Warren and McConnell 2007).

Selectional Restrictions were a simple system that could be easily included in a syntactic processor responsible for the computation of a structural description of an input sentence (Frazier 1995; Binder et al. 2001; Van Gompel et al. 2005; Clifton and Staub, 2008).

In early sentence processing literature (Ferreira & Clifton, 1986; Frazier, 1987), mainly influenced by Generative Grammar Theory, as a syntactic ambiguity involving Semantic

(16)

15

Roles assignment occurred, the only kind of semantic information exploited in the resolution was that encoded by Selectional Restrictions (i.e. whether a given argument position strongly selects for an animate noun, etc.); information concerning typical participants and settings of the described event became accessible only in a second phase. Current research in Sentence Processing brought extensive evidence that during processing speakers activate and integrate a complex array of conceptual information, including World Knowledge that was usually excluded from early interpretation processes (McRae et al., 1997; 1998; 2005; 2009; Ferretti et al., 2007; Matsuki et al., 2011; Elman, 2009; 2011; 2014). This very detailed knowledge is exploited to solve structural ambiguities by computing how plausible a noun is for a particular argument position (Rayner, Carlson & Frazier, 1983; Taraban & McClelland, 1988; McRae et al., 1998; Hare et al., 2009).

This way the two core aspects of Selectional Restrictions has to be revised: (a) they are no more general ontological properties of entities but become very detailed features derived from rich World Knowledge; (b) a noun does not have to possess all these properties to be an appropriate argument of the verb.

Consider, for example, the three following sentences (see also Chersoni et al., 2016a):

(8) The dog chases a ball (9) The dog chases a lawyer (10) The dog chases a wall

In sentence (8) the NP a ball is both an acceptable and typical PATIENT of verb to chase (according to our everyday experience, balls are the sort of things that dogs usually like to chase); in (9) the NP a lawyer is a perfectly acceptable but atypical PATIENT of the verb, yet every speaker is able to build a coherent and meaningful representation of the described situation. Finally, sentence (10) can be considered as a classical example of Selectional Restrictions violations, in which the argument noun is an impossible PATIENT of verb to chase (that imposes its direct object to be an entity that can be set in motion).

Traditional Linguistic Theory was able to only distinguish between (8)-(9) on one hand and (10) on the other; that is, the only relevant opposition was between acceptable sentences and unacceptable or Selectional Restrictions violated ones (Chomsky, 1957; Jackendoff, 2002; Chersoni et al., 2016a).

(17)

16

However, recent empirical evidence suggests that (8)-(9), despite being both acceptable, have a very different cognitive status: given ball is a more typical PATIENT of verb to chase with respect to lawyer, sentence (8) is interpreted more rapidly and efficiently than (9) that requires a greater cognitive effort to compositionally build a semantic representation (Federmeier and Kutas 1999; Kutas and Federmeier 2000; van Berkum et al. 2005; Baggio and Hagoort, 2011).

Though in Linguistic Theory Selectional Restrictions and Selectional Preferences are used interchangeably, we henceforth use Selectional Preferences; this seems more appropriate given they are actually conceived of as a set of properties an argument noun can entirely or partly possess depending on how an appropriate filler it is for the considered argument position.

(18)

17

1.2 Lexical Meaning nowadays

1.2.1 The Generalized Event Knowledge (GEK) paradigm: “lexical knowledge without a lexicon...”

A problem that is necessarily faced by any theory of linguistic meaning is that of the amount and type of semantic information that is stored in a lexical entry.

Traditional Lexical Semantics Models draw a sharp distinction between semantic properties that are encoded in a lexical entry (what is properly termed lexical meaning) and other dimensions of meaning that are taken to be part of a more general World Knowledge (Katz & Fodor, 1963; Katz, 1972; Levin & Rappaport Hovav, 2005; Elman, 2009; 2011; 2014). Consider, for example, word dog, encoding concept DOG; according to classical theories, its lexical meaning consists at most of the semantic features that define the referred object (i.e. has ears, has four legs, has a tail, etc.) while information concerning typical events in which a dog participates (the fact that a dog usually barks, walks and bite) and other entities that occur with them in the same situation (veterinarians, cookies, leashes, etc.) are excluded from its lexical representation.

The lexicon is therefore conceived of as a data structure mapping, in long-term memory, each lexeme to a lexical entry containing phonological, syntactic and semantic information (Jackendoff, 2002; Elman, 2009; 2011; 2014):

<<A lexical entry lists a small chunk of phonology, a small chunk of syntax, and a small chunk of semantics.>> (Jackendoff, 2002: 131)

A question that immediately arises in the context of these models is that of what tests could we use to determine whether a certain kind of semantic information is part of a lexical entry or resides elsewhere.

(19)

18

An advanced proposal is to use the notion of core meaning: a lexical entry should include only semantic information that is stable across contexts of usage; that is semantic dimensions that are shared by each occurrence of a lexeme, independently of the particular context. In psycholinguistics it is traditionally assumed that during sentence processing lexical information, consisting of abstract representations of the sort posited by formal linguistics, is accessed more rapidly than extralexical information, mainly World Knowledge, which is involved at a later stage of processing (Katz 1972; Chomsky 1975; Sperber & Wilson 1995; Schlesinger 1995; Bornkessel & Schlesewsky 2006; Warren & McConnell 2007).

It is therefore possible to use processing evidence to investigate the amount and type of semantic information that is stored in a lexical entry (Elman, 2011; 2014).

However, recent findings in sentence processing research call into question this assumption showing that it is very likely for specific and detailed semantic information to be involved immediately in the processing of an input sequence (Van Petten & Kutas, 1987; Taraban & McClelland, 1988; McRae et al., 1997; 1998; McRae & Matsuki, 2009; Matsuki et al., 2011; Elman, 2009; 2014).

Most studies consist of eye-movements and reading times experiments involving the processing of sentences that could be assigned two different syntactic interpretations; monitoring the strategies adopted by comprehenders to solve the temporary ambiguity is useful to understand what kind of information is accessible at each stage of processing. One of the most famous of these studies is that of McRae and his colleagues (McRae, Ferretti & Amyote, 1997; McRae, Spivey-Knowlton & Tanenhaus, 1998) in the context of Thematic Roles investigation.

Sentences employed for the tasks are of the following type:

(11) The man arrested…

English syntax allows this sentence to be ambiguous between two possible readings: (a) an active verb interpretation in which the NP the man is taken to be the AGENT of arrest; (b) a reduced relative reading in which the verb is in passive voice and the NP plays the role PATIENT of arrest.

Comprehenders have to select one of the two potential readings, assigning a particular semantic role of verb arrest to the man.

(20)

19

McRae et al. (1998) observed that comprehenders exploited detailed knowledge of real world events and their typical participants to solve the syntactic ambiguity (McRae et al., 1998): if the NP is a likely AGENT of the verb (i.e. The cop arrested…) an active interpretation would be preferred; on the other hand, if the NP is a likely PATIENT (i.e. The burglar arrested...), a reduced relative reading would be favored (see also Taraban & McClelland, 1988).

This strongly supports the hypotheses that even at an early stage of processing, as it has been traditionally considered that of syntactic parsing, comprehenders access detailed information deriving from everyday experience with events to chose between competing structural analyses.

This evidence together with other experimental findings suggests, assumed rapidity of access is a good test for lexical and extralexical information, that lexical representations encode a complex array of knowledge and that this knowledge is used as rapidly as possible during sentence processing (McRae et al., 1998; McRae & Matsuki, 2009; Matsuki et al., 2011; Metusalem, 2012; Elman, 2014).

Recently, psycholinguists have proposed a framework for lexical knowledge representation that overcomes the arbitrary distinction between linguistic meaning and world knowledge (Elman, 2009; 2011; 2014; McRae & Matsuki, 2009).

Words are no more elements mapping to bounded packages of semantic information; rather, they are conceived of as stimuli whose elaboration modifies the mental state of the interpretation system, resulting from previous linguistic and extralinguistc processing, in a predictable way (Rumelhart, 1979; Elman, 2009; 2011; 2014).

In Elman’s (2014) terms:

<<...words are not mental objects that reside in a mental lexicon. They are operators on mental states. From this perspective, words do not have meaning; rather, they are cues to meaning.>> (Elman, 2014: 129)

The effect an input word has on the system is both a function of its content and the mental state reached up to that moment (Elman, 2009).

In less formal terms, a word activates, in long-term memory, a portion of so called Generalized Event Knowledge (GEK; McRae & Matsuki, 2009, Matsuki et al., 2011;

(21)

20

Metusalem, 2012) encoding knowledge about common events and states, their settings, objects present in the scene and typical participants.

Going back to our example word dog, GEK it targets is likely to include events in which a dog is usually involved (barking, eating, walking, playing, biting, etc.), typical participants of these events (veterinarian, owner, handler, postman, etc.), places (park, house, beach, garden, veterinary clinic, etc.) and objects present in the same physical context (sofa, leash, ball, cookie, kennel, collar, etc.).

Although this knowledge is obtained through particular experiences with the target concept, it is stored in semantic memory in a generalized fashion, probably as prototype representations (McRae et al., 1998; McRae & Matsuki, 2009; Elman, 2014).

As previously stated, this very detailed and complex information is immediately retrieved during sentence processing and used as a clue to choose the appropriate syntactic structure. How is this generalized knowledge represented and computed in long-term memory is currently under investigation; we can only assume that, being a word associated with a large array of information, the context, linguistic and extralinguistic, in which it is processed modulates its activation pattern making some parts of GEK more prominent while moving others to the background (McRae & Matsuki, 2009; Metusalem, 2012; Elman, 2011; 2014). A series of semantic priming experiments supported the claim that even individual words in isolation activate a portion of event knowledge (Ferretti et al., 2001; McRae et al., 2005; Hare et al., 2009; see McRae & Matsuki, 2009 for a review).

Ferretti et al. (2001) found that verbs primed their typical agents (i.e. arresting-cop), patients (i.e. serving-customer) and instruments (i.e. stirred-spoon) but they did not prime locations (i.e. skated-arena).

McRae et al. (2005) investigated whether it was possible to obtain a similar priming effect in the opposite direction: instead of using verbs as primes and nouns as targets, nouns denoting typical participants of events were presented as primes and verbs encoding those events as targets.

They found that agents (i.e. waiter-serving), patients (i.e. guitar-strummed), instruments (i.e. chainsaw-cutting) and locations (i.e. cafeteria-eating) primed their typical associated event. Finally, Hare et al. (2009) used event, instrument and location nouns as primes to test noun-noun priming: event noun-nouns primed people (i.e. sale-shopper) and objects (i.e. breakfast-eggs) usually taking part in the denoted events; location nouns primed people (i.e. hospital-doctor) and objects (i.e. barn-hay) typically found at those locations.

(22)

21

Interestingly, instrument nouns were able to prime objects on which they are typically used (i.e key-door) but not people that typically use them (i.e. knife-chef).

To summarize, Generalized Event Knowledge (GEK) framework for lexical representations assumes that words do not encode limited and abstract chunks of semantic information but activate event knowledge in long-term memory; furthermore, the evidence provided by different words is integrated and composed during sentence processing to incrementally build a coherent representation of the conveyed situation.

(23)

22

1.2.2 Event Knowledge and Predictive Processing

Suppose the interpretation system has to sequentially process the following sentence:

(12) The tailor cuts…

Processing NP the tailor activates event knowledge associated with concept TAILOR in semantic memory (events in which a tailor typically participates, entities frequently present in the same situations, places, instruments used by tailors, etc.).

Assuming the NP is taken to be the subject of an upcoming verb, the system starts to formulate predictions on what this particular verb could be: knowledge about actions typically performed by tailors is exploited to pre-activate semantic and lexical features of verbs that denote those actions (sew, cut, measure, etc.).

When cut is encountered, expectations are satisfied and the new linguistic input is easily integrated in the preceding context.

At this point the system makes predictions on the type of object that is likely to undergo the action described by cut, given the AGENT tailor.

Concepts like FABRIC, THREAD, TROUSERS, CLOTH etc. are foregrounded in semantic memory and their properties pre-activated before new linguistic input occurs.

Imagine the continuation of the sentence is the noun bread, which is not an object tailors typically cut: previously computed expectations are disregarded and interpretation requires a complex inferential process.

Notice that noun bread is not an impossible PATIENT of verb cut (it does not violate classical Selectional Restrictions of cut, being a physical object), it is just an atypical one.

This example illustrates well a very prolific line of research originating from Generalized Event Knowledge paradigm; that is, investigation of how comprehenders use event knowledge during sentence processing to predict upcoming linguistic input (McRae et al., 1998; Federmeier & Kutas, 1999; Van Berkum, 2009; Kutas & Federmeier, 2011; Paczynski & Kuperberg, 2012; Kuperberg & Jaeger, 2015).

(24)

23

<< I use the term prediction simply to refer to some effect of context on the state of activation at a particular level of representation, ahead of new bottom-up information becoming available at this level of representation. >> (Kuperberg, 2016: 2)

The idea is that event knowledge is exploited during language comprehension to generate expectations on the unfolding linguistic input: knowledge of how events are typically organized in real world allows to anticipate words and linguistic structure, under the assumption that the speaker is conveying a typical situation.

An empirical evidence that has been analyzed to understand event knowledge use in language interpretation is the modulation of the so called N400 component (Kutas & Hillyard, 1984: Van Berkum, Hagoort, & Brown, 1999; Kuperberg, Sitnikova, Caplan, & Holcomb, 2003; Hagoort, Hald, Bastiaansen, & Petersson, 2004; Hagoort, Baggio, & Willems, 2009, Metusalem et al., 2012).

N400 ERP component is a negative-going waveform peaking at approximately 400 ms after stimulus presentation (Federmeier & Kutas, 1999; Van Berkum, 2009; Kutas & Federmeier, 2011; Kuperberg, 2016; Kuperberg & Jaeger, 2015).

Setting aside the long debate on what exactly the component indicates, one of the latest trends assumes that it reflects the ease of accessing semantic features associated with an incoming word (Kuperberg, 2016: 2).

So, comprehenders use event knowledge to predictively pre-activate semantic features associated with likely incoming words; when the actual input word is observed, if its semantic properties have been pre-activated by the context, an attenuated (less negative) N400 response would be recorded.

On the other hand, if expectations are not matched by the input word, a very large N400 response is produced and integration of the new word in the preceding context results in an atypical event.

N400 amplitude is therefore inversely proportional to the degree to which semantic features associated with the incoming word have been pre-activated by the context.

Going back to previous example sentence, (12), when the system has processed the sequence The tailor cuts, it is making predictions on the upcoming direct object: we can imagine that if the next input word is fabric its semantic features have been pre-activated by the context and an attenuated N400 waveform is observed whereas if noun bread is provided as direct object, expectations are disregarded and a consequent larger N400 effect recorded.

(25)

24

N400 component is also taken to reflect an additional cognitive effort in building a semantic representation for an input sequence (Hagoort, Baggio, and Willems, 2009; Baggio and Hagoort, 2011; Baggio, van Lambalgen, and Hagoort, 2012).

Provided it is possible to integrate an unexpected word into an unfolding representation, the process requires additional cognitive costs.

This empirical evidence is useful for theoretical research on Selectional Preferences as far as it forces models to overcome the binary assumption and to consider in between cases: from a cognitive standpoint there exists a crucial difference between nouns violating Selectional Preferences of a verb, nouns that are typical fillers of the argument position and nouns whose composition with the verb gives rise to an atypical event (Metusalem et al., 2012; Paczynski & Kuperberg, 2012; Kuperberg & Jaeger, 2016; Kuperberg, 2016; Chersoni et al., 2016).

In sum, during language processing, comprehenders incrementally build an higher-level meaning representation that interacting with event knowledge stored in semantic memory generates expectations on upcoming linguistic material (Paczynski & Kuperberg, 2012; Chow et al., 2015; Kim et al., 2016; Kuperberg, 2016; Chersoni et al., 2016; Chersoni et al., 2017).

These expectations can be satisfied, resulting in a straightforward integration of new linguistic input into the unfolding representation or mismatched, requiring a complex inferential process.

Furthermore, if we consider expectations a verb produces on one of its argument positions we have to take into account how other Semantic Roles of the predicate are actually filled: going back to example (12), expectations verb cut generates on its PATIENT role are dramatically influenced by the actual AGENT of the verb, tailor.

(26)

25

1.2.3 From Generalized Semantic Roles to Verb-Specific Proto-Roles: Thematic Fit Judgments

Before delving into a more psycholinguistic oriented presentation of state-of-art theories of Semantic Roles, it is useful to sketch up the reasoning behind prototype-based representations in this research field (Dowty, 1989; 1991; Schlesinger, 1995; McRae et al., 1997; 1998).

Let V = {v1, v2, …., vn} be the set of verbs of a language L; each vi Є V assigns an array of

argument positions3, {ARG

1, ARG2, …, ARGm}.

Suppose we have built the two Semantic Roles prototypes, AGENT and PROTO-PATIENT, each defined by a list of entailments a verb can produce on one of its argument positions.

Given a vi Є V, we can determine the entailments it produces on each of its argument

positions and place this argument at a certain distance from both prototypes.

Repeating this operation for each verb in V, we categorize each particular verb argument with respect to proto-roles.

Notice, however, that proto-roles are abstract categories that do not correspond to any particular verb argument position but are used to classify each of them with respect to a prototype, without taking into account specific fillers that may occur in that position. The logic adopted by latest theories of Semantic Roles is quite different (McRae et al, 1997; 1998; Ferretti et al., 2001; Elman, 2011; 2014): for each argument position ARGj of each

verb vi Є V, define a prototype representation encoding semantic features shared according

to different degrees by particular fillers occurring in ARGj with a certain frequency.

These models, given the set of argument positions {ARG1, …., ARGm} a verb assigns, aim

to provide a prototype representation for each element of the set by considering nouns observed as fillers of the argument.

Verb-specific roles are then used to categorize filler nouns according to their similarity to the computed prototype (McRae, 1997; 1998; Elman, 2014).

The seminal work for this transition from generalized Semantic Proto-Roles (cf. Dowty, 1991) to Verb-Specific prototypes is that by McRae, Ferretti & Amyote (1997).

3 An argument position, for a first approximation, can be conceived of as a slot in a verb’s representation that has to be filled by a particular participant to the described event.

(27)

26

Their starting and controversial hypothesis is that Semantic Roles are not just empty slots associated with very abstract properties that have to be filled by particular nouns but contain very detailed information concerning their typical fillers (i.e. World Knowledge).

Furthermore, this information, derived from everyday experience with real world events and situations, is organized in a way that allows it to be used immediately during sentence processing (McRae et al., 1997; 1998; McRae & Matsuki, 2009; Matsuki et al., 2011; Elman, 2011; 2014).

To use authors’ words:

<<...role concepts are formed through the everyday experiences during which people learn about the entities and objects that tend to play certain roles in certain events. >> (McRae et al., 1997: 141)

This claim calls into question the traditional assumption, particularly strong in Chomskyan tradition, that posits only few generic features to be contained in a verb’s semantic role representation (Chomsky, 1965; Cullicover, 1988; Caplan et al., 1994; Schlesinger, 1995; Levin & Rappaport Hovav, 2005).

According to McRae et al. (1997), a verb’s semantic role is a specific concept whose representation is computed by abstracting over all the entities that occur in the corresponding argument position; this prototype will therefore encode a ranking of the features shared by filler objects.

That is:

<< ...a computed role concept can be viewed as a set of features that is typically possessed by the fillers of that role. >> (McRae et al., 1997: 142)

Consider verb break. People possess a large amount of knowledge concerning breaking events that derive from first-hand participation in breaking events, reading and hearing about breaking events, watching them on television, etc.

(28)

27

undergoes the action denoted by the verb is very likely to have semantic properties like <has a function>, <has parts>, <can be fixed>, <is important>, <has a cost>, <is fragile>, <is useless>, etc.

As this example illustrates, features defining a prototype are of a different nature with respect to those proposed by Dowty (Dowty, 1991; see also Schlesinger, 1995); they are specific properties of objects actually playing a verb’s role.

To support their claims, the authors introduce a series of psycholinguistic experiments investigating the nature and use of verb-specific features.

Among these experiments the most important to the present work, and those that will be briefly introduced, are the first one and the first part of the second one.

In the first experiment, EXPERIMENT 1, participants were given verb-specific Semantic Roles, AGENT or PATIENT (someone who is convicted), and asked to list features for that argument position.

This experiment was designed to tap directly into participants’ intuitions regarding role concepts and to build reliable representations of these prototypes.

Second experiment was split into two sub-experiments.

In the first part (EXPERIMENT 2A), which is the most relevant to our purposes, participants provided ratings for role-filler pairs.

Given questions of the type How common is it for a hostage to rescue someone?, subjects had to provide a rating on a 7-point scale expressing the typicality of the given noun as a filler of the considered Semantic Role.

In the case of the previous example question, authors were measuring the typicality of noun hostage as AGENT of verb rescue (which should be intuitively very low).

This kind of task, named Thematic Fit Judgment, is taken to involve role concepts representations that are used to determine whether a candidate noun is an appropriate filler for the verb-specific role: the process resembles those of non-binary categorization that consists in computing the similarity between an instance and a prototype representation (Rosch & Mervis, 1975; Rosch, 1978; Barsalou, 1987; Malt & Smith, 1984; McRae, de Sa & Seidenberg, 1997).

In other words, Thematic Fit can be conceived of as a process through which speakers compute a gradable categorization judgment for the concept denoted by a candidate noun with respect to the prototypical category defined by a Semantic Role.

An empirical domain in which Thematic Fit Judgments have proven to be particularly useful is in the investigation of event knowledge use during sentence processing (McRae et al.,

(29)

28

1998; Ferretti et al., 2001; Warren & McConnell, 2007; Matsuki et al., 2011; Metusalem et al., 2012; Paczynski & Kuperberg, 2012; Elman, 2011; 2014; Warren et al., 2015; see also EXPERIMENT 3 in McRae et al., 1997).

As previously mentioned, one of the most important types of studies in sentence processing research is that involving sentences temporarily ambiguous between two possible syntactic analyses (Rayner et al., 1983; Taraban & McClelland, 1988; McRae et al., 1998; Elman, 2011; 2014).

To use a different example from that introduced in the previous paragraph:

(13) The hostage rescued...

Sentence (13) is ambiguous at the verb between (a) a main verb reading in which rescue is in its active voice and hostage is AGENT and (b) a reduced relative interpretation with the verb in its passive voice and the noun phrase playing the role of PATIENT.

As is evident, the two possible interpretations imply different Semantic Role assignments (cf. McRae et al., 1998).

According to empirical evidence, the preferred reading depends, at least partly, on Thematic Fit between noun hostage and each of the relevant Semantic Roles, AGENT and PATIENT; Thematic Fit is in turn computed by comparing the concept HOSTAGE with the prototype representations of AGENT and PATIENT roles of verb rescue.

If noun hostage turns out to be a very typical AGENT of verb rescue (which is not the case), a main verb reading is favored (i.e. hostage is assigned Subject grammatical relation); on the other hand, if hostage is a typical PATIENT of verb rescue, a reduced relative interpretation is preferred (i.e. hostage is assigned Direct Object grammatical relation).

Given the prototype representations are built by abstracting over the properties of the observed fillers of argument positions, the involvement of this kind of very detailed information at early stages of sentence processing is strong evidence in favor of theories that do not draw a distinction between linguistic semantics and event knowledge (McRae et al., 1998; Matsuki et al., 2011; Metusalem et al., 2012; Elman, 2011; 2014; see also Kuperberg, 2016).

According to Predictive Processing framework processing a sentence containing a particular verb generates expectations on the possible fillers of the verb’s argument positions (Kamide

(30)

29

et al., 2003; DeLong et al., 2005; Federmeier, 2007; Van Petten and Luka, 2012; Willems et al., 2015; Metusalem, 2012; Paczynsky & Kuperberg, 2012; Kuperberg, 2016).

As previously stated, prediction consists in the pre-activation of semantic features associated with expected input words (see Kuperberg, 2016); following this statement we could assume that processing a verb pre-activates, according to different degrees, features defining the prototype representations of its Semantic Roles (McRae et al., 1998; Federmeier & Kutas, 1999; Kutas & Federmeier, 2000; van Berkum et al. 2005).

When a noun is provided as candidate filler of one of the Semantic Roles, if its semantic properties match those pre-activated by the verb, the processing of new input is facilitated (shorter reading times, modulation of N400 amplitude, etc.).

The facilitation is explained in terms of retrieval of pre-stored semantic units as opposed to more compositional interpretation processes required by less typical verb arguments (Hagoort, Baggio & Willems, 2009; Baggio & Hagoort, 2011; Baggio, van Lambalgen & Hagoort, 2012).

Summing up, Thematic Fit is a fundamental concept also for current psycholinguistic research even if it has been embedded in a more general world knowledge exploitation mechanism.

(31)

30

1.2.4 Compositionality and Thematic Fit Update

Consider verb hit and the set of argument positions it assigns, {AGENT, PATIENT, INSTRUMENT}.

For each of these Semantic Roles a prototype representation encoding the features shared by particular fillers of the role is built; these representations are then used during sentence processing to generate expectations on potential fillers of the roles.

There is however an important issue that has been omitted up to this moment: verbs do not occur in isolation, as individual lexemes generating expectations on each of their argument positions, but are part (indeed a fundamental one) of a linguistic sequence unfolding through time.

It is always the case that distinct roles are filled by nouns at different time points depending on the syntactic structure of the actual sentence, the particular language and other complex factors4.

If we are to use Thematic Fit as an operative notion to investigate semantic processing we have to take into account the sequence of Semantic Roles assignment and consider the possibility that when an argument position is filled all other role prototypes are updated according to information added by the particular filler noun.

The following sentences, sharing the same verb but differing for the AGENT noun, suggest different continuations at the verb:

(14) The player hit... (15) The guard hit...

In both cases the first role that is filled is AGENT.

As a consequence the PATIENT’s prototype is updated in a way that it predicts nouns denoting objects on which the given AGENT performs the action described by the verb.

4 Notice that this assumption implies that each NP is assigned at most one Semantic Role which is not an uncontroversial issue in linguistics literature (see for example Jackendoff, 1990)

(32)

31

Therefore, if the Direct Objects of (14) and (15) turn out to be the ball and a prisoner respectively, the comprehension system computes a high Thematic Fit score.

On the other hand, if we reverse the PATIENT nouns, obtaining the sentences The player hit a prisoner and The guard hit the ball, a low Thematic Fit score is computed showing that the way the AGENT role is filled considerably changes plausibility of an argument noun (though a ball is something a guard can hit, it is not the object on which he usually performs the action described by the verb).

These intuitions are supported by empirical evidence employing different experimental paradigms (Kamide et al., 2003; Bicknell et al., 2010; Matsuki et al., 2011; see also Elman, 2014 for a review).

Kamide et al. (2003) tracked participants’ eye movements on a screen displaying several pictures of objects while hearing sentences of the type The woman will drink or The baby will drink.

They found anticipatory looks towards the picture depicting the more likely PATIENT of the whole event described by the sentence (wine and milk respectively for our example sentences).

Given the sentences share the same verb and syntactic structure, these anticipatory looks can be considered as evidence for the influence of an AGENT noun on expectations generated by the verb on its PATIENT role.

Bicknell et al. (2010) explicitly state that expectations a verb produces on its PATIENT argument position are a function of both the linguistic input processed so far and the event knowledge cued by that sequence:

<<...we test the hypothesis that expectations regarding likely fillers of a patient role rely on event knowledge, and that these expectancies are driven by multiple cues that are dynamically integrated during processing.>>(Bicknell et al., 2010: 3)

To support this hypothesis they set up two experiments using reading-time and ERP paradigms.

In the first experiment, organized in multiple stages, a set of 50 verbs whose meaning seemed to be particularly sensitive to the noun filling the AGENT argument position was selected based on intuition.

(33)

32

Each of these verbs was combined with two agent nouns that changed the characteristics of the described event (The toymaker produced VS The moviemaker produced).

Using a production norming task, a list of typical patients was produced for each agent-verb pair.

To exclude the possibility that produced patients were directly associated with agents, without any influence of the verb, a free association task and an agent-verb priming task were conducted.

The tasks allowed the selection of 32 verbs, each occurring in two agent-verb-patient triples (toymaker-produce-puppet and moviemaker-produce-blockbuster); for each verb-patient pair, two sentences containing the two different agents of the verb were built, resulting in 64 pairs of sentences (a pair contained a congruent and an incongruent condition).

The congruent condition included a typical PATIENT for the given agent-verb pair whereas the incongruent one contained an atypical PATIENT.

Finally, reading times for each sentence in a pair were measured revealing shorter times (at the word immediately following the PATIENT) for congruent condition with respect to the incongruent one.

An ERP experiment confirmed the pattern observed for reading times: the processing of a PATIENT noun that is expected given the agent-verb combination is facilitated with respect to an argument that despite being a typical filler of the verb alone is unexpected with the particular AGENT noun.

On the basis of the two experiments the authors conclude:

<<...we demonstrated that comprehenders combine knowledge about an agent and a verb to influence their expectations for upcoming patient nouns. >> (Bicknell et al., 2010: 11)

Bicknell et al. (2010; see also Morris, 1994 and Kamide et al., 2003) investigated the effect of an AGENT noun on expectations a verb produces on its PATIENT role; Matsuki et al. (2011), on the other hand, focused on instrument-verb combinations.

Their aim was to investigate whether the filler of the INSTRUMENT role can alter the nature of the described event determining different expectations for the PATIENT role:

(34)

33

<<...test whether participants rapidly combine instruments and actions to generate expectations for different classes of patients.>> (Matsuki et al., 2011: 916)

After two norming studies, they ended up having a set of 96 sentences organized in 48 pairs containing a typical and an atypical condition.

Sentences in a pair share the same verb-patient combination while differing for the INSTRUMENT noun:

(16a) Rene used the coins to purchase the hand-made candy at the farmers’

market… (TYPICAL INSTRUMENT)

(16b) Rene used the credit card to purchase the hand-made candy at the farmers’

market… (ATYPICAL INSTRUMENT)

Matsuki et al. (2011) used reading-time and eye-tracking experimental paradigms to monitor processing of the two conditions and to verify whether there was a difference in complexity between the two type of sentences (reading times at the PATIENT among others).

They found the atypical condition to cause a disruption during sentence interpretation thus requiring more attention and longer reading times at the PATIENT noun.

The two previous studies demonstrated that the way AGENT and INSTRUMENT roles are filled directly influences expectations a verb produces on its PATIENT argument, determining an update of the role’s prototype (Taraban & McClelland, 1988; Morris, 1994; Kamide et al., 2003; Bicknell et al., 2010; Matsuki et al., 2011; see Elman, 2014 for a review).

In the present work we further assume that, given the incremental nature of linguistic processing, the structure of a role’s prototype and consequently expectations it generates on the position depend on the fillers of all previously filled argument positions; in other words, every time an argument position is filled by a particular noun, all the prototypes of the available argument positions are updated according to the provided information.

This assumption is consistent with recent findings in ERP literature according which expectations on upcoming linguistic input depend on a representation of the event described

(35)

34

so far built by integrating linguistic input as well as stored semantic information and used to generate predictions on upcoming linguistic material (Paczynski & Kuperberg, 2012; Metusalem et al., 2012; Kim et al., 2016; Kuperberg & Jaeger, 2016; Kuperberg, 2016).

(36)

35

1.3 Distributional Models of Thematic Fit

1.3.1 An Exemplar-Based Model of Selectional Preferences

To understand the logic behind Computational Models of Thematic Fit it is useful to first consider the following quote from McRae et al. (1997):

<< ...we draw no sharp distinctions among lexical, semantic, conceptual and episodic knowledge, as in Jackendoff (1983), but contrary to Schlesinger (1995). Verbs describe episodes; that is, one way to view them is as labels for sets of similar events (i.e. event categories; Kersten & Billman, 1995). Thus a verb’s semantic representation is computed from the set of episodes that have been linked to the verb’s word (i.e. spelling or sound). >> (McRae et al., 1997: 141)

McRae et al. (1997) wrote about episodes, conceiving them as real world situations in which an instance of the described event has been experienced.

Let Ev = {e1, e2, …., eN} be the set of particular episodes linked to the verb’s form; in each

ei Є Ev a Semantic Role of the verb is played by a particular entity.

Observing the fillers of this role for each episode we can extract some important generalizations on the properties an ideal filler should possess; in other words, by clustering the concepts that occupy a particular verb slot it is possible to build a prototype representation encoding features shared according to different degrees by those filler concepts.

This representation is then used to compute the plausibility of a candidate filler for the considered Semantic Role.

It is commonly assumed in current psycholinguistic literature on event knowledge (Elman, 2009; 2011; 2014; McRae & Matsuki, 2009; Matsuki et al., 2011; Chersoni et al., 2016; see also Baroni & Lenci, 2010) that at least part of Generalized Event Knowledge speakers

(37)

36

possess is derived from previous linguistic experiences and can therefore be modeled with distributional information extracted from corpora; that is, event knowledge is gathered not only through direct participation in events and situations but also from linguistic descriptions of them.

Imagine to approximate the set of episodes to which a verb form applies as the set of particular occurrences of the verb in a very large corpus of linguistic data: each verb token will assign a subset (possibly all) of its Semantic Roles to syntactically realized constituents that can be used to identify concepts playing particular roles in the event.

This approach is exactly that taken by current Distributional Models of Thematic Fit Estimation (Erk, 2007; Baroni & Lenci, 2010; Erk, Padó & Padó, 2010; Lenci, 2011; Greenberg et al., 2015; Santus et al., 2017 among others).

Building on the work in Erk (2007), Erk, Padó & Padó (2010) propose an Exemplar-Based Model of Thematic Fit estimation that remembers all the nouns observed in a particular argument position of a verb (for Exemplar Models see Nosofsky, 1986; Daelemans & van der Bosch, 2005; Hay, Nolan & Drager, 2006).

To build a formal representation of a verb’s Semantic Roles, the set of nouns observed in the syntactic relation of the verb used to approximate the role is extracted from a syntactically parsed corpus.

Each filler noun is represented as a distributional vector in a Semantic Space (Baroni & Lenci, 2010; Turney & Pantel, 2010; Lenci, 2018) whose dimensions are linguistic contexts in which a target noun can occur (fillers vectors are built using a different corpus from that employed to extract verb-argument occurrences; see Erk, 2007; Erk, Padó & Padó, 2010). Distributional Semantic Models of lexical meaning (DSMs henceforth; Lenci, 2008; Jurafsky & Martin, 2008; Turney & Pantel, 2010; Clark, 2015; Erk, 2016; Lenci, 2018) draw their inspiration from the very popular Distributional Hypothesis (DH henceforth; Harris, 1954).

The DH states that two words w1 and w2 that frequently occur in the same linguistic contexts

are semantically similar; in other words, they share semantic features that allow the two words to occur in the same linguistic contexts.

According to this hypothesis, to determine semantic similarity between two words we can compare their distributional profiles that are in turn defined as the set of linguistic contexts in which a word frequently occurs (Harris, 1954; Miller & Charles, 1991; Jurafsky & Martin, 2008; Lenci, 2008; 2018).

Riferimenti

Documenti correlati

Nella complessa organizzazione della vita quotidiana dei genitori con lavori non-standard (cfr. anche Piccone Stella, 2007), i nonni emergono, in modo più netto di quanto non

sando per hora quel moto tardissimo delle stelle proprie) è chiaro che non può essere nella regione elementare, perché bisognerebbe anco dire che anco quella parte di

What we notice, therefore, is that, while in English, the verbs incorporating the Location and the Location are identical to the nouns they incorporate, in Romanian,

This contribution aims to present the kinetic of heat treatment as well as the effect on some physical and mechanical properties of poplar wood (Populus alba

Schematic representation of (a) the magnesium sample covered by the porous oxide layer spontaneously formed when exposed to air, (b) the AZ31 substrate coated by the porous PDOPA

Using the variation in the number of relocated mafia members according to destination province, I estimate the impact of such relocation on the incidence of crime and homicides,

Among other results, we obtain connections between finite embeddability and the algebraic and topological structure of the Stone- ˇ Cech compactification of the discrete space