• Non ci sono risultati.

Multimodal recognition of emotions in music and language

N/A
N/A
Protected

Academic year: 2021

Condividi "Multimodal recognition of emotions in music and language"

Copied!
15
0
0

Testo completo

(1)

https://doi.org/10.1177/0305735620978697 Psychology of Music 1 –15 © The Author(s) 2021 Article reuse guidelines: sagepub.com/journals-permissions DOI: 10.1177/0305735620978697 journals.sagepub.com/home/pom

Multimodal recognition of

emotions in music and language

Alice Mado Proverbio

1,2

and Francesca Russo

1,2

Abstract

We investigated through electrophysiological recordings how music-induced emotions are recognized and combined with the emotional content of written sentences. Twenty-four sad, joyful, and frightening musical tracks were presented to 16 participants reading 270 short sentences conveying a sad, joyful, or frightening emotional meaning. Audiovisual stimuli could be emotionally congruent or incongruent with each other; participants were asked to pay attention and respond to filler sentences containing cities’ names, while ignoring the rest. The amplitude values of event-related potentials (ERPs) were subjected to repeated measures ANOVAs. Distinct electrophysiological markers were identified for the processing of stimuli inducing fear (N450, either linguistic or musical), for language-induced sadness (P300) and for joyful music (positive P2 and LP potentials). The music/language emotional discordance elicited a large N400 mismatch response (p = .032). Its stronger intracranial source was the right superior temporal gyrus (STG) devoted to multisensory integration of emotions. The results suggest that music can communicate emotional meaning as distinctively as language.

Keywords

event-related potentials, musical emotions, STG, multimodal stimuli, mismatch response, incongruence, reading

The aim of this study was to investigate if, through which mechanism, and how clearly music can transmit distinct emotional states (Panksepp, 1994). While it is known that semantics and prosody can convey precise emotional hues, the neural mechanism with which music is able to transmit emotions is still under investigation. Coutinho and Dibben (2013) found that reported emotions to music and speech prosody could be predicted from a set of common psychoacoustic features including loudness, tempo/speech rate, melody/prosody contour, spectral structure, sharpness, and roughness. This hints at a common mechanism for extracting the emotional content of auditory affective information. Music is indeed able to arouse specific emotions in

1Department of Psychology, University of Milano-Bicocca, Milan, Italy

2Milan Center for Neuroscience, Milan, Italy

Corresponding author:

Alice Mado Proverbio, Department of Psychology, University of Milano-Bicocca, Piazza dell’Ateneo Nuovo 1, U6, 20126 Milan, Italy.

Email: mado.proverbio@unimib.it

(2)

the listener. The most frequently reported musical emotions are calm/relaxation, happiness/joy,

nostalgia, pleasure/enjoyment, sadness/melancholy, energy, and love/tenderness (see Juslin, 2013;

Juslin, Liljeström, Västfjäll, Barradas, & Silva, 2008).

As for the brain mechanisms involved, we know, for example, that fearful music activates the amygdala (e.g., Baumgartner, Lutz, Schmidt, & Jäncke, 2006; Koelsch et al., 2013), that atonal music induces fear bradycardia (Proverbio et al., 2015), that listening to heroic music activates the ventral striatum (Vuilleumier & Trost, 2015). Again, listening to sad music activates the insula and the cingulate cortex (Janata, 2009; Trost, Ethofer, Zentner, & Vuilleumier, 2012), but the hidden mechanisms for which musical auditory frequencies are interpreted as mean-ingful by brain structures remain unclear. Paquette, Takerkart, Saget, Peretz, and Belin (2018) proposed that music could tap into neuronal circuits that have evolved primarily for the pro-cessing of emotional vocalizations, such as laughter and crying (see also Proverbio, De Benedetto, & Guazzone, 2020; Proverbio, Santoni, & Adorni, 2020 for an integrated model of emotion comprehension in speech, vocalizations and music). However, the precision and rela-tive universality with which music can transmit emotional meanings remains a rather unex-plored issue.

The two-dimensional circumplex model of emotional states (Russell, 1980) predicts that emotions can be conceptualized as varying in intensity, along a mild/strong gradient, and in polarity (positive vs. negative). In this view, relaxation and sadness would be comparable in intensity (although varying in polarity). Here, we only considered fear, joy, and sadness feel-ings, while relaxation state was not considered as an emotion for several reasons. First, because of the specificity of language as aid for transmitting emotions: relaxing written words are expe-rienced as neutral ones, as shown by preliminary tests. Moreover, as shown by stimulus valida-tion, sad sentences (describing child abuse, rape, orphan stories, etc.) were more arousing than theoretically assumed by this model (i.e., more arousing than scary words). Furthermore, a two-dimensional model of emotional states (considering the pleasure–displeasure and deacti-vation–activation dimensions) does not hold for brain activation approaches. In fact, neural circuits supporting the various emotions are non-coincident, so that one area (e.g., insula, or cingulate cortex) might be more active in association with a supposedly “deactivating” emotion (e.g., sadness), and less active in association with an “activating” emotion (e.g., happiness). It seemed therefore more proper conceptualizing emotions as distinct-states supported by devoted neural circuits (e.g., Panksepp, 1994).

One possible way of investigating how music can convey clear emotional information is to compare music with other means of transmitting emotions (e.g., language, prosody, lyrics, facial expressions). Several studies explored the integration between affective information conveyed by facial expressions and music (e.g., Jeong et  al., 2011; Jolij & Meurs, 2011; Proverbio & De Benedetto, 2018; Proverbio, Camporeale, & Brusa, 2020) or images and music (Spreckelmeyer, Kutas, Urbach, Altenmüller, & Münte, 2006), finding enhanced N400 responses to emotional discordance. Using magnetoencephalography, Parkes, Perry, and Goodin (2016) also found enhanced N400 responses to word emotional discordance during sentence reading (e.g., “My mother was killed and I felt great”). In classical N400 paradigms, pairs of semantically congruent words (such as “tea” and “coffee”), as opposed to pairs of incongruent items (such as “milk” and “hydrocarbon”) are presented during electroencephalogram (EEG) recording. The second item of an incongruent pair typically elicits a measurable negative potential known as the N400 response, reflecting the conjoined processing of two incongruous information (Kutas & Federmeier, 2000; Lau et al., 2008).

(3)

might be emotionally discordant, or not with the verbal message, by means of N400 paradigm. Music has rarely been compared with linguistic stimuli, except for single words (e.g., Goerlich et  al., 2012), while it has often been compared with lyrics and prosody (Morton & Trehub, 2001; Vidas et al., 2018).

The aim of the present study was multifold. By measuring the amplitude of event-related potential (ERP) responses as a function of music emotional content and congruence with linguis-tic information, we hoped to gather information about how precisely listeners comprehend the emotional content of music. (1) We wished to assess whether music was able to transmit emo-tions (joy, sadness, and fear) as distinctive as those conveyed by semantics of language. At this aim, emotional sentences were congruently or incongruently paired with piano music excerpts transmitting sadness, joy, or fear. If music clearly conveyed an emotional significance, a N400 response to incongruent music/sentence pairing would have been observed. (2) Furthermore, we wished to investigate whether the valence of emotional content of stimuli (positive vs negative) differentially modulated ERP responses. Specifically, it was hypothesized that fear (conveyed by music or written language) would be associated with enlarged N400-like anterior response, as previously found negative emotions conveyed by music (Proverbio et al., 2020), speech prosody (Grossman, Striano, & Friederici, 2005; Schirmer, Kotz, & Friederici, 2002), and vocalizations (Proverbio et al., 2020). Again, we expected that positive emotions such as joy (as opposed to sad-ness) would enhance the amplitude of P2 and late positivity (LP) as found for speech prosody (Paulmann, Bleichner, & Kotz, 2013; Paulmann & Kotz, 2008), music and vocalizations (Proverbio et al., 2020), and language (Kanske & Kotz, 2007). Finally, we expected that P300 component, reflecting the arousal response induced by a stimulus, would be greater to negative than positive emotions (Cuthbert, Schupp, Bradley, Birbaumer, & Lang, 2000).

Material and methods

Participants

Twenty university students (10 males and 10 females) volunteered for the EEG study. They ranged in age from 21 to 28 years (Mage = 23.65 years, SD = 1.75) and had a high level of

edu-cation (Diploma, n = 6, University degree, n = 14). Experiments were conducted with the under-standing and written consent of each participant, with approval from the local Ethical Committee (Prot. N. RM-2019-176) and in compliance with APA ethical standards. Four par-ticipants were successively discarded for excessive EEG artifacts. Additional methodological details can be found in Supplemental Materials online.

Stimulus and procedure

Stimulus validation. Twenty university students (10 males and 10 females) took part in the

vali-dation procedure. They ranged in age from 22 to 28 years (Mage = 24.15 years, SD = 2.08) and

had a high level of education (diploma, n = 4, university degree, n = 16).

(4)

scale. In all cases, the intended emotion was clearly recognized, with more than 80% of correct responses (joy = 99%, sadness = 84%, and fear = 82%).

The visual stimuli were 300 Italian sentences eliciting the same types of emotions: joy (n = 100), sadness (n = 100), and fear (n = 100) created on purpose for the present study. They elicited the same types of emotions of musical stimuli: joy (e.g., “The mother heard the first word of her infant”), sadness (e.g., “He fought hopelessly all his life against cancer”), and fear (e.g., “He opened his eyes wide and found himself submerged by the rubble”).

To validate the sentences emotional content, participants of this study were asked to read them and decide which emotional content they expressed among joy, sadness, and fear. Participants were also asked to indicate the intensity of the emotion induced, by means of a 3-point Likert-type scale (“I feel”: 0 = calm, 1 = activated, and 2 = very activated) taken from a modified version of the self-assessment Manikin tool (Bradley & Lang, 1994). This evaluation tool is based on nonverbal responses and directly measures the activation associated with the emotional reaction of the person linked by an object or an event.

The validation results showed that 270 out of 300 sentences were evaluated coherently by at least 80% of evaluators, while 90% of sentences were categorized as reflecting the same emotion by 18 out of 20 evaluators. These sentences were selected as stimuli for the EEG recording study and were equally paired to congruent and incongruent musical pieces as shown in Table 1.

Procedure. Participants comfortably sat in a dark, acoustically and electrically shielded test area

facing a screen located 120 cm from their eyes. They were instructed to gaze at a central

Table 1. Example of stimulus pairs as a function of congruency condition and specific emotional content.

The final set comprised 270 emotional stimuli (90 joyful, 90 sad, and 90 frightening sentences) plus 30 neutral sentences containing a city name (e.g., “Many composers lived in Wien in the 1800s” or “The

carbonara is a typical dish from Rome”) which acted as target stimuli. Incongruent

Phrase Joy (e.g., Opened anxiously his birthday present)

Music Sad

Phrase Joy (e.g., The little girl crawled for the first time)

Music Frightening

Phrase Fear (e.g., The snarling dog pounced on her)

Music Joyful

Phrase Sadness (e.g., Her one-year-old child died of meningitis)

Music Joyful

Phrase Fear (e.g., He remembered the threatening words of his tormentor)

Music Sad

Phrase Sadness (e.g., That gentleman spent his entire life alone)

Music Frightening

Congruent

Phrase Joy (e.g., He breathed clean, fresh air as he pedaled)

Music Joyful

Phrase Sadness (e.g., That young boy died in a work accident)

Music Sad

Phrase Fear (e.g., The cellar door slammed shut behind him)

(5)

translucent yellow dot that served as a fixation point, and to avoid any eye or body movements during the recording session.

The sentences, written in white Arial narrow size 10 font on a black background, were pre-sented centrally and distributed over 2/3 lines, to facilitate the reading without saccadic move-ments. Sentences occupied a visual angle of approximately 3°30′ in length and 1°30′ in height. The presentation of the auditory and visual stimuli was subdivided into eight experimental runs lasting ~ 2 min each (112 s). In total, 54 music tracks (n = 18 joy, n = 18, sadness, and

n = 18 fear), 270 affective sentences (n = 90 joy, n = 90 sadness, and n = 90 fear), and 30 target

sentences were presented across runs. In this way, each run contained emotionally congruent and incongruent music/language associations, depending on the emotional content of music and language, and their respective congruence or incongruence.

Every musical fragment lasted ~ 12 s and was paired to five sentences. Thirty filler sentences containing city names were randomly distributed in each of the eight runs. Sentences lasted 1,700 ms and were intermixed by an Inter-stimulus Interval randomly varying from 400 to 600 ms. To get enough stimulus repetitions for ERP averaging musical stimuli were repeated twice. Musical fragments were normalized with Audacity to −1 dB, and leveled at −70 dB. Musical stimuli were validated by Vieillard et al. (2008).

The experimental task consisted of responding, as quickly and accurately as possible, by pressing a key on a joypad with the right index to sentences containing a city name (e.g., “The historical buildings of Graz are very famous”). The task was fictitious, and served to draw the subjects’ attention toward the stimulation without revealing the study’s aim.

EEG recordings

EEG signals were continuously recorded from 128 scalp sites at a sampling rate of 512 Hz using tin electrodes mounted in an elastic cap. Further details are reported in Supplemental Materials online.

Results

Hit percentage was 98.55%, while the mean response time was 967 ms. Behavioral data were not analyzed because there was no factor of variability, since targets were just rare attention-capturing stimuli.

Linguistic ERPs

The ANOVA carried out on linguistic N170 response did not show any significant effect of Emotional content, indicating how stimuli were perfectly comparable across classes from the perceptual, orthographic, and lexical point of view (see Figure 1). Full inferential statistics can be found in Supplemental Table 1.

The ANOVA performed on the N450 linguistic component showed a significant effect of hemisphere, with a greater N450 amplitudes over left (–1.66 µV; SD = 0.38) than right hemi-sphere (–.35 µV; SD = 0.55). The interaction of emotion × hemihemi-sphere factors was also signifi-cant. Post hoc comparisons showed that N450 was larger to frightening (–2.01 µV; SD = 0.39) than joyful (–1.60 µV; SD = 0.50) or sad sentences (–1.38 µV; SD = 0.41) over the left hemi-sphere. The interaction of electrode × hemisphere was also statistically significant. Post hoc comparisons showed that N450 was of greater amplitude at left anterior frontal (–2.10 µV;

(6)

The ANOVA performed on P300 amplitudes showed a significant effect of emotional content factor. Post hoc comparisons (p = .016) indicated that sad sentences elicited larger positivities (–.47 µV; SD = 0.58) than frightening (–1.40 µV; SD = 0.50) or joyful sentences (–1.45 µV;

SD = 0.62), as can be also appreciated by looking at waveforms of Figure 1.

Auditory ERPs

The ANOVA carried out on P2 mean area amplitudes revealed a significant effect of the emo-tional content factor. Post hoc comparisons showed that joyful music elicited a larger P2 com-ponent (1.05 µV; SD = 0.42), than frightening (.69 µV; SD = 0.40) or sad music (.33 µV;

SD = 0.44). This result can be appreciated by looking at waveforms of Figure 2 and was

con-firmed by post hoc comparisons (p = .030). The ANOVA also revealed a significant effect of hemisphere, indicating larger P2 responses over the left (1.26 µV; SD = 0.42) than right hemi-sphere (.12 µV; SD = 0.41).

The ANOVA carried out on N400 mean area amplitudes revealed a significant effect of Emotional content. Post hoc comparisons (p = .003) showed that N400 responses elicited by

Figure 1. Grand-Average ERPs Recorded at Prefrontal and Occipito/Temporal Sites in Response to

Different Categories of Linguistic Stimuli. It Is Visible How Orthographic N170 Response Did Not Change as a Function of Sentence’s Emotional Content.

(7)

frightening music were much greater (–1.80 µV; SD = 0.44) than to joyful (–.79 µV; SD = 0.50) or sad music (–1.01 µV; SD = 0.48), as can be clearly appreciated in Figure 2.

The ANOVA carried out on LP mean area amplitudes showed a significant effect of emo-tional content factor. Post hoc comparisons (p = .017) showed that joyful music elicited a wider LP component (1.02 µV; SD = 0.42) than sad music (.45 µV; SD = 0.45) or frightening music (.40 µV; SD = 0.44).

The ANOVA also highlighted the significance of electrode factor indicating greater LP ampli-tudes over inferior frontal than frontal electrode sites. The ANOVA also yielded the significance of emotional content × hemisphere factors. Post hoc comparisons indicated much larger LPs to joyful than sad and frightening music over right hemispheric sites (see Figure 2) and a lack of discriminative effects over left hemispheric sites (RH: joyful = 1.2 µV; SD = 0.43; sad = .38 µV;

SD = 0.46; frightening: .48 µV; SD = 0.48. LH: joyful = .87 µV; SD = 0.44; sad = .75 µV; SD = 0.47;

frightening = .55 µV; SD = 0.54).

Figure 2. Grand-Average ERPs Recorded at Left and Right Sites in Response to Different Categories of

Auditory Stimuli.

(8)

Audiovisual ERPs

The ANOVA performed on the N400 mean area amplitudes showed a significant interaction of congruency × hemisphere. Post hoc comparisons showed greater N400 responses to inent than congruinent audiovisual stimuli especially over the right hemisphere (p < .001; congru-ent: LH = –2.30 µV, SD = 0.40; RH = –1.05 µV, SD = 0.57. incongrucongru-ent: LH = –2.066 µV,

SD = 0.42; RH = –1.61 µV, SD = 0.58). This effect can very well observed in Figure 3.

An analysis of the swLORETA inverse solution of surface potentials was also performed to identify the intracranial generators responsible for the cortical activity recorded when auditory and visual information conflicted from the emotional point of view (i.e., in the incongruent condition). The most active area was the right superior temporal gyrus (STG, BA22), followed, in order of magnitude, by the left fusiform area (BA37), the left Wernicke area (BA39), prefron-tal areas bilaterally (BA10), and inferior fronprefron-tal areas (BA10/BA11). The source reconstruc-tion can be observed in Figure 4, while a list of active brain areas is reported in Supplemental Materials online (Supplemental Table 2).

Discussion

The study’s aim was to investigate the neural mechanism subserving the comprehension of the emotional meaning of language and music, as well as their conjoined processing and integra-tion within the semantic system. The effect of emointegra-tional incongruity between a phrase and a musical soundtrack was observed with an implicit paradigm. Sentences, eliciting three differ-ent emotional states (joy, fear, and sadness) and musical traces expressing the same emotional categories were congruently or incongruently paired, while subjects engaged in a fictions task.

Several ERP components were analyzed. Some of them were associated with language pro-cessing (N170, N450, and P300), others with the encoding of the musical stimuli (P2, N400, and LP) while N400 reflected the conjoined processing of auditory and visual stimuli and therefore the congruence/incongruence effect. The amplitude of N170, sensitive to ortho-graphic stimulus properties (Maurer, Zevin, & McCandliss, 2008) did not differ as a function of emotional class, thus suggesting an optimal perceptual and linguistic matching of stimuli across classes.

Language: frontal N450 larger to fear

(9)

Figure 3. (a) Grand-Average ERP Waveforms Relative to the Perception of Audiovisual Stimuli.

(10)

Language: P300 response larger to sadness

P300 response was greater to sad than happy/frightening sentences. This fits with previous literature (Bernat, Bunce, & Shevrin, 2001) reporting larger P300s to negative than positive words/sentences. Here, the fact that P300 was larger to sad than frightening sentences sug-gests that the former were more arousing than the latter stimuli. Indeed, P300 response is larger to more arousing than less arousing stimuli (Cuthbert et al., 2000). Brain activation induced by reading sad events might reflect affective relevance of the stories depicted (e.g., descriptions of persons suffering from excruciating pain) inducing empathic distress (Cuthbert et al., 2000; Schupp et al., 2004).

Music: positive responses (P2 and LP) larger to joy

P2 component was larger to joyful than frightening/sad music. This response would play a cru-cial role in the analysis of physical characteristics, such as intensity, frequency, tonality, stimu-lus, as well as in selective attention (Kotz & Paulmann, 2011; Morris, Steinmetzger, & Tøndering, 2016) which might suggest that joyful musical pieces were more complex from the perceptual point of view. Indeed, they were more articulated, and fast, with a higher number of notes per second, than sad/ fearful music. Interestingly, P2 was larger over the left than right hemisphere which recalls the hemispheric asymmetry model predicting a left hemisphere role for positive emotions (Rodway & Schepman, 2007; Schmidt & Trainor, 2001) and a right hemisphere role for negative emotions such as fear/sadness (Ross, Thompson, & Yenkosky, 1997).

Figure 4. Coronal and Axial Views of swLORETA Solutions Relative to N400 Response to Emotional

Incongruence (400–500 ms).

(11)

Later, LP deflection was greater to happy than sad/frightening music. This is strikingly simi-larly to what found for music, nonverbal vocalizations, and speech by previous investigations (Proverbio, De Benedetto, & Guazzone, 2020; Proverbio, Santoni, & Adorni, 2020). The inferior frontal distribution of this component (larger to positive stimuli) also resembles previous func-tional magnetic resonance imaging (fMRI) findings by Belin et al. (2008) and Koelsch et al. (2006) relative to positive vocalizations and pleasant music.

Music: N400 larger to fear

The auditory N400 response (450–550 ms) was larger to frightening music, as opposed to sad and happy music. Quite consistently, larger auditory N400s to negative prosody (Schirmer et al., 2002), speech (Grossman et al., 2005), human vocalizations, and music (e.g., crying; Proverbio et al., 2020) were previously reported, hinting at a common mechanism for compre-hending the emotional content of music.

Audiovisual stimuli: N400 larger to incongruent emotions

N400 response showed a strong increase in amplitude induced by the emotional incongruence of audiovisual stimuli (regardless of the specific content, e.g., joy, fear, or sadness), especially over the right anterior areas. For example, reading a joyful phrase while listening to a sad music induced an increase in N400 amplitude. To investigate the cortical areas responsible for com-prehending the emotional significance of audiovisual stimuli, a swLORETA was performed on the N400 response to incongruent pairs. The most active electromagnetic dipole was the right STG (BA22). It would play a key role in audiovisual integration (Jeong et al., 2011), and in the extraction of the emotions conveyed by speech and concurrent visual stimuli (e.g., lip motion). Previous studies have shown how STG is involved in processing multisensory emotional infor-mation (Pehrs et al., 2014; Salmi et al., 2017), prosody (Beaucousin et al., 2007), and music (Angulo-Perkins et  al., 2014; Khalfa et al., 2008; Proverbio & De Benedetto, 2018). In our study, also very active according to swLORETA was the left fusiform gyrus, which is known as the “visual word form area” (Polk et al., 2002; Proverbio, Zani, & Adorni, 2008). The third more active region explaining N400 response to affective incongruence was the left middle temporal gyrus (BA39), also known as “extended Wernicke’s area” involved in word associa-tions and semantics, as well as in speech prosody integration.

Also active were found bilaterally the medial and superior frontal areas (BA10), described as key structures in music listening (Altenmüller, Siggel, Mohammadi, Samii, & Münte, 2014; Khalfa et al., 2005; Proverbio & De Benedetto, 2018) and supporting working memory and attention. In addition, swLORETA showed the activation of the right inferior frontal gyrus (BA45) and the left orbitofrontal cortex (BA11) during the coding of incongruent stimuli. Inferior frontal gyrus would play a key role in the syntactic processing of language (Zaccarella, Meyer, Makuuchi, & Friederici, 2017). Its activity during the processing of incongruent audio-visual pairs might reflect the detection of a syntactic violation in the incoming information. Indeed, neuroimaging studies (Janata et al., 2002; Koelsch, 2011) suggest that music-syntac-tic processing involves the pars opercularis of the inferior frontal gyrus, which perfectly fits with our data.

(12)

this study, so that the role of musical structure (harmonic content, melodic profile, tempo, rhythmic structure, and timber), in transmitting specific emotional information, need to be fur-ther investigated. At this purpose, future studies involving musical fragments more complex in nature and played by instruments other than piano, should be carried out.

Conclusion

The present data showed that the extraction and integration of the emotional content of audio-visual stimuli is an automatic and fast occurring process. Piano musical fragments were able to transmit clear-cut information about their emotional content as early as 250 ms post-stimulus, while semantic comprehension showed later effects (450 ms). Distinct ERP markers were iden-tified for the processing of fear (N450, either linguistic or musical), for language-induced sad-ness (P300) and for joyful music (positive P2 and LP potentials). Overall, this evidence supports the undoubted ability of music to convey and communicate emotions quite distinctively and effectively as the verbal messages. A key role in this process would be played by the right STG, thanks to its multimodal and affective properties in the processing of music, vocalizations, and speech.

Acknowledgements

The authors are very grateful to all participants and to Francesco De Benedetto, Roberta Adorni, Andrea Orlandi, and Alessandra Brusa for their technical help.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The study was supported by the 2018-ATE-0003 28882 grant entitled “Neural encoding of the emotional content of speech and music” from the University of Milano-Bicocca to AMP.

ORCID iD

Alice Mado Proverbio https://orcid.org/0000-0002-5138-1523

Supplemental material

Supplemental material for this article is available online.

References

Altenmüller, E., Siggel, S., Mohammadi, B., Samii, A., & Münte, T. F. (2014). Play it again, Sam: Brain correlates of emotional music recognition. Frontiers in Psychology, 5, Article 114. doi:10.3389/ fpsyg.2014.00114

Angulo-Perkins, A., Aubé, W., Peretz, I., Barrios, F. A., Armony, J. L., & Concha, L. (2014). Music listen-ing engages specific cortical regions within the temporal lobes: Differences between musicians and non-musicians. Cortex, 59, 126–137. doi:10.1016/j.cortex.2014.07.013

Baumgartner, T., Lutz, K., Schmidt, C. F., & Jäncke, L. (2006). The emotional power of music: How music enhances the feeling of affective pictures. Brain Research, 1075, 151–164. doi:10.1016/j.brain-res.2005.12.065

(13)

Belin, P., Fecteau, S., Charest, I., Nicastro, N., Hauser, M. D., & Armony, J. L. (2008). Human cerebral response to animal affective vocalizations. Proceedings of the Royal Society B: Biological Sciences, 275, 473–481. doi:10.1098/rspb.2007.1460

Bernat, E., Bunce, S., & Shevrin, H. (2001). Event-related brain potentials differentiate positive and nega-tive mood adjecnega-tives during both supraliminal and subliminal visual processing. International Journal

of Psychophysiology, 42, 11–34. doi:10.1016/s0167-8760(01)00133-7

Bradley, M. M., & Lang, P. J. (1994). Measuring emotion: The Self-Assessment Manikin and the Semantic Differential. Journal of Behavior Therapy and Experimental Psychiatry, 25, 49–59. doi:10.1016/0005-7916(94)90063-9

Coutinho, E., & Dibben, N. (2013). Psychoacoustic cues to emotion in speech prosody and music. Cognition

and Emotion, 27, 658–684. doi:10.1080/02699931.2012.732559

Cuthbert, B. N., Schupp, H. T., Bradley, M. M., Birbaumer, N., & Lang, P. J. (2000). Brain potentials in affective picture processing: Covariation with autonomic arousal and affective report. Biological

Psychology, 52, 95–111. doi:10.1016/s0301-0511(99)00044-7

Goerlich, K. S., Witteman, J., Schiller, N. O., Van Heuven, V. J., Aleman, A., & Martens, S. (2012). The nature of affective priming in music and speech. Journal of Cognitive Neuroscience, 24, 1725–1741. doi:10.1162/jocn_a_00213

Grossmann, T., Striano, T., & Friederici, A. D. (2005). Infants’ electric brain responses to emotional pros-ody. NeuroReport, 16, 1825–1828. doi:10.1097/01.wnr.0000185964.34336.b1

Janata, P. (2009). The neural architecture of music-evoked autobiographical memories. Cerebral Cortex,

19, 2579–2594. doi:10.1093/cercor/bhp008

Janata, P., Birk, J., Van Horn, J. D., Leman, M., Tillmann, B., & Bharucha, J. J. (2002). The cortical topog-raphy of tonal structures underlying Western music. Science, 298, 2167–2170. doi:10.1126/sci-ence.1076262

Jeong, J. W., Diwadkar, V. A., Chugani, C. D., Sinsoongsud, P., Muzik, O., Behen, M. E., & Chugani, D. C. (2011). Congruence of happy and sad emotion in music and faces modifies cortical audiovisual acti-vation. NeuroImage, 54, 2973–2982. doi:10.1016/j.neuroimage.2010.11.017

Jolij, J., & Meurs, M. (2011). Music alters visual perception. PLoS ONE, 6(4), Article e18861. doi:0.1371/ journal.pone.0018861

Juslin, P. N. (2013). From everyday emotions to aesthetic emotions: Towards a unified theory of musical emotions. Physics of Life Reviews, 10, 235–266. doi:10.1016/j.plrev.2013.05.008

Juslin, P. N., Liljeström, S., Västfjäll, D., Barradas, G., & Silva, A. (2008). An experience sampling study of emotional reactions to music: Listener, music, and situation. Emotion, 8, 668–683. doi:10.1037/ a0013505

Kanske, P., & Kotz, S. A. (2007). Concreteness in emotional words: ERP evidence from a hemifield study.

Brain Research, 1148, 138–148. doi:10.1016/j.brainres.2007.02.044

Khalfa, S., Schon, D., Anton, J., & Liégeois-Chauvel, C. (2005). Brain regions involved in the recogni-tion of happiness and sadness in music. Neuroreport, 16, 1981–1984. doi:10.1097/00001756-200512190-00002

Khalfa, S., Guye, M., Peretz, I., Chapon, F., Girard, N., Chauvel, P., & Liégeois-Chauvel, C. (2008). Evidence of lateralized anteromedial temporal structures involvement in musical emotion process-ing. Neuropsychologia, 46, 2485–2493. doi:10.1016/j.neuropsychologia.2008.04.009

Koelsch, S. (2011). Toward a neural basis of music perception—A review and updated model. Frontiers in

Psychology, 2, Article 110. doi:10.3389/fpsyg.2011.00110

Koelsch, S., Cramon, D. Y. V., Müller, K., & Friederici, A. D. (2006). Investigating emotion with music: An fMRI study. Human Brain Mapp, 27, 239–250. doi:10.1002/hbm.20180

Koelsch, S., Skouras, S., Fritz, T., Herrera, P., Bonhage, C., Küssner, M. B., & Jacobs, A. M. (2013). The roles of superficial amygdala and auditory cortex in music-evoked fear and joy. NeuroImage, 81, 49– 60. doi:10.1016/j.neuroimage.2013.05.008

(14)

Kutas, M., & Federmeier, K. D. (2000). Electrophysiology reveals semantic memory use in language comprehension. Trends in Cognitive Science, 4, 463–470. doi:10.1016/S1364-6613(00)01560-6 Lau, E. F., Phillips, C., & Poeppel, D. (2008). A cortical network for semantics: (De)constructing the N400.

Nature Review Neuroscience, 9, 920–933. doi:10.1038/nrn2532

Maurer, U., Zevin, J. D., & McCandliss, B. D. (2008). Left-lateralized N170 effects of visual expertise in reading: Evidence from Japanese syllabic and logographic scripts. Journal of Cognitive Neuroscience,

20, 878–891. doi:10.1162/jocn.2008.20125

Morris, D. J., Steinmetzger, K., & Tøndering, J. (2016). Auditory event-related responses to diph-thongs in different attention conditions. Neuroscience Letters, 626, 158–163. doi:10.1016/j.neu-let.2016.05.002

Morton, J. B., & Trehub, S.E. (2001). Children’s understanding of emotion in speech. Child Development,

72, 834–843. doi:10.1111/1467-8624.00318

Panksepp, J. (1994). The basics of emotion. In P. Ekman & R. J. Davidson (Eds.), The nature of emotions:

Fundamental questions (pp. 20–24). Oxford, UK: Oxford University Press.

Paquette, S., Takerkart, S., Saget, S., Peretz, I., & Belin, P. (2018). Cross-classification of musical and vocal emotions in the auditory cortex. Annuals of the National Academy of Sciences, 1423, 329–337. doi:10.1111/nyas.13666

Parkes, L., Perry, C., & Goodin, P. (2016). Examining the N400m in affectively negative sentences: A magnetoencephalography study. Psychophysiology, 53, 689–704. doi:10.1111/psyp.12601 Paulmann, S., Bleichner, M., & Kotz, S. A. (2013). Valence, arousal, and task effects in emotional prosody

processing. Frontiers in Psychology, 4, Article 345. doi:10.3389/fpsyg.2013.00345

Paulmann, S., & Kotz, S. A. (2008). Early emotional prosody perception based on different speaker.

NeuroReport, 19, 209–213. doi:10.1097/WNR.0b013e3282f454db

Pehrs, C., Deserno, L., Bakels, J. H., Schlochtermeier, L. H., Kappelhoff, H., Jacobs, A. M., & Kuchinke, L. (2014). How music alters a kiss: Superior temporal gyrus controls fusiform-amygdalar effective con-nectivity. Social Cognitive Affective Neuroscience, 9, 1770–1778. doi:10.1093/scan/nst169

Polk, T., Stallcup, M., Aguirre, G., Alsop, D., D’Esposito, M., & Detre, A. (2002). Neural specialization for letter recognition. Journal of Cognitive Neuroscience, 14, 145–159. doi:10.1162/089892902317236803 Proverbio, A. M., Camporeale, E., & Brusa, A. (2020). Multimodal recognition of emotions in music and

facial expressions. Frontiers in Human Neuroscience, 14, Article 32. doi:10.3389/fnhum.2020.00032 Proverbio, A. M., & De Benedetto, F. (2018). Auditory enhancement of visual memory encoding is driven

by emotional content of the auditory material and mediated by superior frontal cortex. Biological

Psychology, 132, 164–175. doi:10.1016/j.biopsycho.2017.12.003

Proverbio, A. M., De Benedetto, F., & Guazzone, M. (2020). Shared neural mechanisms for processing emotions in music and vocalizations. European Journal of Neuroscience, 51, 1987–2007. doi:10.1111/ ejn.14650

Proverbio, A. M., Manfrin, L., Arcari, L. A., De Benedetto, F., Gazzola, M., Guardamagna, M., & Zani, A. (2015). Non-expert listeners show decreased heart rate and increased blood pressure (fear bradycardia) in response to atonal music. Frontiers in Psychology, 6, Article 1646. doi:10.3389/ fpsyg.2015.01646

Proverbio, A. M., Santoni, S., & Adorni. (2020). ERP makers of valence coding in emotional speech processing. iScience, 23, 100933. doi:10.1016/j.isci.2020.100933

Proverbio, A. M., Zani, A., & Adorni, R. (2008). The left fusiform area is affected by written frequency of words. Neuropsychologia, 46, 2292–2299. doi:10.1016/j.neuropsychologia.2008.03.024

Rodway, P., & Schepman, A. (2007). Valence specific laterality effects in prosody: Expectancy account and the effects of morphed prosody and stimulus lead. Brain & Cognition, 63, 31–41. doi:10.1016/j. bandc.2006.07.008

Ross, E. D., Thompson, R. D., & Yenkosky, J. (1997). Lateralization of affective prosody in brain and the callosal integration of hemispheric language functions. Brain and Language, 56, 27–54. doi:10.1006/ brln.1997.1731

(15)

Salmi, J., Koistinen, O. P., Glerean, E., Jylänki, P., Vehtari, A., Jääskeläinen, I. P., . . . Sams, M. (2017). Distributed neural signatures of natural audiovisual speech and music in the human auditory cor-tex. NeuroImage, 157, 108–117. doi:10.1016/j.neuroimage.2016.12.005

Schirmer, A., Kotz, S. A., & Friederici, A. D. (2002). Sex differentiates the role of emotional prosody during word processing. Cognitive Brain Research, 14, 228–233. doi:10.1016/s0926-6410(02)00108-8 Schmidt, L. A., & Trainor, L. J. (2001). Frontal brain electrical activity (EEG) distinguishes valence and

inten-sity of musical emotions. Cognition and Emotion, 15, 487–500. doi:10.1016/j.bandc.2007.02.008 Schupp, H. T., Öhman, A., Junghöfer, M., Weike, A. I., Stockburger, J., & Hamm, A. O. (2004). The

facili-tated processing of threatening faces: An ERP analysis. Emotion, 4, 189–200. doi:10.1037/1528-3542.4.2.189

Spreckelmeyer, K. N., Kutas, M., Urbach, T. P., Altenmüller, E., & Münte, T. F. (2006). Combined per-ception of emotion in pictures and musical sounds. Brain Research, 1070, 160–170. doi:10.1016/j. brainres.2005.11.075

Trost, W., Ethofer, T., Zentner, M., & Vuilleumier, P. (2012). Mapping aesthetic musical emotions in the brain. Cerebral Cortex, 22, 2769–2783. doi:10.1093/cercor/bhr353

Vidas, D., Dingle, G. A., & Nelson, N. L. (2018). Children’s recognition of emotion in music and speech.

Music and Science, 1, 1–10. doi:10.1177/2059204318762650

Vieillard, S., Peretz, I., Gosselin, N., Khalfa, S., Gagnon, L., & Bouchard, B. (2008). Happy, sad, scary and peaceful musical excerpts for research on emotions. Cognition and Emotion, 22, 720–752. doi:10.1080/02699930701503567

Vuilleumier, P., & Trost, W. (2015). Music and emotions: From enchantment to entrainment. Annuals of

the National Academy of Sciences, 1337, 212–222. doi:10.1111/nyas.12676

Riferimenti

Documenti correlati

In order to examine the differential impact of communication type on perceived happiness among individuals with different linguistic backgrounds, we used Italian (i.e.,

item volumus et concedimus quod dicti mercatores et eorum fam uli emere habere portare et vehere possint quecumque genera ar­ morum et se armare pro

Questa review si propone di analizzare l’impatto delle cisti endometriosiche sulla risposta ovarica e il ruolo della chirurgia laparoscopica nel trattamento degli

Il risultato finale è una tradizionale storia di un’istituzione educati- va, ma rivisitata grazie ad alcune incursioni nella storia delle politiche dell’educazione (la

The spectral distribution discrepancy between the reference signal (associated to the fit on nom- inal LAS data, limited to six terms) and the data generated with a given

As the aim was to perform a preliminary study on the possibility of TM TM/AOT to be loaded in nanoemulsions, and to evaluate the influence of chitosan on drug corneal diffusion

We created median composites in sub-bins of width ∆FWHM = 1000 km s −1. We computed the virial luminosity using the FWHM of the median composite. Figure 18 shows a compari- son