Distributional methods for Sentiment Analysis

(1)

Conclusions

In this study, we have presented some simple distributional approaches to sentiment-related tasks.

After showing the state of the art in the areas of research of our interest, Sentiment Analysis and Distributional Semantics, we searched for possible, promising intersections between these two fields. The idea on which our work is based is that, in order to

recognise the "polarity" (positive or negative) of a linguistic expression, it is possible to extend Firth's claim as follows: "you shall know the polarity of a word from the company it keeps".

In the first experiment, we tested this possibility in a simple task of polarity classification, in which the aim was to identify the polarity of the words of a sentiment lexicon.

Starting from Turney and Littmann's work, we used the resources of the Distributional Memory framework to build a semantic vector space and to classify the words on the basis of their distance from prototype vectors 1_{. We built different prototype vectors, using} various seed sets, but the best performance has been achieved with Turney-Littmann's seed set, which was composed of common adjectives.

Moreover, we tried an experiment of "sentiment compositionality", using additive and multiplicative models similar to those recently proposed by Mitchell and Lapata, and our results are consistent with their findings 2_{: multiplicative models seem to be more suitable} for semantic composition, even for a polarity classification task, since they preserve only the dimensions of meaning that are shared by all the words composing a more complex linguistic expression. In a future study, it would be interesting to test a model of

"sentiment composition" implementing knowledge about selectional preferences, following Erk and Padớ's approach. 3

1 P. D. Turney, M. Littmann, ibid., 2003.

2 J. Mitchell, M. Lapata, ibid., 2008; J. Mitchell, M. Lapata, ibid., 2011. 3 K. Erk, S. Padớ, ibid., 2008.

(2)

In our opinion, there is another research area in which distributional methods could be productively applied, that is the identification and extraction of the features of general concepts and items. The identification of an object's features allows a more fine-grained Sentiment Analysis, because it is possible to identify those aspects about which people are expressing judgements. But it is worth stressing that there is a growing interest in this task, even outside the borders of this discipline, for example in cognitive sciences. Some theories on the representation of concepts in the human brain have supposed a feature-based model, in the sense that such representations can be thought as patterns of

activation over sets of interconnected semantic feature nodes 4_{. The aim of a lot of recent} studies is to develop computational semantic models for the extraction -from large textual corpora- of salient properties characterizing entities, in order to test their ability to learn the inner structure of concepts. 5

It would be very promising, for the purposes of Sentiment Analysis, to combine pattern-based and distributional methods to extract the main features of objects and entities that are cited in online texts, in particular those features which tend to co-occur with

subjectivity clues: for instance, if a feature occurs very frequently in the reviews of a certain category of products and it is often associated with verbs / adjectives expressing opinions or subjective evaluations, it is highly probable that the users consider that feature important in judging that kind of product. Of course, a system able to identify such

features would be of incredible interest for web marketing, just to make an example.

Investigating sentiment with distributional methods is an intriguing perspective, because it creates a link -and consequently new possibilities of an exchange of methods and

resources- between theoretical researches on language and cognition and a subfield of the studies in knowledge discovery that is expected, with the development of Web 2.0 and the increasing availability of opinionated textual data, to take on an important role in the

4 G. Cree, K. McRae, Analyzing the factors underlying the structure and computation of the meaning of chipmunk, cherry, chisel, cheese and cello (and many other such concrete nouns), in Journal of Experimental Psychology, vol. 132, no. 2, 2003, pp. 163-201; K. McRae et al., Semantic feature production norms for a large set of living and

nonliving things, in Behavioral Research Methods, Instruments and Computers, vol. 37, 2005, pp. 547-559.

5 See, for example, M. Baroni, A. Lenci, Concepts and properties in word spaces, in Rivista di Linguistica, vol. 20, n. 1, 2008, pp. 53-86; M. Baroni et al., Strudel: a corpus-based semantic model based on properties and types, in

Cognitive Science, vol. 34, n. 10, 2010, pp. 222-254; C. Kelly et al., Acquiring human-like feature-based conceptual representations from corpora, in Proceedings of the NAACL HLT 2010 - 1st_{Workshop on Computation}

Neurolinguistics, Los Angeles, 2010, pp. 61-69; C. Kelly et al., Semi-supervised learning for automatic conceptual property extraction, in Proceedings of the 3rd_{Workshop on Cognitive Modeling and Computational Linguistics}

(CMCL 2012), Montreal, 2012, pp. 11-20.

(3)

marketing and in the public communication of the future.

Of course, there is still much to do: first of all, we are just at the beginning regarding the attempts of computational modeling of the processes of meaning composition;

consequently, we are very far from the realization of a syntax-aware model for the

calculation of the contextual polarity of phrases and sentences. In this thesis work, we just tried to lay the foundations for a future in-depth research.