Sentiment Analysis for the Italian language

8,744 views
8,636 views

Published on

The PhD thesis of Dr. Paolo Casoto on Sentiment Analysis. The work presented in this thesis provides several contributions to the specific task of Sentiment Analysis applied, more specifically, to product reviews written in Italian language. In particular the following contributions have been proposed:
• a generic framework aimed at defining, training and testing automatic tools devoted to Sentiment Analysis based on supervised classifiers has been designed and implemented. The SENT-IT framework provides a complete set of integrated tools for linguistic analysis and machine learning, which could be applied in order to easily generate new automatic tools for sentiment classification and to evaluate experimentally their performances. A comprehensive description of the SENT-IT framework and its modules is provided in Chapter 3. SENT-IT framework is based on open-source solutions and will be freely released soon for research purposes.
• a set of automatically annotated corpora constituted by product reviews writ- ten in Italian language, grouped by product domain (e.g.: movie, cars, cell phones, et al.), has been collected and shared with other researchers. Each product review is constituted by a short text, a set of additional and optional information, such as date, author name and age, and an overall polarity rating indicator, aimed at representing the polarity expressed by the author within the review. Corpora which have been developed in order to perform evaluation of the proposed methodologies for Sentiment Analysis, could be used in the future by other researchers as a Gold Standard, not available for the Italian language until the beginning of this thesis. Review corpora have been publicly released in 2008 in XML format and are available at author’s site.
• a document features representation schema suitable for Sentiment Analysis applied to Italian language has been proposed and experimentally evaluated. The set of selected features, described in detail in Chapter 3, is constituted by representation features described as suitable in literature, in the case of English language, and ad-hoc defined features, proposed according with the specific particularities of the Italian language.
• a domain independent meta-classifier devoted to Sentiment Analysis has been implement by applying a stacking approach to previously trained domain-dependent classifiers. Stacking approach has been investigated in order to improve the effectiveness of the ensemble classifier on unknown or already known domains.
• a lexical resource of polarity oriented terms for the Italian language has been developed, by proposing a shortest path algorithm based on a graph representation of the input terms. Semantic relations connecting terms, like synonymy,
antinomy and similarity have been used in order to generate the graph representation.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
8,744
On SlideShare
0
From Embeds
0
Number of Embeds
16
Actions
Shares
0
Downloads
258
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Sentiment Analysis for the Italian language

  1. 1. `Universita degli Studi di UdineDipartimento di Matematica e InformaticaDottorato di Ricerca in Informatica Ph.D. ThesisSentiment Analysis for the Italian language Candidate: Supervisor: dott. Paolo Casoto Professor Carlo Tasso January 8, 2012
  2. 2. Author’s address:Dipartimento di Matematica e InformaticaUniversit` degli Studi di Udine aVia delle Scienze, 20633100 UdineItalia
  3. 3. AbstractSentiment Analysis is the discipline aimed at analyzing and classifying the orienta-tion of the opinions expressed in a document or, more generally, in a textual entity. Each textual entity could be classified as positive, negative or neutral, accord-ing with the orientation of opinions it expresses. By means of Sentiment Analysisthe sentence ”Il motore della Fiat Punto ` brillante e piacevole da guidare” is au- etomatically classified as positive, while the sentence ”Il cambio ` impreciso e ne ecompromette la guidabilit`” is classified as negative, due to the different orientation aof the opinions they express. The research activities described in this thesis aim at investigating and proposingdifferent techniques for Sentiment Analysis applied to documents written in theItalian Language. The need of automatic tools for Sentiment Analysis is justified bythe huge amount of opinionated contents available on the Web (e.g.: review sites,blogs, forums) and their continuous growth rate. Users could not deal with suchamount of data; automated tools able to summarize the polarity rating expressedby other reviewers in a set of heterogeneous information sources are required. Sentiment Analysis has many potential applications, ranging from tracking users’opinions and preferences about products or political candidates as expressed in on-line forums, to customer relationship management or terrorism prevention. MoreoverSentiment Analysis includes several different tasks, also referred as Opinion Mining,which are aimed at investigating and identifying subjective elements, which couldappear in a given document, as opinions or judgements. Sentiment analysis has been investigated since 2001 by many researchers world-wide; in particular most of the research activity is focused on the English language.This thesis and its related publications represent the first approach to SentimentAnalysis for the Italian language. In particular we aim at developing and evaluatinga set of methodologies, based on both linguistic and machine learning algorithms,for defining domain dependent and independent classifiers for opinion polarity ofproduct reviews. In order to support the experimental activity the SENT-IT framework has beendesigned and implemented; it provides a complete toolbox for document analysis,feature extraction and classifier training and evaluation. The SENT-IT frameworkhas been used to evaluate the proposed methodology for opinion polarity analysison both domain dependent and independent environments. The results confirm, in
  4. 4. iv ABSTRACTterms of classification accuracy, that automatic tools for Sentiment Analysis in theItalian language could reach performances similar to those described in literaturefor the English language.
  5. 5. AcknowledgmentsDuring the last four years I though many times about leaving these thesis uncom-pleted; fortunately I did not heard this voices in my mind. I must admit these fouryears have been a long and adventurous journey, both professionally and personallyand, at the end, I’m glad I did it. First of all I want to thank Professor Carlo Tasso, my advisor since my bachelorthesis in 2003, who helped me during this work with his experience and advice. Iwant to thank my colleagues at Artificial Intelligence Laboratory: Andrea, Antonina,Felice and Nirmala for their support, advice, help and friendship. I want to thank Luca and Ivan for their help, especially in a very bad momentof my life. Probably both myself and my thesis will not be here without theirmotivational work.
  6. 6. vi ACKNOWLEDGMENTS
  7. 7. Contents1 Introduction 1 1.1 The importance of Opinion and Sentiment in Modern Information Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Applications of Opinion Mining and Sentiment Analysis methodologies 6 1.4 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.5 Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Sentiment Analysis: challenges, solutions and tasks 13 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 Challenges in Sentiment Analysis . . . . . . . . . . . . . . . . . . . . 17 2.3 Sentiment Polarity Classification . . . . . . . . . . . . . . . . . . . . . 22 2.3.1 Sentiment Polarity Regression . . . . . . . . . . . . . . . . . . 24 2.4 Opinion Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.5 Affect computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.6 Multilingual Sentiment Analysis . . . . . . . . . . . . . . . . . . . . . 273 A Supervised Approach to Overall Opinion Polarity Analysis 31 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.2.1 Turney . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.2.2 Pang et al. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.2.3 Dave et al. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2.4 Salvetti et al. . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.3 The SENT-IT Framework . . . . . . . . . . . . . . . . . . . . . . . . 36 3.3.1 Product Review Crawler . . . . . . . . . . . . . . . . . . . . . 43 3.3.2 Analysis Module . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.4 Expertiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.4.1 The Movie Review Corpus . . . . . . . . . . . . . . . . . . . . 47 3.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.5 A novel visualization approach for polarity classified reviews . . . . . 53 3.5.1 Basics of graph theory . . . . . . . . . . . . . . . . . . . . . . 53 3.5.2 Zz-structures . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
  8. 8. viii CONTENTS 3.5.3 Data Visualization Module . . . . . . . . . . . . . . . . . . . . 57 3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584 Domain Independent Sentiment Analysis 61 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.2.1 Aue and Gamon . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.2.2 Engstrom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.2.3 Agrin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.2.4 Blitzer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.3 Domain Independent OvOP . . . . . . . . . . . . . . . . . . . . . . . 65 4.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.4.1 Test set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735 Automatic Generation of Lexical Resources for Sentiment Analysis 75 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.1.1 Prior subjectivity status contextualization . . . . . . . . . . . 77 5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.2.1 Hatzivassiloglou and McKeown . . . . . . . . . . . . . . . . . 79 5.2.2 Turney and Littman . . . . . . . . . . . . . . . . . . . . . . . 81 5.2.3 Kamps et al. . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.2.4 Takamura et al. . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.2.5 Esuli et al. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 5.3 Determining the polarity orientation . . . . . . . . . . . . . . . . . . 86 5.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 5.4.1 The OpenOffice dictionary . . . . . . . . . . . . . . . . . . . . 90 5.4.2 The SinonimiMaster dictionary . . . . . . . . . . . . . . . . . 91 5.4.3 Test set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.4.4 Seed sets and parameters . . . . . . . . . . . . . . . . . . . . . 92 5.4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5.5 OvOP analysis based on sentiment oriented terms . . . . . . . . . . . 96 5.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.6.1 Test set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1036 Conclusions 105A Publications 107 Bibliography 109
  9. 9. List of Figures 2.1 Different approaches to text categorization and polarity classification. 14 2.2 Template adopted in [87] and [88] for opinion oriented information extraction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.3 The EmpathyBuddy email agent [49, 48] in action. . . . . . . . . . . 27 3.1 Overall architecture of the SENT-IT framework. . . . . . . . . . . . . 38 3.2 OvOP Workflows available in the SENT-IT framework. . . . . . . . . 41 3.3 Overall architecture of the Product Review Crawler module. . . . . . 44 3.4 Distribution of preassigned OvOP in MRC. . . . . . . . . . . . . . . . 49 3.5 Feature selection and accuracy for both NB and SVM classifiers. . . . 52 3.6 An example of zz-structure. . . . . . . . . . . . . . . . . . . . . . . . 55 3.7 An example of H-view on focus v7 . . . . . . . . . . . . . . . . . . . . 57 3.8 Set of reviews retrieved from the MRC with the query ”Johnny Depp”. 58 3.9 A view related to dimensions ”Johnny Depp” and ”Pirati dei Caraibi”. 59 4.1 The meta-classification OvOP process. . . . . . . . . . . . . . . . . . 66 5.1 Prior subjectivity status contextualization process. . . . . . . . . . . 78 5.2 A subset of the nodes and edges constituting the WordNet graph analyzed in [38]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.3 The classification of the term ”efficient” provided by SentiWordNet. . 87 5.4 The term polarity evaluation process. . . . . . . . . . . . . . . . . . . 89 5.5 Data provided by the SinonimiMaster dictionary for the term effi- ciente (efficient). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 5.6 The OvOP analysis process. . . . . . . . . . . . . . . . . . . . . . . . 100 5.7 Accuracy of OvOP analysis. . . . . . . . . . . . . . . . . . . . . . . . 103
  10. 10. x LIST OF FIGURES
  11. 11. List of Tables 2.1 List of polarity conveying terms collected by two human experts in [63]. 18 2.2 List of polarity conveying terms collected by human expert and statis- tic analysis of document corpus in [63]. . . . . . . . . . . . . . . . . . 18 3.1 Average three-fold cross-validation accuracies achieved by Pang et al. in [63]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2 Average accuracy of trained classifiers. . . . . . . . . . . . . . . . . . 50 3.3 Average accuracy+ and accuracy− of trained classifiers. . . . . . . . . 50 3.4 Top 50 features extracted from the training set with the highest IG value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.5 Average accuracy of the U3 and UBT3 based classifiers after feature selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.1 Average three-fold cross-validation accuracies for each domain depen- dent OvOP classifier trained according to the UBT3 feature set. . . . 68 4.2 Top 30 features extracted from each training set with the highest IG value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.3 Average three-fold cross-validation accuracy for each domain depen- dent OvOP classifier applied to different domains. . . . . . . . . . . . 71 4.4 Classification accuracy of a classifier trained on three domains and tested on the forth domain. . . . . . . . . . . . . . . . . . . . . . . . 71 4.5 Classification accuracy of a classifier trained on the four domains. . . 72 4.6 Classification accuracy of a meta-classifier evaluated on the four do- mains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.1 Adjectives provided by users in L1 with two or more occurrences. . . 93 5.2 Positive and negative adjectives with the highest orientation value O(t) generated from the OpenOffice dictionary. . . . . . . . . . . . . 94 5.3 Positive and negative adjectives with the highest orientation value O(t) generated from the SinonimiMaster dictionary. . . . . . . . . . . 95 5.4 Coverage and accuracy of both generated sentiment-classified lexical resources with respect to test set L1. . . . . . . . . . . . . . . . . . . 96 5.5 Accuracy of generated sentiment-classified lexical resources with re- spect to test set L2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
  12. 12. xii LIST OF TABLES 5.6 Extraction rules used for OvOP analysis. . . . . . . . . . . . . . . . . 98 5.7 Accuracy of lexical resource based OvOP analysis. . . . . . . . . . . . 101
  13. 13. Chapter 1Introduction1.1 The importance of Opinion and Sentiment in Modern Information AccessMost of the decision-making processes we exploit during our life are based on subjec-tive information: opinions and sentiments. Knowing what other people think, howthey perceive reality, is required in order to support several activities we exploitdaily, like, for example, renting a movie or buying a new digital camera. Most ofpeople usually asks a friend for recommendation about which product to buy, whichcandidate to vote, which book to read, before to make a decision [62]. The importance of gathering and knowing people opinions and preferences is anissue well known by companies, organizations and public authorities. Companies,for example, collect opinions about their products and brands to support marketingstrategy and to plan new activities aimed at improving the way the company is per-ceived by its customers. Politicians could use opinions collected from their electorsand even from the electors of their competitors in order to define their programsand plans. The ability to predict how shareholders and the markets will answer to good andbad news is a key issue for both companies (e.g. banks, financial traders, brokeragencies, insurances et al.) and public authorities; financial trends, in fact, areparticularly sensible to opinions, judgements and ever fears perceived by investors. Opinions could also be used by public authorities to analyze the happiness of thecitizens and to identify the issues perceived by the population, in order to preventcritical situations like terrorism. Opinion sharing was limited, till ten years ago, in terms of both the amount ofavailable information and the potential sources of such kind of information: friends,newspapers, television. The growth of the World Wide Web, on the other hand,made possible for everyone to access a huge amount of information sources (e.g.:forums, blogs, review sites, et al.) providing opinions and, more generally, subjectiveinformation.
  14. 14. 2 CHAPTER 1. INTRODUCTION However the amount of available input data cannot be easily handled by a singleperson; automatic tools are required in order to filter relevant information andaggregate it in a way suitable for decision making. For example, given a set of 1000reviews regarding a brand-new digital camera, the automatic tool could extract theamount of positive and negative opinions and present them to the customer. Analysis of opinions can even move further, by providing users the ability todistinguish between opinions associated with different features of a given product:for example, given a specific digital-camera, specific methodologies can be appliedto identify that reviewers are enthusiastic about the quality of the lenses but, at thesame time, very disappointed by the low precision of the autofocus system. Another issue concerning the large amount of available opinions is trust: opinionsprovided by a friend or by a well-known movie critic tend to be more trusted thanopinions provided by strangers. Even in this case analysis of opinions can provide anadvice to users: by automatically analyzing and classifying thousand of documents,wrong or fake opinions provided by single untrustful users can be leveraged by takingin account the wide amount of knowledge provided by the crowd. Understandinghow the trust of a seller is perceived by its customers according with the reviewswritten by other customers is becoming a critical issue in e-commerce, especially insystems like E-Bay, where trust is considered a key factor at beginning of a newtransaction. Opinion Mining and Sentiment Analysis identify the new field of research devotedto designing and evaluating tools for automatic opinion analysis. It started approx-imately in 2001, with contributions from researchers coming from the domains ofmachine learning, computational linguistic and information retrieval. Most of theresearch experiences aimed at investigating Sentiment Analysis and Opinion Miningfocused on the English language; in fact only a small part of the available worksdeals with the problem of Sentiment Analysis for other languages, where the amountof available tools, resources and previous experiences is significantly reduced.
  15. 15. 1.2. TERMINOLOGY 31.2 TerminologySeveral different terms have been proposed during the last ten years by authorsinvolved in the field of sentiment analysis in order to describe their work and thedifferences with work done by others; in fact no uniform terminology is available. According to Wiebe, the subjectivity of a text is defined as the set of elementsdescribing the private state of the writer. Assumptions, beliefs, thoughts, experi-ences, opinions, and judgments expressed in texts are typical clues of subjectivity[91]. Sentiment is defined as the subset of subjective clues that can be measured interms of positive, neutral or negative orientation. The automatic analysis of opinions, sentiment and subjectivity of a given textare known, respectively, as Opinion Mining, Sentiment Analysis and SubjectivityAnalysis. The term Opinion Mining (OM) has been introduced in 2003 in order to describethe activity aimed at “processing a set of search results for a given item, generatinga list of product attributes (quality, features, et al.) and aggregating opinions abouteach of them (poor, mixed, good)” [19]. Opinion Mining, according to definitionprovided by Dave, is concerned with the analysis of the opinions expressed by adocument, not considering in any way the specific topic (topicality) of analyzeddocument: for this reason OM is classified as a non-topical text analysis task1 . Theterm has been also used in several works available in literature, including [55], [32]and [28]. The term Sentiment Analysis (SA) has been introduced in 2001, in order to de-scribe the process aimed at automatically evaluating the polarity expressed by a setof given documents. The term Sentiment has been inherited by Das and Chen [16]from the economical domain: the work is aimed at defining a prevision algorithmable to determine the future market-share tendency given a set of documents con-cerning public companies extracted from economical newspapers. More specificallythe analysis of future market tendency is usually referred in the economical domainas market sentiment. The term Sentiment Analysis has been used in several worksin literature, including [82], [84], [63] and [97]. While OM is mainly focused on recognizing opinions expressed in a given textwith respect to specific attributes (e.g.: to recognize the opinions related to theengine in a car review and separate them from the opinions regarding tyres), SAis focused, on the other hand, on classifying a given document according with thepolarity it expresses (either negative or positive). 1 Other text analysis tasks, which could be seen as non-topical, are genre recognition, aimedat identifying and classifying the type of the analyzed document, author recognition, aimed atrecognizing the author of a document in a set of potential authors, each one characterized by itsown writing style.
  16. 16. 4 CHAPTER 1. INTRODUCTION In order to properly classify the different issues and solutions proposed in litera-ture in the specific domain of OM and SA, two different dimensions can be analyzed:the granularity of the input documents and the opinion-related goal each solutionin aimed at solving (e.g.: determining subjectivity, polarity or force of the inputdocument). Granularity defines which textual entities will be considered and analyzed by theapplication: • document level granularity [63]: each document is seen as the base element for analysis of the opinion-related properties. Each property is evaluated on the document seen as a whole, even if constituted by sentences expressing different opinion-related properties. • sentence level granularity [60]: each sentence constituting a given document is seen as the base element for analysis of the opinion-related properties (e.g.: given a car review, a polarity rating is assigned to each of the sentences con- stituting it). • proposition and text span level granularity: propositions and text spans, con- stituted by two or more words, are seen as the base element for analysis of the opinion-related properties. • term level granularity [33]: analysis of the opinion-related properties is per- formed on a vocabulary of terms. The terms are not considered with respect to the contexts in which they appear; properties evaluated on terms are general. Analysis at term level granularity could lead to the composition of opinion- related lexical resources. Such resources could be used by application devoted to OM or SA to improve their effectiveness (e.g.: the term ”buono” expresses a positive polarity, while the term ”casa” does not express any subjectivity). • term sense level granularity [26]: the most detailed level at which OM and SA analysis could be performed, exploited by Esuli et al. during the development of the SentiWordNet resource. With respect to the term level granularity, each different sense of a term is considered as the base element for analysis of the opinion-related properties. Opinion-related dimensions define which aspect of subjectivity the subtask willfocus on. Three opinion related dimensions have been mainly investigated in liter-ature: • subjectivity [98], [60], [42]: subjectivity analysis is the activity aimed at rec- ognizing if a given textual entity, according to selected granularity, contains subjective expressions. The subjectivity analysis task, for example, could be used to determine that the sentence ”Il motore della Fiat Punto ` bril- e lante e piacevole da guidare” contains an opinion, while the sentence ”Il
  17. 17. 1.2. TERMINOLOGY 5 motore della Fiat Punto eroga una Potenza di 120 CV” provides only objective information. The term OM is usually associated with applications involved in subjectivity analysis. • polarity [82],[63]: polarity analysis is the activity aimed at evaluating the po- larity (in terms of positive or negative orientation) expressed by a textual entity. For example polarity analysis applied to following sentences ”Il mo- tore della Fiat Punto ` brillante e piacevole da guidare”, ”Il cambio ` e e impreciso e ne compromette la guidabilit`”, recognizes that the former a expresses a positive orientation, while the latter expresses a negative orienta- tion. The term SA is usually associated with applications involved in polarity (sometimes referred as orientation) analysis. • force [61]: force analysis is aimed at recognizing the intensity of the subjec- tive elements contained in a given textual entity. This dimension is usually investigated in association with subjectivity or polarity. Force analysis can be exploited, for example, to compare intensity of the two sentences “Il cambio ` un po’ impreciso” and “Il cambio ` un terribilmente impreciso” and e e to conclude that the latter one expresses a more intense polarity with respect to the former one, which provides a lighter orientation. Subjectivity and polarity analysis could usually be seen as a typical classificationproblem: given a textual entity D as input, a specific class C (with C ∈ [ objective,subjective ] or C ∈ [ positive, negative ]) is assigned to D. Force analysis, on theother hand, presents a regression-like nature: given a textual entity D as input, theanalysis tasks is aimed at assigning a force value FD to D.
  18. 18. 6 CHAPTER 1. INTRODUCTION1.3 Applications of Opinion Mining and Senti- ment Analysis methodologiesOpinion Mining and Sentiment Analysis have been proficiently applied to severaldomain of application, in both research and industrial contexts. The domains of ap-plication which have been explored since 2001 vary from product recommendation toproduct marketing, from brand and reputation analysis to business and governmentintelligence, from analysis of the market sentiment to terrorism prevention. The most significant domain of application of Sentiment Analysis is representedby product recommendation and review: Sentiment Analysis could be applied toautomatically summarize the polarity expressed by a set of reviews concerning agiven product or a specific property of the product, identified by means of OpinionMining. Summarized and aggregated information could be used by customers to evaluatehow a product is perceived by other customers and decide if it should be bought.Customers could easily base their decision-making process on aggregated data with-out reading the whole set of product reviews. Summarization and analysis of the reviews concerning a specific product becomeseven more significant for companies building or selling the product itself: they canidentify, by means of both Opinion Mining and Sentiment Analysis activities, whichproperties of their products are perceived as a benefit or, on the other hand, as anissue, by customers. At the same time, described techniques could be applied toreviews concerning products build by competitors, in order to compare the feed-back provided by users. A company could find, for example, that its products areperceived better than the products of their competitors because of a faster spareparts delivery system. On the other hand the company could identify that one of itsproducts is perceived as too expensive by potential customers with respect to theset of provided functionalities. Summarization of reviews polarity has been analyzed by Turney in [84]: reviewsare segmented in sentences, for each sentence the semantic orientation rating isevaluated by comparing its similarity to a positive reference word (“excellent”) withits similarity to a negative reference word (“poor”). Similarity between phrases andwords is defined by means of the PMI-IR algorithm [82]. The overall polarity of thereview is evaluated as the average semantic orientation of the sentences constitutingit. Pang et al. [63] proposed a machine learning based approach for sentiment clas-sification of movie domain reviews; supervised binary classifiers have been applied toreviews in order to evaluate reviews orientation. Pang showed how machine learningcould improve the effectiveness of the method proposed by Turney in the specificdomain of movie reviews. Review classification and clustering have been explored also in [5]: four differentcorpora of reviews belonging to different domains have been analyzed and evaluated.
  19. 19. 1.3. APPLICATIONS OF OPINION MINING AND SENTIMENT ANALYSIS METHODOLOGIES7The different types of data they consider range from movie reviews to short, phrase-level user feedback from web surveys. Authors present an innovative clusteringapproach aimed at graphically representing the sentiment orientation of differentaspects of the input reviews. The movie domain has been analyzed in [54]: a new trend prediction algorithmbased on sentiment expressed by messages available in blogs is presented. Morespecifically the authors present a new approach to predict the sales of a movieby analyzing both the amount of references to it available in the blogosphere andthe sentiment orientation of the textual context (at text span level granularity)surrounding each movie reference. Authors showed that a correlation exists betweena movie financial performance and the amount of positive oriented references to it. Politics has been influenced too by the growth of popularity of both SentimentAnalysis and Opinion Mining; in particular two different tasks have been exploredby several authors in literature: to identify the opinions of the voters and to clarifythe position expressed by a politician with respect to a specific topic. The first task has been explored, for example, in [47, 56]; in particular Mullenet al. evaluated the performance of a classification method based on Na¨ Bayes ıveclassifiers aimed at inferring the political affiliation of a blogger. The experimentalactivity described by the authors leads to poor results in terms of classificationaccuracy, with a best performance of 64,48%. A better accuracy, 65,57%, has beenreached by replacing the Na¨ Bayes classifier with a simple classification rule: to ıveassign a user to a political affiliation opposite to the users they tend to quote or bequoted by. In [80] authors investigated the problem of determining if a politician speakingduring a debate agrees or not with the contents of the debate: in particular theyshowed how integrating constraints based on speaker identity and on direct tex-tual references between statements can significantly increase the accuracy of sup-port/opposition classification. A similar domain has been explored in [46], where a new application for auto-matically analyzing a large set of documents has been developed in order to identifythose documents which support or oppose to a specific rule proposition. Sentiment Analysis and Opinion Mining could also be used to implement opinionrelated search engines; an opinion related search engine can be defined as a searchengine providing users the ability to filter the retrieved data according with a spe-cific subjectivity or orientation (e.g.: user asks for documents expressing negativeopinions about “Fiat Punto”). An example of opinion related search engine hasbeen proposed in [4]; the system described by the authors has been evaluated onblog contents included in the 2006 TREC Blog track. In particular authors showedhow query reformulation including opinion-related terms could be proficiently usedto improve accuracy in retrieval of opinion related contents. Sentiment Analysis has been exploited in order to implement tools for checkingthe coherence of the review expressed by a user: for example SA could be usedto automatically check if the contents of review are compatible with the rating
  20. 20. 8 CHAPTER 1. INTRODUCTIONexpressed by the user to summarize the review. Sentiment Analysis and Opinion Mining have also been integrated into productrecommender systems, in order to provide augmented recommendation based onboth collaborative filtering and analysis of user feedbacks. Product with manypositive reviews will be recommended with an higher probability with respect toproducts with a lot of negative reviews. Moreover, Sentiment Analysis and Opinion Mining could be adopted in orderto identify flames (messages with improper language) in email communications,forums, blogs and websites. Even accuracy of Information Extraction could beimproved by integrating Opinion Mining into the extraction workflow, as describedin [70]: sentences characterized by highest subjectivity are discarded, limiting theextraction process to objective sentences. Opinion Mining has also been exploited in question answering in order to developan opinion related question answering system [77] [75]: a system, which answers touser questions by providing both objective and subjective information. For example,given the question “Com’` il motore della FiatPunto ?“2 an opinion related equestion answering system could provide both positive and negative oriented per-spectives on the same topic. In [98] an opinion mining application operating at bothdocument and sentence level is described; its goal is the identification of subjectivetextual entities which could be exploited in order to answer questions expressed byusers. Other activities where Sentiment Analysis and Opinion Mining have been in-tegrated in order to improve effectiveness include summarization [73] and citationanalysis [64], where Sentiment Analysis could be exploited in order to identify if anauthor agrees or not with an hypothesis or a result expressed by other authors. 2 “How good is Fiat Punto’s engine?“
  21. 21. 1.4. CONTRIBUTION 91.4 ContributionThe work presented in this thesis provides several contributions to the specific task ofSentiment Analysis applied, more specifically, to product reviews written in Italianlanguage. In particular the following contributions have been proposed: • a generic framework aimed at defining, training and testing automatic tools devoted to Sentiment Analysis based on supervised classifiers has been de- signed and implemented. The SENT-IT framework provides a complete set of integrated tools for linguistic analysis and machine learning, which could be applied in order to easily generate new automatic tools for sentiment classi- fication and to evaluate experimentally their performances. A comprehensive description of the SENT-IT framework and its modules is provided in Chapter 3. SENT-IT framework is based on open-source solutions and will be freely released soon for research purposes. • a set of automatically annotated corpora constituted by product reviews writ- ten in Italian language, grouped by product domain (e.g.: movie, cars, cell phones, et al.), has been collected and shared with other researchers. Each product review is constituted by a short text, a set of additional and optional information, such as date, author name and age, and an overall polarity rating indicator, aimed at representing the polarity expressed by the author within the review. Corpora which have been developed in order to perform evaluation of the proposed methodologies for Sentiment Analysis, could be used in the future by other researchers as a Gold Standard, not available for the Italian language until the beginning of this thesis. Review corpora have been publicly released in 2008 in XML format and are available at author’s site3 . • a document features representation schema suitable for Sentiment Analysis applied to Italian language has been proposed and experimentally evaluated. The set of selected features, described in detail in Chapter 3, is constituted by representation features described as suitable in literature, in the case of English language, and ad-hoc defined features, proposed according with the specific particularities of the Italian language. • a domain independent meta-classifier devoted to Sentiment Analysis has been implement by applying a stacking approach to previously trained domain- dependent classifiers. Stacking approach has been investigated in order to improve the effectiveness of the ensemble classifier on unknown or already known domains. • a lexical resource of polarity oriented terms for the Italian language has been developed, by proposing a shortest path algorithm based on a graph represen- tation of the input terms. Semantic relations connecting terms, like synonymy, 3 http://users.dimi.uniud.it/ paolo.casoto/research.html
  22. 22. 10 CHAPTER 1. INTRODUCTION antinomy and similarity have been used in order to generate the graph rep- resentation. In particular our research has been focused on evaluating the polarity orientation of attributes; in fact attributes, as described in detail in both Chapters 3 and 5, carry most of the overall polarity of the documents analyzed in this work. • a novel information visualization and navigation approach, based on zz-structures, has been proposed. The navigation module is aimed at providing users an ef- fective and personalized way to browse the set of reviews according with the polarity expressed by each review. The research activities described in this thesis and in [13] and [14] represent thefirst solution published in literature to the problem of Sentiment Analysis applied todocuments written in Italian language. For this reason results presented in Chapters3, 4 and 5 could only be compared with similar results presented in literature butevaluated on different languages. In particular results have been compared withsimilar approaches applied to documents written in English language. The SENT-IT framework has been considered also as part of several theoreticalproposals for novel information and knowledge sharing systems, presented in [68],[14], and [6]. The outcomes of the SENT-IT framework could be used to properlyannotate, in an automatic way, a set of input documents. Annotations providedby SENT-IT have been grouped with the annotations generated by InformationExtraction tools on the same documents and used to inference and suggest moreannotations to users in a proactive way. In addiction to Sentiment Analysis, the author focused, during its PhD researchactivity, attention on the domain of Digital Libraries and Digital Preservation ofCultural Heritage. Activities and results which have been achieved in this domainsare out of the topics of this thesis and will not be described. The full set of publi-cations is listed in Appendix A.
  23. 23. 1.5. OUTLINE OF THE THESIS 111.5 Outline of the thesisThis thesis is constituted by five chapters; in Chapter 2 a detailed description ofthe Sentiment Analysis activity is provided. Issues which affect the accuracy ofSentiment Analysis task are described and compared with topic-based classificationactivities, well known in literature, like document categorization. The problem ofapplying Sentiment Analysis techniques to non English languages, where lexical orlinguistic resources are often missing, is analyzed, by describing some of the solutionsavailable in literature. In Chapter 3 we focus on the specific problem of Overall Opinion Polarity clas-sification at document level; more specifically we are interested in defining andevaluating supervised classifiers aimed at classifying a movie review as positive ornegative according with the sentiment orientation its author expresses. In order toovercome this problem several document representation schemas and classificationmethods have been proposed and evaluated; the SENT-IT framework is presentedin detail and results are analyzed and discussed. Chapter ends with a brief descrip-tion of the information visualization module, based on zz-structures, which has beenproposed in [14] in order to improve user navigation of movie reviews. In Chapter 4 we move from the problem of effectively identifying the OverallOpinion Polarity of reviews in a specific domain to a more difficult activity: togenerate and train a classifier able to perform Overall Opinion Polarity analysison documents concerning different or unknown domains. In particular we describeour meta-classification approach, based on the stacking method of ensemble clas-sification. Evaluated results are analyzed and discussed, with respect to similarexperiences available in literature. In Chapter 5 we move forward from automatic classification of Overall OpinionPolarity at document level to term level. More specifically we propose an originalmethod for determining the polarity orientation of a set of terms extracted by thevocabulary of the Italian language, based on shortest path models applied to thelink graph representing the selected terms. This activity is aimed at creating apolarity oriented lexicon for the Italian language, which could be used in orderto improve the effectiveness of the Overall Opinion Polarity analysis at documentlevel, by identifying unigrams carrying domain independent features for documentrepresentation. In Chapter 6 a brief summary of the obtained results is reported and analyzedand the possible future path of research are described.
  24. 24. 12 CHAPTER 1. INTRODUCTION
  25. 25. Chapter 2Sentiment Analysis: challenges,solutions and tasks Abstract In this Chapter a detailed survey about challenges in Sentiment Analysis and related solutions presented in literature is provided. This chapter is aimed at describing the critical issues, which affect the accuracy of the SENT-IT framework described in Chapter 3 in overall opinion classification. A brief de- scription of the complementary tasks which are related with Sentiment Anal- ysis but not directly covered by our activities on the SENT-IT framework is proposed. The Chapter ends with the analysis of works presented in literature for Sentiment Analysis applied to non English languages.2.1 IntroductionInterest expressed in Sentiment Analysis is constantly increasing in both industrialand research sectors, due to its wide range of potential applications, first of thembeing business intelligence. Since the beginning of research activity in this specific area, it became clear,as stated by Turney in [82] and then confirmed by many other works, that Senti-ment Analysis is different form classic document classification activities, such as textcategorization. “Text categorization (also referred as text classification, or topic spotting) is theactivity of labelling natural language texts with thematic categories (also referredas classes) from a predefined set.“ [72] Categories are defined according to the specific goals of users or applications,which will perform categorization: different tasks are based on different sets of cate-gories. In fact the number of categories, which could be used in text categorization,could vary from a small set constituted just by two categories (binary classification)
  26. 26. 14 CHAPTER 2. SENTIMENT ANALYSIS: CHALLENGES, SOLUTIONS AND TASKSto larger sets including thousand or more categories (e.g.: categories required toclassify a newspaper’s article according to its topic, by using a structured taxonomyor ontology). Moreover text categorization could be performed according to twodifferent approaches, described in Figure 2.1: • single label : input text ti is assigned to exactly one category Cj ; in fact cate- gories do not overlap each other; • multi label : input text ti could be assigned to one or more categories C1 , C2 , C3 , . . . , overlapping each other.Sentiment Analysis, on the other hand, is usually based on a relatively small set ofcategories (e.g.: “positive” and “negative” in a binary classification approach; “5stars”. . . ”1 star” in a multiclass classification approach). Such classes are domainindependent and generalized across users and applications. “Positive” category, forexample, clearly represents the set of positively oriented documents in both themovie and the car review domain. Generalization of categories across users andapplications may not, on the contrary, be a valid hypothesis when dealing with textcategorization. Consider, for example, two users, which are interested in classifyingtheir documents according to two different taxonomies describing document’s top-icality. Same category Ci , for example “sport”, could be located at different levelsof the taxonomies and be associated with different documents.Figure 2.1: Different approaches to text categorization and polarity classification.
  27. 27. 2.1. INTRODUCTION 15 Another difference that arises by comparing Sentiment Analysis with traditionaltext categorization tasks is the strong relationship, which connects categories eachother. More specifically, while text categorization is based on unrelated (or hierar-chically related when dealing with taxonomies) categories, Sentiment Analysis usescategories representing opposite concepts (e.g.: “positive” and “negative” in binaryclassification) or categories connected by an order relation (e.g.: “5 stars”. . . ”1 star”in multiclass classification). In fact, as stated by Pang and Lee in [62]: ”the regression-like nature of strengthof feeling, degree of positivity, and so on seems rather unique to sentiment catego-rization (although one could argue that the same phenomenon exists with respectto topic-based relevance)”. Opinion Mining has many characteristics that differ from another activity com-monly applied to unstructured texts: Information Extraction (IE). Information Ex-traction could be defined as “ the process of filling the fields and records of a databasefrom unstructured or loosely formatted text” [50]. The IE process analyses the input text, in order to identify and to extract ref-erences to entities (e.g. people, places, dates, currencies, et al.) appearing in thetext and their relationships (e.g. person X is going to place Y). Extracted dataare structured by means of a frame-like structure, the template, a list of slots filledwith the strings extracted from the input document during the IE activity. Eachtemplate is strongly coupled with the specific domain and task it is aimed at: tem-plates devoted for the medical domains, where slots contain references to diseases,drugs, viruses or chemical compounds, do not perform effectively when applied tounstructured bibliographic references. In fact Information Extraction is described in literature as a strongly domaindependent activity; many works, like [29] and [31], describe this issue of IE, intro-ducing and evaluating new approaches aimed at domain-independent InformationExtraction of documents available on the Web. Opinion oriented Information Extraction, on the other hand, is based on tem-plates whose fields (e.g.: appraiser, appraised, orientation, attitude, strength, et al.)generalize well across different domains. The described template could be adoptedto extract opinions expressed by a text in both car and movie review domain. In[87] and [88] a particular domain-independent template, defined as appraisal groupis used to extract opinion related information from a set of input texts and performSentiment Analysis classification. The frame-like structure used to describe an ap-praisal group and the set of textual entities, which could be assigned to the attitudeslot, are represented in Figure 2.2.
  28. 28. 16 CHAPTER 2. SENTIMENT ANALYSIS: CHALLENGES, SOLUTIONS AND TASKSFigure 2.2: Template adopted in [87] and [88] for opinion oriented informationextraction.
  29. 29. 2.2. CHALLENGES IN SENTIMENT ANALYSIS 172.2 Challenges in Sentiment AnalysisBoth sentiment polarity classification and opinion oriented information extraction,as described in the previous section, generalize well across different domains, differ-ent users and even different information needs. By observing this domain-independency of activities constituting the researchfield of Sentiment Analysis, the following assumption could be formulated: the po-larity (or, in the same way, the subjectivity for opinion oriented information extrac-tion) of a text is given by the polarity of the single words constituting it. In orderto classify the polarity of an opinion, expressed in a given text, a set of specifickeywords should be identified. In order to support such assumption, which has been proven as being partiallywrong by several publications, [63], consider the following example from the cellphone review domain: Considero il Nokia 5250 un vero affare, visto che possiede tutte le funzioni pi` ricercate in un telefono di ultima generazione e costa u come una serata al ristorante. Il 5230 ` un ottima via di mezzo ad un e prezzo basso, secondo me ne venderanno una miriade.The topic of this review could be easily identified by the entity “Nokia 5250”1 , whilethe presence of words like “affare” and “ottima” clearly suggests review’s author isexpressing a positive opinion. Other textual elements, such as “pi` ricercate”, “ul- utima generazione” and “ne venderanno una miriade” help in conveying the polarityof the opinion expressed by the author. By looking at the previous example it seems that distinguishing positive fromnegative reviews is relatively easy for humans, especially in comparison to other tra-ditional text categorization problem, like topic categorization applied to documentsconcerning very similar topics. However, as stated by Pang in [63], the identificationof keywords conveying sentiment polarity is difficult even for human classifiers. In[63] the author asked two human classifiers to collect, independently each other, alist of indicators of positive or negative orientation in a given document. Both listsof keywords proposed by the human experts, intuitively plausible, has been used inorder to classify a set of input documents, concerning movie review domain: eval-uated accuracy, however, is only about 60%, with respect to the baseline of 50%based on random classification. The two lists of polarity bearing terms collected byhuman experts are reported in Table 2.1. In order to improve the accuracy of the classification activity based on key-words identification, a new list of polarity conveying terms has been collected and 1 Appraised entity is easily extracted, in the specific domain of cell phones, by implementinga named extity extraction [15] module based on a small set of extraction rules. Named entityextraction could be achieved, for example, by identifying the textual spans containing a referenceto a manufacturer of cell phones (e.g. Nokia, Apple), listed in a gazetteer, followed by a number(e.g.: 5240, 3310).
  30. 30. 18 CHAPTER 2. SENTIMENT ANALYSIS: CHALLENGES, SOLUTIONS AND TASKSTable 2.1: List of polarity conveying terms collected by two human experts in [63]. Human 1 Positive dazzling, brilliant, phenomenal, excellent, fantastic Negative suck, terrible, awful, unwatchable, hideous Human 2 Positive gripping, mesmerizing, riveting, spectacular, cool awesome, thrilling, badass, excellent, moving, exciting Negative bad, cliched, sucks, boring, stupid, slowevaluated: words has been chosen according to a preliminary examination of thefrequency counts characterizing the test set. The second list, shown in table 2.2,has the same size of the previously described lists but, at the same time, presentssome particularities. Terms whose semantic is not directly bearing a polarity orien-tation have been included as an indicator of positive (still) or negative orientation(question and exclamation mark). Although such textual entities would probably not have been proposed by humanexperts, their usage as polarity indicators arises from the statistical analysis of termfrequency applied to the test corpus. The accuracy achieved by the latter describedlist of keywords is almost 70%.Table 2.2: List of polarity conveying terms collected by human expert and statisticanalysis of document corpus in [63]. Positive love, wonderful, best, great, superb, still, beautiful Negative bad, worst, stupid, waste, boring, ?, ! Pang et al. showed that polarity classification based on keyword identificationcould be outperformed by adopting machine learning based approach. More specifi-cally, by using unigram models and Na¨ Bayesian classifiers, accuracy over 80% has ıvebeen achieved. However such accuracy, even if better than performance achievedwith keywords identification, in still lower than performance expected in typicaltopic-based binary classification [72]. But why is the sentiment classification problem harder than the traditional topic-based binary classification, considering, in particular, that “positive” and “negative”classes are so semantically different each other?
  31. 31. 2.2. CHALLENGES IN SENTIMENT ANALYSIS 19 The most significant difference with topic classification of textual entities, asstated in [82], [63], [71] and others, is that “sentiment can often be expressed ina more subtle manner, making it difficult to be identified by any of a sentence ordocument’s terms when considered in isolation” [62]. In order to better understand how sentiment orientation could be expressed with-out requiring the presence of specific polarity bearing terms, consider the followingexamples in both English and Italian languages 2 : • “If you are reading this because it is your darling fragrance, please wear it at home exclusively, and tape the windows shut.” The example has been taken from the perfume review domain; it clearly expresses a negative orientation, although no ostensibly negative words occur. • “She runs the gamut of emotions from A to B.” Even in this example, taken from the movie review domain, no ostensibly negative words occur but, in fact, the author expresses a strongly negative opinion. • “Everytime I read ‘Pride and Prejudice’ I want to dig her up and beat her over the skull with her own shin-bone.” This example, extracted by a review written by Mark Twain about Jane Austen’s books, expresses a strongly negative opinion. • “La nuova Fiat Punto rappresenta l’anello di congiunzione fra il carro bestiame del secolo scorso e le automobili tedesce del giorno d’oggi.” In this example, taken from the car review domain, a strongly negative opinion is expressed although no ostensibly negative words occur. • “Certamente un pomeriggio con la suocera pu` rivelarsi pi` entusiasmante di o u questa sceneggiatura.” Another example, taken from the movie review domain, where a negative opinion is expressed. In this particular review sentiment is subtle and very difficult to understand. In fact irony is used to convey the negative opinion the author wants to express about the plot of the movie. She/he compares the time spent seeing the movie with the time spent with the mother-in-law, which is usually associated with negative moments and feelings. • “Il nuovo Nokia N8 ` una bomba, l’ultima ancora di salvataggio per il bilancio e dell’azienda finlandese.” In this example a positive opinion arises; more specif- ically the author wants to express how the appeal of the described product could attract many new customers. Even in this case no reference to positive terms is present in the review. Moreover words like “bomba” (bomb) and “ultima” (last) are used to convey a positive polarity, even if both, in their 2 English examples have been extracted from [62], while the Italian examples have been manuallyextracted during the preliminary analysis of the corpora described in Chapters 3 and 4.
  32. 32. 20 CHAPTER 2. SENTIMENT ANALYSIS: CHALLENGES, SOLUTIONS AND TASKS English version, have been classified, by using SentiWordNet [25, 26, 27, 28], as strongly objective and slightly oriented to negative polarity. Provided examples show how polarity orientation could be conveyed withoutrequiring opinions, even if strongly oriented, to be associated with specific keywordsor phrases. Moreover, as stated by Kim and Hovy in [43], another issue affecting classificationof polarity is the difficulty in recognizing and distinguishing objective and subjectiveparts of a given text. According to the authors, this task reveals itself as particularlydifficult for human classifiers too; more specifically they express their doubts byclaiming that “human annotators often disagreed on whether a belief statement wasor was not an opinion”. The conclusions expressed by Kim and Hovy has not been widely accepted bythe research community. In [74] [75] authors present their results based on manualidentification of opinionated sentences and their respective polarities in 24 differentdocuments (13 for the study A and 11 for the study B, performed by the same expertstwo months later). The experimental activity shows an inter-evaluator agreement of83% for study A and 85% for study B, that outperforms results presented by Kimand Hovy. Even objective sentences of a text, indeed, could provide opinions and polarityclues; even “facts”, strongly objective sentences, do not guarantee the absence ofopinion. In order to clarify this statement, consider the following examples writtenin both English and Italian language: • “I must familiarise my mind with the fact that Miss Austen is not a poetess. I must “learn to acknowledge her as one of the greatest artists, of the greatest painters of human character, and one of the writers with the nicest sense of means to an end that ever lived.” • “Il Nokia N8 non ` un comune telefono. Il suo schermo da 3.5’ e la connet- e tivit` wireless Wi-Fi / 3G lo rendono un vero computer portatile. La batteria a garantisce oltre 6 ore di autonomia.”In both examples strong opinion is expressed by both objective and subjective sen-tences at the same time. Consider the second example, extracted from the cell phonereview domain; it is constituted by 3 different sentences, which could all be classifiedas facts conveying only objective information (e.g.: the first sentence, “Il Nokia N8non ` un comune telefono” is an example of a fact, providing the definition of ean entity, a cell phone). However, although constituted by objective sentences, thetext expresses a positive opinion, in particular it shows that the described productprovides a wider set of capabilities not available in similar products, leading to anadvanced and unique product. Similarly, the example shows how the sentence “the fact that” does not neces-sarily guarantee the objective truth of what follows it. Objective sentence “Miss
  33. 33. 2.2. CHALLENGES IN SENTIMENT ANALYSIS 21Austen is not a poetess” provides emphasis to following sentences, by augmentingtheir orientation; the sentence, in other terms, is not aimed at describing a real fact,but at conveying an opinion in a more suitable or elegant way. In addiction to difficulties in recognizing sentences, which could be classified asopinions, another issue, widely discussed in literature, is the identification of opinionholder and opinion object (also referred as appraiser and appraised in [87, 88]). Thisissue has been studied, in particular, in [47, 56, 80, 57] in order to identify the holderof an opinion in transcriptions of political debates. The general notion of positive and negative opinions is consistent across differ-ent domains, as described in Section 2.1; however sentiment and subjectivity of agiven text depend, as shown by previous examples, from the context where the textis located. This assumption can be generalized at domain level: in each domaindifferent terms can be used to convey the same opinion and polarity or, on the otherhand, same terms can convey different semantics in different domains and contexts[62]. The following examples describe how some terms or sentences whose polaritychanges across different domains: • “Go read the book” [62]. This simple sentence clearly expresses a positive opinion when concerning a book review. Same sentence, however, expresses a quite negative sentiment when used in the review of a movie. Same sentence, thus, could be used to express completely opposite opinions and sentiments in different contexts or domains. • “La memoria RAM installata a bordo ` di 512 MB”. This simple sentence e presents both the previously described issues: it is a fact, expressing an objec- tive information about a product, but provides, at the same time, an implicit opinion concerning the quality of the described product. Moreover it assumes different orientations when applied to different domains. For example it could describe an high level cell phone, assuming a positive orientation, in the cell phone domain, while it could assume a quite negative orientation when used in the computer domain, where 512 MB is considered, nowadays, a poor amount of RAM.This phenomenon is more frequent when documents contains off-topic or cross-topic sections, for example when the author moves to a different domain dependentvocabulary, inside the body of the review. Last issue, which affects Sentiment Analysis, concerns the importance of model-ing the structure of the discourse expressed by the author of a text. In traditionaltext categorization the order in which different subjects are presented is not im-portant; terms, which occur relatively frequently in the text concur in determiningthe topic the the document. In Sentiment Analysis the order in which opinionsare presented influences the polarity expressed by the document; same sentences
  34. 34. 22 CHAPTER 2. SENTIMENT ANALYSIS: CHALLENGES, SOLUTIONS AND TASKSin a different order could lead to a completely opposite overall sentiment polarity.Following examples provide a better understanding of described phenomenon: • “This film should be brilliant. It sounds like a great plot, the actors are first grade, and the supporting cast is good as well, and Stallone is attempting to deliver a good performance. However, it can’t hold up.” The orientation of the text, except for the last sentence, is clearly positive; in fact the presence of several positive indicators like “brilliant”, “great”, “first grade”, “good” supports this hypothesis. However the overall sentiment, which is clearly nega- tive, is provided only by the last sentence. The last sentence, in fact, is crucial for determining the overall polarity of the review but, at the same time, does not provide any explicit negative polarity indicator. • “Il cambio ` preciso e silenzioso, lo sterzo pronto alle sollecitazioni. L’elettronica e di bordo non delude. Tuttavia il prodotto finale ` ancora troppo arretrato e rispetto ai diretti concorrenti sul mercato.” Similarly to the previous example, in this review extracted from the car domain the overall negative polarity is expressed by the last sentence. First sentences, in particular, provide a positive opinion about the car under description. This issue as been described by Tur- ney in [82] as the problem of identifying and distinguishing sentences providing opinions concerning the whole from sentences providing opinions concerning single elements. This issue affects, in particular, specific domains, as proven by Turney: “good beaches do not necessarily add up to a good vacation. On the other hand good banking services add up to a good bank”.2.3 Sentiment Polarity ClassificationThe sentiment polarity classification is the task aimed at classifying the opinion ex-pressed by an opinionated text as “positive” or “negative” or at locating its positionon the continuum between these two polarities [62]. Polarity classification of opinionated texts could improve the effectiveness ofseveral activities based on the analysis of large amount of textual data, the mostimportant one being Business Intelligence; in fact, as described in Chapter 1, polar-ity classification has been exploited in literature in order to improve the effectivenessof several applications. In [22] polarity classified sentences have been used to de-fine novel sentiment information retrieval models in the framework of probabilisticlanguage models, aimed at improving the accuracy of polarity-oriented retrieval. Sentiment polarity classification could be used to refer broadly to binary catego-rization (e.g. opinion expressed by text A is classified as “positive” or “negative”),to regression (e.g.: opinion expressed by text A is classified as “2” in a scale between”0”, which represents an “extremely negative” opinion and ”10”, which representsan “extremely positive” opinion), or to ranking (e.g.: opinion expressed by text Ais more positive than opinion expressed by text B on the same topic).
  35. 35. 2.3. SENTIMENT POLARITY CLASSIFICATION 23 Sentiment polarity classification, in both its binary or regression formulation, isusually based on two opposite classes, like “positive and “negative” or “like” and“dislike”, whose semantic is quite clear. However such dichotomies used in classi-fication could also assume different nuances, for example in the field of applicationof politic, as in [56] [57] and [80], where authors are interested in determining if anopinion holder supports or not the topic under discussion during a debate. Or in[44], where authors are interested in predicting which party will win an election bylooking at informal opinions left by users on an election prediction website. Theauthors evaluated a prediction accuracy of 81.68%, by adopting a classification ap-proach based on the SVM method, properly improved by integration with a noveltechnique, which generalizes n-gram feature patterns. In fact all described variants to the standard sentiment polarity classificationactivity could be exploited by mean of similar machine learning tasks, such as Na¨ ıveBayesian and SVM classifiers. In [41] authors focused on determining the reasons why a product is liked or notliked by reviewers. More specifically the work is aimed at identifying and classifyingwhich expression of a review describes “pro” and “con” of a given product, like, forexample: • “The battery life of this laptop is only 2 hours long.” • “La tastiera ` affidabile e poco rumorosa.” e Authors applied a Maximum Entropy approach; more specifically, in order toeasily deal with a multi-class classification problem (sentences can express “pro” -PR, “con” - CR or no reason - NR), a two-step binary classification approach hasbeen used: first classifier is aimed at distinguishing between CR or PR sentences andNR sentences, which are not relevant in pro and con extraction. Second classifier,indeed, performs extraction of CR and PR sentences. The ability to extract theelements, which represent “pro” and “con” in a given text, leads to a significantimprovement to Sentiment Analysis: the ability to evaluate the agreement betweenthe overall rating expressed by a reviewer and the effective contents of his/her review.Such analysis could be used, for example, to determine the reputation of a reviewerand, consequently, the trust about her/his review. Although most of the polarity orientation of a given text is provided by sub-jective (opinionated) contents, sentiment polarity classification could be applied toobjective texts too. Some of the examples, which have been reported and describedin Section 2.2 could be considered in order to support the importance of applyingsentiment polarity classification to objective information too. The following sen-tences represent two further examples aimed at providing a better understandingof how objective information could help in determining the sentiment polarity of agiven text: • “The Nokia N8 has got a large and brilliant display, with a resolution of 320 per 480 pixels.”
  36. 36. 24 CHAPTER 2. SENTIMENT ANALYSIS: CHALLENGES, SOLUTIONS AND TASKS • “Il Nokia N8 ha uno schermo ampio e brillante, con una risoluzione di 320 per 480 pixel.” Both sentences are objective, they express a fact about the capabilities of thedisplay; such information, however, could reveals itself useful in determining theoverall polarity of the product. In [45] a sentiment classification activity applied toobjective texts is described: authors are interested in developing a novel predictionmodel, able to predict the trends of a public company stock with respect to a setof news concerning the company itself. In order to evaluate the accuracy of theproposed prediction model the Multex Significant Developments corpus, constitutedby more than 12.000 news, has been used as testing set. The predicted results havebeen compared, for each company, with the real trend the company stocks describedduring the same period. Evaluated accuracy spans from 70,3% to 52%, dependingon the different parameters and labelling methods considered by the experiment. The overall polarity orientation of a given text could also be determined bythe presence of comparative sentences, like “Canon EOS optics are better thanthose of Sony and Nikon”, which represent a relationship between different opinionsexpressed across the same document or across different documents by the sameauthor. Given the previous example and an opinionated text written by the sameauthor about the optics of a Nikon digital camera, previously classified as positivelyoriented, we could plausibly assume that evaluation expressed about Canon EOSdigital camera would be positively oriented too. This particular problem has been investigated in [37] and [36]: different super-vised learning approaches as been exploited in order to identify and extract compar-ative sentences in a domain dependent environment. Results have been evaluated onthree different domains: news articles, consumer reviews of products, and Internetforum postings.2.3.1 Sentiment Polarity RegressionSentiment polarity classification could be generalized from a binary classificationproblem to a multi-class classification problem, where ratings assigned to a text inorder to describe the polarity it expresses, represent classes. Multi-class classificationbased on ordinal ratings is in fact a form of ordinal regression classification. Moving towards a multi-class classification problem improves the effectivenessof the classification activity, by providing a more detailed rating schema(e.g.: doc-uments could be classified by using classes, which span from “extremely positive”to “extremely negative”). At the same time, however, it affects the accuracy of thetrained classifiers by improving their complexity. An interesting property of the multi-class reformulation of the sentiment polarityclassification activity is represented by the following observation: although eachclass representing a specific rating is characterized by a specific vocabulary (theset of keywords, which could be used to infer that the polarity expressed by a
  37. 37. 2.4. OPINION MINING 25text matches the class), texts containing a mixture of terms from opposite classescould be assigned to a third class. Consider, for example, a classification problembased on three different classes: the “positive” class, the “negative” class and the“neutral” class. Neutral texts could fall into neutral class because they containmultiple references to neutral terms (e.g.: “normal”, “standard”, “mediocre”) orbecause they contain a mix of terms from both positive and negative class, whosemixture leads to an overall neutral polarity. The neutral class is a critical class in sentiment polarity classification, whosesemantic could be particularly subtle. For this reason, as described in Chapter 3, ithas not been considered in this thesis. Neutral class, in fact, could represent, at thesame time, three different situations: 1. the text does not express any information concerning its polarity orientation; 2. the text includes both positive and negative opinions, mitigating each other without leading to a clear opinion; 3. the text explicitly expresses a neutral opinion: the author wants to express an opinion, which could not be classified neither as positive or negative. Another aspect of neutral class has been observed in [12]; the authors show howneutral comments are usually perceived as slightly negative by users. accordingto the authors, who based their work on the observation of dynamics of sellers’reputation on eBay, the effects of a neutral feedback are similar to the effects of anegative feedback. Cabral’s assumption has been further corroborated even in our work, as describedin both Chapters 3 and 4: most of neutral comments we collected in our experimentalactivity could be seen as negative feedbacks, soften according to social influences,such shame (e.g.: don’t let the others know I did a bad affair buying a specificproduct) and fear (e.g.: according to [12] “a buyer leaving a negative comment hasa 40% chance of being hit back, while a buyer leaving a neutral comment only hasa 10% chance of being retaliated upon by the seller.”).2.4 Opinion MiningIn Section 2.3 we have seen how most of the works described in literature, whichare aimed at sentiment polarity classification, are based on the assumption thatopinionated texts are provided as input. The importance of deciding if a givendocument contains subjective information has been summarized by Mihalcea in [52]: “the problem of distinguishing subjective versus objective instances has often proved to be more difficult than subsequent polarity classifica- tion, so improvements in subjectivity classification promise to positively impact sentiment classification.”
  38. 38. 26 CHAPTER 2. SENTIMENT ANALYSIS: CHALLENGES, SOLUTIONS AND TASKS The role of adjectives as carriers of orientation and, moreover, their effects onsentence subjectivity have been examined by Hatzivassiloglou and Wiebe in [34].Authors are interested in determining if a given sentence is subjective by analyzingthe adjectives it contains. Several projects, described in detail in [91], explore toproblem of determining the subjectivity of a given text, sentence or sub-sentence indifferent domains, like in [92],[90],[98], and [93]. Moreover in [91] a comprehensivesurvey of subjectivity recognition using different clues and features is provided. Wilson [94] addresses the problem of determining clause-level opinion strength(e.g.: “how mad are you?”). In particular the author shows how the problem ofdetermining opinion strength is different from inferencing the polarity of an opinion.A text classified as a neutral opinion by polarity classification is not necessary anobjective text, where a clear opinion is missing: it can convey a strong “mediocre”opinion or it can describe some aspects of the product as positives and some othersas negative, mitigating each other. Subjectivity detection and ranking at the document level is a task derived fromgenre classification, the process devoted to infer the genre of a given text. In [98] theauthors obtained a high accuracy (97%) with a Naive Bayes classifier on a testing setconstituted by articles from the Wall Street Journal. The authors aimed at properlydiscriminating between three classes of articles: News and Business (facts), Editorialand Letter to the Editor (opinions), as previously performed in [92] and [91].2.5 Affect computingAnother research field related with the Sentiment Analysis domain is represented bycomputational affect (also defined as affect analysis). Computational affect movestowards the identification and extraction of opinionated pieces of text and the sub-sequent sentiment polarity classification, focusing on the identification of specificemotions appearing in the text as part of an opinion. Most of the research works are inspired by the six universal emotions describedby Ekman in his study on the possible expressions of the human face [23]: anger,disgust, fear, happiness, sadness, and surprise. In [49],[48] a subset of the Open MindCommon-sense Corpus is used to generate four different affect classification models,based on the six emotions described by Ekman. Input text is split in sentences andfor each sentence the affective classification is performed; several techniques havebeen introduced, such as analysis of the global mood at document level, to smooththe transition between sequent sentences. Figure 2.3 shows an example of the system developed in [49],[48] called Empathy-Buddy email browser; the system is aimed at representing in real time the affectivequality of the text being typed by the user by means of Chernov-style face feedback. Another novel approach described in [49],[48] is the ability to define and toextract patterns representing meta-emotions: complex emotions, such as frustration,relief of horror, which could be represented as a mix of the six basic emotions.
  39. 39. 2.6. MULTILINGUAL SENTIMENT ANALYSIS 27 Figure 2.3: The EmpathyBuddy email agent [49, 48] in action.Frustration, for example, is defined as the repetition of words expressing anger witha low strength. Affect computing has been moreover investigated by Valitutti [86]: authors de-veloped WordNet-Affect, a subset of the WordNet resource labelled by means of ataxonomy of 11 categories representing affective concepts, defined as a-labels. A setof 1,314 WordNet synsets, including 3,340 different terms, has been labelled withthe set of pre-defined a-labels. WordNet-Affect has been initially developed by manually labelling a set of morethan 1,900 terms selected from different resources, like dictionaries. Labelled termshave been linked to their respective WordNet synsets, each one with an associatedframe of related information (e.g. Italian and English version, a-label). WordNet hierarchy has been exploited in order to identify new affective synsetsnot included into the WordNet-Affect core; following relations between synsets havebeen investigated according to the assumption that they preserve the affective mean-ing of the related synsets: antonymy, similarity, derived-from, pertains-to, attributeand see-also.2.6 Multilingual Sentiment AnalysisMost of the research activities on Sentiment Analysis available in literature arefocused on documents written in the English language; in fact most of the availableresources required in Opinion Mining and Sentiment Analysis, like lexicons andmanually labelled corpora, are easily available only for the English language. The lack of linguistic resources is described as a critical issue in most of theresearch experiences concerning Sentiment Analysis of non English languages. Inthis thesis, as described in Chapters 3 and 5, many experimental activities have beenexploited in order to deal with the lack of available resources for the Italian language.
  40. 40. 28 CHAPTER 2. SENTIMENT ANALYSIS: CHALLENGES, SOLUTIONS AND TASKSThe development of new tools and resources for a foreign language requires severalyears of work; for this reason we focused more on supervised and unsupervisedmachine learning approaches instead of more complex analysis tools, like parsers orinformation extraction engines. Sentiment Analysis techniques applied to foreign languages have been investi-gated in [79] and [39] for the Japanese language, in [35, 99, 100] for the Chineselanguage, in [1] for the Arabic language and in [43] for the German language. Many researchers investigated novel methods to automatically generate resourcesrequired in Sentiment Analysis for a new language: lexical resources already definedfor the English language have been projected to the target languages by means ofdifferent cross-lingual projection strategies. In [52] both bilingual dictionaries andparallel corpora have been investigated in order to exploit sentiment analysis for theRomanian language. The simplest strategy devoted to cross-lingual Sentiment Analysis is automaticmachine translation [7]: during pre-processing activities input text is translated inEnglish and, subsequently, classified by means of a Sentiment Analysis classifier.Nine different languages have been investigated, by means of the state-of-the art au-tomatic translation system, the WebSphere Translation Service developed by IBM.Authors show how Sentiment Analysis applied to automatically translated texts per-forms in a consistent way across different languages. Moreover authors show howthe proposed approach could be generalized from the exploited translation service. Another interesting result presented in [7] is constituted by the analysis of cross-cultural orientation of automatically annotated terms: languages like English andItalian are the most biased languages towards negative sentiment orientation, whileKorean language is the most biased languages towards positive sentiment orienta-tion. In [52] the subjectivity lexicon used in the OpinionFinder system [89] is trans-lated in Romanian language by using both an authoritative and a web based English-Romanian dictionary containing, respectively, 41.500 and 4.500 entries. Opinion-Finder lexicon is constituted by expressions constituted by one or more words, la-belled according to their subjectivity and strength. Several issues have been faced in order to properly perform cross-lingual projec-tion, such as word lemmatization, resolution of translation ambiguities and, finally,translation of multi-word expressions. Multi-word expressions have been translatedword-by-word from the English language to the Romanian language and validated bycounting the number of occurrences of the translated expressions on the AltaVistasearch engine. Authors evaluated, by performing manual annotation of 150 sampleexpressions extracted from the generated lexicon, that subjectivity clues tend to beless reliable in the target language. In other words part of the subjectivity is lost intranslation and such issue is stronger on weekly subjective expressions. In addition to lexicon translation in [52] corpus based cross-lingual projection hasbeen investigated, in order to overcome the limitations observed in lexicon transla-tion: authors translated a set of 107 English documents in Romanian and manually
  41. 41. 2.6. MULTILINGUAL SENTIMENT ANALYSIS 29annotated them in order to be used as gold standard in evaluation. English docu-ments have been automatically annotated by means of the OpinionFinder system;annotations have been projected to the corresponding Romanian texts. A Bayesclassifier has been trained on the translated sentences and evaluated; authors showhow corpus based cross-lingual projection performs better than lexicon translation. According to the authors corpus based cross-lingual projection improves theeffectiveness of lexicon translation: the context in which a text is used in the originallanguage could reduce its ambiguity on the new language. Cross-lingual projection allows researcher to investigate Sentiment Analysis innew languages without requiring large linguistic resources or complex analysis toolsto be developed. On the other hand cross-lingual projection presents several is-sues, including translation ambiguities and cross-cultural differences, like irony, noteasily projectable on different languages. Such issues affect, as proven by describedresearch experiences, the effectiveness of both Sentiment Analysis and Opinion Min-ing activities, leading to lower performances with respect to the English language.
  42. 42. 30 CHAPTER 2. SENTIMENT ANALYSIS: CHALLENGES, SOLUTIONS AND TASKS
  43. 43. Chapter 3A Supervised Approach to OverallOpinion Polarity Analysis Abstract In this Chapter we present a new supervised approach for evaluating the over- all opinion polarity (OvOP) of a set of documents written in Italian language. The proposed method is based on two different supervised learners: Na¨ ıve Bayes classifier and SVM classifier. The set of features that has been intro- duced in order to represent the input documents includes several contributions previously presented in literature for the English language. We tried, during the experimental activity, to evaluate and to adapt the set of selected features to the specific context of the Italian language. Collected results provide a good evaluation of the effectiveness of our approach for the Italian language.3.1 IntroductionThe amount of product reviews freely available online is facing a continuos growthsince the last ten years: Internet has become the best mean of communication usedby people to express their opinions about every kind of product and services. At thesame time Internet represents, nowadays, the most valuable source of information,and more specifically subjective information, for each customer interested in buyinga product. It is interesting to notice how product reviews can be used in different ways byusers, according to their specific information needs: for example, as described in [71],a customer that is already interested in a certain product may want to read somenegative reviews just to pinpoint possible drawbacks, but has no interest in spendingtime reading positive reviews. In contrast, customers interested in watching a goodmovie may want to read reviews that express a positive overall opinion polarity.
  44. 44. 32CHAPTER 3. A SUPERVISED APPROACH TO OVERALL OPINION POLARITY ANALYSIS However the abundance of such reviews may reveal itself, at the same time,an issue; reading each review in order to evaluate its subjectivity and, moreover,the polarity it conveys is a complex and time-consuming task. Tools aimed atautomatically determining the polarity of a given review according to its contentsare required; for each review a polarity value, expressed as positive or negative, isrequired in order to represent the orientation it bears. Such value is defined OverallOpinion Polarity (OvOP): the polarity that is assigned to the opinion expressedin a document seen as a whole. The process aimed at determining the OvOP ofa document is referred ad OvOP analysis (also referred as OvOP classification orOvOP Identification [71]). OvOP, according to the taxonomy introduced in Chapter 1, is classified as asubtask of Sentiment Analysis with a document level granularity. In this Chapter wedesign and implement a set of binary supervised classifiers aimed at distinguishingbetween positive and negative oriented product reviews. For example, given thefollowing user generated review concerning an MP3 player: Il prezzo di questo oggetto ` attualmente di circa 130 euro, ed il e design e le funzionalit` lo rendono un ottimo regalo per tutti. A me a piace molto soprattutto per le sue funzionalit` ma lo apprezzo molto a anche per la durata della batteria, che garantisce ben 12 ore di musica in riproduzione continua.we are interested in automatically identifying that the OvOP expressed by the doc-ument is positive.3.2 Related Work3.2.1 TurneyIn [82] the author presents an unsupervised algorithm aimed at classifying reviews as”recommended” or ”not recommended” based on the average semantic orientationof sentences. Semantic orientation of a sentence, described in detail in Section 5.2.2,is defined as the mutual information between the phrase and the word “excellent”minus the mutual information between the phrase and the word “poor”. Mutualinformation is calculated according to the PMI-IR metric, introduced in [83], basedon the number of hits from the AltaVista search engine, by using the NEAR operatorin query formulation. The semantic orientation of a sentence indicates its polarity(positive vs. negative) and, at the same time, the strength of the opinion it conveys. Sentences are extracted from reviews by using a set of five extraction rules basedon POS tagging; extraction rules are aimed at identifying occurrences of adjectivesand adverbs, when used together or in association with common and proper names.The semantic orientation of a document is calculated as the average semantic ori-entation of its phrases.
  45. 45. 3.2. RELATED WORK 33 The algorithm has been tested on 410 product reviews covering different domains,including a set of 120 reviews concerning movies. 59% of the reviews constitutingthe test set are positive; the results show an average accuracy of 74,39% acrossthe different domains. The movie domain presents the lowest performance, with anaccuracy of 65,83%; such result is in contrast with average accuracy obtained forthe car domain (84,00%) and for the bank domain (80,00%). The author shows, in particular, how its domain independent approach, presentssome limitations when different words can be used to convey different orientationsin different domains. For example the word ”unpredictable”, which is consideredpositive in a movie review context (”unpredictable plot”) but negative in a carreview (”unpredictable steering”).3.2.2 Pang et al.In [63] the authors present the first application of machine learning based techniquesaimed at determining the OvOP of a set of movie reviews, a domain already exploitedin [84] by means of the Pointwise Mutual Information approach. Movie reviewscollected from the web are classified as positive or negative according to the ratingindicator assigned by their authors; in particular star rating provided by reviewersto summarize the sentiment of each review has been considered. The corpus ofmovie reviews collected by the authors represents the Gold Standard for evaluationof OvOP classification systems; many works, including [71], use the corpus in orderto evaluate the performances of the proposed approaches. The first assumption proven by the authors as wrong is based on the idea that afew words expressing strong sentiment are enough to classify documents accordingto their OvOP. In order to prove the assumption a list of words bearing polarityorientation has been compiled by two different human annotators. Classificationbased only on the presence of identified words presents, however, relatively poorresults, varying from 55% to 65%, partly due to the low coverage offered by the listof words collected by human annotators, each limited to 20 terms. Results showhow sentiment-carrying words are not enough to perform sentiment classification ina proficient way. A third list constituted by 7 positive and 7 negative words and symbols (excla-mation and question marks are included in order to describe negative documents)has been collected by looking at the polarity clues, occurring most frequently in theinput corpus. The list has been used for OvOP classification, leading to an accuracyof 69%. A machine learning approach is defined in order to improve the effectiveness ofthe OvOP classification process. Each document is represented as a feature vector;the set of features used in each experimental activity is reported in Table 3.1. Threedifferent learning method for OvOP classification are exploited: Na¨ Bayes (NB), ıveMaximum Entropy (ME) classification and Support Vector Machines (SVM). The set of unigrams used in feature representation has been filtered in order

×