Sentiment MiningProf. Maurice MulvennaUniversity of Kassel14 December 2011
Outline§ Ulster§ What is Sentiment Mining§ Why Sentiment Mining§ Challenges§ Methods§ Data Sources§ Applications§ Examples§ Simple Keyword-based Prototype§ Some Results
The Right ChoiceCOLERAINE JORDANSTOWNMAGEE BELFASTFOUR CAMPUSES-ONE UNIVERSITY
• Largest University in Ireland – over 25,500 local,national and international student body• International reputation in research• “Excellence” in teaching• Graduate employment well above national average• Excellent study and recreational facilitiesUniversity of Ulsterin Top 10 UK universities in applications
What to StudyComputing and MultimediaElectrical and Mechanical EngineeringHumanities/Performing ArtsLife and Health SciencesSocial SciencesArt, Design and Built EnvironmentBusiness and ManagementAround 600 degree programmes:
Faculty of Computing and EngineeringWithin the Faculty there are:§ 5 Schools§ Approximately 3000 students§ 200 staff§ Extensive specialist facilities on the Coleraine,Jordanstown and Magee Campuses
What is Sentiment Mining§ Also referred to as sentiment analysis or opinionmining§ It refers to the application of natural languageprocessing, computational linguistics, and textanalytics to identify and extract subjectiveinformation in source materials. (Wikipedia)§ Its aim is to determinethe attitude or mood of a user or user group (i.e. happy or sad)the contextual polarity of statements or larger documents(i.e. positive or negative)the intended emotional communication (i.e. sarcasm or irony)
Why Sentiment Mining§ Capture and analyse public opinion§ Capturing the word-of-mouth effect§ Evaluate the social profile of individual§ News detection and analysis§ Quantify the emotional state of users (i.e. duress,stress, sadness, angriness, etc.)§ Feedback mechanism to e.g. policy makers§ National (e.g., UK riots) and§ International ( ﻝلﺭرﺏبﻱيﻉعﺍاﻝلﻉعﺭرﺏبﻱي or ‘Arab Spring’)events that impact and resonate in peoples’ dailylives
Challenges§ Sentiment is a subjective measure and as such is subjectto interpretation§ Data VolumesNumber of statements, users, documents, etc.Size of documents and the complexity (topic, sentence,paragraph, chapter, document level)§ Noise, and unstructured data§ Slang, vernaculars, abbreviations (i.e. wdc, cu, ru, lol, etc.)§ Language heterogeneityDemographic dependenciesSocial dependency§ Ambivalence§ Complexity of NLP tasks
Methods§ Keyword-based approaches§ Machine learning techniquesLatent semantic analysisSupport vector machines"bag of words” MethodsNaive Bayes classifiersOther NLP tools that allow the detailed parsingof text related sources including the underlyinggrammar.
Data Sources§ Any single document or document collection (i.e.reviews of any kind – travel, food, movie, etc.)§ Social media networks (i.e. Twitter)§ Spoken communication (either directly or afterconverting it into a textual representation)à Any source in which an opinion or emotion isexpressed or communicated
Applications§ Reputation Management§ Customer Profiling§ Product Management§ News Detection and Analysis§ Public Opinion Analysis§ Affective Computing where systems shouldinterpret the emotional state of users and adaptthere behaviour accordingly also providing anappropriate response for the emotions detected.
The essence of the bookis Laniers attempt toanswer the question:"What happens when westop shaping technologyand technology startsshaping us?" "