Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Social media week


Published on

17 slides from the session on September 25th forming part of Social Media Week Glasgow

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

Social media week

  1. 1. Sentiment Analysis #SMWsentimentTuesday 25th September 2-3pm Stephen Tagg & Jillian Ney
  2. 2. Workshop Overview1. Have you checked in (& on Foursquare)? Backchat on #SMWsentiment…2. A little introduction, definitions..3. Sentiment Analysis issues4. An example using free software5. Other free software6. Other software7. Workshop Discussion Q&A
  3. 3. Definitions• Text Analysis – a bit more specific• Opinion Mining – not quite the same but overlap – will it make surveys less relevant?• “Sentiment analysis or opinion mining refers to the application of natural language processing, computational linguistics, and text analytics to identify and extract subjective information in source materials.• Generally speaking, sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document. The attitude may be his or her judgement or evaluation (see appraisal theory), affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader).” Wikipedia• Appraisal theory – psychological theories of emotion
  4. 4. Sentiment analysis and Web 2.0• The rise of social media such as blogs and social networks has fuelled interest in sentiment analysis. With the proliferation of reviews, ratings, recommendations and other forms of online expression, online opinion has turned into a kind of virtual currency for businesses looking to market their products, identify new opportunities and manage their reputations. As businesses look to automate the process of filtering out the noise, understanding the conversations, identifying the relevant content and actioning it appropriately, many are now looking to the field of sentiment analysis. If web 2.0 was all about democratizing publishing, then the next stage of the web may well be based on democratizing data mining of all the content that is getting published.• One step towards this aim is accomplished in research. Several research teams in universities around the world currently focus on understanding the dynamics of sentiment in e-communities through sentiment analysis. The CyberEmotions project, for instance, recently identified the role of negative emotions in driving social networks discussions. Sentiment analysis could therefore help understand why certain e-communities die or fade away (e.g., MySpace) while others seem to grow without limits (e.g., Facebook).• The problem is that most sentiment analysis algorithms use simple terms to express sentiment about a product or service. However, cultural factors, linguistic nuances and differing contexts make it extremely difficult to turn a string of written text into a simple pro or con sentiment. The fact that humans often disagree on the sentiment of text illustrates how big a task it is for computers to get this right. The shorter the string of text, the harder it becomes.
  5. 5. Sentiment Analysis issues Hype• Gartner hype cycle – text analysis is in a good place – after the initial hype. More avenues for research – General Sentiment, Inc – Attensity – Lexalytics – Telligent Systems – CyberEmotions project
  6. 6. An example• 31,000+ hotel evaluations from Dubai. (thanks to Prof Alan Wilson).• Applied the R tm package. R is a statistical package and tm is a package for text mining available in it.• Considerable time learning how to read in the data correctly.• Sentiment is another R package for generating sentiment analysis of texts.• Hotel evaluations need feature/ aspect-based sentiment analysis• Machine learning – latent semantic analysis, support vector machines, Bag of Words, Semantic orientation.
  7. 7. First three cases ID Hotel Date of Data Title Body Review source1 247 Ibis 07-MAY- agoda OK if your PROS: - Access to WTC - Not far from anywhere. CONS: - No in-room safe - No mini bar in 606 World 2010 work is at room - No comp water - Rather expensive internet access - poor information at lobby. . 59 Trade the world Upon check in they asked me for a cash deposit which was higher than the entire hotel Centr trade center bill and informed in advance that the change will not be in US $ but in local currency. e Anyway did a credit card inprint which satisfied them. No bell boy service, you carry all Dubai your luggage yourself; travel light. Breakfast is ok, and with a few chages day by day. Hotel2 247 Rama 28-MAY- agoda Conveniently PROS: - Convenient location: very near to airport, shopping centres - Hotel facilities are 615 da 2010 located with superb - Good restaurants and entertainment centres - Easy to take taxis. CONS: - 60 Conti all the problems with the room key card, which get spoilt fast and need to be repalced at least nental facilities once a day. . . If you go to Dubai, I would definately recommend this hotel. It is Hotel, conveniently located and has all the facilities. The hotel is clean and the staff are friendly. Dubai It has a wide choice of restaurants and entertainment centres. It has some shopping centres nearby and a metro station is currently being built very near to this hotel. The airport is only about 15 mins drive from this hotel3 256 Nihal 01-MAY- agoda Nihal Hotel PROS: - Cozy - Inexpensive - Good service - Nice disco (Filipino disco) - 450 Meter from 587 Hotel, 2010 Dubai Dubai metro - Clean rooms . CONS: - Small - No swimming pool/Gym - Rooms doors are 26 Dubai with with old-style locks. . I like the place so much as it is inexpensive, cozy, in the down town, has a nice disco (which i like too much), has clean rooms, and near to Dubai metro.
  8. 8. Terms/roots occurring >1000 times findFreqTerms(bodytdm,1000)"access" "airport" "amaz" "apart" "arriv" "avail" "bad" "bar""bathroom" "beach" "beauti" "bed" "bedroom" "bit" "book" "breakfast""buffet" "burj" "busi" "call" "car" "charg" "check" "choic" "citi" "clean" "close" "club" "comfort" "comfortable" "cons" "cost" "day" "definit" "desk" "door" "drink" "dubai" "easi" "emir" "enjoy" "especi" "etc" "excel" "excellent" "expect" "expens" "experi" "extra" "extrem" "facil" "famili" "fantast" "feel“ "floor" "food" "found" "free" "friend" "friendly""front" "guest" "help" "helpful" "holiday" "hot" "hotel" "hour""huge" "includ" "internet" "kitchen" "like" "littl" "locat" "location""look" "lot" "love" "main" "mall" "manag" "metro" "minut""money" "near" "nice" "night" "offer" "park" "pay" "peopl""perfect" "pool" "poor" "price" "pros" "provid" "qualiti" "quiet""rate" "reason" "recept" "recommend" "relax" "restaur" "return" "road""servic" "service" "shop" "shower" "shuttl" "size" "spacious" "special" "staff" "standard" "star" "station" "stay" "suit" "swim" "taxi" "time" "told" "towel" "travel" "tri" "trip" "upgrad" "valu" "view" "visit" "wait" "walk" "water" "wonder" "worth"
  9. 9. Sentiment analysis results Hotel * BEST_FIT Crosstabulation BEST_FIT Total anger disgust fear joy sadnes surprise s Count 0 0 0 1 0 0 1 ABC Arabain Suites 100. % within Hotel 0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 0% Count 0 0 0 0 1 0 1 Abu Dhabi Gulf Hotel 100.0 100. % within Hotel 0.0% 0.0% 0.0% 0.0% 0.0% % 0% Count 8 0 1 92 12 7 120 Admiral Plaza Hotel, Dubai 100. % within Hotel 6.7% 0.0% 0.8% 76.7% 10.0% 5.8% 0% Count 0 0 0 2 0 0 2 Akas-Inn Hotel Apartment 100. % within Hotel 0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 0% Count 1 0 1 13 1 1 17 Al Bustan Centre & 100. Residence Hotel, Dubai % within Hotel 5.9% 0.0% 5.9% 76.5% 5.9% 5.9% 0%Hotel Count 11 1 0 55 5 4 76 Al Bustan Rotana Hotel, 100. Dubai % within Hotel 14.5% 1.3% 0.0% 72.4% 6.6% 5.3% 0% Count 0 0 0 9 1 1 11 Al Deyafa Hotel Apartments 100. 3, Dubai % within Hotel 0.0% 0.0% 0.0% 81.8% 9.1% 9.1% 0% Count 0 0 0 0 0 1 1 Al Faris Hotel Apartments 1, 100. Dubai % within Hotel 0.0% 0.0% 0.0% 0.0% 0.0% 100.0% 0% Count 0 0 1 8 2 0 11 Al Faris Hotel Apartments 2, 100. Dubai % within Hotel 0.0% 0.0% 9.1% 72.7% 18.2% 0.0% 0% Count 5 2 1 21 4 3 36 Al Jawhara Gardens Hotel, 100. Dubai % within Hotel 13.9% 5.6% 2.8% 58.3% 11.1% 8.3% 0%
  10. 10. Sentiment Analysis issues Indicators• R-sentiment uses either Janyce Wiebe’s subjectivity lexicon to classify polarity (+, 0, -) or “Word-net affect” to generate scores for anger, disgust, fear, joy, sadness and surprise.• May need own lexicon.• Remove objective statements.• Use of Machine Learning, Artificial intelligence to address ambiguity, multiple meaning, context…
  11. 11. Other free software• GATE a computer science academic tool which looks to allow sophisticated processing.• RapidMiner – built on Java. Assumes you’ve got maybe 20 or 200 texts (not 31,000) so rapid it wasn’t!• You’ll struggle to Acquire data, process it and then you need to be able to summarise and interpret it!
  12. 12. Sentiment Analysis as an add-on to social media metrics• Brand Watch – sentiment analysis – based on machine learning – select an industry for the query.• Alterian SM2 – part of dictionary – includes emoticons• Radian6 includes automated sentiment analysis and a Clarabridge Sentiment analysis• Other options on the resources hand-out
  13. 13. Agenda for participants• Your experience and interests – What sentiment analysis have you seen/ used – What could Sentiment analysis contribute• Your concerns, barriers to use – Privacy issues – Effort/ Cost of acquiring data and doing it yourself – Trusting a third party to do Sentiment Analysis – judging their offerings – cost and vfm
  14. 14. Workshop activities• What do people say about – your brand, – your company/ organisation?• How do they feel?• What UGC (User Generated Content) do you contribute to social media?
  15. 15. Other developments• What do you say about….? Scottish independence issues – effect of other forum content.. (are Scots overwhelmed by so many English on BBC discussion forum)• I’m always on the look out for data – but I can take months/ years!!!