SlideShare a Scribd company logo
Opinion mining
for social media

    Diana Maynard
University of Sheffield, UK
Who Wants to be a Millionaire?




Ask the audience?



          Or Phone a Friend?
How did the Royal Wedding influence
          the UK's health?
Can Twitter help us predict the stock market?




“Hey Jon, Derek in Scunthorpe's having a bacon and egg butty. Is that
good for wheat futures?”
Can Twitter predict earthquakes?
Outline

• Introduction - social media, opinion mining
  and the Arcomem project
• What is opinion mining and why is it hard?
• Current research in the Arcomem project
• What does the future hold?
The Social Web


Information, thoughts
and opinions are shared
prolifically these days on
the social web
Drowning in information

• It can be difficult to get the
  relevant information out of such
  large volumes of data in a useful
  way
• Social web analysis is all about the
  users who are actively engaged
  and generate content
• Social networks are pools of a
  wide range of articulation
  methods, from simple "I like it"
  buttons to complete articles
Opinion Mining

• Along with entity, topic and
  event recognition, opinion
  mining forms the
  cornerstone for social web
  analysis
What is Opinion Mining?

• OM is a relatively recent discipline that studies the extraction of
  opinions using IR, AI and/or NLP techniques.
• More informally, it's about extracting the opinions or sentiments
  given in a piece of text
• Also referred to as Sentiment Analysis (though technically this is a
  more specific task)
• Web 2.0 nowadays provides a great medium for people to share
  things.
• This provides a great source of unstructured information (especially
  opinions) that may be useful to others (e.g. companies and their
  rivals, other consumers, governments...)
It's about finding out what people
               think...
Opinion Mining is Big Business

●
    Someone who wants to buy a camera
    ●
        Looks for comments and reviews
●
    Someone who just bought a camera
    ●
        Comments on it
    ●
        Writes about their experience
●
    Camera Manufacturer
    ●
        Gets feedback from customer
    ●
        Improve their products
    ●
        Adjust Marketing Strategies
Analysing and preserving opinions
• We want to collect, store and later retrieve public opinions about events
  and their changes or developments over time
• Not only can online social networks provide a snapshot of such
  situations, but they can actually trigger a chain of reactions and events
• Ultimately these events might led to societal, political or administrative
  changes
• Since the Royal Wedding, Pilates classes have become incredibly popular
  in the UK solely as a result of social media.
On a more serious note...
●
    We can measure the impact of a political entity or event on the overall
    sentiment about another entity or event, over the course of time.
●
    Aggregation of opinions over entities and events to cover sentences and
    documents
●
    Combined with time information and/or geo-locations, we can then
    analyse changes to opinions about the same entity/event over time, and
    other statistical correlations
But there are lots of tools that do this
                       already....
• Here are some examples:
   – Twitter sentiment http://twittersentiment.appspot.com/
   – Twends: http://twendz.waggeneredstrom.com/
   – Twittratr: http://twitrratr.com/
   – SocialMention: http://socialmention.com/
Why not use existing sentiment apps?

• Easy to search for opinions about famous people, brands and
  so on
• Hard to search for more abstract concepts, perform a non-
  keyword based string search
   – e.g. to find opinions about Lady Gaga's' dress, you can only
     search on “Lady Gaga” to get hits
• And the opinion finding they do isn't very good...
Why are these sites unsuccessful?

• They don't work well at more than a very basic level
• They mainly use dictionary lookup for positive and negative
  words
• They classify the tweets as positive or negative, but not with
  respect to the keyword you're searching for
• First, the keyword search just retrieves any tweet mentioning
  it, but not necessarily about it as a topic
• Second, there is no correlation between the keyword and the
  sentiment: the sentiment refers to the tweet as a whole
• Sometimes this is fine, but it can also go horribly wrong
Whitney Houston wasn't very popular...
Or was she?
Negative tweets about Whitney?

• the #nationalenquirers cover photo of
  #whitneyhouston lying in a casket has sparked
  outrage in the #media world
• feeling bad for #whitneyhouston right now
• It's such a shame #whitneyhouston is dead
What did people think of the Olympics?
Twittrater's view of the Olympics

• A keyword search for Olympics shows exactly
  how existing systems fail to cut the mustard
• Lookup of sentiment words is not enough if
  – they're part of longer words
  – they're used in different contexts
  – the tweet itself isn't relevant
  – they're used in a negative or sarcastic sentence
  – they're ambiguous
Opinion mining for social media
What's the capital of Spain?



    A: Barcelona

    B: Madrid

    C: Valencia

    D: Seville
What's the height of Mt Kilimanjaro?


         A: 19,341 ft

         B: 23,341 ft

         C: 15,341 ft

         D: 21,341 ft
Go for the majority or trust an expert?
• It depends what kind of question you're asking
• In Who Wants to Be a Millionaire, people tend to ask the audience
  fairly early on, because once the questions get hard, they can't rely
  on the audience getting it right
• Similarly with opinion mining, you have to know who to trust

What's the height of Mt              What's the capital of Spain?
Kilimanjaro?

  A: 19,341 ft                       A: Barcelona
  B: 23,341 ft                       B: Madrid
  C: 15,341 ft                       C: Valencia
  D: 21,341 ft                       D: Seville
Arcomem project

• EU-funded research project involving memory institutions like
  archives, museums, and libraries in the age of the Social Web.
• Arcomem aims to
   – help transform archives into collective memories that are
     more tightly integrated with their community of users
   – exploit Social Web and the wisdom of crowds to make
     Web archiving a more selective and meaning-based
     process
• http://vimeo.com/45954490
Key questions
• What are the opinions on crucial social events and who are the
  main protagonists?
• How are these opinions distributed in relation to demographic user
  data?
• How have these opinions evolved?
• Who are the opinion leaders?
• What is their impact and influence?
ETOEs detection in Arcomem
• ETOES - Entities, Topics, Opinions and Events
• ETOEs detection is used to guide the crawler and to add useful
  metadata to the information stored for future search
• Opinion mining builds on entities, terms and events identified,
  finding relevant opinions about them
• Text-based sentiment analysis performed by GATE tools (USFD) in
  English and German
• Text-based analysis is then combined with topic detection to
  produce clustered opinions over documents
• Image- and video-based opinion mining also plays a supporting
  role, e.g. match new images against existing images which have
  been manually annotated as positive or negative
ETOEs Workflow
                                             Relation
                                            Extraction




  Linguistic                    Entity
Preprocessing                 Extraction                 Event Extraction




            Word Sense                                       Topic Identification
                                      Opinion Mining
           Determination




                                     Dynamics
                                     Capturing
Integration Architecture
Main Research Directions

    New techniques for opinion mining based around entities and events in
    social media - different from typical product reviews.
    
        Approach involves building up a number of linguistic subcomponents
    
        Includes techniques for multimedia opinion analysis
    
        Investigating the reusability of these components for different
        domains, document types, languages etc (Knowledge Transfer)
    
        Experiments on cross- domain/language adaptation
• Aggregation of opinions over a document
    
        Extension of work from LivingKnowledge project on drilling down
        into subtopics about specific entities, rather than just getting
        information about general topics
    
        Provide summaries of opinions for multiple occurrences of entities in
        a document
Text-based Opinion Mining in Arcomem

• Performed basic sentiment analysis on various English and
  German datasets
• Documents may be in multiple languages, e.g. Greek crisis
  documents contain both English and Greek sentences
• Entities and events used as seeds for opinion objects
• Experimenting with sentiment lexicons for initial lookup -
  currently SentiWordNet (English), SentiWS (German) and
  additions automatically derived from test corpora
• Components for detection of negatives, swear words etc
• Output is an opinion feature on the entity/event, plus polarity
  and score features where appropriate
Opinion Mining sub-components
• Top level sub-components
    
      source and target identification (entity/event-based)
    
      opinion-feature association (uses linguistic relations)
• Dealing with the complexities of natural language
    
        incorrect English
    
        negatives
    
        conditional statements
    
        slang/swear words/
    
        irony/sarcasm
    
        spam opinion detection
What can we realistically hope to achieve?
●
    Opinion mining is hard, and it's harder because:
     ●
       we have lots of different text types and domains.
    ●
        we're processing social media
    ●
        we're processing multiple language
    ●
        we don't necessarily know what we're looking for
●
    Experimenting with different techniques and subcomponents
    to build up a complex OM system
●
    Experimenting with techniques for cross-language adaptation
GATE (General Architecture for Text
                  Engineering)
●
    GATE is a tool for Language Engineering in development in Sheffield
    since 2000.
●
    GATE includes:

  – components for language processing, e.g. parsers, machine
    learning tools, stemmers, IR tools, IE components for various
    languages...
  – tools for visualising and manipulating text, annotations,
    ontologies, parse trees, etc.
  – various information extraction tools
  – evaluation and benchmarking tools
• More info and freely available at http://gate.ac.uk
Applications
• Developed a series of initial applications for opinion mining
  from social media using GATE
• Based on previous work identifying political opinions from
  tweets
• Extended to more generic analysis about any kind of entity or
  event, in 2 domains
   – Greek financial crisis
   – Rock am Ring (German rock festival)
• Uses a variety of social media including twitter, facebook and
  forum posts
• Based on entity and event extraction
Machine learning approach
    
        ML: automating the process of inferring new data from existing data
    
        In GATE, that means creating annotations or adding features to
        annotations by learning how they relate to other annotations
    
        For example, we haveToken annotations with string features and Product
        annotations
●
        ML could learn that a Product close to the Token “stinks” expresses a
        negative sentiment, then add a polarity =“negative” feature to the
        Sentence.
          The     new     Acme       Model         33         stinks     !
         Token   Token    Token      Token        Token       Token    Token
                                      Product
                                      Sentence
Rule-based techniques

• These rely primarily on sentiment dictionaries, plus some
  rules to do things like attach sentiments to targets, or modify
  the sentiment scores
• Examples include:
   – analysis of political tweets (Maynard and Funk, 2011)
   – analysis of opinions expressed about political events and
     rock festivals in social media (Maynard, Bontcheva and
     Rout, 2012)
   – SO-CAL (Taboada et al, 2011) for detecting positive and
     negative sentiment of ePinions reviews on the web.
Why Rule-based?

• Although ML applications are typically used for Opinion
  Mining, this task involves documents from many different text
  types, genres, languages and domains
• This is problematic for ML because it requires many
  applications trained on the different datasets, and methods to
  deal with acquisition of training material
• Aim of using a rule-based system is that the bulk of it can be
  used across different kinds of texts, with only the pre-
  processing and some sentiment dictionaries which are
  domain and language-specific
GATE Application

• Structural pre-processing, specific to social media types
• Linguistic pre-processing (including language detection), NE,
  term and event recognition
• Additional targeted gazetteer lookup
• JAPE grammars
• Aggregation of opinions
• Dynamics
Structural pre-processing on Twitter




Original markups set
Linguistic pre-processing

• Language identification (per sentence) using TextCat
• Standard tokenisation, POS tagging etc using GATE
• Modified versions of ANNIE and TermRaider for NE and term
  recognition
• Event recognition using specially developed GATE application
  (e.g. band performance, economic crisis, industrial strike)
Language ID with TextCat
Basic approach for opinion finding

• Find sentiment-containing words in a linguistic relation with
  entities/events (opinion-target matching)
• Use a number of linguistic sub-components to deal with issues
  such as negatives, irony, swear words etc.
• Starting from basic sentiment lookup, we then adjust the
  scores and polarity of the opinions via these components
Sentiment finding components

• Flexible Gazetteer Lookup: matches lists of affect/emotion
  words against the text, in any morphological variant
• Gazetteer Lookup: matches lists of affect/emotion words
  against the text only in non-variant forms, i.e. exact string
  match (mainly the case for specific phrases, swear words,
  emoticons etc.)
• Sentiment Grammars: set of hand-crafted JAPE rules which
  annotate sentiments and link them with the relevant targets
  and opinion holders
• RDF Generation: create the relevant RDF-XML for the
  annotations according to the data model (so they can be used
  by other components)
Opinions on Greek Crisis Text
Austrian Parliament
• Our partners in the Austrian Parliament have quite specific
  requests for opinion mining on Parliamentary texts (in German):
    
       identification of comments being in favour or against a draft
       bill
    
       if possible, distinguish between the reasons for dislike (e.g.
       because it goes too far, or does not go far enough)
    
       if possible, also distinguish alternative concepts in the
       comments
• These are difficult challenges, mainly to be tackled in the later
  stages of the project as we extend beyond
  positive/negative/neutral opinions
• Different strategies required, e.g. not necessarily starting from
  entities and events, but making more use of topics
Sentiment Finding in Parliamentary docs
Opinion scoring
• Sentiment gazetteers (developed from sentiment words in
  WordNet) have a starting “strength” score
• These get modified by context words, e.g. adverbs, swear
  words, negatives and so on
Opinions about Entities in Rock am Ring
Opinions about sentences in RockamRing
Challenges imposed by social media

• Language: specific pre-processing for Twitter. use shallow
  analysis techniques with back-off strategies; incorporate
  specific subcomponents for swear words, sarcasm etc.
• Relevance: topics and comments can rapidly diverge. Solutions
  involve training a classifier or using clustering techniques
• Target identification: use an entity-centric approach
• Contextual information: use metadata for further information,
  also aggregation of data can be useful
Short sentences, e.g. tweets

• Social media, and especially tweets, can be problematic because
  sentences are very short and/or incomplete
• Typically, linguistic pre-processing tools such as POS taggers and
  parsers do badly on such texts
• Even things like language identification tools can have problems
• The best solution is to try not to rely too heavily on these tools
   – Does it matter if we get the wrong language for a sentence?
   – Do we actually need full parsing?
   – Can we use other clues when POS tags may be incorrect?
Dealing with incorrect English

• Frequent problem in any NLP task involving social media
• Incorrect capitalisation, spelling, grammar, made-up words (eg swear
  words, infixes)
• Some specific pre-processing
• Backoff strategies include
   ●
       using more flexible gazetteer matching
   ●
       using case-insensitive resources (but be careful)
   ●
       avoiding full parsing and using shallow techniques
   ●
       using very general grammar rules
   ●
       adding specialised gazetteer entries for common misspellings, or
       using co-reference techniques
Tokenisation
• Plenty of “unusual”, but very important tokens in social
  media:
   – @Apple – mentions of company/brand/person names
   – #fail, #SteveJobs – hashtags expressing sentiment, person
     or company names
   – :-(, :-), :-P – emoticons (punctuation and optionally letters)
   – URLs
• Tokenisation key for entity recognition and opinion mining
• A study of 1.1 million tweets: 26% of English tweets have a
  URL, 16.6% - a hashtag, and 54.8% - a user name mention
  [Carter, 2013].
Example

#WiredBizCon #nike vp said when @Apple saw what
http://nikeplus.com did, #SteveJobs was like wow I didn't expect
this at all.
  ●
    Tokenising on white space doesn't work that well:
  ●
    Nike and Apple are company names, but if we have tokens such
    as #nike and @Apple, this will make the entity recognition
    harder, as it will need to look at sub-token level
  ●
    Tokenising on white space and punctuation characters doesn't
    work well either: URLs get separated (http, nikeplus), as are
    emoticons and email addresses
The GATE Twitter Tokeniser

• Treat RTs, emoticons, and URLs as 1 token each
• #nike is two tokens (# and nike) plus a separate annotation
  HashTag covering both. Same for @mentions
• Capitalisation is preserved, but an orthography feature is
  added: all caps, lowercase, mixCase
• Date and phone number normalisation, lowercasing, and
  other such cases are optionally done later in separate
  modules
• Consequently, tokenisation is faster and more generic
De-duplication and Spam Removal

• Approach from [Choudhury & Breslin, #MSM2011]:
• Remove as duplicates/spam:
   – Messages with only hashtags (and optional URL)
   – Similar content, different user names and with the same
     timestamp are considered to be a case of multiple
     accounts
   – Same account, identical content are considered to be
     duplicate tweets
   – Same account, same content at multiple times are
     considered as spam tweets
Normalisation
• “RT @Bthompson WRITEZ: @libbyabrego honored?! Everybody
  knows the libster is nice with it...lol...(thankkkks a bunch;))”
• OMG! I’m so guilty!!! Sprained biibii’s leg! ARGHHHHHH!!!!!!
• Similar to SMS normalisation
• For some later components to work well (POS tagger, parser), it
  is necessary to produce a normalised version of each token
• BUT uppercasing, and letter and exclamation mark repetition
  often convey strong sentiment, so we keep both versions of
  tokens
• Syntactic normalisation: determine when @mentions and #tags
  have syntactic value and should be kept in the sentence, vs
  replies, retweets and topic tagging
Irony and sarcasm

• Life's too short, so be sure to read as many articles about celebrity
  breakups as possible.
• I had never seen snow in Holland before but thanks to twitter and
  facebook I now know what it looks like. Thanks guys, awesome!
• On a bright note if downing gets injured we have Henderson to
  come in.
How do you know when someone is
                  being sarcastic?
• Use of hashtags in tweets such as #sarcasm
• Large collections of tweets based on hashtags can be used to
  make a training set for machine learning
• But you still have to know which bit of the tweet is the
  sarcastic bit
To the hospital #fun #sarcasm
Man , I hate when I get those chain letters & I don't resend
them , then I die the next day .. #Sarcasm
lol letting a baby goat walk on me probably wasn't the best idea.
Those hooves felt great. #sarcasm
How else can you deal with it?

• Look for word combinations with opposite polarity, e.g. “rain”
  or “delay” plus “brilliant”
Going to the dentist on my weekend home. Great. I'm totally
pumped. #sarcasm
• Inclusion of world knowledge / ontologies can help (e.g.
  knowing that people typically don't like going to the dentist,
  or that people typically like weekends better than weekdays.
• It's an incredibly hard problem and an area where we expect
  not to get it right that often
• Still very much work in progress for us
Evaluation
• Very hard to measure opinion polarity beyond
  positive/negative/neutral
• On a small corpus of 20 facebook posts, we identified
  sentiment-containing sentences with 86% Precision and 71%
  Recall
• Of these, the polarity accuracy was 66%
• While this is not that high, not all the subcomponents are
  complete in the system, so we would expect better results
  with improved methods for negation and sarcasm detection
• NE recognition was high on these texts: 92% Precision and
  69% Recall (compared with other NE evaluations on social
  media)
Comparison of Opinion Finding in
                      Different Tasks

Corpus                      Sentiment   Polarity    Target
                            detection   detection   assignment

Political Tweets            78%         79%         97.9%


Financial Crisis Facebook   55%         81.8%       32.7%


Financial Crisis Tweets     90%         93.8%       66.7%
Image-opinion identification
Facial expression analysis/classification
    –
       Working on processing pipeline for this
    –
       Helps with facial similarity calculations and
       face recognition
    –
       Can be used to predict sentiment/polarity
    –
       Can be combined with analysis text from
       document
Coarse-grained opinion classification
   –
     Looking at image-feature classification for
     abstract concepts (sentiment / privacy /
     attractiveness)
   –
     e.g. looking at image colours, placement of
     interesting images in the picture
Multimodal opinion analysis
●
    Investigate correlation between images and
    whole-document opinions
    ●
        Do documents asserting specific opinions
        get illustrated with the same imagery?
    ●
        e.g. articles about euro-scepticism in the
        UK might be illustrated with images of
        specific Conservative peers….
    ●
        Is there correlation between low-level
        image features and specific opinions?
●
    Investigate finer-grained (i.e. sub-document)
    correlations between imagery and opinions
    ●
        e.g. sentence-level correlations
        incorporating analysis of the document
        layout
Summary

• Ongoing work on developing opinion mining
  tools and adapting them to social media
• Deal with multilinguality, ungrammatical
  English, and very short posts (tweets)
• Components for negation, swear words,
  sarcasm etc
• Promising initial evaluations
• Much further work still to come, e.g.
  integration with multimedia components
Further information

• Work done in the context of the EU-funded ARCOMEM and
  TrendMiner projects
• See http://www.arcomem.eu and http://www.trend-miner.eu for
  more details
• More information about GATE at http://gate.ac.uk
• More information about opinion mining see the LREC 2012 Tutorial
  “Opinion Mining: Exploiting the Sentiment of the Crowd”
• Module 12 of the GATE Training Course (new material since June
  2012) https://gate.ac.uk/family/training.html
Questions?

More Related Content

What's hot

Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Simplilearn
 
Sentiment Analysis using Twitter Data
Sentiment Analysis using Twitter DataSentiment Analysis using Twitter Data
Sentiment Analysis using Twitter Data
Hari Prasad
 
Sentiment analysis using ml
Sentiment analysis using mlSentiment analysis using ml
Sentiment analysis using ml
Pravin Katiyar
 
Text analytics in social media
Text analytics in social mediaText analytics in social media
Text analytics in social media
Jeremiah Fadugba
 
Sentiment analysis presentation
Sentiment analysis presentationSentiment analysis presentation
Sentiment analysis presentation
GunjanSrivastava23
 
Social Media Sentiments Analysis
Social Media Sentiments AnalysisSocial Media Sentiments Analysis
Social Media Sentiments Analysis
PratisthaSingh5
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Amenda Joy
 
Sentiment Analysis Using Twitter
Sentiment Analysis Using TwitterSentiment Analysis Using Twitter
Sentiment Analysis Using Twitter
piya chauhan
 
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter Data
Nurendra Choudhary
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Aditya Nag
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysis
Sunil Kandari
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Data Science Society
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
AntaraBhattacharya12
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
Makrand Patil
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)
Kavita Ganesan
 
How Does Customer Feedback Sentiment Analysis Work in Search Marketing?
How Does Customer Feedback Sentiment Analysis Work in Search Marketing?How Does Customer Feedback Sentiment Analysis Work in Search Marketing?
How Does Customer Feedback Sentiment Analysis Work in Search Marketing?
Countants
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
CJ Jenkins
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion Mining
Shital Kat
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumar
Ravi Kumar
 
web mining
web miningweb mining
web mining
Arpit Verma
 

What's hot (20)

Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
 
Sentiment Analysis using Twitter Data
Sentiment Analysis using Twitter DataSentiment Analysis using Twitter Data
Sentiment Analysis using Twitter Data
 
Sentiment analysis using ml
Sentiment analysis using mlSentiment analysis using ml
Sentiment analysis using ml
 
Text analytics in social media
Text analytics in social mediaText analytics in social media
Text analytics in social media
 
Sentiment analysis presentation
Sentiment analysis presentationSentiment analysis presentation
Sentiment analysis presentation
 
Social Media Sentiments Analysis
Social Media Sentiments AnalysisSocial Media Sentiments Analysis
Social Media Sentiments Analysis
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Sentiment Analysis Using Twitter
Sentiment Analysis Using TwitterSentiment Analysis Using Twitter
Sentiment Analysis Using Twitter
 
Sentiment analysis of Twitter Data
Sentiment analysis of Twitter DataSentiment analysis of Twitter Data
Sentiment analysis of Twitter Data
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysis
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)Opinion Mining Tutorial (Sentiment Analysis)
Opinion Mining Tutorial (Sentiment Analysis)
 
How Does Customer Feedback Sentiment Analysis Work in Search Marketing?
How Does Customer Feedback Sentiment Analysis Work in Search Marketing?How Does Customer Feedback Sentiment Analysis Work in Search Marketing?
How Does Customer Feedback Sentiment Analysis Work in Search Marketing?
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion Mining
 
New sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumarNew sentiment analysis of tweets using python by Ravi kumar
New sentiment analysis of tweets using python by Ravi kumar
 
web mining
web miningweb mining
web mining
 

Viewers also liked

Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
Sumit Raj
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
Jaganadh Gopinadhan
 
What do you really mean when you tweet? Challenges for opinion mining on soci...
What do you really mean when you tweet? Challenges for opinion mining on soci...What do you really mean when you tweet? Challenges for opinion mining on soci...
What do you really mean when you tweet? Challenges for opinion mining on soci...
Diana Maynard
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real world
Diana Maynard
 
Opinion Mining using Data Mining Techniques
Opinion Mining using Data Mining TechniquesOpinion Mining using Data Mining Techniques
Opinion Mining using Data Mining Techniques
Aerofoil Kite
 
Twitter analysis by Kaify Rais
Twitter analysis by Kaify RaisTwitter analysis by Kaify Rais
Twitter analysis by Kaify Rais
Ajay Ohri
 
Opinion mining
Opinion miningOpinion mining
Opinion mining
Ha noi
 
IEEE 2014 DOTNET DATA MINING PROJECTS Ai and opinion mining
IEEE 2014 DOTNET DATA MINING PROJECTS Ai and opinion miningIEEE 2014 DOTNET DATA MINING PROJECTS Ai and opinion mining
IEEE 2014 DOTNET DATA MINING PROJECTS Ai and opinion mining
IEEEMEMTECHSTUDENTPROJECTS
 
Expression opinion kls X
Expression opinion kls XExpression opinion kls X
Expression opinion kls X
Cherish Mink
 
Identifying features in opinion mining via intrinsic and extrinsic domain rel...
Identifying features in opinion mining via intrinsic and extrinsic domain rel...Identifying features in opinion mining via intrinsic and extrinsic domain rel...
Identifying features in opinion mining via intrinsic and extrinsic domain rel...
Gajanand Sharma
 
Practical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaPractical Opinion Mining for Social Media
Practical Opinion Mining for Social Media
Diana Maynard
 
Using opinion mining techniques for early crisis detection
Using opinion mining techniques for early crisis detectionUsing opinion mining techniques for early crisis detection
Using opinion mining techniques for early crisis detection
Faculty of Computer Science
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
rathnaarul
 
Tutorial on Opinion Mining and Sentiment Analysis
Tutorial on Opinion Mining and Sentiment AnalysisTutorial on Opinion Mining and Sentiment Analysis
Tutorial on Opinion Mining and Sentiment Analysis
Yun Hao
 
Tiffany Iwantoro_Exchange Rate Directional Forecasting using Sentiment Analys...
Tiffany Iwantoro_Exchange Rate Directional Forecasting using Sentiment Analys...Tiffany Iwantoro_Exchange Rate Directional Forecasting using Sentiment Analys...
Tiffany Iwantoro_Exchange Rate Directional Forecasting using Sentiment Analys...
Tiffany Iwantoro
 
Science Festival 2011 Programme
Science Festival 2011 ProgrammeScience Festival 2011 Programme
Science Festival 2011 Programme
Associazione Festival della Scienza
 
Multimodal opinion mining from social media
Multimodal opinion mining from social mediaMultimodal opinion mining from social media
Multimodal opinion mining from social media
Diana Maynard
 
Customer relationship management_dwm_ankita_dubey
Customer relationship management_dwm_ankita_dubeyCustomer relationship management_dwm_ankita_dubey
Customer relationship management_dwm_ankita_dubey
Ankita Dubey
 
Opinion mining
Opinion miningOpinion mining
Opinion mining
Antigoni-Maria Founta
 
Major
MajorMajor
Major
Vickysin
 

Viewers also liked (20)

Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
 
What do you really mean when you tweet? Challenges for opinion mining on soci...
What do you really mean when you tweet? Challenges for opinion mining on soci...What do you really mean when you tweet? Challenges for opinion mining on soci...
What do you really mean when you tweet? Challenges for opinion mining on soci...
 
Challenges of social media analysis in the real world
Challenges of social media analysis in the real worldChallenges of social media analysis in the real world
Challenges of social media analysis in the real world
 
Opinion Mining using Data Mining Techniques
Opinion Mining using Data Mining TechniquesOpinion Mining using Data Mining Techniques
Opinion Mining using Data Mining Techniques
 
Twitter analysis by Kaify Rais
Twitter analysis by Kaify RaisTwitter analysis by Kaify Rais
Twitter analysis by Kaify Rais
 
Opinion mining
Opinion miningOpinion mining
Opinion mining
 
IEEE 2014 DOTNET DATA MINING PROJECTS Ai and opinion mining
IEEE 2014 DOTNET DATA MINING PROJECTS Ai and opinion miningIEEE 2014 DOTNET DATA MINING PROJECTS Ai and opinion mining
IEEE 2014 DOTNET DATA MINING PROJECTS Ai and opinion mining
 
Expression opinion kls X
Expression opinion kls XExpression opinion kls X
Expression opinion kls X
 
Identifying features in opinion mining via intrinsic and extrinsic domain rel...
Identifying features in opinion mining via intrinsic and extrinsic domain rel...Identifying features in opinion mining via intrinsic and extrinsic domain rel...
Identifying features in opinion mining via intrinsic and extrinsic domain rel...
 
Practical Opinion Mining for Social Media
Practical Opinion Mining for Social MediaPractical Opinion Mining for Social Media
Practical Opinion Mining for Social Media
 
Using opinion mining techniques for early crisis detection
Using opinion mining techniques for early crisis detectionUsing opinion mining techniques for early crisis detection
Using opinion mining techniques for early crisis detection
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 
Tutorial on Opinion Mining and Sentiment Analysis
Tutorial on Opinion Mining and Sentiment AnalysisTutorial on Opinion Mining and Sentiment Analysis
Tutorial on Opinion Mining and Sentiment Analysis
 
Tiffany Iwantoro_Exchange Rate Directional Forecasting using Sentiment Analys...
Tiffany Iwantoro_Exchange Rate Directional Forecasting using Sentiment Analys...Tiffany Iwantoro_Exchange Rate Directional Forecasting using Sentiment Analys...
Tiffany Iwantoro_Exchange Rate Directional Forecasting using Sentiment Analys...
 
Science Festival 2011 Programme
Science Festival 2011 ProgrammeScience Festival 2011 Programme
Science Festival 2011 Programme
 
Multimodal opinion mining from social media
Multimodal opinion mining from social mediaMultimodal opinion mining from social media
Multimodal opinion mining from social media
 
Customer relationship management_dwm_ankita_dubey
Customer relationship management_dwm_ankita_dubeyCustomer relationship management_dwm_ankita_dubey
Customer relationship management_dwm_ankita_dubey
 
Opinion mining
Opinion miningOpinion mining
Opinion mining
 
Major
MajorMajor
Major
 

Similar to Opinion mining for social media

Practical sentiment analysis
Practical sentiment analysisPractical sentiment analysis
Practical sentiment analysis
Diana Maynard
 
Leeds Met talk on social media monitoring
Leeds Met talk on social media monitoringLeeds Met talk on social media monitoring
Leeds Met talk on social media monitoring
Anthony Devenish
 
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Digital Methods Initiative
 
Sentiment Analysis and Social Media: How and Why
Sentiment Analysis and Social Media: How and WhySentiment Analysis and Social Media: How and Why
Sentiment Analysis and Social Media: How and Why
Davide Feltoni Gurini
 
Social Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersSocial Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the users
Mounia Lalmas-Roelleke
 
Do Users Really Generate Content? Tips and Tools for Building Engaged Online ...
Do Users Really Generate Content? Tips and Tools for Building Engaged Online ...Do Users Really Generate Content? Tips and Tools for Building Engaged Online ...
Do Users Really Generate Content? Tips and Tools for Building Engaged Online ...
Laura Norvig
 
Hr roundtable slides
Hr roundtable slidesHr roundtable slides
Hr roundtable slides
daniherro
 
E-challenges11 WeGov Workshop
E-challenges11 WeGov WorkshopE-challenges11 WeGov Workshop
E-challenges11 WeGov Workshop
WeGov project
 
WeGov Analysis Tools to connect Policy Makers with Citizens Online
WeGov Analysis Tools to connect Policy Makers with Citizens OnlineWeGov Analysis Tools to connect Policy Makers with Citizens Online
WeGov Analysis Tools to connect Policy Makers with Citizens Online
Timo Wandhoefer
 
The Well Connected Facility
The Well Connected FacilityThe Well Connected Facility
The Well Connected Facility
Ryan Duggan
 
CIL Stats Workshop April1 2022 Abram Silk.pdf
CIL Stats Workshop April1 2022 Abram Silk.pdfCIL Stats Workshop April1 2022 Abram Silk.pdf
CIL Stats Workshop April1 2022 Abram Silk.pdf
Stephen Abram
 
Twitter data analysis using R
Twitter data analysis using RTwitter data analysis using R
Twitter data analysis using R
santoshi mangalgi
 
Maximizing Social Capital to Increase Core Facility Exposure and Usage
Maximizing Social Capital to Increase Core Facility Exposure and UsageMaximizing Social Capital to Increase Core Facility Exposure and Usage
Maximizing Social Capital to Increase Core Facility Exposure and Usage
Ryan Duggan
 
Nycon social media nyfa presentation
Nycon social media nyfa presentationNycon social media nyfa presentation
Nycon social media nyfa presentation
Andrew Marietta
 
analyzing public sentiments using twitter feeds
 analyzing public sentiments using twitter feeds analyzing public sentiments using twitter feeds
analyzing public sentiments using twitter feeds
Orakzay
 
Cultural Networks 2012: Headway UK
Cultural Networks 2012: Headway UKCultural Networks 2012: Headway UK
Cultural Networks 2012: Headway UK
University of Portsmouth
 
Storytelling 1
Storytelling 1Storytelling 1
Storytelling 1
Julia Goldberg
 
LAK13: Epistemology, Pedagogy, Assessment and Learning Analytics
LAK13: Epistemology, Pedagogy, Assessment and Learning AnalyticsLAK13: Epistemology, Pedagogy, Assessment and Learning Analytics
LAK13: Epistemology, Pedagogy, Assessment and Learning Analytics
Simon Knight
 
Citizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsCitizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and Applications
Amit Sheth
 
Social Media Dataset
Social Media DatasetSocial Media Dataset
Social Media Dataset
Jose Carlos Cortizo Perez
 

Similar to Opinion mining for social media (20)

Practical sentiment analysis
Practical sentiment analysisPractical sentiment analysis
Practical sentiment analysis
 
Leeds Met talk on social media monitoring
Leeds Met talk on social media monitoringLeeds Met talk on social media monitoring
Leeds Met talk on social media monitoring
 
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
 
Sentiment Analysis and Social Media: How and Why
Sentiment Analysis and Social Media: How and WhySentiment Analysis and Social Media: How and Why
Sentiment Analysis and Social Media: How and Why
 
Social Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the usersSocial Media and AI: Don’t forget the users
Social Media and AI: Don’t forget the users
 
Do Users Really Generate Content? Tips and Tools for Building Engaged Online ...
Do Users Really Generate Content? Tips and Tools for Building Engaged Online ...Do Users Really Generate Content? Tips and Tools for Building Engaged Online ...
Do Users Really Generate Content? Tips and Tools for Building Engaged Online ...
 
Hr roundtable slides
Hr roundtable slidesHr roundtable slides
Hr roundtable slides
 
E-challenges11 WeGov Workshop
E-challenges11 WeGov WorkshopE-challenges11 WeGov Workshop
E-challenges11 WeGov Workshop
 
WeGov Analysis Tools to connect Policy Makers with Citizens Online
WeGov Analysis Tools to connect Policy Makers with Citizens OnlineWeGov Analysis Tools to connect Policy Makers with Citizens Online
WeGov Analysis Tools to connect Policy Makers with Citizens Online
 
The Well Connected Facility
The Well Connected FacilityThe Well Connected Facility
The Well Connected Facility
 
CIL Stats Workshop April1 2022 Abram Silk.pdf
CIL Stats Workshop April1 2022 Abram Silk.pdfCIL Stats Workshop April1 2022 Abram Silk.pdf
CIL Stats Workshop April1 2022 Abram Silk.pdf
 
Twitter data analysis using R
Twitter data analysis using RTwitter data analysis using R
Twitter data analysis using R
 
Maximizing Social Capital to Increase Core Facility Exposure and Usage
Maximizing Social Capital to Increase Core Facility Exposure and UsageMaximizing Social Capital to Increase Core Facility Exposure and Usage
Maximizing Social Capital to Increase Core Facility Exposure and Usage
 
Nycon social media nyfa presentation
Nycon social media nyfa presentationNycon social media nyfa presentation
Nycon social media nyfa presentation
 
analyzing public sentiments using twitter feeds
 analyzing public sentiments using twitter feeds analyzing public sentiments using twitter feeds
analyzing public sentiments using twitter feeds
 
Cultural Networks 2012: Headway UK
Cultural Networks 2012: Headway UKCultural Networks 2012: Headway UK
Cultural Networks 2012: Headway UK
 
Storytelling 1
Storytelling 1Storytelling 1
Storytelling 1
 
LAK13: Epistemology, Pedagogy, Assessment and Learning Analytics
LAK13: Epistemology, Pedagogy, Assessment and Learning AnalyticsLAK13: Epistemology, Pedagogy, Assessment and Learning Analytics
LAK13: Epistemology, Pedagogy, Assessment and Learning Analytics
 
Citizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsCitizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and Applications
 
Social Media Dataset
Social Media DatasetSocial Media Dataset
Social Media Dataset
 

More from Diana Maynard

Filth and lies: analysing social media
Filth and lies: analysing social mediaFilth and lies: analysing social media
Filth and lies: analysing social media
Diana Maynard
 
Adding value to NLP: a little semantics goes a long way
Adding value to NLP: a little semantics goes a long wayAdding value to NLP: a little semantics goes a long way
Adding value to NLP: a little semantics goes a long way
Diana Maynard
 
Methodological possibilities for strengthening the monitoring of SDG indicato...
Methodological possibilities for strengthening the monitoring of SDG indicato...Methodological possibilities for strengthening the monitoring of SDG indicato...
Methodological possibilities for strengthening the monitoring of SDG indicato...
Diana Maynard
 
Getting the-most-out-of-conferences
Getting the-most-out-of-conferencesGetting the-most-out-of-conferences
Getting the-most-out-of-conferences
Diana Maynard
 
Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...
Diana Maynard
 
The language of social media
The language of social mediaThe language of social media
The language of social media
Diana Maynard
 
Text analysis-semantic-search
Text analysis-semantic-searchText analysis-semantic-search
Text analysis-semantic-search
Diana Maynard
 
Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...
Diana Maynard
 
Adapting NLP tools to diverse data: challenges and solutions
Adapting NLP tools to diverse data: challenges and solutionsAdapting NLP tools to diverse data: challenges and solutions
Adapting NLP tools to diverse data: challenges and solutions
Diana Maynard
 
Ontologies as bridges between data sources and user queries: the KNOWMAK proj...
Ontologies as bridges between data sources and user queries: the KNOWMAK proj...Ontologies as bridges between data sources and user queries: the KNOWMAK proj...
Ontologies as bridges between data sources and user queries: the KNOWMAK proj...
Diana Maynard
 
20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...
20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...
20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...
Diana Maynard
 
Text Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATEText Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATE
Diana Maynard
 
GATE: a text analysis tool for social media
GATE: a text analysis tool for social mediaGATE: a text analysis tool for social media
GATE: a text analysis tool for social media
Diana Maynard
 
Tools for (Almost) Real-Time Social Media Analysis
Tools for (Almost) Real-Time Social Media AnalysisTools for (Almost) Real-Time Social Media Analysis
Tools for (Almost) Real-Time Social Media Analysis
Diana Maynard
 
Social media analytics as a service: tools from GATE
Social media analytics as a service: tools from GATESocial media analytics as a service: tools from GATE
Social media analytics as a service: tools from GATE
Diana Maynard
 
Text analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEText analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATE
Diana Maynard
 
Cls8 decarbonet
Cls8 decarbonetCls8 decarbonet
Cls8 decarbonet
Diana Maynard
 
Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?
Diana Maynard
 
Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?
Diana Maynard
 
Do we really know what people mean when they tweet?
Do we really know what people mean when they tweet?Do we really know what people mean when they tweet?
Do we really know what people mean when they tweet?
Diana Maynard
 

More from Diana Maynard (20)

Filth and lies: analysing social media
Filth and lies: analysing social mediaFilth and lies: analysing social media
Filth and lies: analysing social media
 
Adding value to NLP: a little semantics goes a long way
Adding value to NLP: a little semantics goes a long wayAdding value to NLP: a little semantics goes a long way
Adding value to NLP: a little semantics goes a long way
 
Methodological possibilities for strengthening the monitoring of SDG indicato...
Methodological possibilities for strengthening the monitoring of SDG indicato...Methodological possibilities for strengthening the monitoring of SDG indicato...
Methodological possibilities for strengthening the monitoring of SDG indicato...
 
Getting the-most-out-of-conferences
Getting the-most-out-of-conferencesGetting the-most-out-of-conferences
Getting the-most-out-of-conferences
 
Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...Understanding the world with NLP: interactions between society, behaviour and...
Understanding the world with NLP: interactions between society, behaviour and...
 
The language of social media
The language of social mediaThe language of social media
The language of social media
 
Text analysis-semantic-search
Text analysis-semantic-searchText analysis-semantic-search
Text analysis-semantic-search
 
Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...Using language to save the world: interactions between society, behaviour and...
Using language to save the world: interactions between society, behaviour and...
 
Adapting NLP tools to diverse data: challenges and solutions
Adapting NLP tools to diverse data: challenges and solutionsAdapting NLP tools to diverse data: challenges and solutions
Adapting NLP tools to diverse data: challenges and solutions
 
Ontologies as bridges between data sources and user queries: the KNOWMAK proj...
Ontologies as bridges between data sources and user queries: the KNOWMAK proj...Ontologies as bridges between data sources and user queries: the KNOWMAK proj...
Ontologies as bridges between data sources and user queries: the KNOWMAK proj...
 
20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...
20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...
20 Years of Text Mining Applications with GATE: from Donald Trump to curing c...
 
Text Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATEText Analysis and Semantic Search with GATE
Text Analysis and Semantic Search with GATE
 
GATE: a text analysis tool for social media
GATE: a text analysis tool for social mediaGATE: a text analysis tool for social media
GATE: a text analysis tool for social media
 
Tools for (Almost) Real-Time Social Media Analysis
Tools for (Almost) Real-Time Social Media AnalysisTools for (Almost) Real-Time Social Media Analysis
Tools for (Almost) Real-Time Social Media Analysis
 
Social media analytics as a service: tools from GATE
Social media analytics as a service: tools from GATESocial media analytics as a service: tools from GATE
Social media analytics as a service: tools from GATE
 
Text analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEText analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATE
 
Cls8 decarbonet
Cls8 decarbonetCls8 decarbonet
Cls8 decarbonet
 
Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?
 
Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?Can Social Media Analysis Improve Collective Awareness of Climate Change?
Can Social Media Analysis Improve Collective Awareness of Climate Change?
 
Do we really know what people mean when they tweet?
Do we really know what people mean when they tweet?Do we really know what people mean when they tweet?
Do we really know what people mean when they tweet?
 

Recently uploaded

(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
Priyanka Aash
 
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
Priyanka Aash
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
313mohammedarshad
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
David Wilson
 
How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...
DianaGray10
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
shyamraj55
 
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptxMAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
janagijoythi
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
Steven Carlson
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
aslasdfmkhan4750
 
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
maigasapphire
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
SubhamMandal40
 
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision MakingConnector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
DianaGray10
 
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
alexjohnson7307
 
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
Priyanka Aash
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Networks
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
BrainSell Technologies
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
Priyanka Aash
 
(CISOPlatform Summit & SACON 2024) Workshop _ Most Dangerous Attack Technique...
(CISOPlatform Summit & SACON 2024) Workshop _ Most Dangerous Attack Technique...(CISOPlatform Summit & SACON 2024) Workshop _ Most Dangerous Attack Technique...
(CISOPlatform Summit & SACON 2024) Workshop _ Most Dangerous Attack Technique...
Priyanka Aash
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
BrainSell Technologies
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
alexjohnson7307
 

Recently uploaded (20)

(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
(CISOPlatform Summit & SACON 2024) Digital Personal Data Protection Act.pdf
 
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
(CISOPlatform Summit & SACON 2024) Regulation & Response In Banks.pdf
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
 
Mastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for SuccessMastering OnlyFans Clone App Development: Key Strategies for Success
Mastering OnlyFans Clone App Development: Key Strategies for Success
 
How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
 
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptxMAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
MAKE MONEY ONLINE Unlock Your Income Potential Today.pptx
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
 
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
Girls Call Churchgate 9910780858 Provide Best And Top Girl Service And No1 in...
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
 
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision MakingConnector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
 
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
leewayhertz.com-Generative AI tech stack Frameworks infrastructure models and...
 
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
(CISOPlatform Summit & SACON 2024) Orientation by CISO Platform_ Using CISO P...
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
 
(CISOPlatform Summit & SACON 2024) Workshop _ Most Dangerous Attack Technique...
(CISOPlatform Summit & SACON 2024) Workshop _ Most Dangerous Attack Technique...(CISOPlatform Summit & SACON 2024) Workshop _ Most Dangerous Attack Technique...
(CISOPlatform Summit & SACON 2024) Workshop _ Most Dangerous Attack Technique...
 
Acumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptxAcumatica vs. Sage Intacct _Construction_July (1).pptx
Acumatica vs. Sage Intacct _Construction_July (1).pptx
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
 

Opinion mining for social media

  • 1. Opinion mining for social media Diana Maynard University of Sheffield, UK
  • 2. Who Wants to be a Millionaire? Ask the audience? Or Phone a Friend?
  • 3. How did the Royal Wedding influence the UK's health?
  • 4. Can Twitter help us predict the stock market? “Hey Jon, Derek in Scunthorpe's having a bacon and egg butty. Is that good for wheat futures?”
  • 5. Can Twitter predict earthquakes?
  • 6. Outline • Introduction - social media, opinion mining and the Arcomem project • What is opinion mining and why is it hard? • Current research in the Arcomem project • What does the future hold?
  • 7. The Social Web Information, thoughts and opinions are shared prolifically these days on the social web
  • 8. Drowning in information • It can be difficult to get the relevant information out of such large volumes of data in a useful way • Social web analysis is all about the users who are actively engaged and generate content • Social networks are pools of a wide range of articulation methods, from simple "I like it" buttons to complete articles
  • 9. Opinion Mining • Along with entity, topic and event recognition, opinion mining forms the cornerstone for social web analysis
  • 10. What is Opinion Mining? • OM is a relatively recent discipline that studies the extraction of opinions using IR, AI and/or NLP techniques. • More informally, it's about extracting the opinions or sentiments given in a piece of text • Also referred to as Sentiment Analysis (though technically this is a more specific task) • Web 2.0 nowadays provides a great medium for people to share things. • This provides a great source of unstructured information (especially opinions) that may be useful to others (e.g. companies and their rivals, other consumers, governments...)
  • 11. It's about finding out what people think...
  • 12. Opinion Mining is Big Business ● Someone who wants to buy a camera ● Looks for comments and reviews ● Someone who just bought a camera ● Comments on it ● Writes about their experience ● Camera Manufacturer ● Gets feedback from customer ● Improve their products ● Adjust Marketing Strategies
  • 13. Analysing and preserving opinions • We want to collect, store and later retrieve public opinions about events and their changes or developments over time • Not only can online social networks provide a snapshot of such situations, but they can actually trigger a chain of reactions and events • Ultimately these events might led to societal, political or administrative changes • Since the Royal Wedding, Pilates classes have become incredibly popular in the UK solely as a result of social media.
  • 14. On a more serious note... ● We can measure the impact of a political entity or event on the overall sentiment about another entity or event, over the course of time. ● Aggregation of opinions over entities and events to cover sentences and documents ● Combined with time information and/or geo-locations, we can then analyse changes to opinions about the same entity/event over time, and other statistical correlations
  • 15. But there are lots of tools that do this already.... • Here are some examples: – Twitter sentiment http://twittersentiment.appspot.com/ – Twends: http://twendz.waggeneredstrom.com/ – Twittratr: http://twitrratr.com/ – SocialMention: http://socialmention.com/
  • 16. Why not use existing sentiment apps? • Easy to search for opinions about famous people, brands and so on • Hard to search for more abstract concepts, perform a non- keyword based string search – e.g. to find opinions about Lady Gaga's' dress, you can only search on “Lady Gaga” to get hits • And the opinion finding they do isn't very good...
  • 17. Why are these sites unsuccessful? • They don't work well at more than a very basic level • They mainly use dictionary lookup for positive and negative words • They classify the tweets as positive or negative, but not with respect to the keyword you're searching for • First, the keyword search just retrieves any tweet mentioning it, but not necessarily about it as a topic • Second, there is no correlation between the keyword and the sentiment: the sentiment refers to the tweet as a whole • Sometimes this is fine, but it can also go horribly wrong
  • 18. Whitney Houston wasn't very popular...
  • 20. Negative tweets about Whitney? • the #nationalenquirers cover photo of #whitneyhouston lying in a casket has sparked outrage in the #media world • feeling bad for #whitneyhouston right now • It's such a shame #whitneyhouston is dead
  • 21. What did people think of the Olympics?
  • 22. Twittrater's view of the Olympics • A keyword search for Olympics shows exactly how existing systems fail to cut the mustard • Lookup of sentiment words is not enough if – they're part of longer words – they're used in different contexts – the tweet itself isn't relevant – they're used in a negative or sarcastic sentence – they're ambiguous
  • 24. What's the capital of Spain? A: Barcelona B: Madrid C: Valencia D: Seville
  • 25. What's the height of Mt Kilimanjaro? A: 19,341 ft B: 23,341 ft C: 15,341 ft D: 21,341 ft
  • 26. Go for the majority or trust an expert? • It depends what kind of question you're asking • In Who Wants to Be a Millionaire, people tend to ask the audience fairly early on, because once the questions get hard, they can't rely on the audience getting it right • Similarly with opinion mining, you have to know who to trust What's the height of Mt What's the capital of Spain? Kilimanjaro? A: 19,341 ft A: Barcelona B: 23,341 ft B: Madrid C: 15,341 ft C: Valencia D: 21,341 ft D: Seville
  • 27. Arcomem project • EU-funded research project involving memory institutions like archives, museums, and libraries in the age of the Social Web. • Arcomem aims to – help transform archives into collective memories that are more tightly integrated with their community of users – exploit Social Web and the wisdom of crowds to make Web archiving a more selective and meaning-based process • http://vimeo.com/45954490
  • 28. Key questions • What are the opinions on crucial social events and who are the main protagonists? • How are these opinions distributed in relation to demographic user data? • How have these opinions evolved? • Who are the opinion leaders? • What is their impact and influence?
  • 29. ETOEs detection in Arcomem • ETOES - Entities, Topics, Opinions and Events • ETOEs detection is used to guide the crawler and to add useful metadata to the information stored for future search • Opinion mining builds on entities, terms and events identified, finding relevant opinions about them • Text-based sentiment analysis performed by GATE tools (USFD) in English and German • Text-based analysis is then combined with topic detection to produce clustered opinions over documents • Image- and video-based opinion mining also plays a supporting role, e.g. match new images against existing images which have been manually annotated as positive or negative
  • 30. ETOEs Workflow Relation Extraction Linguistic Entity Preprocessing Extraction Event Extraction Word Sense Topic Identification Opinion Mining Determination Dynamics Capturing
  • 32. Main Research Directions  New techniques for opinion mining based around entities and events in social media - different from typical product reviews.  Approach involves building up a number of linguistic subcomponents  Includes techniques for multimedia opinion analysis  Investigating the reusability of these components for different domains, document types, languages etc (Knowledge Transfer)  Experiments on cross- domain/language adaptation • Aggregation of opinions over a document  Extension of work from LivingKnowledge project on drilling down into subtopics about specific entities, rather than just getting information about general topics  Provide summaries of opinions for multiple occurrences of entities in a document
  • 33. Text-based Opinion Mining in Arcomem • Performed basic sentiment analysis on various English and German datasets • Documents may be in multiple languages, e.g. Greek crisis documents contain both English and Greek sentences • Entities and events used as seeds for opinion objects • Experimenting with sentiment lexicons for initial lookup - currently SentiWordNet (English), SentiWS (German) and additions automatically derived from test corpora • Components for detection of negatives, swear words etc • Output is an opinion feature on the entity/event, plus polarity and score features where appropriate
  • 34. Opinion Mining sub-components • Top level sub-components  source and target identification (entity/event-based)  opinion-feature association (uses linguistic relations) • Dealing with the complexities of natural language  incorrect English  negatives  conditional statements  slang/swear words/  irony/sarcasm  spam opinion detection
  • 35. What can we realistically hope to achieve? ● Opinion mining is hard, and it's harder because: ● we have lots of different text types and domains. ● we're processing social media ● we're processing multiple language ● we don't necessarily know what we're looking for ● Experimenting with different techniques and subcomponents to build up a complex OM system ● Experimenting with techniques for cross-language adaptation
  • 36. GATE (General Architecture for Text Engineering) ● GATE is a tool for Language Engineering in development in Sheffield since 2000. ● GATE includes: – components for language processing, e.g. parsers, machine learning tools, stemmers, IR tools, IE components for various languages... – tools for visualising and manipulating text, annotations, ontologies, parse trees, etc. – various information extraction tools – evaluation and benchmarking tools • More info and freely available at http://gate.ac.uk
  • 37. Applications • Developed a series of initial applications for opinion mining from social media using GATE • Based on previous work identifying political opinions from tweets • Extended to more generic analysis about any kind of entity or event, in 2 domains – Greek financial crisis – Rock am Ring (German rock festival) • Uses a variety of social media including twitter, facebook and forum posts • Based on entity and event extraction
  • 38. Machine learning approach  ML: automating the process of inferring new data from existing data  In GATE, that means creating annotations or adding features to annotations by learning how they relate to other annotations  For example, we haveToken annotations with string features and Product annotations ● ML could learn that a Product close to the Token “stinks” expresses a negative sentiment, then add a polarity =“negative” feature to the Sentence. The new Acme Model 33 stinks ! Token Token Token Token Token Token Token Product Sentence
  • 39. Rule-based techniques • These rely primarily on sentiment dictionaries, plus some rules to do things like attach sentiments to targets, or modify the sentiment scores • Examples include: – analysis of political tweets (Maynard and Funk, 2011) – analysis of opinions expressed about political events and rock festivals in social media (Maynard, Bontcheva and Rout, 2012) – SO-CAL (Taboada et al, 2011) for detecting positive and negative sentiment of ePinions reviews on the web.
  • 40. Why Rule-based? • Although ML applications are typically used for Opinion Mining, this task involves documents from many different text types, genres, languages and domains • This is problematic for ML because it requires many applications trained on the different datasets, and methods to deal with acquisition of training material • Aim of using a rule-based system is that the bulk of it can be used across different kinds of texts, with only the pre- processing and some sentiment dictionaries which are domain and language-specific
  • 41. GATE Application • Structural pre-processing, specific to social media types • Linguistic pre-processing (including language detection), NE, term and event recognition • Additional targeted gazetteer lookup • JAPE grammars • Aggregation of opinions • Dynamics
  • 42. Structural pre-processing on Twitter Original markups set
  • 43. Linguistic pre-processing • Language identification (per sentence) using TextCat • Standard tokenisation, POS tagging etc using GATE • Modified versions of ANNIE and TermRaider for NE and term recognition • Event recognition using specially developed GATE application (e.g. band performance, economic crisis, industrial strike)
  • 44. Language ID with TextCat
  • 45. Basic approach for opinion finding • Find sentiment-containing words in a linguistic relation with entities/events (opinion-target matching) • Use a number of linguistic sub-components to deal with issues such as negatives, irony, swear words etc. • Starting from basic sentiment lookup, we then adjust the scores and polarity of the opinions via these components
  • 46. Sentiment finding components • Flexible Gazetteer Lookup: matches lists of affect/emotion words against the text, in any morphological variant • Gazetteer Lookup: matches lists of affect/emotion words against the text only in non-variant forms, i.e. exact string match (mainly the case for specific phrases, swear words, emoticons etc.) • Sentiment Grammars: set of hand-crafted JAPE rules which annotate sentiments and link them with the relevant targets and opinion holders • RDF Generation: create the relevant RDF-XML for the annotations according to the data model (so they can be used by other components)
  • 47. Opinions on Greek Crisis Text
  • 48. Austrian Parliament • Our partners in the Austrian Parliament have quite specific requests for opinion mining on Parliamentary texts (in German):  identification of comments being in favour or against a draft bill  if possible, distinguish between the reasons for dislike (e.g. because it goes too far, or does not go far enough)  if possible, also distinguish alternative concepts in the comments • These are difficult challenges, mainly to be tackled in the later stages of the project as we extend beyond positive/negative/neutral opinions • Different strategies required, e.g. not necessarily starting from entities and events, but making more use of topics
  • 49. Sentiment Finding in Parliamentary docs
  • 50. Opinion scoring • Sentiment gazetteers (developed from sentiment words in WordNet) have a starting “strength” score • These get modified by context words, e.g. adverbs, swear words, negatives and so on
  • 51. Opinions about Entities in Rock am Ring
  • 52. Opinions about sentences in RockamRing
  • 53. Challenges imposed by social media • Language: specific pre-processing for Twitter. use shallow analysis techniques with back-off strategies; incorporate specific subcomponents for swear words, sarcasm etc. • Relevance: topics and comments can rapidly diverge. Solutions involve training a classifier or using clustering techniques • Target identification: use an entity-centric approach • Contextual information: use metadata for further information, also aggregation of data can be useful
  • 54. Short sentences, e.g. tweets • Social media, and especially tweets, can be problematic because sentences are very short and/or incomplete • Typically, linguistic pre-processing tools such as POS taggers and parsers do badly on such texts • Even things like language identification tools can have problems • The best solution is to try not to rely too heavily on these tools – Does it matter if we get the wrong language for a sentence? – Do we actually need full parsing? – Can we use other clues when POS tags may be incorrect?
  • 55. Dealing with incorrect English • Frequent problem in any NLP task involving social media • Incorrect capitalisation, spelling, grammar, made-up words (eg swear words, infixes) • Some specific pre-processing • Backoff strategies include ● using more flexible gazetteer matching ● using case-insensitive resources (but be careful) ● avoiding full parsing and using shallow techniques ● using very general grammar rules ● adding specialised gazetteer entries for common misspellings, or using co-reference techniques
  • 56. Tokenisation • Plenty of “unusual”, but very important tokens in social media: – @Apple – mentions of company/brand/person names – #fail, #SteveJobs – hashtags expressing sentiment, person or company names – :-(, :-), :-P – emoticons (punctuation and optionally letters) – URLs • Tokenisation key for entity recognition and opinion mining • A study of 1.1 million tweets: 26% of English tweets have a URL, 16.6% - a hashtag, and 54.8% - a user name mention [Carter, 2013].
  • 57. Example #WiredBizCon #nike vp said when @Apple saw what http://nikeplus.com did, #SteveJobs was like wow I didn't expect this at all. ● Tokenising on white space doesn't work that well: ● Nike and Apple are company names, but if we have tokens such as #nike and @Apple, this will make the entity recognition harder, as it will need to look at sub-token level ● Tokenising on white space and punctuation characters doesn't work well either: URLs get separated (http, nikeplus), as are emoticons and email addresses
  • 58. The GATE Twitter Tokeniser • Treat RTs, emoticons, and URLs as 1 token each • #nike is two tokens (# and nike) plus a separate annotation HashTag covering both. Same for @mentions • Capitalisation is preserved, but an orthography feature is added: all caps, lowercase, mixCase • Date and phone number normalisation, lowercasing, and other such cases are optionally done later in separate modules • Consequently, tokenisation is faster and more generic
  • 59. De-duplication and Spam Removal • Approach from [Choudhury & Breslin, #MSM2011]: • Remove as duplicates/spam: – Messages with only hashtags (and optional URL) – Similar content, different user names and with the same timestamp are considered to be a case of multiple accounts – Same account, identical content are considered to be duplicate tweets – Same account, same content at multiple times are considered as spam tweets
  • 60. Normalisation • “RT @Bthompson WRITEZ: @libbyabrego honored?! Everybody knows the libster is nice with it...lol...(thankkkks a bunch;))” • OMG! I’m so guilty!!! Sprained biibii’s leg! ARGHHHHHH!!!!!! • Similar to SMS normalisation • For some later components to work well (POS tagger, parser), it is necessary to produce a normalised version of each token • BUT uppercasing, and letter and exclamation mark repetition often convey strong sentiment, so we keep both versions of tokens • Syntactic normalisation: determine when @mentions and #tags have syntactic value and should be kept in the sentence, vs replies, retweets and topic tagging
  • 61. Irony and sarcasm • Life's too short, so be sure to read as many articles about celebrity breakups as possible. • I had never seen snow in Holland before but thanks to twitter and facebook I now know what it looks like. Thanks guys, awesome! • On a bright note if downing gets injured we have Henderson to come in.
  • 62. How do you know when someone is being sarcastic? • Use of hashtags in tweets such as #sarcasm • Large collections of tweets based on hashtags can be used to make a training set for machine learning • But you still have to know which bit of the tweet is the sarcastic bit To the hospital #fun #sarcasm Man , I hate when I get those chain letters & I don't resend them , then I die the next day .. #Sarcasm lol letting a baby goat walk on me probably wasn't the best idea. Those hooves felt great. #sarcasm
  • 63. How else can you deal with it? • Look for word combinations with opposite polarity, e.g. “rain” or “delay” plus “brilliant” Going to the dentist on my weekend home. Great. I'm totally pumped. #sarcasm • Inclusion of world knowledge / ontologies can help (e.g. knowing that people typically don't like going to the dentist, or that people typically like weekends better than weekdays. • It's an incredibly hard problem and an area where we expect not to get it right that often • Still very much work in progress for us
  • 64. Evaluation • Very hard to measure opinion polarity beyond positive/negative/neutral • On a small corpus of 20 facebook posts, we identified sentiment-containing sentences with 86% Precision and 71% Recall • Of these, the polarity accuracy was 66% • While this is not that high, not all the subcomponents are complete in the system, so we would expect better results with improved methods for negation and sarcasm detection • NE recognition was high on these texts: 92% Precision and 69% Recall (compared with other NE evaluations on social media)
  • 65. Comparison of Opinion Finding in Different Tasks Corpus Sentiment Polarity Target detection detection assignment Political Tweets 78% 79% 97.9% Financial Crisis Facebook 55% 81.8% 32.7% Financial Crisis Tweets 90% 93.8% 66.7%
  • 66. Image-opinion identification Facial expression analysis/classification – Working on processing pipeline for this – Helps with facial similarity calculations and face recognition – Can be used to predict sentiment/polarity – Can be combined with analysis text from document Coarse-grained opinion classification – Looking at image-feature classification for abstract concepts (sentiment / privacy / attractiveness) – e.g. looking at image colours, placement of interesting images in the picture
  • 67. Multimodal opinion analysis ● Investigate correlation between images and whole-document opinions ● Do documents asserting specific opinions get illustrated with the same imagery? ● e.g. articles about euro-scepticism in the UK might be illustrated with images of specific Conservative peers…. ● Is there correlation between low-level image features and specific opinions? ● Investigate finer-grained (i.e. sub-document) correlations between imagery and opinions ● e.g. sentence-level correlations incorporating analysis of the document layout
  • 68. Summary • Ongoing work on developing opinion mining tools and adapting them to social media • Deal with multilinguality, ungrammatical English, and very short posts (tweets) • Components for negation, swear words, sarcasm etc • Promising initial evaluations • Much further work still to come, e.g. integration with multimedia components
  • 69. Further information • Work done in the context of the EU-funded ARCOMEM and TrendMiner projects • See http://www.arcomem.eu and http://www.trend-miner.eu for more details • More information about GATE at http://gate.ac.uk • More information about opinion mining see the LREC 2012 Tutorial “Opinion Mining: Exploiting the Sentiment of the Crowd” • Module 12 of the GATE Training Course (new material since June 2012) https://gate.ac.uk/family/training.html