SlideShare a Scribd company logo
How Emotional are
  Users’ Needs?
            Exploring Emotion in Query Logs
                                      Marina Santini
                                       29 Jan 2013
 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                       1
 29-30 Jan 2013
Outline
• Inspirational Triggers:
   o The Big Unstructured Textual Data Issue
   o Emotion in IR
   o Research hypothesis

• Genre- and Emotion- Profiling of Query Logs
   o Characterization of genre
   o Definition of emotion
   o Benefits of genre and emotion awareness in query log analysis

• Experiments
   o Query Logs from GenitoriCrescono thematic blog (in iItalian)
   o Query Logs from Västra Götlands Region (in Swedish)

• Conclusions



    Marina Santini - CyberEmotions2013
    Warsaw University of Technology                                  2
    29-30 Jan 2013
Inspirational Trigger 1
  BIG UNSTRUCTURED TEXTUAL DATA



Marina Santini - CyberEmotions2013
Warsaw University of Technology      3
29-30 Jan 2013
Big Unstructured Texutal Data
 “MerrillLynch estimates that more than 85 percent of
  all business information exists as unstructured data –
  commonly appearing in e‐mails, memos, notes from
  call centers and support operations, news, user
  groups, chats, reports, letters, surveys, white
  papers, marketing material, research, presentations
  and web pages.” [DM Review Magazine, February
  2003 Issue]

 ECONOMIC LOSS!
                                      Lots of different genres!

 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                                  4
 29-30 Jan 2013
Simple search is not
               enough…
• Of course, it is possible to use simple search. But
  simple search is unrewarding, because is based on
  single terms.
    o ”a search is made on the term felony. In a simple search, the term felony
      is used, and everywhere there is a reference to felony, a hit to an
      unstructured document is made. But a simple search is crude. It does not
      find references to crime, arson, murder, embezzlement, vehicular
      homicide, and such, even though these crimes are types of felonies” [
      Source: Inmon, B. & A. Nesavich, "Unstructured Textual Data in the
      Organization" from "Managing Unstructured data in the
      organization", Prentice Hall 2008, pp. 1–13]




 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                                                  5
 29-30 Jan 2013
Text Analytics
• A set of NLP techniques that provide some structure
  to textual documents.
• Common components:
    o   Tokenization
    o   Morphological Analysis
    o   Syntactic Analysis
    o   Named Entity Recognition
    o   Sentiment Analysis
    o   Automatic Summarization
    o   Etc.




 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                        6
 29-30 Jan 2013
Text Analytics Products
      and Frameworks
• Commercial:                                 Open Source:
       Attensity
   o
                                              • GATE
   o   Clarabridge
   o   Temis                                  • NLTK
       Lexalytics
   o
                                              • UIMA
   o   Texify
   o   SAS                                    • etc.
   o   IBM Cognos
   o   etc.


                                     Business Intelligence (BI)
                                     Customer Experience Management (CEM)



Marina Santini - CyberEmotions2013
Warsaw University of Technology                                             7
29-30 Jan 2013
Actionable Intelligence
• Business Intelligence (BI) + Customer Experience
  Management (CEM) = Actionable Intelligence

• Actionable Intelligence is information that:
   1. must be accurate and verifiable
   2. must be timely
   3. must be comprehensive
   4. must be comprehensible
   5. give the power to make decisions and to act
      straightaway

 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                     8
 29-30 Jan 2013
In 2003, Merryl Lynch pointed out
     that it was too difficult to extract
     automatically usable intelligence
     from the following genres:

       o e‐mails
       o memos                                               Today…
       o notes from call centers and support
         operations                                   Previous genres plus
       o news                                                •Blogs
       o user groups
                                                            •Tweets
       o chats
       o reports                                        •FB microposts
       o letters                                         •FB comments
       o surveys
                                               •Many other social network texutal
       o white papers
       o marketing material,                             ”interactions”
       o research,
       o presentations
       o web pages


Marina Santini - CyberEmotions2013
Warsaw University of Technology                                                 9
29-30 Jan 2013
From Big Data to Query Logs
       Current State of affair               Viable Alternative
1. Big Unstructured Textual Data             • Query Logs
2. Text Analytics (commercial                • Genre- & Context
   products and frameworks)                    aware Text Analytics
3. Structured information for BI             • Actionable
                                               Information
   and CEM                                     (BI, CEM, sentiment, e
                                               merging topics…)
The main advantage to uses query logs (when they are   Typical Use Case
available) instead of other genres consists in         A company managing:
REDUCED DATA SIZE, REDUCED PRE-PROCESSING;             •Website
REDUCED NOISE, REDUCED DATA CLEANING!                  •Blog
                                                       •eMails
     Marina Santini - CyberEmotions2013                •Facebook Page
     Warsaw University of Technology                                      10
     29-30 Jan 2013
                                                       •Twitter account
Exploratory Query-log
                                         Analysis Workshop
                                           Organized by
                                       Findwise, AB – Sweden
        SearchInFocus                         SLTC 2012
Exploratory Study on Query Logs
   and Actionable Intelligence


                                       Query Logs provide
                                       Actionable Intelligence for:
                                       - search providers
                                       - clients
                                       - end-users
  Marina Santini - CyberEmotions2013
  Warsaw University of Technology                               11
  29-30 Jan 2013
Inspirational Trigger 2
EMOTION IN INFORMATION RETRIEVAL (IR)



 Marina Santini - CyberEmotions2013
 Warsaw University of Technology      12
 29-30 Jan 2013
Emotion in IR                         Role of Emotion
                                        in Information
       o Three concepts:                   Retrieval
                                        by Yashar Moshfeghi
             • Emotion need             PhD Thesis at University of
                                             Glasgow, 2012
             • Emotion object         ” uncover social situations
             • Emotion relevance     where emotion is the primary
                                         factor (i.e., source of
                                        motivation) in an IR&S
                                     process.” (from the abstract)




Marina Santini - CyberEmotions2013
Warsaw University of Technology                                    13
29-30 Jan 2013
Emotion Need
• The whole IR&S behaviour is driven by an emotion
  need.

• An emotion need is more fundamental than an
  information need in the sense that if an information
  need exists it implies that there is an underlying
  emotion need to satisfy this information need.

• Emotion needs, even when they do not lead to a
  particular information need, can motivate
  searchers to use an IR system.
 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                         14
 29-30 Jan 2013
Research Hypothesis for the
   exploration of emotion in
          query logs
It is plausible that much of the IR&S behaviour is driven by an
emotion need and that users’ emotions are expressed in the
  queries that are typed in search boxes and stored in query
                               logs.


If this is true, also emotion extraction from query logs provides
actionable intelligence, because extracted emotions can be
used to improve decision making and more grounded future
                              choices.

   Marina Santini - CyberEmotions2013
   Warsaw University of Technology                            15
   29-30 Jan 2013
Research Questions
• Is it possible to extract emotion from query logs?

• If so, is it possible to use emotion from query logs for
  actionable intelligence?




 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                             16
 29-30 Jan 2013
Genre Profiling of Query Logs



Marina Santini - CyberEmotions2013
Warsaw University of Technology      17
29-30 Jan 2013
What characterizes a
                genre?
1.   Must have a name
2.   Must be recognized within a community
3.   Must be produced during a task
4.   Must have conventions
5.   Must raise expectations
6.   Can change over time. It is an cultural artifact
     (culture here includes
     society, media, techonology, etc.)


 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                        18
 29-30 Jan 2013
The query log genre is…
         a newly acknowledge but fully-
                          emerged webgenre
1. Name: in line with other digital genres (ex: web log
    blog)
2. Community: internet users, IR practitioners
3. Task: to express searchers’needs in a search engine
4. Conventions: short texts written in”keywordese”
5. Expectations: to find information relevant to the
   query
6. Cultural artifact: a product of sinternet-based
   society OR a subproduct of search engines
 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                      19
 29-30 Jan 2013
The query log genre:
          Languistic and Textual
              Conventions
• Length: short text (a query log can be seen as a
  corpus of very short texts, shorter than
  tweets, mobile text messages, chat logs, etc.)
• Sublanguage/Jargon: ”keywordese”
• Register: neutral
• Morphology: REDUCED
• Syntax : REDUCED (usually no subclauses, etc.)



 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                     20
 29-30 Jan 2013
The Query Log Genre:
              Benefits
• wrt discourse analysis:
    o Conceptual lean and essential jargon
       • reduced morphology
       • reduced syntax
       • short texts
       • mostly nouns and verbs

   Benefit1: Predictable Sublanguage

• wrt BIG UNSTRUCTURED TEXTUAL DATA
  BENEFIT 2: REDUCED SIZE, REDUCED PRE-
  PROCESSING; LITTLE DATA CLEANING!

 Marina Santini - CyberEmotions2013
 Warsaw University of Technology             21
 29-30 Jan 2013
Emotion Profiling of Query Logs



  Marina Santini - CyberEmotions2013
  Warsaw University of Technology      22
  29-30 Jan 2013
What is emotion?
BROAD DEFINITION: ANY DEGREE OF JUDGEMENTAL EVALUATION.
               LIKE SENTISTRENGTH’S SCALE :
      DUAL 5-POINTS SYSTEM FOR POSITIVE [1; 2; 3; 4; 5]
         AND NEGATIVE [-1; -2; -3; -4; -5] EMOTIONS


Marina Santini - CyberEmotions2013
Warsaw University of Technology                       23
29-30 Jan 2013
Explorations



Marina Santini - CyberEmotions2013
Warsaw University of Technology            24
29-30 Jan 2013
Thematic Blog – Italian
      Logs from Google
         Analytics




Marina Santini - CyberEmotions2013
Warsaw University of Technology      25
29-30 Jan 2013
• Parents Grow Up:                  Genitori Crescono
                                     http://genitoricrescono.com/
     to learn together the parent
     profession




                                                belongs to:
                                          FattoreMammaNetwork
                                         (gathers websites targeted
                                         to mothers and written by
                                                 mothers)
 • About:
   parenthood, childcar
   e, maternity, upbringin
   g, behaviours during
   childhood…
Marina Santini - CyberEmotions2013
Warsaw University of Technology                                  26
29-30 Jan 2013
Queries from Google Analytics
www.genitoricrescono.com - Search Overview 2009-01-01-2012-11-10
                       togliere il pannolino = stop wearing nappies/stop using diapers
                      genitori crescono = is website name
                       Nopron = is the name of a controvensial syrup to make children
                       sleep all night long

                       Tracy Hogg is is maternity nurse to Hollywood stars known as
                       'the baby whisperer' for her skill in calming unruly infants

                       nanna = familiar bye-byes (Brit) , beddy-byes

                       neonato 4 mesi = 4-months-old baby
                       io mi svezzo da solo: I wean by myself

                         nulla osta = certificate of no impediment


    Marina Santini - CyberEmotions2013 terapeutico=therapeutic
                             aborto                              abortion
    Warsaw University of Technology                                                27
    29-30 Jan 2013
Zipf’s
                                          distribution



                                     “… much research has shown
                                      that query term frequency
                                      distributions conform to the
                                         power law, or long tail
                                      distribution curves. That is, a
                                       small portion of the terms
                                     observed in a large query log
                                     (e.g. > 100 million queries) are
                                       used most often, while the
                                     remaining terms are used less
                                           often individually."



Marina Santini - CyberEmotions2013
Warsaw University of Technology                                    28
29-30 Jan 2013
Parts of Speech
                                               NOUNS
                                               VERBS
                                     ADJECTIVES AND ADVERBS
                                     ARTICLES AND PREPOSITIONS




                                         1.9
Marina Santini - CyberEmotions2013
Warsaw University of Technology                            29
29-30 Jan 2013
Most Frequent Syntactic
         Patterns            inserimento al nido
                         bambini aggressivi
                             metodo estivill




Marina Santini - CyberEmotions2013
Warsaw University of Technology                    30
29-30 Jan 2013
Average Lengths




                                      “The average length of a
                                     search query was 2.4 terms"



                                      in a recent study in 2011 it
                                     was found that the average
                                     length of queries has grown
                                        steadily over time and
                                        average length of non-
                                      English languages queries
                                      had increased more than
                                            English queries."

Marina Santini - CyberEmotions2013
Warsaw University of Technology                                31
29-30 Jan 2013
Long query, informal
                   syntax


How to stop breastfeeding and make it sleep alone i am planning second pregnacy




     Marina Santini - CyberEmotions2013
     Warsaw University of Technology                                              32
     29-30 Jan 2013
• Queries’ Emotional
  Strength (i)
                                         SentiStrength
                                        (basic options)




   Marina Santini - CyberEmotions2013
   Warsaw University of Technology                        33
   29-30 Jan 2013
The power of genre and
    the importance of the
  communicative situation
• ”bambini aggressivi”

• Refinement of the concept presented in ”Topic-
  based Sentiment Analysis in the Social Media …”
  (Thelwall and Buckley, 2012): the polarity of affect
  words might flip according to genre and the
  communicative situation, and not only according
  the topic.

 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                         34
 29-30 Jan 2013
Addition: Emotion Words




Marina Santini - CyberEmotions2013
Warsaw University of Technology      35
29-30 Jan 2013
Emotional Strength:
  Basic vs. Boosted




Marina Santini - CyberEmotions2013
Warsaw University of Technology      36
29-30 Jan 2013
Negation
               Basic Options                              Boosted

bambini che non mangiano 2                -1   bambini che non mangiano 1       -1
              children who do not eat



quando i bambini non dormono 2            -1   quando i bambini non dormono 1   -1

            when children do not sleep




     Marina Santini - CyberEmotions2013
     Warsaw University of Technology                                             37
     29-30 Jan 2013
Most frequent
                     wordTrigrams




Marina Santini - CyberEmotions2013
Warsaw University of Technology      38
29-30 Jan 2013
Query ”Normalization”
• Stopword removal
• Lemmatization
• And ideally synomym expansion




Marina Santini - CyberEmotions2013
Warsaw University of Technology      39
29-30 Jan 2013
Use emotion needs as Actionable Intelligence


                                     Ex: for increasing
                                         traffic to a
                                           website
                                     Increase emotion relevance:
                                         • be empathetic to
                                         searchers ’s problems by
                                           sympathising and by
                                          convetring the negative
                                          words into more neutral
                                                 concepts
                                     •   Give heart and hope and
                                           offer many solutions…
                                     •   In a few word: offer a new
                                         communication stategy…

Marina Santini - CyberEmotions2013
Warsaw University of Technology                                 40
29-30 Jan 2013
Public organization website

 Enterprise search and log
  server by Findwise, AB.



 Marina Santini - CyberEmotions2013
 Warsaw University of Technology      41
 29-30 Jan 2013
Within the Västra Götaland
    Region website…




Marina Santini - CyberEmotions2013
Warsaw University of Technology      42
29-30 Jan 2013
…hittavård [find health
        care center]


                                     Regional
                                     HealthCare




Marina Santini - CyberEmotions2013
Warsaw University of Technology                   43
29-30 Jan 2013
VGR Corpus Description
• Corpus Time frame: 2010-2011 (2 years)

• Description: “These logs come from the search
  at hittavard.vgregion.se. The biggest bulk should come
  from 1177.se. The rest should be from vgregion.se. The
  target audience are both VGR (Västra Götalands
  Region) users/employees as well as the general
  public, as it is a public site. The internal files are
  searches made from within the VGR…”

• Corpus size:
    o   size = 3,167 KB (only queries) (BIG DATA is usually > 1TB)
    o   number of queries = 249,243
    o   number of words = 306,453
• Average query length: 1.23 words

 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                                     44
 29-30 Jan 2013
VGR Top Queries




                                 egenremiss=self-certification

                                 mina vårdkontakter=my healthcare contacts


                                 webbisar=a invented word referring to newborn
                                 babies whose pictures have been published on the web

Marina Santini - CyberEmotions2013sjukresa/or=trip   to the hospital
Warsaw University of Technology                                                 45
29-30 Jan 2013
Linguistic Remarks
• At the top of the frequency list:                        Simple nouns
                                                           •feber
    o Simple nouns
                                                           •influensa
    o Compounds
                                                           •klamydia
    o V+N
                                                           •…
                                      Compounds
                                      •urinvägsinfektion
V+N                                   •öroninflammation
•byta vårdcentral                     •Reseersättning
•avboka tid                           •…
•boka tid
•…




 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                                          46
 29-30 Jan 2013
More complex constructions
       at the bottom




 Marina Santini - CyberEmotions2013
 Warsaw University of Technology      47
 29-30 Jan 2013
SentiStrength on VGR




Marina Santini - CyberEmotions2013
Warsaw University of Technology      48
29-30 Jan 2013
It seems that no emotion is
conveyed by VGR users…

• Are Swedes less emotional than Italians?
• Is the ”healthcare” topic less emotional than the
  ”childcare” topic?




 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                      49
 29-30 Jan 2013
It might be that…
     There is a difference in users’ emotional
     behaviour when specifying queries to a
     web search engine OR when using a the
     search engine of a specialized website.




Marina Santini - CyberEmotions2013
Warsaw University of Technology                  50
29-30 Jan 2013
Emotion Interpretation…
is not straightforward…

• There are several factors to be accounted for:
    o One important factor is the context of communication: similar words or
      sentences can convey positive emotion in a query and negative emotion
      in Facebook post, for example.




 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                                               51
 29-30 Jan 2013
different communicative contexts = different genres




  Marina Santini - CyberEmotions2013
  Warsaw University of Technology               52
  29-30 Jan 2013
Genre Awaraness
• In practical terms, genre awareness is important in
  text analytics and sentiment analysis because, all
  things being equal:
    o let you choose the easiest and less problematic texts to
      process;
    o help interpret and disambiguate the real meanings of
      words and sentences according to the different
      communicative context in which they appear.




 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                                 53
 29-30 Jan 2013
In Summary



Marina Santini - CyberEmotions2013
Warsaw University of Technology          54
29-30 Jan 2013
Is it possible to identify
 and extract emotion from
         query logs?
• It is possible to identify and extract emotion from web
  query logs.

• It seems more difficult to extract emotion from
  enterprise search engine query logs.



 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                       55
 29-30 Jan 2013
Is it possible to use
     emotion from query logs
           for actionable
            intelligence?
• If present, query log emotion can be used for
  actionable intelligence.




    Marina Santini - CyberEmotions2013
    Warsaw University of Technology               56
    29-30 Jan 2013
What do you think?




      Thank you for your attention!

Marina Santini - CyberEmotions2013
Warsaw University of Technology       57
29-30 Jan 2013
Details



Marina Santini - CyberEmotions2013
Warsaw University of Technology                58
29-30 Jan 2013
Benefits for the Search
              Provider
• Mining query logs to extract user-created knowlege, ie
  queries that can be used as tags (metadata)
• Quickly create domain-specific taxonomies you can
  capitalize upon, especially for new client companies
  working in related fields
• Enhancements of current search products
• Inexpensive creation of annotated corpora: document
  annotation through query logs is a simple technique that
  in the a short time will build massive annotated corpora
  to use for machine learning, which will allow more
  sophisticated search refinements.


 Marina Santini - CyberEmotions2013
 Warsaw University of Technology                             59
 29-30 Jan 2013
Benefits for Clients & End
              Users
• Somebody said: SEARCH MUST BE MIND READER!
• BUT ALSO faster, more friendly, more exhaustive and more
  accurate.
• If this happens, clients will spend less for customer care. If
  the end user finds what s/he needs online and
  quickly, there is no need to call an helpdesk or customer
  care service.
• Through the analysis of query logs, log analysts can spot
  the less ”satisfied” queries (i.e. user’s needs). Companies
  can use this information to plan future products or
  product enhancement or marketing strategies, etc. (BI)
    Marina Santini - CyberEmotions2013
    Warsaw University of Technology                          60
    29-30 Jan 2013

More Related Content

Similar to How Emotional Are Users' Needs? Emotion in Query Logs

Big Data Analytics : Understanding for Research Activity
Big Data Analytics : Understanding for Research ActivityBig Data Analytics : Understanding for Research Activity
Big Data Analytics : Understanding for Research Activity
Andry Alamsyah
 
Togy Jose: Organizational Network Analytics - Revealing the Real Networks
Togy Jose: Organizational Network Analytics - Revealing the Real NetworksTogy Jose: Organizational Network Analytics - Revealing the Real Networks
Togy Jose: Organizational Network Analytics - Revealing the Real Networks
Edunomica
 
SFBA_SUG_2023-08-02.pdf
SFBA_SUG_2023-08-02.pdfSFBA_SUG_2023-08-02.pdf
SFBA_SUG_2023-08-02.pdf
Becky Burwell
 
Iotx futures research_futures_trends_2011
Iotx futures research_futures_trends_2011Iotx futures research_futures_trends_2011
Iotx futures research_futures_trends_2011
Andy Hunter
 
Deck 92-146 (3)
Deck 92-146 (3)Deck 92-146 (3)
Deck 92-146 (3)
Thinkful
 
Put it Together
Put it TogetherPut it Together
Put it Together
UXPA Boston
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social Media
Seth Grimes
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sd
Thinkful
 
Oess NCRM Festival
Oess NCRM FestivalOess NCRM Festival
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17
Thinkful
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sd
Thinkful
 
Big Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachBig Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network Approach
Andry Alamsyah
 
The Elusive Nature of Software Documentation
The Elusive Nature of Software DocumentationThe Elusive Nature of Software Documentation
The Elusive Nature of Software Documentation
Margaret-Anne Storey
 
Enriching social media personas with personality traits
Enriching social media personas with personality traitsEnriching social media personas with personality traits
Enriching social media personas with personality traits
Joni Salminen
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
Thinkful
 
Datamining Etnography
Datamining  EtnographyDatamining  Etnography
Datamining Etnography
Suhermin Pujiati
 
Netnography online course part 1 of 3 17 november 2016
Netnography online course part 1 of 3 17 november 2016Netnography online course part 1 of 3 17 november 2016
Netnography online course part 1 of 3 17 november 2016
suresh sood
 
SENTIMENT ANALYSIS – SARCASM DETECTION USING MACHINE LEARNING
SENTIMENT ANALYSIS – SARCASM DETECTION USING MACHINE LEARNINGSENTIMENT ANALYSIS – SARCASM DETECTION USING MACHINE LEARNING
SENTIMENT ANALYSIS – SARCASM DETECTION USING MACHINE LEARNING
IRJET Journal
 
Overview of Data and Analytics Essentials and Foundations
Overview of Data and Analytics Essentials and FoundationsOverview of Data and Analytics Essentials and Foundations
Overview of Data and Analytics Essentials and Foundations
NUS-ISS
 
Text Mining : Experience
Text Mining : ExperienceText Mining : Experience
Text Mining : Experience
Boonlert Aroonpiboon
 

Similar to How Emotional Are Users' Needs? Emotion in Query Logs (20)

Big Data Analytics : Understanding for Research Activity
Big Data Analytics : Understanding for Research ActivityBig Data Analytics : Understanding for Research Activity
Big Data Analytics : Understanding for Research Activity
 
Togy Jose: Organizational Network Analytics - Revealing the Real Networks
Togy Jose: Organizational Network Analytics - Revealing the Real NetworksTogy Jose: Organizational Network Analytics - Revealing the Real Networks
Togy Jose: Organizational Network Analytics - Revealing the Real Networks
 
SFBA_SUG_2023-08-02.pdf
SFBA_SUG_2023-08-02.pdfSFBA_SUG_2023-08-02.pdf
SFBA_SUG_2023-08-02.pdf
 
Iotx futures research_futures_trends_2011
Iotx futures research_futures_trends_2011Iotx futures research_futures_trends_2011
Iotx futures research_futures_trends_2011
 
Deck 92-146 (3)
Deck 92-146 (3)Deck 92-146 (3)
Deck 92-146 (3)
 
Put it Together
Put it TogetherPut it Together
Put it Together
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social Media
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sd
 
Oess NCRM Festival
Oess NCRM FestivalOess NCRM Festival
Oess NCRM Festival
 
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sd
 
Big Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachBig Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network Approach
 
The Elusive Nature of Software Documentation
The Elusive Nature of Software DocumentationThe Elusive Nature of Software Documentation
The Elusive Nature of Software Documentation
 
Enriching social media personas with personality traits
Enriching social media personas with personality traitsEnriching social media personas with personality traits
Enriching social media personas with personality traits
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
 
Datamining Etnography
Datamining  EtnographyDatamining  Etnography
Datamining Etnography
 
Netnography online course part 1 of 3 17 november 2016
Netnography online course part 1 of 3 17 november 2016Netnography online course part 1 of 3 17 november 2016
Netnography online course part 1 of 3 17 november 2016
 
SENTIMENT ANALYSIS – SARCASM DETECTION USING MACHINE LEARNING
SENTIMENT ANALYSIS – SARCASM DETECTION USING MACHINE LEARNINGSENTIMENT ANALYSIS – SARCASM DETECTION USING MACHINE LEARNING
SENTIMENT ANALYSIS – SARCASM DETECTION USING MACHINE LEARNING
 
Overview of Data and Analytics Essentials and Foundations
Overview of Data and Analytics Essentials and FoundationsOverview of Data and Analytics Essentials and Foundations
Overview of Data and Analytics Essentials and Foundations
 
Text Mining : Experience
Text Mining : ExperienceText Mining : Experience
Text Mining : Experience
 

More from Marina Santini

Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Marina Santini
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Marina Santini
 
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
Marina Santini
 
An Exploratory Study on Genre Classification using Readability Features
An Exploratory Study on Genre Classification using Readability FeaturesAn Exploratory Study on Genre Classification using Readability Features
An Exploratory Study on Genre Classification using Readability Features
Marina Santini
 
Lecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebLecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic Web
Marina Santini
 
Lecture: Summarization
Lecture: SummarizationLecture: Summarization
Lecture: Summarization
Marina Santini
 
Relation Extraction
Relation ExtractionRelation Extraction
Relation Extraction
Marina Santini
 
Lecture: Question Answering
Lecture: Question AnsweringLecture: Question Answering
Lecture: Question Answering
Marina Santini
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
Marina Santini
 
Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)
Marina Santini
 
Lecture: Word Sense Disambiguation
Lecture: Word Sense DisambiguationLecture: Word Sense Disambiguation
Lecture: Word Sense Disambiguation
Marina Santini
 
Lecture: Word Senses
Lecture: Word SensesLecture: Word Senses
Lecture: Word Senses
Marina Santini
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Marina Santini
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
Marina Santini
 
Semantics and Computational Semantics
Semantics and Computational SemanticsSemantics and Computational Semantics
Semantics and Computational Semantics
Marina Santini
 
Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)
Marina Santini
 
Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1)
Marina Santini
 
Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Lecture 5: Interval Estimation
Lecture 5: Interval Estimation
Marina Santini
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Marina Santini
 
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationLecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Marina Santini
 

More from Marina Santini (20)

Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity i...
 
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology ApplicationsTowards a Quality Assessment of Web Corpora for Language Technology Applications
Towards a Quality Assessment of Web Corpora for Language Technology Applications
 
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
A Web Corpus for eCare: Collection, Lay Annotation and Learning -First Results-
 
An Exploratory Study on Genre Classification using Readability Features
An Exploratory Study on Genre Classification using Readability FeaturesAn Exploratory Study on Genre Classification using Readability Features
An Exploratory Study on Genre Classification using Readability Features
 
Lecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic WebLecture: Ontologies and the Semantic Web
Lecture: Ontologies and the Semantic Web
 
Lecture: Summarization
Lecture: SummarizationLecture: Summarization
Lecture: Summarization
 
Relation Extraction
Relation ExtractionRelation Extraction
Relation Extraction
 
Lecture: Question Answering
Lecture: Question AnsweringLecture: Question Answering
Lecture: Question Answering
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
 
Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)Lecture: Vector Semantics (aka Distributional Semantics)
Lecture: Vector Semantics (aka Distributional Semantics)
 
Lecture: Word Sense Disambiguation
Lecture: Word Sense DisambiguationLecture: Word Sense Disambiguation
Lecture: Word Sense Disambiguation
 
Lecture: Word Senses
Lecture: Word SensesLecture: Word Senses
Lecture: Word Senses
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
Semantics and Computational Semantics
Semantics and Computational SemanticsSemantics and Computational Semantics
Semantics and Computational Semantics
 
Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)
 
Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1)
 
Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Lecture 5: Interval Estimation
Lecture 5: Interval Estimation
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationLecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
 

Recently uploaded

Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Networks
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
HackersList
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Kunal Gupta
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
bhumivarma35300
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
Steven Carlson
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Neo4j
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
SynapseIndia
 
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
Anant Gupta
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
Ivanti
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
Google Developer Group - Harare
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
Shiv Technolabs
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
Jimmy Lai
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
Matthias Neugebauer
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
313mohammedarshad
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
shyamraj55
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
digitalxplive
 

Recently uploaded (20)

Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
 
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
High Profile Girls call Service Pune 000XX00000 Provide Best And Top Girl Ser...
 
Vulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive OverviewVulnerability Management: A Comprehensive Overview
Vulnerability Management: A Comprehensive Overview
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
 
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes..."Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
"Mastering Graphic Design: Essential Tips and Tricks for Beginners and Profes...
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
 
Google I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged SlidesGoogle I/O Extended Harare Merged Slides
Google I/O Extended Harare Merged Slides
 
The Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF GuideThe Role of IoT in Australian Mobile App Development - PDF Guide
The Role of IoT in Australian Mobile App Development - PDF Guide
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
 
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptxIntroduction-to-the-IAM-Platform-Implementation-Plan.pptx
Introduction-to-the-IAM-Platform-Implementation-Plan.pptx
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
The Rise of AI in Cybersecurity How Machine Learning Will Shape Threat Detect...
 

How Emotional Are Users' Needs? Emotion in Query Logs

  • 1. How Emotional are Users’ Needs? Exploring Emotion in Query Logs Marina Santini 29 Jan 2013 Marina Santini - CyberEmotions2013 Warsaw University of Technology 1 29-30 Jan 2013
  • 2. Outline • Inspirational Triggers: o The Big Unstructured Textual Data Issue o Emotion in IR o Research hypothesis • Genre- and Emotion- Profiling of Query Logs o Characterization of genre o Definition of emotion o Benefits of genre and emotion awareness in query log analysis • Experiments o Query Logs from GenitoriCrescono thematic blog (in iItalian) o Query Logs from Västra Götlands Region (in Swedish) • Conclusions Marina Santini - CyberEmotions2013 Warsaw University of Technology 2 29-30 Jan 2013
  • 3. Inspirational Trigger 1 BIG UNSTRUCTURED TEXTUAL DATA Marina Santini - CyberEmotions2013 Warsaw University of Technology 3 29-30 Jan 2013
  • 4. Big Unstructured Texutal Data  “MerrillLynch estimates that more than 85 percent of all business information exists as unstructured data – commonly appearing in e‐mails, memos, notes from call centers and support operations, news, user groups, chats, reports, letters, surveys, white papers, marketing material, research, presentations and web pages.” [DM Review Magazine, February 2003 Issue]  ECONOMIC LOSS! Lots of different genres! Marina Santini - CyberEmotions2013 Warsaw University of Technology 4 29-30 Jan 2013
  • 5. Simple search is not enough… • Of course, it is possible to use simple search. But simple search is unrewarding, because is based on single terms. o ”a search is made on the term felony. In a simple search, the term felony is used, and everywhere there is a reference to felony, a hit to an unstructured document is made. But a simple search is crude. It does not find references to crime, arson, murder, embezzlement, vehicular homicide, and such, even though these crimes are types of felonies” [ Source: Inmon, B. & A. Nesavich, "Unstructured Textual Data in the Organization" from "Managing Unstructured data in the organization", Prentice Hall 2008, pp. 1–13] Marina Santini - CyberEmotions2013 Warsaw University of Technology 5 29-30 Jan 2013
  • 6. Text Analytics • A set of NLP techniques that provide some structure to textual documents. • Common components: o Tokenization o Morphological Analysis o Syntactic Analysis o Named Entity Recognition o Sentiment Analysis o Automatic Summarization o Etc. Marina Santini - CyberEmotions2013 Warsaw University of Technology 6 29-30 Jan 2013
  • 7. Text Analytics Products and Frameworks • Commercial: Open Source: Attensity o • GATE o Clarabridge o Temis • NLTK Lexalytics o • UIMA o Texify o SAS • etc. o IBM Cognos o etc. Business Intelligence (BI) Customer Experience Management (CEM) Marina Santini - CyberEmotions2013 Warsaw University of Technology 7 29-30 Jan 2013
  • 8. Actionable Intelligence • Business Intelligence (BI) + Customer Experience Management (CEM) = Actionable Intelligence • Actionable Intelligence is information that: 1. must be accurate and verifiable 2. must be timely 3. must be comprehensive 4. must be comprehensible 5. give the power to make decisions and to act straightaway Marina Santini - CyberEmotions2013 Warsaw University of Technology 8 29-30 Jan 2013
  • 9. In 2003, Merryl Lynch pointed out that it was too difficult to extract automatically usable intelligence from the following genres: o e‐mails o memos Today… o notes from call centers and support operations Previous genres plus o news •Blogs o user groups •Tweets o chats o reports •FB microposts o letters •FB comments o surveys •Many other social network texutal o white papers o marketing material, ”interactions” o research, o presentations o web pages Marina Santini - CyberEmotions2013 Warsaw University of Technology 9 29-30 Jan 2013
  • 10. From Big Data to Query Logs Current State of affair Viable Alternative 1. Big Unstructured Textual Data • Query Logs 2. Text Analytics (commercial • Genre- & Context products and frameworks) aware Text Analytics 3. Structured information for BI • Actionable Information and CEM (BI, CEM, sentiment, e merging topics…) The main advantage to uses query logs (when they are Typical Use Case available) instead of other genres consists in A company managing: REDUCED DATA SIZE, REDUCED PRE-PROCESSING; •Website REDUCED NOISE, REDUCED DATA CLEANING! •Blog •eMails Marina Santini - CyberEmotions2013 •Facebook Page Warsaw University of Technology 10 29-30 Jan 2013 •Twitter account
  • 11. Exploratory Query-log Analysis Workshop Organized by Findwise, AB – Sweden SearchInFocus SLTC 2012 Exploratory Study on Query Logs and Actionable Intelligence Query Logs provide Actionable Intelligence for: - search providers - clients - end-users Marina Santini - CyberEmotions2013 Warsaw University of Technology 11 29-30 Jan 2013
  • 12. Inspirational Trigger 2 EMOTION IN INFORMATION RETRIEVAL (IR) Marina Santini - CyberEmotions2013 Warsaw University of Technology 12 29-30 Jan 2013
  • 13. Emotion in IR Role of Emotion in Information o Three concepts: Retrieval by Yashar Moshfeghi • Emotion need PhD Thesis at University of Glasgow, 2012 • Emotion object ” uncover social situations • Emotion relevance where emotion is the primary factor (i.e., source of motivation) in an IR&S process.” (from the abstract) Marina Santini - CyberEmotions2013 Warsaw University of Technology 13 29-30 Jan 2013
  • 14. Emotion Need • The whole IR&S behaviour is driven by an emotion need. • An emotion need is more fundamental than an information need in the sense that if an information need exists it implies that there is an underlying emotion need to satisfy this information need. • Emotion needs, even when they do not lead to a particular information need, can motivate searchers to use an IR system. Marina Santini - CyberEmotions2013 Warsaw University of Technology 14 29-30 Jan 2013
  • 15. Research Hypothesis for the exploration of emotion in query logs It is plausible that much of the IR&S behaviour is driven by an emotion need and that users’ emotions are expressed in the queries that are typed in search boxes and stored in query logs. If this is true, also emotion extraction from query logs provides actionable intelligence, because extracted emotions can be used to improve decision making and more grounded future choices. Marina Santini - CyberEmotions2013 Warsaw University of Technology 15 29-30 Jan 2013
  • 16. Research Questions • Is it possible to extract emotion from query logs? • If so, is it possible to use emotion from query logs for actionable intelligence? Marina Santini - CyberEmotions2013 Warsaw University of Technology 16 29-30 Jan 2013
  • 17. Genre Profiling of Query Logs Marina Santini - CyberEmotions2013 Warsaw University of Technology 17 29-30 Jan 2013
  • 18. What characterizes a genre? 1. Must have a name 2. Must be recognized within a community 3. Must be produced during a task 4. Must have conventions 5. Must raise expectations 6. Can change over time. It is an cultural artifact (culture here includes society, media, techonology, etc.) Marina Santini - CyberEmotions2013 Warsaw University of Technology 18 29-30 Jan 2013
  • 19. The query log genre is… a newly acknowledge but fully- emerged webgenre 1. Name: in line with other digital genres (ex: web log  blog) 2. Community: internet users, IR practitioners 3. Task: to express searchers’needs in a search engine 4. Conventions: short texts written in”keywordese” 5. Expectations: to find information relevant to the query 6. Cultural artifact: a product of sinternet-based society OR a subproduct of search engines Marina Santini - CyberEmotions2013 Warsaw University of Technology 19 29-30 Jan 2013
  • 20. The query log genre: Languistic and Textual Conventions • Length: short text (a query log can be seen as a corpus of very short texts, shorter than tweets, mobile text messages, chat logs, etc.) • Sublanguage/Jargon: ”keywordese” • Register: neutral • Morphology: REDUCED • Syntax : REDUCED (usually no subclauses, etc.) Marina Santini - CyberEmotions2013 Warsaw University of Technology 20 29-30 Jan 2013
  • 21. The Query Log Genre: Benefits • wrt discourse analysis: o Conceptual lean and essential jargon • reduced morphology • reduced syntax • short texts • mostly nouns and verbs Benefit1: Predictable Sublanguage • wrt BIG UNSTRUCTURED TEXTUAL DATA BENEFIT 2: REDUCED SIZE, REDUCED PRE- PROCESSING; LITTLE DATA CLEANING! Marina Santini - CyberEmotions2013 Warsaw University of Technology 21 29-30 Jan 2013
  • 22. Emotion Profiling of Query Logs Marina Santini - CyberEmotions2013 Warsaw University of Technology 22 29-30 Jan 2013
  • 23. What is emotion? BROAD DEFINITION: ANY DEGREE OF JUDGEMENTAL EVALUATION. LIKE SENTISTRENGTH’S SCALE : DUAL 5-POINTS SYSTEM FOR POSITIVE [1; 2; 3; 4; 5] AND NEGATIVE [-1; -2; -3; -4; -5] EMOTIONS Marina Santini - CyberEmotions2013 Warsaw University of Technology 23 29-30 Jan 2013
  • 24. Explorations Marina Santini - CyberEmotions2013 Warsaw University of Technology 24 29-30 Jan 2013
  • 25. Thematic Blog – Italian Logs from Google Analytics Marina Santini - CyberEmotions2013 Warsaw University of Technology 25 29-30 Jan 2013
  • 26. • Parents Grow Up: Genitori Crescono http://genitoricrescono.com/ to learn together the parent profession belongs to: FattoreMammaNetwork (gathers websites targeted to mothers and written by mothers) • About: parenthood, childcar e, maternity, upbringin g, behaviours during childhood… Marina Santini - CyberEmotions2013 Warsaw University of Technology 26 29-30 Jan 2013
  • 27. Queries from Google Analytics www.genitoricrescono.com - Search Overview 2009-01-01-2012-11-10 togliere il pannolino = stop wearing nappies/stop using diapers genitori crescono = is website name Nopron = is the name of a controvensial syrup to make children sleep all night long Tracy Hogg is is maternity nurse to Hollywood stars known as 'the baby whisperer' for her skill in calming unruly infants nanna = familiar bye-byes (Brit) , beddy-byes neonato 4 mesi = 4-months-old baby io mi svezzo da solo: I wean by myself nulla osta = certificate of no impediment Marina Santini - CyberEmotions2013 terapeutico=therapeutic aborto abortion Warsaw University of Technology 27 29-30 Jan 2013
  • 28. Zipf’s distribution “… much research has shown that query term frequency distributions conform to the power law, or long tail distribution curves. That is, a small portion of the terms observed in a large query log (e.g. > 100 million queries) are used most often, while the remaining terms are used less often individually." Marina Santini - CyberEmotions2013 Warsaw University of Technology 28 29-30 Jan 2013
  • 29. Parts of Speech NOUNS VERBS ADJECTIVES AND ADVERBS ARTICLES AND PREPOSITIONS 1.9 Marina Santini - CyberEmotions2013 Warsaw University of Technology 29 29-30 Jan 2013
  • 30. Most Frequent Syntactic Patterns inserimento al nido bambini aggressivi metodo estivill Marina Santini - CyberEmotions2013 Warsaw University of Technology 30 29-30 Jan 2013
  • 31. Average Lengths “The average length of a search query was 2.4 terms" in a recent study in 2011 it was found that the average length of queries has grown steadily over time and average length of non- English languages queries had increased more than English queries." Marina Santini - CyberEmotions2013 Warsaw University of Technology 31 29-30 Jan 2013
  • 32. Long query, informal syntax How to stop breastfeeding and make it sleep alone i am planning second pregnacy Marina Santini - CyberEmotions2013 Warsaw University of Technology 32 29-30 Jan 2013
  • 33. • Queries’ Emotional Strength (i) SentiStrength (basic options) Marina Santini - CyberEmotions2013 Warsaw University of Technology 33 29-30 Jan 2013
  • 34. The power of genre and the importance of the communicative situation • ”bambini aggressivi” • Refinement of the concept presented in ”Topic- based Sentiment Analysis in the Social Media …” (Thelwall and Buckley, 2012): the polarity of affect words might flip according to genre and the communicative situation, and not only according the topic. Marina Santini - CyberEmotions2013 Warsaw University of Technology 34 29-30 Jan 2013
  • 35. Addition: Emotion Words Marina Santini - CyberEmotions2013 Warsaw University of Technology 35 29-30 Jan 2013
  • 36. Emotional Strength: Basic vs. Boosted Marina Santini - CyberEmotions2013 Warsaw University of Technology 36 29-30 Jan 2013
  • 37. Negation Basic Options Boosted bambini che non mangiano 2 -1 bambini che non mangiano 1 -1 children who do not eat quando i bambini non dormono 2 -1 quando i bambini non dormono 1 -1 when children do not sleep Marina Santini - CyberEmotions2013 Warsaw University of Technology 37 29-30 Jan 2013
  • 38. Most frequent wordTrigrams Marina Santini - CyberEmotions2013 Warsaw University of Technology 38 29-30 Jan 2013
  • 39. Query ”Normalization” • Stopword removal • Lemmatization • And ideally synomym expansion Marina Santini - CyberEmotions2013 Warsaw University of Technology 39 29-30 Jan 2013
  • 40. Use emotion needs as Actionable Intelligence Ex: for increasing traffic to a website Increase emotion relevance: • be empathetic to searchers ’s problems by sympathising and by convetring the negative words into more neutral concepts • Give heart and hope and offer many solutions… • In a few word: offer a new communication stategy… Marina Santini - CyberEmotions2013 Warsaw University of Technology 40 29-30 Jan 2013
  • 41. Public organization website Enterprise search and log server by Findwise, AB. Marina Santini - CyberEmotions2013 Warsaw University of Technology 41 29-30 Jan 2013
  • 42. Within the Västra Götaland Region website… Marina Santini - CyberEmotions2013 Warsaw University of Technology 42 29-30 Jan 2013
  • 43. …hittavård [find health care center] Regional HealthCare Marina Santini - CyberEmotions2013 Warsaw University of Technology 43 29-30 Jan 2013
  • 44. VGR Corpus Description • Corpus Time frame: 2010-2011 (2 years) • Description: “These logs come from the search at hittavard.vgregion.se. The biggest bulk should come from 1177.se. The rest should be from vgregion.se. The target audience are both VGR (Västra Götalands Region) users/employees as well as the general public, as it is a public site. The internal files are searches made from within the VGR…” • Corpus size: o size = 3,167 KB (only queries) (BIG DATA is usually > 1TB) o number of queries = 249,243 o number of words = 306,453 • Average query length: 1.23 words Marina Santini - CyberEmotions2013 Warsaw University of Technology 44 29-30 Jan 2013
  • 45. VGR Top Queries egenremiss=self-certification mina vårdkontakter=my healthcare contacts webbisar=a invented word referring to newborn babies whose pictures have been published on the web Marina Santini - CyberEmotions2013sjukresa/or=trip to the hospital Warsaw University of Technology 45 29-30 Jan 2013
  • 46. Linguistic Remarks • At the top of the frequency list: Simple nouns •feber o Simple nouns •influensa o Compounds •klamydia o V+N •… Compounds •urinvägsinfektion V+N •öroninflammation •byta vårdcentral •Reseersättning •avboka tid •… •boka tid •… Marina Santini - CyberEmotions2013 Warsaw University of Technology 46 29-30 Jan 2013
  • 47. More complex constructions at the bottom Marina Santini - CyberEmotions2013 Warsaw University of Technology 47 29-30 Jan 2013
  • 48. SentiStrength on VGR Marina Santini - CyberEmotions2013 Warsaw University of Technology 48 29-30 Jan 2013
  • 49. It seems that no emotion is conveyed by VGR users… • Are Swedes less emotional than Italians? • Is the ”healthcare” topic less emotional than the ”childcare” topic? Marina Santini - CyberEmotions2013 Warsaw University of Technology 49 29-30 Jan 2013
  • 50. It might be that… There is a difference in users’ emotional behaviour when specifying queries to a web search engine OR when using a the search engine of a specialized website. Marina Santini - CyberEmotions2013 Warsaw University of Technology 50 29-30 Jan 2013
  • 51. Emotion Interpretation… is not straightforward… • There are several factors to be accounted for: o One important factor is the context of communication: similar words or sentences can convey positive emotion in a query and negative emotion in Facebook post, for example. Marina Santini - CyberEmotions2013 Warsaw University of Technology 51 29-30 Jan 2013
  • 52. different communicative contexts = different genres Marina Santini - CyberEmotions2013 Warsaw University of Technology 52 29-30 Jan 2013
  • 53. Genre Awaraness • In practical terms, genre awareness is important in text analytics and sentiment analysis because, all things being equal: o let you choose the easiest and less problematic texts to process; o help interpret and disambiguate the real meanings of words and sentences according to the different communicative context in which they appear. Marina Santini - CyberEmotions2013 Warsaw University of Technology 53 29-30 Jan 2013
  • 54. In Summary Marina Santini - CyberEmotions2013 Warsaw University of Technology 54 29-30 Jan 2013
  • 55. Is it possible to identify and extract emotion from query logs? • It is possible to identify and extract emotion from web query logs. • It seems more difficult to extract emotion from enterprise search engine query logs. Marina Santini - CyberEmotions2013 Warsaw University of Technology 55 29-30 Jan 2013
  • 56. Is it possible to use emotion from query logs for actionable intelligence? • If present, query log emotion can be used for actionable intelligence. Marina Santini - CyberEmotions2013 Warsaw University of Technology 56 29-30 Jan 2013
  • 57. What do you think? Thank you for your attention! Marina Santini - CyberEmotions2013 Warsaw University of Technology 57 29-30 Jan 2013
  • 58. Details Marina Santini - CyberEmotions2013 Warsaw University of Technology 58 29-30 Jan 2013
  • 59. Benefits for the Search Provider • Mining query logs to extract user-created knowlege, ie queries that can be used as tags (metadata) • Quickly create domain-specific taxonomies you can capitalize upon, especially for new client companies working in related fields • Enhancements of current search products • Inexpensive creation of annotated corpora: document annotation through query logs is a simple technique that in the a short time will build massive annotated corpora to use for machine learning, which will allow more sophisticated search refinements. Marina Santini - CyberEmotions2013 Warsaw University of Technology 59 29-30 Jan 2013
  • 60. Benefits for Clients & End Users • Somebody said: SEARCH MUST BE MIND READER! • BUT ALSO faster, more friendly, more exhaustive and more accurate. • If this happens, clients will spend less for customer care. If the end user finds what s/he needs online and quickly, there is no need to call an helpdesk or customer care service. • Through the analysis of query logs, log analysts can spot the less ”satisfied” queries (i.e. user’s needs). Companies can use this information to plan future products or product enhancement or marketing strategies, etc. (BI) Marina Santini - CyberEmotions2013 Warsaw University of Technology 60 29-30 Jan 2013

Editor's Notes

  1. In this talk, I would like to share and discuss with you the preliminaryresults of query logs’ emotionalanalysis.There is still much to be investigated in this field and to be capitalized on.HowEmotional are Users’ Needs?AbstractEmotional behaviour seems to be ubiquitous on the web. Predictably, social media web genres such as tweets, blog posts and blog comments show high emotional involvement. What about other genres on the web? In this talk, the focus is on the search query log genre. According to recent IR research, searchers’ behaviour is not only limited to traditional informational, navigational and transactional needs. A novel hypothesis is that the seeking behaviour is driven by emotion. But can emotion be detected by analysing the queries typed by users in a search box? In this talk, I will present the results of some experiments carried out to investigate whether it is possible to identify emotion in the query log genre, and discuss how emotion could be utilized to improve the relevance of retrieved documents in searches. These experiments are part of SearchInFocus, a study centred on search.
  2. Big dataset: hadhoop, R etc.Merrill Lynch – financial management and advisorywww.ml.com/Merrill Lynch is one of the world's leading financial management and advisory companies, providing financial advice and investment banking services.e‐mails, memos, notes from call centers and support operations, news, user groups, chats, reports, letters, surveys, white papers, marketing material, research, presentations , etc are different genres, ie different types of text. For example, emails and white papers are both textual genres but they differ a lot from each other. They might deal with the same topic, but in a complete different way. So the type of information related to the same topic can vary according to genre.
  3. Business intelligence (BI) is the ability of an organization to collect, maintain, and organize data. This produces large amounts of information that can help develop new opportunities. Identifying these opportunities, and implementing an effective strategy, can provide a competitive market advantage and long-term stability. BI technologies provide historical, current and predictive views of business operations.Customer Experience Management (CEM) is the practice of actively listening to the Voice of the Customer through a variety of listening posts, analyzing customer feedback to create a basis for acting on better business decisions and then measuring the impact of those decisions to drive even greater operational performance and customer loyalty. Through this process, a company strategically organizes itself to manage a customer's entire experience with its product, service or company.  Companies invest in CEM to improve customer retention
  4. If this companywants to analyse the interaction with clients/customers/fans/complainers to identifymaintopics, main problems, main sentiment, and decidewhichdirections are profitable for the future, I would suggest starting from query log analysis.
  5. Findwise….
  6. So ifwherehave genreawareness, wecandecidewhich genre is betterthananother for ourpurposes. I tried to advocate the use of query logs (whenavailable) becausetheymight be easier to mine. Can wealsofind emotions in query logs?
  7. It has been shown in previous research that emotion plays an important role in the success of an IR&S process which has the purpose of satisfying an information need. However, these previous studies do not give a sufficiently prominent position to emotion in IR, since they limit the role of emotion to a secondary factor, by assuming that a lack of knowledge (the need for information) is the primary factor (the motivation of the search).Yashar proposes to treat emotion as the principal factor in the system of needs of a searcher, and therefore one that ought to be considered by the retrieval algorithms. He presents a view of searchers’ needs by considering not only theories from information retrieval and science, but also from psychology, philosophy, and sociology. We extensively report on the role of emotion in every aspect of human behaviour, both at an individual and social level. This serves not only to modify the current IR views of emotion, but more importantly to uncover social situations where emotion is the primary factor (i.e., source of motivation) in an IR&S process.Emotion need: An individual or group’s desire to be in a particular emotion state by means of acquiring information and/or emotion. P. 52Emotion object: emotion extracted from the content of a document that represents the emotion of the document creator, the emotion of document viewer. P. 56Emotion relevance: an IR system musth know about searchers’ emotion need as well as their information needs. P. 60
  8. Previous research: the role of emotion in the information seeking process is to alleviate and/or diminish thenegative feelings experienced because of uncertainty, so the emotion need here is for experience positive feelings of satisfaction via obtaining information.Yasharargues that physiological needs are not directly satisfied through an information seeking process, but that they instead lead either to anemotion or information need that initiates the information seeking behaviour which goes on to satisfy these needs. For example, hunger (i.e., physiological need) can lead to either searching for close-by restaurants (i.e., information need) or negative emotion states (e.g., frustration) needing to be resolved by watching funny clips (i.e., emotion need). Due to this delegation of physiological need to information or emotion need, we do not further investigate physiological need. Therefore, all we need is to investigate the relationship between information need and emotion need.Yashar argues that an emotion need is more fundamental than an information need in the sense that if an information need exists it implies that there is an underlying emotion need to satisfy this information need. The whole IR&S behaviour is thus driven by an emotion need. However, the converse may not necessary be true, e.g., a user could want to be happy/sad/angry but without having a well-defined IN. Thus, whenever information need is discussed, an emotion need is preexistent. In the case when the emotion need of the searcher is to diminish the negative feelings associated with a lack of knowledge (i.e., an IN), the emotion need would be satisfied if the IN associated with it is resolved. For example, if a searcher’s IN is to know about topic x, the searcher must believe4 that information about x has been acquired, in order for their emotion need to be satisfied. Thus, the emotion need will not be resolved unless the underlying information need is resolved, since in this context, the information need is the dominant one.There are in fact emotion needs that do not imply an information need in the way we have defined information above. An example of such needs are the scenarios explained in Section 3.5.4, i.e., users who are stressed and look at some clips that they know will help to relieve their stress. Of course, one way of remembering these clips is by employing the associative nature of the relationship between emotion and memory (see Section 3.4.4). Other ways include looking at the popular (most viewed/highly recommended) objects. In all these scenarios there is no particular information need to be resolved, but only an emotion need, e.g., when searchers are seeking for funny clips in YouTube. In these scenarios, it is argued that the emotion aspect of information objects is more important than their information aspect, and we label them as extreme emotion need scenarios. Thus, one can present emotion need as a continuous spectrum ranging from informational needs to extreme emotionalneeds.It has been shown that information need motivates searchers to engage with an IR system. An emotion need can be a motive for searchers to use an IR system when it manifests itself as information need. It is our belief that emotion needs, even when they do not lead to a particular information need, can motivate searchers to use an IR system.
  9. Yasharsays that emotion is the driving force of Information Retrieval and Seeking. If this is true, it is plausible that wecanfind emotion in the queries that the userwrites in a search box, and consequently in query logs.
  10. “keywordese”, i.e. the kind of sublanguage/jargonweuse to communicate with searchengines (that is, a languagewithoutarticle, without prepositions, and other stop words, withoutmuch syntax or hedges, etc.), query logs are skimmed texts that require no cleaning from redundancies or rhetorical ornaments, and reducedpre-processing.
  11. One of the manyblogsfocussing on maternity and child-relatedissuel. There is a widenetwork of similarblogs and websites in ItalycalledFattoreMammaNetwork: http://fattoremamma.com/network/
  12. RepetionsUse of functionwords (stopwords)ProperNamesOneword & multi-wordsBooktitles: Iomisvezzo da solo scritto da Lucio PiermarinipediatricianWean = Accustom an infant to food other than its mother's milk.Nullaosta = certificate of no impedimentAbortoterapeutico= therapeuticabortion = misscarriageDr. Estivill, a pediatrician and neurophysiologist, is the director of the Sleep Clinic at the InstitutDexeus in Barcelona, Spain. He is also the coordinator of the Sleep Unit at Catalonia General Hospital and Incosol Clinic in Marbella. Dr. Estivill has written many popular books on sleep and other habits, including 5 Days to a Perfect Night’s Sleep for your Child. Togliere il pannolino = stop wearingnappies/stopusingdiapers = article ”i” masculin plural has beenusedNopron: syrup = psicofarmaco =drug used in treatment of mental conditions  Fare la nanna = nanna  fam   bye-byes   (Brit)  , beddy-byesNeonato 4 mesiNulla osta Iomisvezzo da soloTracy Hogg(notprovided): About a year ago, Google decided that all users logged into Google — such as Google+, Gmail — would be redirected from http://google.com to https://google.com, thereby encrypting data. Google claims this move was done to protect users’ privacy. While the jury is still out on whether or not this move protects anyone’s privacy, one thing is for certain: web-based businesses and SEO experts have to jump through some hoops in order to get around the dreaded "(not provided)" keyword.
  13. High frequency of nouns and verbs indicates density of information (Biber, 1988: 105). Adjectives elaborate nominal informationA density of contentwords,
  14. Inserimento al nido = ”settling-in phase" (period in which children are gradually introduced to the nursery)
  15. Whenweknow the genre of texts, wecansurmize the purpose of the text producer.Whenwehave a high frequency of a string such as ”bambiniaggressivi” in a specializedblog for parents, it probablymeans that usershave a problem with aggressive children and theywant to find suggestions on how to solve the problem and alsofindsomeempathy (they are not the onlyoneshaving this problem). So the negative adjectiveaggressividoes not convey a negative feeling but a positive energy to solve a problem. So this queryconveys a posive emotion and the hope that something negative (an aggressive behaviour) can be changed or solved. This types of adjectives that are negative at face values, shouldbecome positive becausethey express an emotion need that is a positive attitude. hat I am proposinghere is a refinement of the conceptpresented in ”Topic-based Sentiment Analysis in the Social Media …” (Thelwall and Buckley, 2012): the polarity of affectwordsmightflipaccording to genre (that gives us a hint of the purpose of why a text has beenproduced) and the communicative situation., and not only the topic. Ifsomebodywrites ”bambiniaggressivi” in a FB problably the purpose is different, like witnessingchildren who behaveaggressively.
  16. Tweeking is not enough.
  17. Not sure how to interpret this
  18. Instead of applyinggenreralaffectvocubulary, one easy way is to identify the coreaffectvocabulary is to make a ngramfrequency list. In this example you cansee the mostfrequenttrigrams from genitoricrescono. You canseemany repetitions because in thesequeriesthere ar manyfunctionwords (articles, prepositions, etc). In order to get a cleaner list.
  19. Nationa HEALTH CARE /Nhc
  20. Not big data
  21. Self-certificationMymedical center contatctMammographiGerman measlesChange healthcarewardWebbisar = regulation on how to publishpictures of new born babys on the web
  22. UrinarytraitinfectionEarinfectionReseersättning travelreimbursement
  23. 20 occurrences…
  24. I think that the VGR website genre is less emotional… There is possibly a difference in users’ emotionalbehaviourwhenspecifyingqueries to a websearchengine and whenusing a specializedwebsite. In the first case, userscommunicatetheir emotions moreextensively,In the second case, they just specify the singleword that best representtheir information need.
  25. The expectation from a public service website is to be informative. VGR queries are specificed by users who are in the website and use the internalsearchengine. Thereforethey go straitght to the point by specifying a singlewordwithoutanydescriptive part of speech.While on the web, users must describe in a moredetailedwaywhatthey are looking for. It wouldhavebeeninteresting to comparehow the same topics expressed in the VGR querieswherespecified in a forum or a medicalblog.…
  26. From the exploratoryresults,wecansay that