Your SlideShare is downloading. ×
How Emotional Are Users' Needs? Emotion in Query Logs
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

How Emotional Are Users' Needs? Emotion in Query Logs

3,395
views

Published on

Emotional behaviour seems to be ubiquitous on the web. Predictably, social media web genres such as tweets, blog posts and blog comments show high emotional involvement. What about other genres on the …

Emotional behaviour seems to be ubiquitous on the web. Predictably, social media web genres such as tweets, blog posts and blog comments show high emotional involvement. What about other genres on the web? In this talk, the focus is on the search query log genre. According to recent IR research, searchers’ behaviour is not only limited to traditional informational, navigational and transactional needs. A novel hypothesis is that the seeking behaviour is driven by emotion. But can emotion be detected by analysing the queries typed by users in a search box? In this talk, I will present the results of some experiments carried out to investigate whether it is possible to identify emotion in the query log genre, and discuss how emotion could be utilized to improve the relevance of retrieved documents in searches. These experiments are part of SearchInFocus, a study centred on search.

Published in: Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,395
On Slideshare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
25
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • In this talk, I would like to share and discuss with you the preliminaryresults of query logs’ emotionalanalysis.There is still much to be investigated in this field and to be capitalized on.HowEmotional are Users’ Needs?AbstractEmotional behaviour seems to be ubiquitous on the web. Predictably, social media web genres such as tweets, blog posts and blog comments show high emotional involvement. What about other genres on the web? In this talk, the focus is on the search query log genre. According to recent IR research, searchers’ behaviour is not only limited to traditional informational, navigational and transactional needs. A novel hypothesis is that the seeking behaviour is driven by emotion. But can emotion be detected by analysing the queries typed by users in a search box? In this talk, I will present the results of some experiments carried out to investigate whether it is possible to identify emotion in the query log genre, and discuss how emotion could be utilized to improve the relevance of retrieved documents in searches. These experiments are part of SearchInFocus, a study centred on search.
  • Big dataset: hadhoop, R etc.Merrill Lynch – financial management and advisorywww.ml.com/Merrill Lynch is one of the world's leading financial management and advisory companies, providing financial advice and investment banking services.e‐mails, memos, notes from call centers and support operations, news, user groups, chats, reports, letters, surveys, white papers, marketing material, research, presentations , etc are different genres, ie different types of text. For example, emails and white papers are both textual genres but they differ a lot from each other. They might deal with the same topic, but in a complete different way. So the type of information related to the same topic can vary according to genre.
  • Business intelligence (BI) is the ability of an organization to collect, maintain, and organize data. This produces large amounts of information that can help develop new opportunities. Identifying these opportunities, and implementing an effective strategy, can provide a competitive market advantage and long-term stability. BI technologies provide historical, current and predictive views of business operations.Customer Experience Management (CEM) is the practice of actively listening to the Voice of the Customer through a variety of listening posts, analyzing customer feedback to create a basis for acting on better business decisions and then measuring the impact of those decisions to drive even greater operational performance and customer loyalty. Through this process, a company strategically organizes itself to manage a customer's entire experience with its product, service or company.  Companies invest in CEM to improve customer retention
  • If this companywants to analyse the interaction with clients/customers/fans/complainers to identifymaintopics, main problems, main sentiment, and decidewhichdirections are profitable for the future, I would suggest starting from query log analysis.
  • Findwise….
  • So ifwherehave genreawareness, wecandecidewhich genre is betterthananother for ourpurposes. I tried to advocate the use of query logs (whenavailable) becausetheymight be easier to mine. Can wealsofind emotions in query logs?
  • It has been shown in previous research that emotion plays an important role in the success of an IR&S process which has the purpose of satisfying an information need. However, these previous studies do not give a sufficiently prominent position to emotion in IR, since they limit the role of emotion to a secondary factor, by assuming that a lack of knowledge (the need for information) is the primary factor (the motivation of the search).Yashar proposes to treat emotion as the principal factor in the system of needs of a searcher, and therefore one that ought to be considered by the retrieval algorithms. He presents a view of searchers’ needs by considering not only theories from information retrieval and science, but also from psychology, philosophy, and sociology. We extensively report on the role of emotion in every aspect of human behaviour, both at an individual and social level. This serves not only to modify the current IR views of emotion, but more importantly to uncover social situations where emotion is the primary factor (i.e., source of motivation) in an IR&S process.Emotion need: An individual or group’s desire to be in a particular emotion state by means of acquiring information and/or emotion. P. 52Emotion object: emotion extracted from the content of a document that represents the emotion of the document creator, the emotion of document viewer. P. 56Emotion relevance: an IR system musth know about searchers’ emotion need as well as their information needs. P. 60
  • Previous research: the role of emotion in the information seeking process is to alleviate and/or diminish thenegative feelings experienced because of uncertainty, so the emotion need here is for experience positive feelings of satisfaction via obtaining information.Yasharargues that physiological needs are not directly satisfied through an information seeking process, but that they instead lead either to anemotion or information need that initiates the information seeking behaviour which goes on to satisfy these needs. For example, hunger (i.e., physiological need) can lead to either searching for close-by restaurants (i.e., information need) or negative emotion states (e.g., frustration) needing to be resolved by watching funny clips (i.e., emotion need). Due to this delegation of physiological need to information or emotion need, we do not further investigate physiological need. Therefore, all we need is to investigate the relationship between information need and emotion need.Yashar argues that an emotion need is more fundamental than an information need in the sense that if an information need exists it implies that there is an underlying emotion need to satisfy this information need. The whole IR&S behaviour is thus driven by an emotion need. However, the converse may not necessary be true, e.g., a user could want to be happy/sad/angry but without having a well-defined IN. Thus, whenever information need is discussed, an emotion need is preexistent. In the case when the emotion need of the searcher is to diminish the negative feelings associated with a lack of knowledge (i.e., an IN), the emotion need would be satisfied if the IN associated with it is resolved. For example, if a searcher’s IN is to know about topic x, the searcher must believe4 that information about x has been acquired, in order for their emotion need to be satisfied. Thus, the emotion need will not be resolved unless the underlying information need is resolved, since in this context, the information need is the dominant one.There are in fact emotion needs that do not imply an information need in the way we have defined information above. An example of such needs are the scenarios explained in Section 3.5.4, i.e., users who are stressed and look at some clips that they know will help to relieve their stress. Of course, one way of remembering these clips is by employing the associative nature of the relationship between emotion and memory (see Section 3.4.4). Other ways include looking at the popular (most viewed/highly recommended) objects. In all these scenarios there is no particular information need to be resolved, but only an emotion need, e.g., when searchers are seeking for funny clips in YouTube. In these scenarios, it is argued that the emotion aspect of information objects is more important than their information aspect, and we label them as extreme emotion need scenarios. Thus, one can present emotion need as a continuous spectrum ranging from informational needs to extreme emotionalneeds.It has been shown that information need motivates searchers to engage with an IR system. An emotion need can be a motive for searchers to use an IR system when it manifests itself as information need. It is our belief that emotion needs, even when they do not lead to a particular information need, can motivate searchers to use an IR system.
  • Yasharsays that emotion is the driving force of Information Retrieval and Seeking. If this is true, it is plausible that wecanfind emotion in the queries that the userwrites in a search box, and consequently in query logs.
  • “keywordese”, i.e. the kind of sublanguage/jargonweuse to communicate with searchengines (that is, a languagewithoutarticle, without prepositions, and other stop words, withoutmuch syntax or hedges, etc.), query logs are skimmed texts that require no cleaning from redundancies or rhetorical ornaments, and reducedpre-processing.
  • One of the manyblogsfocussing on maternity and child-relatedissuel. There is a widenetwork of similarblogs and websites in ItalycalledFattoreMammaNetwork: http://fattoremamma.com/network/
  • RepetionsUse of functionwords (stopwords)ProperNamesOneword & multi-wordsBooktitles: Iomisvezzo da solo scritto da Lucio PiermarinipediatricianWean = Accustom an infant to food other than its mother's milk.Nullaosta = certificate of no impedimentAbortoterapeutico= therapeuticabortion = misscarriageDr. Estivill, a pediatrician and neurophysiologist, is the director of the Sleep Clinic at the InstitutDexeus in Barcelona, Spain. He is also the coordinator of the Sleep Unit at Catalonia General Hospital and Incosol Clinic in Marbella. Dr. Estivill has written many popular books on sleep and other habits, including 5 Days to a Perfect Night’s Sleep for your Child. Togliere il pannolino = stop wearingnappies/stopusingdiapers = article ”i” masculin plural has beenusedNopron: syrup = psicofarmaco =drug used in treatment of mental conditions  Fare la nanna = nanna  fam   bye-byes   (Brit)  , beddy-byesNeonato 4 mesiNulla osta Iomisvezzo da soloTracy Hogg(notprovided): About a year ago, Google decided that all users logged into Google — such as Google+, Gmail — would be redirected from http://google.com to https://google.com, thereby encrypting data. Google claims this move was done to protect users’ privacy. While the jury is still out on whether or not this move protects anyone’s privacy, one thing is for certain: web-based businesses and SEO experts have to jump through some hoops in order to get around the dreaded "(not provided)" keyword.
  • High frequency of nouns and verbs indicates density of information (Biber, 1988: 105). Adjectives elaborate nominal informationA density of contentwords,
  • Inserimento al nido = ”settling-in phase" (period in which children are gradually introduced to the nursery)
  • Whenweknow the genre of texts, wecansurmize the purpose of the text producer.Whenwehave a high frequency of a string such as ”bambiniaggressivi” in a specializedblog for parents, it probablymeans that usershave a problem with aggressive children and theywant to find suggestions on how to solve the problem and alsofindsomeempathy (they are not the onlyoneshaving this problem). So the negative adjectiveaggressividoes not convey a negative feeling but a positive energy to solve a problem. So this queryconveys a posive emotion and the hope that something negative (an aggressive behaviour) can be changed or solved. This types of adjectives that are negative at face values, shouldbecome positive becausethey express an emotion need that is a positive attitude. hat I am proposinghere is a refinement of the conceptpresented in ”Topic-based Sentiment Analysis in the Social Media …” (Thelwall and Buckley, 2012): the polarity of affectwordsmightflipaccording to genre (that gives us a hint of the purpose of why a text has beenproduced) and the communicative situation., and not only the topic. Ifsomebodywrites ”bambiniaggressivi” in a FB problably the purpose is different, like witnessingchildren who behaveaggressively.
  • Tweeking is not enough.
  • Not sure how to interpret this
  • Instead of applyinggenreralaffectvocubulary, one easy way is to identify the coreaffectvocabulary is to make a ngramfrequency list. In this example you cansee the mostfrequenttrigrams from genitoricrescono. You canseemany repetitions because in thesequeriesthere ar manyfunctionwords (articles, prepositions, etc). In order to get a cleaner list.
  • Nationa HEALTH CARE /Nhc
  • Not big data
  • Self-certificationMymedical center contatctMammographiGerman measlesChange healthcarewardWebbisar = regulation on how to publishpictures of new born babys on the web
  • UrinarytraitinfectionEarinfectionReseersättning travelreimbursement
  • 20 occurrences…
  • I think that the VGR website genre is less emotional… There is possibly a difference in users’ emotionalbehaviourwhenspecifyingqueries to a websearchengine and whenusing a specializedwebsite. In the first case, userscommunicatetheir emotions moreextensively,In the second case, they just specify the singleword that best representtheir information need.
  • The expectation from a public service website is to be informative. VGR queries are specificed by users who are in the website and use the internalsearchengine. Thereforethey go straitght to the point by specifying a singlewordwithoutanydescriptive part of speech.While on the web, users must describe in a moredetailedwaywhatthey are looking for. It wouldhavebeeninteresting to comparehow the same topics expressed in the VGR querieswherespecified in a forum or a medicalblog.…
  • From the exploratoryresults,wecansay that
  • Transcript

    • 1. How Emotional are Users’ Needs? Exploring Emotion in Query Logs Marina Santini 29 Jan 2013 Marina Santini - CyberEmotions2013 Warsaw University of Technology 1 29-30 Jan 2013
    • 2. Outline• Inspirational Triggers: o The Big Unstructured Textual Data Issue o Emotion in IR o Research hypothesis• Genre- and Emotion- Profiling of Query Logs o Characterization of genre o Definition of emotion o Benefits of genre and emotion awareness in query log analysis• Experiments o Query Logs from GenitoriCrescono thematic blog (in iItalian) o Query Logs from Västra Götlands Region (in Swedish)• Conclusions Marina Santini - CyberEmotions2013 Warsaw University of Technology 2 29-30 Jan 2013
    • 3. Inspirational Trigger 1 BIG UNSTRUCTURED TEXTUAL DATAMarina Santini - CyberEmotions2013Warsaw University of Technology 329-30 Jan 2013
    • 4. Big Unstructured Texutal Data “MerrillLynch estimates that more than 85 percent of all business information exists as unstructured data – commonly appearing in e‐mails, memos, notes from call centers and support operations, news, user groups, chats, reports, letters, surveys, white papers, marketing material, research, presentations and web pages.” [DM Review Magazine, February 2003 Issue] ECONOMIC LOSS! Lots of different genres! Marina Santini - CyberEmotions2013 Warsaw University of Technology 4 29-30 Jan 2013
    • 5. Simple search is not enough…• Of course, it is possible to use simple search. But simple search is unrewarding, because is based on single terms. o ”a search is made on the term felony. In a simple search, the term felony is used, and everywhere there is a reference to felony, a hit to an unstructured document is made. But a simple search is crude. It does not find references to crime, arson, murder, embezzlement, vehicular homicide, and such, even though these crimes are types of felonies” [ Source: Inmon, B. & A. Nesavich, "Unstructured Textual Data in the Organization" from "Managing Unstructured data in the organization", Prentice Hall 2008, pp. 1–13] Marina Santini - CyberEmotions2013 Warsaw University of Technology 5 29-30 Jan 2013
    • 6. Text Analytics• A set of NLP techniques that provide some structure to textual documents.• Common components: o Tokenization o Morphological Analysis o Syntactic Analysis o Named Entity Recognition o Sentiment Analysis o Automatic Summarization o Etc. Marina Santini - CyberEmotions2013 Warsaw University of Technology 6 29-30 Jan 2013
    • 7. Text Analytics Products and Frameworks• Commercial: Open Source: Attensity o • GATE o Clarabridge o Temis • NLTK Lexalytics o • UIMA o Texify o SAS • etc. o IBM Cognos o etc. Business Intelligence (BI) Customer Experience Management (CEM)Marina Santini - CyberEmotions2013Warsaw University of Technology 729-30 Jan 2013
    • 8. Actionable Intelligence• Business Intelligence (BI) + Customer Experience Management (CEM) = Actionable Intelligence• Actionable Intelligence is information that: 1. must be accurate and verifiable 2. must be timely 3. must be comprehensive 4. must be comprehensible 5. give the power to make decisions and to act straightaway Marina Santini - CyberEmotions2013 Warsaw University of Technology 8 29-30 Jan 2013
    • 9. In 2003, Merryl Lynch pointed out that it was too difficult to extract automatically usable intelligence from the following genres: o e‐mails o memos Today… o notes from call centers and support operations Previous genres plus o news •Blogs o user groups •Tweets o chats o reports •FB microposts o letters •FB comments o surveys •Many other social network texutal o white papers o marketing material, ”interactions” o research, o presentations o web pagesMarina Santini - CyberEmotions2013Warsaw University of Technology 929-30 Jan 2013
    • 10. From Big Data to Query Logs Current State of affair Viable Alternative1. Big Unstructured Textual Data • Query Logs2. Text Analytics (commercial • Genre- & Context products and frameworks) aware Text Analytics3. Structured information for BI • Actionable Information and CEM (BI, CEM, sentiment, e merging topics…)The main advantage to uses query logs (when they are Typical Use Caseavailable) instead of other genres consists in A company managing:REDUCED DATA SIZE, REDUCED PRE-PROCESSING; •WebsiteREDUCED NOISE, REDUCED DATA CLEANING! •Blog •eMails Marina Santini - CyberEmotions2013 •Facebook Page Warsaw University of Technology 10 29-30 Jan 2013 •Twitter account
    • 11. Exploratory Query-log Analysis Workshop Organized by Findwise, AB – Sweden SearchInFocus SLTC 2012Exploratory Study on Query Logs and Actionable Intelligence Query Logs provide Actionable Intelligence for: - search providers - clients - end-users Marina Santini - CyberEmotions2013 Warsaw University of Technology 11 29-30 Jan 2013
    • 12. Inspirational Trigger 2EMOTION IN INFORMATION RETRIEVAL (IR) Marina Santini - CyberEmotions2013 Warsaw University of Technology 12 29-30 Jan 2013
    • 13. Emotion in IR Role of Emotion in Information o Three concepts: Retrieval by Yashar Moshfeghi • Emotion need PhD Thesis at University of Glasgow, 2012 • Emotion object ” uncover social situations • Emotion relevance where emotion is the primary factor (i.e., source of motivation) in an IR&S process.” (from the abstract)Marina Santini - CyberEmotions2013Warsaw University of Technology 1329-30 Jan 2013
    • 14. Emotion Need• The whole IR&S behaviour is driven by an emotion need.• An emotion need is more fundamental than an information need in the sense that if an information need exists it implies that there is an underlying emotion need to satisfy this information need.• Emotion needs, even when they do not lead to a particular information need, can motivate searchers to use an IR system. Marina Santini - CyberEmotions2013 Warsaw University of Technology 14 29-30 Jan 2013
    • 15. Research Hypothesis for the exploration of emotion in query logsIt is plausible that much of the IR&S behaviour is driven by anemotion need and that users’ emotions are expressed in the queries that are typed in search boxes and stored in query logs.If this is true, also emotion extraction from query logs providesactionable intelligence, because extracted emotions can beused to improve decision making and more grounded future choices. Marina Santini - CyberEmotions2013 Warsaw University of Technology 15 29-30 Jan 2013
    • 16. Research Questions• Is it possible to extract emotion from query logs?• If so, is it possible to use emotion from query logs for actionable intelligence? Marina Santini - CyberEmotions2013 Warsaw University of Technology 16 29-30 Jan 2013
    • 17. Genre Profiling of Query LogsMarina Santini - CyberEmotions2013Warsaw University of Technology 1729-30 Jan 2013
    • 18. What characterizes a genre?1. Must have a name2. Must be recognized within a community3. Must be produced during a task4. Must have conventions5. Must raise expectations6. Can change over time. It is an cultural artifact (culture here includes society, media, techonology, etc.) Marina Santini - CyberEmotions2013 Warsaw University of Technology 18 29-30 Jan 2013
    • 19. The query log genre is… a newly acknowledge but fully- emerged webgenre1. Name: in line with other digital genres (ex: web log  blog)2. Community: internet users, IR practitioners3. Task: to express searchers’needs in a search engine4. Conventions: short texts written in”keywordese”5. Expectations: to find information relevant to the query6. Cultural artifact: a product of sinternet-based society OR a subproduct of search engines Marina Santini - CyberEmotions2013 Warsaw University of Technology 19 29-30 Jan 2013
    • 20. The query log genre: Languistic and Textual Conventions• Length: short text (a query log can be seen as a corpus of very short texts, shorter than tweets, mobile text messages, chat logs, etc.)• Sublanguage/Jargon: ”keywordese”• Register: neutral• Morphology: REDUCED• Syntax : REDUCED (usually no subclauses, etc.) Marina Santini - CyberEmotions2013 Warsaw University of Technology 20 29-30 Jan 2013
    • 21. The Query Log Genre: Benefits• wrt discourse analysis: o Conceptual lean and essential jargon • reduced morphology • reduced syntax • short texts • mostly nouns and verbs Benefit1: Predictable Sublanguage• wrt BIG UNSTRUCTURED TEXTUAL DATA BENEFIT 2: REDUCED SIZE, REDUCED PRE- PROCESSING; LITTLE DATA CLEANING! Marina Santini - CyberEmotions2013 Warsaw University of Technology 21 29-30 Jan 2013
    • 22. Emotion Profiling of Query Logs Marina Santini - CyberEmotions2013 Warsaw University of Technology 22 29-30 Jan 2013
    • 23. What is emotion?BROAD DEFINITION: ANY DEGREE OF JUDGEMENTAL EVALUATION. LIKE SENTISTRENGTH’S SCALE : DUAL 5-POINTS SYSTEM FOR POSITIVE [1; 2; 3; 4; 5] AND NEGATIVE [-1; -2; -3; -4; -5] EMOTIONSMarina Santini - CyberEmotions2013Warsaw University of Technology 2329-30 Jan 2013
    • 24. ExplorationsMarina Santini - CyberEmotions2013Warsaw University of Technology 2429-30 Jan 2013
    • 25. Thematic Blog – Italian Logs from Google AnalyticsMarina Santini - CyberEmotions2013Warsaw University of Technology 2529-30 Jan 2013
    • 26. • Parents Grow Up: Genitori Crescono http://genitoricrescono.com/ to learn together the parent profession belongs to: FattoreMammaNetwork (gathers websites targeted to mothers and written by mothers) • About: parenthood, childcar e, maternity, upbringin g, behaviours during childhood…Marina Santini - CyberEmotions2013Warsaw University of Technology 2629-30 Jan 2013
    • 27. Queries from Google Analyticswww.genitoricrescono.com - Search Overview 2009-01-01-2012-11-10 togliere il pannolino = stop wearing nappies/stop using diapers genitori crescono = is website name Nopron = is the name of a controvensial syrup to make children sleep all night long Tracy Hogg is is maternity nurse to Hollywood stars known as the baby whisperer for her skill in calming unruly infants nanna = familiar bye-byes (Brit) , beddy-byes neonato 4 mesi = 4-months-old baby io mi svezzo da solo: I wean by myself nulla osta = certificate of no impediment Marina Santini - CyberEmotions2013 terapeutico=therapeutic aborto abortion Warsaw University of Technology 27 29-30 Jan 2013
    • 28. Zipf’s distribution “… much research has shown that query term frequency distributions conform to the power law, or long tail distribution curves. That is, a small portion of the terms observed in a large query log (e.g. > 100 million queries) are used most often, while the remaining terms are used less often individually."Marina Santini - CyberEmotions2013Warsaw University of Technology 2829-30 Jan 2013
    • 29. Parts of Speech NOUNS VERBS ADJECTIVES AND ADVERBS ARTICLES AND PREPOSITIONS 1.9Marina Santini - CyberEmotions2013Warsaw University of Technology 2929-30 Jan 2013
    • 30. Most Frequent Syntactic Patterns inserimento al nido bambini aggressivi metodo estivillMarina Santini - CyberEmotions2013Warsaw University of Technology 3029-30 Jan 2013
    • 31. Average Lengths “The average length of a search query was 2.4 terms" in a recent study in 2011 it was found that the average length of queries has grown steadily over time and average length of non- English languages queries had increased more than English queries."Marina Santini - CyberEmotions2013Warsaw University of Technology 3129-30 Jan 2013
    • 32. Long query, informal syntaxHow to stop breastfeeding and make it sleep alone i am planning second pregnacy Marina Santini - CyberEmotions2013 Warsaw University of Technology 32 29-30 Jan 2013
    • 33. • Queries’ Emotional Strength (i) SentiStrength (basic options) Marina Santini - CyberEmotions2013 Warsaw University of Technology 33 29-30 Jan 2013
    • 34. The power of genre and the importance of the communicative situation• ”bambini aggressivi”• Refinement of the concept presented in ”Topic- based Sentiment Analysis in the Social Media …” (Thelwall and Buckley, 2012): the polarity of affect words might flip according to genre and the communicative situation, and not only according the topic. Marina Santini - CyberEmotions2013 Warsaw University of Technology 34 29-30 Jan 2013
    • 35. Addition: Emotion WordsMarina Santini - CyberEmotions2013Warsaw University of Technology 3529-30 Jan 2013
    • 36. Emotional Strength: Basic vs. BoostedMarina Santini - CyberEmotions2013Warsaw University of Technology 3629-30 Jan 2013
    • 37. Negation Basic Options Boostedbambini che non mangiano 2 -1 bambini che non mangiano 1 -1 children who do not eatquando i bambini non dormono 2 -1 quando i bambini non dormono 1 -1 when children do not sleep Marina Santini - CyberEmotions2013 Warsaw University of Technology 37 29-30 Jan 2013
    • 38. Most frequent wordTrigramsMarina Santini - CyberEmotions2013Warsaw University of Technology 3829-30 Jan 2013
    • 39. Query ”Normalization”• Stopword removal• Lemmatization• And ideally synomym expansionMarina Santini - CyberEmotions2013Warsaw University of Technology 3929-30 Jan 2013
    • 40. Use emotion needs as Actionable Intelligence Ex: for increasing traffic to a website Increase emotion relevance: • be empathetic to searchers ’s problems by sympathising and by convetring the negative words into more neutral concepts • Give heart and hope and offer many solutions… • In a few word: offer a new communication stategy…Marina Santini - CyberEmotions2013Warsaw University of Technology 4029-30 Jan 2013
    • 41. Public organization website Enterprise search and log server by Findwise, AB. Marina Santini - CyberEmotions2013 Warsaw University of Technology 41 29-30 Jan 2013
    • 42. Within the Västra Götaland Region website…Marina Santini - CyberEmotions2013Warsaw University of Technology 4229-30 Jan 2013
    • 43. …hittavård [find health care center] Regional HealthCareMarina Santini - CyberEmotions2013Warsaw University of Technology 4329-30 Jan 2013
    • 44. VGR Corpus Description• Corpus Time frame: 2010-2011 (2 years)• Description: “These logs come from the search at hittavard.vgregion.se. The biggest bulk should come from 1177.se. The rest should be from vgregion.se. The target audience are both VGR (Västra Götalands Region) users/employees as well as the general public, as it is a public site. The internal files are searches made from within the VGR…”• Corpus size: o size = 3,167 KB (only queries) (BIG DATA is usually > 1TB) o number of queries = 249,243 o number of words = 306,453• Average query length: 1.23 words Marina Santini - CyberEmotions2013 Warsaw University of Technology 44 29-30 Jan 2013
    • 45. VGR Top Queries egenremiss=self-certification mina vårdkontakter=my healthcare contacts webbisar=a invented word referring to newborn babies whose pictures have been published on the webMarina Santini - CyberEmotions2013sjukresa/or=trip to the hospitalWarsaw University of Technology 4529-30 Jan 2013
    • 46. Linguistic Remarks• At the top of the frequency list: Simple nouns •feber o Simple nouns •influensa o Compounds •klamydia o V+N •… Compounds •urinvägsinfektionV+N •öroninflammation•byta vårdcentral •Reseersättning•avboka tid •…•boka tid•… Marina Santini - CyberEmotions2013 Warsaw University of Technology 46 29-30 Jan 2013
    • 47. More complex constructions at the bottom Marina Santini - CyberEmotions2013 Warsaw University of Technology 47 29-30 Jan 2013
    • 48. SentiStrength on VGRMarina Santini - CyberEmotions2013Warsaw University of Technology 4829-30 Jan 2013
    • 49. It seems that no emotion isconveyed by VGR users…• Are Swedes less emotional than Italians?• Is the ”healthcare” topic less emotional than the ”childcare” topic? Marina Santini - CyberEmotions2013 Warsaw University of Technology 49 29-30 Jan 2013
    • 50. It might be that… There is a difference in users’ emotional behaviour when specifying queries to a web search engine OR when using a the search engine of a specialized website.Marina Santini - CyberEmotions2013Warsaw University of Technology 5029-30 Jan 2013
    • 51. Emotion Interpretation…is not straightforward…• There are several factors to be accounted for: o One important factor is the context of communication: similar words or sentences can convey positive emotion in a query and negative emotion in Facebook post, for example. Marina Santini - CyberEmotions2013 Warsaw University of Technology 51 29-30 Jan 2013
    • 52. different communicative contexts = different genres Marina Santini - CyberEmotions2013 Warsaw University of Technology 52 29-30 Jan 2013
    • 53. Genre Awaraness• In practical terms, genre awareness is important in text analytics and sentiment analysis because, all things being equal: o let you choose the easiest and less problematic texts to process; o help interpret and disambiguate the real meanings of words and sentences according to the different communicative context in which they appear. Marina Santini - CyberEmotions2013 Warsaw University of Technology 53 29-30 Jan 2013
    • 54. In SummaryMarina Santini - CyberEmotions2013Warsaw University of Technology 5429-30 Jan 2013
    • 55. Is it possible to identify and extract emotion from query logs?• It is possible to identify and extract emotion from web query logs.• It seems more difficult to extract emotion from enterprise search engine query logs. Marina Santini - CyberEmotions2013 Warsaw University of Technology 55 29-30 Jan 2013
    • 56. Is it possible to use emotion from query logs for actionable intelligence?• If present, query log emotion can be used for actionable intelligence. Marina Santini - CyberEmotions2013 Warsaw University of Technology 56 29-30 Jan 2013
    • 57. What do you think? Thank you for your attention!Marina Santini - CyberEmotions2013Warsaw University of Technology 5729-30 Jan 2013
    • 58. DetailsMarina Santini - CyberEmotions2013Warsaw University of Technology 5829-30 Jan 2013
    • 59. Benefits for the Search Provider• Mining query logs to extract user-created knowlege, ie queries that can be used as tags (metadata)• Quickly create domain-specific taxonomies you can capitalize upon, especially for new client companies working in related fields• Enhancements of current search products• Inexpensive creation of annotated corpora: document annotation through query logs is a simple technique that in the a short time will build massive annotated corpora to use for machine learning, which will allow more sophisticated search refinements. Marina Santini - CyberEmotions2013 Warsaw University of Technology 59 29-30 Jan 2013
    • 60. Benefits for Clients & End Users• Somebody said: SEARCH MUST BE MIND READER!• BUT ALSO faster, more friendly, more exhaustive and more accurate.• If this happens, clients will spend less for customer care. If the end user finds what s/he needs online and quickly, there is no need to call an helpdesk or customer care service.• Through the analysis of query logs, log analysts can spot the less ”satisfied” queries (i.e. user’s needs). Companies can use this information to plan future products or product enhancement or marketing strategies, etc. (BI) Marina Santini - CyberEmotions2013 Warsaw University of Technology 60 29-30 Jan 2013

    ×