Your SlideShare is downloading. ×
0
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Opinion-Based Entity Ranking
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Opinion-Based Entity Ranking

2,641

Published on

A brief description of the Opinion-Based Entity Ranking paper published in the Information Retrieval Journal, Volume 15, Number 2, 2012. …

A brief description of the Opinion-Based Entity Ranking paper published in the Information Retrieval Journal, Volume 15, Number 2, 2012.

Slides By Kavita Ganesan.

Published in: Technology, Design
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,641
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
96
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • So this long keyword query will be split into 3 separate queries. Each called an aspect query.These aspect queries are scored separately and the results are then combined.
  • -
  • -for each entity, average the numerical ratings of each aspect-assumption: this would be a good approximation to human judgment
  • Otherwise, this tells you that the system is not really doing well in ranking.
  • -could not obtain natural queries, so we used semi synthetic queries.-what we did was-and then we randomly combined queries…to form a set of queries.
  • Then finally we conducted a user study where users were asked to manually determine the relevance of the the sysGen results to query. This is to validate that the results made sense to real usersAnd also to validate the effectiveness of the gold standard rankings which is based on the…Based on this we found that…Which means that this evaluation method can be safely used for similar ranking tasks…
  • Transcript

    • 1. Ganesan & Zhai 2012, Information Retrieval, Vol 15, Number 2Kavita Ganesan (www.kavita-ganesan.com)University of Illinois @ Urbana ChampaignJournalProject Page
    • 2.  Currently: No easy or direct way of finding entities (e.g. products, people, businesses) based on online opinions You need to read opinions about different entities to find entities that fulfill personal criteria e.g. finding mp3 players with ‘good sound quality’
    • 3.  Currently: No easy or direct way of finding entities (e.g. products, people, businesses) based on online opinions You need to read opinions about different entities to find entities that fulfill personal criteria  (e.g. finding mp3 players with ‘good sound quality’ Time consuming process & impairs user productivity!
    • 4.  Use existing opinions to rank entities based on a set of unstructured user preferences Example of user preferences:  Finding a hotel: “clean rooms, heated pools”  Finding a restaurant: “authentic food, good ambience”
    • 5.  Most obvious way: use results of existing opinion mining methods  Find sentiment ratings on various aspects ▪ For example, for an mp3 player: find ratings for screen, sound, battery life aspects ▪ Then, rank entities based on these discovered aspect ratings  Problem is that this is Not practical! ▪ Costly – It is costly to mine large amounts of textual content ▪ Prior knowledge – You need to know the set of queriable aspects in advance. So, you may have to define aspects for each domain either manually or through text mining ▪ Supervision – Most of the existing methods rely on some form of supervision like the presence of overall user ratings. Such information may not always be available.
    • 6.  Leverage Existing Text Retrieval Models Why?  Retrieval models can scale up to large amounts of textual content  The models themselves can be tweaked or redefined  This does not require costly information extraction or text mining
    • 7. Leveraging robust text retrieval models Indexed rank Entity 1 Entity 1 Reviews rank retrieval User Preferences Entity 2 models (query) Entity 2 Reviews (BM25, LM, PL2) rank Entity 3 Entity 3 Reviews Keyword match between user prefs & textual reviews
    • 8. Leveraging robust text retrieval models Indexed rank Entity 3 Entity 3 Reviews rank retrieval User Preferences Entity 2 models (query) Entity 2 Reviews (BM25, LM, PL2) rank Entity 1 Entity 1 Reviews Keyword match between user prefs & textual reviews
    • 9.  Based on the basic setup, this ranking problem seems similar to regular document retrieval problem However, there are important differences:1. The query is meant to express a users preferences in keywords  Query is expected to be longer than regular keyword queries  Query may contain sub-queries expressing preferences for different aspects  It may actually be beneficial to model these semantic aspects2. Ranking is to capture how well an entity satisfies a users preferences  Not the relevance of a document to a query (as in regular retrieval)  The matching of opinion/sentiment words would be important in this case
    • 10.  Investigate use of text retrieval models for the task of Opinion-Based Entity Ranking Explore some extensions over IR models Propose evaluation method for the ranking task User Study  To determine if results make sense to users  Validate effectiveness of evaluation method
    • 11.  In standard text retrieval we cannot distinguish the multiple preferences in a query. For example: “clean rooms, cheap, good service”  Would be treated as a long keyword query even though there are 3 preferences in the query  Problem with this is that an entity may score highly because of matching one aspect extremely well To improve this:  We try to score each preference separately and then combine the results
    • 12. Aspect Queries“clean rooms, cheap, “good “clean rooms” “cheap” service” good service” scored retrieval model separately retrieval model result set 1 result set 2 result set 3 Results results Results combined
    • 13.  In standard retrieval models the matching of an opinion word & a standard topic word is not distinguished However, with Opinion-Based Entity Ranking:  It is important to match opinion words in the query, but opinion words tend to have more variation than topic words  Solution: Expand a query with similar opinion words to help emphasize the matching of opinions
    • 14. Similar Meaning toFantastic battery life “Fantastic battery life” Query Good battery life Great battery life Excellent battery life Review documents
    • 15. Similar Meaning toFantastic battery life “Fantastic battery life” Query Add synonyms of Good battery life word “fantastic” Fantastic, good, Great battery life great,excellent… battery life Excellent battery life Expanded Query Review documents
    • 16.  Document Collection Gold Standard: Relevance Judgement User Queries Evaluation Measure
    • 17.  Document Collection:  Reviews of Hotels – Tripadvisor  Reviews of Cars – Edmunds Numerical aspect ratings Gold standard Free text reviews
    • 18.  Gold Standard:  Needed to asses performance of ranking task For each entity & for each aspect (in dataset):  Average numerical ratings across reviews. This will give the judgment score for each aspect  Assumption: Since the numerical ratings were given by users, this would be a good approximation to actual human judgment
    • 19.  Gold Standard: Ex. User looking for cars with “good performance”  Ideally, the system should return cars with ▪ High numerical ratings on performance aspect ▪ Otherwise, we can say that the system is not doing well in ranking Should have high ratings on performance
    • 20.  User Queries  Semi synthethic queries  Not able to obtain natural sample of queries  Ask users to specify preferences on different aspects of car & hotel based on aspects available in dataset ▪ Seed queries ▪ Ex. Fuel: “good gas mileage”, “great mpg”  Randomly combine seed queries from different aspects  forms synthetic queries ▪ Ex. Query 1: “great mpg, reliable car” ▪ Ex. Query 2: “comfortable, good performance”
    • 21.  Evaluation Measure: nDCG  This measure is ideal because it is based on multiple levels of ranking  The numerical ratings used as judgment scores has a range of values and nDCG will actually support this.
    • 22.  Users were asked to manually determine the relevance of system generated rankings to a set of queriesTwo reasons for user study: Validate that results made sense to real users  On average, users thought that the entities retrieved by the system were a reasonable match to the queries Validate effectiveness of gold standard rankings  Gold standard ranking has relatively strong agreement with user rankings. This means the gold standard based on numerical ratings is a good approximation to human judgment
    • 23. Most effective Most effective on BM25 (p23) on BM25 (p23)8.0% Hotels 2.5% Cars6.0% 2.0% 1.5%4.0% 1.0%2.0% 0.5%0.0% 0.0% PL2 LM BM25 PL2 LM BM25 QAM QAM + OpinExp QAM QAM + OpinExpImprovement in ranking using QAMImprovement in ranking using QAM + OpinExp
    • 24.  Lightweight approach to ranking entities based on opinions  Use existing text retrieval models Explored some enhancements over retrieval models  Namely opinion expansion & query aspect modeling  Both showed some improvement in ranking Proposed evaluation method using user ratings  User study shows that the evaluation method is sound  This method can be used for future evaluation tasks

    ×