Opinion Driven Decision Support System

  • 1,553 views
Uploaded on

Summary of Kavita Ganesan's PhD Thesis

Summary of Kavita Ganesan's PhD Thesis

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,553
On Slideshare
0
From Embeds
0
Number of Embeds
9

Actions

Shares
Downloads
22
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • I would like to thank all of you for being present for my talk despite the time differences.Today I will be presenting my thesis proposal. The title of my thesis is Opinion-Driven Decision Support System.
  • We use opinions on the web for all sorts of decision making tasks…For example….when visiting a new city…We use opinions to decide which hotel to stay at? Or which attractions to visit? Opinions are essential for decision making…for example if we are looking for a hotel in NYC, we often read many online opinions to figure out which one of these to stay at?Without these online opinions the decision making task becomes more difficult because youOnly have limited information to base your self on…you may only have price, description
  • We use opinions on the web for all sorts of decision making tasks…For example….when visiting a new city…We use opinions to decide which hotel to stay at? Or which attractions to visit? Without these online opinions the decision making task becomes more difficult because youOnly have limited information to base your self on…you may only have price, location description
  • Most of the existing work leveraging opinions have been focused on summarization of opinions to help Users better digest the opinions. This is mainly in the form of structured summaries such as +ve or –ve on a given piece of text (sentence, passage or a document) or a more fine-grained summary where you have sentiment ratings on different aspects….This alone is not sufficient because you need otherComponents to support effective decision making
  • We actually need to address a broader set of problems to effectively support such a decision making task.
  • First of all we large amounts of opinions. Since opinions are subjective andCan vary quite a bit, we need a large set of it to allow users to get the complete pictureThen we need different analysis tools, we can have sentiment trend visualization …which showsFluctuation in sentiments over time. Then we can have aspect level summaries, textual summaries and so on.
  • First of all we large amounts of opinions. Since opinions are subjective andCan vary quite a bit, we need a large set of it to allow users to get the complete pictureThen we need different analysis tools, we can have sentiment trend visualization …which showsFluctuation in sentiments over time. Then we can have aspect level summaries, textual summaries and so on.
  • Then, we also need to incorporate search so that users can actually find different itemsAnd entities utilizing existing opinions. This would actually improve user productivityBecause it cuts down on the time spent on reading a large number of opinions.
  • -finally, we also need to know how to present the opinions at hand effectively-so for example, if you have aspect level summaries, you need to understand howTo organize it, so you show scores or visuals like star rating…do you also need supporting phrases?If you have full opinions, then you need to think about how to allow effecitive browsing between one passage to the next….so as to not overwhelm users.
  • With this, I propose a framework called the ODSS…which encompasses data collection, analysis tools,Search and presentation. All of which to support a more complete decision making platform based on opinions. In my thesis I tried to solve some of the problems related to data collection, search capabilities and analysis….
  • The focus of the methods proposed in this thesis is (1) to make it very general…..This is so that the approach can work across various domains and content type (cars, electronics) and (news, legal docs). The second focus is to make these methods actually practical and lightweight…. This is so that it can be easily applied in practica and can scale up to large amounts of data
  • So first we will look at my search related work on OpinionBased Entity Ranking which was published in the IRJ.
  • So how do u go about solving this ranking problem?-this approach is usually not practicalSome of the existing methods rely on some form of supervision, such as the overall user ratings…but this kind of informationMay not always be available
  • So we propose to leverageSo in this paper what we did was to
  • First we will into some of the extensions that we explored…The first is modeling the aspects in the query…With the standard retrieval method…To improve this we look into scoring each preference separately… and then combine the results
  • First we will into some of the extensions that we explored…The first is modeling the aspects in the query…With the standard retrieval method…To improve this we look into scoring each preference separately… and then combine the results
  • In standard text retrieval models, matching an opinion word and a standard topic word is not distinguished.But, in this ranking task it is important to match opinion words in the user's queryHowever, opinion words generally have more variation than topic words….So the intuition is that by expanding a user query with additional equivalent opinion words, we can help in emphasizing the matching of opinion words.
  • So here are results….These are the results with the use of QAM & Opin Exp in two domains…hotels and cars.The blue bar is improvement with the use QAM over standard retrievalThe red bar is improvement with the use OpinExp with QAM over standard retrievalFirst you see that with both extensions, there is improvement with most cases andis especially clear with the use of OE.
  • Then you see that with just QAM any of the retrieval models can be used – because the improvements are not that different.
  • But when you pair OpinExp+QAM, it is clear that BM25 is most effective retrieval model. One reason for this is because BM25 does not over reward high frequency words. So an entity is not ranked highly because of matching just one of the words in the query.
  • Next, we will move to the analaysis part where I looked into abstractive summarization of opinions. For this, I have explored two different approaches.
  • Most current work in opinion summarization focus on predictingthe aspect based ratings for an entity.For example, for an ipod, you may predict that the appearance is 5 stars, ease of use is three stars..etc.
  • But the problem is, if you wanted to know more about each of these aspects, you would actually have to read many sentences, from thethousands of reviews, to have your questions answered.
  • For textual summaries to be useful to users we first require that the summary actually summarizes the major opinions, is concise so that its viewable on smaller screens and of course should be reasonably readable.
  • Miss out information during sentence selection
  • So this is how it works at a very high level, but there are many details to this that are outline in the paperSuch as stictch two different subgraphs into 1 and how we use positional information to find promising paths.
  • Opinosis-Graph has 3 unique properties that help generate abstractive summaries3 key concepts used in summarization algorithm
  • There are three unique properties of the OG that helps in Finding candidate summaure
  • There are three unique properties of the OG that helps in Finding candidate summaries.
  • If you look at the words drop and freq from sentence 2, even though the gap between the words is 2,Because there is already a link between drop and freq, you can actually leverage this to find more redundancies.Here you can see that even though the gap between the words is 2, because there is already a link betweendrop and frequently, you can leverage this link to find more redundancies.
  • Here you can see that even though the gap between the words is 2, because there is already a link betweendrop and frequently, you can leverage this link to find more redundancies.
  • Here are two new sentences, calls drop freq qith the iphone and black…This is the resulting opinosis graph. You can see that there is one high redundancy path, followed by high fan out. The node ‘the’ thus acts like a hub. Such a structure can actually be used to merge sentences to form one suchAs calls drop freq….This kind of a structure is easily discoverable using the Opinosis Graph.
  • Well formedaccording to the language's grammatical rules…And in this work we emphasize…on 3 diff
  • We try to capture these three criteria using the following optimzation framework. Objective function ensure that the summaries reflect opinions from the original text and is reasonably well formed.
  • We try to capture these three criteria using the following optimzation framework. Objective function ensure that the summaries reflect opinions from the original text and is reasonably well formed. Objective function here tried to optimize rep and readability scores. So it tries ensure that the summaries reflect opinions from the original text and is reasonably well formed. Srep(mi) is the rep score and Sread(mi) is the read score…. And mi represents a micropinion
  • Srep(mi) is the representativeness score of mi and mi is a micropinion summarySread(mi) is the readability score of mi
  • The first threshold constraint that controls the maximum length of the summary, captures compactness.The constraint that controls the similarity between phrases, also captures compactness by minimizing redundancy. Both these thresholds are user adjustable. If the user can tolerate more redundancues
  • This constraint controls the similarity between phrases which is a user adjustable parameters.Italso captures compactness by minimizing the amount redundancies. Both these thresholds are user adjustable. If the user can tolerate more redundancues
  • To score similarity between phrases, we simply use the jaccard similarity measure
  • Then the next is representativeness…Then in scoring representativeness, we have defined two properties of highly representative phrases:The words in each phrase should be strongly associated within a narrow window in the original textThe words in each phrase should be sufficiently frequent in the original textThese two properties is capture by a modified pointwise mutual information functionThen to scoreThe intuition is that if a generated phrase occurs frequently onthe web, then this phrase is readable.
  • Readability scoring is to determine how well formed the constructed phrases are.Since the phrases are constructed from seed words, we can have new phrases that may not have occurredin the original text.
  • Moving into the evaluation part….
  • -This graph shows the rouge scores of thedifferent summarization methods for different summary lengths.-We have KEA is a supervised keyphrase extraction model-We have a simple tfidf based keyphrase extraction method-Then we have Opinosis previously show-For this task, WebNgram performs the bestOpinosis does not perform as well most likely because there is a lack of structural redundancies within the full revieews of CNET.
  • -This graph shows the rouge scores of thedifferent summarization methods for different summary lengths.-We have KEA is a supervised keyphrase extraction model-We have a simple tfidf based keyphrase extraction method-Then we have Opinosis previously show-For this task, WebNgram performs the bestOpinosis does not perform as well most likely because there is a lack of structural redundancies within the full revieews of CNET.
  • Now we will change gears and move into the new work that I have done for the data collection part.Which is:….This work is to be submitted to an upcoming conference
  • So even tho we have an abundance of existing opinions, there is no direct way of finding entities of interest based on opinions.
  • Since user reviews alone make up a big portion of online opinions, I would like to narrow the focus to the crawling online reviews.The goal this task would be to provide an efficient method for crawling reviews and I would like to do this by focusing on the discovery of review rich seed pages for a given entity by exploiting…trying to intelligentsly discover a set of review rich pages for a given entity and this would act as seeds to the actually crawler. NEW------The goal of this task is to….And I plan to do this by focusing on….I focus on reviews because it represents a big portio…
  • Goal of OpinoFetch is to be general enough to work across domains
  • The input is basically a set of entities on which reviews are required.For e.g. all hotels in a particular city.Then for each entity, we find a set of initial candidate review pages, referred to as CRPs. ThisIs done using a general web search engine such as bing or google where the top n resultsWill server as the initial CRPS.
  • The input is basically a set of entities on which reviews are required.For e.g. all hotels in a particular city.Then for each entity, we find a set of initial candidate review pages, referred to as CRPs. ThisIs done using a general web search engine such as bing or google where the top n resultsWill server as the initial CRPS.
  • Then we expand the CRP list by exploring links in the neighborhood of the initial CRPs until a depth limit is met. This is actually to obtain more potential CRPs. Finally, to collection relevant pages, each CRP is scored in terms of entity relevance and review page relevance and only those that satisfy the min thresholds are retained.
  • The first step is to find initial candidate review pages. This is done on a per entity basis using a general web search engine like Bing or google. The query format used is the entity name followed by the brand or address and followed by the word reviews.So in this case…..The intuition for this is that search engines, already index most of the web, so we can leverage this to dig out the relevant review pages instead of trying to crawl the entire web
  • The next step is to find more CRPs. This is done by exploring links around the initialCRP. There are different link exploration approaches that have been proposed but this is notour focus. In OpinoFetch, we use URL prioritization where we follow the top N URLs in a given page.We stop when a certain depth is reached. The prioritization strategy used is to bias the crawl path towardsEntity related pages and this is achieved using a priority score assigned to URLs and this is based on the average cosine distance between the EQ and the URL Tokens and EQ and the Anchor Tokens. The intuition is that the more the anchor and url resemble the EQ the more likely the page is relevant to target entity.I try to expand the list of candidate pages through URL prioritizationWhere you follow the top n urls in any given page until a certain depth is reached. There has been many different approaches proposed in terms of url prioritizationSo this is not really my focus. The strategy that I used is to try to bias the crawl path towards entity related pages…..And to achieve this, the cosine distance is measured between the url tokens and the entity queryAchor text and the EQ; Where the scores are then averaged. So with this, the more the URL and anchor text resemble the EQ the more likely that the page is about the target entity.
  • So for review page relevance scoring, we use a rv. Which is a lexicon consisting of the most commonly occuring words within a review page. The details on how this lexicon is constructred is outlined in the full thesis. The idea is to score a page based on the # of review page words occurring in the page. The intuition is that, if a page has many of the review vocab terms, then this page is likely a review page. To determine review page relevance you can score a page based on the occurences of specific words. In this work I used a review vocabulary which is a lexicon consisting of commonly occurring words within review pages and each word is weighted by importance(details in thesis). The details on how this review vocab is constructed can be found in my thesis. So this is the scoring formula
  • -t is a term in rv-c(t,pi) is the freq of t in page pi . Log is used to scale down the tf, otherwise one very frequently occurring word can dominate-wt(t) is the importance of t in rv, so the more important t is the higher the weighting-Because the numerator is the sum of weights, this value can become quite large for highly dense review pages. So we need to normalize it to be able to adjust the score threshold.
  • It is unclear what the best normalizer would be for the raw rev rel score.So we explored 3 options.This first is called SiteMaxnormalizer where we use the max rev rel score amongst pagesFrom a given site. The intuition is that if a site is densely populated with reviews then the rev rel score will be high. So non review pages can get eliminated easily because…If entity is highly popular, there will be many more reviews on that entity  resulting in higher Srev(pi) scoreIf an entity is not so popular, then the amount of reviews will be sparse  resulting in lower Srev(pi) score A review page of an unpopular entity would still receive a high score because the maximum Srev(pi) score would not be high, so thenormalizer gets adjusted according to entity popularity. The third normalizer is to combine entity max or sitemax with the global maximum. This is to help with cases where EM or SM are unreliable.
  • Most review pages, have URLs that are highly descriptive.For example: this is the URL for iPhone reviews on amazonAs you can see the name of the item is within the url itself.This is the URL for reviews on Hampton Inn on Tripadvisor.Again you see the name of the hotel and city within the URL.With this, the entity relevance scoring is based on how similar the entity query is with the page URL. The intuition is that the more the URL resembles the query the more likely it is relevant to the target entity.
  • There are number of ways in which we can solve the proposed steps that I just described.Howevver, if we want to server real applications? We need to think about what will make the approachUseful in practice and useful to client applications.
  • First of all, we need an approach that is efficient, because our goal is to collect allow review collections for a large number of entities. So the task should terminate in reasonable time with reasonable accuracy. The problem usually happens when we cannot access required information fast. For example in computing term frequencies within a page for relevance scoring. The second thing that would be useful is access to rich information to enable client applications to obtain information beyond just the list of crawled pages. For example: to get all review pages for entity X from the top 10 popular sites. With current methods it would be difficcult get such info. So we need a rich information representation to deduce such information. Databases are not suitable for this because you will be dealing with complex joins and it does not naturally model complex info. The webgraph does not model complex relationships.
  • So, this is an example of a fetch graph. As you can see, each component is represented by a simple node in the graph and relationships are modeled using edges.You have entity nodes, site nodes, page nodes, term nodes and logical nodes which are conceptual nodes.
  • Entity nodes, represents entity on which reviews are needed; and you can have a very large set of entities (e.g. all hotels in the US). Here you only have three. Then entity nodes belong to a set of page nodes. This relationship is based on the set of CRPs found for eachIf p1 p2 and p3 appeared as the CRP for E1, then there would be edges from E1 to p1, p2 and p3. Then the pages themselves can be related to one another. For example, if you have a page that is a near duplicate to the other, the pages can be linked. Or if one page is the parent of another you can model this relationship.Next, each page can be made up of several components….such as the title, the URL and the textual contents. This can be modeled using logical nodes which are simply nodes to represent different concepts. Each of these components are made up of terms, and this is modeled using relation ship with term nodes. Each term node represents a unique word in entire vocabulary. So you would only have one node per term. The edges to the term nodes can hold term weighting information. Then you can have other logical nodes. In OpinoFetch, we have the query and an opinion vocabulary. The contents of the query and the opinion vocabulary are captured using relationship with term nodes.
  • TheFetchGraph has many uses. First of all,It serves as a simple data representation structure. You don’t need separate index for the terms and all other components.Then you can access all sorts of information using this 1 structure. You can get various stats. For e.g. you can easily getthe tf of a word in a given page. You just need to read the edge wt connecting the content node to the relevant term node. And then you can also access complex relationships and global information. This is because the relationship between The different components is tracked over the course of data collection. Also, since this structure hasThe potential of being compact, it can be easily made an in memory data structure. Since the network itself can be persisted and accessed at a later time, the client application can actually use this to answer interesting and important app related questions. For e.g. now with this structure the client can actually obtain all review pages for a given entity from top 10 pop sites. For example ou have access to
  • So now we will look at how to obtain statistics needed from the FG.-It is very easy to compute raw rev rel score using the FetchGraph-Using the fetchgraph, the Review Vocabulary is modeled using a logical where all outgoing edges to term nodes represent the terms that are part of the vocabulary. The wt on the edges represent the Importance of each term in the rv. -So to compute the raw review relevance score, we need to look at the content node of a page, and seeWhich of the terms appear both in the content and RV. -With this you do not need to parse page contents each time it is encountered and the lookup of review vocabulary words within a page is fast
  • To compute the similarity between the URL and the EQ, you need to access the URL node of a page and theCorresponding query node. The union of terms can be obtained by looking at all term nodes connected to the url or query node. For the intersection you look at term nodes that both are connected to.
  • Gold standard is valid URLs for a given entity around the vicinity of the search results. Precision is computed as the number of relevant review pages, divided by the total number of retrieved pages for the given entity.Recall is computed is the #of relevant pages drivided by the number of relevant pages according to the gold standard
  • This graph shows the recall achieved by google, OpinoFetch and the unnormalized version of OpinoFetch at different search results sizes. We see that the recall achieved by google is consistently low even with increasing num of results. This shows a lot of the search results are not necessarily relevant to the EQ or are not direct pointers to review pages. much lower than OpinoFetch . ThisGoes to show that there are many more relevant review pages around the vicinity of the searchResults than what the search engine deems as relevant.
  • Then we see that with OpinoFetch, with increasing number of search results, the recallActually improves. This shows that is actually a lot of relevant content around the vicinity of the search results and OpinoFetch is able to discover such relevant content. The actuall recall is higher if you account for near duplicate pages (where you get penalized if you don’t find all versions of the page),
  • Next, we see that by normalizing the raw review relevance score, the recall is actually better than without normalization. One reason for this is because the scores are normalized using special normalizers like EM or SM , in this case EM. So the scores are normalizedaccording to entity popularity. So its easier to identify truly relevant review pages from irrelevant ones.
  • The next question is, what is the best way to normalize review relevance score. This graph shows the average change in precision over not pruning the crawled pages using different normalizers. Here we can see that EM+GM gives the best precision and SM gives the lowest precision. SM is worst performing most probably because certain sites like Tripadvisor, have reviews on different types of entities like attractions, hotels, and so on. So using the max score from the site may be unreliable for sparse entities like attractions.
  • Next, we look at the growth of the FG. Since we use a single network to track all information,it may seem that the FG will grow too large too fast. This graph shows the growth Of the FG, with respect to # pages crawled. It is clear that the FetchGraph'sgrowth is actually linear to number of pages collected and this is actuallyWithout any form of optimization or compression. The growth can actually be Further contained with more optimizations. So this shows that it is possibleto have this as an in memds for different data collection problems.
  • Now we will look into the improvement in efficiency using the FetchGraph. The first row shows the average time to compute the raw relevance score using the FetchGraph and without it.As you can see, it takes about 0.085ms using the FG and without the use of the FG it takes 8.62 ms.Without the FG, you need to load a page into memory each time and then parse the page and thenDo the score computation. Even if the page was already previously encountered, you wouldstill have to load it and parse it to compute the scores. With the FG, a page is loaded into memory only onceAnd then the connections with the RV are immediately established so from then on its very straightforwardTo compute the Srevraw using the FG
  • Then to compute the EM normalizer, it takes 0.056ms using the FG and 4.39s without it. This is because without any ds, you basically would have toload sets of pages back into memory to find the entity max score normalizer. With the FG however, you track global info, so you just needTo do a lookup on the related sets of pages and obtain the max scores from that.
  • Now I will talk a little about the web demo system that I have developed called Findilike which integrates some of the ideas so far from this thesis. Demo was shown in WWW 2012
  • Findilike, find and ranks entities based on a set of user prefsThis can be unstructured opinion preferneces – which is the unique partand also structured preferences such as price brand and so on.Beyond search, it supports analysis of entities in termsOf textual review summaries and also tag cloud visualization of reviews.The current version actually works in the hotels domain.
  • So this is the interface of findilike. This is where u specify the unstructured prefs.In this case it is a search for clean hotels.This is where u specify the structured prefs such as distance from a particular location.In this case, the location specified is universal studios in LA.This is the ranking of hotels based on the prefenreces specified. So you hv opinion prefs as well as distance.
  • This shows the tag clouds of reviews.
  • This shows the textual review summaries. Can see that the summaries are farly well formed.
  • I have also updated part of the demo with reviews crawled using the OpinoFetch method.Here is an example. This summary is for Hampton inn in Champaign. This isBased on the initial reviews crawled from 1 or 2 sources. There are total Of 26 reviews . Now with the reviews crawled using OpinoFetch, I obtainedabout 135 reviews after doing some filtering and this was based on 8 sources.I actually wrote a baseline review extractor to extract individual reviews. The reviews selected were based on the length of the reviews (not too short) and Also a subjectivity score.
  • In terms of Future Work, with Opinosis, I would like to look into how to scale up the approach to really large amounts of text? I want to explore the use of the map reduce framework for this. I also would like to see how the approach works on other types of text, such as tweets, facebook comments, news articles and patient health records.Then for the work of OBER, I would like to see how to use query logs and click info to further improve ranking of entities. This is now possible because everything is logged in the demo system. I would also like to look into the use of phrasal search for ranking. I have not had much success with phrase search and I would like to understand why. Perhaps I would have to try a back-off type of approach where the scoring of entities is first based on the phrase, and then without the phrase restriction.
  • In terms of Future Work, with Opinosis, I would like to look into how to scale up the approach to really large amounts of text? I want to explore the use of the map reduce framework for this. I also would like to see how the approach works on other types of text, such as tweets, facebook comments, news articles and patient health records.And then for the work on opinion acquisition I would like to compare the proposed method with an unsupervised one and then also see how to further improve recall using just web search engines. To do this at a reasonable scale, I actually need to think about how to approximate judgments without relying completely on human judges. Then for the work of OBER, I would like to see how to use query logs and click info to further improve ranking of entities. This is now possible because everything is logged in the demo system. I would also like to look into the use of phrasal search for ranking. I have not had much success with phrase search and I would like to understand why. Perhaps I would have to try a back-off type of approach where the scoring of entities is first based on the phrase, and then without the phrase restriction.

Transcript

  • 1. Opinion-Driven Decision SupportSystem
  • 2. Visiting a new city….Online Opinions  Which hotel to stay at?2
  • 3. Visiting a new city….Online Opinions  What attractions to visit?Without opinions, decision making becomesdifficult!3
  • 4. ODSS Components1. DataComprehensive set of opinions to support search and analysiscapabilities4. PresentationPutting it all altogether- easy way for users to explore results of searchand analysis components (ex. organizing and summarizing results)3. Search CapabilitiesAbility to find entities usingexisting opinionsFocus of ExistingWorkopinion summarizationstructured summaries1. Sentiment Summary(ex. +ve/-ve on a piece of text)2. Fine-grained Sentiment Summ.(ex. Battery life: 2 stars; Audio: 1 star)2. Analysis ToolsTools to help digest opinions(ex. Summaries, Opinion trendvisualization)Not a complete solution to supportdecision making based on opinions !4
  • 5. ODSS Components1. DataComprehensive set of opinions to support search and analysiscapabilities4. PresentationPutting it all altogether- easy way for users to explore results of searchand analysis components (ex. organizing and summarizing results)3. Search CapabilitiesAbility to find entities usingexisting opinionsFocus of ExistingWorkopinion summarizationstructured summaries1. Sentiment Summary(ex. +ve/-ve on a piece of text)2. Fine-grained Sentiment Summ.(ex. Battery life: 2 stars; Audio: 1 star)2. Analysis ToolsTools to help digest opinions(ex. Summaries, Opinion trendvisualization)Need to address broader set of problemsto enable opinion driven decision support5
  • 6.  We need data: large number of onlineopinions Allow users to get complete and unbiased picture▪ Opinions are very subjective and can vary a lot Currently: No study on how to systematicallycollect opinions from the web
  • 7.  We need different analysis tools To help users analyze & digest opinions▪ Sentiment trend visualization▪ fluctuation over time▪ Aspect level sentiment summaries▪ Textual summaries, etc… Currently: focus on structured summarization
  • 8.  We need to incorporate search Allow users find different items or entities basedon existing opinions This can improve user productivity  cuts downon the time spent on reading large numberopinions
  • 9.  We also need to know how to organize &present opinions at hand effectively Aspect level summaries:▪ How to organize these summaries?▪ Scores or Visuals (stars)?▪ Do you show supporting phrases? Full opinions:▪ How to allow effective browsing of reviews/opinions? don’t overwhelm users
  • 10. ODSS Components1. DataComprehensive set of opinions to support opinion basedsearch & analysis tasks3. Search CapabilitiesFind items/entities based onexisting opinions(ex. show “clean” hotels only)4. PresentationOrganizing opinions to support effective decision making2. Analysis ToolsTools to help analyze & digestopinions (ex. Summaries, Opiniontrend visualization)10
  • 11. 1. Should be general Works across different domains & possibly contenttype2. Should be practical & lightweight Can be integrated into existing applications Can potentially scale up to large amounts of data11
  • 12. Ganesan & Zhai 2012 (Information Retrieval)12
  • 13.  Currently: No direct way of finding entitiesbased on online opinions Need to read opinions about different entitiesto find entities that fulfill personal criteria13Time consuming & impairs userproductivity!
  • 14.  Use existing opinions to rank entities based ona set of unstructured user preferences Finding a hotel: “clean rooms, good service” Finding a restaurant: “authentic food, good ambience”14
  • 15.  Use results of existing opinion mining methods Find sentiment ratings on different aspects Rank entities based on discovered aspect ratings Problem: Not practical! Costly - mine large amounts of textual content Need prior knowledge on set of queriable aspects Most existing methods rely on supervision▪ E.g. Overall user rating15
  • 16.  Use existing text retrieval models for rankingentities based on preferences: Can scale up to large amounts of textual content Can be tweaked Do not require costly IE or text mining16
  • 17.  Investigate use of text retrieval models for Opinion-Based Entity Ranking Compare 3 state-of-the-art retrieval models:BM25, PL2, DirichletLM – shown to work best for TR tasks Which one works best for this ranking task? Explore some extensions over existing IR models Can ranking improve with these extensions? Compile the first test set & propose evaluationmethod for this new ranking task17
  • 18. 18
  • 19.  Standard retrieval  cannot distinguish multiplepreferences in queryE.g. Query: “clean rooms, cheap, good service” Treated as long keyword query but actually 3 preferences Problem: An entity may score highly because of matchingone aspect extremely well To address this problem: Score each preference separately – multiple queries Combine the results of each query – different strategies▪ Score combination  works best▪ Average rank▪ Min rank▪ Max rank19
  • 20.  In standard retrieval: Matching an opinionword & standard topic word is notdistinguished Opinion-Based Entity Ranking: Important to match opinion words in the query▪ opinion words have more variation than topic words▪ E.g. Great: excellent, good, fantastic, terrific… Intuition:▪ Expand a query with similar opinion words▪ Help emphasize matching of opinions20
  • 21. 0.0%2.0%4.0%6.0%8.0%PL2 LM BM25QAM QAM + OpinExp0.0%0.5%1.0%1.5%2.0%2.5%PL2 LM BM25QAM QAM + OpinExpHotels CarsImprovement using QAMImprovement using QAM + OpinExp21
  • 22. 0.0%2.0%4.0%6.0%8.0%PL2 LM BM25QAM QAM + OpinExp0.0%0.5%1.0%1.5%2.0%2.5%PL2 LM BM25QAM QAM + OpinExpHotels CarsImprovement using QAMImprovement using QAM + OpinExpQAM: Any modelcan be usedQAM: Any modelcan be used22
  • 23. 0.0%2.0%4.0%6.0%8.0%PL2 LM BM25QAM QAM + OpinExp0.0%0.5%1.0%1.5%2.0%2.5%PL2 LM BM25QAM QAM + OpinExpHotels CarsImprovement using QAMImprovement using QAM + OpinExpQAM+OpinExp: BM25most effectiveQAM+OpinExp: BM25most effective23
  • 24. 24
  • 25. Current methods: Focus ongenerating structuredsummaries of opinions[Lu et al., 2009; Lerman et al., 2009;..]Opinion Summary for iPod
  • 26. We need supporting textualsummaries!To know more: read manyredundant sentencesOpinion Summary for iPod
  • 27.  Summarize the major opinions What are the major complaints/praise in the text? Concise◦ Easily digestible◦ Viewable on smaller screen Readable◦ Easily understood27
  • 28.  Widely studied for years[Radev et al.2000; Erkan & Radev, 2004; Mihalcea & Tarau, 2004…] Not suitable for generating concise summaries Bias: with limit on summary size▪ Selected sentences may have missed critical info. Verbose: Not shortening sentencesWe need more of an abstractive approach
  • 29. 2 Abstractive SummarizationMethodsOpinosis-Graph based summarization framework-Relies on structural redundancies in sentencesWebNgram-Optimization framework based on readability& representativeness scoring-Phrases generated by combining words inoriginal text29
  • 30. InputSet of sentences:Topic specificPOS annotated30
  • 31. mythe iphone is aphone calls frequentlytoowith.dropStep 1: Generategraph representation oftext (Opinosis-Graph)greatdeviceInputSet of sentences:Topic specificPOS annotated31
  • 32. Step 2: Find promising paths(candidate summaries) &score the candidatesmythe iphone is aphone calls frequentlytoowith.dropStep 1: Generategraph representation oftext (Opinosis-Graph)greatdeviceInputSet of sentences:Topic specificPOS annotatedcalls frequentlydropgreat devicecandidate sum1candidate sum23.22.532
  • 33. The iPhone is a greatdevice, but calls dropfrequently.Step 3: Select top scoringcandidates as final summarycalls frequentlydropgreat deviceStep 2: Find promising paths(candidate summaries) &score the candidatescandidate sum1candidate sum23.22.5mythe iphone is aphone calls frequentlytoowith.dropStep 1: Generategraph representation oftext (Opinosis-Graph)greatdeviceInputSet of sentences:Topic specificPOS annotated33
  • 34. Assume: 2 sentences about “call quality of iphone”1. My phone calls drop frequently with the iPhone.2. Great device, but the calls drop too frequently.34
  • 35. • One node for each unique word + POS combination• Sid and Pid maintained at each node• Edges indicate relationship between words in sentence 35great2:1device2:2,2:3but2:4.1:9, 2:10my1:1phone1:2drop1:4, 2:7frequently1:5, 2:9 with1:6the1:7, 2:5iphone1:8calls1:3, 2:6too2:8
  • 36. great2:1device2:2,2:3but2:4.1:9, 2:10my1:1phone1:2drop1:4, 2:7frequently1:5, 2:9 with1:6the1:7, 2:5iphone1:8calls1:3, 2:6too2:8
  • 37. great2:1device2:2,2:3but2:4.1:9, 2:10my1:1phone1:2drop1:4, 2:7frequently1:5, 2:9 with1:6the1:7, 2:5iphone1:8calls1:3, 2:6too2:8drop1:4, 2:7frequently1:5, 2:9calls1:3, 2:6Path shared by 2 sentences naturallycaptured by nodes37
  • 38. great2:1device2:2,2:3but2:4.1:9, 2:10my1:1phone1:2drop1:4, 2:7frequently1:5, 2:9 with1:6the1:7, 2:5iphone1:8calls1:3, 2:6too2:8drop1:4, 2:7frequently1:5, 2:9calls1:3, 2:6Easily discover redundancies for highconfidence summaries38
  • 39. great2:1device2:2,2:3but2:4.1:9, 2:10my1:1phone1:2drop1:4, 2:7frequently1:5, 2:9 with1:6the1:7, 2:5iphone1:8calls1:3, 2:6too2:8drop1:4, 2:7frequently1:5, 2:9calls1:3, 2:6Gap between words = 239
  • 40. great2:1device2:2,2:3but2:4.1:9, 2:10my1:1phone1:2drop1:4, 2:7frequently1:5, 2:9 with1:6the1:7, 2:5iphone1:8calls1:3, 2:6too2:8drop1:4, 2:7frequently1:5, 2:9calls1:3, 2:6Gapped subsequences allow:• redundancy enforcements• discovery of new sentences40
  • 41.  Calls drop frequently with the iPhone Calls drop frequently with the Black Berrydrop frequently with the iphonecallsblack berryOne common highredundancy pathHigh fan-out“calls drop frequently with the iphone andblack berry”41
  • 42.  Input: Topic specific sentences from user reviews Evaluation Measure: Automatic ROUGE evaluation42
  • 43. 0.31840.28310.49320.12930.08510.2316HUMAN(17 words)OPINOSISbest(15 words)MEAD(75 words)ROUGE-1 ROUGE-SU4ROUGE Recall0.34340.44820.09160.30880.32710.1515HUMAN(17 words)OPINOSISbest(15 words)MEAD(75 words)ROUGE PrecisionLowest precisionMuch longersentencesHighest recallMEAD does not do well in generatingconcise summaries.43
  • 44. 0.31840.28310.49320.12930.08510.2316HUMAN(17 words)OPINOSISbest(15 words)MEAD(75 words)ROUGE-1 ROUGE-SU4ROUGE Recall0.34340.44820.09160.30880.32710.1515HUMAN(17 words)OPINOSISbest(15 words)MEAD(75 words)ROUGE Precisionsimilar similarPerformance of Opinosis is reasonable similar to human performance44
  • 45.  Use existing words in original text to generatemicropinion summaries- set of short phrases Emphasis on 3 aspects: Compactness - use as few words as possible Representativeness – reflect major opinions in text Readability – fairly well formed45
  • 46. kmmsim)(mS)(mSm)(mS)(mS...mmMjisimjireadireadrepirepsskiikiireadirepki,1(subject tomaxarg,),1146
  • 47. kmmsim)(mS)(mSm)(mS)(mS...mmMjisimjireadireadrepirepsskiikiireadirepki,1(subject tomaxarg,),11Objective function: Optimizerepresentativeness & readabilityscores• Ensure: summaries reflect key opinions &reasonably well formed47
  • 48. kmmsim)(mS)(mSm)(mS)(mS...mmMjisimjireadireadrepirepsskiikiireadirepki,1(subject tomaxarg,),11Readability score of miRepresentativeness score of mi48
  • 49. kmmsim)(mS)(mSm)(mS)(mS...mmMjisimjireadireadrepirepsskiikiireadirepki,1(subject tomaxarg,),11Constraint 1: Maximumlength of summary.•User adjustable•Captures compactness.49
  • 50. kmmsim)(mS)(mSm)(mS)(mS...mmMjisimjireadireadrepirepsskiikiireadirepki,1(subject tomaxarg,),11Constraint 2 &3: Minrepresentativeness & readability.•Helps improve efficiency•Does not affect performance50
  • 51. kmmsim)(mS)(mSm)(mS)(mS...mmMjisimjireadireadrepirepsskiikiireadirepki,1(subject tomaxarg,),11Constraint 4: Maxsimilarity of phrases• User adjustable• Captures compactness byminimizing redundancies51
  • 52.  Measure used: Standard Jaccard Similarity Measure Why important? Allows user to control amount of redundancy E.g. User desires good coverage of information onsmall device  request less redundancies !52
  • 53.  Purpose: Measure how well a phrase representsopinions from the original text? 2 properties of a highly representative phrase:1. Words should be strongly associated in text2. Words should be sufficiently frequent in text Captured by a modified pointwise mutualinformation (PMI) function53)()(),(),(log)( 2,jijijijiwpwpwwcwwpwwpmiAdd frequency ofoccurrence withina window
  • 54.  Purpose: Measure well-formedness of a phrase Readability scoring: Use Microsofts Web N-gram model (publicly available) Obtain conditional probabilities of phrases Intuition: A readable phrase would occur morefrequently according to the web than a non-readablephrase)|(log1)( 1...12... kqknqkknkread wwwpKwwS54chain rule to computejoint probability in terms ofconditional probabilities(averaged)
  • 55.  Input: User reviews for 330 products (CNET) Evaluation Measure: Automatic ROUGE evaluation55
  • 56. 0.000.010.020.030.040.050.060.070.080.095 10 15 20 25 30ROUGE-2RECALLSummary Size (max words)KEATfidfOpinosisWebNGramWebNgram: Performsthe best for this taskKEA: slightlybetter than tfidfTfidf: Worstperformance56
  • 57. 0.000.010.020.030.040.050.060.070.080.095 10 15 20 25 30ROUGE-2RECALLSummary Size (max words)KEATfidfOpinosisWebNGramWebNgram: Performsthe best for this taskKEA: slightlybetter than tfidfTfidf: Worstperformance57PROSCONSFULL REVIEW
  • 58. To Submit.58
  • 59.  No easy way to obtain a comprehensive set ofopinions about an entity Where to get opinions now? Rely on content providers or crawl a few sourcesProblem :▪ Can result in source specific bias▪ Data sparseness for some entities59
  • 60. 60 Automatically crawl online reviews forarbitrary entitiesE.g. Cars, Restaurants, Doctors Target online reviews  represent a bigportion of online opinions
  • 61.  Meant to collect pages relevant to a topicE.g. “Databases Systems”, “Boston Terror Attack” Page type is not as important content news article, review pages, forum page, etc. Most focused crawlers are supervised require large amounts of training data for each topic Not suitable for review collection on arbitraryentities Need training data for each entity  will not scale up tolarge # of entities61
  • 62.  Focused crawler for collecting reviews pageson arbitrary entities Unsupervised approach Does not require large amounts of training data Solves crawling problem efficiently Uses a special data structure for relevance scoring
  • 63. Set of entities in a domain(e.g. All hotels in a city)Step 1: For each entity, obtaininitial set of Candidate ReviewPages (CRP).Find Initial CandidateReview Pages (CRP)Input1. Hampton Inn Champaign…2. I Hotel Conference Center…3. La Quinta Inn Champaign…4. Drury Inn5. ….Hampton Inn…Reviews63
  • 64. Step 3: Score CRPs:• Entity relevance (Sent)• Review pg. relevance (Srev)Select: Srev > σrev ; Sent > σentExpand CRP Listtripadvisor.com/Hotel_Review-g36806-d903...tripadvisor.com/Hotels-g36806-Urbana_Cha...hamptoninn3.hilton.com/en/hotels/…tripadvisor.com/ShowUserReviews-g36806-.........…tripadvisor.com/Hotel_Review-g35790-d102…tripadvisor.com/Hotels-g36806-Urbana_Cha...hamptoninn3.hilton.com/en/hotels/…tripadvisor.com/ShowUserReviews-g36806-.........…Step 2: Expand list of CRPs byexploring links in neighborhoodof initial CRPs.Collect Relevant Review Pages64
  • 65.  Use any general web search (e.g. Bing/Google) Per entity basis Search engines do partial matching of entities topages More likely pages in vicinity of search resultsrelated to entity QueryEntity QueryFormat: “entity name + brand / address” + “reviews”E.g. “Hampton Inn Champaign 1200 W University Ave Reviews”65
  • 66.  Follow top-N URLs around vicinity of searchresults Use URL prioritization strategy: Bias crawl path towards entity related pages Score each URL: based on similarity between(a) URL + Entity Query, Sim(URL,EQ)(b) Anchor + Entity Query, Sim(Anchor,EQ)66
  • 67.  To determine if page is indeed a review page Use review vocabulary: Lexicon with most commonly occurring wordswithin review pages – details in thesis Idea: score a page based on # of review page words67]10[)(,)()()(),(log)( 2pirevSnormalizerpiSpirevSVt twtiptcpirevSrawrevraw
  • 68. ]10[)(,)()()(),(log)( 2pirevSnormalizerpiSpirevSVt twtiptcpirevSrawrevraw To determine if page is indeed a review page Use review vocabulary: Lexicon with most commonly occurring wordswithin review pages – details in thesis Idea: score a page based on # of review page wordsRaw review pagerelevance scoreNormalize to obtainfinal review pagerelevance score68t is a term in thereview vocabulary, Vc(t, pi) – freq. of t in page pi (tf).wt(t) - importanceweighting of t in RVNormalizer needed toset proper thresholds
  • 69.  Explored 3 normalization options: SiteMax (SM) : Max Srevraw(pi) amongst all pagesrelated to a particular site - Normalize based on sitedensity EntityMax (EM) : Max Srevraw(pi) score amongst allpages related to an entity - Normalize based onentity popularity EntityMax + GlobalMax (GM) orSiteMax + GlobalMax (GM) :▪ To help with cases where SM/EM are unreliable69
  • 70.  To determine if page is about target entity Based on similarity between a page URL & EntityQuery Why it works? Most review pages have highly descriptive URLs Entity Query is a detailed description of entity The more URL resembles query, more likely it isrelevant to target entity Similarity measure: Jaccard Similarity70
  • 71.  Steps proposed so far, can be implemented in avariety of different ways Our goal: make the crawling framework usable inpractice71
  • 72. 1. Efficiency: Allow review collection for large number of entities Task should terminate in reasonable time & accuracy Problem happens when cannot access requiredinformation quickly▪ E.g. Repeated access to term frequencies of different pages2. Rich Information Access (RIA): Allow client to access info. beyond crawled pagesE.g. Get all review pages from top 10 popular sites for entity X DB not suitable because you cannot naturally modelcomplex relationships and would yield in large joins72
  • 73.  Heterogeneous graph data structure Models complex relationships betweendifferent components in a data collectionproblem73
  • 74. Review VocabularyCurrent QueryQVt1t2t3t4t5tz....Term NodeswtwtwtwtwtwtwtwtwtwtwtE1Entity NodesE2EkHampton Inn ChampaignI-Hotel Conference CenterDrury inn ChampaigntttuuucccPage NodesP2P1P3P4P5P6PnSite NodesS2Sthotels.comlocal.yahoo.comS1tripadvisor.comt = title, u = url, c = contentLogical NodesOtherLogical Nodes74
  • 75. Review VocabularyCurrent QueryQVOtherLogical Nodest1t2t3t4t5tz....Term NodeswtwtwtwtwtwtwtwtwtwtwtE1Entity NodesE2EkHampton Inn ChampaignI-Hotel Conference CenterDrury inn ChampaignuuucccPage NodesP2P1P3P4P5P6PnSite NodesS2Sthotels.comlocal.yahoo.comS1tripadvisor.comtttt = title, u = url, c = contentLogical Nodes75List of entities on whichreviews are requiredBased on set of CRPsfound for each entityAt thecore, made up oftermsOne nodeper uniqueterm
  • 76.  Maintain one simple data structure: Access to various statistics▪ E.g TF of word in a page  EdgeWT(content node  term node) Access to complex relationships and global information Compact: can be an in memory data structure Network can be persisted and accessed later Client applications can use network to answerinteresting app. related questionsE.g. Get all review pages for entity X from top 10 popular sites76
  • 77. t1t2t3t4t5tz....Term NodesPage NodesP2P1P3P4P5P6PnwtwtwtwtVCContent Node(logical node)tftftfReview Vocabulary Node(logical node)To compute Srevraw(pi) :-Terms present in both the Content node and RV node.-TF and weights can be obtained from edges-Lookup of review vocabulary words within a page is fast-No need to parse page contents each time encountered77Outgoing edges = term ownershipEdge weight = importance wtEdge weight = TF
  • 78. Opinion VocabularyCurrent QueryQOOtherLogical Nodest1t2t3t4t5tz....Term NodeswtwtwtwtwtwtwtwtwtwtwtE1Entity NodeE2EkHampton Inn ChampaignI-Hotel Conference CenterDrury inn ChampaignuuucccPage NodesP2P1P3P4P5P6PnSite NodesS2Sthotels.comlocal.yahoo.comS1tripadvisor.comtttLogical NodesAccess all pages connectedto the site noderequires complete graph78
  • 79. Opinion VocabularyCurrent QueryQOOtherLogical Nodest1t2t3t4t5tz....Term NodeswtwtwtwtwtwtwtwtwtwtwtE1Entity NodeE2EkHampton Inn ChampaignI-Hotel Conference CenterDrury inn ChampaignuuucccPage NodesP2P1P3P4P5P6PnSite NodesS2Sthotels.comlocal.yahoo.comtripadvisor.comS1tttLogical NodesAccess all pages connectedto entity noderequires complete graph79
  • 80. t1t2t3t4t5tz....Term NodestfPage NodesP2P1P3P4P5P6Pntftftftfq1Entity Query Node(logical node)Hampton Inn Champaign 1200W Univ…Reviewstftftripadvisor.com/ShowUser…UURL Node(logical node)80
  • 81.  Goal: Evaluate accuracy & give insights into efficiencyusing FetchGraph Evaluated in 3 domains: (5) – Electronics, (5) – Hotels, (4) - Attractions Only 14 entities  expensive to obtain judgments Gold standard: For each entity, explore top 50 Google results & linksaround vicinity of the results (up to depth 3) 3 Human judges used to determine relevance ofcollected links to entity query (crowd sourcing) Final judgment: majority voting81
  • 82.  Baseline: Google search results Deemed relevant to entity query Evaluation measure: Precision Recall – estimate of coverage of review pages82)Pages(eGoldStdRel#)RelPages(e#)Recall(ekkk)ages(eRetrievedP#)RelPages(e#)Prec(ekkk
  • 83. 83
  • 84. 0.000.050.100.150.200.2510 20 30 40 50RecallNumber of search resultsGoogle OpinoFetch OpinoFetchUnnormalizedOpinoFetchOpinoFetchUnnormalizedGoogleGoogle: recallconsistently lowGoogle: recallconsistently lowGoogle: recallconsistently lowGoogle: recallconsistently low84Search results  not always relevant to EQ or notdirect pointers to actual review pages.
  • 85. 0.000.050.100.150.200.2510 20 30 40 50RecallNumber of search resultsGoogle OpinoFetch OpinoFetchUnnormalizedOpinoFetchOpinoFetchUnnormalizedGoogleOpinoFetch: recallkeeps improvingOpinoFetch: recallkeeps improvingOpinoFetch: recallkeeps improvingOpinoFetch: recallkeeps improving85-A lot of relevant content in vicinity of search results-OpinoFetch is able to discover such relevant content
  • 86. 0.000.050.100.150.200.2510 20 30 40 50RecallNumber of search resultsGoogle OpinoFetch OpinoFetchUnnormalizedOpinoFetchOpinoFetchUnnormalizedGoogleOpinoFetch: betterrecall with normalization-Scores are normalized using special normalizers(e.g. EntityMax / SiteMax)-Easier to distinguish relevant review pages86
  • 87. 97.23%85.72%36.23%19.62%0%20%40%60%80%100%EntityMax +GlobalMaxEntityMax SiteMax +GlobalMaxSiteMax%Changeinprecision EM + GM: gives thebest precisionSM: gives lowestprecision87SM is worst performing: certain sites coverdifferent classes of entities. Max score from thesite may be unreliable for sparse entities
  • 88. 0500001000001500002000002500003000003500004000004500000 200 400 600 800 1000GraphSize# pages crawledLinear growth without anyoptimization/compressionPossible to use FetchGraph as inmemory data structure88
  • 89. Avg. Execution Time with/without FetchGraphWithFetchGraphWithoutFetchGraphSrevraw(pi) 0.09ms 8.60msEnityMaxNormalizer0.06ms 4.40 sWithout FetchGraph:-Parse page contents each timeWith FetchGraph:-Page loaded into memory once-Use FetchGraph to compute Srevraw(pi)89
  • 90. Avg. Execution Time with/without FetchGraphWithFetchGraphWithoutFetchGraphSrevraw(pi) ~0.09ms ~8.60msEnityMaxNormalizer~0.06ms ~4.40sWithout FetchGraph:load sets of pages into memoryto find entity max normalizerWith FetchGraph:-Global info tracked till the end-Only need to do a lookup on related setsof pages to obtain entity max normalizer90
  • 91.  Proposed: An unsupervised, practical methodfor collecting reviews on arbitrary entities Works with reasonable accuracy withoutrequiring large amounts of training data Proposed FetchGraph: Helps with efficient lookup of various statistics Useful for answering application related queries91
  • 92. Ganesan & Zhai, WWW 201292
  • 93.  Finds & ranks entities based on user preferences Unstructured opinion preferences - novel Structured preferences - e.g. price, brand, etc. Beyond search: Support for analysis of entities Ability to generate textual summaries of reviews Ability to display tag clouds of reviews Current version: Works in the hotels domain93
  • 94. Search: Find entities basedon unstructured opinionpreferencesSearch: + Combine withstructured preferencesRanking: How well allpreferences arematched?94
  • 95. Tag cloudsweighted by frequencyRelated snippets(“convenient location”)95
  • 96. Opinion summariesreadable, well-formedRelated snippets96
  • 97. Summary with Initial Reviews:-26 reviews in total-1-2 sourcesSummary with OpinoFetch Reviews:-135 reviews (8 sources)-Extracted with a baseline extractor-Not all reviews were included – filter• Based on length of review• Subjectivity score of review 97
  • 98.  Opinion Based Entity Ranking Use click through & query logs to further improveranking of entities▪ Now possible  everything is logged by demo system Look into the use of phrasal search for ranking▪ Limit deviation from actual query (e.g. “close to university”)▪ Explore: “back-off” style scoring – score based on phrasethen remove the phrase restriction98
  • 99.  Opinosis How to scale up to very large amounts of text?▪ Explore use of map reduce framework Would this approach work with other types of texts?▪ E.g. Tweets, Facebook comments – shorter texts Opinion Acquisition Compare OpinoFetch with a supervised crawler▪ Can achieve comparable results? How to improve recall of OpinoFetch?▪ To evaluate at a reasonable scale: approximate judgmentswithout relying on humans?99
  • 100. [Barzilay and Lee2003] Barzilay, Regina and Lillian Lee. 2003. Learning to paraphrase: an unsupervisedapproach using multiple-sequence alignment. In NAACL ’03: Proceedings of the 2003 Conference of theNorth American Chapter of the Association for Computational Linguistics on Human LanguageTechnology, pages 16–23, Morristown, NJ, USA.[DeJong1982] DeJong, Gerald F. 1982. An overview of the FRUMP system. In Lehnert, Wendy G. and Martin H.Ringle, editors, Strategies for Natural Language Processing, pages 149–176. LawrenceErlbaum, Hillsdale, NJ.[Erkan and Radev2004] Erkan, G¨unes and Dragomir R. Radev. 2004. Lexrank: graph-based lexical centrality assalience in text summarization. J. Artif. Int. Res.,22(1):457–479.[Finley and Harabagiu2002] Finley, Sanda Harabagiu and Sanda M. Harabagiu. 2002. Generating single andmulti-document summaries with gistexter. In Proceedings of the workshop on automaticsummarization, pages 30–38.[Hu and Liu2004] Hu, Minqing and Bing Liu. 2004. Mining and summarizing customer reviews. In KDD, pages168–177.[Jing and McKeown2000] Jing, Hongyan and Kathleen R. McKeown. 2000. Cut and paste based textsummarization. In Proceedings of the 1st North American chapter of the Association for ComputationalLinguistics conference, pages 178–185, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.[Lerman et al.2009] Lerman, Kevin, Sasha Blair-Goldensohn, and Ryan Mcdonald. 2009. Sentimentsummarization: Evaluating and learning user preferences. In 12th Conference of the European Chapter ofthe Association for Computational Linguistics (EACL-09).[Mihalcea and Tarau2004] Mihalcea, R. and P. Tarau. 2004. TextRank: Bringing order into texts. In Proceedingsof EMNLP-04and the 2004 Conference on Empirical Methods in Natural Language Processing, July.[Pang and Lee2004] Pang, Bo and Lillian Lee. 2004. A sentimental education: Sentiment analysis usingsubjectivity summarization based on minimum cuts. In Proceedings of the ACL, pages 271–278.[Pang et al.2002] Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentimentclassification using machine learning techniques. In Proceedings of the 2002 Conference on EmpiricalMethods in Natural Language Processing (EMNLP), pages 79–86.[Radev and McKeown1998] Radev, DR and K. McKeown. 1998. Generating natural language summaries frommultiple on-line sources. Computational Linguistics, 24(3):469–500.[More in Thesis Report] 100