Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

of

Is this Entitity Relevant to your Needs - CIKM2012 Slide 1 Is this Entitity Relevant to your Needs - CIKM2012 Slide 2 Is this Entitity Relevant to your Needs - CIKM2012 Slide 3 Is this Entitity Relevant to your Needs - CIKM2012 Slide 4 Is this Entitity Relevant to your Needs - CIKM2012 Slide 5 Is this Entitity Relevant to your Needs - CIKM2012 Slide 6 Is this Entitity Relevant to your Needs - CIKM2012 Slide 7 Is this Entitity Relevant to your Needs - CIKM2012 Slide 8 Is this Entitity Relevant to your Needs - CIKM2012 Slide 9 Is this Entitity Relevant to your Needs - CIKM2012 Slide 10 Is this Entitity Relevant to your Needs - CIKM2012 Slide 11 Is this Entitity Relevant to your Needs - CIKM2012 Slide 12 Is this Entitity Relevant to your Needs - CIKM2012 Slide 13 Is this Entitity Relevant to your Needs - CIKM2012 Slide 14 Is this Entitity Relevant to your Needs - CIKM2012 Slide 15 Is this Entitity Relevant to your Needs - CIKM2012 Slide 16 Is this Entitity Relevant to your Needs - CIKM2012 Slide 17 Is this Entitity Relevant to your Needs - CIKM2012 Slide 18 Is this Entitity Relevant to your Needs - CIKM2012 Slide 19 Is this Entitity Relevant to your Needs - CIKM2012 Slide 20 Is this Entitity Relevant to your Needs - CIKM2012 Slide 21 Is this Entitity Relevant to your Needs - CIKM2012 Slide 22 Is this Entitity Relevant to your Needs - CIKM2012 Slide 23 Is this Entitity Relevant to your Needs - CIKM2012 Slide 24 Is this Entitity Relevant to your Needs - CIKM2012 Slide 25 Is this Entitity Relevant to your Needs - CIKM2012 Slide 26 Is this Entitity Relevant to your Needs - CIKM2012 Slide 27 Is this Entitity Relevant to your Needs - CIKM2012 Slide 28 Is this Entitity Relevant to your Needs - CIKM2012 Slide 29 Is this Entitity Relevant to your Needs - CIKM2012 Slide 30 Is this Entitity Relevant to your Needs - CIKM2012 Slide 31 Is this Entitity Relevant to your Needs - CIKM2012 Slide 32 Is this Entitity Relevant to your Needs - CIKM2012 Slide 33 Is this Entitity Relevant to your Needs - CIKM2012 Slide 34 Is this Entitity Relevant to your Needs - CIKM2012 Slide 35 Is this Entitity Relevant to your Needs - CIKM2012 Slide 36 Is this Entitity Relevant to your Needs - CIKM2012 Slide 37 Is this Entitity Relevant to your Needs - CIKM2012 Slide 38 Is this Entitity Relevant to your Needs - CIKM2012 Slide 39 Is this Entitity Relevant to your Needs - CIKM2012 Slide 40
Upcoming SlideShare
Sigir12 tutorial: Query Perfromance Prediction for IR
Next
Download to read offline and view in fullscreen.

2 Likes

Share

Download to read offline

Is this Entitity Relevant to your Needs - CIKM2012

Download to read offline

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Is this Entitity Relevant to your Needs - CIKM2012

  1. 1. Is This Entity Relevant to Your Needs? David Carmel IBM Research - Haifa, Israel IBM Research - Haifa © 2012 IBM Corporation
  2. 2. IBM Research - Haifa Outline Some Open Questions in Entity Oriented Search (EoS) What makes an entity relevant to the user needs? Is it the same relevance that the IR community deals with Can we adopt exiting IR models into this new area The classical model of relevance in IR User based relevance Topical based relevance (Aboutness) Similarity based relevance measurements Supportive evidence as indication of relevancy For Q&A For EoS Relevance Estimation approaches for EOS Exploration & Discovery in EoS Summary 2 Is This entity Relevant? © 2012 IBM Corporation
  3. 3. IBM Research - Haifa Entity Oriented Search (EoS) When people use retrieval systems they are often not searching for documents or text passages Often named entities play a central role in answering such information needs persons, organizations, locations, products… At least 20-30% of the queries submitted to Web SE are simply name entities ~71% of Web search queries contain named entities (Named entity recognition in query, Guo et al, SIGIR09) 3 Is This entity Relevant? © 2012 IBM Corporation
  4. 4. IBM Research - Haifa Popular Entity Oriented Search tools Product Search On-line Shopping (books, movies, electronic devices…) Amazon, eBay… Travel (places, hotels, flights…) Yahoo! Travel, Kayak… Multi-media (Music, Video, Images…) Last.fm, YouYube, Flickr… People Search Expert Search (for a specific topic) LinkedIn, ArnetMiner… Friends (colleagues, other people with mutual interests, lost friends …) Facebook… Location Search Addresses Businesses Proximity Search (Find close sites to the current searcher’s location) 4 Is This entity Relevant? © 2012 IBM Corporation
  5. 5. IBM Research - Haifa 5 Is This entity Relevant? © 2012 IBM Corporation
  6. 6. IBM Research - Haifa Expert Search The task: Identify people who are knowledgeable on a specific topic Find people who have skills and experience on a given topic How knowledgeable can be measured? How persons should be ranked, in response to a query, such that those with relevant expertise are ranked first? 6 Is This entity Relevant? © 2012 IBM Corporation
  7. 7. IBM Research - Haifa Are those entities satisfy our needs? What makes an entity relevant to the user’s need? What is the meaning of relevance in this context? Is it the same relevance that the IR community deals with for many decades in the context of document retrieval? Can we adopt exiting IR models into this new area of Entity oriented Search in a straight forward manner? In this talk I’ll try to deal with some of those questions I’ll overview how the same questions are handled in related areas, (especially in Q&A) I’ll raise some research directions that might lead to a better understanding of the concept of relevance in EoS 7 Is This entity Relevant? © 2012 IBM Corporation
  8. 8. IBM Research - Haifa What is an Entity? Entity: an object or a “thing” that can be uniquely identified in the world An entity must be distinguished from other entities Can be anything (including an abstract thing!) Attributes: Used to describe entities An attribute contains a single piece of information Key - A minimal set of attributes that uniquely identify an entity Entity set: a set of entities of the same type and attributes id birthday Actor name address 8 Is This entity Relevant? © 2012 IBM Corporation
  9. 9. IBM Research - Haifa What is a Relationship? Relationship: Association among two or more entities A Relationship also may have attributes Relationship Set: Set of relationships of the same type code Medication name id Patient Prescription Physician id name Date 9 Is This entity Relevant? © 2012 IBM Corporation
  10. 10. IBM Research - Haifa Example: ERD for Social Search in the Enterprise Creator 10 Is This entity Relevant? © 2012 IBM Corporation
  11. 11. IBM Research - Haifa Entity Relationship Graph (ERG) Represents Entity instances as graph nodes Binary relationships as (weighted) edges N-ary relations are broken into binary ones 11 Is This entity Relevant? © 2012 IBM Corporation
  12. 12. IBM Research - Haifa Entity Oriented Search (EoS) Entity Relationship Entities, Relations Index Entity Relationship Data Query Examples: • Nikon D40 • Teammates of Michael Schumacher Query • “Data mining” (Free Text, Entity, Hybrid query) Runtime Related Entities, Relationships Ranking Navigation Exploration 12 Is This entity Relevant? © 2012 IBM Corporation
  13. 13. The concept of Relevance in IR IBM Research - Haifa © 2012 IBM Corporation
  14. 14. IBM Research - Haifa The Classical Concept of Relevance in IR (Saracevic76, Mizzaro96) Problem Request Judgment P: The user has R: The user expresses J: The same user problem to solve IN explicitly, usually Judges the or an aim to In natural language, RELEVANCE achieve (sometimes with the of search results help of an intermediary) Information Query Need IN: The user builds Q: Formalization: R is mental, implicit translated to a formal representation of P query understandable by (may be incorrect or the search system Incomplete) 14 Is This entity Relevant? © 2012 IBM Corporation
  15. 15. IBM Research - Haifa User-based (Subjective) Relevance Relevance is a dynamic concept that depends on the user’s subjective judgment Subjective Relevance judgment may depend on: User’s characteristics and perceptions Gender, age, education, income, occupation… Preferences, Interests, State of mind The context of search Level of the user’s expertise (regarding the topic of interests) Current Time Current Location Session status Dependencies between retrieved items to the • specific query • sequential queries during the session 15 Is This entity Relevant? © 2012 IBM Corporation
  16. 16. IBM Research - Haifa Topical-based relevance judgment How well the topic of the information retrieved matches the topic of the request An object is objectively relevant to a request if it deals with the topic of the request (Aboutness) TREC working definition for relevance assessment: If you are writing a report on the topic and would use the information contained in the document in the report – then the document is considered relevant to the topic… A document is judged relevant if any piece of it is relevant regardless of how small that piece is in relation to the rest of the document 16 Is This entity Relevant? © 2012 IBM Corporation
  17. 17. IBM Research - Haifa Probability Ranking Principal Given a set of documents that “match” the entity-oriented query How do we rank them for the user? The Probability Ranking Principal (PRP) for Document Retrieval (Robertson 71): ``If a retrieval system's response to each request is a ranking of the documents in the collection in order of decreasing probability of relevance to the user who submitted the request, where the probabilities are estimated as accurately as possible on the basis of whatever data have been made available to the system for this purpose, The overall effectiveness of the system to its user will be the best…'' Pr( R = 1 | d , q) Pr( R = 1 | e, q ) We need a reliable and coherent methodology for measuring the probability of relevance of an entity to a query 17 Is This entity Relevant? © 2012 IBM Corporation
  18. 18. IBM Research - Haifa Relevance estimation in classic Document Retrieval Most relevance approximation approaches for document retrieval are based on measuring some kind of similarity between the user's query and retrieved documents Vector Space: The Cosine of the angle between two vectors Concept space: similarity in the latent concept space • e.g. LDA, LSI, ESA Language models: Similarity between the documents and the query term distributions Can we use similar approaches for EoS? 18 Is This entity Relevant? © 2012 IBM Corporation
  19. 19. IBM Research - Haifa Entity Similarity While similarity plays a central role in document retrieval for relevance estimation many relevant entities are not similar to the queried entity At least according to standard definitions of similarity This problem is well known in the Question Answering domain The answer is not necessarily “similar” to the question The supportive passage is not always similar to the question Example: Who killed JFK? John F. Kennedy (JFK), the thirty-fifth President of the United States, was assassinated at 12:30 p.m. Central Standard Time (18:30 UTC) on Friday, November 22, 1963, in Dealey Plaza, Dallas, Texas. The ten-month investigation of the Warren Commission of 1963–1964 concluded that the President was assassinated by Lee Harvey Oswald. 19 Is This entity Relevant? © 2012 IBM Corporation
  20. 20. IBM Research - Haifa Relevance Judgment in Question Answering In QA we usually assume a question that identifies the information need “precisely” Who was the first American in space? How many calories are there in a Big Mac? How many Grand Slam titles did Bjorn Borg win? When an answer will be considered relevant to the question? It must be correct! i.e. it Must has supportive evidences (from reliable sources) A prominent factor in answering a question is not so much in finding an answer but in validating whether the candidate answer is correct Therefore supportive evidence is essential Assessment instructions from the TREC’s QA track: Assessors read each candidate answer and make a binary decision as to whether or not the candidate is actually an answer to the question in the context provided by the supportive document 20 Is This entity Relevant? © 2012 IBM Corporation
  21. 21. IBM Research - Haifa What do you mean the answer is correct? As in Document retrieval – correctness/relevance in QA might be subjective and user dependent Where is the Taj Mahal? Agra, India? The famous temple Atlantic-City, NJ? Casino? In TREC, it is common to consider each candidate answer with (relevant) supportive evidences as correct one This leads to the understanding how various candidate answers can be ranked: i.e. Relevance judgment is transformed to the judgment of the relevance of supporting evidences This approach can be applied to Entity oriented Search Rank retrieved entitles according to the amount and quality of their supportive evidences! Entity Ranking should be based on the supportive evidences for their relevance to the query 21 Is This entity Relevant? © 2012 IBM Corporation
  22. 22. Relevance Estimation Approaches for EoS IBM Research - Haifa © 2012 IBM Corporation
  23. 23. IBM Research - Haifa The Expert Profile based Approach (Craswell et all 2001): Represent each person by a virtual document (a profile) Employee directory (in the enterprise) Concatenating all existing passages mentioning the person Rank those profiles according to their relevance to the query Using standard IR ranking techniques The user profile can be naturally used as supportive evidence to the user expertise Difficulties: Co-resolution and name disambiguation Privacy concerns 23 Is This entity Relevant? © 2012 IBM Corporation
  24. 24. IBM Research - Haifa EoS: Voting approach (Balog06, MacDonald09) Any relevant document is a “voter” for the entities it mentions / relates-to p1 d1 q d2 p2 d3 p3 Score( p, q ) = ∑ Score(d , q ) ∗ Score( p, d ) d What is the ratio behind? An entity mentioned many times in relevant (top retrieved) docs is more likely to be relevant on the given topic? 24 Is This entity Relevant? © 2012 IBM Corporation
  25. 25. IBM Research - Haifa Relevance Propagation (Serdyukov 2008) We should also consider entities that are indirectly related to the query Relevance is propagated through the entity relationship graph p1 d1 q d2 p2 p4 d3 p3 d4 How relevance should be propagated in the graph? 25 Is This entity Relevant? © 2012 IBM Corporation
  26. 26. IBM Research - Haifa Proximity in the Entity Relationship Graph - Random walks Random walk approach The relationship strength between two nodes is reflected by the probability that a random surfer who starts at one node will visit the second one during the walk Justification Popular Random Walk Approaches The more paths that connect the two SimRank(u,v): entities in the graph How soon two random surfers (starting at u,v) are the higher the probability that the expected to meet at the same node surfer will visit the target entity Random walk with Restart (RWR) : The surfer has a fixed restart probability to return to The higher the relationship strength the source between the two Lazy Random Walk The surfer has a fixed probability of halting the walk at each step Effective Conductance Only simple (cycle free) paths – treating edges as resistors 26 Is This entity Relevant? © 2012 IBM Corporation
  27. 27. IBM Research - Haifa Markov Random Fields for EoS (Raviv, Carmel, Kurland, 2012) Q =< {q1...qn }, T > P( E | Q) ∑ P∈{ D ,T , N } λE P( EP | Q) P 27 Is This entity Relevant? © 2012 IBM Corporation
  28. 28. IBM Research - Haifa MRF based Entity Document Scoring P(ED|Q) We consider three types cliques Full Independent Sequential dependent Full dependent The feature function over cliques measures how well the clique's terms represent the entity document Based on Dirichlet smoothed language model T  tf (qi , ED ) + µ ⋅ cf (qi )/ | C |  f (qi , ED ) log   | ED | + µ D   For dependent models we replace qi with #1(qi..qi+k) and #uwN({qi,.. qj}) respectively The entity document scoring function aggregates the feature functions over all clique types P( ED | Q) ∑ I ∈{T ,O ,U } λ I ED ∑ c∈I ED I f (c ) D 28 Is This entity Relevant? © 2012 IBM Corporation
  29. 29. IBM Research - Haifa Entity type Scoring P(ET|Q) We measure the “similarity” between the query type and the entity type  e −α d (QT , ET )  P ( ET | Q) = fT (c) log  −α d ( QT , E 'T )   ∑ E '∈R e    d(QT,ET) - the type distance, is domain dependent In our experiments we measured the distance in the Wikipedia category graph The minimal path length between all pairs of the query and the entity’s page categories 29 Is This entity Relevant? © 2012 IBM Corporation
  30. 30. IBM Research - Haifa Entity Name Scoring P(EN|Q) We measure the dependency between the query term(s) and the entity name Globally Measure the proximity between the query term(s) and the entity name in the whole collection • We use pointwise mutual information (PMI) – the likelihood of finding one term in proximity to another term Locally Measure the proximity between the query terms and the entity name in the top retrieved documents P( EN | Q) = ∑ λE X ∈A X N ∑ c∈X EN f NX (c) A = {S , T , O , U , PMI T , PMI O , PMI U } 30 Is This entity Relevant? © 2012 IBM Corporation
  31. 31. IBM Research - Haifa Experimental Results over INEX Entity track (2007-2009) Full Independence Sequential dependence 0.4 0.4 0.35 0.35 0.3 0.3 0.25 0.25 2007 2007 0.2 2008 0.2 2008 2009 2009 0.15 0.15 0.1 0.1 0.05 0.05 0 0 S(ED) S(ED,ET) S(ED,ET,EN) INEX top S(ED) S(ED,ET) S(ED,ET,EN) INEX top Results are improved significantly Full dependence 0.4 when type and name scoring were 0.35 added 0.3 0.25 2007 0.2 2008 2009 Final Results are superior to top INEX 0.15 0.1 results at 2007,2008, and comparable 0.05 to 2009 0 S(ED) S(ED,ET) S(ED,ET,EN) INEX top Dependence models have not improved over Independence model?? 31 Is This entity Relevant? © 2012 IBM Corporation
  32. 32. IBM Research - Haifa Exploratory EoS When only an entity is given as input, the information need is quite fuzzy Any related entity has a potential to be relevant Therefore any related entity should be retrieved! High diversity in search results (entity types, relationship types) How can we ease the user to find the most relevant answers? Iterative IR – let the user navigate and explore the ER graph Facet search: Categorize the search results according to their facets (entity types/attributes..) Let the user drill down: restrict retrieved entities to a specific facet NOTE: We still need to rank the search results in each of the facets! Graph navigation: Let the user explore the graph by using a retrieved entity as a pivot to a new search Query reformulation 32 Is This entity Relevant? © 2012 IBM Corporation
  33. 33. IBM Research - Haifa Search over Social Media Data (SaND) – (Carmel 2009, Guy 2010) SaND provides social aggregation over social data SaND builds an entity-entity relationship matrix that maps a given entity to all related entities, weighted by their relationship strength Direct relations of a user to: document – as an author, tagger and commenter another user – as a friend or as a manager/employee tag – she used, or tagged by others group –as a member/owner Indirect relations: Two entities are indirectly related if both are directly related to the same entity The overall relationship strength between two entities is determined by a linear combination of their direct and indirect relationship strengths 33 Is This entity Relevant? © 2012 IBM Corporation
  34. 34. IBM Research - Haifa Search for the term ‘social’ Related People – Ranked list of people that are related to the topic and to the result set, in one or more relationship types (author, commenter, tagger, etc.) Results contain different types of entities – Blogs, Communities, bookmarked documents etc.. Popular, higher ranked results Related Tags – Ranked tag cloud for appear higher in the result set. this result set. 34 Is This entity Relevant? © 2012 IBM Corporation
  35. 35. IBM Research - Haifa Narrowing the search to Luis Suarez’ related results Hovering over a result, highlights the related people and tags 35 Is This entity Relevant? © 2012 IBM Corporation
  36. 36. IBM Research - Haifa Viewing results for query ‘social’ and person ‘Luis Suarez’ Viewing Luis’ business card, and results related to him 36 Is This entity Relevant? © 2012 IBM Corporation
  37. 37. IBM Research - Haifa Summary In this talk we raised several questions related to the concept of relevance in EoS: What makes an entity relevant to the user’s need? What is the meaning of relevance in this context? Is it the same notion of relevance used in document retrieval? We argue that the relevance of an entity can be estimated, according to supportive evidences provided by the search system We talked on EoS common retrieval techniques: Profile based approach The Voting approach Relevance propagation We discussed several examples of EoS systems and how relevance estimation can be applied in these domains We claimed that the scale and diversity of EoS search results demand Exploratory search techniques such as Facet search and Graph navigation 37 Is This entity Relevant? © 2012 IBM Corporation
  38. 38. IBM Research - Haifa Open Questions and Challenges Entity Similarity While in document retrieval similarity plays a central role in relevance judgment, entity similarity measurement should still be better understood Attribute based similarity, Evidence based similarity Graph proximity Hybrid approaches The clustering hypothesis: Are two “similar” entities likely being relevant to the same information need? Challenges to what extent relevant entities are indeed similar to each other and according to which similarity measurement Relevance propagation: What relationship types provide effective relevance propagation channels? Do your friends inherit your own expertise? Which relationship types contribute to relevance propagation? 38 Is This entity Relevant? © 2012 IBM Corporation
  39. 39. IBM Research - Haifa Thank You! Questions? 39 Is This entity Relevant? © 2012 IBM Corporation
  40. 40. Is This Entity Relevant to Your Needs? David Carmel IBM Research - Haifa, Israel IBM Research - Haifa © 2012 IBM Corporation
  • fabriziosilvestri

    Nov. 12, 2012
  • arjenpdevries

    Nov. 12, 2012

Views

Total views

1,834

On Slideshare

0

From embeds

0

Number of embeds

1

Actions

Downloads

44

Shares

0

Comments

0

Likes

2

×