Entity Aware Click Graph

1,120 views

Published on

Query logs record the actual usage of search systems and
their analysis has proven critical to improving search engine
functionality. Yet, despite the deluge of information, query
log analysis often suffers from the sparsity of the query space.
Based on the observation that most queries pivot around a
single entity that represents the main focus of the user’s
need, we propose a new model for query log data called the
entity-aware click graph. In this representation, we decom-
pose queries into entities and modifiers, and measure their
association with clicked pages. We demonstrate the benefits
of this approach on the crucial task of understanding which
websites fulfill similar user needs, showing that using this
representation we can achieve a higher precision than other
query log-based approaches.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,120
On SlideShare
0
From Embeds
0
Number of Embeds
30
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Entity Aware Click Graph

  1. 1. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 1/16 Measuring Website Similarity using an Entity-Aware Click Graph Pablo N. Mendes1, Peter Mika2, Hugo Zaragoza2, Roi Blanco2 1. Freie Universität Berlin 2. Yahoo! Research Barcelona Nov 1st 2012, Maui, CIKM 2012
  2. 2. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 2/16 Introduction: query log analysis ● Query logs record user interaction with Web search engines ● Query log analysis has been proven critical to improving search ● For search engines – Ranking, autosuggest, “Also try”, etc. ● For site owners – insight into user needs, allows optimizing Web presence, etc.
  3. 3. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 3/16 Introduction: website similarity ● Click graph: relating queries and websites, edges are clicks Click graph Site similarity graph (SG) ● Allows modeling website relatedness based on shared queries leading to each website pair
  4. 4. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 4/16 Problems: Sparsity ● 44% of queries occur only once even when considering a full year of data [1] ● using “shared queries” as relatedness measure relatedness becomes tough in the long tail. [1] Baeza-Yates. Relating content through web usage. In HT ’09, 2009.
  5. 5. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 5/16 Problems: partial overlaps ● Breaking up into words distorts semantics – “Forest” vs “Forest Gump” – “Pitt” vs “Brad Pitt”
  6. 6. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 6/16 Introduction ● >62% of queries contain entity name or type [20][20] Pound, Mika, & Zaragoza. Ad-hoc object retrieval in the web of data. In WWW’10, 2010.
  7. 7. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 7/16 Entity-aware Click Graph ● Websites can share entities and/or modifiers
  8. 8. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 8/16 Entity-aware Website Similarity Graph ● More connected ● Preserves semantics ● Allows analysis of how websites relate to entities and modifiers
  9. 9. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 9/16 Experiments ● Website similarity – Find top K similar sites – Evaluation: two sites are “similar” if they are in the same category in ODP (Open Directory Project) ● Website characteristics from the searcher POV – What entities lead to a website – What context words lead to a website
  10. 10. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 10/16 Dataset Statistics: Query Log ● 1 month of queries from Yahoo!, 45M sessions ● 5M entities from Freebase
  11. 11. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 11/16 Results 1 ● Similarity edge prediction
  12. 12. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 12/16 Results 1 ● Similarity edge prediction with credit to partial category overlap
  13. 13. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 13/16 Results 2 Many entities Few modifiers Many entities Many modifiers Entropy ofdistribution of entities Few entities Many modifiers Entropy of distribution of modifiers
  14. 14. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 14/16 Results 2
  15. 15. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 15/16 Conclusion ● Recognizing entities in Web search logs allows for click graphs that account for internal composition of queries ● New similarity graphs built from entity-aware click graphs allow enable more robust and flexible similarity analysis (evaluated for website similarity) ● Future: – Exploit the knowledge base (e.g. type hierarchy) – More complex queries – etc
  16. 16. Mendes, Mika, Zaragoza, Blanco. Measuring Website Similarity using an Entity-Aware Click Graph 16/16 Thank you!● Web: http://pablomendes.com● E-mail: pablo.mendes@fu-berlin.de● Twitter: @pablomendes● Slideshare: slideshare.net/pablomendes Questions?

×