Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Determining Relevance Rankings from Search Click Logs

1,051 views

Published on

Published in: Technology
  • Be the first to comment

Determining Relevance Rankings from Search Click Logs

  1. 1. Dr. Carson Kai-Sang Leung Inderjeet Singh(Database and Data Mining Lab)
  2. 2.  Introduction Problem Solution Methodology Evaluation Comp 7220 1/11/2012 2
  3. 3. Comp 7220 1/11/2012 3
  4. 4.  Mining user behaviour/preferences Predict document relevance Re-rank the search results Compare different ranking functions (train/test) Optimize the ad. performance Query suggestions How Big are these logs? ◦ 10+ terabyte of entries each day ◦ Composed of billions of distinct (query, url)’s Comp 7220 1/11/2012 4
  5. 5. Documents/results Ranking factors Many ranking factorspresented in order of depend on query, considered whenthe relevance to the document and ranking these results query query-document pair Improving ranking based on user Personalized search Recency (temporal) preferences +Social search ranking (likes/dislikes) Comp 7220 1/11/2012 5
  6. 6. [David Green; blog] Comp 7220 1/11/2012 6
  7. 7. # of clicks received[CIKM09 Tutorial] Comp 7220 1/11/2012 7
  8. 8. Trust factor: Preferences to certain URLs more than the other,e.g., wikipedia.com, stackoverflow.com, Yahoo answers,about.comWhat is missing (in previous models) ? Modelling trust factor Clicks on sponsored results Related queries/searches (sidebars) Realistic and flexible assumptions on user behaviour Comp 7220 1/11/2012 8
  9. 9. Comp 7220 1/11/2012 9
  10. 10. 1. Informational query – “DDR3 memory”, “SATA 3 hard drives”, “American history”2. Navigational query – “gmail”, “digg”, “CIBC”, “CIBC credit cards” Comp 7220 1/11/2012 10
  11. 11. No NoSnippet Examine? Snippet Examine? No Yes Yes NoSnippet Attractive? Snippet Attractive? No Yes No YesEnough Utility? Enough Utility? Yes Yes End End Comp 7220 1/11/2012 11
  12. 12. Realistic and flexible assumptions on userbehaviour (session modelling) Consider trust bias (trust factor) Order results for particular query by relevance scores predicted by model Comparison of this order to the editorial ranking Is it good model? If orderings agree upto a considerable extent Comp 7220 1/11/2012 12
  13. 13. Deploy this model as a feature/factor for predicting relevance in learning to rank algorithm Deriving retrieval/ranking functionIf metric gains over baseline ranking function? Model insights can be used as a feature in ranking function Ranking function tests with different class of queries for metric gains Comp 7220 1/11/2012 13
  14. 14. Metrics• Discounted Cumulative Gain (DCG)• Normalized DCG (NDCG)• Precision• RecallTwo types of data1. Search click logs (from real or meta search engines)2. Benchmarking dataset LEarning TO Rank (LETOR) for information retrieval Comp 7220 1/11/2012 14
  15. 15. [Guo et al., 2009] [Chapelle and Zhang, 2009] Comp 7220 1/11/2012 15
  16. 16.  David Green Blog. http://davidgreen.com/comparative-value-of-google-search- rankings (accessed 20th-April-2011) Fan Guo and Chao Liu. Statistical Models for Web Search Click Log Analysis. Tutorial, 2009 Fan Guo, Chao Liu, and Yi Min Wang. Efficient multiple-click models in web search. In Proceedings of Second Web Search and Data Mining (WSDM) Conference, Barcelona, Spain, pages 124-131. ACM, 9-11 February, 2009 Olivier Chapelle and Ye Zhang. A dynamic bayesian network click model for web search and ranking. In Proceedings of the 18th International Conference on World Wide web (WWW), Madrid, Spain, pages 1-10, ACM, 20-24 April, 2009 Comp 7220 1/11/2012 16
  17. 17. [Tmcnet.com Blog]Comp 7220 1/11/2012 17

×