Topic sensitive page rank(review)

1,956 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,956
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Topic sensitive page rank(review)

  1. 1. Topic Sensitive PageRank-Review<br />
  2. 2. Motivation<br />HITS ( Hubs and Authorities)<br /><ul><li>Kleinberg SODA ‘98
  3. 3. Determine important Hub pages and important Authority pages
  4. 4. + Query specific rank score
  5. 5. - Expensive at runtime</li></ul>PageRank<br /><ul><li>Brin , Page et al.’98
  6. 6. Assigns a priori “importance” estimates to pages
  7. 7. + Query independent rank score
  8. 8. - Inexpensive at runtime</li></li></ul><li>Motivation<br />Topic-Sensitive PageRank<br /><ul><li>Assigns multiple a-priori “impertance” estimates to pages
  9. 9. One PageRank score per basis topic
  10. 10. + Query specific rank score
  11. 11. + Make use of context
  12. 12. + Inexpensive at runtime</li></li></ul><li>Architecture<br />Query<br />Query Processor<br />Web graph<br />(Page, topic) -> ranktopic<br />Classifier<br />Topic-Sensitive PageRank()<br />ODP<br />
  13. 13. Content<br />Offline Processing<br /><ul><li>Input :
  14. 14. Web W
  15. 15. Basis topic [ c1, ……c16] use 16 categories (first level of ODP)
  16. 16. Output
  17. 17. List of rank vectors[ r1, ……r16]
  18. 18. rj : (page -> page importance of topic cj)
  19. 19. rj = IPR (W, vj)</li></li></ul><li>Content<br />Query Processing<br /><ul><li>Goal : calculate some distribution of weights over 16 topics
  20. 20. Use a multinomial Naïve Bayes classifier
  21. 21. Training set : pages listed in ODP
  22. 22. Input : query or query, context
  23. 23. Output : probability distribution (weights) over the basis topics</li></li></ul><li>Content<br />Composite Link Score<br /><ul><li>Use the distribution w to weight the respective topic-specific ranks, forming the topic-sencitivePageRank score for Document d:</li></li></ul><li>Experiment<br /><ul><li> Test set of 10 queries
  24. 24. 5 users were each shown top 10 results to queries, when ranking using</li></ul>- Standard PageRank<br /><ul><li> Topic-sensitive PageRank
  25. 25. A page was relevant if 3of the 5 users’ judged it to be relevant. </li></ul>Training set : 280,000 of the 3 million URLs in the ODP<br />
  26. 26. Experiment<br />The average precision for rankings induced by topic-sensitive PageRank scoresis substantially higher than that of the unbiased PageRank.<br />
  27. 27. Novelty<br />Flexibility- uniformly treat variety of sources of context and personalization<br />Transparency<br /><ul><li>Topic weights are easily interpreted by user</li></ul>Privacy <br />-topic weights reveal less unintentionally<br />Efficiency<br />-low query time cost, with small additional preprocessing cost<br />
  28. 28. Discussion<br />Why author chosen ODP?<br />Rocchio<br />K-Nearest Neighbor<br />Decision Tree<br />Naïve Bayes<br />Support Vector Machine<br />We don’t need the name of topic too much.<br />
  29. 29. Discussion<br />Why are the weights calculated by counting the number of terms? <br /><ul><li>In topic-sensitive PageRank, it consider how many occurrence of term</li></ul> in each topic<br /><ul><li>Is there any way to calculate weights?</li></ul>Count the number of documents, which have X query, of each topic <br />If there is many retrieved document in topic A then X is very related with topic A<br />
  30. 30. Discussion<br />What if there are relevant document in topics which have 0 point?<br />If A document does not relate with shopping I can not retrieve any document in shopping.<br />Calculate how similar each topic is and cover 0 point<br />

×