Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Learning to Rank Personalized Searc... by Viet Ha-Thuc 497 views
- Learning To Rank User Queries to De... by Franco Maria Nardini 50 views
- Learning to Rank - From pairwise ap... by Hasan H Topcu 597 views
- Machine Learning for Search at Link... by Viet Ha-Thuc 491 views
- Learn to Rank search results by Ganesh Venkataraman 2688 views
- CONTENT2VEC: a Joint Architecture t... by recsysfr 729 views

1,034 views

Published on

No Downloads

Total views

1,034

On SlideShare

0

From Embeds

0

Number of Embeds

92

Shares

0

Downloads

14

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Learning to rank fulltext results from clicks Tomáš Kramár @tkramar @synopsitv
- 2. Let's build a fulltext search engine. Query Find matches Rank results 1 2 43
- 3. Let's build a fulltext search engine. Query Find matches Rank results 1 2 43 ● ElasticSearch ● LIKE %% ● ...
- 4. Let's build a fulltext search engine. Query Find matches Rank results 1 2 43 ● By number of hits ● By PageRank ● By Date ● ...
- 5. How do you choose relevant results?
- 6. Number of keywords in title 2 2 Number of keywords in text 2 0 Domain carrerjet.sk vienna-rb.at Category Job search Programming Language Slovak English
- 7. Document feature How much I care about it (the higher the more I care) # keywords in title 2.1 # keywords in text 1 Domain is carreerjet.sk -2 Domain is vienna-rb.at 3.5 Category is Job Search -1 Category is Programming 4.2 Language is Slovak 0.9 Language is English 1.5
- 8. Document feature How much I care about it # keywords in title 2.1 2 2 # keywords in text 1 2 0 Domain is carreerjet.sk -2 1 0 Domain is vienna-rb.at 3.5 0 1 Category is Job Search -1 1 0 Category is Programming 4.2 0 1 Language is Slovak 0.9 1 0 Language is English 1.5 0 1 = 4.1 = 13.3rank = d . u
- 9. Rate each result on a scale 1- 5.
- 10. rating = d . u = = d1 . u1 + d2 . u2 + ... + dn . un d1,1 . u1 + d1,2 . u2 + ... + d1,n . un = 3 d2,1 . u1 + d2,2 . u2 + ... + d2,n . un = 5 d3,1 . u1 + d3,2 . u2 + ... + dn . u3,n = 1 d4,1 . u1 + d4,1 . u2 + ... + dn . u4,n = 3
- 11. rating = d . u = = d1 . u1 + d2 . u2 + ... + dn . un d1,1 . u1 + d1,2 . u2 + ... + d1,n . un = 3 d2,1 . u1 + d2,2 . u2 + ... + d2,n . un = 5 d3,1 . u1 + d3,2 . u2 + ... + dn . u3,n = 1 d4,1 . u1 + d4,1 . u2 + ... + dn . u4,n = 3 di,j are known, solve this system of equations and you have u. Done.
- 12. Except.. ● You don't know the explicit ratings ● User preferences change in time ● Those equations probably don't have solution
- 13. Clicked! Assume rating 1. Not clicked. Assume rating 0.
- 14. Except.. ● You don't know the explicit ratings ● User preferences change in time ● Those equations probably don't have solution
- 15. Approximation function h(d): d → rank h(d) = d1 .u1 + ... + dn .un = estimated_rank If the function is good, it should make minimal errors error = (estimated_rank - real_rank)2
- 16. Gradient descent 1. Set user preferences (u) to arbitrary values 2. Calculate the estimated rank h(d) for each document 3. Calculate the mean square error 4. Adjust preferences u in a way that minimizes the error 5. Repeat until the error converges
- 17. meansquareerror u# of keywords in title cost function
- 18. meansquareerror u# of keywords in title cost function Calculate the derivation of cost function at this point and it will give you the direction to move in.
- 19. Preference update ui = ui - α.h(d)dui α learning rate h(d)dui partial derivation of cost function h(d) by ui
- 20. Preference update ui = ui - α.h(d)dui α learning rate h(d)dui partial derivation of cost function h(d) by ui How fast will you move. Too low - slow progress. Too high - you will overshoot.
- 21. Preference update ui = ui - α.h(d)dui α learning rate h(d)dui partial derivation of cost function h(d) by ui Nothing scary. You can find these online for standard cost functions. For mean square error: (rank(d) - h(d)) * ui
- 22. Gradient descent 1. Set user preferences (u) to arbitrary values 2. Calculate the estimated rank h(d) for each document 3. Calculate the square error 4. Adjust preferences u in a way that minimizes the error 5. Repeat until the error converges
- 23. Clicked! Assume rating 1. Clicked! Assume rating 1. Or? Doesn't this mean result #1 is not relevant?
- 24. Clicked! Assume nothing. Clicked! Assume it is better than #2 and #3.
- 25. What's changed? We no longer have ratings, just document comparisons. Cost function - something that considers ordering, e.g., Kendall's T (number of concordant and discordant pairs) h is now a function of 2 parameters: h(d1, d2). But you can just do d2 - d1 and learn on that. d4 > d3 d4 > d2

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment