Set Retrieval 2.0

3,429 views

Published on

This presentation outlines the principles of information seeking as a dialogue and walk though concrete examples that illustrate the principles of human-computer information retrieval (HCIR). The foundation is an interactive set retrieval approach that responds to queries with an overview of the user\'s current context and an organized set of options for incremental exploration. Contextual summaries of document sets optimize the system\'s communication with the user, while query refinement options optimize user\'s communication with the system.

By enabling bidirectional communication between the user and the system, we can address the inherent limitations of best-match approaches.

Published in: Technology
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,429
On SlideShare
0
From Embeds
0
Number of Embeds
194
Actions
Shares
0
Downloads
57
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide
  • Set Retrieval 2.0

    1. Set Retrieval 2.0 Daniel Tunkelang Chief Scientist, Endeca
    2. howdy! <ul><li>1988 – 1992 </li></ul><ul><li>1993 – 1998 </li></ul><ul><li>1999 - </li></ul>
    3. overview <ul><li>what’s right with search today? </li></ul><ul><li>what’s wrong with search today? </li></ul><ul><li>how do we fix it? </li></ul>
    4. let’s quickly review some history…
    5. 1947: Hans Peter Luhn
    6. 1968: Gerald Salton
    7. 1972: Karen Spärck Jones
    8. 1980s: lots of progress
    9. 1990s – 2000s: WWW
    10. today
    11. so, do we all feel lucky?
    12. recession? what recession?
    13. ask the users…
    14. …though they do have complaints <ul><li>78% wish search engines could read their minds </li></ul><ul><li>what frustrates users most? </li></ul><ul><ul><li>25%: deluge of results </li></ul></ul><ul><ul><li>24%: too many paid listings </li></ul></ul><ul><ul><li>19%: inability to understand their keywords </li></ul></ul><ul><ul><li>19%: disorganized / random results </li></ul></ul><ul><li>The State of Search </li></ul><ul><li>Autobytel & Kelton Research, Oct ’07 </li></ul>
    15. web search vs. enterprise search <ul><li>“ Search on the internet is solved. I always find what I need. But why not in the enterprise? Seems like a solution waiting to happen.” </li></ul><ul><li>- a Fortune 500 CTO </li></ul>
    16. enterprise users really have complaints <ul><li>Why is Joe the Knowledge Worker so upset? </li></ul><ul><ul><li>49%: finding the information needed to do their job is difficult and time consuming </li></ul></ul><ul><ul><li>50%: findability within organization worse than on their own consumer-facing site </li></ul></ul><ul><li>Market IQ Report on Findability </li></ul><ul><li>AIIM, June ’08 </li></ul>
    17. selection bias?
    18. the library and information science critique <ul><li>models </li></ul><ul><ul><li>relevance is subjective </li></ul></ul><ul><li>evaluation </li></ul><ul><ul><li>neglects interactivity </li></ul></ul><ul><li>tools </li></ul><ul><ul><li>no support for exploration </li></ul></ul>
    19. the rebuttal &quot;Tell us what to do, and we will do it.&quot;
    20. besides, search is 90% solved
    21. we need to call a truce <ul><li>real, effective systems </li></ul><ul><li>that support interaction </li></ul><ul><li>cost-effective to evaluate </li></ul>
    22. let’s go back to the 80s for a moment
    23. then vs. now <ul><li>known-item search was an open problem </li></ul><ul><ul><li>now it’s a commodity </li></ul></ul><ul><li>library and information science ideas of the 80s </li></ul><ul><ul><li>ahead of their time </li></ul></ul><ul><li>now we can find known items </li></ul><ul><ul><li>let’s tackle more ambitious information needs </li></ul></ul>
    24. requirements
    25. transparency
    26. control
    27. guidance
    28. set retrieval precision = fraction of retrieved documents that are relevant recall = fraction of relevant documents that are retrieved retrieved documents relevant documents
    29. the classic trade-off recall precision
    30. set retrieval: 2 out of 3
    31. set retrieval 2.0 = set retrieval + guidance Did you mean : guidance Related Searches Guidance Counselor Salary Guidance Counselor Job Description Definition of Guidance Guidance Counseling History of Guidance Counseling Child Guidance Career Guidance What Is the Meaning of Guidance Free Marriage Counseling Problems in Marriage Career Exploration Role of School Counselor
    32. guidance vs. mind reading <ul><li>system can’t read your mind </li></ul><ul><li>spouse / best friend can’t read your mind </li></ul><ul><li>sometimes you can’t read your own mind </li></ul>
    33. so where does guidance come from?
    34. it’s people!
    35. human-computer information retrieval <ul><li>don’t just guess the user’s intent </li></ul><ul><ul><li>optimize communication </li></ul></ul><ul><li>de-emphasize the top ten documents </li></ul><ul><ul><li>response is a set of documents </li></ul></ul><ul><li>think beyond single queries </li></ul><ul><ul><li>support refinement and exploration </li></ul></ul>
    36. hcir cheats the trade-off recall precision
    37. but how do we get there?
    38. set retrieval 2.0 <ul><li>set retrieval that responds to queries with </li></ul><ul><ul><li>overview of the user's current context </li></ul></ul><ul><ul><li>organized set of options for exploration </li></ul></ul><ul><li>contextual summaries of document sets </li></ul><ul><ul><li>optimize system’s communication with user </li></ul></ul><ul><li>query refinement options </li></ul><ul><ul><li>optimize user’s communication with system </li></ul></ul>
    39. faceted search guides refinement
    40. showing the right facets: microwaves
    41. showing the right facets: ceiling fans
    42. query-driven clarification before refinement <ul><li>Matching Categories include: </li></ul><ul><ul><li>Appliances > Small Appliances > Irons & Steamers </li></ul></ul><ul><ul><li>Appliances > Small Appliances > Microwaves & Steamers </li></ul></ul><ul><ul><li>Bath > Sauna & Spas > Steamers </li></ul></ul><ul><ul><li>Kitchen > Bakeware & Cookware > Cookware > </li></ul></ul><ul><ul><li>Open Stock Pots > Double Boilers & Steamers </li></ul></ul><ul><ul><li>Kitchen > Small Appliances > Steamers </li></ul></ul>
    43. results-driven clarification before refinement Search : storage
    44. taxonomies are so 1990s
    45. dynamic topic facet <ul><li>Subject </li></ul><ul><ul><li>Electronic data processing (1002) </li></ul></ul><ul><ul><li>Distributed processing (937) </li></ul></ul><ul><ul><li>Parallel processing (619) </li></ul></ul><ul><ul><li>Computer networks (562) </li></ul></ul><ul><ul><li>Fault-tolerant-computing (365) </li></ul></ul><ul><ul><li>Show more… </li></ul></ul>
    46. facets populated using entity extraction apple production
    47. bootstrap on folksonomies
    48. or learn from users
    49. hcir using set retrieval 2.0 <ul><li>emphasize set summaries over ranked lists </li></ul><ul><li>establish a dialog between the user and the data </li></ul><ul><li>enable exploration and discovery </li></ul>
    50. think outside the (search) box <ul><li>best-first search works for many use cases </li></ul><ul><li>but not for some of the most valuable ones </li></ul><ul><li>set retrieval 2.0 = set retrieval + guidance </li></ul><ul><li>human-computer information retrieval </li></ul>
    51. thank you <ul><li>communication 1.0 </li></ul><ul><li>email: [email_address] </li></ul><ul><li>communication 2.0 </li></ul><ul><li>blog: http://thenoisychannel.com </li></ul><ul><li>twitter: http://twitter.com/dtunkelang </li></ul>

    ×