Is Search Broken?!

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

4 comments

Comments 1 - 4 of 4 previous next Post a comment

Post a comment
Embed Video
Edit your comment Cancel

31 Favorites

Is Search Broken?! - Presentation Transcript

  1. Is Search Broken?! Daniel Tunkelang Chief Scientist, Endeca
  2. howdy!
    • 1992: Bachelor’s + Master’s from MIT in CS + Math
    • 1998: PhD from CMU in CS (ACO program)
    • 1999: Co-founded Endeca!
    • 2008: ???
  3. overview
    • Who is Endeca?
    • Is search broken?
    • If it is, what can we do about it?
  4. who / what is endeca?
    • Software to help people explore, analyze, and understand complex information, guiding them to unexpected insights and better decisions.
    • 500+ customers
    • $108M revenue in 2007.
  5. some of our customers
    • Is search broken?
    • Search has hit a wall.
  6. search hits a wall in ecommerce
  7. search hits a wall in knowledge management Current Search: it outsourcing
  8. search even hits a wall on the web Results 1-10 out of about 344,000,000 for ir
    • But is search broken?
  9. the accountants don’t think so
  10. most users don’t think so 75
  11. or do they?
    • 78% wish search engines could read their minds.
    • What frustrates users most?
      • 25%: deluge of results
      • 24%: too many paid listings
      • 19%: inability to understand their keywords
      • 19%: disorganized / random results
    • The State of Search
    • Autobytel & Kelton Research, Oct ’07
  12. web search vs. enterprise search
    • “ Search on the internet is solved. I always find what I need. But why not in the enterprise? Seems like a solution waiting to happen.”
    • - a Fortune 500 CTO
    • Can theory help?
  13. precision = fraction of retrieved documents that are relevant recall = fraction of relevant documents that are retrieved retrieved documents relevant documents
  14. why improve precision? the truth, nothing but the truth
  15. why improve recall? the whole truth,
  16. what we want… the truth, the whole truth, nothing but the truth
  17. but there is a trade-off… recall precision
  18. which should we favor? Precision …to avoid annoying users with irrelevant results? Recall …to make sure we don’t throw away results the user wants / needs?
    • Enough stalling…what’s the answer?!
  19. depends on what you want vs.
  20. you get what you pay for
    • There are easy use cases…
      • 30% of queries are navigational.
      • 30% of queries lead to Wikipedia pages.
      • Users won’t pay, but advertisers will!
    • …and hard use cases.
      • Queries where recall matters.
      • Exploratory search.
      • Enterprises will pay for insight.
    • Great, bring on the insight!
  21. technology alone can’t provide insight
    • The system can’t read your mind.
    • Your spouse / best friend can’t read your mind.
    • Sometimes you can’t read your own mind.
    • So should we just give up?
  22. technology is a catalyst
    • Computers are good at analysis.
    • People are good at using what they know.
    • How do we get the best of both worlds?
  23. with apologies to luis von ahn
  24. human-computer information retrieval
    • Instead of guessing the user’s intent, optimize communication.
    • De-emphasize the top ten documents; response is a set of documents.
    • Think beyond single queries; support refinement and exploration.
  25. hcir cheats the trade-off recall precision
    • But how do we implement HCIR?
  26. endeca's approach: guided summarization
    • Set retrieval that responds to queries with
      • an overview of the user's current context.
      • an organized set of options for incremental exploration.
    • Contextual summaries of document sets optimize system’s communication with user.
    • Query refinement options optimize user’s communication with system.
  27. guided summarization for ecommerce
    • Matching Categories include:
      • Appliances > Small Appliances > Irons & Steamers
      • Appliances > Small Appliances > Microwaves & Steamers
      • Bath > Sauna & Spas > Steamers
      • Kitchen > Bakeware & Cookware > Cookware >
      • Open Stock Pots > Double Boilers & Steamers
      • Kitchen > Small Appliances > Steamers
  28. guided summarization for KM
    • Guided summarization starts with faceted search.
  29. facets 101
    • But faceted search isn’t enough…
  30. showing the right facets: microwaves vs.
  31. showing the right facets: ceiling fans
  32. traditional topic taxonomy
  33. dynamic topic facet
    • Subject
      • Electronic data processing (1002)
      • Distributed processing (937)
      • Parallel processing (619)
      • Computer networks (562)
      • Fault-tolerant-computing (365)
      • Show more…
    Subject Artificial intelligence (227) High performance computing (244) Automatic theorem proving (9) History (11) Client/server computing (185) Information technology (145) Computer algorithms (110) Java (77) Computer architecture (162) Law and legislation (70) Computer networks (552) Logic, Symbolic and mathematical (16) Computer programs (139) Mathematics (70) Computer security (151) Mobile communication systems (54) Computer software (253) Operating systems (87) Computers (124) Parallel processing (619) Database management (277) Research (83) Distributed processing (937) Software engineering (197) Electronic data processing (1002) Supercomputers (139) Electronic digital computers (148) Web databases (54) Fault-tolerant computing (365) Wireless communication systems (97)
  34. facets populated using entity extraction apple production
  35. cutting through facets to show the big picture Search : storage
  36. summarization: more than search and browse
  37. guided summarization – a summary
    • Guided summarization enables a dialog
    • between the user and the data,
    • enabling exploration and discovery.
    • The Moral
  38. think outside the box
    • Search works for many use cases.
    • But not for some of the most valuable ones.
    • Focus on human-computer information retrieval.
    • One More Thing
  39. maybe we should treat search as a game
  40. thank you
    • Questions?

+ Daniel TunkelangDaniel Tunkelang, 2 years ago

custom

6879 views, 31 favs, 13 embeds more stats

Keynote address by Daniel Tunkelang, Chief Scientis more

More info about this document

© All Rights Reserved

Go to text version

  • Total Views 6879
    • 6078 on SlideShare
    • 801 from embeds
  • Comments 4
  • Favorites 31
  • Downloads 152
Most viewed embeds
  • 292 views on http://www.findability.org
  • 200 views on http://www.cs.cmu.edu
  • 169 views on http://findability.org
  • 74 views on http://thenoisychannel.blogspot.com
  • 23 views on http://www.usability-onair.com

more

All embeds
  • 292 views on http://www.findability.org
  • 200 views on http://www.cs.cmu.edu
  • 169 views on http://findability.org
  • 74 views on http://thenoisychannel.blogspot.com
  • 23 views on http://www.usability-onair.com
  • 14 views on http://thenoisychannel.com
  • 9 views on http://www.hanrss.com
  • 9 views on http://supershiv.wordpress.com
  • 7 views on http://reach-rich.com
  • 1 views on http://www.reach-rich.com
  • 1 views on http://bookssearch.blogspot.com
  • 1 views on http://wiki.imarketingguru.com
  • 1 views on http://semanticstudios.com

less

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

Cancel
File a copyright complaint
Having problems? Go to our helpdesk?

Categories