Your SlideShare is downloading. ×
Better Search Through Query Understanding
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Better Search Through Query Understanding

4,218

Published on

Better Search Through Query Understanding …

Better Search Through Query Understanding
Presented as a Data Talk at Intuit on April 22, 2014

Search is a fundamental problem of our time — we use search engines daily to satisfy a variety of personal and professional information needs. But search engine development still feels stuck in an information retrieval paradigm that focuses on result ranking. In this talk, I’ll advocate an emphasis on query understanding. I’ll talk about how we implement query understanding at LinkedIn, and I’ll present examples from the broader web. Hopefully you’ll come out with a different perspective on search and share my appreciation for how we can improve search through query understanding.

About the Speaker

Daniel Tunkelang leads LinkedIn's efforts around query understanding. Before that, he led LinkedIn's product data science team. He previously led a local search quality team at Google and was a founding employee of Endeca (acquired by Oracle in 2011). He has written a textbook on faceted search, and is a recognized advocate of human-computer interaction and information retrieval (HCIR). He has a PhD in Computer Science from CMU, as well as BS and MS degrees from MIT.

Published in: Technology, Business
1 Comment
20 Likes
Statistics
Notes
No Downloads
Views
Total Views
4,218
On Slideshare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
72
Comments
1
Likes
20
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Recruiting SolutionsRecruiting SolutionsRecruiting Solutions Daniel Tunkelang Head, Query Understanding better search through query understanding
  • 2. overview  query understanding: what is it?  how we do query understanding at LinkedIn  some other thoughts from search in the wild what I’m not going to cover: 2
  • 3. Information need query select from results rank using IR model user: system: tf-idf PageRank bird’s-eye view of how a search engine works 3
  • 4. Information need query select from results rank using IR model user: system: tf-idf PageRank query understanding 4
  • 5. search is a communication problem 5
  • 6. 6 tag: skill OR title related skills: search, ranking, … tag: company id: 1337 industry: internet verticals: people, jobs intent: exploratory
  • 7. query understanding pipeline 7 spellcheck query tagging vertical intent prediction query expansion raw query structured query + annotations
  • 8. query understanding pipeline 8 spellcheck query tagging vertical intent prediction query expansion raw query structured query + annotations
  • 9. 9 fix obvious typos help users spell names spelling correction
  • 10. spelling out the details 10 PEOPLE NAMES COMPANIES TITLES PAST QUERIES n-grams marissa => ma ar ri is ss sa metaphone mark/marc => MRK co-occurrence counts marissa:mayer = 1000 marisa meyer yahoo marissa marisa meyer mayer yahoo
  • 11. spelling out the details 11 problem: corpus as well as query logs contain many spelling errors certain spelling errors are quite frequent while genuine words (especially names) might be infrequent
  • 12. spelling out the details 12 problem: corpus & query logs contain spelling errors solution: use query chains to infer correct spelling [product manger] [product manager] CLICK [marissa mayer] CLICK
  • 13. query understanding pipeline 13 spellcheck query tagging vertical intent prediction query expansion raw query structured query + annotations
  • 14. query tagging: identifying entities in the query 14 TITLE CO GEO TITLE-237 software engineer software developer programmer … CO-1441 Google Inc. Industry: Internet GEO-7583 Country: US Lat: 42.3482 N Long: 75.1890 W (RECOGNIZED TAGS: NAME, TITLE, COMPANY, SCHOOL, GEO, SKILL )
  • 15. query tagging: identifying entities in the query 15 TITLE CO GEO MORE PRECISE MATCHING WITH DOCUMENTS
  • 16. entity-based filtering 16 BEFORE
  • 17. entity-based filtering 17 AFTER BEFORE
  • 18. entity-based filtering 18 BEFORE
  • 19. entity-based filtering 19 AFTER BEFORE
  • 20. entity-based suggestions 20
  • 21. entity-based suggestions 21
  • 22. query tagging: sequential model 22 EMISSION PROBABILITIES (learned from user profiles) TRANSITION PROBABILITIES (learned from query logs) TRAINING
  • 23. query tagging: sequential model 23 INFERENCE given a query, find the most likely sequence of tags
  • 24. query understanding pipeline 24 spellcheck query tagging vertical intent prediction query expansion raw query structured query + annotations
  • 25. vertical intent prediction: distribution 25 JOBS PEOPLE COMPANIES (probability distribution over verticals)
  • 26. vertical intent prediction: relevance 26 [company] [employees] [jobs] [name search]
  • 27. query understanding pipeline 27 spellcheck query tagging vertical intent prediction query expansion raw query structured query + annotations
  • 28. 28 query expansion: name synonyms
  • 29. 29 query expansion: job title synonyms
  • 30. 30 query expansion: signals [jon] [jonathan] CLICK trained using query chains: [programmer] [developer] CLICK symmetric but not transitive! [francis] ⇔ [frank] [franklin] ⇔ [frank] [francis] ≠ [franklin] [software engineer] [software developer] CLICK context based! [software engineer] => [software developer] [civil engineer] ≠ [civil developer]
  • 31. query understanding pipeline 31 spellcheck query tagging vertical intent prediction query expansion raw query structured query + annotations
  • 32. 32 what else can we learn from search in the wild?
  • 33. don’t guess when it’s better to ask 33 vs.
  • 34. clarify then refine 34 computers books
  • 35. give users transparency, guidance, and control 35
  • 36. think beyond individual search queries 36 Gene Golovchinsky, FXPAL
  • 37. know when you don’t know 37 Claudia Hauff, Query Difficulty for Digital Libraries [2009]
  • 38. 38 Daniel Tunkelang dtunkelang@linkedin.com https://linkedin.com/in/dtunkelang

×