Have you ever been asked to fix a bad search experience? Have you ever been asked to predict the outcome of a change to your search algorithm? Fixing search problems can feel like a game of whack-a-mole -- fixing one set of queries breaks others. It's a frustrating game of guesswork and trade-offs. But you're responsible for the overall performance of a search system, and the powers that be want assurances. Wouldn't it be great if you could report, with confidence, the overall quality of search? Search quality metrics such as MAP, MRR, and nDCG exist to help the search engineer, but mapping them to real-world business goals is challenging. And pointing to obscure metrics in the face of contrary perceptions is unsatisfying to everyone. Can we do better?
In this talk, we'll look at some of these evaluation metrics and how to use them. Then, we'll explore the gap between these metrics and the business perspective on search quality. Finally, we'll explore tactics you can use to tell an effective story about search quality that non-experts can understand.
7. 1. DO YOUR HOMEWORK
2. GATHER PRELIMINARIES
3. INTERVIEWS
4. ANALYSIS
5. ROADMAPPING
6. PRIORITIZE & PLAN
7. EXECUTE ITERATIVELY
a. IRON TRIANGLE
b. FEEDBACK LOOPS
A METHOD TO THE MADNESS
9. Search Quality Anti-Patterns
● Search Quality is Bug Squashing
● Search Quality is Too Hard For You
● Search Quality is Off Limits
● Search Quality is a Feeling
● Search Quality is Relevance
● Search Quality is Mysterious
10. Search Quality Anti-Patterns
● Search Quality is Bug Squashing
● Search Quality is Too Hard For You
● Search Quality is Off Limits
● Search Quality is a Feeling
● Search Quality is Relevance
● Search Quality is Mysterious
13. Search Quality Anti-Patterns
● Search Quality is Bug Squashing
● Search Quality is Too Hard For You
● Search Quality is Off Limits
● Search Quality is a Feeling
● Search Quality is Relevance
● Search Quality is Mysterious
15. Search Quality Anti-Patterns
● Search Quality is Bug Squashing
● Search Quality is Too Hard For You
● Search Quality is Off Limits
● Search Quality is a Feeling
● Search Quality is Relevance
● Search Quality is Mysterious
17. Search Quality Anti-Patterns
● Search Quality is Bug Squashing
● Search Quality is Too Hard For You
● Search Quality is Off Limits
● Search Quality is a Feeling
● Search Quality is Relevance
● Search Quality is Mysterious
19. Search Quality Anti-Patterns
● Search Quality is Bug Squashing
● Search Quality is Too Hard For You
● Search Quality is Off Limits
● Search Quality is a Feeling
● Search Quality is Relevance
● Search Quality is Mysterious
21. Search Quality Anti-Patterns
● Search Quality is Bug Squashing
● Search Quality is Too Hard For You
● Search Quality is Off Limits
● Search Quality is a Feeling
● Search Quality is Relevance
● Search Quality is Mysterious
22. Search Quality Anti-Patterns
● Search Quality is Bug Squashing
● Search Quality is Too Hard For You
● Search Quality is Off Limits
● Search Quality is a Feeling
● Search Quality is Relevance
● Search Quality is Mysterious!
23. 1. DO YOUR HOMEWORK
2. GATHER PRELIMINARIES
3. INTERVIEWS
4. ANALYSIS
5. ROADMAPPING
6. PRIORITIZE & PLAN
7. EXECUTE ITERATIVELY
a. IRON TRIANGLE
b. FEEDBACK LOOPS
A METHOD TO THE MADNESS
25. 1. DO YOUR HOMEWORK
2. GATHER PRELIMINARIES
3. INTERVIEWS
4. ANALYSIS
5. ROADMAPPING
6. PRIORITIZE & PLAN
7. EXECUTE ITERATIVELY
a. IRON TRIANGLE
b. FEEDBACK LOOPS
A METHOD TO THE MADNESS
29. 1. DO YOUR HOMEWORK
2. GATHER PRELIMINARIES
3. INTERVIEWS
4. ANALYSIS
5. ROADMAPPING
6. PRIORITIZE & PLAN
7. EXECUTE ITERATIVELY
a. IRON TRIANGLE
b. FEEDBACK LOOPS
A METHOD TO THE MADNESS
30. ● Big Picture
○ Use Cases, Domain, IR Vertical
● Classes of Queries
○ Navigational, Informational, Transactional, Research
○ Length, Category, Intent
● Distribution of Queries
○ Head/Tail Analysis
○ Over Tail, Over Time
○ Relative Performance, Outliers
○ Quantify Scale/Impact
● Connect to Revenue
● Points of Interest
Analysis
33. 1. DO YOUR HOMEWORK
2. GATHER PRELIMINARIES
3. INTERVIEWS
4. ANALYSIS
5. ROADMAPPING
6. PRIORITIZE & PLAN
7. EXECUTE ITERATIVELY
a. IRON TRIANGLE
b. FEEDBACK LOOPS
A METHOD TO THE MADNESS
35. Searching for People
Motivation: Many customers are
searching for movies by actor name.
~10% of random sample includes an
actor name in the query. These
queries have poor performance
relative to their peers as measured
by click-through.
We believe that recognizing named
entities (people names) will bring
performance in-line with peer
group.
IMPACT: HIGH
RISK: LOW
LOE: MEDIUM
SCOPE: Named Entities queries
SCALE: ~10%
MEASURE: Relative CTR
VALUE: $XX.XX
36. 1. DO YOUR HOMEWORK
2. GATHER PRELIMINARIES
3. INTERVIEWS
4. ANALYSIS
5. ROADMAPPING
6. PRIORITIZE & PLAN
7. EXECUTE ITERATIVELY
a. IRON TRIANGLE
b. FEEDBACK LOOPS
A METHOD TO THE MADNESS
37. 1. DO YOUR HOMEWORK
2. GATHER PRELIMINARIES
3. INTERVIEWS
4. ANALYSIS
5. ROADMAPPING
6. PRIORITIZE & PLAN
7. EXECUTE ITERATIVELY
a. IRON TRIANGLE
b. FEEDBACK LOOPS
A METHOD TO THE MADNESS
43. RELEVANCE JUDGMENTS
● relevance is approximately query similarity
● relevance is binary
● judgment lists agree with users
● judgment lists are complete and consistent
46. What are we talking about?
1. Does it deliver value?
2. At what cost?
47. Who are we talking about?
1. Marketing
a. define offerings
b. attract and retain customers
2. Management
a. set goals
b. plan and allocate resources
48. HIERARCHY OF BUSINESS OBJECTIVES
REVENUE
ENGAGEMENT
CONVERSIONSRETENTION
MARKET
SHARE
CUSTOMER
SATISFACTION
CTR ABANDONMENTACQUISITION
49. Search Behaviors
● behaviors make sense
● behaviors are measurable
○ even without relevance data
● behaviors tell a story
● you can map behaviors to user tasks
51. 1. DO YOUR HOMEWORK
2. GATHER PRELIMINARIES
3. INTERVIEWS
4. ANALYSIS
5. ROADMAPPING
6. PRIORITIZE & PLAN
7. EXECUTE ITERATIVELY
a. IRON TRIANGLE
b. FEEDBACK LOOPS
A METHOD TO THE MADNESS
52. Development Framework for Search
LABOPERATIONS
INTEGRATION
CLICKSTREAM
A/B TESTING
EVALUATION
OFFLINE
RELEVANCE
ONLINE
AUTOMATED
TESTING
IRON TRIANGLE
53. 1. DO YOUR HOMEWORK
2. GATHER PRELIMINARIES
3. INTERVIEWS
4. ANALYSIS
5. ROADMAPPING
6. PRIORITIZE & PLAN
7. EXECUTE ITERATIVELY
a. IRON TRIANGLE
b. FEEDBACK LOOPS
A METHOD TO THE MADNESS