Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Improving Findability
             through
Site Search Analytics


  Louis Rosenfeld • May 12, 2009
   ESS 2009 • New York...
What we’ll cover
★ Quick intro
★ SSA from the Bottom Up
★ SSA from the Top Down
★ Putting them together

              2
Quick Intro
★ Where search query data
comes from
★ Our friend Zipf
★ Long tail, meet short head

               3
XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] quot;GET /search?
  access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL
  ...
The Zipf Distribution




    5
Zipf, textually:
the power of the short head




              6
But can you get at your data?
SSA from the Bottom Up
★ The basics: play and ask
questions
★ Five things you should be
doing

               8
Generic questions help
you play with your data
★ What are the most frequent unique queries?
★ Are frequent queries retriev...
Bottom Up SSA:
Five things you should do
1. Cluster your data to get a better picture
of metadata and content needs
2. Tra...
Hunting for metadata patterns
CANDIDATE VALUES   CANDIDATE ATTRIBUTES




                   11
Hunting for metadata patterns
CANDIDATE VALUES   CANDIDATE ATTRIBUTES




                   11
Hunting for metadata patterns
CANDIDATE VALUES   CANDIDATE ATTRIBUTES




                   11
Hunting for metadata patterns
CANDIDATE VALUES   CANDIDATE ATTRIBUTES




                   11
Hunting for metadata patterns
CANDIDATE VALUES   CANDIDATE ATTRIBUTES




                   11
Hunting for content types




              12
Hunting for content types




              12
Hunting for content types




              12
Surfacing content types




            13
Surfacing content types




            13
The When of search
   14
Failure is underrated:
digging deeper




               15
Beyond best bets
             16
Netflix moves beyond
generic reports




            17
Netflix moves beyond
generic reports




            17
Netflix moves beyond
generic reports




            17
Netflix moves beyond
generic reports




            17
Netflix moves beyond
generic reports




            17
Analyzing data
from the bottom up:
play with the data,
look for patterns, trends,
and outliers
Analyzing data
from the bottom up:
play with the data,
look for patterns, trends,
and outliers

So what’s being measured?
SSA from the Top Down

★ The basics: why are we here?
★ The hard part: what can we
measure?


              19
First: why are we here?
★ Commerce
★ Lead Generation
★ Content/Media
★ Support/Self-Service


              20
First: why are we here?
★ Commerce
★ Lead Generation
★ Content/Media
★ Support/Self-Service
Data supports metrics... but
w...
Can we measure
findability?




           21
Can we measure
findability?


Does measure mean
monetize?

           21
Vanguard and the
quantification of search
                            Target         Oct 3   Oct 10   Oct 16
 Mean distance...
Search-related metrics
★ Jeannine Bartlett’s SIX Metrics(tm)
Framework
★ Lee Romero’s search metrics
★ Both here: http://b...
Analyzing data
the top down:
start with metrics,
benchmark and
measure performance
Analyzing data
the top down:
start with metrics,
benchmark and
measure performance
But you can’t measure
what you don’t kn...
Putting it all together

Top-down analysis
Bottom-up analysis



               25
Putting it all together what

Top-down analysis
Bottom-up analysis



             25
Putting it all together what

Top-down analysis
Bottom-up analysis

                       why
             25
XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] quot;GET /search?
  access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL
  ...
XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] quot;GET /search?
  access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL
  ...
XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] quot;GET /search?
  access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL
  ...
XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] quot;GET /search?
  access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL
  ...
!quot;#$%"'()*+),%(-).(%(quot;-&/)0(1/*$%)
                                                                   /
     ...
!quot;#$%"'()*+),%(-).(%(quot;-&/)0(1/*$%)
                                                                   /
     ...
The data that
drive our decisions




              29
The data that
drive our decisions
     Web Analytics                User Experience

       behavioral                    ...
The data that
drive our decisions
     Web Analytics                User Experience

       behavioral                    ...
The data that
drive our decisions
     Web Analytics                User Experience

       behavioral                    ...
The data that
drive our decisions
     Web Analytics                User Experience

       behavioral                    ...
The data that
drive our decisions
     Web Analytics                User Experience

       behavioral                    ...
The data that
drive our decisions
     Web Analytics                User Experience

       behavioral                    ...
Common queries
can drive task analysis




               30
Common queries
can drive task analysis
                      “Can you find a map of
                      the campus?”

   ...
Query data
can augment
personas




              31
Query data
can augment
personas




              31
Query data
can augment
personas


 “What Steven
 Searches” added to
 existing persona
 (from Adaptive Path)
              ...
This is not statistics




               32
This is not statistics
This is not difficult




               32
This is not statistics
This is not difficult
This is very useful




               32
Systems can help us
objectify the
subjective


              33
Subjective
                      evaluations...




Systems can help us
objectify the
subjective


              33
Subjective
                      evaluations...


                                 ...lead to
Systems can help us         ...
Integrating through
shared goals




              34
What we covered
★ Quick intro
★ SSA from the Bottom Up
★ SSA from the Top Down
★ Putting them together

             35
Some day my book
will come...
Search Analytics for Your Site:
Conversations with Your Customers

Louis Rosenfeld & Marko H...
Until then...

Louis Rosenfeld
457 Third Street, #4R
Brooklyn, NY 11215 USA

lou@louisrosenfeld.com
www.louisrosenfeld.com...
Improving Findability through Site Search Analytics
Improving Findability through Site Search Analytics
Upcoming SlideShare
Loading in …5
×

Improving Findability through Site Search Analytics

4,479 views

Published on

Brief talk given at the Enterprise Search Summit; New York, NY, USA; May 12, 2009.

Published in: Technology, Education
  • Best one
    Hope you are in good health. My name is AMANDA . I am a single girl, Am looking for reliable and honest person. please have a little time for me. Please reach me back amanda_n14144@yahoo.com so that i can explain all about myself .
    Best regards AMANDA.
    amanda_n14144@yahoo.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Improving Findability through Site Search Analytics

  1. 1. Improving Findability through Site Search Analytics Louis Rosenfeld • May 12, 2009 ESS 2009 • New York, NY, USA 1
  2. 2. What we’ll cover ★ Quick intro ★ SSA from the Bottom Up ★ SSA from the Top Down ★ Putting them together 2
  3. 3. Quick Intro ★ Where search query data comes from ★ Our friend Zipf ★ Long tail, meet short head 3
  4. 4. XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxy stylesheet=www&q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1quot; 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www&q=license+plate &ud=1&site=AllSites&spell=1&oe=UTF-8&proxystylesheet=www&ip =XXX.XXX.X.104 HTTP/1.1quot; 200 8283 146 0.16 XXX.XXX.XX.130 - - [10/Jul/2006:10:24:38 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxy stylesheet=www&q=regional+transportation+governance +commission&ip=XXX.XXX.X.130 HTTP/1.1quot; 200 9718 62 0.17 Sample query data (from Google Search Appliance) 4
  5. 5. The Zipf Distribution 5
  6. 6. Zipf, textually: the power of the short head 6
  7. 7. But can you get at your data?
  8. 8. SSA from the Bottom Up ★ The basics: play and ask questions ★ Five things you should be doing 8
  9. 9. Generic questions help you play with your data ★ What are the most frequent unique queries? ★ Are frequent queries retrieving quality results? ★ Click-through rates per frequent query? ★ Most frequently clicked result per query? ★ Which frequent queries retrieve zero results? ★ What are the referrer pages for frequent queries? ★ Which queries retrieve popular documents? ★ What interesting patterns emerge in general? 9
  10. 10. Bottom Up SSA: Five things you should do 1. Cluster your data to get a better picture of metadata and content needs 2. Track for seasonality 3. Take failure further: beyond failed searches 4. Leverage your best bets 5. Don’t be satisfied with generic reports 10
  11. 11. Hunting for metadata patterns CANDIDATE VALUES CANDIDATE ATTRIBUTES 11
  12. 12. Hunting for metadata patterns CANDIDATE VALUES CANDIDATE ATTRIBUTES 11
  13. 13. Hunting for metadata patterns CANDIDATE VALUES CANDIDATE ATTRIBUTES 11
  14. 14. Hunting for metadata patterns CANDIDATE VALUES CANDIDATE ATTRIBUTES 11
  15. 15. Hunting for metadata patterns CANDIDATE VALUES CANDIDATE ATTRIBUTES 11
  16. 16. Hunting for content types 12
  17. 17. Hunting for content types 12
  18. 18. Hunting for content types 12
  19. 19. Surfacing content types 13
  20. 20. Surfacing content types 13
  21. 21. The When of search 14
  22. 22. Failure is underrated: digging deeper 15
  23. 23. Beyond best bets 16
  24. 24. Netflix moves beyond generic reports 17
  25. 25. Netflix moves beyond generic reports 17
  26. 26. Netflix moves beyond generic reports 17
  27. 27. Netflix moves beyond generic reports 17
  28. 28. Netflix moves beyond generic reports 17
  29. 29. Analyzing data from the bottom up: play with the data, look for patterns, trends, and outliers
  30. 30. Analyzing data from the bottom up: play with the data, look for patterns, trends, and outliers So what’s being measured?
  31. 31. SSA from the Top Down ★ The basics: why are we here? ★ The hard part: what can we measure? 19
  32. 32. First: why are we here? ★ Commerce ★ Lead Generation ★ Content/Media ★ Support/Self-Service 20
  33. 33. First: why are we here? ★ Commerce ★ Lead Generation ★ Content/Media ★ Support/Self-Service Data supports metrics... but which metrics for search? 20
  34. 34. Can we measure findability? 21
  35. 35. Can we measure findability? Does measure mean monetize? 21
  36. 36. Vanguard and the quantification of search Target Oct 3 Oct 10 Oct 16 Mean distance from 1st 3 13 7 5 Median distance from 1st 2 7 3 1 Count: Below 1st 47% 84% 62% 58% Count: Below 5th 12% 58% 38% 14% Count: Below 10th 7% 38% 10% 7% Precision – Strict 42% 15% 36% 39% Precision – Loose 71% 38% 53% 65% Precision – Permissive 96% 55% 72% 92% Quantification, not monetization 22
  37. 37. Search-related metrics ★ Jeannine Bartlett’s SIX Metrics(tm) Framework ★ Lee Romero’s search metrics ★ Both here: http://bit.ly/1a2mzk Disconnect: analytics world of KPI vs. experiential world 23 of search
  38. 38. Analyzing data the top down: start with metrics, benchmark and measure performance
  39. 39. Analyzing data the top down: start with metrics, benchmark and measure performance But you can’t measure what you don’t know
  40. 40. Putting it all together Top-down analysis Bottom-up analysis 25
  41. 41. Putting it all together what Top-down analysis Bottom-up analysis 25
  42. 42. Putting it all together what Top-down analysis Bottom-up analysis why 25
  43. 43. XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxysty lesheet=www&q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1quot; 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www&q=license+plate &ud=1&site=AllSites&spell=1&oe=UTF-8&proxystylesheet=www&ip=XX X.XXX.X.104 HTTP/1.1quot; 200 8283 146 0.16 XXX.XXX.XX.130 - - [10/Jul/2006:10:24:38 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxysty lesheet=www&q=regional+transportation+governance +commission&ip=XXX.XXX.X.130 HTTP/1.1quot; 200 9718 62 0.17 26
  44. 44. XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxysty lesheet=www&q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1quot; 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www&q=license+plate &ud=1&site=AllSites&spell=1&oe=UTF-8&proxystylesheet=www&ip=XX X.XXX.X.104 HTTP/1.1quot; 200 8283 146 0.16 XXX.XXX.XX.130 - - [10/Jul/2006:10:24:38 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxysty lesheet=www&q=regional+transportation+governance +commission&ip=XXX.XXX.X.130 HTTP/1.1quot; 200 9718 62 0.17 BU Q: “What are the most common queries?” 26
  45. 45. XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxysty lesheet=www&q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1quot; 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www&q=license+plate &ud=1&site=AllSites&spell=1&oe=UTF-8&proxystylesheet=www&ip=XX X.XXX.X.104 HTTP/1.1quot; 200 8283 146 0.16 XXX.XXX.XX.130 - - [10/Jul/2006:10:24:38 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxysty lesheet=www&q=regional+transportation+governance +commission&ip=XXX.XXX.X.130 HTTP/1.1quot; 200 9718 62 0.17 27
  46. 46. XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxysty lesheet=www&q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1quot; 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www&q=license+plate &ud=1&site=AllSites&spell=1&oe=UTF-8&proxystylesheet=www&ip=XX X.XXX.X.104 HTTP/1.1quot; 200 8283 146 0.16 XXX.XXX.XX.130 - - [10/Jul/2006:10:24:38 -0800] quot;GET /search? access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxysty lesheet=www&q=regional+transportation+governance +commission&ip=XXX.XXX.X.130 HTTP/1.1quot; 200 9718 62 0.17 TD Q: “Are we converting license plate renewals?” 27
  47. 47. !quot;#$%"'()*+),%(-).(%(quot;-&/)0(1/*$%) / Behavioral Eyetracking Data Mining/Analysis A/B (Live) Testing Usability Benchmarking (in lab) / Data Source Usability Lab Studies Online User Experience Assessments (“Vividence-like” studies) Ethnographic Field Studies mix Diary/Camera Study Message Board Mining Participatory Design Customer feedback via email Focus Groups Desirability studies Intercept Surveys Attitudinal Phone Interviews Cardsorting Email Surveys mix Approach Qualitative (direct) Quantitative (indirect) Key for Context of Product Use during data collection Natural use of product De-contextualized / not using product © 2008 Christian Rohrer Scripted (often lab-based) use of product Combination / hybrid 20 A LOVELY USER RESEARCH STRAW MAN 28
  48. 48. !quot;#$%"'()*+),%(-).(%(quot;-&/)0(1/*$%) / Behavioral Eyetracking Data Mining/Analysis A/B (Live) Testing Usability Benchmarking (in lab) / Data Source Usability Lab Studies Online User Experience Assessments (“Vividence-like” studies) Ethnographic Field Studies mix Diary/Camera Study Message Board Mining Participatory Design Customer feedback via email Focus Groups Desirability studies Intercept Surveys Attitudinal Phone Interviews Cardsorting Email Surveys mix Approach Qualitative (direct) Quantitative (indirect) Key for Context of Product Use during data collection Natural use of product De-contextualized / not using product © 2008 Christian Rohrer Scripted (often lab-based) use of product Combination / hybrid 20 A LOVELY USER RESEARCH STRAW MAN 28
  49. 49. The data that drive our decisions 29
  50. 50. The data that drive our decisions Web Analytics User Experience behavioral attitudinal quantitative qualitative high fidelity artificial high volume high quality This data is about WHAT This data is about WHY 29
  51. 51. The data that drive our decisions Web Analytics User Experience behavioral attitudinal quantitative qualitative high fidelity artificial high volume high quality This data is about WHAT This data is about WHY 29
  52. 52. The data that drive our decisions Web Analytics User Experience behavioral attitudinal quantitative qualitative high fidelity artificial high volume high quality This data is about WHAT This data is about WHY 29
  53. 53. The data that drive our decisions Web Analytics User Experience behavioral attitudinal quantitative qualitative high fidelity artificial high volume high quality This data is about WHAT This data is about WHY 29
  54. 54. The data that drive our decisions Web Analytics User Experience behavioral attitudinal quantitative qualitative high fidelity artificial high volume high quality This data is about WHAT This data is about WHY 29
  55. 55. The data that drive our decisions Web Analytics User Experience behavioral attitudinal quantitative qualitative high fidelity artificial high volume high quality This data is about WHAT This data is about WHY 29
  56. 56. Common queries can drive task analysis 30
  57. 57. Common queries can drive task analysis “Can you find a map of the campus?” “What study abroad options are available to students?” “When is the last home football game of the season?” 30
  58. 58. Query data can augment personas 31
  59. 59. Query data can augment personas 31
  60. 60. Query data can augment personas “What Steven Searches” added to existing persona (from Adaptive Path) 31
  61. 61. This is not statistics 32
  62. 62. This is not statistics This is not difficult 32
  63. 63. This is not statistics This is not difficult This is very useful 32
  64. 64. Systems can help us objectify the subjective 33
  65. 65. Subjective evaluations... Systems can help us objectify the subjective 33
  66. 66. Subjective evaluations... ...lead to Systems can help us objective decisions objectify the subjective 33
  67. 67. Integrating through shared goals 34
  68. 68. What we covered ★ Quick intro ★ SSA from the Bottom Up ★ SSA from the Top Down ★ Putting them together 35
  69. 69. Some day my book will come... Search Analytics for Your Site: Conversations with Your Customers Louis Rosenfeld & Marko Hurst Rosenfeld Media, 2009 (?) rosenfeldmedia.com/books/searchanalytics 36
  70. 70. Until then... Louis Rosenfeld 457 Third Street, #4R Brooklyn, NY 11215 USA lou@louisrosenfeld.com www.louisrosenfeld.com www.rosenfeldmedia.com Twitter: louisrosenfeld, rosenfeldmedia 37

×