Your SlideShare is downloading. ×
0
Site Search Analytics
in a Nutshell
Louis Rosenfeld
lou@louisrosenfeld.com • @louisrosenfeld
Webdagane • 10 September 2013
Hello, my name is Lou
www.louisrosenfeld.com | www.rosenfeldmedia.com
Let’s look at the data
No, let’s look at the real data
Critical elements in bold: IP address, time/date stamp, query, and # of
results:
XXX.XXX.X...
No, let’s look at the real data
Critical elements in bold: IP address, time/date stamp, query, and # of
results:
XXX.XXX.X...
No, let’s look at the real data
Critical elements in bold: IP address, time/date stamp, query, and # of
results:
XXX.XXX.X...
SSA is semantically rich data, and...
SSA is semantically rich data, and...
Queries
sorted by
frequency
...what users want--in their own words
A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents
meet the needs of your most import...
A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents
meet the needs of your most import...
A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents
meet the needs of your most import...
A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents
meet the needs of your most import...
A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents
meet the needs of your most import...
A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents
meet the needs of your most import...
(and the tail is quite long)
(and the tail is quite long)
(and the tail is quite long)
(and the tail is quite long)
(and the tail is quite long)
The Long Tail is
much longer than
you’d suspect
The Zipf Distribution, textually
Some things you can do with SSA
1.Make it harder to get lost in deep content
2.Make search smarter
3.Reduce jargon
4.Learn...
#1
Make it harder to get lost
Start with basic SSA data:
queries and query frequency
Percent: volume
of search activity
for a unique
query during a
part...
Tease out common content types
Tease out common content types
Tease out common content types
Took an hour to...
• Analyze top 50 queries (20% of all search activity)
• Ask and iterate:...
Clear content types lead to
better contextual navigation
artist descriptions
album reviews
album pages
artist biosdiscogra...
#2
Make search smarter
Clear content types improve
search performance
Clear content types improve
search performance
Clear content types improve
search performance
Content objects
related to products
Clear content types improve
search performance
Content objects
related to products
Raw search results
Contextualizing “advanced” features
Session data suggest
progression and context
Session data suggest
progression and context
search session patterns
1. solar energy
2. how solar energy works
Session data suggest
progression and context
search session patterns
1. solar energy
2. how solar energy works
search sess...
Session data suggest
progression and context
search session patterns
1. solar energy
2. how solar energy works
search sess...
Session data suggest
progression and context
search session patterns
1. solar energy
2. how solar energy works
search sess...
Session data suggest
progression and context
search session patterns
1. solar energy
2. how solar energy works
search sess...
Recognizing
proper nouns,
dates, and
unique ID#s
#3
Reduce jargon
Saving the brand by killing jargon
at a community college
Jargon related to online education: FlexEd, COD,
College on Dema...
#4
Learn how your audiences differ
Who cares about what?
Who cares about what?
Who cares about what?
Who cares about what?
Why analyze queries by audience?
Fortify your personas with data
Learn about differences between audiences
• Open Universi...
#5
Know when to publish what
Interest in the
football team:
going...
Interest in the
football team:
going...
...going...
Interest in the
football team:
going...
...going...
gone
Interest in the
football team:
going...
...going...
gone
Time to
study!
Before
Tax Day
After
Tax Day
#6
Own and enjoy your failures
Failed navigation?
Examining unexpected searching
Look for places
searches happen
beyond main page
What’s going on?
• Navi...
Where navigation is failing
(“Professional Resources” page)
Do users and
AIGA mean
different
things by
“Professional
Resou...
Comparing what users find
and what they want
Comparing what users find
and what they want
Failed business goals?
Developing custom metrics
Netflix asks
1. Which movies most frequently searched? (query count)
2. Wh...
Failed business goals?
Developing custom metrics
Netflix asks
1. Which movies most frequently searched? (query count)
2. Wh...
Failed business goals?
Developing custom metrics
Netflix asks
1. Which movies most frequently searched? (query count)
2. Wh...
#7
Avoid disasters
The new and improved search engine
that wasn’t
Vanguard used SSA to help benchmark
existing search engine’s performance an...
What to do?
Test performance of common queries
“Before and after” testing using two sets of
metrics
1.Relevance: how relia...
Old engine (target) and new compared
Note: low relevance and high precision scores are optimal
More on Vanguard case study...
Old engine (target) and new compared
Note: low relevance and high precision scores are optimal
More on Vanguard case study...
Old engine (target) and new compared
Note: low relevance and high precision scores are optimal
More on Vanguard case study...
#8
Predict the future
Shaping the
FinancialTimes’ editorial agenda
FT compares these
• Spiking queries
for proper nouns
(i.e., people and
compan...
Can SSA bring us together?
Lou’s TABLE OF
OVERGENERALIZED
DICHOTOMIES
Web Analytics User Experience
What they analyze
Users' behaviors (what's
happen...
Lands End and SKUs
Lands End and SKUs
SKU: # 39072-2AH1
Use SSA to start work
on a site report card
Use SSA to start work
on a site report card
SSA helps
determine common
information needs
Read this
	

 Search Analytics forYour Site:
Conversations with
Your Customers
by Louis Rosenfeld
(Rosenfeld Media, 2011)
...
Louis Rosenfeld
lou@louisrosenfeld.com
www.louisrosenfeld.com
www.rosenfeldmedia.com
www.slideshare.net/lrosenfeld
@louisr...
Site Search Analytics in a Nutshell
Site Search Analytics in a Nutshell
Site Search Analytics in a Nutshell
Site Search Analytics in a Nutshell
Site Search Analytics in a Nutshell
Upcoming SlideShare
Loading in...5
×

Site Search Analytics in a Nutshell

44,633

Published on

Originally presented at SXSW March 13, 2011, on panel with Fred Beecher and Austin Govella. Modified and updated for Web 2.0 Expo talk, October 12, 2011, UX Web Summit September 26, 2012; Webdagene September 10, 2013.

Published in: Technology, Business, Design
11 Comments
130 Likes
Statistics
Notes
  • @subrataseneclerx Which Dropbox link are you referring to?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hi not able to open this excellent presentation, can you please mail me a copy to sensubrata@yahoo.in - the dropbox link goes 404
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Nice presentation. it gave me new ways to use analytic to understand the visitor trend. we are trying hard to increase visitors for website http://www.mallxs.com

    thanks a lot. do share some more topics lie this.. i would bookmark your profile here to follow
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • no puedo hablar el ingles hasi que no entiendo lo que dicen
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Chris, what’s happening? The file is Keynote, BTW; anyone who wants the PDF can grab it from here: http://dl.dropbox.com/u/6208785/Web2.0.pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
44,633
On Slideshare
0
From Embeds
0
Number of Embeds
33
Actions
Shares
0
Downloads
1,068
Comments
11
Likes
130
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • We get two major things out of this data: SESSIONS and FREQUENT QUERIES\n
  • Your brain on data: what will it do?\n
  • Your brain on data: what will it do?\n
  • \n
  • \n
  • Amazing drawing by Eva-Lotta Lamm: www.evalotta.net\n
  • Amazing drawing by Eva-Lotta Lamm: www.evalotta.net\n
  • Amazing drawing by Eva-Lotta Lamm: www.evalotta.net\n
  • Amazing drawing by Eva-Lotta Lamm: www.evalotta.net\n
  • Amazing drawing by Eva-Lotta Lamm: www.evalotta.net\n
  • Amazing drawing by Eva-Lotta Lamm: www.evalotta.net\n
  • Amazing drawing by Eva-Lotta Lamm: www.evalotta.net\n
  • Amazing drawing by Eva-Lotta Lamm: www.evalotta.net\n
  • Amazing drawing by Eva-Lotta Lamm: www.evalotta.net\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Personas: http://www.uie.com/images/blog/YahooExamplePersona.gif\nTable: From Jarrett, Quesenbery, Stirling, and Allen’s report “Search Behaviour at OU;” April 6, 2007.\n
  • Personas: http://www.uie.com/images/blog/YahooExamplePersona.gif\nTable: From Jarrett, Quesenbery, Stirling, and Allen’s report “Search Behaviour at OU;” April 6, 2007.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Examples\n “OO7” versus “007”\n Porn-related (not carried by Netflix)\n “yoga”: not stocking enough? Or not indexing enough record content? Some other problem?\n
  • Examples\n “OO7” versus “007”\n Porn-related (not carried by Netflix)\n “yoga”: not stocking enough? Or not indexing enough record content? Some other problem?\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • More great illustrations by Eva-Lotta Lamm\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript of "Site Search Analytics in a Nutshell"

    1. 1. Site Search Analytics in a Nutshell Louis Rosenfeld lou@louisrosenfeld.com • @louisrosenfeld Webdagane • 10 September 2013
    2. 2. Hello, my name is Lou www.louisrosenfeld.com | www.rosenfeldmedia.com
    3. 3. Let’s look at the data
    4. 4. No, let’s look at the real data Critical elements in bold: IP address, time/date stamp, query, and # of results: XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8 &client=www&oe=UTF-8&proxystylesheet=www& q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /searchaccess=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www& q=license+plate&ud=1&site=AllSites &spell=1&oe=UTF-8&proxystylesheet=www& ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16
    5. 5. No, let’s look at the real data Critical elements in bold: IP address, time/date stamp, query, and # of results: XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8 &client=www&oe=UTF-8&proxystylesheet=www& q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /searchaccess=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www& q=license+plate&ud=1&site=AllSites &spell=1&oe=UTF-8&proxystylesheet=www& ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16 What are users searching?
    6. 6. No, let’s look at the real data Critical elements in bold: IP address, time/date stamp, query, and # of results: XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8 &client=www&oe=UTF-8&proxystylesheet=www& q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /searchaccess=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www& q=license+plate&ud=1&site=AllSites &spell=1&oe=UTF-8&proxystylesheet=www& ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16 What are users searching? How often are users failing?
    7. 7. SSA is semantically rich data, and...
    8. 8. SSA is semantically rich data, and... Queries sorted by frequency
    9. 9. ...what users want--in their own words
    10. 10. A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences
    11. 11. A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences Not all queries are distributed equally
    12. 12. A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences
    13. 13. A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences Nor do they diminish gradually
    14. 14. A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences
    15. 15. A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences 80/20 rule isn’t quite accurate
    16. 16. (and the tail is quite long)
    17. 17. (and the tail is quite long)
    18. 18. (and the tail is quite long)
    19. 19. (and the tail is quite long)
    20. 20. (and the tail is quite long) The Long Tail is much longer than you’d suspect
    21. 21. The Zipf Distribution, textually
    22. 22. Some things you can do with SSA 1.Make it harder to get lost in deep content 2.Make search smarter 3.Reduce jargon 4.Learn how your audiences differ 5.Know when to publish what 6.Own and enjoy your failures 7.Avoid disaster 8.Predict the future
    23. 23. #1 Make it harder to get lost
    24. 24. Start with basic SSA data: queries and query frequency Percent: volume of search activity for a unique query during a particular time period Cumulative Percent: running sum of percentages
    25. 25. Tease out common content types
    26. 26. Tease out common content types
    27. 27. Tease out common content types Took an hour to... • Analyze top 50 queries (20% of all search activity) • Ask and iterate: “what kind of content would users be looking for when they searched these terms?” • Add cumulative percentages Result: prioritized list of potential content types #1) application: 11.77% #2) reference: 10.5% #3) instructions: 8.6% #4) main/navigation pages: 5.91% #5) contact info: 5.79% #6) news/announcements: 4.27%
    28. 28. Clear content types lead to better contextual navigation artist descriptions album reviews album pages artist biosdiscography TV listings
    29. 29. #2 Make search smarter
    30. 30. Clear content types improve search performance
    31. 31. Clear content types improve search performance
    32. 32. Clear content types improve search performance Content objects related to products
    33. 33. Clear content types improve search performance Content objects related to products Raw search results
    34. 34. Contextualizing “advanced” features
    35. 35. Session data suggest progression and context
    36. 36. Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works
    37. 37. Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works search session patterns 1. solar energy 2. energy
    38. 38. Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works search session patterns 1. solar energy 2. energy search session patterns 1. solar energy 2. solar energy charts
    39. 39. Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works search session patterns 1. solar energy 2. energy search session patterns 1. solar energy 2. solar energy charts search session patterns 1. solar energy 2. explain solar energy
    40. 40. Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works search session patterns 1. solar energy 2. energy search session patterns 1. solar energy 2. solar energy charts search session patterns 1. solar energy 2. explain solar energy search session patterns 1. solar energy 2. solar energy news
    41. 41. Recognizing proper nouns, dates, and unique ID#s
    42. 42. #3 Reduce jargon
    43. 43. Saving the brand by killing jargon at a community college Jargon related to online education: FlexEd, COD, College on Demand Marketing’s solution: expensive campaign to educate public (via posters, brochures) The Numbers (from SSA): Result: content relabeled, money saved query rank query #22 online* #101 COD #259 College on Demand #389 FlexTrack *“online”part of 213 queries
    44. 44. #4 Learn how your audiences differ
    45. 45. Who cares about what?
    46. 46. Who cares about what?
    47. 47. Who cares about what?
    48. 48. Who cares about what?
    49. 49. Why analyze queries by audience? Fortify your personas with data Learn about differences between audiences • Open University “Enquirers”: 16 of 25 queries are for subjects not taught at OU • Open University Students: search for course codes, topics dealing with completing program Determine what’s commonly important to all audiences (these queries better work well)
    50. 50. #5 Know when to publish what
    51. 51. Interest in the football team: going...
    52. 52. Interest in the football team: going... ...going...
    53. 53. Interest in the football team: going... ...going... gone
    54. 54. Interest in the football team: going... ...going... gone Time to study!
    55. 55. Before Tax Day
    56. 56. After Tax Day
    57. 57. #6 Own and enjoy your failures
    58. 58. Failed navigation? Examining unexpected searching Look for places searches happen beyond main page What’s going on? • Navigational failure? • Content failure? • Something else?
    59. 59. Where navigation is failing (“Professional Resources” page) Do users and AIGA mean different things by “Professional Resources”?
    60. 60. Comparing what users find and what they want
    61. 61. Comparing what users find and what they want
    62. 62. Failed business goals? Developing custom metrics Netflix asks 1. Which movies most frequently searched? (query count) 2. Which of them most frequently clicked through? (MDP views) 3. Which of them least frequently added to queue? (queue adds)
    63. 63. Failed business goals? Developing custom metrics Netflix asks 1. Which movies most frequently searched? (query count) 2. Which of them most frequently clicked through? (MDP views) 3. Which of them least frequently added to queue? (queue adds)
    64. 64. Failed business goals? Developing custom metrics Netflix asks 1. Which movies most frequently searched? (query count) 2. Which of them most frequently clicked through? (MDP views) 3. Which of them least frequently added to queue? (queue adds)
    65. 65. #7 Avoid disasters
    66. 66. The new and improved search engine that wasn’t Vanguard used SSA to help benchmark existing search engine’s performance and help select new engine New search engine “performed” poorly But IT needed convincing to delay launch Information Architect & Dev Team Meeting Search seems to have a few problems… Nah . Where’s the proof? You can’t tell for sure.
    67. 67. What to do? Test performance of common queries “Before and after” testing using two sets of metrics 1.Relevance: how reliably the search engine returns the best matches first 2.Precision: proportion of relevant results clustered at the top of the list
    68. 68. Old engine (target) and new compared Note: low relevance and high precision scores are optimal More on Vanguard case study: http://bit.ly/D3B8c
    69. 69. Old engine (target) and new compared Note: low relevance and high precision scores are optimal More on Vanguard case study: http://bit.ly/D3B8c uh-oh
    70. 70. Old engine (target) and new compared Note: low relevance and high precision scores are optimal More on Vanguard case study: http://bit.ly/D3B8c uh-oh better
    71. 71. #8 Predict the future
    72. 72. Shaping the FinancialTimes’ editorial agenda FT compares these • Spiking queries for proper nouns (i.e., people and companies) • Recent editorial coverage of people and companies Discrepancy? • Breaking story?! • Let the editors know! Seed your
    73. 73. Can SSA bring us together?
    74. 74. Lou’s TABLE OF OVERGENERALIZED DICHOTOMIES Web Analytics User Experience What they analyze Users' behaviors (what's happening) Users' intentions and motives (why those things happen) What methods they employ Quantitative methods to determine what's happening Qualitative methods for explaining why things happen What they're trying to achieve Helps the organization meet goals (expressed as KPI) Helps users achieve goals (expressed as tasks or topics of interest) How they use data Measure performance (goal- driven analysis) Uncover patterns and surprises (emergent analysis) What kind of data they use Statistical data ("real" data in large volumes, full of errors) Descriptive data (in small volumes, generated in lab environment, full of errors)
    75. 75. Lands End and SKUs
    76. 76. Lands End and SKUs SKU: # 39072-2AH1
    77. 77. Use SSA to start work on a site report card
    78. 78. Use SSA to start work on a site report card SSA helps determine common information needs
    79. 79. Read this Search Analytics forYour Site: Conversations with Your Customers by Louis Rosenfeld (Rosenfeld Media, 2011) www.rosenfeldmedia.com Use code WEBDAGENE2013 for 20% off all Rosenfeld Media books
    80. 80. Louis Rosenfeld lou@louisrosenfeld.com www.louisrosenfeld.com www.rosenfeldmedia.com www.slideshare.net/lrosenfeld @louisrosenfeld @rosenfeldmedia Say hello
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×