• Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
412
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
15
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Site Search Analytics in a Nutshell Louis Rosenfeld lou@louisrosenfeld.com • @louisrosenfeld Webdagane • 10 September 2013
  • 2. Hello, my name is Lou www.louisrosenfeld.com | www.rosenfeldmedia.com
  • 3. Let’s look at the data
  • 4. No, let’s look at the real data Critical elements in bold: IP address, time/date stamp, query, and # of results: XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8 &client=www&oe=UTF-8&proxystylesheet=www& q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /searchaccess=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www& q=license+plate&ud=1&site=AllSites &spell=1&oe=UTF-8&proxystylesheet=www& ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16
  • 5. No, let’s look at the real data Critical elements in bold: IP address, time/date stamp, query, and # of results: XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8 &client=www&oe=UTF-8&proxystylesheet=www& q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /searchaccess=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www& q=license+plate&ud=1&site=AllSites &spell=1&oe=UTF-8&proxystylesheet=www& ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16 What are users searching?
  • 6. No, let’s look at the real data Critical elements in bold: IP address, time/date stamp, query, and # of results: XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ud=1&site=AllSites&ie=UTF-8 &client=www&oe=UTF-8&proxystylesheet=www& q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02 XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /searchaccess=p&entqr=0 &output=xml_no_dtd&sort=date%3AD%3AL %3Ad1&ie=UTF-8&client=www& q=license+plate&ud=1&site=AllSites &spell=1&oe=UTF-8&proxystylesheet=www& ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16 What are users searching? How often are users failing?
  • 7. SSA is semantically rich data, and...
  • 8. SSA is semantically rich data, and... Queries sorted by frequency
  • 9. ...what users want--in their own words
  • 10. A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences
  • 11. A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences Not all queries are distributed equally
  • 12. A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences
  • 13. A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences Nor do they diminish gradually
  • 14. A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences
  • 15. A little goes a long wayA handful of queries/tasks/ways to navigate/features/ documents meet the needs of your most important audiences 80/20 rule isn’t quite accurate
  • 16. (and the tail is quite long)
  • 17. (and the tail is quite long)
  • 18. (and the tail is quite long)
  • 19. (and the tail is quite long)
  • 20. (and the tail is quite long) The Long Tail is much longer than you’d suspect
  • 21. The Zipf Distribution, textually
  • 22. Some things you can do with SSA 1.Make it harder to get lost in deep content 2.Make search smarter 3.Reduce jargon 4.Learn how your audiences differ 5.Know when to publish what 6.Own and enjoy your failures 7.Avoid disaster 8.Predict the future
  • 23. #1 Make it harder to get lost
  • 24. Start with basic SSA data: queries and query frequency Percent: volume of search activity for a unique query during a particular time period Cumulative Percent: running sum of percentages
  • 25. Tease out common content types
  • 26. Tease out common content types
  • 27. Tease out common content types Took an hour to... • Analyze top 50 queries (20% of all search activity) • Ask and iterate: “what kind of content would users be looking for when they searched these terms?” • Add cumulative percentages Result: prioritized list of potential content types #1) application: 11.77% #2) reference: 10.5% #3) instructions: 8.6% #4) main/navigation pages: 5.91% #5) contact info: 5.79% #6) news/announcements: 4.27%
  • 28. Clear content types lead to better contextual navigation artist descriptions album reviews album pages artist biosdiscography TV listings
  • 29. #2 Make search smarter
  • 30. Clear content types improve search performance
  • 31. Clear content types improve search performance
  • 32. Clear content types improve search performance Content objects related to products
  • 33. Clear content types improve search performance Content objects related to products Raw search results
  • 34. Contextualizing “advanced” features
  • 35. Session data suggest progression and context
  • 36. Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works
  • 37. Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works search session patterns 1. solar energy 2. energy
  • 38. Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works search session patterns 1. solar energy 2. energy search session patterns 1. solar energy 2. solar energy charts
  • 39. Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works search session patterns 1. solar energy 2. energy search session patterns 1. solar energy 2. solar energy charts search session patterns 1. solar energy 2. explain solar energy
  • 40. Session data suggest progression and context search session patterns 1. solar energy 2. how solar energy works search session patterns 1. solar energy 2. energy search session patterns 1. solar energy 2. solar energy charts search session patterns 1. solar energy 2. explain solar energy search session patterns 1. solar energy 2. solar energy news
  • 41. Recognizing proper nouns, dates, and unique ID#s
  • 42. #3 Reduce jargon
  • 43. Saving the brand by killing jargon at a community college Jargon related to online education: FlexEd, COD, College on Demand Marketing’s solution: expensive campaign to educate public (via posters, brochures) The Numbers (from SSA): Result: content relabeled, money saved query rank query #22 online* #101 COD #259 College on Demand #389 FlexTrack *“online”part of 213 queries
  • 44. #4 Learn how your audiences differ
  • 45. Who cares about what?
  • 46. Who cares about what?
  • 47. Who cares about what?
  • 48. Who cares about what?
  • 49. Why analyze queries by audience? Fortify your personas with data Learn about differences between audiences • Open University “Enquirers”: 16 of 25 queries are for subjects not taught at OU • Open University Students: search for course codes, topics dealing with completing program Determine what’s commonly important to all audiences (these queries better work well)
  • 50. #5 Know when to publish what
  • 51. Interest in the football team: going...
  • 52. Interest in the football team: going... ...going...
  • 53. Interest in the football team: going... ...going... gone
  • 54. Interest in the football team: going... ...going... gone Time to study!
  • 55. Before Tax Day
  • 56. After Tax Day
  • 57. #6 Own and enjoy your failures
  • 58. Failed navigation? Examining unexpected searching Look for places searches happen beyond main page What’s going on? • Navigational failure? • Content failure? • Something else?
  • 59. Where navigation is failing (“Professional Resources” page) Do users and AIGA mean different things by “Professional Resources”?
  • 60. Comparing what users find and what they want
  • 61. Comparing what users find and what they want
  • 62. Failed business goals? Developing custom metrics Netflix asks 1. Which movies most frequently searched? (query count) 2. Which of them most frequently clicked through? (MDP views) 3. Which of them least frequently added to queue? (queue adds)
  • 63. Failed business goals? Developing custom metrics Netflix asks 1. Which movies most frequently searched? (query count) 2. Which of them most frequently clicked through? (MDP views) 3. Which of them least frequently added to queue? (queue adds)
  • 64. Failed business goals? Developing custom metrics Netflix asks 1. Which movies most frequently searched? (query count) 2. Which of them most frequently clicked through? (MDP views) 3. Which of them least frequently added to queue? (queue adds)
  • 65. #7 Avoid disasters
  • 66. The new and improved search engine that wasn’t Vanguard used SSA to help benchmark existing search engine’s performance and help select new engine New search engine “performed” poorly But IT needed convincing to delay launch Information Architect & Dev Team Meeting Search seems to have a few problems… Nah . Where’s the proof? You can’t tell for sure.
  • 67. What to do? Test performance of common queries “Before and after” testing using two sets of metrics 1.Relevance: how reliably the search engine returns the best matches first 2.Precision: proportion of relevant results clustered at the top of the list
  • 68. Old engine (target) and new compared Note: low relevance and high precision scores are optimal More on Vanguard case study: http://bit.ly/D3B8c
  • 69. Old engine (target) and new compared Note: low relevance and high precision scores are optimal More on Vanguard case study: http://bit.ly/D3B8c uh-oh
  • 70. Old engine (target) and new compared Note: low relevance and high precision scores are optimal More on Vanguard case study: http://bit.ly/D3B8c uh-oh better
  • 71. #8 Predict the future
  • 72. Shaping the FinancialTimes’ editorial agenda FT compares these • Spiking queries for proper nouns (i.e., people and companies) • Recent editorial coverage of people and companies Discrepancy? • Breaking story?! • Let the editors know! Seed your
  • 73. Can SSA bring us together?
  • 74. Lou’s TABLE OF OVERGENERALIZED DICHOTOMIES Web Analytics User Experience What they analyze Users' behaviors (what's happening) Users' intentions and motives (why those things happen) What methods they employ Quantitative methods to determine what's happening Qualitative methods for explaining why things happen What they're trying to achieve Helps the organization meet goals (expressed as KPI) Helps users achieve goals (expressed as tasks or topics of interest) How they use data Measure performance (goal- driven analysis) Uncover patterns and surprises (emergent analysis) What kind of data they use Statistical data ("real" data in large volumes, full of errors) Descriptive data (in small volumes, generated in lab environment, full of errors)
  • 75. Lands End and SKUs
  • 76. Lands End and SKUs SKU: # 39072-2AH1
  • 77. Use SSA to start work on a site report card
  • 78. Use SSA to start work on a site report card SSA helps determine common information needs
  • 79. Read this Search Analytics forYour Site: Conversations with Your Customers by Louis Rosenfeld (Rosenfeld Media, 2011) www.rosenfeldmedia.com Use code WEBDAGENE2013 for 20% off all Rosenfeld Media books
  • 80. Louis Rosenfeld lou@louisrosenfeld.com www.louisrosenfeld.com www.rosenfeldmedia.com www.slideshare.net/lrosenfeld @louisrosenfeld @rosenfeldmedia Say hello