Your SlideShare is downloading. ×
0
Search analytics –   Understanding the long tail   SIKM Leaders July 2012Lee Romeroblog.leeromero.orgJuly 2012
About meMy background and early career are both in software engineering.Ive worked in the knowledge management field for t...
Search AnalyticsDefinition: Search analytics is the field of analyzing and  aggregating usage statistics of your search so...
The challenge of yoursearch log
Understanding your search logFor enterprise search solutions1, the “80-20” rule is not trueThe language variability is ver...
Some facts about search termsThere’s an anecdote that goes something like, “80% of your  searches are from 20% of your sea...
Some facts about search terms: part 2Another myth: a large percent of searches repeat over and over  againFact: on enterpr...
Some facts about search terms: part 3Another myth: a good percentage of your search terms will repeat in  sequential perio...
What to do with your search log?The summary of the previous slides:• It is hard to understand a decent percentage of terms...
Understanding yourusers’ informationneeds
Categorizing your users’ languageGiven the challenges previously laid out, using the search log to   understand user needs...
Categorizing your users’ language, p2So we need to categorize search terms to really be able to  understand our users’ inf...
Categories to useProposal: Start with your own taxonomy and its vocabularies as the  categories into which search terms ar...
Automating categorizationNow we turn to the hairier challenge – how can we categorize  search terms?To describe the proble...
Automating categorization, p2The proposed solution is based on a couple of concepts:     1. You can think of this categori...
Automating categorization, p3Here’s a depiction of this solution                                           Previously     ...
Automating categorization, p4: BootstrappingThis approach depends on matching to previously-categorized terms     • Every ...
Automating categorization, p5: Iterative                                                  Previously                      ...
Automating categorization, p5: IterativeThis approach also needs to be applied iteratively     • You start with a set of c...
Automating categorization, p6: IterativeIt will be beneficial to have a human review the set of matches for    each iterat...
Automating categorization, p7: No matchesThere will always be search terms that do not get matched.     • This may be beca...
SummaryWith this approach, we can take a set of search terms at any time  and categorize them (partially) automatically   ...
Additional benefits: Measuring your taxonomyAs mentioned earlier, part of the challenge will be that there will be  terms ...
Additional benefits: Linguistic statistics                                                     Word             Distinct T...
Additional benefits: Comparing to your content spaceWith the statistics described in the previous slide, you could,  conce...
Understanding thequality of your users’experience
The ProblemSearch sucks!Yes, the common refrain from many users – “search doesn’t return  what I’m looking for” or “I can ...
A solution?One way to assess the impact is to have a set of users perform  either a set of pre-defined searches or a set o...
Automating evaluationThe idea is to automate some of the analysis of the quality of the  result set by examining propertie...
The approachThe algorithm takes the following approach:• For each search term, it executes the query against the search  e...
What are we looking at in assessing quality?Facets that influence quality• Focusing primarily on user-visible aspects     ...
What are we looking at in assessing quality?Factors that influence quality•    Only examining the first page of results•  ...
What are we looking at in assessing quality?Others that may be explored• Balance across sources of content (does it match ...
Validating the approachDoes this reflect how a human user would perceive the quality?• This idea seems reasonable, but do ...
Validating the approach, p2Comparing against a human assessment• One of our on-going operations in GCKM is to review the q...
Validating the approach, p3Comparing against searches/term• Within our search program, we use the ratio of searches per vi...
Validating the approach, p4Summing up• At this point, I am confident that the quality assessment we are  producing automat...
Additional benefits of this toolBetter analysis• Given that this utility can output data in a spreadsheet format, this  pr...
Quality of results split by taxonomy on the contentBetter analysis - examples• Quality of results averaged over the servic...
Quality of results by depth into the “long tail”Better analysis - examples• A chart of the quality of the result pages by ...
Quality over time – comparing before and after an upgradeBetter analysis - examples• This chart shows the # of terms by th...
And, finallyFor more about search analytics, I highly would recommend:• “Search Analytics for your Site” by Lou Rosenfeld•...
Upcoming SlideShare
Loading in...5
×

SIKM Leaders July 2012 - Understanding your Search Log

332

Published on

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
332
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
22
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "SIKM Leaders July 2012 - Understanding your Search Log"

  1. 1. Search analytics – Understanding the long tail SIKM Leaders July 2012Lee Romeroblog.leeromero.orgJuly 2012
  2. 2. About meMy background and early career are both in software engineering.Ive worked in the knowledge management field for the last 12+ years – almost all of it in the technology of KMI’ve worked with various search solutions for the last 7-8 years – and spent most of that time trying to figure out how to measure their usefulness and improve them in any way I can.I’ve spoken at both Enterprise Search Summit and Taxonomy Boot Camp twice.My writings on search analytics have been featured by a number of experts in the field including Lou Rosenfeld and Avi Rappoport2
  3. 3. Search AnalyticsDefinition: Search analytics is the field of analyzing and aggregating usage statistics of your search solution to understand user behavior and to improve the experience.Some search analytics are focused on SEO / SEM activities (for internet searches).The focus here will be on enterprise search, so will primarily be focusing on the aspect of improving the user experience.Further, I will primarily focus here on keyword search and understanding the user language found in search logsAlways remember – analytics without action does not have much value.3
  4. 4. The challenge of yoursearch log
  5. 5. Understanding your search logFor enterprise search solutions1, the “80-20” rule is not trueThe language variability is very high in a couple of ways (covered in the next few slides)Yet having a good understanding of the language, frequency and commonality in your search log is critical to being able to make sustainable improvements to your searchThe remainder of this presentation first provides some evidence supporting my claim and then will cover some ideas and research into this problem 1 This does not seem to apply equally to e-commerce solutions5
  6. 6. Some facts about search termsThere’s an anecdote that goes something like, “80% of your searches are from 20% of your search terms” • Equivalently, some will say that you can make significant impact by paying attention to a few of your most common terms (you can, but in limited ways)Fact: in enterprise search solutions the curve is much shallower: This chart shows the inverted power curve for two different solutions I’m currently working withIn the second case, it takes 13% of terms to cover 50% of searches, and that is over 7000 distinct terms in a typical month!6
  7. 7. Some facts about search terms: part 2Another myth: a large percent of searches repeat over and over againFact: on enterprise search solutions, there is surprisingly little commonality month-to-monthOver a recent six month period, which saw a total of ~289K distinct search terms, only 11% of terms occurred in more than 1 month! # of months # terms % of searches 1 257665 89.2% 2 17994 6.2% 3 5790 2.0% 4 2900 1.0% 5 2019 0.7% 6 2340 0.8%7
  8. 8. Some facts about search terms: part 3Another myth: a good percentage of your search terms will repeat in sequential periodsFact: There is much more churn even month-to-month than you might expect – in the period covered below, only about 13% of terms repeated from one month to the next (covering about 36% of searches)8
  9. 9. What to do with your search log?The summary of the previous slides:• It is hard to understand a decent percentage of terms within a given time period (month)!• If you could do that, the problem during the next time period isn’t that much easier!The next sections describe a couple of research projects I’ve beenworking on to tackle these issues9
  10. 10. Understanding yourusers’ informationneeds
  11. 11. Categorizing your users’ languageGiven the challenges previously laid out, using the search log to understand user needs seems very challengingBeyond the first several dozen terms, it is hard to understand what users are looking for • And those several dozen terms cover a vanishingly small percentage of all searches!However, it would be very useful to understand your users’ information needs if we could somehow understand the entirety of the search logHow do we handle this? Categorize the search terms!11
  12. 12. Categorizing your users’ language, p2So we need to categorize search terms to really be able to understand our users’ information needs.To do this, we face two challenges 1. What categorization scheme should we use? 2. How do we apply categorization in a repeatable, scalable and manageable way?For the first challenge, I would recommend you use your taxonomy (you do have one, right?)The second challenge is a bit more difficult but is addressed later in this deck12
  13. 13. Categories to useProposal: Start with your own taxonomy and its vocabularies as the categories into which search terms are groupedSome searches will not fit into any of these categories, so you can anticipate the need to add further categoriesAs an aside, this exercise actually provides a great measurement tool for your taxonomy • You can quantitatively assess the percent of your users’ language that is classifiable with your taxonomy • A number you may wish to drive up over time (through evolution of your taxonomy)13
  14. 14. Automating categorizationNow we turn to the hairier challenge – how can we categorize search terms?To describe the problem, we have: 1. A set of categories, which may be hierarchically related (most taxonomies are) 2. A set of search terms, as entered by users, that need to be assigned to those categories Search Term Category Category ? Search Term Category Search Term Category Category Category Search Term Category ... Search Term ... ? ... ...14
  15. 15. Automating categorization, p2The proposed solution is based on a couple of concepts: 1. You can think of this categorization problem as search! 2. You are taking each search term and searching in an index in which the potential search results are categories!Question: What is the “body” of what you are searching?Answer: Previously-categorized search terms!Using this approach, you can consider the set of previously- categorized search terms as a corpus against which to search • You can apply all of the same heuristics to this search as any search: • Word matching (not string matching) • Stemming • Relevancy (word ordering, proximity, # of matches, etc.)15
  16. 16. Automating categorization, p3Here’s a depiction of this solution Previously categorized terms Search Term Category Category Category Search Term Category Search Term Category Category Previously Category categorized Search Term terms ... Search Term ... ... ... Previously categorized terms This red oval represents the “matching” process – it takes as input the search terms to be categorized, the set of categories along with previously-matched search terms and produces as output a set of categories associated with the new search terms16
  17. 17. Automating categorization, p4: BootstrappingThis approach depends on matching to previously-categorized terms • Every time you categorize a new search term, you expand the set of categorized terms, enabling more matches in the futureBootstrapping: You can take the names of the categories (the terms in your taxonomy) as the first set of “categorized search terms” • This allows you to start with no search terms having been categorized at all • You run a first round of matching against the categories to find first-level matches • Take those that seem like “good” matches and pull those into the set of categorized search terms for a second iteration, etc. • Using this in initial testing resulted in 10% of distinct terms from a month being associated with at least one categoryAnother aspect: Any manual categorization of common search terms will add to the success of categorization17
  18. 18. Automating categorization, p5: Iterative Previously categorized Search Term Category terms Category Search Term Category Category New categorizations Search Term Category Category Category Previously Search Term categorized ... terms Search Term ... ... ... New categorizations Previously categorized terms New categorizations
  19. 19. Automating categorization, p5: IterativeThis approach also needs to be applied iteratively • You start with a set of categorized search terms and a new set of (uncategorized) search terms • You then apply this matching to the uncategorized search terms, getting a set of newly-categorized search terms (with some measure of probability of “correctness” of the match, i.e., relevancy) • You pull in the newly-categorized search terms and run the matching process again • Each time, as you expand the set of categorized search terms (from a previous match), you increase the possibility of more matches (in subsequent matches)19
  20. 20. Automating categorization, p6: IterativeIt will be beneficial to have a human review the set of matches for each iteration and determine if they are accurate enough • The measurement of relevancy is intended to do this but would likely only be partially successfulOver time, using this process, you build up a larger and larger set of categorized search terms • This makes it more likely in future iterations that more terms will be categorizable20
  21. 21. Automating categorization, p7: No matchesThere will always be search terms that do not get matched. • This may be because the terminology used does not match • This may be because there are no categories in the global taxonomy that would be useful for categorizationThe first issue would require a human to recognize the association (thus, categorizing the term and then enabling matches on future uses of that term)The second issue would require adding in new categories (not part of the global taxonomy) • And then categorizing the term into the newly-added category(ies)21
  22. 22. SummaryWith this approach, we can take a set of search terms at any time and categorize them (partially) automatically • Over time, the accuracy of the matching will improve through human review- and-approval of matchesWe then are able to relate these information needs to a variety of other pieces of data: • Volume of content available to users – significant mismatches can highlight need for new content • Rating of content in these categories – can highlight that a particular area of interest has content but it isn’t quality content • Downloads of content in these categories – could highlight navigational issues (e.g., when a category is much more highly represented in search than in downloads)This does not require directly working with end-users and is scalable22
  23. 23. Additional benefits: Measuring your taxonomyAs mentioned earlier, part of the challenge will be that there will be terms that do not match the starting categories (i.e., the global taxonomy)This actually highlights some valuable insight obtainable from this: • We can identify gaps in our taxonomy (terms requiring new categories) • We can identify areas of our taxonomy where we have many search terms associated with a taxonomy term and consider if we need to either add or split search terms in order to better match our users’ real language • We can identify areas of the taxonomy that are of little use in terms of the language used by our users23
  24. 24. Additional benefits: Linguistic statistics Word Distinct Terms Searches management 3128 8283Word counts – independent of term usage, sap 1931 3873 strategy 1414 3728 what are the most common individual business 1558 3599 words? it 1343 2992 process 1515 2920 data 1264 2899 project 1249 2823 model 1296 2791 plan 987 2170Word networks – we can understand the inter-relationships between individual words (which pairs occur commonly together, which words occur commonly for a given word)These are not as much about information needs as about understanding the language users use (so this insight can help shape categorization)These are also very useful to prioritize your efforts in reviewing your search logs24
  25. 25. Additional benefits: Comparing to your content spaceWith the statistics described in the previous slide, you could, conceivably compare it to the same analysis applied to your “content space”For example, derive the statistics for the titles of content available in your search • Do you find significant differences? This could represent differences in the names people apply to things and what they expect to use to find the contentAnother interesting angle is to use other controlled lists as the matched terms in a category • People names (applied this and found about 8% of terms match a person’s name) • Client names25
  26. 26. Understanding thequality of your users’experience
  27. 27. The ProblemSearch sucks!Yes, the common refrain from many users – “search doesn’t return what I’m looking for” or “I can never find what I’m looking for”There are many tools available to improve the users’ experience, including: • Improving the UI • Improving the content included • Manipulating settings in the engine to modify relevancy calculations, possibly even the engine itselfThe challenge for many of these is, once you make a change, how do you know it has improved the results?27
  28. 28. A solution?One way to assess the impact is to have a set of users perform either a set of pre-defined searches or a set of their own searches and then evaluate the quality of resultsThe challenge with this is that it is very labor intensive, can take a long calendar time and is hard to do iteratively.An alternative could be to automate this evaluation!It is important to keep in mind that this is not about the relevancy of the results or determining whether the engine is returning the “right” items • It’s about assessing the user-perceived quality of a set of results given a set of criteria for a search28
  29. 29. Automating evaluationThe idea is to automate some of the analysis of the quality of the result set by examining properties of the result setThis approach attempts to perform a simple test similar to what a human user would do in scanning a set of search results • It uses the data returned by the search engine and displayed on the first page of results • It does not do a “deep” review of content29
  30. 30. The approachThe algorithm takes the following approach:• For each search term, it executes the query against the search engine and retrieves the results ‒For each individual result, it calculates a quality score from 0.0 to 1.0 (a higher score implies the result looks like a better result) ‒The individual scores for a search term’s set of results are averaged to get a single score for that search term• In addition, the current POC outputs data in a tabular format including most of the individual elements returned by the search engine along with the derived score30
  31. 31. What are we looking at in assessing quality?Facets that influence quality• Focusing primarily on user-visible aspects First page Result set size Snippet Title Age Uniqueness of title31
  32. 32. What are we looking at in assessing quality?Factors that influence quality• Only examining the first page of results• Similarity / dissimilarity of keywords to title• Similarity / dissimilarity of keywords to excerpt• Uniqueness of titles within the result set (just first page)• Size of total result set• Age of results• Looking for specific “known” targets• (one “cheat”) Presence of keywords in “concepts” identified by engine32
  33. 33. What are we looking at in assessing quality?Others that may be explored• Balance across sources of content (does it match overall ratio?)• Ratings of individual results• Web domain of content (following an internet expectation that “some sources are better than others”)• Match of terms could be altered to consider synonyms• Examining taxonomy values ‒ Could apply matching to taxonomy values? ‒ Could be a “bonus” to items that have taxonomy?• May want to make weights (e.g., impact of age) consider source or class of content• Currently, in our search engine, best bets are automatically included. ‒ Would prefer to have them not included to see where they end up organically.• Also, in our search engine, the exact order on a page has not been replicated so we can’t include the exact order as a factor33
  34. 34. Validating the approachDoes this reflect how a human user would perceive the quality?• This idea seems reasonable, but do we really have a way to determine if it is valid ‒Or, do we run the risk that this would lead to “local maximums” for the factors measured but not meaningfully improve the user’s experience?• So far, I have 2 independent ways to assess this ‒Comparing the results of this against a human assessment ‒Comparing the results of this against other factors that have been used as indicators of quality in the past34
  35. 35. Validating the approach, p2Comparing against a human assessment• One of our on-going operations in GCKM is to review the quality of results for a very small number of terms ‒The below takes the output of the most recent of this for our a subset of our “super search terms” and compares it against the programmatically calculated quality ‒There is at least a correlation 0.8 between the automated score y = 0.2781x + 0.3826 0.7 R² = 0.5803 (the Y axis) and the manual 0.6 score (the X axis) Automated Score 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 1.2 Manual Score35
  36. 36. Validating the approach, p3Comparing against searches/term• Within our search program, we use the ratio of searches per visit for a term as an indicator of the quality the results ‒The more pages of results a user looks at for a term, indicates that it’s harder for the user to find what they are looking for ‒The following chart displays a comparison between searches/visit (X-axis) and the automated quality score (Y-axis) ‒Again, we can see that there 80 is a correlation, though perhaps y = -0.6857x + 55.234 R² = 0.5225 70 not as strong as 60 50 compared to the manual 40 review 30 20 10 0 50 40 30 20 10 0 Quality Linear (Quality)36
  37. 37. Validating the approach, p4Summing up• At this point, I am confident that the quality assessment we are producing automatically is reflecting the user’s general experience. ‒On individual items, it can vary significantly but in aggregate it appears to be valid ‒I have not yet dug into this but the automation enables the weights of each factor to be adjusted and it’s possible that we can get the automated score closer still to the “real” quality of results through adjusting weights37
  38. 38. Additional benefits of this toolBetter analysis• Given that this utility can output data in a spreadsheet format, this presents some other capabilities ‒Estimate total “search impressions” for specific targets • Analyze “search impressions” vs. usage ‒Analyze spread of returned results across sources ‒Analyze quality along a variety of dimensions (source, taxonomy values, etc.) ‒Comparing results sets between terms that should show similar results • E.g., how similar are the results really for two synonyms? ‒Also, comparing result sets along a temporal dimension • How much change is there from one month (week) to the next? ‒Analyzing factors by depth into the “long tail” ‒Evaluating the quality of results for auto-complete terms38
  39. 39. Quality of results split by taxonomy on the contentBetter analysis - examples• Quality of results averaged over the service area assigned to content Quality by Service Area of content 38.0 37.5 37.0 36.5 Overall Avg 36.0 35.5 35.0 34.5 34.0 33.5 33.0 Enterprise Human Capital Outsourcing Strategy & Technology Applications (Consulting) Operations Integration39
  40. 40. Quality of results by depth into the “long tail”Better analysis - examples• A chart of the quality of the result pages by how far into the long tail a search term is Quality by Depth into the "long tail" 60.0 50.0 40.0 30.0 20.0 y = 55.685x-0.14 R² = 0.5253 10.0 0.0 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 6000 6500 7000 7500 8000 8500 9000 9500 10000 10500 11000 11500 12000 12500 13000 13500 14000 14500 15000 15500 16000 16500 17000 1750040
  41. 41. Quality over time – comparing before and after an upgradeBetter analysis - examples• This chart shows the # of terms by their change in quality through an upgrade of our search engine – overall change was +2%! Change in Quality through an upgrade 450 400 Worse Better 350 300 250 200 150 100 50 0 11% 13% 15% 17% 19% 21% 23% 25% 27% 29% 31% 33% 35% 37% 39% 41% 44% 47% 49% 51% 54% 56% 59% 66% 81% -9% -7% -5% -3% -1% 1% 3% 5% 7% 9% -46% -39% -34% -31% -29% -27% -25% -23% -21% -19% -17% -15% -13% -11%41
  42. 42. And, finallyFor more about search analytics, I highly would recommend:• “Search Analytics for your Site” by Lou Rosenfeld• www.searchtools.com – edited by Avi RappoportAlso, you can find my own writings on search analytics (along with avariety of other KM topics) on my blog:• blog.leeromero.org42
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×