How just a little data analysis can improve yourcontentJoe Pairman                                     Listening for movem...
Background• DITA XML implementation at HTC: effective web content a primary driver• From “How do we design for the web?” to...
Slide types               Ideas and overviews               Cautionary notes               How to               Tips and i...
Examples in this presentation Online knowledge base of support articles for a fictitious e-reader device                   ...
The predominant flavor of web analytics“This is a fast-growing category thats generated tremendous interest in recent years...
What can web data tell us about content?• What people are searching for, and the language they use to search for it• What ...
You need…• Access to analytics data• Significant body of homogeneous content, such as knowledge base,  established blog• Si...
What can’t web data tell us?• How to design our content (it can suggest which things work better but in the  end we still ...
UltimatelyThe data provides focus and pointers, not answersIntroduction                        How just a little data anal...
Search terms
What can search query data tell us?• Top searches (so crucial content)• The vocabulary that customers use• The way that cu...
External search v.s. site search 1                            Site search                              External search    ...
External search v.s. site search 2• It’s possible that site search is used more for technical or specialized info• But som...
Processing search terms 1: Data collection• Even if your content is only one section of the site, it’s best to get the who...
Processing search terms 2: Common phrases• Filter out small words: and, the, a• Consider getting 2- and 3-word phrases too...
Processing search terms 3: Categorizing• Based on the frequent keywords, draft out categories. Not too granular; the  idea...
Using search data 1: Prioritization• Do you have gaps? Are you putting energy into the right places?Search terms          ...
Using search data 2: Language• Based on your categories, look into the language that people actually search for  most:  di...
Using search data 3: Classification• How do your site users classify subject areas?  For example, a UI-driven category of “...
Other avenues for exploration• Segmentation by screen size / geography / language• Social media monitoring• Further site s...
Page views and time on page
Food for thought                                                                                         (simulated data)P...
High (unique) page views• Some indication of whats popular• Compare with search keyword categories, to identify gaps• Does...
Low (unique) page views• Generally could indicate candidate for removal, but...• Could be not effective information on a “n...
Time on page• Seems appealing at first — longer means better (up to a point)?• But people can just leave a page open• Some ...
Time on page correlates with related keywords• When people land on a page that wasn’t what they wanted, they don’t tend to...
Page ratings
What can ratings tell us?• Do people like the page or not? (For whatever reason.)• Can be a good metric, when combined wit...
Cautions about ratings• Avoid assumptions. “Not helpful” doesn’t always mean the page content is  unsuitable for its purpo...
What you need…• A rating per page• Should have at least ability to rate positively and negatively (not just "like",  which...
Getting a better response rate• Keep the ratings system as simple as possible• If there’s the chance to provide a comment,...
How to prepare ratings data• Make sure its comparable — i.e. dont compare product section to support  section• If its bina...
If few rate a page, do the ratings count? • Response rate may correlate with helpfulness ratio (so don’t ignore pages   wi...
Making a dashboard withsynthetic metrics
A combination of metrics could indicate …• Which pages need to tackle their subject more effectively• Which pages need to b...
Dashboard — overviewSynthetic metrics dashboard   How just a little data analysis can improve your content — Joe Pairman
Relative measures are fine• What’s a good helpfulness ratio? How many page views do we need?• Very hard to answer these kin...
Calculating low, medium, & high rankings• For each metric, create a column to show whether the page is in the bottom third...
Synthesizing metrics• Indicators for Improve searchability: High helpfulness ratio, low page views,  and response rate at ...
Ratings with other metrics• Improve content? — Low helpfulness, and page views are at least medium• Improve searchability?...
Ratings with other metrics• Unrelated searches — may be indicated by low time on page > check  keywords for these (remembe...
Ratings with other metrics• Good topics — high helpfulness, and at least medium response rate and page  viewsSynthetic met...
Further research into a (potential) problem page• Does it really have a problem? For example the time on page may be low, ...
Investigating specific attributes
Ratings ratios for answering specific questions• Are pages with graphics more helpful?• Is it better to have more subtopics...
Looking at relationships• Excel CORREL function (0.3 or above is respectable)• Scatter chart, with optional trend line• Bu...
Correlating with XHTML / XML structure• For example, pages with more graphics:  <img> or perhaps <fig>• More subtopics on a...
Bounce rate• Why should we try to keep people on the site? Dont we want to give them  the answer and then have them leave ...
Combining ratings with non-web data• Assign human-judged ratings and see how they match up. (Is a particular  word usage i...
Next steps
Web data in the whole organisation• Content teams should have access to the data• Can not only improve content but provide...
Schedule• Search terms — every six months• Synthetic metrics dashboard — every month or two• Specific questions — as necess...
General principles• Always present data in terms of the question it’s aiming to answer (though it’s  good to explore the d...
Further information
Useful resources• Search Analytics for Your Site, by Louis Rosenfeld — a thorough and thought-  provoking investigation of...
A simple synthetic metrics dashboard — stepsIn Excel: 1. Get data from each source such as your analytics tool and your ra...
Find me on Google+ via:joepairman.com
Upcoming SlideShare
Loading in …5
×

How Just a Little Data Analysis Can Improve your Content

2,859 views
2,830 views

Published on

Slides from a webinar given on February 5th, 2013, organized by Comtech Services (http://comtech-serv.com/ ). Abstract as follows:

In the past, it was often difficult for information development teams to obtain quantitative data on how their content was used. In recent years, with the spread of online content delivery, it has become easier to obtain such data. Now, the challenge is how to interpret it in order to make content more effective.

In this webinar, Joe Pairman from HTC's User Education team will show how content usage data, ratings by users, and search query records can:
• Indicate appropriate vocabulary
• Contribute to taxonomy development
• Suggest areas of focus for content improvements
• Help to answer specific questions about designing effective content

Published in: Business
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,859
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
10
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

How Just a Little Data Analysis Can Improve your Content

  1. 1. How just a little data analysis can improve yourcontentJoe Pairman Listening for movement in the mine. National Institute for Occupational Safety and Health (NIOSH). www.flickr.com/photos/25069384@N03/2492849690/
  2. 2. Background• DITA XML implementation at HTC: effective web content a primary driver• From “How do we design for the web?” to “What can we learn from the web?”• Co-ordinated analytics and user feedback plan• Main focus is improving content• This presentation covers methods, tips, and lessons learned from that• Exploration of ideas rather than a technical guideIntroduction How just a little data analysis can improve your content — Joe Pairman
  3. 3. Slide types Ideas and overviews Cautionary notes How to Tips and insightsIntroduction How just a little data analysis can improve your content — Joe Pairman
  4. 4. Examples in this presentation Online knowledge base of support articles for a fictitious e-reader device http://commons.wikimedia.org/wiki/File%3AEbook_reader_icon.png By netalloy (Open Clip Art Library images page) [see page for license], via Wikimedia CommonsIntroduction How just a little data analysis can improve your content — Joe Pairman
  5. 5. The predominant flavor of web analytics“This is a fast-growing category thats generated tremendous interest in recent years due to the advertising and marketing value derived from tracking and understanding user behavior.” Morville & Rosenfeld, Information Architecture for the World Wide Web, 3rd Edition (emphasis added)• Much web analytics aims to directly improve sales• In contrast, content-based sites focus on delivering effective information• Of course (for a commercial site), the goal is still sales, but indirectlyIntroduction How just a little data analysis can improve your content — Joe Pairman
  6. 6. What can web data tell us about content?• What people are searching for, and the language they use to search for it• What they’re viewing and how long they’re staying there• (With a ratings system) How much they like what they’re seeing• (With a combination of metrics) What we can focus on for improvement• Whats the effect of particular qualities (graphics, word count, links, etc)Introduction How just a little data analysis can improve your content — Joe Pairman
  7. 7. You need…• Access to analytics data• Significant body of homogeneous content, such as knowledge base, established blog• Significant views of that content• Data such as searches, page views, ratingsIntroduction How just a little data analysis can improve your content — Joe Pairman
  8. 8. What can’t web data tell us?• How to design our content (it can suggest which things work better but in the end we still need a coherent design)• Why the patterns exist (interpretation is up to us)• What the full context isIntroduction How just a little data analysis can improve your content — Joe Pairman
  9. 9. UltimatelyThe data provides focus and pointers, not answersIntroduction How just a little data analysis can improve your content — Joe Pairman
  10. 10. Search terms
  11. 11. What can search query data tell us?• Top searches (so crucial content)• The vocabulary that customers use• The way that customers classify things• And much moreSearch terms How just a little data analysis can improve your content — Joe Pairman
  12. 12. External search v.s. site search 1 Site search External search • Users more likely to know what • Potentially many more queries they’re looking for? Pros • A much wider range of data • A much wider range of search terms available • Increasingly, Google is where • Still only those who made it to people search first your site Cons • Poorer range and quality of • Google encrypted search: now results may drive people away up to a third of queries may not from your site search have associated termsSearch terms How just a little data analysis can improve your content — Joe Pairman
  13. 13. External search v.s. site search 2• It’s possible that site search is used more for technical or specialized info• But some argue against this• Best way would be to actually compare external (referral) to site (local) terms Rosenfeld, Louis. 2011. Search Analytics for Your Site. New York: Rosenfeld Media. www.rosenfeldmedia.com/books/searchanaly tics/• External is probably still the best way to get startedSearch terms How just a little data analysis can improve your content — Joe Pairman
  14. 14. Processing search terms 1: Data collection• Even if your content is only one section of the site, it’s best to get the whole site’s search queries• If a lot, try using a phrase to filter, such as "how to". Also filter out the obvious irrelevant terms• But if you do this, compare with other sources to make sure not too skewedSearch terms How just a little data analysis can improve your content — Joe Pairman
  15. 15. Processing search terms 2: Common phrases• Filter out small words: and, the, a• Consider getting 2- and 3-word phrases too: back up ≠ back + up• Even at this stage the results may be very interestingSearch terms How just a little data analysis can improve your content — Joe Pairman
  16. 16. Processing search terms 3: Categorizing• Based on the frequent keywords, draft out categories. Not too granular; the idea is to make big baskets to categorize quickly.• Categorize the original search terms, based on these categories (automate this!) Anything uncategorized goes in “Other”.• Spot check your categorized terms so far.• Look at “Other”, and think up new categories.• Iterate a couple of times. Probably some manual categorization at the end.Search terms How just a little data analysis can improve your content — Joe Pairman
  17. 17. Using search data 1: Prioritization• Do you have gaps? Are you putting energy into the right places?Search terms How just a little data analysis can improve your content — Joe Pairman
  18. 18. Using search data 2: Language• Based on your categories, look into the language that people actually search for most: display or screen? storage, memory, or just space?• Best place for frequent terms is page title; next is intro paragraph• After that, try to get terms into body of the page.• Last resort is index or other non-visible keywords (but that’s mostly for internal site search, not external searches)• Strike a balance between using a range of terms and “stuffing”Search terms How just a little data analysis can improve your content — Joe Pairman
  19. 19. Using search data 3: Classification• How do your site users classify subject areas? For example, a UI-driven category of “Sharing” might not match users’ distinct searches for recommend a book and sync notes• If designing from scratch (or big revamp) this work should probably come first• Search terms seem particularly amenable to a flat, “tagging” approach, but can be informative no matter the approachSearch terms How just a little data analysis can improve your content — Joe Pairman
  20. 20. Other avenues for exploration• Segmentation by screen size / geography / language• Social media monitoring• Further site search data such as audience and searches with no resultsSearch terms How just a little data analysis can improve your content — Joe Pairman
  21. 21. Page views and time on page
  22. 22. Food for thought (simulated data)Page views Pages Page views and time on page How just a little data analysis can improve your content — Joe Pairman
  23. 23. High (unique) page views• Some indication of whats popular• Compare with search keyword categories, to identify gaps• Doesn’t identify whether the pages are doing a good job, or even if they’re actually the things users were looking forPage views and time on page How just a little data analysis can improve your content — Joe Pairman
  24. 24. Low (unique) page views• Generally could indicate candidate for removal, but...• Could be not effective information on a “niche” topic• Could be useful but not findablePage views and time on page How just a little data analysis can improve your content — Joe Pairman
  25. 25. Time on page• Seems appealing at first — longer means better (up to a point)?• But people can just leave a page open• Some pages might be harder to read than others, so take longer?• Some topics just deeper than others• However, low time on page could be useful...Page views and time on page How just a little data analysis can improve your content — Joe Pairman
  26. 26. Time on page correlates with related keywords• When people land on a page that wasn’t what they wanted, they don’t tend to stay long:• Pages with average time of less than a minute could be flagged.• Though tip-style pages may have short time on page but still be popular.Page views and time on page How just a little data analysis can improve your content — Joe Pairman
  27. 27. Page ratings
  28. 28. What can ratings tell us?• Do people like the page or not? (For whatever reason.)• Can be a good metric, when combined with other data. A simple example: High page views Low page views High positive Could be helpful info on a niche Good ratings ratio subject, or perhaps is hard to find Low positive Needs improved Possible candidate for removal? ratings ratioPage ratings How just a little data analysis can improve your content — Joe Pairman
  29. 29. Cautions about ratings• Avoid assumptions. “Not helpful” doesn’t always mean the page content is unsuitable for its purpose.• Don’t use in isolation.• Combine with qualitative data if at all possible. Comments, usability studies, social media monitoring, etc.Page ratings How just a little data analysis can improve your content — Joe Pairman
  30. 30. What you need…• A rating per page• Should have at least ability to rate positively and negatively (not just "like", which is dubious - people dont even remember what they liked and why)• Not really about lengthy surveys — they are a separate thing and require a lot more preparationPage ratings How just a little data analysis can improve your content — Joe Pairman
  31. 31. Getting a better response rate• Keep the ratings system as simple as possible• If there’s the chance to provide a comment, make sure this shows up after a rating is selected Kohavi, R; Henne, R; and Sommerfield, D: Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO (Slides from talk on Controlled Experiments). www.exp-platform.com/Documents/controlledExperimentsHippoEbay.pdfPage ratings How just a little data analysis can improve your content — Joe Pairman
  32. 32. How to prepare ratings data• Make sure its comparable — i.e. dont compare product section to support section• If its binary — helpful or not — divide positive by negative: 756 helpful divided by 230 not helpful gives a helpfulness ratio of 3.29• Even if have multiple negative options, sum them and do the same, though hang on to source data — it could be useful• You end up with a list of pages, ranked by their helpfulness ratioPage ratings How just a little data analysis can improve your content — Joe Pairman
  33. 33. If few rate a page, do the ratings count? • Response rate may correlate with helpfulness ratio (so don’t ignore pages with low response rate) (simulated data) Response rate Helpfulness • Response rate is a useful metric in itselfPage ratings How just a little data analysis can improve your content — Joe Pairman
  34. 34. Making a dashboard withsynthetic metrics
  35. 35. A combination of metrics could indicate …• Which pages need to tackle their subject more effectively• Which pages need to be more findable (similar to above but not the same)• Which pages need to discourage wrong searches (different again)• Which pages are candidates for removal• Which pages work well (so are examples to follow)Synthetic metrics dashboard How just a little data analysis can improve your content — Joe Pairman
  36. 36. Dashboard — overviewSynthetic metrics dashboard How just a little data analysis can improve your content — Joe Pairman
  37. 37. Relative measures are fine• What’s a good helpfulness ratio? How many page views do we need?• Very hard to answer these kinds of questions (especially at first)• Rather, focus on relative measures: which pages are comparatively weak or strongSynthetic metrics dashboard How just a little data analysis can improve your content — Joe Pairman
  38. 38. Calculating low, medium, & high rankings• For each metric, create a column to show whether the page is in the bottom third, middle third, or top third• In Excel, use something like this: =IF(RANK(AC2,AC:AC)>ROUND(COUNT(AC:AC)*2/3,0),"Low",IF(RANK(AC2,AC:A C)>ROUND(COUNT(AC:AC)*1/3,0),"Medium","High"))Synthetic metrics dashboard How just a little data analysis can improve your content — Joe Pairman
  39. 39. Synthesizing metrics• Indicators for Improve searchability: High helpfulness ratio, low page views, and response rate at least medium.Synthetic metrics dashboard How just a little data analysis can improve your content — Joe Pairman
  40. 40. Ratings with other metrics• Improve content? — Low helpfulness, and page views are at least medium• Improve searchability? Low page views, high helpfulness, and response rate at least mediumSynthetic metrics dashboard How just a little data analysis can improve your content — Joe Pairman
  41. 41. Ratings with other metrics• Unrelated searches — may be indicated by low time on page > check keywords for these (remember tip-type pages may have low time on page too)• Consider getting rid of — low page views, and low or medium helpfulnessSynthetic metrics dashboard How just a little data analysis can improve your content — Joe Pairman
  42. 42. Ratings with other metrics• Good topics — high helpfulness, and at least medium response rate and page viewsSynthetic metrics dashboard How just a little data analysis can improve your content — Joe Pairman
  43. 43. Further research into a (potential) problem page• Does it really have a problem? For example the time on page may be low, but ratings very good. Is it a short, tip-style page?• How do people get there? Where do they go when they leave? (Search terms, navpaths, exits.)• Is there anything the good pages have in common that the problem ones don’t? (See next section … )Synthetic metrics dashboard How just a little data analysis can improve your content — Joe Pairman
  44. 44. Investigating specific attributes
  45. 45. Ratings ratios for answering specific questions• Are pages with graphics more helpful?• Is it better to have more subtopics on a page?• Does the number of links on a page affect bounce rate?Specific content attributes How just a little data analysis can improve your content — Joe Pairman
  46. 46. Looking at relationships• Excel CORREL function (0.3 or above is respectable)• Scatter chart, with optional trend line• But remember that correlation is not causation!Specific content attributes How just a little data analysis can improve your content — Joe Pairman
  47. 47. Correlating with XHTML / XML structure• For example, pages with more graphics: <img> or perhaps <fig>• More subtopics on a page: <h2> or perhaps use information from DITA maps• Several ways to automate this: Python with LXML library is powerful and not too intimidatingSpecific content attributes How just a little data analysis can improve your content — Joe Pairman
  48. 48. Bounce rate• Why should we try to keep people on the site? Dont we want to give them the answer and then have them leave satisfied?• However, bounce rate can indicate things like whether links are being used — (correlate links on page to bounce rate)Specific content attributes How just a little data analysis can improve your content — Joe Pairman
  49. 49. Combining ratings with non-web data• Assign human-judged ratings and see how they match up. (Is a particular word usage important? Friendly style?)• (For support content) Matching to support call issues. What types of pages are used more on the web v.s. called about?Specific content attributes How just a little data analysis can improve your content — Joe Pairman
  50. 50. Next steps
  51. 51. Web data in the whole organisation• Content teams should have access to the data• Can not only improve content but provide valuable feedback for other groups in the organization• Resourcing may require persuasion• Potential legal issues may need to be addressed• Once we have the data, we need to treat it responsiblyNext steps How just a little data analysis can improve your content — Joe Pairman
  52. 52. Schedule• Search terms — every six months• Synthetic metrics dashboard — every month or two• Specific questions — as necessaryNext steps How just a little data analysis can improve your content — Joe Pairman
  53. 53. General principles• Always present data in terms of the question it’s aiming to answer (though it’s good to explore the data first)• Surprises are good. They indicate that youre not just confirming your prejudices.• Dont assume that your data answers the question. Be very suspicious. Use all other sources possible. And use common sense.• Watch your resources.• Analytics is not going to write your content or guarantee its success. And its reactive — only measures whats there, not what could be there.Next steps How just a little data analysis can improve your content — Joe Pairman
  54. 54. Further information
  55. 55. Useful resources• Search Analytics for Your Site, by Louis Rosenfeld — a thorough and thought- provoking investigation of applications for internal site search data (Also see slide deck with some key points at the same link.) www.rosenfeldmedia.com/books/searchanalytics/• Best Practices for “Was this helpful?” — a discussion about the design of page ratings systems: www.ixda.org/node/24101• For “Was this page helpful” data, should I take response rate into account? — a question with some useful comments and answers: stats.stackexchange.com/questions/46428/for-was-this-page-helpful-data- should-i-take-response-rate-into-accountFurther information How just a little data analysis can improve your content — Joe Pairman
  56. 56. A simple synthetic metrics dashboard — stepsIn Excel: 1. Get data from each source such as your analytics tool and your ratings database. Get the data in any format that Excel can open. 2. Combine the data from different sources. Use VLOOKUP formula if the value you’re matching on is to the left of other values; INDEX and MATCH if not. If matching on page title, remember to allow for any underscores / percent encoded characters / garbled characters. 3. Calculate rankings for key metrics. See slide 38. An example formula: =IF(RANK(AC2,AC:AC)>ROUND(COUNT(AC:AC)*2/3,0),"Low",IF(RANK(AC2,AC:AC)>ROUND(CO UNT(AC:AC)*1/3,0),"Medium","High")) 4. Set synthetic metric indicators. See slide 39. An example formula: =IF(AND(AC2="Low", OR(L2="High", L2="Medium"), OR(N2="High", N2="Medium")), "1","")Or, get your data as CSV/TSV, do steps 2-4 with a Python script, write to a CSV file, and thenopen the result in any spreadsheet package.Further information How just a little data analysis can improve your content — Joe Pairman
  57. 57. Find me on Google+ via:joepairman.com

×