Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Driven Design: Using Web Analytics to Improve Information Architectures


Published on

Presentation of peer-reviewed research paper for the 2007 IA Summit, presented March 24, 2007 in Las Vegas.

Published in: Technology, Education

Data Driven Design: Using Web Analytics to Improve Information Architectures

  1. 1. Data Driven Design Using Web Analytics to Improve Information Architectures Andrea Wiggins IA Summit 2007
  2. 2. Motivation: What Information Architects Want to Know <ul><li>Interviewees said: </li></ul><ul><ul><li>Context for making design decisions </li></ul></ul><ul><ul><li>Validation of heuristic assumptions </li></ul></ul><ul><ul><li>Understand why visitors come to the site & what they seek </li></ul></ul>
  3. 3. Agenda <ul><li>Overview for Context </li></ul><ul><li>Insert show of hands here! (topic, tools, data) </li></ul><ul><li>What is web analytics (WA)? How is it done? </li></ul><ul><ul><li>major WA concepts </li></ul></ul><ul><ul><li>what the data look like </li></ul></ul><ul><li>IA questions to answer </li></ul><ul><li>Rubinoff’s user experience audit </li></ul><ul><li>Some WA measures for heuristic validation </li></ul>
  4. 4. What is web analytics? <ul><li>Data mining from web traffic logs </li></ul><ul><ul><li>Web server log files </li></ul></ul><ul><ul><li>Page tag logs from client-side data collection (end up in server logs) </li></ul></ul><ul><ul><li>Cookies to identify “unique visitors” </li></ul></ul><ul><li>What for? </li></ul><ul><ul><li>Proving web site value (ROI) </li></ul></ul><ul><ul><li>Marketing campaign evaluation </li></ul></ul><ul><ul><li>Executive decision making - markets & products </li></ul></ul><ul><ul><li>Web site design parameters </li></ul></ul><ul><ul><li>More… </li></ul></ul>
  5. 5. How do you do it? <ul><li>Vendor analysis solutions </li></ul><ul><ul><ul><li>Hosted ASP </li></ul></ul></ul><ul><ul><ul><ul><li>Currently most popular model </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Provides traffic stats “on-demand” </li></ul></ul></ul></ul><ul><ul><ul><li>Software </li></ul></ul></ul><ul><ul><ul><ul><li>Runs on dedicated servers </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Scalability: requires significant data storage space and data maintenance </li></ul></ul></ul></ul><ul><ul><ul><li>Costs </li></ul></ul></ul><ul><ul><ul><ul><li>Starts at FREE for Google Analytics and goes way, way up </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Large organizations spend $50K/yr and up </li></ul></ul></ul></ul><ul><li>Open source: not a robust option </li></ul>
  6. 6. Very Quick Major Concepts <ul><li>Sessionizing (cookie > IP & UA) </li></ul><ul><li>Hits: all server requests </li></ul><ul><li>Pageviews: all server requests for page filetypes, variously defined </li></ul><ul><li>Visits & Visitors: stronger measures from sessionizing, sensitive to time periods </li></ul>
  7. 7. Sample Logs <ul><li>#Software: Microsoft Internet Information Services 6.0 </li></ul><ul><li>#Version: 1.0 </li></ul><ul><li>#Date: 2005-08-01 00:00:35 </li></ul><ul><li>#Fields: date time cs-method cs-uri-stem cs-username c-ip cs-version cs(User-Agent) cs(Referer) sc-status sc-bytes </li></ul><ul><li>2005-08-01 00:10:05 GET /index.htm - 216.xx.76.7 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+98) 200 13099 </li></ul><ul><li>2005-08-01 00:10:29 GET /current.html - 216.xx.76.7 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+98) 200 17985 </li></ul><ul><li>2005-08-01 00:11:24 GET /tickets.html - 216.xx.76.7 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+98) 200 15689 </li></ul><ul><li>2005-08-01 00:18:06 GET /index.htm - HTTP/1.0 Mozilla/4.0+(compatible;+MSIE+6.0;+AOL+9.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.1.4322) 304 300 </li></ul><ul><li>2005-08-01 00:20:18 GET /index.htm - 68.xx.117.55 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.1.4322) 200 13099 </li></ul><ul><li>2005-08-01 00:20:21 GET /classes.html - 68.xx.117.55 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.1.4322) 200 15296 </li></ul>
  8. 8. Spiders <ul><li>2005-08-01 00:49:32 GET /robots.txt - HTTP/1.0 Mozilla/5.0+ (compatible;+Yahoo!+Slurp;+ - 200 319 </li></ul><ul><li>2005-08-01 00:49:32 GET /plays/completing_dahlia.html - HTTP/1.0 Mozilla/5.0+ (compatible;+Yahoo!+Slurp;+ - 200 3507 </li></ul>
  9. 9. A Few Good Metrics <ul><li>Information Architects want to know: </li></ul><ul><ul><li>Confirmation of heuristics </li></ul></ul><ul><ul><ul><li>Do users leave at first glance of this awful page? </li></ul></ul></ul><ul><ul><ul><li>Where do they click? </li></ul></ul></ul><ul><ul><ul><li>What position on the screen or layout produces the most clicks for the same content? </li></ul></ul></ul><ul><ul><ul><li>Do the users “pogo-stick” back and forth between pages? What are they comparing? </li></ul></ul></ul><ul><ul><li>Ambient findability measures </li></ul></ul><ul><ul><ul><li>At what hierarchy depth do visitors enter the site? How do they get in on deep pages? </li></ul></ul></ul><ul><ul><ul><li>Do they ever see the home page? </li></ul></ul></ul><ul><ul><ul><li>Can they find their way to where we want them to go? </li></ul></ul></ul>
  10. 10. Searching for IA Answers <ul><li>On-site search behaviors </li></ul><ul><ul><li>How many searches do users make? </li></ul></ul><ul><ul><li>Do users refine their search results? </li></ul></ul><ul><ul><li>What type of queries do users make? </li></ul></ul><ul><ul><li>How often are search results the last page? </li></ul></ul><ul><ul><li>From what pages are searches initiated? </li></ul></ul><ul><ul><li>Do the search terms have context in the page from which the search is initiated? </li></ul></ul><ul><ul><li>Why are users querying about chimpanzees?!? </li></ul></ul>
  11. 11. What IAs Want <ul><li>Good navigation and content make the online world go ‘round </li></ul><ul><ul><li>Where in a process do users leave? Where do they go? Do they re-enter the process? </li></ul></ul><ul><ul><li>How do users move through the site? Is there a better route? </li></ul></ul><ul><ul><li>What pages don’t get visited? What pages get unexpectedly high visits? </li></ul></ul><ul><ul><li>What prompts conversion? </li></ul></ul><ul><ul><li>Where do search engine spiders go in the site? Is the best content being indexed? </li></ul></ul>
  12. 12. Everybody Loves Rubinoff <ul><li>UX audit quantifies subjective measures </li></ul><ul><ul><li>Offers structure for comparing properties of the site </li></ul></ul><ul><ul><li>Completely customizable, use strategically </li></ul></ul><ul><li>In a perfect world: </li></ul><ul><ul><li>Analyst & IA work together to set key performance indicators (KPI) and measurable heuristics </li></ul></ul><ul><ul><li>Each independently evaluates the site on the same points and compare the IA’s heuristics to user data for validation </li></ul></ul><ul><ul><li>They set before-and-after measures to prove value for the entire project </li></ul></ul>
  13. 13. Rubinoff’s Four Categories <ul><li>Using a sample of statements from Rubinoff’s model: </li></ul><ul><ul><ul><li>Branding </li></ul></ul></ul><ul><ul><ul><ul><li>Engaging, memorable brand experience </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Value of multimedia & graphics </li></ul></ul></ul></ul><ul><ul><ul><li>Functionality </li></ul></ul></ul><ul><ul><ul><ul><li>Server response time & technical errors </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Security & privacy practices </li></ul></ul></ul></ul><ul><ul><ul><li>Usability </li></ul></ul></ul><ul><ul><ul><ul><li>Error prevention & recovery </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Supporting user goals & tasks </li></ul></ul></ul></ul><ul><ul><ul><li>Content </li></ul></ul></ul><ul><ul><ul><ul><li>Navigation & site structure </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Search & referrals </li></ul></ul></ul></ul>
  14. 14. 1a: Branding Memorable & Engaging Experiences <ul><li>Ratio of new to returning visitors is key; set target KPI specific to site business goals </li></ul><ul><li>Track trends over time and in relation to cross-channel marketing </li></ul><ul><li>Median visit length in minutes </li></ul><ul><li>Average visit length in pages viewed </li></ul><ul><li>Depth, breadth of visits </li></ul><ul><li>Segment new and returning visitors to examine visit trends for different audiences </li></ul>
  15. 15. 1b: Branding Value of Multimedia & Graphics <ul><li>Flash & AJAX require deciding upon what to measure, programming appropriate data collection, and configuring analysis tools </li></ul><ul><li>Plan to include measures when designing multimedia applications to prove value </li></ul><ul><li>Compare clickthrough rates for clickable graphics to rates for standard navigation links </li></ul><ul><li>Great tools like Crazy Egg’s heatmap - easy! (also relevant to navigation, of course) </li></ul>
  16. 16. Crazy Egg Heatmap Example
  17. 17. Crazy Egg Overlay Example
  18. 18. Crazy Egg List Example
  19. 19. 2a: Functionality Response Time & Technical Errors <ul><li>Response time is a default log field, easy to measure </li></ul><ul><li>Check at peak load time to make sure site is responding quickly enough </li></ul><ul><li>Monitor the rate of 500 (server) errors: this should be an extremely low number </li></ul>
  20. 20. 2b: Functionality Security & Privacy Practices <ul><li>A matter of design for measurement, not measurement of design: considerations for designing a site that will be measured </li></ul><ul><ul><li>Privacy best practices: </li></ul></ul><ul><ul><ul><li>Give a short, accurate, easy to understand privacy statement and stand by your word </li></ul></ul></ul><ul><ul><ul><li>True first-party cookie </li></ul></ul></ul><ul><ul><li>Security best practices: (from an IA/analytic POV) </li></ul></ul><ul><ul><ul><li>SSL encryption on any transactional forms: lead generation, ecommerce, surveys </li></ul></ul></ul><ul><ul><ul><li>Secure file transfer for & restricted access to raw web analytic data; password restrictions at minimum </li></ul></ul></ul>
  21. 21. 3a: Usability Error Prevention & Recovery <ul><li>Percentage of visits experiencing 404 and 500 errors: errors should be < 0.5% of all hits </li></ul><ul><li>Percentage of visits including an error, that end with an error - frustrated into leaving </li></ul><ul><li>Where do 404 errors occur? </li></ul><ul><ul><li>Use to build a redirect page list to ensure (temporary) continuity of service to bookmarked URLs </li></ul></ul><ul><ul><li>Path/navigation analysis: how did users arrive at 404? What did they do after? </li></ul></ul><ul><li>User errors: identify problems & re-enact or test </li></ul>
  22. 22. 3b: Usability Supporting User Goals & Tasks <ul><li>Scenario/conversion analysis </li></ul><ul><ul><li>Define tasks and procedures supporting user goals </li></ul></ul><ul><ul><li>Examine completion rates, step by step, intervals & overall </li></ul></ul><ul><ul><ul><li>A to B, B to C, C to D; A to C, B to D; A to D </li></ul></ul></ul><ul><ul><li>Look at leakage points </li></ul></ul><ul><ul><ul><li>Where did they go when they left the process? Did they come back later? </li></ul></ul></ul><ul><ul><li>Shopping cart analysis </li></ul></ul><ul><ul><ul><li>Keep in mind that users shop online for offline purchases </li></ul></ul></ul><ul><ul><ul><li>Do behaviors suggest a need for a tool like a shipping calculator or product comparison? </li></ul></ul></ul><ul><ul><li>Online form completion </li></ul></ul>
  23. 23. 4a: Content Navigation & Site Structure <ul><li>Pogo-sticking: jumping back & forth between content or hierarchy levels (what about tabs?) </li></ul><ul><ul><li>Need a comparison tool, can’t identify product: not enough detail at the right level of site hierarchy or step of the purchase decision process </li></ul></ul><ul><li>Compare page-level traffic statistics for larger trends, broad navigation analysis: the usual #s </li></ul><ul><li>Path analysis on navigation tools (by type) to pinpoint navigation and labeling problems </li></ul><ul><ul><li>Extensive use of supplemental navigation may indicate need for updates to global navigation </li></ul></ul>
  24. 24. 4b: Content Mining Search & Referrals <ul><li>Popularity = value? What about findability? If it’s not findable, it probably won’t be popular. </li></ul><ul><ul><li>Compare the content’s value (against similar content) with proportions of returning visitors, average page viewing length, external referrals - especially search referrals </li></ul></ul><ul><li>Search log analysis: what do your users value? </li></ul><ul><ul><li>Does user query language match site contents? Are users searching for panties when you’re selling pants ? </li></ul></ul>
  25. 25. Validate the Match Between the Site & the Real World <ul><li>More ways to use search log analysis: </li></ul><ul><ul><li>Does user vocabulary match site vocabulary? </li></ul></ul><ul><ul><li>Do different audiences have different vocabularies, and does the site support them equally? </li></ul></ul><ul><ul><li>Brand measurement returns </li></ul></ul><ul><ul><ul><li>product and industry terminology usage </li></ul></ul></ul><ul><ul><ul><li>“ accuracy” of brand queries: spelling, inclusion of competitor’s brands, advertising slogans </li></ul></ul></ul><ul><ul><li>Did users find what they expected? How many visits end on search results? Null results are revealing. </li></ul></ul>
  26. 26. Language Validation
  27. 27. Conclusions <ul><li>Not much out there in the academic literature on using web analytics (hopefully to change!) </li></ul><ul><li>WA data is flawed and tough to handle, but ultimately pays off in developing holistic understanding of user behavior </li></ul><ul><li>Best-suited to case studies </li></ul><ul><li>WA is ripe for adoption into formal usability frameworks, particularly for persona design and determining design parameters </li></ul><ul><li>Best used iteratively: beginning, middle, end, annual follow-up… </li></ul>
  28. 28. Thanks! Questions?