Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Web analytics overview


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Web analytics overview

  1. 1. Web Analytics Overview
  2. 2. Data Collection Mechanism • There are four core groups of data • Click-stream • A click stream is the sequence of clicks or pages requested as a visitor explores a Web site. It helps in understanding time spent by visitors on your site, how often they return and the most frequently viewed pages etc.. • Outcomes • It help in understanding that if visitors came to the website and spent so much time on the site, then what was the outcome for the customer or the company. • Research (Qualitative) • Qualitative research allows us to get really close to our customers and get a real-world feel for their needs, wants, and perceptions of interactions with our websites.. • Competitive Data • Competitive Data helps in understanding market trends, role of search engines, campaigns, visitor demographics, and more stats about competitors' sites
  3. 3. Clickstream Data • Four main ways of capturing clickstream data • Web Logs • Web Beacons • JavaScript Tags • Packet Sniffing • Lets look at each of them in detail.
  4. 4. Web Logs • The data capture process is as follows • A user types a URL in browser • The request for the webpage comes to one of the web servers • The web server accepts the request and creates an entry in web log ( includes page name, IP Address, Browser details & Date time stamp etc.) • The web server sends the webpage to the customer Visitor Browser Server Internet Web Log Benefits Concerns Easily Accessible Captures errors, server usage etc. but not optimally suited for business information Only mechanism to capture search engine visits & behavior as they don’t execute JavaScript tags Need to filter Image, Page error, CSS & robot traffic to get accurate traffic trend Many Log Parser Available Page caching by ISP & proxy server means that some of traffic is invisible You own the data for web log Inaccuracy in identifying visitor without setting cookie
  5. 5. Web Beacons • They are 1 x 1 pixel transparent images that are placed in web pages, within an img src HTML tag. The transparent images are usually hosted on a third-party server. • Data capturing mechanism • The customer types –in a URL in a browser. • The request comes to one of the web servers, Which sends back the page along with a get request for a 1 × 1 pixel image from a third-party server. • As the page loads, it executes the call for the 1 × 1 pixel image, thus sending data about the page view back to the third-party server. • The third-party server sends the image back to the browser along with code that can read cookies and capture anonymous visitor data such as Page view, IP address, time etc Visitor Browser Server Internet Third- Party Server Log 1 x 1 Beacon
  6. 6. Continued… Benefits Concerns Easy to implement. Targeted for narrow purpose like banners, emails etc. Antispyware programs automatically remove the third party cookies, which makes it difficult to track visitors. You can optimize exactly what data the beacon collects (for example, just the page viewed, or time, or cookie values, or referrers), and because robots do not execute image requests, you won’t collect unwanted data. Beacons are not as expansive and customizable as JavaScript tags in terms of the data they can capture. Useful when it comes to collecting data across multiple websites or domains for comparison. (fig.above) If image requests are turned off in email programs or some browsers, you can’t collect the data. Visitor Website 1 Internet Third- Party Server Log Third- Party ServerWebsite 2
  7. 7. JavaScript Tags • JavaScript tagging allows for more data to be collected and more accurately. • Data serving was separated from data capture, hence reducing the reliance on corporate IT departments for various data capture requests • The data capture process is as follows • A user types a URL in browser • The request for the webpage comes to one of the web servers • As the page loads, it executes the JavaScript code, which captures the page view, details about the visitor session, and cookies, and sends it back to the data collection server. • In some cases, upon receipt of the first set of data, the server sends back additional code to the browser to set additional cookies or collect more data. Visitor Browser Server Internet Third- Party Server / In-house Server Log JavaScript Tags
  8. 8. Continued… Benefits Concerns Easier implementation effort with benefit of massive amount of data capture Not all website visitors have JavaScript turned on, often for privacy or other reasons. (usually 2-6 % of total users) If you don’t have access to your web servers (technically) and/or your web server logs, JavaScript tagging is your only choice Data collected is divorced from other metadata and hence requires thought & planning in creating the tags that capture site taxonomy & hierarchy Page caching is not a problem for JavaScript tagging and the analytics tools will be able to collect data. Capturing data about downloads (for example, PDFs or EXEs) and redirects is harder than with web logs Greater control over exactly what data is collected and ability to add custom tags on special pages Inaccuracy in identifying visitor without setting cookie JavaScript enables you to separate data capture from data serving thereby allowing you to set cookies. If your website is already JavaScript heavy your web analytics JavaScript tag could cause conflicts. Tracking users across multiple domains becomes easier, because your third-party cookie and its identifying elements stay consistent as visitors go across multiple domains where your JavaScript tags exist Some websites, rather than storing some data in cookies or URL parameters, will store data on the servers during the visitor session. In this case, the tags will not capture essential data.
  9. 9. Packet Sniffing • Packet sniffing is one of the most sophisticated ways of collecting web data. • The data capture process is as follows • A user types a URL in browser • The request to the web server passes through a software or hardware-based packet sniffer that collects attributes of the request that provide more data about the Visitor. • The packet sniffer sends the request on to the web server. • The request is sent back to the customer but is first passed to the packet sniffer. The packet sniffer captures information about the page going back and stores that data. • Some vendor packet-sniffing solutions append a JavaScript tag that can send back to the packet sniffer more data about the visitor. • The packet sniffer sends the page on to the visitor browser. Visitor Browser Packet Sniffer Internet Server
  10. 10. Continued… Benefits Concerns No need to modify the website with JavaScript tags as all data passed through packet sniffer. Difficult to convince IT department about adding additional layer of Software/ Hardware in their data center to route all traffic through it. Implementation time is more than JavaScript tags but lesser than other methods. (reliance on IT team) Privacy of data is biggest concern as it captures all raw packets which includes data such as passwords, names, addresses & credit card information. Provides ability to collect more data. For example, you can get server errors, bandwidth usage, and all the technical data as well as the page-related business data. Need JavaScript tags to collect optimal data from Cached pages, Adobe Flash Files, AJAX or Rich Internet Applications. Inability to capture core structure & metadata about pages with packet sniffers Provides ability to always use first party cookies Expensive for web farm architecture
  11. 11. Outcomes Data Listed below are outcome data capture strategy for different businesses. • E-commerce • For E-commerce, the standard practice is to use JavaScript tags to capture data from order confirmation page. • Data captures may include following metrics  Order’s unique identifier  Product or service ordered  Quantity and price of each item  Discounts and promotions applied  Metadata about the customer session: A/B or multivariate test IDs, cookie values etc.  Metadata about the products/services: product hierarchy, campaign hierarchy, product attributes • Lead Generation • Lead Generation data may be collected on “thank you” page (Which is seen by the customer after submitting a successful lead) • Partner with other websites that might be collecting and storing the leads on your behalf to get the Lead Generation data. • Plan on identifying where the data is being captured and how you can have access to it (for example, via JavaScript tags or beacons or database exports).
  12. 12. Continued… • Brand/Advocacy and Support. • Outcomes in this case are harder to figure out because we do not know whether the page view resulted in solving customer’s problem. • For the longest time, if the user sees a certain page, we can call it mission accomplished. • However, a great way to start by a statistically significant sample of site visitors to get their ratings on success. • Having internal Data Warehouse gives the flexibility to capture more data (for example, event logs from your Flash or rich Internet applications, Google search data, metadata from other parts of the company, and CRM or phone channel data). • This allows you to truly create an end-to-end view of customer behavior and outcomes that can scale effectively over time
  13. 13. Research Data • The goal qualitative analysis is to understand the rationale behind the metrics and trends that we see and to actively incorporate the voice of the customer (VOC) into our decision making. • The following user-centric design (UCD) and human-computer interaction (HCI) methodologies are commonly used to understand the customer perspective: • Surveys • They are the optimal method for collecting feedback (based on questionnaires) from a very large number of customer relatively inexpensively and quickly. • Conclusions based on survey data, if done right, will be more accurate and reliable and provide insights and conclusions that help us better understand customer perspectives. • Heuristic evaluations • Heuristic evaluations follow a set of well-established rules (best practices) in web design and in how website visitors experience websites and interact with them. • Here, a user researcher acts as a website customer and attempts to complete a set of predetermined tasks (tasks related to the website’s reason for existence—for example, trying to place an order) • In addition to the best practices, the user researcher will draw from their own experience of running usability studies and their general knowledge of standard design principles.
  14. 14. Continued… • Usability testing (lab and remote) • Lab usability tests measure a user’s ability to complete tasks. • Usability tests are best for optimizing User Interface (UI) designs and work flows, understanding the customer’s voice, and understanding what customers really do. • In a typical usability test, a user attempts to complete a task or set of tasks by using a website (or software or a product). • Each of these tasks has a specified goal with effectiveness, efficiency, and satisfaction identified in a specified usage context. • Site visits (or follow-me-homes) • In a site visit, user researchers, and often other key stakeholders, go to the home or office of the customer to observe them completing tasks in a real-world environment. • You can observe customers interacting with websites in the midst of all the other distractions of their environment—for example, ringing phones, weird pop-up blockers, etc. • This experience is very different from a lab because the complicating environmental factors. • You would use the best-fit methodology based on the following: • Scope (both size and complexity) of the problem you are trying to solve (entire website, core segments of experience, particular pages, and so forth) • Timing (whether you need it overnight or over the next few weeks) • Number of participants (how many customers you would like feedback from)
  15. 15. Competitive Data • This competitive intelligence is key to helping you understand your performance in the context of the greater web ecosystem and allows you to better understand whether a certain result is caused by eco-system trends or your actions (or lack thereof). • Having a focused competitive intelligence program can help you exploit market trends, build off the success of your competitors, or help optimize your search engine marketing program. • There are three main methodologies used to collect : • Panel-based measurement • Here, the participant agrees to have their Web browsing behavior tracked in exchange for an incentive. • A company called ComScore Networks uses panel-based measurement to compile data that is used by many companies for competitive analysis • ISP-based measurement • Here, the data is collected from various Internet Service Providers (ISPs), through which we are connected to internet while surfing the web. • Companies such as Hitwise have agreements with ISPs worldwide whereby the ISPs share the anonymous web log data collected on the ISP network with Hitwise
  16. 16. Continued… • Search engine data • They also often know information about their users (ex: Search Keywords, Regions, Cities etc.) • Google (Google Trend) and MSN (adlabs ) have recently opened up lab/beta environments where they enable users to run queries against their databases to glean insights into competitive data.
  17. 17. Analysis of data
  18. 18. Process of Data Analysis • Web analytics methodology has following steps: • Defining Business Metrics (KPIs) • To get real business metrics, you need to look at your website in the context of your overall business strategy & desired user behavior • They include such things as the paths you want users to take, the marketing initiatives you want them to come into contact with, and the products you want them to buy. • The second step is to monetize these desired behaviors. In other words, you should figure out the value of each behavior to your business. • Reports • Report is the representation of metrics (or the KPIs) you’ve identified and the other contributing metrics that can help you to better understand the details behind your performance. (for ex: which pages are visited by user after successfully submitting request etc.) Business Metrics Reports Analysis Optimization & Action
  19. 19. Continued… • The data from other sources may include data from call centers, retail stores, attitudinal surveys, or information about your competitors. • Analysis • Analysis involves looking at the factors driving your performance so you can identify opportunities for improvement • Optimization and Action • There are a large number of ways you can take action and optimize a site, with the most common being A/B and multivariate testing. • You can also make changes to the design, information architecture, the structure of your promotions, and much more.
  20. 20. What does web analytics helps in answering?
  21. 21. How much did Visitors benefit my Business? • Here, you try to find out how well your site helps visitors accomplish the things you hoped they’d do like purchase, clicking on advertisement & subscribing etc. • Conversion & Abandonment • The percentage of visitors that your site converts to contributors, buyers, or users is the most important metric you can track • Click-Throughs • For site that relies on third-party ads, click-through data is the metric that directly relates to revenue. • Offline Activity • Many actions that start on the Web end elsewhere for ex: the purchase that started online & ends in a call center • It is important to associate such call center request with the online support information (by providing a unique code to visitors) to understand the effectiveness of a website • User-Generated Content • If your site thrives on user-generated content (UGC), contribution is key. • You need to know how many people are adding to the site, either as editors or commenter's, and whether your contributors are creating the content that your audience wants
  22. 22. Continued… • Subscriptions • Some media sites offer premium subscriptions that give paying customers more storage, downloadable content, better bandwidth, and so on. • This can be the main revenue source for analyst firms, writers, and large media content sites such as independent video producers. • Additional bandwidth costs money, so subscriptions need to be monitored for cost to ensure that the premium service contributes to the business as a whole. • Billing and Account Use • If you’re running a subscription website—such as a SaaS application—then your subscribers pay a recurring fee to use the application. • This is commonly billed per month, and may be paid for by the individual user or as a part of a wider subscription from an employer. • It’s essential to track billing and account use, not only because it shows your revenues, but also because it can pinpoint users who are unlikely to renew. • You’ll need to define what constitutes an “active” user, and watch how many of your users are no longer active. You’ll also want to watch the rate of nonrenewal to measure churn.
  23. 23. Where is my Traffic coming from? • Getting the right visitors to your site is a combination of Affiliate Marketing, Search Engine Marketing, and Search Engine Optimization • Referring Websites • If the user linked to that page from elsewhere, the browser includes a referring URL. This lets you know who’s sending you traffic. • If you know the page that referred visitors, you can track those visits back to the site that sent them and see what’s driving them to you. Remember, however, that you need to look not only at who’s sending you visitors, but also at who’s sending you the ones that convert. • Inbound Links from Social Networks • An increasing number of visitors come to you from social networks. • If your media site breaks a news story or offers popular content, social communities will often link to it. This includes not only social news aggregators like reddit or Digg, but also bloggers, comment threads, and sites like Twitter, Facebook etc. • There is a need to identify the source of the traffic so they can engage the people who brought them the attention. • Visitor Motivation • Sometimes the only way to get inside a visitor’s head is to ask her, using surveys and questions on the site. This approach is called as the voice of the customer (VOC). • It involves finding about what customers try to accomplish, did they plan to purchases?, what product & services they are considering? etc.
  24. 24. What is Working Best (and Worst)? • Site Effectiveness • A site that convinces visitors to purchase more than what they initially intended is an effective site. • Many e-commerce sites suggest related purchases or offer package deals(up-selling). Similarly for collaborative sites tracks how many visitors subscribe to mailing list or RSS feeds. These are important metrics to track. • Ad and Campaign Effectiveness • With the exception of organic traffic, most visitors arrive because of a campaign. • This may be an online campaign—banner ads, sponsorship, or paid content—or an offline campaign such as a movie trailer or radio spot, or simply good word of mouth and an informal community. • Analytics applications can segment incoming traffic by campaign to measure how much they helped the bottom line • Search Effectiveness • Users prefer to search what they’re looking for and choose from the results rather than browsing through several hierarchies of a directory. • If this search data is tied into analytics, we can measure search effectiveness and then we can better label and index the site.
  25. 25. Continued… • Trouble Ticketing and Escalation • An increase in call center activity and support email messages are sure signs of a broken site. • Site operators need to track the volume of trouble tickets related to the website, and ideally relate those trouble tickets to the user visits that cause them in order to speed up problem diagnosis. • Content Popularity • Media sites are about content. The successful ones put popular content on the home page, alongside ad space for which they charge a premium. • Who you attract with your content, and what they do afterward, is an important part of what works best. In other words, content popularity has to tie back to site goals rather than just page views. • Usability • No site will succeed if it’s hard to use. • Focus groups and prerelease testing can identify egregious errors before you launch, but there’s no substitute for watching a site in production.
  26. 26. Continued… • User Productivity • user productivity looks at whether visitors could accomplish their tasks quickly and without errors. • Every website operator should care whether visitors can accomplish goals, but for SaaS sites this is particularly important, as users may spend their entire workday interacting with the application. • Community Rankings and Rewards • Sometimes, it’s important to watch for Top/key contributors and rewards given for contribution for the sites like Wikipedia etc.
  27. 27. How good is my relationship with my Visitors? • Once you’ve got your site in order, traffic is flowing in, and you’re making the most of all of your visitors, it’s time to be sure your relationship with them is long and fruitful. • Loyalty • The best visitors are those who keep coming back. Thanks to browser cookies, most web analytics applications show the ratio of new to returning visitors. • Strike a healthy balance here: get new blood so you can grow, but encourage existing visitors to return so they become regular buyers or contributors • The average time between visits & the number of users who no longer engage with the site are the metric to watch for user engagement. • Enrollment • Enrollment is valuable because consumers are increasingly skeptical of web marketing. • Enrollment also provides better targeting. You can ask subscribers for demographic information such as gender, interests, and income, then tailor your messages—and those of your advertisers—to your audience. • Reach • Whether through email subscriptions, alerts, or RSS feeds, Reach is the measurement of how many enrolled visitors actually see your messages. • Reach is a far more meaningful measure of subscription, since it discounts “stale” enrollments and shows how well your outbound messages, blogs, and alerts result in action.
  28. 28. How healthy is my Infrastructure? • Slow page loads or excessive downtime can undermine even the best- designed, most effective, easiest-to-use website. Hence, End-user monitoring is needed. • Availability and Performance • The most basic metrics for web health are availability (is it working?) and performance (how fast is it?), • These can be measured on a broad, site-wide basis by running synthetic tests at regular intervals; or they can be measured for every visit to every page with real user monitoring (RUM). • Service Level Agreement Compliance • If people pay to use your site, you have an implied contract or Service Level Agreement (SLA) that you’ll be available and usable. • A properly crafted SLA includes not only acceptable performance and availability, but also time windows, stakeholders, and which functions of the website are covered. • Measure and report the metrics that comprise an SLA in a regular fashion to both your colleagues and your customers. • Content Delivery • Content delivery is important for media companies for ex: a Flash ad may be measured for its delivery. • Users may need to interact with the content—by rolling over the ad, clicking a sound button, playing the ad and the user either clicks on the offer or ignores it.
  29. 29. Continued… • Hence it is important to track content engagement; attention; completion of the media; pauses etc. • Capacity and Flash Traffic • When a community (like blogs, social news aggregators) suddenly discovers content that it likes, the result is a flash crowd (thousands of visitors). • While flash crowds create dramatic bursts of traffic, a gradual, sustained increase in traffic can sneak up on you and consume all available capacity.. • You need to monitor long-term increases in page latency or server processing or decreases in availability that may be linked to increased demand for your website. • Impact of Performance on Outcomes • Poor performance has a direct impact on outcomes like conversion rate, as well as on user productivity. • Responsive websites leads to increased productivity, while slow sites encourage distraction. • The relationship between performance and conversion can be measured on an individual basis, by making performance a metric that’s tracked by analytics and by segmenting conversion rates for visitors who had different levels of page latency.
  30. 30. Continued… • Traffic Spikes from Marketing Efforts • Marketing campaigns should drive site traffic. • You need to identify the additional volume of visitors to your site not only for marketing reasons, but also to understand the impact that marketing promotions have on your infrastructure and capacity. • Seasonal Usage Patterns • If your business is highly seasonal, you need to understand historical usage patterns • It helps to understand usage trends so you can plan for capacity changes & to meet long-term SLAs.
  31. 31. How Am I Doing Against the Competition? • In addition to monitoring your own website and the communities that affect your business, you also need to watch your competition. • Site Popularity and Ranking • Most startups and media outlets are judged by their monthly unique visitor count • Keep a watch on Google PageRank; Google Trends; Google Insights; & • If you’re a media site or portal that has to report traffic estimates as part of your business, ComScore and Nielsen dominate traffic measurement, with Quantcast and Hitwise as smaller alternatives. • How People Are Finding My Competitors • Knowing which organic terms are leading visitors to your competitors helps you understand what customers are looking for and how they’re thinking about your products or services. • On the other hand, using a competitor’s web domain, you can find out what search terms the market thinks apply to your product category and change your marketing accordingly
  32. 32. Continued… • Relative Site Performance • Compare competitors performance and availability to yourself and to industry benchmarks • you may want to set up a synthetic testing service of competitors site or even set up a transactional benchmark that can show the difference in performance across similar workflows. • Competitor Activity • Monitor competitor activities like changes to competitors’ pages with business impact such as pricing information, financing, media materials, screenshots, and executive teams using alerts.
  33. 33. Where Are My Risks? • You need to monitor your site for abusive content, legal liabilities & anonymous detractors attacking you publicly • Trolling and Spamming • Any website that offers comment fields, collaboration, and content sharing will become a target for two main groups of mischief-makers: spammers and trolls. • Hence, monitor for number of users that exhibit unwanted behaviors; percent of spammy comments; traffic sources that generate spam; volume of community flags. • Copyright and Legal Liability • If your site lets users post content, you may have to take steps to ensure that this content isn’t subject to copyright from other organizations. • Best practices today are to ask users to confirm that they are legally permitted to post the content, and to provide links for someone to report illegal content. • Fraud, Privacy, and Account Sharing • Safeguarding your visitors’ personally identifiable information is a legal obligation. • Make sure you’ve got plenty of detailed log files to monitor breaches in privacy or cases of fraud • Watch for fraud related to account sharing by keeping track of number of concurrent-user logins per account; number of states and different user agent from which a user has logged in.
  34. 34. What Are People Saying About Me? • You should subscribe to a keyword across various types of sites (blogs, mailing lists, news aggregators, etc.), then review the results wherever someone is talking about things that matter to your organization. • Site Reputation • Keep a track of site reputation by watching Google PageRank; Technorati ranking; StumbleUpon rating; other Internet ranking tools. • Trends • Use Google Trends, Yahoo! Buzz & Google Insights to understand the relative popularity of content on the Internet in order to optimize the wording of your site or to downplay aging themes. • Social Network Activity • Search results for your company name, URL, product names, executives, and relevant keywords across social sites like Digg, Summize, and Twitter, as well as any that are relevant to your particular industry or domain.
  35. 35. How Are My Site and Content Being Used Elsewhere? • Track and monitor other people who are using your site. They may be doing so as part of a mashup, running search engine crawlers or they may be competitors checking up on you. • API Access and Usage • Your site may offer formal web services or APIs to let your users access your application programmatically through automated scripts. • Keep a track of traffic volume and number of requests for each API you offer; number of failed authentications to the API; number of API requests by developer; top URLs by traffic and request volume. • Mashups, Stolen Content, and Illegal Syndication • Your site’s data can easily appear online in a mashup. By combining several sites and services, web users can create a new application, often without the original sites knowing it. • If this is happening to you, you’ll see referring URLs belonging to the mashup page, and you can track back to that URL to determine where the traffic is coming from and take action if needed. • Try to treats mashups as business opportunities, not threats. If you have interesting content, find a way to deliver it that benefits both you and the mashup site. • Integration with Legacy Systems • Some SaaS applications may connect to their subscribers’ enterprise software through dedicated links in order to exchange customer, employee, and financial data • Such calls may degrade the performance of the web site and hence keep a track of volume and performance of API calls between the application and enterprise customers or data partners.
  36. 36. References • Web Analytics – An hour a day by Avinash Kaushik • Complete web monitoring by Alistair Croll & Sean Power • Groundswell: Winning in a World Transformed by Social Technologies • • 244205.shtml •