Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

clickstream analysis


Published on

Business Analytics

Published in: Engineering
  • Be the first to comment

clickstream analysis

  1. 1. Introduction A clickstream is the recording of the parts of the screen a computer user clicks on while web browsing or using another software application. As the user clicks anywhere in the webpage or application, the action is logged on a client or inside the web server, as well as possibly the web browser, router, proxy server.
  2. 2. Introduction Clickstream analysis is useful for web activity analysis, software testing, market research, and for analyzing employee productivity. Clickstream as defined by Internet Advertising Bureau (IAB) : “The electronic path a user takes while navigating from site to site, and from page to page within a site. It is a comprehensive body of data describing the sequence of activity between a user’s browser and any other Internet resource, such as a Web site or third party ad server”
  3. 3. Methodology The click stream data is analyzed to identify different paths taken by the visitors and the sequence of pages that lead to payment of membership fee. Based on this analysis, specific strategies are recommended to maximize the revenue for the website. The main point of clickstream tracking is to give webmasters insight into what visitors on their site are doing.
  4. 4. Data Data is obtained from the site in the form of click stream records. Each record consists of the details of clicks by the visitors and each record contains the following details: Server IP Client IP Time stamp with Date Status: HTTP Status code URL requested: has three subfields namely The request method, resource requested and the protocol used No. of bytes transferred The country of origin for a specific request is identified using the IP address.
  5. 5. Data URL is used to identify the information/web page browsed by the visitors. Time stamp of each click is used to sequence the movement of the visitors across different pages in the website. Identifying a unique user session is an important step in the analysis of click stream data. Inactivity for more than 30 minutes is considered as a break of session. This is an approximation since there could be multiple users accessing from the same IP, or the same user accessing from different IPs. Due to lack of more data available we consider hits from each unique IP as belonging to a unique user for a unique session.
  6. 6. Technology-Enabled Approaches The Web provides marketers with huge amounts of information about users ⇒This data is collected automatically Server-side data collection  Log file analysis - historical data  Real-time profiling (tracking user Clickstream analysis) Client-side data collection (cookies) Data Mining These techniques did not exist prior to the Internet. ⇒They allow marketers to make quick and responsive changes in Web pages, promotions, and pricing. ⇒The main challenge is analysis and interpretation
  7. 7. Web server log files • All web servers automatically log (record) each http request • A server log is a log file (or several files) automatically created and maintained by a server of activity performed by it. • A typical example is a web server log which maintains a history of page requests. • Most log file formats can be extended to include “cookie” information – This allows you to identify a user at the “visitor” level
  8. 8. 9 Web Server Logging – How Does it Work? Web servers such as Apache or Microsoft IIS record activity as they receive and fulfill requests. Web servers provide general-purpose logging at a very detailed level. To prepare the data for analysis, the web team must clean and organize log records – a big job!
  9. 9. Web Server Logging – A Log Record Example
  10. 10. What log files can record includes: Number of requests to the server (hits) Number of page views Total unique visitors (using “cookies”) The referring web site Number of repeat visits Time spent on a page Route through the site (click path) Search terms used Most/least popular pages
  11. 11. Software for log file analysis (web analytics) • Market leader is Webtrends
  12. 12. How do you use log files effectively? 1. Identify leading indicators of business success 2. Identify the key performance metrics with which to measure them 3. Establish benchmarks to track changes over time 4. Configure software and use settings consistently
  13. 13. Shortcomings of log file analysis Cannot identify individual people. The log file records the computer IP address and/or the “cookie”, not the user. Information may be incomplete because of caching. Assumptions made in defining “user sessions” may be incorrect. This is why benchmarking is so important trends rather than absolute numbers
  14. 14. Log file analysis is a useful tool to: identify what visitors are looking for what content they find most interesting which search and navigation tools they find most useful whether promotions are being successful identify normal volatility in usage levels measure growth in site usage as compared to overall web usage
  15. 15. Enhancing marketing tactics using web analytics - some examples Identify point of drop-off in registration or purchasing process. Pinpoint problem and concentrate efforts on the apparent trouble spot to improve conversion rates. Maximize cross-selling opportunities in an on-line store Identify the top non-purchased products that customers also looked at before completing the purchasing process. Add these products in as suggestions Refine search engine placements by implementing keyword strategy Use referrer files to identify commonly used search terms and the search engine or directory that sent the customer.
  16. 16. Improve web site structure using web analytics - some examples Analysis of search logs to improve findability on the web site. Do people search by “category” rather than “uniquely identifying” search terms? Redesign home page to enhance visibility of most commonly used links and therefore promote usability. Demote least used items to “below the fold” Analyze “click paths”, entry and exit points to trace most common routes around the site. Identify areas where navigation seems unclear or confusing Improve navigation to match demonstrated user preferences.
  17. 17. Clickstream monitoring and personalization How does do that? This type of personalization is very complex and expensive to achieve Existing customers and order databases must be mined for buying patterns  People who bought a Nora Jones CD also bought a John Grisham novel  Called collaborative filtering Real-time monitoring of customers on your site needed, so you can make recommendations or special offers at the right time Becomes even more complex when combined with information actually provided by the customer
  18. 18. Data Analysis and Distribution Data collected from all customer touch points are: Stored in the data warehouse, Available for analysis and distribution to marketing decision makers. Analysis for marketing decision making: Data mining Customer profiling RFM analysis (recency, frequency, monetary
  19. 19. Data mining = extraction of hidden predictive information in large databases through statistical analysis. Marketers are looking for patterns in the data such as:  Do more people buy in particular months  Are there any purchases that tend to be made after a particular life event  Refine marketing mix strategies,  Identify new product opportunities,  Predict consumer behavior.
  20. 20. Real-Space Approaches Real-space primary data collection occurs at offline points of purchase with: Smart card and credit card readers, interactive point of sale machines (iPOS), and bar code scanners are mechanisms for collecting real-space consumer data. Offline data, when combined with online data, paint a complete picture of consumer behavior for individual retail firms.
  21. 21. Customer profiling Customer profiling = uses data warehouse information to help marketers understand the characteristics and behavior of specific target groups.  Understand who buys particular products,  How customers react to promotional offers and pricing changes,  Select target groups for promotional appeals,  Find and keep customers with a higher lifetime value to the firm,  Understand the important characteristics of heavy product users,  Direct cross-selling activities to appropriate customers;  Reduce direct mailing costs by targeting high-response customers.
  22. 22. RFM analysis RFM analysis (recency, frequency, monetary) = scans the database for three criteria. When did the customer last purchase (recency)?  How often has the customer purchased products (frequency)?  How much has the customer spent on product purchases (monetary value)?  => Allows firms to target offers to the customers who are most responsive, saving promotional costs and increasing sales.